A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation
From Computer Vision Freiburg: Recent work has shown that optical flow estimation can be formulated as a supervised learning task and can be successfully solved with convolutional networks. Training of the so-called FlowNet was enabled by a large synthetically generated dataset. The present paper extends the concept of optical flow estimation via convolutional networks to disparity and scene flow estimation. To this end, we propose three synthetic stereo video datasets with sufficient realism, variation, and size to successfully train large networks. Our datasets are the first large-scale datasets to enable training and evaluating scene flow methods. Besides the datasets, we present a convolutional network for real-time disparity estimation that provides state-of-the-art results. By combining a flow and disparity estimation network and training it jointly, we demonstrate the first scene flow estimation with a convolutional network.
his video shows impressions from various parts of our dataset, as well as state-of-the-art realtime disparity estimation results produced by one of our new CNNs... (full paper)
From Manuel Ruder, Alexey Dosovitskiy, Thomas Brox of the University of Freiburg:
In the past, manually re-drawing an image in a certain artistic style required a professional artist and a long time. Doing this for a video sequence single-handed was beyond imagination. Nowadays computers provide new possibilities. We present an approach that transfers the style from one image (for example, a painting) to a whole video sequence. We make use of recent advances in style transfer in still images and propose new initializations and loss functions applicable to videos. This allows us to generate consistent and stable stylized video sequences, even in cases with large motion and strong occlusion. We show that the proposed method clearly outperforms simpler baselines both qualitatively and quantitatively... (pdf paper)
Efficient 3D Object Segmentation from Densely Sampled Light Fields with Applications to 3D Reconstruction
From Kaan Yücer, Alexander Sorkine-Hornung, Oliver Wang, Olga Sorkine-Hornung:
Precise object segmentation in image data is a fundamental problem with various applications, including 3D object reconstruction. We present an efficient algorithm to automatically segment a static foreground object from highly cluttered background in light fields. A key insight and contribution of our paper is that a significant increase of the available input data can enable the design of novel, highly efficient approaches. In particular, the central idea of our method is to exploit high spatio-angular sampling on the order of thousands of input frames, e.g. captured as a hand-held video, such that new structures are revealed due to the increased coherence in the data. We first show how purely local gradient information contained in slices of such a dense light field can be combined with information about the camera trajectory to make efficient estimates of the foreground and background. These estimates are then propagated to textureless regions using edge-aware filtering in the epipolar volume. Finally, we enforce global consistency in a gathering step to derive a precise object segmentation both in 2D and 3D space, which captures fine geometric details even in very cluttered scenes. The design of each of these steps is motivated by efficiency and scalability, allowing us to handle large, real-world video datasets on a standard desktop computer... (paper)
Sher Minn Chong wrote a good introductory to image processing in Python:
In this article, I will go through some basic building blocks of image processing, and share some code and approaches to basic how-tos. All code written is in Python and uses OpenCV, a powerful image processing and computer vision library...
... When we’re trying to gather information about an image, we’ll first need to break it up into the features we are interested in. This is called segmentation. Image segmentation is the process representing an image in segments to make it more meaningful for easier to analyze3.
One of the simplest ways of segmenting an image isthresholding. The basic idea of thresholding is to replace each pixel in an image with a white pixel if a channel value of that pixel exceeds a certain threshold... (full tutorial) (iPython Notebook)
From University of Tokyo:
In our laboratory, the Lumipen system has been proposed to solve the time-geometric inconsistency caused by the delay when using dynamic objects. It consists of a projector and a high-speed optical axis controller with high-speed vision and mirrors, called Saccade Mirror (1ms Auto Pan-Tilt technology). Lumipen can provide projected images that are fixed on dynamic objects such as bouncing balls. However, the robustness of the tracking is sensitive to the simultaneous projection on the object, as well as the environmental lighting... (full article)
From David Stolarsky:
The goal of Frankenimage is to reconstruct input (target) images with pieces of images from a large image database (the database images).
Frankenimage is deliberately in contrast with traditional photomosaics. In traditional photomosaics, more often than not, the database images that are composed together to make up the target image are so small as to be little more than glorified pixels. Frankenimage aims instead for component database images to be as large as possible in the final composition, taking advantage of structure in each database image, instead of just its average color. In this way, database images retain their own meaning, allowing for real artistic juxtaposition to be achieved between target and component images... (full description and pseudo code)
Records 1 to 9 of 9