In this paper, we introduce the TOF technology and the theory of operation. We also compare TOF sensors with 2D machine vision and other 3D vision technologies and highlight TOF sensors’ differentiating advantages.
Time-of-Flight Camera – An Introduction
Larry Li | Texas Instruments / Mouser Electronics
Reprinted with permission from Mouser Electronics
3D Time-of-Flight (TOF) technology is revolutionizing the machine vision industry by providing 3D imaging using a low-cost CMOS pixel array together with an active modulated light source. Compact construction, easy-of-use, together with high accuracy and frame-rate makes TOF cameras an attractive solution for a wide range of applications. In this article, we will cover the basics of TOF operation, and compare TOF with other 2D/3D vision technologies. Then various applications that benefit from TOF sensing, such as gesturing and 3D scanning and printing, are explored. Finally, resources that help readers get started with Texas Instruments’ 3D TOF solution are provided.
2. Theory of Operation
A 3D time-of-flight (TOF) camera works by illuminating the scene with a modulated light source, and observing the reflected light. The phase shift between the illumination and the reflection is measured and translated to distance. Figure 1 illustrates the basic TOF concept. Typically, the illumination is from a solid-state laser or a LED operating in the near-infrared range (~850nm) invisible to the human eyes. An imaging sensor designed to respond to the same spectrum receives the light and converts the photonic energy to electrical current. Note that the light entering the sensor has an ambient component and a reflected component. Distance (depth) information is only embedded in the reflected component. Therefore, high ambient component reduces the signal to noise ratio (SNR).
Figure 1: 3D time-of-flight camera operation.
To detect phase shifts between the illumination and the reflection, the light source is pulsed or modulated by a continuous-wave (CW), source, typically a sinusoid or square wave. Square wave modulation is more common because it can be easily realized using digital circuits .
Pulsed modulation can be achieved by integrating photoelectrons from the reflected light, or by starting a fast counter at the first detection of the reflection. The latter requires a fast photo-detector, usually a single-photon avalanche diode (SPAD). This counting approach necessitates fast electronics, since achieving 1 millimeter accuracy requires timing a pulse of 6.6 picoseconds in duration. This level of accuracy is nearly impossible to achieve in silicon at room temperature .
Figure 2: Two time-of-flight methods: pulsed (top) and continuous-wave (bottom).
The pulsed method is straightforward. The light source illuminates for a brief period (∆t), and the reflected energy is sampled at every pixel, in parallel, using two out-of-phase windows, C1 and C2, with the same ∆t. Electrical charges accumulated during these samples, Q1 and Q2, are measured and used to compute distance using the formula:
In contrast, the CW method takes multiple samples per measurement, with each sample phase-stepped by 90 degrees, for a total of four samples. Using this technique, the phase angle between illumination and reflection, φ, and the distance, d, can be calculated by
It follows that the measured pixel intensity (A) and offset (B) can be computed by:
In all of the equations, c is the speed-of-light constant.
At first glance, the complexity of the CW method, as compared to the pulsed method, may seemed unjustified, but a closer look at the CW equations reveals that the terms, (Q3 – Q4) and (Q1 – Q2) reduces the effect of constant offset from the measurements. Furthermore, the quotient in the phase equation reduces the effects of constant gains from the distance measurements, such as system amplification and attenuation, or the reflected intensity. These are desirable properties.
The reflected amplitude (A) and offset (B) do have an impact the depth measurement accuracy. The depth measurement variance can be approximated by:
The modulation contrast, 𝑐𝑑, describes how well the TOF sensor separates and collects the photoelectrons. The reflected amplitude, 𝐴, is a function of the optical power. The offset, 𝐵, is a function of the ambient light and residual system offset. One may infer from Equation 6 that high amplitude, high modulation frequency and high modulation contrast will increase accuracy; while high offset can lead to saturation and reduce accuracy.
At high frequency, the modulation contrast can begin to attenuate due to the physical property of the silicon. This puts a practical upper limit on the modulation frequency. TOF sensors with high roll-off frequency generally can deliver higher accuracy.
The fact that the CW measurement is based on phase, which wraps around every 2π, means the distance will also have an aliasing distance. The distance where aliasing occurs is called the ambiguity distance, amb, and is defined as:
Since the distance wraps, amb is also the maximum measurable distance. If one wishes to extend the measurable distance, one may reduce the modulation frequency, but at the cost of reduced accuracy, as according to Equation 6.
Instead of accepting this compromise, advanced TOF systems deploy multi-frequency techniques to extend the distance without reducing the modulation frequency. Multi-frequency techniques work by adding one or more modulation frequencies to the mix. Each modulation frequency will have a different ambiguity distance, but true location is the one where the different frequencies agree. The frequency of when the two modulations agree, called the beat frequency, is usually lower, and corresponds to a much longer ambiguity distance. The dual-frequency concept is illustrated below.
Figure 3: Extending distance using a multi-frequency technique .
3. Point Cloud
In TOF sensors, distance is measured for every pixel in a 2D addressable array, resulting in a depth map. A depth map is a collection of 3D points (each point also known as a voxel). As an example, a QVGA sensor will have a depth map of 320 x 240 voxels. 2D representation of a depth map is a gray-scale image, as is illustrated by the soda cans example in Figure 4—the brighter the intensity, the closer the voxel. Figure 4 shows the depth map of a group of soda cans.
Figure 4: Depth map of soda cans.
Alternatively, a depth map can be rendered in a three-dimensional space as a collection of points, or point-cloud. The 3D points can be mathematically connected to form a mesh onto which a texture surface can be mapped. If the texture is from a real-time color image of the same subject, a life-like 3D rendering of the subject will emerge, as is illustrated by the avatar in Figure 5. One may be able to rotate the avatar to view different perspectives.
Figure 5: Avatar formed from point-cloud.
4. Other Vision Technologie
Time-of-flight technology is not the only vision technology available. In this section, we will compare TOF to the classical 2D machine vision and other 3D vision technologies. A table summarizing the comparison is included at the end of this section.
2D Machine Vision
Most machine vision systems deployed today are 2D, a cost-effective approach when lighting is closely controlled. They are well-suited for inspection applications where defects are detected using well-known image processing techniques, such as edge detection, template matching and morphology open/close. These algorithms extract critical feature parameters that are compared to a database for pass-fail determination. To detect defects along the z-axis, an additional 1D sensor or 3D vision is often deployed.
2D vision could be used in unstructured environment as well with the aid of advanced image processing algorithms to get around complications caused by varying illumination and shading conditions. Take the images in Figure 6 for example. These images are from the same face, but under very different lighting. The shading differences can make face recognition difficult even for humans.
In contrast, computer recognition using point cloud data from TOF sensors is largely unaffected by shading, since illumination is provided by the TOF sensor itself, and the depth measurement is extracted from phase measurement, not image intensity.
Figure 6: Same face, different shading.
3D Machine Vision
Robust 3D vision overcomes many problems of 2D vision, as the depth measurement can be used to easily separate foreground from background. This is particularly useful for scene understanding, where the first step is to segment the subject of interest (foreground) from other parts of the image (background).
Gesture recognition, for example, involves scene understanding. Using distance as a discriminator, a TOF sensor enables separation of the face, hands, and fingers from the rest of the image, so gesture recognition can be achieved with high confidence.
Figure 7: Advantages of 3D vision over 2D.
In the next two subsections we will compare the TOF technology with two other 3D vision technologies: stereo vision and structured-light.
Stereo Vision vs. TOF
Stereo vision generally uses two cameras separated by a distance, in a physical arrangement similar to the human eyes. Given a point-like object in space, the camera separation will lead to measurable disparity of the object positions in the two camera images. Using a simple pin-hole camera model, the object position in each image can be computed, which we will represent them by α and β. With these angles, the depth, z, can be computed.
Figure 8: Stereopsis--depth through disparity measurement.
A major challenge in stereo vision is solving the correspondence problem: giving a point in one image, how to find the same point in the other camera? Until the correspondence can be established, disparity, and therefore depth, cannot be accurately determined. Solving the correspondence problem involves complex, computationally intensive algorithms for feature extraction and matching. Feature extraction and matching also require sufficient intensity and color variation in the image for robust correlation. This requirement renders stereo vision less effective if the subject lacks these variations—for example, measuring the distance to a uniformly colored wall. TOF sensing does not have this limitation because it does not depend on color or texture to measure the distance.
In stereo vision, the depth resolution error is a quadratic function of the distance. By comparison, a TOF sensor, which works off reflected light, is also sensitive to distance. However, the difference is that, for TOF this shortcoming is remedied by increasing the illumination energy when necessary; and the intensity information is used by TOF as a “confidence” metric to maximize accuracy using Kalman filter-like techniques.
Stereo vision has some advantages. The implementation cost is very low, as most common off-the-shelf cameras can be used. Also, the human-like physical configuration makes stereo vision well-suited for capturing images for intuitive presentation to humans, so that both humans and machines are looking at the same images.
Structured-Light vs. TOF
Structured-Light works by projecting known patterns onto the subject and inspecting the pattern distortion . Successive projections of coded or phase-shifted patterns are often required to extract a single depth frame, which leads to a lower frame rate. Low frame rate means the subject must remain relatively still during the projection sequence to avoid blurring. The reflected pattern is sensitive to optical interference from the environment; therefore, structured-light tends to be better suited for indoor applications. A major advantage of structured-light is that it can achieve relatively high spatial (X-Y) resolution by using off-the-shelf DLP projectors and HD color cameras. Figure 9 shows the structured-light concept.
Figure 9: Structured-light concept.
By comparison, TOF is less sensitive to mechanical alignment and environmental lighting conditions, and is more mechanically compact. The current TOF technology has lower resolution than today’s structured-light, but is rapidly improving.
The comparison of TOF camera with stereo vision and structured-light is summarized in Table 1. The key takeaway is that TOF is a cost-effective, mechanically compact depth imaging solution unaffected by varying environmental illumination and vastly simplifies the figure-ground separation commonly required in scene understanding. This powerful combination makes TOF sensor well-suited for a wide variety of applications.
Figure 10:Comparison of 3D Imaging Technologies
TOF technology can be applied to applications from automotive to industrial to healthcare, to smart advertising, gaming and entertainment. A TOF sensor could also serve as an excellent input device to both stationary and portable computing devices. In automotive, TOF sensors could enable autonomous driving and increased surrounding awareness for safety. In the industrial segment, TOF sensors could be used as HMI, and for enforcing safety envelopes in automation cells where humans and robots may need to work in close proximity. In smart advertising, using TOF sensors as gesture input and human recognition, digital signage could become highly interactive, targeting media contents to the specific live audience. In healthcare, gesture recognition offers non-contact human-machine interactions, fostering more sanitary operating environment. The gesturing capability is particularly well-suited for consumer electronics, particularly in gaming, portable computing, and home entertainment. TOF sensors natural interface provides an intuitive gaming interface for first-person video games. This same interface could also replace remote controls, mice and touch screens.
Generally speaking, TOF applications can be categorized into Gesture and Non-Gesture. Gesture applications emphasize human interactions and speed; while non-gesture applications emphasize measurement accuracy.
Figure 11: TOF technology applies to a wide range of applications.
Gesture applications translate human movements (faces, hands, fingers or whole-body) into symbolic directives to command gaming consoles, smart televisions, or portable computing devices. For examples, channel surfing can be done by waving of hands, and presentation can be scrolled by using finger flickering. These applications usually require fast response time, low- to medium-range, centimeter-level accuracy and power consumption.
Figure 13: Gesture recognition using a 3D-TOF sensor and SoftKinetic iisu® middleware.
TOF sensors can be used in non-gesture applications as well. For instance, in automotive, a TOF camera can increase safety by alerting the driver when it detects people and objects in the vicinity of the car, and in computer assisted driving. In robotics and automation, TOF sensors can help detect product defects and enforce safety envelopes required for humans and robots to work in close proximity. With 3D printing rapidly becoming popular and affordable, TOF cameras can be used to perform 3D scanning to enable “3D copier” capability. In all of these applications, spatial accuracy is important.
In this paper, we introduced the TOF technology and the theory of operation. We also compared TOF sensors with 2D machine vision and other 3D vision technologies and highlighted TOF sensors’ differentiating advantages. We also explored a wide range of applications that TOF sensors enable or enhance. To help readers get started, we introduced TI 3D-TOF chipset and CDK, as well as third-party software resources.
Mouser Electronics is a worldwide leading authorized distributor of semiconductors and electronic components for over 500 industry leading suppliers. We specialize in the rapid introduction of new products and technologies for design engineers and buyers. Our extensive product offering includes semiconductors, interconnects, passives, and electromechanical components.
In 2007, Mouser became a part of the Warren Buffett Berkshire Hathaway family of companies. Today, Buffett's holdings include insurance and finance subsidiaries and a host of almost fifty businesses ranging from jewelry and furniture to manufactured homes.
Mouser has a strong commitment to customer service. That’s why we’ve won awards for our legendary worldwide customer service excellence. We understand the value of having a knowledgeable person there to answer your questions quickly. Mouser is redefining customer-focused distribution.
This post does not have any comments. Be the first to leave a comment below.
Post A Comment
You must be logged in before you can post a comment. Login now.