Industry tracking

Analysis of the principle, structure, characteristics, development status and direction of binocular vision technology

Binocular stereo vision is an important form of machine vision, which is based on the parallax principle and obtains the three-dimensional geometric information from multiple images. The binocular stereo vision system generally obtains two digital images of the measured object from different angles by two cameras at the same time, or two digital images of the measured object from different angles by a single camera at different times, and restores the three-dimensional geometric information of the object based on the parallax principle, and rebuilds the three-dimensional contour and position of the object. Binocular stereo vision system has a wide range of application prospects in the field of machine vision.

In the 1980s, Marr of the Artificial Intelligence Laboratory of MIT put forward a visual computing theory and applied it to the binocular matching! It made two plans with parallax produce deep stereo graphics! It laid a theoretical foundation for the development of binocular stereo vision. Compared with other stereoscopic methods, such as lens plate three-dimensional imaging, three-dimensional display, holography, etc., binocular stereoscopy is a reliable and simple way to directly simulate human eyes to process scenes. It has great application value in many fields, such as micro operating system's pose detection and control, robot human navigation and aerial survey, three-dimensional measurement and virtual reality, etc.

Principle and structure of binocular stereo vision

The three-dimensional measurement of binocular stereo vision is based on the principle of parallax. Figure 1 shows a simple schematic diagram of head up binocular stereo imaging. The distance between the lines of the projection centers of the two cameras, i.e. the baseline distance, is B. The origin of the camera coordinate system is at the optical center of the camera lens, and the coordinate system is shown in Figure 1. In fact, the imaging plane of the camera is behind the optical center of the lens. In Figure 1, the left and right imaging planes are drawn in front of the optical center of the lens. The U-axis and v-axis of the virtual image plane coordinate system o1uv are consistent with the x-axis and Y-axis directions of the camera coordinate system, which can simplify the calculation process. The origin of left and right image coordinate system is at the intersection of camera optical axis and plane O1 and O2. The corresponding coordinates of a point P in the left and right images are P1 (U1, V1) and P2 (U2, V2), respectively. Assuming that the images of the two cameras are in the same plane, the Y coordinate of the point P image coordinate is the same, that is, V1 = v2. From trigonometric geometry:

Where (XC, YC, ZC) is the coordinate of point P in the left camera coordinate system, B is the baseline distance, f is the focal length of two cameras, (U1, V1) and (U2, V2) are the coordinates of point P in the left and right images respectively.

Parallax is defined as the position difference of a point in two images

The coordinates of a point P in the space in the left camera coordinate system can be calculated as follows:

Therefore, as long as the corresponding points of a point in the space on the left and right camera image planes can be found, and the internal and external parameters of the camera can be obtained through camera calibration, the three-dimensional coordinates of this point can be determined.

The binocular vision measurement probe consists of two cameras and one semiconductor laser.

As a light source, the semiconductor laser emits a point of light onto a cylindrical lens and becomes a straight line. The line laser is projected on the surface of the workpiece as a measurement mark line. The laser wavelength is 650 nm, and the scanning laser linewidth is about 1 mm Two ordinary CCD cameras are placed at a certain angle to form a sensor for depth measurement. The focal length of the lens will affect the angle between the optical axis of the lens and the line laser, the distance between the probe and the object to be measured and the depth of field to be measured.

Vision measurement is a non-contact measurement, which is based on the principle of laser triangulation. The light emitted by laser 1 expands in a single direction through the cylindrical lens and then becomes a light strip, which is projected on the surface of the object to be measured. Due to the change of the curvature or depth of the surface of the object, the light strip is deformed. The image of the deformed light strip is captured by the camera. In this way, the distance or position of the measured point can be obtained from the emission angle of the laser beam and the imaging position of the laser beam in the camera through the triangular geometric relationship.

Similar to the distance and distance of objects observed by human eyes, the binocular vision measurement sensor takes an image of a light bar by two cameras at the same time, and then obtains the positions of all pixels on the light bar in the two images by matching the two images. By using parallax, the position and depth information of the point can be calculated. If a certain coordinate value of the scanning line is obtained by cooperating with the scanning mechanism, all the contour information (i.e. 3D coordinate points) of the scanned object can be obtained.

Generally speaking, the larger the parallax (x2-x1) of binocular sensor, the higher the measurement accuracy. It is found that increasing the baseline length can improve the accuracy of visual measurement. But for a certain focal length lens, too large baseline length will increase the angle between the binocular axis, which will cause large distortion of the image, which is not conducive to CCD calibration and feature matching, but will reduce the measurement accuracy. Two lenses with a focal length of 8 mm are selected, and the matching baseline length is found through experiments, which can ensure that the binocular vision sensor has a high measurement accuracy within the depth of field of the lens.

Technical characteristics of binocular vision

The realization of binocular stereo vision technology can be divided into the following steps: image acquisition, camera calibration, feature extraction, image matching and three-dimensional reconstruction. The realization methods and technical characteristics of each step are introduced in turn.

Image acquisition

The image acquisition of binocular stereo vision is to capture the same scene by two or one camera (CCD) in different positions through moving or rotating, and obtain the stereo image pair. The pinhole model is shown in Figure 1. Assuming that the angular distance and internal parameters of camera C1 and C2 are equal, the optical axes of the two cameras are parallel to each other, the two-dimensional imaging planes x1o1y1 and x2o2y2 coincide, P1 and P2 are the imaging points of space point P on C1 and C2 respectively. But in general, the internal parameters of two cameras in pinhole model cannot be the same, and there is no

Please read the Chinese version for details.

PREVIOUS：Where should intelligent manufacturing be? NEXT：Analysis of three-dimensional positioning pri