The present disclosure generally relates to an image processing technology, in particular, to a system and a method for processing images related to depth.
Depth information may be used in object detection, three-dimensional objection generation, or other implementations. In the conventional approaches, a specific processor (for example, a digital signal processor (DSP) or a central processing unit (CPU)) may perform the depth calculation based on the images by using software computation to generate the depth information. However, for a compact device (such as a smartphone, a tablet, a handheld mounted display, etc.), it may not have better performance relative to other computing devices (such as a desktop computer, a laptop, or a server), so that the software computation of the compact device may not provide enough efficiency for depth calculation.
Accordingly, the present disclosure is directed to a system and a method for processing images related to depth, to provide a specific circuit to handle depth-related calculations with hardware computation.
In one of the exemplary embodiments, a system for processing image related to depth includes, but is not limited to, a first image sensor, a second image sensor, a first image processing circuit, and a second image processing circuit. The first image sensor is used for generating the first image. The second image is used for generating a second image. The first image processing circuit is coupled to the first image sensor and the second image sensor. The first image processing circuit is configured to generate depth data corresponding to one or more objects identified in the first image and the second image and generate a first data packet including two of the first image, the second image, and the depth data. The second image processing circuit is coupled to the first image processing circuit. The second image processing circuit is configured to receive the first data packet and perform stereo matching on the first image and the second image.
In one of the exemplary embodiments, a method for processing images related to depth includes, but is not limited to, the following steps. A first image sensor generates a first image. A second image sensor generates a second image. A first image processing circuit generates depth data corresponding to one or more objects identified in the first and the second image. The first image processing circuit generates a first data packet including two of the first image, the second image, and the depth data. A second image processing circuit receives the first data packet. The second image processing circuit performs stereo matching on the first image and the second image.
It should be understood, however, that this Summary may not contain all of the aspects and embodiments of the present disclosure, is not meant to be limiting or restrictive in any manner, and that the invention as disclosed herein is and will be understood by those of ordinary skill in the art to encompass obvious improvements and modifications thereto.
The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments of the disclosure and, together with the description, serve to explain the principles of the disclosure.
Reference will now be made in detail to the present preferred embodiments of the disclosure, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers are used in the drawings and the description to refer to the same or like parts.
The first image sensor 111 and the second image sensor 113 could be infrared ray (IR) sensors, color image sensors, red-green-blue (RGB) sensors, RGB-IR sensors, or depth cameras. In one embodiment, the first image sensor 111 and the second image sensor 113 correspond to the left and right eyes, respectively.
In one embodiment, the system 100 further includes a third image sensor 115. The third image sensor 115 could be an IR sensor, a color image sensor, an RGB sensor, an RGB-IR sensor, or a depth camera.
It should be noted that for IR sensor and RGB-IR sensor, the system 100 may further includes infrared light source (not shown), so that the IR-related sensors may detect the infrared light.
The first image processing circuit 120 could be an image signal processor (ISP), an image chip, or other image-related processors. The first image processing circuit 120 is coupled to the first image sensor 111, the second image sensor 113, and the third image sensor 115.
The depth calculation circuit 122 includes, but is not limited to, an image analysis circuit 123, an object extraction circuit 124, an object depth calculation circuit 125, an overlapped object depth calculation circuit 126, and a multiplexer 127.
In one embodiment, the image analysis circuit 123 is configured to determine whether to adjust pixel values of the images M1 and M2 generated by the first image sensor 111 and the second image sensor 113 to enhance picture quality. For example, when the raw images are too dark, the image analysis circuit 123 may increase exposure values of the raw images to improve picture quality for the following object extraction operation. In some embodiments, the pixel values may be related to chroma, contract, or other image-related parameters.
The object extraction circuit 124 is coupled to the image analysis circuit 123. In one embodiment, the object extraction circuit 124 is configured to identify one or more objects in the real world from the raw images generated by the first image sensor 111 and the second image sensor 113. For example, the object extraction circuit 124 extract features from the raw images and compare the features with predefined objects' features.
The object depth calculation circuit 125 is coupled to the object extraction circuit 124. In one embodiment, the object depth calculation circuit 125 is configured to calculate a first depth of the one or more objects according to a distance between the first and second image sensors 111 and 113 and a pixel distance between where the one or more objects are located in both raw images of the first and second image sensors 111 and 113 by using the triangulation method.
The overlapped object depth calculation circuit 126 is coupled to the object depth calculation circuit 125. In one embodiment, the overlapped object depth calculation circuit 126 is configured to calculate the second depth of two overlapped objects of the one or more objects and output the depth data D including the first depth and the second depth.
In some embodiments, the depth calculation is performed by hardware computation, so that the first image processing circuit 120 could be considered as a depth hardware engine.
The multiplexer 127 is coupled to the overlapped object depth calculation circuit 126. In one embodiment, the multiplexer 127 is configured to output one of the raw images M1 and M2 generated by the first image sensor 111 and the second image sensor 113, and the depth data D according to a control signal. The control signal may be generated based on the requirement of the second image processing circuit 130 or other circuits, and the embodiment is not limited thereto.
The second image processing circuit 130 could be a vision processing unit (VPU), an artificial intelligence (AI) accelerator for vision task, or other image-related processors. The second image processing circuit 130 is coupled to the first image processing circuit 120. In one embodiment, the second image processing circuit 130 is configured to perform stereo matching on the images. The stereo matching process is used to extract depth information or three-dimensional information from the digital image. For example, the second image processing circuit 130 may compare two images M1 and M2 from the first image sensor 111 and the second image sensor 113, and the depth information or three-dimensional information could be obtained based on the disparity. In another example, the second image processing circuit 130 is configured to determines one or more extracted objects with a specific figure or pattern (e.g., a hand gesture) according to the images M1 and M2 from the first image sensor 111 and the second image sensor 113.
The third image processing circuit 140 could be an ISP, an image chip, or other image-related processors. The third image processing circuit 140 is coupled to the second image processing circuit 130. In one embodiment, the third image processing circuit 140 is configured to perform automatic white balance and exposure value calibrations to the images outputted from the second image processing circuit 130 to improve picture quality for object recognition and depth calculation. In some embodiments, the third image processing circuit 140 may calibrate chroma, contract, or other image-related parameters. In still some embodiments, two or more of the first, second, and third image processing circuits 120, 130, and 140 may be integrated into a single chip or a digital circuit.
The fourth image processing circuit 150 could be a digital signal processor (DSP), an image chip, or other image-related processors. The fourth image processing circuit 150 is coupled to the third image processing circuit 140. In one embodiment, the fourth image processing circuit 150 is configured to stereography conversion on the image outputted from the second or third image processing circuit 130 or 140 according to the depth data generated by the first image processing circuit 120, to generate a stereography. For example, the stereography includes the three-dimensional object(s) projected onto a two-dimensional surface. In some embodiments, the fourth image processing circuit 150 may be omitted.
The CPU 160 could be a microprocessor, a microcontroller, a graphics processing unit (GPU), an application-specific integrated circuit (ASIC), or a field-programmable gate array (FPGA). The functions of the CPU 160 may also be implemented by an independent electronic device or an integrated circuit (IC), and operations of the CPU 160 may also be implemented by software. The CPU 160 is coupled to the third image processing circuit 140 or the fourth image processing circuit 150. In one embodiment, the CPU 160 is configured to generate a computation result regarding applications for hand motion detection and tracking, space scanning, object scanning, augmented reality (AR) see-through, 6 degrees of freedom (DoF), and Simultaneous Localization and Mapping (SLAM) based on the images outputted from the first, second, third and/or fourth image processing circuit 120, 130, 140, 150 and/or the corresponding depth data.
The system 100 may include one or more memory (not shown) to store data used in or outputted from the first, second, third, or fourth image processing circuit 120, 130, 140, 150, or the CPU 160.
To better understand the operating process provided in one or more embodiments of the disclosure, several embodiments will be exemplified below to elaborate the operating process of the system 100. The devices and modules in the system 100 are applied in the following embodiments to explain the method for processing images related to depth provided herein. Each step of the method can be adjusted according to actual implementation situations and should not be limited to what is described herein.
In one embodiment, the third image sensor 115 may generate the third image. In some embodiments, the first image is an infrared image, the second image is another infrared image, the third image is a color image. In still some embodiments, the first image is a color image, the second image is another color image, the third image is an infrared image.
The first image processing circuit 120 may generate the depth data corresponding to one or more objects identified in the first image and the second image (step S330). As mentioned in the operation of the depth calculation circuit 122, the first depth of the one or more non-overlapped objects and/or the second depth of the one or more overlapped objects identified in the first image and the second image are generated to form the depth data. In some embodiments, the depth calculation circuit 122 may generate the depth data corresponding to the one or more objects identified in the first image, the second image, and the third image.
The first image processing circuit 120 may generate a first data packet including two of the first image, the second image, and the depth data (step S340). In one embodiment, two data among the first image, the second image, and the depth data are combined to generate the first data packet. The format of the data packet is defined based on the transmission interface of the image processing circuit 120. In another embodiment, two data among the first image, the second image, the depth data, and dummy data are combined to generate the first data packet. The dummy data may include specific values or random values.
In still another embodiment, the first image processing circuit 120 may generate a second data packet including the other two of the first image, the second image, the depth data, and the dummy data different from the first data packet.
For example,
In one embodiment, the first image of the first image sensor 111 is a first color infrared image, the second image of the second image sensor 113 is a second color infrared image. For example,
Furthermore, the first image processing circuit 120 may generate the first data packet including two of the color portions C4 and C5 of the first color infrared image CIR1 and the second color infrared image CIR2, the infrared portions IR3 and IR4 of the first color infrared image CIR1 and the second color infrared image CIR2, the first depth data D1, and the second depth data D2, generate a third data packet including another two of the color portions C4 and C5, the infrared portions IR3 and IR4, the first depth data D1, and the second depth data D2 different from the first data packet, and generate a fourth data packet including the other two of the color portions C4 and C5, the infrared portions IR3 and IR4, the first depth data D1, and the second depth data D2 different from the first data packet and the third data packet.
For example, the first image processing circuit 120 combines the color portions C4 and C5 of the first color infrared image CIR1 and the second color infrared image CIR2 in the first data packet, combines the infrared portions IR3 and IR4 of the first color infrared image CIR1 and the second color infrared image CIR2 into the third data packet, and combines the first depth data D1 and the second depth data D2 into the fourth data packet. The data structure could be:
The second image processing circuit 130 may receive the first data packet from the first image processing circuit 120 (step S350). Specifically, the first image processing circuit 120 provides a first transmission interface, and the second image processing circuit 130 provides a second transmission interface connected with the first transmission interface. The first and second transmission interfaces could be camera serial interface (CSI)-3, CSI-2, another mobile industry processor interface (MIPI), or other transmission interfaces. The first and second transmission interfaces provide multiple data lines to transmit data.
In one embodiment, the first transmission interface transmits the first data packet to the second transmission interface over a first channel, and the first data packet further includes a first channel identifier merely corresponding to the first channel. The first channel is a logical channel that is identified by the firmware or software of the first and second transmission interfaces. However, the first data packet is still transmitted over the physical data lines.
In another embodiment, the first transmission interface transmits the second data packet to the second transmission interface over a second channel different from the first channel, and the second data packet further comprises a second channel identifier merely corresponding to the second channel. The first channel and the second channel correspond to different memory blocks in the second transmission interface. The second transmission interface may identify the channel identifier to know which channel the data packet belongs to. For example, the first channel identifier is ‘00’, and the second channel identifier is ‘01’. Then, the second image processing circuit 130 may store the data packet to a corresponding memory block.
In some embodiments, the first channel is a physical channel/way, and the second channel is a virtual channel/way. It is assumed the transmission interfaces provide multiple logical channels via the data lines between two transmission interfaces. If there are two or more types of data packets, the first transmission interface may arrange different data packets to different logical channels. There may be merely one logical channel called as the physical channel, and the other logical channels would be called as the virtual channel.
Taking
In another embodiment, there are three different data packets, which are the first, third, and fourth data packets as mentioned before. The first transmission interface may transmit the first data packet to the second transmission interface over a third channel, transmit the third data packet to the second transmission interface over a fourth channel, and transmit the fourth data packet to the second transmission interface over a fifth channel. The first data packet further includes a third channel identifier merely corresponding to the third channel, the third data packet further includes a fourth channel identifier merely corresponding to the fourth channel, and the fourth data packet further includes a fifth channel identifier merely corresponding to the fifth channel. For example, the third channel identifier is ‘01’, the fourth channel identifier is ‘10’, and the fifth channel identifier is ‘11’. Similarly, the third, fourth, and fifth channels correspond to different memory blocks in the second transmission interface. Then, the second image processing circuit 130 may retrieve desired data from a corresponding memory block.
In some embodiments, the third channel is a physical channel, the fourth channel is a virtual channel, and the fifth channel is another virtual channel. Taking
It should be noted that the data in the same data packet may be provided to a specific application. Taking
The second image processing circuit 130 may perform stereo matching on the first image and the second image (step S360). Specifically, the second image processing circuit 130 may combine the first image, the second image, and the depth data based on the stereo matching algorithm, to generate a matching image related to the depth data. Taking
Taking
Taking
Taking
In one embodiment, the fourth image processing circuit 150 may convert the color matching image into a stereography according to the depth data.
In another embodiment, the CPU 160 may generate a computation result regarding applications for hand motion detection and tracking, space scanning, object scanning, AR see-through, 6 Dof, and SLAM based on the stereography and the corresponding depth data.
Under the architecture of the system 100, the present disclosure firstly calculates the depth data corresponding to the images M1 and M2 using the first image processing circuit 120 (i.e., the depth hardware engine), so as to replace the software calculations of a digital signal processor in the prior art. Afterward, with the operations of the second image processing circuit 130 and the third image processing circuit 140, the images M1 and M2 with better picture quality and the corresponding depth data with higher accuracy may be obtained. Therefore, the accuracy and efficiency of the CPU 160 for handling applications (such as hand motion detection and tracking, space scanning, object scanning, AR see-through, and SLAM) may be improved.
It will be apparent to those skilled in the art that various modifications and variations can be made to the structure of the present disclosure without departing from the scope or spirit of the disclosure. In view of the foregoing, it is intended that the present disclosure cover modifications and variations of this disclosure provided they fall within the scope of the following claims and their equivalents.
This application is a continuation-in-part application of and claims the priority benefits of U.S. application Ser. No. 16/386,267, filed on Apr. 17, 2019, U.S. application Ser. No. 16/387,528, filed on Apr. 17, 2019, and U.S. application Ser. No. 16/386,273, filed on Apr. 17, 2019, all now pending. The entirety of each of the above-mentioned patent applications is hereby incorporated by reference herein and made a part of this specification.
Number | Date | Country | |
---|---|---|---|
Parent | 16386267 | Apr 2019 | US |
Child | 17020821 | US | |
Parent | 16387528 | Apr 2019 | US |
Child | 16386267 | US | |
Parent | 16386273 | Apr 2019 | US |
Child | 16387528 | US |