The invention relates to the automatic detection, location and identification of objects in a 3D volume.
It generally applies to the field of target detection, to the medical field, to the field of microelectronics, and also to similar fields. It makes it possible in particular to respond to application-based queries encountered in many situations, such as the automatic detection of small fossils (2-3 microns) in large 3D volumes reconstructed by scanning oil exploration core samples, the identification of camouflaged objects in a complex 3D scene; the identification of benign pigment disorders liable to progress to a carcinoma or a melanoma based on a three-dimensional skin reconstruction; the identification of “carcinogenic anomalies” in OCT (Optical Coherence Tomography) cross sections, or even the automatic detection, location and identification of carcinogenic tumours/deficient areas resulting from three-dimensional reconstructions of tomographic scans or MRI (Magnetic Resonance Imaging).
At present, the abovementioned application-based queries are more often than not handled by an expert in the field (geophysicist, physicist, radiologist, dermatologist, etc.), who identifies the objects of interest in 3D volumes using viewing tools, such as for example the MIP (Maximum Intensity Projection).
However, the handling is difficult to carry out for the expert in the field because the mass of data to be processed is very large. In addition, the identification success rate by the expert in the field is limited and rarely exceeds 90%.
The present invention aims to improve the situation.
To this end, the present invention proposes a method for detecting, locating and identifying objects contained in a complex scene.
According to a general definition of the invention, the method comprises the following steps:
The invention thus makes it possible to significantly improve the quality of the detection, location and identification of objects in a complex scene by virtue, on the basis of the 3D volume of the scene to be processed, of obtaining k 2D cross sections in each of which the objects to be processed are detected, located and identified through artificial intelligence and semantic segmentation and the results are concatenated in 3D. This gives the ability to detect, locate and identify objects automatically with very high accuracy even when the objects are masked from one another.
According to some preferred embodiments, the invention comprises one or more of the following features, which may be used separately or in partial combination with one another or in full combination with one another:
Advantageously, the method comprises the following steps, in order to concatenate the results of all of the k output 2D cross sections in 3D, for one object of interest from among said objects of interest: defining, for each output 2D cross section, a local three-dimensional reference system, one of the dimensions of which is perpendicular to the plane defined by the 2D cross section, and associating said reference system with said 2D cross section; identifying, in the output 2D cross sections, subsets or slices of the object of interest; transforming each identified subset or slice of the object of interest by changing the reference system, from the local three-dimensional reference system of the 2D cross section to which it belongs to a predetermined absolute Cartesian reference frame; concatenating the transformed subsets or slices into a 3D icon.
The invention also relates to a system for implementing the method defined above.
The invention furthermore relates to a computer program comprising program instructions for executing a method as defined above when said program is executed on a computer.
Other features and advantages of the invention will become apparent on reading the following description of one preferred embodiment of the invention, given by way of example and with reference to the appended drawings.
The invention relates to the automatic detection, location and identification of objects in three-dimensional (3D) imaging forming a three-dimensional (3D) volume in voxels (volumetric pixels).
For example and without limitation; 3D imaging corresponds to a complex scene in which objects may mask one another, as illustrated in
In practice, the three-dimensional volume may be obtained using a transmission-based or fluorescence-based reconstruction method (Optical Projection Tomography, nuclear imaging or X-Ray Computed Tomography) or a reflection-based reconstruction method (reflection of a laser wave or using solar reflection in the case of the visible band (between 0.4 μm and 0.7 μm) or near infrared band (between 0.7 μm and 1 μm) or SWIR band (Small Wave InfraRed between 1 μm and 3 μm), or taking into account the thermal emission of the object (thermal imaging between 3 μm and 5 μm and between 8 μm and 12 μm); this three-dimensional reconstruction process is described in the patent “Optronic system and method dedicated to identification for formulating three-dimensional images” (U.S. Pat. No. 8,836,762B2, EP2333481B1).
All of the voxels resulting from a three-dimensional reconstruction with the associated intensity are used, this reconstruction having preferably been obtained through reflection.
First of all, with reference to
A correspondence table TAB “Index Class×Label Class” for all of the classes of objects of interest is thus created. For example, at the end of the indexing, the following elements are obtained: Class(n)→{Index(n), Label(n)}, n={1, 2, . . . , N}, N being the number of classes of objects of interest.
By way of example, the Index(n) is at the value “n”, and the Index(background) is at the value “0”.
The detection, location and identification method according to the invention comprises the following general steps, which are described with reference to
In a first step referenced 10, k 2D cross sections are taken in the reconstructed 3D volume. 3D volume→>{CrossSection(k)}, k={1, 2, . . . , K}, K being the number of 2D cross sections taken.
With reference to step 20, for each input 2D cross section thus obtained, the objects of interest are automatically detected, located and identified using a specialized artificial intelligence (AI) method.
The artificial intelligence (AI) method generates the following elements at output:
For example, the artificial intelligence (AI) method is based on the type of deep learning also known as “Faster R-CNN (Regions with Convolutional Neural Network features) object classification” deep learning.
Next, the method applies a semantic segmentation of each 2Dicon defined by a bounding box 2Dboundingbox.
In practice, the generation of a Segmented2Dicon(k,m) of the same size as the 2Dicon(k,m) has, for each pixel, either the value of the Index(k,m) of the Object(k,m) identified in the CrossSection(k), or the value of the Index(background). This therefore gives, at output, 2Dicon(k,m)→Segmented2Dicon(k,m).
For example, the semantic segmentation is performed using Deep Learning, for example a Mask R-CNN (Regions with Convolutional Neural Network) designated for the semantic segmentation of images.
With reference to step 30, the results of all of the 2D cross sections are finally concatenated in 3D.
In one set of embodiments of the invention, the results of the 2D cross sections are concatenated in 3D through the following steps:
This makes it possible to reconstruct the three-dimensional object while ensuring continuity at the limits of the subsets or slices of the object.
At the output, this then gives the following elements:
Concatenation of the Labels(k,m)→Generation of the consolidated Labels(n) Concatenation of the 2Dboundingboxes(k,m)→Generation of the 3Dboundingboxes(n)
{Object(n), Label(n), 3Dboundingbox(n), Segmented3Dicon(k,n)}, (n) belonging to {1, 2, . . . , N}, N being the number of classes of objects of interest.
The method according to the invention exhibits multiple specific details.
The first specific detail, also called resolution, relates to the number and the angle of the 2D cross sections, for example the 2D cross sections belong to the group formed by main cross sections, horizontal cross sections, vertical cross sections, oblique cross sections. The higher the number of 2D cross sections, the better the detection resolution. In addition, 2D cross sections of different angles may provide better detection results and will be used in the 3D concatenation of the results, which will be described in more detail below.
The second specific detail relates to the 2D bounding boxes, 2Dboundingbox=[(x1,x2),(y1,y2)]. The smaller the size of the 2D bounding boxes, the better the detection resolution.
The third specific detail relates to the 3D bounding boxes, 3Dboundingbox=[(x1,x2),(y1,y2),(z1,z2)]. The smaller the size of the 3D bounding boxes, the better the detection resolution.
As seen above, the choice and the number of 2D cross sections will impact the resolution of the detection.
From the 3D volume 11 (reconstructed in voxels), the cross-sectioning module 12 generates 2D cross sections 15 (in pixels) in response to the command from the choosing module 13. The 2D cross sections 15 (in pixels) are managed and indexed by the management module 14 in accordance with the indexing table TAB (
The output 2D cross section 23 generated by the AI method 22 comprises a 2D bounding box 24 bounding an object of interest 25.
With reference to
For example, the complex scene contains a vehicle camouflaged in the bushes. The 2D shot is an air-to-ground shot with 2D images of 415×693 pixels.
The fields of application of the invention are broad, covering the detection, classification, recognition and identification of objects of interest.
Number | Date | Country | Kind |
---|---|---|---|
1908109 | Jul 2019 | FR | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2020/069056 | 7/7/2020 | WO |