The present invention relates to a three-dimensional oral scan data reconstruction technology, and more particularly, to a device and a method for reconstructing three-dimensional oral scan data, which can reduce geometric distortion of a scan model by using a computed tomography (CT) image.
In a dental field, including orthodontics and implants, establishing an accurate diagnostic plan and treatment are very important. For this purpose, a 3D computed tomography image (hereinafter, referred to as ‘CT images’) and 3D oral scan data are used. The CT image as an image reconstructed by taking an entire tooth in an oral cavity at once has an advantage in that tooth root information can be known and geometric distortion does not occur, and since the 3D oral scan data has an advantage of having higher tooth crown precision than the CT image, both the CT image and the 3D oral scan data are generally used for diagnosis and treatment.
However, the 3D oral scan data has a disadvantage of causing the geometric distortion because the 3D oral scan data is an image that is reconstructed by stitching together multiple pieces of local data within the oral cavity using a mathematical algorithm. In other words, the 3D oral scan data is an image reconstructed by generating multiple scan frames by locally scanning each part of the oral cavity using an oral scanner, and then matching point clouds included in each generated scan frame into one in a 3D coordinate system. Accordingly, even if a matching error between one frame pair is small, errors are accumulated when matching all frames, and the accumulated errors cause the geometric distortion in a 3D oral model, reducing the reliability of the 3D oral scan data.
Therefore, research and development on data reconstruction technology that can reduce the geometric distortion of the scan model is required to establish accurate diagnostic plans and treatments in the dental field, including orthodontics and implants.
An object of the present invention is to provide a device and a method for reconstructing 3D oral scan data, which may reduce geometric distortion of a scan model by using a CT image with no geometric distortion.
Further, another object of the present invention is to provide an oral scanner and a computed tomography apparatus which may reduce the geometric distortion of the scan model by using the CT image with no geometric distortion.
Further, yet another object of the present invention is to provide a computer program stored in a computer-readable recording medium implementing a method for reconstructing 3D oral scan data, which may reduce the geometric distortion of the scan model by using the CT image, and a recording medium storing the program.
In addition, the present invention is not limited to the above-described objects, and besides, various objects may be additionally provided through techniques described through embodiments and claims described later.
In order to achieve the object, an aspect of the present invention provides a method for reconstructing 3D oral scan data by using a computed tomography (CT) image and scan data of which coordinates are matched based on 3D feature points of a detected CT image and scan data, which includes: (a) a process of matching scan key frames of the scan data initially positioned by the 3D feature points, and obtaining a rigid transform of the scan key frames to minimize a 3D distance between the scan key frames and the CT image; and (b) a process of reconstructing a 3D oral scan model by correcting and rematching 3D coordinate information of an original scan key frame by using the obtained rigid transform of the scan key frames.
Further, in the process (a), if the number of scan frames for a scan frame set S={s1, s2, . . . } is |S|, a rigid transform {Ti}i=1|S| of each scan frame that minimizes a loss L may be obtained as in [Equation] below.
Where pi,k represents a point corresponding an i-th scan key frame, C(i,j) represents a set of all points that are shared between an i-th key frame and a j-th key frame, ci,k represents a point of 3D coordinate information of the CT image corresponding to pi,k Ni represents the number of points where pi,k and ci,k correspond to each other, and w1,i and w2,i represent weights corresponding to respective losses.
Further, in the process (a), when 1,i− becomes larger upon matching the scan key frames of the scan data, matching the scan key frames may be emphasized, and when
2,i− becomes larger, matching the scan key frame and the CT image may be emphasized.
In addition, in the process (a), by comparison with a rigid transform Ti in which L(Ti) is minimal with respect to fixed 1,i− and
2,i−, a rigid transform obtained by increasing only
1,i− may decrease a matching error between scan key frames instead of increasing the matching error between the scan key frame and the CT image, and in contrast, a rigid transform obtained by increasing only
2,i− may decrease the matching error between the scan key frame and the CT image instead of increasing the matching error between the scan key frames.
In addition, in the process (a), when artifact is severe in the CT image, a size of 2,i− is partially decreased or is set to 0, and a size of
1,i− is increased to reduce an influence of the CT image when matching the scan key frames, and in contrast, when the matching error between the scan key frames is large, the size of
1,i− is decreased or is 0, or the size of w2,i− is increased to increase an influence of the CT image, thereby reducing the matching error between the scan key frames.
Further, in order to achieve the object, another aspect of the present invention provides a method for reconstructing 3D oral scan data by using a computed tomography image, which includes: (a) a process of generating 3D coordinate information from a computed tomography (CT) image; (b) a process of detecting 3D coordinate information of the CT image and a 3D feature point of scan data; (c) a process of matching the 3D coordinate information of the CT image and coordinates of the scan data by using the 3D feature points; (d) a process of matching scan key frames of the scan data initially positioned by the 3D feature points, and obtaining a rigid transform of the scan key frames to minimize a 3D distance between the scan key frames and the CT image; and (e) a process of reconstructing a 3D oral scan model by correcting and rematching 3D coordinate information of an original scan key frame by using the obtained rigid transform of the scan key frames.
Further, step (b) above may include (b-1) a process of generating a 2D rendering image for the 3D coordinate information of the scan data and the CT image; (b-2) a process of detecting adjacent 2D points between teeth in the 2D rendering image; (b-3) a process of detecting adjacent 3D points between teeth based on the detected 2D points; and (b-4) a process of obtaining a direction perpendicular to a virtual straight line passing through two adjacent detected 3D points, and obtaining a center point which becomes a center between two adjacent 3D points in the obtained direction perpendicular to the virtual straight line, and then sampling points on the 3D coordinate information in the perpendicular direction obtained from two adjacent 3D points and the center point to detect the 3D coordinate information of the CT image and the 3D feature point of the scan data.
Further, in the process (c), a rotation matrix R and a translation vector t are obtained by using [Equation 1] below by using the detected 3D feature points as an initial value, and coordinates of the scan data and the CT image are approximately matched, and the coordinates of the scan data and the CT image are matched by using [Equation] below with respect to all 3D coordinate information generated by the CT image and all points of the scan data again to minimize errors.
Where ‘C’ represents the 3D coordinate information generated from the CT image, and ‘D’ represents the scan data.
Further, in the process (c), if the numbers of 3D feature points of the scan data and the CT image are n with respect to all 3D coordinate information C and scan data D generated by the CT image, the scan data is transformed into R1D+ti by obtaining an initial rigid transform R1, t1 of minimizing the error E(R1; t1) by using [Equation 1] below with respect to the 3D feature points {ci}i=0n⊆C of the CT image and the 3D feature points {di}i=0n⊆D of the scan data to approximately match the coordinates of the scan data and the CT image, and coordinate information of the scan data D is transformed into R(R1D+t1)+t by obtaining the rigid transform R,t of minimizing the error E(R,t) by using [Equation 2] below with respect to 3D coordinate information C generated by the CT image and all points R1D+t1 of coordinate-shifted scan data to match the coordinates of the scan data and the CT image.
Further, in the process (d), if the number of scan frames for a scan frame set S={s1, s2, . . . } is |S|, a rigid transform {Ti}i=1|S| of each scan frame that minimizes a loss L may be obtained as in [Equation] below.
Where pi,k represents a point corresponding an i-th scan key frame, C(i,j) represents a set of all points that are shared between an i-th key frame and a j-th key frame, ci,k represents a point of 3D coordinate information of the CT image corresponding to pi,k, Ni represents the number of points where pi,k, and ci,k correspond to each other, and 1,i− and
2,i− represent weights corresponding to respective losses.
Further, in the process (d), when 1,i− becomes larger upon matching the scan key frames of the scan data, matching the scan key frames may be emphasized, and when
2,i− becomes larger, matching the scan key frame and the CT image may be emphasized.
In addition, in the process (d), by comparison with a rigid transform Ti in which L(Ti) is minimal with respect to fixed 1,i− and
2,i−, in the same condition, a rigid transform obtained by increasing only W may decrease a matching error between scan key frames instead of increasing the matching error between the scan key frame and the CT image, and in contrast, a rigid transform obtained by increasing only
2,i− may decrease the matching error between the scan key frame and the CT image instead of increasing the matching error between the scan key frames.
In addition, in the process (d), when artifact is severe in the CT image, a size of 2,i− is partially decreased or is set to 0, and a size of
1,i− is increased to reduce an influence of the CT image when matching the scan key frames, and in contrast, when the matching error between the scan key frames is large, the size of
1,i− is decreased or is 0, or the size of
2,i− is increased to increase an influence of the CT image, thereby reducing the matching error between the scan key frames.
Further, in order to achieve the object, yet another aspect of the present invention provides a device for reconstructing 3D oral scan data by using a computed tomography image, which includes: a 3D coordinate information generation unit generating 3D coordinate information from a computed tomography (CT) image; a 3D feature point detection unit detecting 3D coordinate information of the CT image and a 3D feature point of scan data; a coordinate matching unit matching the 3D coordinate information of the CT image and coordinates of the scan data by using the 3D feature points; a rematching and rigid transform calculation unit matching scan key frames of the scan data initially positioned by the 3D feature points, and obtaining a rigid transform of the scan key frames to minimize a 3D distance between the scan key frames and the CT image; and an oral scan model reconstruction unit reconstructing a 3D oral scan model by correcting and rematching 3D coordinate information of an original scan key frame by using the obtained rigid transform of the scan key frames.
Further, the 3D feature point detection unit may generate a 2D rendering image for the 3D coordinate information of the scan data and the CT image, detect adjacent 2D points between teeth in the 2D rendering image, detect adjacent 3D points between teeth based on the detected 2D points, and obtain a direction perpendicular to a virtual straight line passing through two adjacent detected 3D points, and obtain a center point which becomes a center between two adjacent 3D points in the obtained direction perpendicular to the virtual straight line, and then sample points on the 3D coordinate information in the perpendicular direction obtained from two adjacent 3D points and the center point to detect the 3D coordinate information of the CT image and the 3D feature point of the scan data.
Further, the coordinate matching unit obtains a rotation matrix R and a translation vector t by using [Equation] below by using the detected 3D feature points as an initial value, and approximately matches coordinates of the scan data and the CT image, and matches the coordinates of the scan data and the CT image by using [Equation] below with respect to all 3D coordinate information generated by the CT image and all points of the scan data again to minimize errors.
Where ‘C’ represents the 3D coordinate information generated from the CT image, and ‘D’ represents the scan data.
Further, if the numbers of 3D feature points of the scan data and the CT image are n with respect to all 3D coordinate information C and scan data D generated by the CT image, the coordinate matching unit transforms the scan data into R1D+t1 by obtaining an initial rigid transform. R1, t1 of minimizing the error E(R1, t1) by using [Equation 1] below with respect to the 3D feature points {ci}i=0n C of the CT image and the 3D feature points {di}t=0n⊆D of the scan data to approximately match the coordinates of the scan data and the CT image, and transforms coordinate information of the scan data D into R(R1D+t1)+t by obtaining the rigid transform R,t of minimizing the error E(R, t) by using [Equation 2] below with respect to 3D coordinate information C generated by the CT image and all points R1D+1 of coordinate-shifted scan data to match the coordinates of the scan data and the CT image.
Further, if the number of scan frames for a scan frame set S={s1, s2, . . . } is |S|, the rematching and rigid transform calculation unit may obtain a rigid transform {Ti}i=1|S| of each scan frame that minimizes a loss L as in [Equation] below.
Where pi,k represents a point corresponding an i-th scan key frame, C(i,j) represents a set of all points that are shared between an i-th key frame and a j-th key frame, ci,k represents a point of 3D coordinate information of the CT image corresponding to pi,k, Ni represents the number of points where pi,k and ci,k correspond to each other, and1,i− and
2,i− represent weights corresponding to respective losses.
Further, the rematching and rigid transform calculation unit may emphasize matching the scan key frames when i,t− becomes larger upon matching the scan key frames of the scan data, and emphasize matching the scan key frame and the CT image when
2,i− becomes larger.
Further, by comparison with a rigid transform Ti in which L(Ti) is minimal with respect to fixed 1,i− and
2,i−, in the same condition, the rematching and rigid transform calculation unit may decrease a matching error between scan key frames instead of increasing the matching error between the scan key frame and the CT image in a rigid transform obtained by increasing only
1,i−, and in contrast, decrease the matching error between the scan key frame and the CT image instead of increasing the matching error between the scan key frames in a rigid transform obtained by increasing only
2,i−.
Further, the rematching and rigid transform calculation unit partially decreases a size of 2,i− or sets the size of
2,i− to 0, and increases a size of
i,i− to reduce an influence of the CT image when matching the scan key frames when artifact is severe in the CT image, and in contrast, decreases the size of
1,t− or set the size of w1,i− to 0, and increases the size of
2,i− to increase an influence of the CT image when the matching error between the scan key frames is large, thereby reducing the matching error between the scan key frames.
Further, in order to achieve the object, still yet another aspect of the present invention provides an oral scanner including the device for reconstructing 3D oral scan data by using the computed tomography image described above.
In addition, in order to achieve the object, still yet another aspect of the present invention provides a computed tomography (CT) apparatus including the device for reconstructing 3D oral scan data by using a computed tomography image.
Further, in order to achieve the object, still yet another aspect of the present invention provides a computer-readable recording medium storing a program for implementing the method for reconstructing 3D oral scan data by using the computed tomography image described above.
In addition, in order to achieve the object, still yet another aspect of the present invention provides a computer program stored in a computer-readable recording medium for implementing the method for reconstructing 3D oral scan data by using a computed tomography image.
As described above, by the device and the method for reconstructing 3D oral scan data by using a computed tomography image according to the embodiments of the present invention, accumulated matching errors of scan frames are reduced by rematching the scan frames according to a geometric structure of a CT image by using a CT image with no geometric distortion to reduce the geometric distortion of an oral scan model.
Hereinafter, advantages and features of the present invention, and methods for accomplishing the same will be more clearly understood from embodiments described in detail below with reference to the accompanying drawings. However, the present invention is not limited to the embodiments set forth below, and will be embodied in various different forms. The embodiments are just for rendering the disclosure of the present invention complete and are set forth to provide a complete understanding of the scope of the invention to a person with ordinary skill in the technical field to which the present invention pertains, and the present invention will be defined by the scope of the claims.
Further, it is also to be understood that the terminology used herein is for the purpose of describing embodiments only and is not intended to limit the present invention. In addition, in this specification, singular forms include even plural forms unless the context clearly indicates otherwise. For example, ‘include’ (or ‘comprise’) and/or ‘including’ (or ‘comprising’) used in the specification can add the mentioned components and processes. Throughout the whole specification, the same reference numerals denote the same elements. ‘And/or’ includes each and every combination of one or more of the mentioned items.
Further, unless otherwise defined, all terms (including technical and scientific terms) used in the present specification may be used as the meaning which may be commonly understood by the person with ordinary skill in the art, to which the present invention pertains. Terms defined in commonly used dictionaries should not be interpreted in an idealized or excessive sense unless expressly and specifically defined.
Referring to
As illustrated in
The CT image acquisition unit 12 which acquires a CT image obtained by capturing the inside of the patient's oral cavity may be, for example, CT apparatus 2. Alternatively, the CT image of the inside of the patient's oral cavity captured through the CT apparatus 2 may be provided and acquired from the CT apparatus 2 through the wired and wireless communication. At this time, the CT apparatus 2 may be, for example, cone beam computed tomography (CBCT) apparatus.
The 3D coordinate information generation unit 13 generates 3D coordinate information of the CT image acquired by the CT image acquisition unit 12 (S2). For example, the 3D coordinate information generation unit 13 divides the CT image into upper and lower jaws using deep learning, and then generates 3D coordinate information for the upper and lower jaws of the divided CT image, respectively. At this time, the 3D coordinate information may mean a depth map, a 3D point cloud, a 3D mesh, or the like.
The 3D feature point detection unit 14 detects a 3D feature point for the 3D coordinate information of the CT image generated by the 3D coordinate information generation unit 13 and a 3D feature point for the scan data acquired by the scan data acquisition unit 11 (S3). The 3D feature point detection process S3 is illustrated in
The 3D feature point detection process S3 is illustrated as below with reference to
Rendering is performed in a z-axis direction on the 3D coordinate information of the acquired scan data and the generated CT image to generate a 2D rendered image for the scan data and the CT image (S31).
Subsequently, adjacent points 3a and 3b (hereinafter, referred to as ‘2D point’) between teeth are detected in the 2D rendering images generated from the 3D coordinate information of the scan data and the CT image, respectively in the process S31 (S32). For example, the 2D point detection process S32 may be detected by using a deep learning network learning of finding adjacent points between teeth, i.e., 2D points 3a and 3b in the 2D rendering image.
Subsequently, 3D inter-teeth adjacent points 4a and 4b (hereinafter, referred to as ‘3D point’) are detected by using the 2D points 3a and 3b detected in the process S32 (S33). For example, the 2D points 3a and 3b detected through deep learning are projected to original 3D data (scan data and CT image) to detect the 3D points 4a and 4b for the scan data and the CT image (S33).
Subsequently, a 3D point 5a for the scan data and a 3D feature point 5b for the 3D coordinate information of the CT image are detected by using the respective 3D points 4a and 4b detected in the process S33 (S34).
The process S34 of detecting the 3D feature points 5a and 5b using the 3D points 4a and 4b is illustrated in
As illustrated in
Subsequently, as illustrated in (b) of
Subsequently, by the same method as the above-described processes S341 and s342, the processes are repeatedly performed to detect the 3D feature point 5a for the scan data (S343). For example, in the scan data, the direction perpendicular to the virtual straight line passing through two adjacent 3D points is obtained, and then the center point which becomes the center between two adjacent 3D points in the obtained direction perpendicular to the virtual straight line is obtained, and two adjacent 3D points, and the points on the 3D coordinate information of the scan data in the direction perpendicular to the center point are sampled to detect the 3D feature point 5a for the scan data.
In the present invention, the 3D feature point for the 3D coordinate information of the CT image is detected, and then the 3D feature point for the scan data is detected, but this is an example, and the process of detecting the 3D feature point for the scan data may be performed simultaneously with the process of detecting the 3D feature point for the 3D coordinate information of the CT image, or also performed before the process. In other words, the order is not limited.
Then, when the 3D feature points 5a and 5b for the 3D coordinate information of the scan data and the CT image are detected, as illustrated in
In the process of matching the coordinates of the scan data and the CT image, first, a rotation matrix R and a translation vector t in which an error E(R,t) becomes minimal are obtained as in [Equation 1] below by using the 3D feature point as an initial value.
Where ‘C’ represents the 3D coordinate information generated from the CT image, and ‘D’ represents the scan data.
In other words, the process of matching the coordinates of the scan data and the CT image is to approximately match the coordinates of the scan data and the CT image by obtaining the approximate rotation matrix R and translation vector t using [Equation 1] above based on the 3D feature point, and matches the coordinates of the scan data and the CT image by using [Equation 1] above with respect to all 3D coordinate information (including the 3D feature point) generated by the CT image and all points (including the 3D feature point) of the scan data again to minimize errors.
For example, if the numbers of 3D feature points of the scan data and the CT image are n with respect to all 3D coordinate information C and scan data D generated by the CT image, the scan data is transformed into R1D+t1 by obtaining an initial rigid transform. R1·t1 of minimizing the error E(R1, t1) by using [Equation 2] below with respect to the 3D feature points {ci}i=0n⊆C of the CT image and the 3D feature points {di}t=0n⊆D of the scan data to approximately match the coordinates of the scan data and the CT image. In addition, coordinate information of the scan data D is transformed into R(R1D+t1)+t by obtaining the rigid transform R,t of minimizing the error E(R,t) by using [Equation 3] below with respect to 3D coordinate information C generated by the CT image and all points R1D+t1 of coordinate-shifted scan data to match the coordinates of the scan data and the CT image.
As illustrated in
In this specification, the scan key frame is a term used to expand a semantic range of the scan frame. For example, the scan key frame may be one scan data or data representing adjacent scan frames. Alternatively, the scan data may also be reprojected data. Here, if the scan key frame is one scan frame and the number of key frames is equal to the total number of frames, this means that all frames are matched, and if it is a frame made by combining several scan frames into one rather than a single frame unit (a result obtained by scanning a predetermined range), this means that frame sets that have already been matched for each section are matched between sets. The reprojected data means that scan data that has already been matched once is reprojected toward a camera again to create and use a new frame.
This method may reduce the accumulated matching errors of the scan frames because the scan frames are matched according to the geometric structure of the CT image. A method for obtaining a rigid transform that satisfies this is as follows.
For example, if the number of scan frames for a scan frame set S={s1, s2, . . . } is |S|, a rigid transform {Ti}i=1|S| of each scan frame that minimizes a loss L below may be obtained as in [Equation 4] below.
Where pi,k represents a point corresponding an i-th scan key frame, represents a set of all points that are shared between an i-th key frame and a j-th key frame, ci,k represents a point of 3D coordinate information of the CT image corresponding to pi,k Ni represents the number of points where pi,k and ci,k correspond to each other, and 1,i− and
2,i− represent weights corresponding to respective losses.
Further, when 1,i− increases, matching the scan key frames may be emphasized, and when
2,i− increases, matching the scan key frame and the CT image may be emphasized. In some cases, each weight may be ‘0’. By comparison with a rigid transform Ti in which L(Ti) is minimal with respect to fixed
1,i− and
2,i−, in the same condition, a rigid transform obtained by increasing only
1,i− decreases a matching error between scan key frames instead of increasing the matching error between the scan key frame and the CT image. In contrast, a rigid transform obtained by increasing only
2,i− decreases the matching error between the scan key frame and the CT image instead of increasing the matching error between the scan key frames.
Meanwhile, a direction in which a 3D distance between scan key frame data and CT image data corresponding thereto is minimal means that the scan key frame data and the CT image data are close to each other.
A method for minimizing the 3D distance between the scan key frame data and the CT image data may minimize a 3D difference between coordinates of points pk on the scan key frame and points ck on the CT image corresponding thereto as in [Equation 5] below.
Further, as in [Equation 6] below, a value by inner product of a normal vectornk of respective points with respect to a difference for 3D coordinates of respective points of the scan key frame data and the CT image data may also be minimized. Here, the normal vector means a normal vector for points on target data to which data is to be moved.
Further, as in [Equation 7] below, a difference between depth map points dp
As such, the method for minimizing the 3D distance between the scan key frame data and the CT image data may be calculated by various methods.
As illustrated in 2,i− is partially decreased or is set to 0, and a size of
1,i− is increased to reduce an influence of the CT image when matching the scan key frames. In contrast, when the matching error between the scan key frames is large, the size of
1,i− is decreased or is 0, and the size of
2,i− is increased to increase an influence of the CT image, thereby reducing the matching error between the scan key frames.
Therefore, in the present invention, like the rigid transform obtained in [Equation 4] above, by using a rigid transform considering both a relationship (2,i− part) with the CT image and a relationship (
1,i− part) between the scan frames, the 3D coordinate information of the scan frame is corrected and rematched to supplement an artifact problem by emphasizing a matching weight between the scan frames when the artifact of the CT image (an image in which an unnecessary part is captured when capturing the CT image) is severe.
As illustrated in
As illustrated in
Meanwhile, the device and the method for reconstructing the 3D oral scan data according to the embodiments of the present invention may be provided as one module or software form in the oral scanner or CT apparatus. For example, the oral scanner includes the components of
Further, the device and the method for reconstructing the 3D oral scan data according to the embodiments of the present invention may be implemented, for example, as a form (or a computer program product) of a recording medium executable by a computer, such as a program module stored in a computer-readable recording medium and executed by the computer. Here, the computer-readable recording medium may include computer storage media (e.g., a memory, a hard disk, a magnetic/optical medium, or a slid-state drive (SSD)). Further, the computer-readable media may be predetermined available media accessible by the computer and for example, includes all of volatile and non-volatile media and removable and irremovable media.
Further, the device and the method for reconstructing the 3D oral scan data according to the embodiments of the present invention may include instructions executable in whole or in part by the computer, and the computer program may include programmable machine instructions processed by a processor, and may be implemented by a high-level programming language, an object-oriented programming language, an assembly language, or a machine language.
As above, preferred embodiments of the invention have been described and illustrated using specific terminology, but such terminology is only intended to clearly describe the present invention. In addition, it is apparent that various modifications and changes can be made to the embodiments and described terms of the present invention without departing from the technical spirit and scope of the following claims. These modified embodiments should not be understood individually from the spirit and scope of the present invention, but should be regarded as falling within the scope of the claims of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
10-2022-0018687 | Feb 2022 | KR | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/KR2022/019837 | 12/7/2022 | WO |