This invention relates generally to medical diagnostics, and more specifically to an automated system and method for a guided implant surgery planning system, enabling a fully integrated scan-to-print planning, combining AI and user interface tools for added edit capabilities and customization.
Modern image generation systems play an important role in disease detection and treatment planning. Few existing systems and methods were discussed as follows. One common method utilized is dental radiography, which provides dental radiographic images that enable the dental professional to identify many conditions that may otherwise go undetected and to see conditions that cannot be identified clinically. Another technology is cone beam computed tomography (CBCT) that allows to the view of structures in the oral-maxillofacial complex in three dimensions. Hence, cone beam computed tomography technology is most desired over dental radiography.
However, CBCT includes one or more limitations, such as time consumption and complexity for personnel to become fully acquainted with the imaging software and correctly using digital imaging and communications in medicine (DICOM) data. American Dental Association (ADA) also suggests that the CBCT image should be evaluated by a dentist with appropriate training and education in CBCT interpretation. Further, many dental professionals who incorporate this technology into their practices have not had the training required to interpret data on anatomic areas beyond the maxilla and the mandible. To address the foregoing issues, deep learning has been applied to various medical imaging problems to interpret the generated images, but its use remains limited within the field of dental radiography. Further, most applications only work with 2D X-ray images.
Another existing article entitled “Teeth and jaw 3D reconstruction in stomatology”, Proceedings of the International Conference on Medical Information Visualisation—BioMedical Visualisation, pp 23-28, 2007, researchers Krsek et al. describe a method dealing with problems of 3D tissue reconstruction in stomatology. In this process, 3D geometry models of teeth and jaw bones were created based on input (computed tomography) CT image data. The input discrete CT data were segmented by a nearly automatic procedure, with manual correction and verification. Creation of segmented tissue 3D geometry models was based on vectorization of input discrete data extended by smoothing and decimation. The actual segmentation operation was primarily based on selecting a threshold of Hounsfield Unit values. However, this method fails to be sufficiently robust for practical use.
Another existing patent number U.S. Pat. No. 8,849,016, entitled “Panoramic image generation from CBCT dental images” to Shoupu Chen et al. discloses a method for forming a panoramic image from a computed tomography image volume, acquires image data elements for one or more computed tomographic volume images of a subject, identifies a subset of the acquired computed tomographic images that contain one or more features of interest and defines, from the subset of the acquired computed tomographic images, a sub-volume having a curved shape that includes one or more of the contained features of interest. The curved shape is unfolded by defining a set of unfold lines wherein each unfold line extends at least between two curved surfaces of the curved shape sub-volume and re-aligning the image data elements within the curved shape sub-volume according to a re-alignment of the unfold lines. One or more views of the unfolded sub-volume are displayed.
Another existing patent application number US20080232539, entitled “Method for the reconstruction of a panoramic image of an object, and a computed tomography scanner implementing said method” to Alessandro Pasini et al. discloses a method for the reconstruction of a panoramic image of the dental arches of a patient, a computer program product, and a computed tomography scanner implementing said method. The method involves acquiring volumetric tomographic data of the object; extracting, from the volumetric tomographic data, tomographic data corresponding to at least three sections of the object identified by respective mutually parallel planes; determining, on each section extracted, a respective trajectory that a profile of the object follows in an area corresponding to said section; determining a first surface transverse to said planes such as to comprise the trajectories, and generating the panoramic image on the basis of a part of the volumetric tomographic data identified as a function of said surface. However, the above references also fail to address the afore discussed problems regarding the cone beam computed tomography technology and image generation system.
Therefore, there is a need for an automated parsing pipeline system and method for anatomical localization and condition classification. There is a need for training an AI/ML model for performing segmentation of any dental volumetric image for providing dental practitioners with an automated diagnostic tool. Additionally, while individual imaging techniques, such as CBCT, are powerful on their own, when combined, they can provide a more accurate 3D representation of a patient. In practice, volumetric CBCT images are already being merged with surface Intraoral Scans (IOS) to improve planning for computer-guided surgery. However, this superimposition must currently be done manually. One method, for example, involves manually identifying and specifying matching points in both the volumetric images and surface scants. The process of manual alignment is time-consuming. An automated system capable of aligning volumetric images and surface scans would benefit dental practitioners by reducing the time and effort required to align said images prior to use in surgical and clinical applications.
Additionally, automated systems are also capable of providing measurements useful to the selection and planning of implants and crowns. CBCT imaging is a powerful tool for improving the safety and outcomes of dental implant surgery. Such images allow dental practitioners to plan the size and placement of implants, and to be aware of complicating factors, such as insufficient bone at the implant site, or the need for guided tissue regeneration or sinus elevation. Pre-surgical planning can increase success rates and avoid negative outcomes. However, while CBCT is currently being used by some dental practitioners to plan for implant surgery, the usefulness and success of such methods are dependent upon the practitioner's ability to correctly interpret CBCT images. Many practitioners lack the training and experience to use volumetric imaging effectively. An automated system capable of predicting implant and crown design and placement would address this training shortfall and make CBCT technology accessible to a wider group of practitioners and their patients.
In an existing article entitled “A deep learning approach for dental implant planning in cone-beam computed tomography,” Bayrakdar et at. demonstrate that an artificial intelligence (AI) system is capable of detecting anatomical features both present (mandibular canal) and absent (missing teeth) for the purpose of implant planning. The AI also had some success in measuring bone height in the premolar sections of the mandible and maxilla. However, additional anatomical features (nasal fossa in the maxilla, mandibular accessory canals) were not reliably detected, and bone thickness measurements differed significantly from traditional manual measurements in all locations. These deficiencies indicate a need to improve AI accuracy via deep learning. An improved AI capable of accurately measuring dental anatomy identifying sites of implant and crown placement, and predicting the size, shape, and orientation of implants and crowns would make CBCT accessible to more dental practitioners, streamline the implant and crown planning process, and increase treatment success rates. Improved AI accuracy via deep learning techniques is critical for the design and fabrication of surgical templates or guides, on the basis of the input imagery and an image processing/reconstruction framework.
Dental implant surgery planning is a critical step in restorative dentistry, and it significantly contributes to the success of the surgical procedures and long-term dental health of patients. Despite technological advancements, there are several limitations in current practices, mostly centered around the laborious nature of manually determining implant sites, sleeve, crown, and implant characteristics, as well as surgical guide design.
Manual implant site determination:
Sleeve, Crown, and Implant Characteristics:
Surgical Guide Design:
Subjectivity and Variability:
Limited Integration:
Time Consuming:
An advanced AI system can utilize superimposition of CBCT and Intra-oral scans, segment key objects in the superimposed scan, and automatically detect missing or problematic teeth, thus suggesting implant site candidates. Moreover, such a system could recommend implant, sleeve, and crown characteristics based on a vast dataset of successful implant cases, ensuring a better fit and overall outcome. This AI system could provide a more user-friendly interface for practitioners to make modifications, thereby reducing the time and complexity involved.
The ability to export planning results as a comprehensive PDF report or a 3-D model of the surgical guide would enable easier communication between practitioners, better patient education, and would contribute to more precise, accurate, and efficient planning and execution of dental implant surgeries. There is clearly a void in the market for such an automated system and method for guided implant surgery planning.
A system of one or more computers can be configured to perform particular operations or actions by virtue of having software, firmware, hardware, or a combination of them installed on the system that in operation causes or cause the system to perform the actions. One or more computer programs can be configured to perform particular operations or actions by virtue of including instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions. Embodiments disclosed include an automated parsing pipeline system and method for anatomical localization and condition classification.
In an embodiment, the system comprises an input event source, a memory unit in communication with the input event source, a processor in communication with the memory unit, a volumetric image processor in communication with the processor, a voxel parsing engine in communication with the volumetric image processor and a localizing layer in communication with the voxel parsing engine. In one embodiment, the memory unit is a non-transitory storage element storing encoded information. In one embodiment, at least one volumetric image data is received from the input event source by the volumetric image processor. In one embodiment, the input event source is a radio-image gathering source.
The processor is configured to parse the at least one received volumetric image data into at least a single image frame field of view by the volumetric image processor. The processor is further configured to localize anatomical structures residing in the at least single field of view by assigning each voxel a distinct anatomical structure by the voxel parsing engine. In one embodiment, the single image frame field of view is pre-processed for localization, which involves rescaling using linear interpolation. The pre-processing involves use of any one of a normalization schemes to account for variations in image value intensity depending on at least one of an input or output of volumetric image. In one embodiment, localization is achieved using a V-Net-based fully convolutional neural network.
The processor is further configured to select all voxels belonging to the localized anatomical structure by finding a minimal bounding rectangle around the voxels and the surrounding region for cropping as a defined anatomical structure by the localization layer. The bounding rectangle extends by at least 15 mm vertically and 8 mm horizontally (equally in all directions) to capture the tooth and surrounding context. In one embodiment, the automated parsing pipeline system further comprises a detection module. The processor is configured to detect or classify the conditions for each defined anatomical structure within the cropped image by a detection module or classification layer. In one embodiment, the classification is achieved using a DenseNet 3-D convolutional neural network.
In another embodiment, an automated parsing pipeline method for anatomical localization and condition classification is disclosed. At one step, at least one volumetric image data is received from an input event source by a volumetric image processor. At another step, the received volumetric image data is parsed into at least a single image frame field of view by the volumetric image processor. At another step, the single image frame field of view is pre-processed by controlling image intensity value by the volumetric image processor. At another step, the anatomical structure residing in the single pre-processed field of view is localized by assigning each voxel a distinct anatomical structure ID by the voxel parsing engine. At another step, all voxels belonging to the localized anatomical structure is assigned a distinct identifier and segmentation is based on a distribution approach. Optionally, a segmented polygonal mesh may be generated from the distribution-based segmentation. Further optionally, the polygonal mesh may be generated from a coarse-to-fine model segmentation of coarse input volumetric images. In other embodiments, may be converted selected by finding a minimal bounding rectangle around the voxels and the surrounding region for cropping as a defined anatomical structure by the localization layer. In another embodiment, the method includes a step of, classifying the conditions for each defined anatomical structure within the cropped image by the classification layer.
In another embodiment, the system comprises an input event source, a memory unit in communication with the input event source, a processor in communication with the memory unit, an image processor in communication with the processor, a segmentation layer in communication with the image processor, a mesh layer in communication with the segmentation layer, and an alignment module in communication with both the segmentation layer and mesh layer. In one embodiment, the memory unit is a non-transitory storage element storing encoded information. In one embodiment, at least one volumetric image datum and at least one surface scan datum are received from the input event source by the image processor. In one embodiment, the input event source is at least one radio-image gathering source. In one embodiment, the volumetric image is a three-dimensional voxel array of a maxillofacial anatomy of a patient and the surface scan is a polygonal mesh corresponding to the maxillofacial anatomy of the same patient.
In yet another aspect, a system and method for surgical design and fabrication entails the steps of: receiving an input mesh with calculated sequence of points along edges of the input mesh; finding a geodesic path along the mesh edges using at least one of a flip-out technique; generating an inner and outer surface from the edge-flipped mesh by finding a direction of a plane insertion that minimizes an undercut area and calculating a height map offset in the direction of the insertion for triangulating and clipping by curve.
In yet another aspect, a system and method for surgical design and fabrication entails the steps of: receiving an input mesh with calculated sequence of points along edges of the input mesh; finding a geodesic path along the mesh edges using at least one of a flip-out technique; generating an inner and outer surface from the edge-flipped mesh by finding a direction of a plane insertion that minimizes an undercut area and calculating a height map offset in the direction of the insertion for triangulating and clipping by curve. (collision detection/rule-surface generation clause); and (fabrication).
The processor is configured to segment both volumetric images and surface scan images into a set of distinct anatomical structures. In one embodiment, the volumetric image is segmented by assigning an anatomical structure identifier to each volumetric image voxel, and the surface scan image segmented by assigning an anatomical structure identifier to each vertex or face of the surface scan's mesh. The volumetric image and the surface scan image have at least one distinct anatomical structure in common.
The processor is further configured to convert both the volumetric image and the surface scan image into point clouds/point sets that can be aligned. In one embodiment, a polygonal mesh is extracted from the volumetric image. Both the original surface scan polygonal mesh and the extracted volumetric image mesh are converted to point clouds. In one embodiment, both the volumetric image and surface scan image are processed by applying a binary erosion on the voxels corresponding to an anatomical structure, producing an eroded mask. The eroded mask is subtracted from a non-eroded mask, revealing voxels on the boundary. A random subset of boundary voxels is selected as a point set by selecting a number of points similar to a number of points on a corresponding structure in a polygonal mesh. Once both the volumetric image and surface scan image are converted to point clouds/point sets, the volumetric image and surface scan image point cloud/point sets are aligned. In one embodiment, alignment is accomplished using point set registration. Alternatively, each of the volumetric and surface scan meshes may be converted into a format featuring coordinates of assigned structures, landmarks, etc. for alignment based on common coordinates/structures, landmarks, etc.
In another embodiment of the invention, an automated pipeline for the prediction of at least one tooth crown or dental implant feature, such as but not limited to location, orientation, dimensions, or geometry, is disclosed. At one step, either a volumetric image, such as but not limited to a CBCT image, or a surface scan image, such as but not limited to an IOS image, is received and segmented into a set of distinct anatomical structures, such as but not limited to individual teeth, maxilla, mandible, mandibular canal, maxillary sinus, fossae, and a missing tooth. Segmentation is performed by assigning each voxel, or each vertex or face of the volumetric and surface scan image, respectively, to one of the anatomical structures. In one embodiment, the position and angulation of the roots of a segmented missing tooth is used to suggest an implant from a library of prototypes. This is done by selecting a set of points on the surface of the segmented image and prototype, and running a pointset-matching algorithm to identify the closest matching prototype.
In an alternative embodiment, in an additional step, a “missing tooth” or “phantom crown” (terms used interchangeably hereinafter) feature is predicted in the location of a segmented missing tooth. In one embodiment, a neural network is trained to predict a “phantom crown” by imputing a segmented radiological image, removing a random subset of teeth from the input image and replacing them with background, and instructing the neural network to predict, for missing tooth sites, a tooth segmentation using the tooth removed from each site as a training target. In one embodiment, the predicted “phantom crown” is the output. In an alternative embodiment, the “phantom crown” is used to suggest an implant or crown from a library of prototypes by selecting a set of points on the surface of either the “phantom crown” or the prototype and running a pointset matching algorithm. In an alternative embodiment, at least one of a cylindrical or conical shape is imposed along the location/orientation, dimension, or geometry indicated by the phantom crown, defining an “allowed placement zone” for implant placement. An implant is suggested from a library of prototypes or practitioner inventory by selecting a set of points on the surface of either the segmented image or the prototype and running a pointset matching algorithm. In yet other embodiments, a rule-based approach may be employed to predict crown/implant features, rather than relying strictly on neural network outputs along the prediction pipeline.
Other aspects include for a system and method for an automated surgical guide design, comprising a geodesic module; a slicing module; a processor coupled to a memory element with stored instructions, when implemented by the processor, cause the processor to: receive an input mesh with calculated sequence of points on the input mesh; find geodesic line segment on the mesh between the points by the geodesic module; slice out from the mesh a part that is inside the area bounded by the geodesic line segments by the slicing module; find an insertion direction that minimizes an undercut area; generate a height map in the direction of the insertion with offsets a and b for an inner and outer surfaces for rendering a three-dimensional mask for triangulating and smoothing into the surgical guide; and (optionally) fabricate the designed guide on or off-site.
In yet another embodiment, the present invention relates to a system and method for guided implant surgery planning, incorporating innovative tools and features for efficient and effective dental treatment planning. This system is designed to be instrumental in both the preliminary stages of planning and the eventual surgical guide design, facilitating improved outcomes for dental implant surgeries.
In one embodiment, the system is configured to receive and interpret volumetric or surface scan images, which can depict a three-dimensional voxel array or a polygonal mesh of a patient's maxillofacial anatomy. Through the implementation of encoded instructions, the system can segment these images into distinct anatomical structures, including missing teeth that require replacement.
In a preferred embodiment, the system can further predict the shape and position of a tooth crown or implant, to replace a segmented missing tooth. To achieve this, the system imposes either a cylindrical or conical shape along a mesh-planned location, orientation, dimension, or geometry. These aspects are configured to align with the predicted crown or implant shape and positioning.
Moreover, in another embodiment, the system can generate an implant planning interface for a practitioner, which provides an edit-ready report outlining the imposed shape and position of the predicted crown or implant. This interface incorporates user-friendly elements to facilitate the modification of this information according to the specific requirements of the patient and the expertise of the practitioner.
In an additional embodiment, the system can superimpose a volumetric image and surface scan image, combining these inputs to generate a 3-D model of the patient's head. This model can then be segmented to distinguish different anatomical structures, providing a clear visual guide for practitioners. In this segmented model, every tooth is numbered, and key anatomical features such as the maxilla, mandible, nerve canal, incisive canal, sinus, airways, cranial, and soft tissue are outlined.
In another embodiment, the system offers a series of interface features, designed to facilitate user interaction with the system. These features include multiplanar reformation, panoramic reconstruction, a 3D scene, a toolbar, an objects panel, and a planning panel. The objects panel provides a representation of all segmented anatomic and artificial objects in a tree structure, enabling the practitioner to adjust their visibility and alter their color and transparency level.
In a further embodiment, the present invention also proposes a method for generating an implant planning interface, wherein the practitioner can alter the visibility of segmented structures in preparation of an edit-ready report. This method incorporates the reception and segmentation of volumetric or surface scan images, and the generation of an interface where a practitioner can prepare an edit-ready report, comprising at least one of the shape and position of a predicted crown or implant, or an edit-ready surgical guide design for fabrication. The system and method described herein provide a comprehensive and practical solution for dental implant planning, facilitating improved efficiency and accuracy in the planning and execution of dental implant surgeries.
Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
The drawings illustrate the design and utility of embodiments of the present invention, in which similar elements are referred to by common reference numerals. In order to better appreciate the advantages and objects of the embodiments of the present invention, reference should be made to the accompanying drawings that illustrate these embodiments. However, the drawings depict only some embodiments of the invention, and should not be taken as limiting its scope. With this caveat, embodiments of the invention will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the invention. It will be apparent, however, to one skilled in the art that the invention can be practiced without these specific details.
Reference in this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments, but not other embodiments.
The present embodiments disclose for a system and method for an automated and AI-aided alignment of volumetric images and surface scan images for improved dental diagnostics. In addition to the various segmentation/localization techniques for assigning structures to each of the received volumetric and surface scan images—as described previously—the automated alignment pipeline additionally features an alignment layer for aligning the converted meshes/erosion points from each of the image types.
Specific embodiments of the invention will now be described in detail with reference to the accompanying
In one embodiment, an input data is provided via the input event source 101. In one embodiment, the input data is a volumetric image data and the input event source 101 is a radio-image gathering source. In one embodiment, the input data is 2D image data. The volumetric image data comprises 3-D pixel array. The volumetric image processor 103a is configured to receive the volumetric image data from the radio-image gathering source. Initially, the volumetric image data is pre-processed, which involves conversion of 3-D pixel array into an array of Hounsfield Unit (HU) radio intensity measurements.
The processor 103 is further configured to parse at least one received volumetric image data 103b into at least a single image frame field of view by the volumetric image processor.
The processor 103 is further configured to localize anatomical structures residing in the single image frame field of view by assigning each voxel a distinct anatomical structure by the voxel parsing engine 104. In one embodiment, the single image frame field of view is pre-processed for localization, which involves rescaling using linear interpolation. The pre-processing involves use of any one of a normalization schemes to account for variations in image value intensity depending on at least one of an input or output of volumetric image. In one embodiment, localization is achieved using a V-Net-based fully convolutional neural network. In one embodiment, the V-Net is a 3D generalization of UNet.
The processor 103 is further configured to select all voxels belonging to the localized anatomical structure by finding a minimal bounding rectangle around the voxels and the surrounding region for cropping as a defined anatomical structure by the localization layer. The bounding rectangle extends by at least 15 mm vertically and 8 mm horizontally (equally in all directions) to capture the tooth and surrounding context.
In one embodiment, the localization layer 105 includes 33 class semantic segmentation in 3D. In one embodiment, the system is configured to classify each voxel as one of 32 teeth or background and resulting segmentation assigns each voxel to one of 33 classes. In another embodiment, the system is configured to classify each voxel as either tooth or other anatomical structure of interest. In case of localizing only teeth, the classification includes, but not limited to, 2 classes. Then individual instances of every class (teeth) could be split, e.g. by separately predicting a boundary between them. In some embodiments, the anatomical structure being localized, includes, but not limited to, teeth, upper and lower jaw bone, sinuses, lower jaw canal and joint.
In one embodiment, the system utilizes fully-convolutional network. In another embodiment, the system works on downscaled images (typically from 0.1-0.2 mm voxel resolution to 1.0 mm resolution) and grayscale (1-channel) image (say, 1×100×100×100-dimensional tensor). In yet another embodiment, the system outputs 33-channel image (say, 33×100×100×100-dimensional tensor) that is interpreted as a probability distribution for non-tooth vs. each of 32 possible (for adult human) teeth, for every pixel.
In an alternative embodiment, the system provides 2-class segmentation, which includes labeling or classification, if the localization comprises tooth or not. The system additionally outputs assignment of each tooth voxel to a separate “tooth instance”.
In one embodiment, the system comprises VNet predicting multiple “energy levels”, which are later used to find boundaries. In another embodiment, a recurrent neural network could be used for step by step prediction of tooth, and keep track of the teeth that were outputted a step before. In yet another embodiment, Mask-RCNN generalized to 3D could be used by the system. In yet another embodiment, the system could take multiple crops from 3D image in original resolution, perform instance segmentation, and then join crops to form mask for all original image. In another embodiment, the system could apply either segmentation or object detection in 2D, to segment axial slices. This would allow to process images in original resolution (albeit in 2D instead of 3D) and then infer 3D shape from 2D segmentation.
In one embodiment, the system could be implemented utilizing descriptor learning in the multitask learning framework i.e., a single network learning to output predictions for multiple dental conditions. This could be achieved by balancing loss between tasks to make sure every class of every task has approximately the same impact on the learning. The loss is balanced by maintaining a running average gradient that network receives from every class*task and normalizing it. Alternatively, descriptor learning could be achieved by teaching network on batches consisting of data about a single condition (task) and sample examples into these batches in such a way that all classes will have same number of examples in batch (which is generally not possible in multitask setup). Further, standard data augmentation could be applied to 3D tooth images to perform scale, crop, rotation, vertical flips. Then, combining all augmentations and final image resize to target dimensions in a single affine transform and apply all at once.
Advantageously, in some embodiment, to accumulate positive cases faster, weak model could be trained and run the model on all of unlabeled data. From resulting predictions, teeth model that gives high scores on some rare pathology of interest are selected. Then, the teeth are sent to be labelled by humans or users and added to the dataset (both positive and negative human labels). This allows to quickly and cost-efficiently build up more balanced dataset for rare pathologies.
In some embodiments, the system could use coarse segmentation mask from localizer as an input instead of tooth image. In some embodiments, the descriptor could be trained to output fine segmentation mask from some of the intermediate layers. In some embodiments, the descriptor could be trained to predict tooth number.
As an alternative to multitask learning approach, “one network per condition” could be employed, i.e. models for different conditions are completely separate models that share no parameters. Another alternative is to have a small shared base network and use separate subnetworks connected to this base network, responsible for specific conditions/diagnoses.
The anatomical structures residing in the at least single field of view is localized by assigning each voxel a distinct anatomical structure by the voxel parsing engine 208b.
The processor 208 is configured to select all voxels belonging to the localized anatomical structure by finding a minimal bounding rectangle around the voxels and the surrounding region for cropping as a defined anatomical structure by the localization layer 208c. Then, the conditions for each defined anatomical structure within the cropped image is classified by a detection module or classification layer 208d.
At step 304, a tooth or anatomical structure inside the pre-processed and parsed volumetric image is localized and identified by tooth number. At step 306, the identified tooth and surrounding context within the localized volumetric image are extracted. At step 308, a visual report is reconstructed with localized and defined anatomical structure. In some embodiments, the visual reports include, but not limited to, an endodontic report (with focus on tooth's root/canal system and its treatment state), an implantation report (with focus on the area where the tooth is missing), and a dystopic tooth report for tooth extraction (with focus on the area of dystopic/impacted teeth).
At step 314, the received volumetric image data is parsed into at least a single image frame field of view by the volumetric image processor. At least single image frame field of view is pre-processed by controlling image intensity value by the volumetric image processor. At step 316, an anatomical structure residing in the at least single pre-processed field of view is localized by assigning each voxel a distinct anatomical structure ID by the voxel parsing engine. At step 318, all voxels belonging to the localized anatomical structure is selected by finding a minimal bounding rectangle around the voxels and the surrounding region for cropping as a defined anatomical structure by the localization layer. At step 320, a visual report is reconstructed with defined and localized anatomical structure. At step 322, conditions for each defined anatomical structure is classified within the cropped image by the classification layer.
Referring to
Problem: Formulating the problem of tooth localization as a 33-class semantic segmentation. Therefore, each of the 32 teeth and the background are interpreted as separate classes.
Model: A V-Net-based fully convolutional network is used. V-Net is a 6-level deep, with widths of 32; 64; 128; 256; 512; and 1024. The final layer has an output width of 33, interpreted as a softmax distribution over each voxel, assigning it to either the background or one of 32 teeth. Each block contains 3*3*3 convolutions with padding of 1 and stride of 1, followed by ReLU non-linear activations and a dropout with 0:1 rate. Instance normalization before each convolution is used. Batch normalization was not suitable in this case, as long as there is only one example in batch (GPU memory limits); therefore, batch statistics are not determined.
Different architecture modifications were tried during the research stage. For example, an architecture with 64; 64; 128; 128; 256; 256 units per layer leads to the vanishing gradient flow and, thus, no training. On the other hand, reducing architecture layers to the first three (three down and three up) gives a comparable result to the proposed model, though the final loss remains higher.
Loss function: Let R be the ground truth segmentation with voxel values ri (0 or 1 for each class), and P the predicted probabilistic map for each class with voxel values pi. As a loss function we use soft negative multi-class Jaccard similarity, that can be defined as:
where N is the number of classes, which in our case is 32, and E is a loss function stability coefficient that helps to avoid a numerical issue of dividing by zero. Then the model is trained to convergence using an Adam optimizer with learning rate of 1e-4 and weight decay 1e-8. A batch size of 1 is used due to the large memory requirements of using volumetric data and models. The training is stopped after 200 epochs and the latest checkpoint is used (validation loss does not increase after reaching the convergence plateau).
Results: The localization model is able to achieve a loss value of 0:28 on a test set. The background class loss is 0:0027, which means the model is a capable 2-way “tooth/not a tooth” segmentor. The localization intersection over union (IoU) between the tooth's ground truth volumetric bounding box and the model-predicted bounding box is also defined. In the case where a tooth is missing from ground truth and the model predicted any positive voxels (i.e. the ground truth bounding box is not defined), localization IoU is set to 0. In the case where a tooth is missing from ground truth and the model did not predict any positive voxels for it, localization IoU is set to 1. For a human-interpretable metric, tooth localization accuracy which is a percent of teeth is used that have a localization IoU greater than 0:3 by definition. The relatively low threshold value of 0:3 was decided from the manual observation that even low localization IoU values are enough to approximately localize teeth for the downstream processing. The localization model achieved a value of 0:963 IoU metric on the test set, which, on average, equates to the incorrect localization of 1 of 32 teeth.
Referring to
In order to focus the downstream classification model on describing a specific tooth of interest, the tooth and its surroundings is extracted from the original study as a rectangular volumetric region, centered on the tooth. In order to get the coordinates of the tooth, the upstream segmentation mask is used. The predicted volumetric binary mask of each tooth is preprocessed by applying erosion, dilation, and then selecting the largest connected component. A minimum bounding rectangle is found around the predicted volumetric mask. Then, the bounding box is extended by 15 mm vertically and 8 mm horizontally (equally in all directions) to capture the tooth context and to correct possibly weak localizer performance. Finally, a corresponding sub-volume is extracted from the original clipped image, rescale it to 643 and pass it on to the classifier. An example of a sub-volume bounding box is presented in
Referring to
Model: The classification model has a DenseNet architecture. The only difference between the original and implementation of DenseNet by the present invention is a replacement of the 2D convolution layers with 3D ones. 4 dense blocks of 6 layers is used, with a growth rate of 48, and a compression factor of 0:5. After passing the 643 input through 4 dense blocks followed by down-sampling transitions, the resulting feature map is 548×2×2×2. This feature map is flattened and passed through a final linear layer that outputs 6 logits—each for a type of abnormality.
Loss function: Since tooth conditions are not mutually exclusive, binary cross entropy is used as a loss. To handle class imbalance, weight each condition loss inversely proportional to its frequency (positive rate) in the training set. Suppose that Fi is the frequency of condition i, pi is its predicted probability (sigmoid on output of network) and ti is ground truth. Then: Li=(1=Fi). ti .log pi+Fi. (1−ti).log(1−pi) is the loss function for condition i. The final example loss is taken as an average of the 6 condition losses.
Results: The classification model achieved average area under the receiver operating characteristic curve (ROC AUC) of 0:94 across the 6 conditions. Per-condition scores are presented in above table. Receiver operating characteristic (ROC) curves 700 of the 6 predicted conditions are illustrated in
The automated segmentation pipeline may segment/localize volumetric images by distinct anatomical structure/identifiers based on a distribution approach, versus the bounding box approach described in detail above. In accordance with an exemplary embodiment of the this alternative automated segmentation pipeline, as illustrated by
The processor 803 is further configured to parse at least one received volumetric image data 803b into at least a single image frame field of view by the volumetric image processor and further configured to localize anatomical structures residing in the single image frame field of view by assigning each voxel a distinct anatomical structure by the voxel parsing engine 804. Optionally, in one embodiment, the single image frame field of view may be pre-processed for segmentation/localization, which involves rescaling using linear interpolation. The pre-processing involves use of any one of a normalization schemes to account for variations in image value intensity depending on at least one of an input or output of volumetric image. In one embodiment, localization/segmentation is achieved using a V-Net-based fully convolutional neural network. In one embodiment, the V-Net is a 3D generalization of UNet.
The processor 803 is further configured to select all voxels belonging to the localized anatomical structure. The processor 803 is configured to parse the received volumetric image data into at least a single image frame field of view by the said volumetric image processor 803a. The anatomical structures residing in the at least single field of view is localized by assigning each voxel a distinct anatomical structure (identifier) by the voxel parsing engine 803b. The distribution-based approach is an alternative to the minimum bounding box approach detailed in earlier figure descriptions above: selecting all voxels belonging to the localized anatomical structure by finding a minimal bounding rectangle around the voxels and the surrounding region for cropping as a defined anatomical structure by the localization layer. Whether segmented based on distribution or bounding box, the conditions for each defined anatomical structure within the cropped/segmented/mesh-converted image may then be optionally classified by a detection module or classification layer 806.
In a preferred embodiment, the processor is configured for receiving a volumetric image comprising a jaw/tooth structure in terms of voxels; and defining each voxel a distinct anatomical identifier based on a probabilistic distribution for each of an anatomical structure. Apply a computer segmentation model to output probability distribution or discrete assignment of each voxel in the image to one or more classes (probabilistic of discrete segmentation).
In one embodiment, the voxel parsing engine 803b or a localization layer (not shown) may perform 33 class semantic segmentation in 3D for dental volumetric images. In one embodiment, the system is configured to classify each voxel as one of 32 teeth or background and the resulting segmentation assigns each voxel to one of 33 classes. In another embodiment, the system is configured to classify each voxel as either tooth or other anatomical structure of interest. In the case of localizing only teeth, the classification includes, but not limited to, 2 classes. Then individual instances of every class (teeth) could be split, e.g., by separately predicting a boundary between them. In some embodiments, the anatomical structure being localized, includes, but not limited to, teeth, upper and lower jaw bone, sinuses, lower jaw canal and joint.
For example, each tooth in a human may have a distinct number based on its anatomy, order (1-8), and quadrant (upper, lower, left, right). Additionally, any number of dental features (maxilla, mandible, mandibular canal, sinuses, airways, outer contour of soft tissue, etc.) constitute a distinct anatomical structure that can be unambiguously coded by a number.
In one embodiment, a model of a probability distribution over anatomical structures via semantic segmentation may be performed: using a standard fully-convolutional network, such as VNet or 3D UNet, to transform I×H×W×D tensor of input image with I color channels per voxel, to H×W×D×C tensor defining class probabilities per voxel, where C is the number of possible classes (anatomical structures). In the case where classes do not overlap, this could be converted to probabilities via applying a softmax activation along the C dimension. In case of a class overlap, a sigmoid activation function may be applied to each class in C independently.
Alternatively, an instance or panoptic segmentation may be applied to potentially identify several distinct instances of a single class. This works both for cases where there is no semantic ordering of classes (as in case 1, which can be alternatively modeled by semantic segmentation), and for cases where there is no natural semantic ordering of classes, such as in segmenting multiple caries lesions on a tooth.
Instance or Panoptic segmentation could be achieved, for example, by using a fully-convolutional network to obtain several outputs tensors:
After these steps, we obtain an assignment of each voxel to object instance, and assignment of instances to classes. Again, while not shown, the automated segmentation pipeline system may further comprise a detection module. The detection module is configured to detect or classify the conditions for each defined anatomical structure within the cropped image by a detection module or classification layer. In one embodiment, the classification is achieved using a DenseNet 3-D convolutional neural network. In continuing reference to
The fine model runs in higher resolution than the coarse model, and typically cannot process the image as a whole. Hence, two techniques are proposed to split volumes in sub-images:
Now in reference to
In one embodiment, an input data is provided via the input event source. In one embodiment, the input data is a volumetric image data and/or surface scan image and the input event source is any one of an image gathering source. In one embodiment, the input data is 2D image data. In another embodiment, the volumetric and/or surface scan image data comprises 3-D voxel array. In another embodiment, the volumetric image received from the input source may be a three-dimensional voxel array of a maxillofacial anatomy of a patient and the surface scan image received may be a polygonal mesh corresponding to the maxillofacial anatomy of the same patient. The image processor 1203a is configured to receive the image data from the image gathering source. In one embodiment, the image data is pre-processed, which involves conversion of 3-D pixel array into an array of Hounsfield Unit (HU) radio intensity measurements.
The processor 1203 is further configured to localize/segment anatomical structures residing in the single image frame field of view by assigning each voxel/pixel/face/vertex/vertices a distinct anatomical structure by the segmentation or localization layer 1204. In one embodiment, the single image frame field of view is pre-processed for localization, which involves rescaling using linear interpolation (not shown). The pre-processing 1203b involves use of any one of a normalization schemes to account for variations in image value intensity depending on at least one of an input or output of volumetric image.
In one embodiment, the localization layer 1204 may perform 33 class semantic segmentation in 3D for dental volumetric images. In one embodiment, the system is configured to classify each voxel as one of 32 teeth or background and the resulting segmentation assigns each voxel to one of 33 classes. In another embodiment, the system is configured to classify each voxel as either tooth or other anatomical structure of interest. In the case of localizing only teeth, the classification includes, but not limited to, 2 classes. Then individual instances of every class (teeth) could be split, e.g., by separately predicting a boundary between them. In some embodiments, the anatomical structure being localized, includes, but not limited to, teeth, upper and lower jaw bone, sinuses, lower jaw canal and joint. Segmentation/localization entails, according to a certain embodiment, selecting for all voxels belonging to the localized anatomical structure by finding a minimal bounding rectangle around the voxels and the surrounding region.
In one embodiment, a model of a probability distribution over anatomical structures via semantic segmentation may be performed: using a standard fully-convolutional network, such as VNet or 3D UNet, to transform I×H×W×D tensor of input image with I color channels per voxel, to H×W×D×C tensor defining class probabilities per voxel, where C is the number of possible classes (anatomical structures). In the case where classes do not overlap, this could be converted to probabilities via applying a softmax activation along the C dimension. In case of a class overlap, a sigmoid activation function may be applied to each class in C independently.
Alternatively, an instance or panoptic segmentation may be applied to potentially identify several distinct instances of a single class. This works both for cases where there is no semantic ordering of classes (as in case 1, which can be alternatively modeled by semantic segmentation), and for cases where there is no natural semantic ordering of classes, such as in segmenting multiple caries lesions on a tooth.
In continuing reference to
Once segmented, a polygonal mesh from the volumetric image featuring common structures with the polygonal mesh from the surface scan image is extracted/generated by the mesh layer 1205. The meshes from both the volumetric image and from the surface scan image are then converted to point clouds; and the converted meshes are then aligned via point clouds using a point set registration by the alignment module 1206. In one embodiment, the surface scan image mesh is extracted or generated from the surface scan image, while in other embodiments, the surface scan mesh is received de novo or directly from the input source for downstream processing. In yet other embodiments, as shown in
Now in reference to
In a preferred embodiment, the mesh extraction is performed by a Marching Cubes algorithm. Alternatively, the extraction of the polygonal mesh is of a polygonal mesh of an isosurface from a three-dimensional discrete scalar field. Other, less conventional extraction techniques may be used as well. Preferred alignment methods, such as Iterative Closest Point or Deformable Mesh Alignment may be performed. Essentially any means for aligning two partially overlapping meshes given initial guess for relative transform, so long as one mesh is derived from a CBCT (volumetric image), and the other from an IOS (surface scan image). Aligned CBCT and IOS is then used for orthodontic treatment and implant planning. CBCT provides knowledge about internal structures: bone, nerves, sinuses and tooth roots, while IOS provides very precise visible structures: gingiva and tooth crowns. Both scans are needed for high-quality digital dentistry.
The implementation essentially consists of the following steps:
As shown in
Exemplary Dental Anatomical Landmarks:
Following the localization of landmarks common to both the volumetric and surface scan images, the images are aligned by minimizing the distance between the corresponding landmarks present in both images 1605. Alignment may be performed alternatively between: a polygonal mesh of a volumetric image and a polygonal mesh of a surface scan image; a point set of a volumetric image and a point set of a surface scan image; a mesh of a volumetric image and a point set of a surface scan image; or a point set of a volumetric image and a mesh of a surface scan image.
Alternatively, volumetric images and surface scan images may be combined into a single image via a fusion of tooth meshes.
Once both are segmented and numerated, the volumetric tooth mesh and the surface scan tooth mesh are matched by their numbers. For each numbered tooth, the faces of the volumetric tooth mesh also present in the surface scan tooth crown mesh are identified. In one embodiment, this is accomplished by, for each face of the surface scan mesh, identifying the nearest face of the volumetric tooth mesh. Next, each face in the volumetric tooth mesh found to match a face in the surface scan tooth crown mesh is removed from the volumetric tooth mesh 1708. Border vertices on the volumetric and surface scan meshes are identified by finding edges adjacent to a single triangle. The two meshes can then be fused by triangulating the border vertices 1710.
Now in reference to
One aspect of the invention is an automated pipeline for segmenting volumetric and/or surface scan images to predict the placement and/or design of dental implants and/or crowns. In this aspect, the radiological image is segmented into various anatomies, including missing teeth. Identification and measurement of the anatomies enables a trained neural network, to predict the location, orientation, size, and type of any missing tooth, referred to as a phantom crown. The measured and predicted anatomies then enable the AI to suggest features of a suggested implant and/or crown, such as location, orientation, dimensions, geometry and/or specific model, and/or match the phantom crown or specified features to a preexisting library of implants and/or crowns.
While not shown, the system may further comprise an input event source; a memory unit in communication with the input event source; a processor in communication with the memory unit; an image processor in communication with the processor; a localizing layer or segmenting layer in communication with the prediction module, and optionally, a matching module. In an embodiment, the memory unit is a non-transitory storage element storing encoded information. The encoded instructions when implemented by the processor, configure the automated system to segment and predict crown and dental implant features for more accurate and efficient planning.
The processor is further configured to localize/segment anatomical structures residing in the single image frame field of view by assigning each voxel/pixel/face/vertex/vertices a distinct anatomical structure by the segmentation or localization layer 1804. In one embodiment, the single image frame field of view may be pre-processed by the image processor 1803 for localization, which involves rescaling using linear interpolation (not shown). The pre-processing involves use of any one of a normalization schemes to account for variations in image value intensity depending on at least one of an input or output of volumetric image.
In continuing reference to
The prediction module 1805 (neural network) is trained to predict the phantom crown by: inputting one of a manually or a machine-produced segmentation of a radiological image, wherein the segmentation comprises of tooth segmentation, or tooth and anatomy segmentation; removing a random subset of segmented teeth from the input segmentation and replacing the teeth with background; and instructing the prediction module/neural network to predict one or more sites of missing teeth, and for the missing tooth sites, predict a tooth segmentation, wherein the training target is the removed segmented tooth.
While not shown, the image processor/alignment module may align the meshes/points extracted/generated from each of the surface scan image and volumetric image for downstream processing, such as predicting the crown and/or implant features. In one embodiment, a polygonal mesh from the volumetric image featuring common structures with the polygonal mesh from the surface scan image is extracted/generated by the mesh layer. The meshes from both the volumetric image and from the surface scan image are then converted to point clouds; and the converted meshes are then aligned via point clouds using a point set registration by the alignment module. In one embodiment, the surface scan image mesh is extracted or generated from the surface scan image, while in other embodiments, the surface scan mesh is received de novo or directly from the input source for downstream processing, such as predicting crown and/or implant features for dental planning.
While not shown, in addition to volumetric cone beam computed tomography (CBCT), the radiographic image input may be a volumetric computed tomography (CT) or magnetic resonance imaging (MRI) image or a surface scan such as intraoral (IOS) or facial scan image. In one embodiment, a volumetric image and a surface scan image may be merged into one image via conversion of the volumetric image to a polygonal mesh, and merging via alignment of points or meshes, or by fusing a surface scan tooth mesh to a volumetric scan via triangulation of the border vertices. Additionally, while not shown, a volumetric image may be normalized by eliminating the values lying outside a standard range to derive zero mean and unit standard deviation, and a surface scan image may be normalized by centering and rescaling mesh vertices to fit an unfit sphere.
Furthermore, while not shown, segmentation of a volumetric image may be accomplished in one embodiment by finding a minimal bounding rectangle around the voxels and the surrounding region for cropping as a defined anatomical structure by the localization layer. The bounding rectangle extends equally in all directions to capture the tooth and surrounding context. In an alternative embodiment, a model of a probability distribution over anatomical structures via semantic segmentation may be performed: using a standard fully-convolutional network, such as VNet or 3D UNet, Segmentation of a surface scan image may be accomplished by assigning each vertex and/or face of a mesh a distinct anatomical structure identifier. Also not shown, individual voxels segmented as teeth may be further segmented as dental crown if the distance between this voxel and the tooth's highest point is within a predefined threshold. In one embodiment, pre-defined threshold of distance between the voxel and the tooth's highest point is not greater than 6 mm for the lower (upper) jaw tooth.
Now in reference to
Though not shown, the method for predicting at least one of a tooth crown or implant feature may comprise the steps of: receiving at least one of a volumetric or surface scan image, wherein the volumetric image is a three-dimensional voxel array of a maxillofacial anatomy of a patient and the surface scan image is a polygonal mesh of a maxillofacial anatomy of the patient; segmenting at least one of the volumetric image or surface scan image into a set of distinct anatomical structures by assigning each voxel an identifier by structure and assigning at least one of a vertices, face, or points on the mesh an identifier by structure for the volumetric image and surface scan image, respectively, wherein the distinct anatomical structures include at least one of a tooth, jaw, mandibular canal, maxillary sinus, fossae, and a missing tooth; and predicting at least one of a tooth crown or implant feature in place of the segmented missing tooth, wherein the predicted crown and/or implant feature is at least one of generated or selected from a library.
While also not shown, the method for predicting at least one of a tooth crown and implant feature may comprise the steps of: predicting at least one of a tooth crown shape and position in place of a segmented missing tooth; imposing at least one of a cylindrical or conical shape along a planned location/orientation, dimension, or geometry with a pre-defined distance to surrounding structures to avoid contact with the implant for predicting at least one of an allowed placement zone for implant shape and positioning; and generating a report comprising data or derived data related to at least the predicted crown and/or implant shape and position for crown/implant planning. In addition to pointset-matching against a library or inventory, the predicted crown or dental implant features may be predicted and generated for future production or for probing against a library/inventory (manufacturer inventory or practitioner supply, etc.).
While not illustrated in
The SDG pipeline may be designed to seamlessly import DICOM format images (natively) from or to the image gathering or input event source 2301 and support 16-bit imaging, achieving highly accurate, pixel-perfect annotations for downstream processing—including for, inter alia, surgical guide design and (optionally) fabrication. In other embodiments, an input mesh, or an input mesh with calculated sequence of points are received to or from the input event source 2301, allowing for the geodesic module 2303a to find geodesic line segment on the mesh between the points.
In one embodiment, the geodesic module 2303a determines the geodesic line segments between points, forming a geodesic path, by finding a raw path that follows mesh edges using dijkstra's algorithm. This is followed by an intrinsic edge flip starting by calculating a tangent vector for each edge. Once the tangent vectors are determined, a flip operation is performed to shorten a rough path segment. After the flip, a new edge is obtained for which it is necessary to calculate the tangent vector. If the method described above constructs a geodesic line between a sequence of points, then a polygonal chain of geodesic lines will be obtained, which will not be a smooth line at the transition points from one segment to another. Geodesic line segments are smooth out by adding new points at a distance d tangent to and opposite to the tangent to the geodesic line segments at the first and last points of the segment and for each segment a new line is generated passing through the new points.
The chain is smooth by shortening all curves between control points to geodesics; inserting a new control point vertex at the midpoint between each pair of old control points; and unmark all old control points except the first and last. If there are greater than 2 points left, return to the first step in the smoothening process (‘shortening step’), and reduce the working set to exclude the first and last control points. However, now the geodesic line does not pass through the original points. In order to fix this, new control points are inputted into a midpoint subdivision method 2502 (
In continuing reference to
The process for generating a height map and rendering the 3-D mesh, as shown in
In summary, the SDG pipeline starts by receiving an input mesh with a sequence of points, which is then processed by the geodesic module. This module determines the geodesic line segments between the points by first finding a raw path using Dijkstra's algorithm and then smoothing the path by adding new points and lines. The process involves shortening the curves, inserting new control points, and repeating until all curves are smoothed. The slicing module then slices out a portion of the mesh defined by the geodesic line segments. This process involves dividing triangles that the line crosses into smaller ones and assigning IDs to each triangle on either side of the line. The insertion module then determines the best insertion direction to minimize the undercut area and generates a height map in that direction. This height map is used to render a 3D mesh for the final design of the surgical guide. The pipeline may also, optionally, be designed to fabricate the guide on or off-site. There are several methods for designing surgical guides, including computer-aided design (CAD) and computer-aided manufacturing (CAM) techniques, as well as image-based methods that use medical imaging data such as CT or MRI scans. Another technique for finding the insertion direction is the use of 3D scans of the patient's anatomy, which can generate a height map to determine the best insertion direction. Additionally, computational simulations can be used to predict the behavior of the patient's anatomy during the procedure and help determine the best insertion direction. However, the SDG pipeline is a computer-based process for designing surgical guides that involves receiving an input mesh, determining geodesic line segments, slicing out a portion of the mesh, determining the best insertion direction, and generating a height map for rendering a 3D mesh. The resulting 3D mesh can then be visualized, manipulated, and analyzed for various purposes. The use of 3D mesh extraction in dentistry allows for more accurate and precise treatment planning and the production of high-quality, patient-specific surgical and orthodontic devices—including surgical guide design and, optionally, fabrication.
Now in reference to
In a preferred embodiment, the system then proceeds to segment the received image into distinct anatomical structures, which include but are not limited to the teeth (which are individually segmented and numbered), maxilla, mandible, nerve canal, incisive canal, sinus, airways, cranial, and soft tissue. These segmented structures appear as distinct 3D objects on the 3D model and are outlined by contours on a multiplanar reformation or panoramic reconstruction. Following segmentation, the system predicts the shape and position of a tooth crown or implant in place of a segmented missing or problematic tooth. In an embodiment, the prediction of the crown, implant, and sleeve shape and position is used as an input for generating a clinically appropriate plan for the placement of the implant/crown, which is based on the available anatomic conditions as determined from the 3-D model. This clinically appropriate plan generated, subject to approval/modification by the practitioner, then aids in deriving a clinically appropriate shape of the surgical guide, a 3D model of the surgical guide saved as an stl file, and a report saved as a pdf file.
In an embodiment, the system provides an implant planning interface for a practitioner, allowing for the practitioner to modify the received image, segmented image, and/or the generated plan with respect to the crown/implant shape, position, or site. The interface includes a toolbar that allows the practitioner to activate or deactivate instruments that support the implant planning. It also includes an objects panel that represents all segmented anatomic and artificial objects in a tree structure, and a planning panel that enables selection of the implant system, implant site, and other critical steps of the planning process.
In continuing reference to
In one embodiment, the implant planning interface system is capable of employing advanced AI algorithms/routines to segment the processed image, creating a foundation for the implant/crown plan. The system natively imports DICOM format images from the image gathering or input event source 2601. It supports 16-bit imaging, which allows for precise annotations crucial for surgical guide design and optional fabrication. The system receives a CBCT or IOS scan image, which may be mutually exclusive or superimposed, furnishing it with comprehensive and accurate anatomical data for creating the implant plan.
In one embodiment, the segmentation module 2603b of the system processes the received images, segmenting them into distinct anatomical structures that help predict the crown/implant shape and position. This function is key for determining the appropriate plan for the implant/crown placement, which takes into account each patient's unique anatomical structures. Advanced AI algorithms facilitate this segmentation, ensuring accuracy and precision. Alternatively, other segmentation techniques, such as probabilistic approaches to localizing and identifying relevant anatomical structures may be employed.
The system includes an interface module 2603d that facilitates system interaction. This interface permits practitioners and patients to modify at least one of the received image, segmented image, predicted image, or even the generated plan. Modifications can occur on any one of the 3-D models, surgical guides, or reports, thereby providing practitioners with a highly interactive and personalized planning tool. The implant planning interface system as illustrated in
Now in reference to
Following this, the Segmentation Module 2702 takes over. This module is responsible for utilizing state-of-the-art artificial intelligence algorithms to split the processed images into separate anatomical structures. The segmented structures act as the foundational elements for the construction of the implant/crown plan. Each segmented tooth is assigned a specific number to ensure systematic and organized representation of the patient's dental anatomy.
The Prediction Module 2703 is the subsequent component in the system's process flow. This module has a pivotal role as it utilizes proprietary algorithms to generate clinically feasible predictions about the shape, position, and size of the implant and crown or the implant site. It accomplishes this by considering available anatomical data from the 3D patient model obtained from the Segmentation Module 2702.
Theories about how the Prediction Module 2703 functions can be multifaceted. In one interpretation, it might use machine learning models trained on a large dataset of prior successful implant cases. The module could use factors like bone density, patient's age, health condition, and oral hygiene habits to make predictions. Alternatively, it could apply sophisticated mathematical models that calculate optimal implant position, size, and shape based on the segmented anatomical data. These predictions are potentially reviewed and adjusted in the subsequent Implant Planning Interface 2704, ensuring that they align with clinical needs and patient-specific conditions.
The process culminates at the Implant Planning Interface 2704. This module enables the practitioner to interact with the system, allowing for the assessment or modification of the received image, the segmented image, or the generated plan concerning the crown/implant shape, position, size, or implant site. The Interface 2704 is user-friendly, offering multiplanar reformation views and various manipulation tools for panning, zooming, scrolling, and rotating the view. Additionally, the Interface 2704 includes a capability to generate a panoramic reconstruction of the CBCT scan.
In another embodiment, the Implant Planning Interface 2704 presents a 3D visualization of the patient's head, comprising segmented anatomical and artificial objects. An extensive toolbar is present within the Interface 2704, offering an array of implant planning tools, including view modes, brightness-contrast instruments, ruler instruments, and a heatmap of teeth contacts instrument. Notably, the Interface 2704 provides undo/redo functions for flexibility during the planning process.
Further, the Implant Planning Interface 2704 includes an objects panel that displays all segmented anatomic and artificial objects in a tree structure. This allows practitioners to control the visibility of these objects and adjust their color and transparency level.
Lastly, in a preferred embodiment, the Implant Planning Interface 2704 offers a complete planning panel for implant treatment planning. The panel outlines steps for selecting planning options, adjusting AI-generated implant properties, generating the surgical guide, and approving planning results. In addition, selecting for designer or manufacturer, style, and other fabrication criteria is possible as a drop down menu in the planning panel. As shown in
Advantageously, the present invention provides an end-to-end pipeline for detecting state or condition of the teeth in dental 3D CBCT scans. The condition of the teeth is detected by localizing each present tooth inside an image volume and predicting condition of the tooth from the volumetric image of a tooth and its surroundings. Further, the performance of the localization model allows to build a high-quality 2D panoramic reconstruction, which provides a familiar and convenient way for a dentist to inspect a 3D CBCT image. The performance of the pipeline is improved by adding volumetric data augmentations during training; reformulating the localization task as instance segmentation instead of semantic segmentation; reformulating the localization task as object detection, and use of different class imbalance handling approaches for the classification model. Alternatively, the jaw region of interest is localized and extracted as a first step in the pipeline. The jaw region typically takes around 30% of the image volume and has adequate visual distinction. Extracting it with a shallow/small model would allow for larger downstream models. Further, the diagnostic coverage of the present invention extends from basic tooth conditions to other diagnostically relevant conditions and pathologies. Furthermore, the segmentation pipeline may extend further to align and/or fuse IOS and CBCT scans for more global and granular resolution, not to mention for achieving optimal treatment planning and dental outcomes. What's more, as described in detail above, the pipeline may further predict for crown and implant features for dental and implant planning based on a “phantom” tooth feature prediction.
The figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. It should also be noted that, in some alternative implementations, the functions noted/illustrated may occur out of the order noted. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
Since various possible embodiments might be made of the above invention, and since various changes might be made in the embodiments above set forth, it is to be understood that all matter herein described or shown in the accompanying drawings is to be interpreted as illustrative and not to be considered in a limiting sense. Thus, it will be understood by those skilled in the art of creating independent multi-layered virtual workspace applications designed for use with independent multiple input systems that although the preferred and alternate embodiments have been shown and described in accordance with the Patent Statutes, the invention is not limited thereto or thereby.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Some portions of embodiments disclosed are implemented as a program product for use with an embedded processor. The program(s) of the program product defines functions of the embodiments (including the methods described herein) and can be contained on a variety of signal-bearing media. Illustrative signal-bearing media include, but are not limited to: (i) information permanently stored on non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM disks readable by a CD-ROM drive); (ii) alterable information stored on writable storage media (e.g., floppy disks within a diskette drive or hard-disk drive, solid-state disk drive, etc.); and (iii) information conveyed to a computer by a communications medium, such as through a computer or telephone network, including wireless communications. The latter embodiment specifically includes information downloaded from the Internet and other networks. Such signal-bearing media, when carrying computer-readable instructions that direct the functions of the present invention, represent embodiments of the present invention.
In general, the routines executed to implement the embodiments of the invention may be part of an operating system or a specific application, component, program, module, object, or sequence of instructions. The computer program of the present invention typically is comprised of a multitude of instructions that will be translated by the native computer into a machine-accessible format and hence executable instructions. Also, programs are comprised of variables and data structures that either reside locally to the program or are found in memory or on storage devices. In addition, various programs described may be identified based upon the application for which they are implemented in a specific embodiment of the invention. However, it should be appreciated that any particular program nomenclature that follows is used merely for convenience, and thus the invention should not be limited to use solely in any specific application identified and/or implied by such nomenclature.
The present invention and some of its advantages have been described in detail for some embodiments. It should be understood that although the system and process is described with reference to automated segmentation pipeline systems and methods, the system and process may be used in other contexts as well. It should also be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims. An embodiment of the invention may achieve multiple objectives, but not every embodiment falling within the scope of the attached claims will achieve every objective. Moreover, the scope of the present application is not intended to be limited to the particular embodiments of the process, machine, manufacture, and composition of matter, means, methods and steps described in the specification. A person having ordinary skill in the art will readily appreciate from the disclosure of the present invention that processes, machines, manufacture, compositions of matter, means, methods, or steps, presently existing or later to be developed are equivalent to, and fall within the scope of, what is claimed. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or steps.
Number | Date | Country | |
---|---|---|---|
Parent | 18114508 | Feb 2023 | US |
Child | 18210593 | US | |
Parent | 17868098 | Jul 2022 | US |
Child | 18114508 | US | |
Parent | 17854894 | Jun 2022 | US |
Child | 17868098 | US | |
Parent | 17564565 | Dec 2021 | US |
Child | 17854894 | US | |
Parent | 17215315 | Mar 2021 | US |
Child | 17564565 | US | |
Parent | 16783615 | Feb 2020 | US |
Child | 17215315 | US | |
Parent | 16175067 | Oct 2018 | US |
Child | 16783615 | US |