Orthodontic clear tray aligners allow patients to receive high quality, customizable treatment options. A potential patient may browse past clinical cases if they are considering getting treatment. A high-level overview of one pipeline is as follows: a potential patient will arrive at the doctor's office: the doctor will take a scan of their teeth, extracting a three-dimensional (3D) mesh; and this 3D mesh is processed by an algorithm to produce a mesh of the patient's teeth in their final alignment.
A common question from patients is, “what would my new smile look like?” While they have the ability to view previous clinical trials and even the ability to view the 3D mesh of their newly aligned teeth, neither option provides the patient with a true feel of what their teeth and smile may look like after orthodontic treatment. Because of this, potential patients may not be fully committed to receiving treatment.
A method for displaying teeth after planned orthodontic treatment includes receiving a digital 3D model of teeth or rendered images of teeth, and an image of a person. The method uses a generator network to produce a generated image of the person showing teeth of the person after the planned orthodontic treatment. The method uses a discriminator network processing input images, generated images, and real images to train the generator network.
Embodiments include an automated system to generate an image of a potential person's smile showing post-treatment aligner results, before treatment has begun. The system utilizes data including a person's image as well as their corresponding 3D scan in order to learn how to generate a photo-realistic image. Though the system is trained to generate a person's smile from their pre-treatment scan, the scan can be swapped out with a post-treatment scan in order to give the person the ability to view potential post-treatment results. Alternatively or in addition, the system can be used to show persons their appearance after each stage, treatment, or selected stages of treatment.
The ability for a person to view a post-treatment photo of themselves smiling, before any treatment has begun, may give them confidence moving forward with the treatment process, as well as help convince those who may be uncertain. Additionally, the person would be able to provide feedback to the doctor or practitioner if any aesthetic changes are requested, and the doctor or practitioner can modify the alignment of the mesh to meet the person's needs.
The system is built upon generative machine or deep learning models known as Generative Adversarial Networks (GANs). This class of algorithms contains a pair of differentiable functions, often deep neural networks, whose goal is to learn an unknown data distribution. The first function, known as the generator, produces a data sample given some input (e.g., random noise, conditional class label, or others). The generator and feature extracting network also receive the pixel-wise difference between the generated and ground truth image in the form of a loss function. The second function, known as the discriminator, attempts to classify the “fake” data generated by the generator from the “real” data coming from the true data distribution. As the generator continuously tries to fool the discriminator into classifying data as “real,” the generated data becomes more realistic.
The system uses a conditional GAN (cGAN), where the generator is conditioned on either a two-dimensional (2D) rendered image of the person's scanned teeth, or the 3D mesh model of their teeth, along with an image of the person smiling with their teeth blocked out. The generator (see
As shown in
The discriminator, also represented as a CNN, has two training steps. As shown in
The images of the person smiling with their teeth blocked out can be generated by finding features of the smile in the images (e.g., corners of the mouth), extracting the bounds of those features, and whiting out those features.
The generator network, discriminator network, and feature extracting network can be implemented in, for example, software or firmware modules for execution by a processor such as processor 20. The generated images can be displayed on, for example, display device 16.
The dataset for the following experiment consisted of ˜5,000 patients. Each patient has a front facing photo of themselves smiling, as well as a scan of their teeth. For this experiment, we used a 2D render for the scan as the conditional information.
In order to test the viability of this, we swapped the scans of different patients, then generated their corresponding photo. For example, Column A in
In
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB2022/057323 | 8/5/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63231823 | Aug 2021 | US |