Aspects of the present invention relate to generation of training sets for disease detection models that use facial imagery to identify diseases and disorders. In particular, aspects of the invention relate to the use of existing facial imagery to generate additional facial imagery as part of training sets to train disease detection models.
Facial recognition models can be useful to identify certain types of facial diseases and disorders. Such models can supplement a doctor's examination, to help identify the correct disease or disorder before resorting to more expensive diagnostic tools such as diagnostic imaging (e.g. CT scans, MRI). These models also can provide early warning of onset of a disease or disorder.
It would be helpful to provide more robust training data to improve the performance of the facial recognition models, particularly for specific diseases or disorders.
In view of the foregoing, according to aspects of the invention, transformations may be performed on existing facial images, whether affected or unaffected by disease or disorder, in order to generate additional training data for a facial recognition model. The transformations can be tailored to particular facial disorders, and can be applied in differing degrees to facial images to generate transformed facial images to be added to training sets for facial recognition models. In some aspects, the transformations may be applied to different portions of a facial image to focus on the kinds of facial anomalies that may be unique to a particular disease or disorder.
Aspects of the present invention will be described in detail with reference to the accompanying drawings, in which:
Aspects of the present invention generate synthetic disease face images by adjusting a degree of disease symptoms for use in development of disease detection models. Such models can be used in primary care facilities, emergency rooms (ER), as well as in doctors' offices, prior to undertaking expensive imaging, such as CT or MRI.
Embodiments of the present invention thus can be helpful in various clinical practices, including early diagnosis and treatment planning. Acquiring a sufficient amount of training data for facial recognition models such as this can be very difficult and/or very expensive, thereby limiting the volume and quality of training data available. Existing 2D image data augmentation techniques, such as rotation, color shifting, and contrast adjustment are not effective when the images are of people. With limited training data in the form of real facial images showing a disease or disorder, a trained model may not perform as well during inference time, and may be inaccurate when it comes to diagnosing the cause of a patient's facial appearance.
Depending on the embodiment, images may be taken of someone's entire face or cranium, or of portions of the face or cranium, such as eyes, mouth, nose, ears, cheeks, or jaws.
In embodiments discussed herein, there is specific reference to two different diseases or disorders that present in the human face. Stroke is one of these. Moon face is another. Stroke is but one example of a medical condition that can cause facial drooping, in which various parts of a person's face are paralyzed and therefore seem to droop. Moon face (referred to sometimes as moon facies), in which portions of a person's face, including for example their cheeks and/or surrounding areas appear rounder or puffier, may result from different syndromes or treatments. In embodiments, training sets may be generated to enable an artificial intelligence/machine learning (AI/ML) system to detect these and other different medical conditions.
Aspects of the present invention enable the provision of supervised learning in a neural network, which may be any of a variety of neural networks, as ordinarily skilled artisans will appreciate, as well as any of a variety of machine learning systems. Recognizing that some ordinarily skilled artisans apply different definitions to different types of machine learning systems, the inventive techniques are applicable across a range of such systems, whether referred to as machine learning systems, or deep learning systems, or by another name. The inventive techniques also are applicable across a range of neural networks, for which a non-exhaustive but exemplary list includes convolutional neural networks (CNN), fully convolutional neural networks (FCNN), recurrent neural networks (RNN). The inventive techniques also can be applicable to vision transformer (ViT) networks. Sequence models to model progression of a disease or disorder can be useful for monitoring of patients over time.
In an embodiment, an FCN such as U-Net may be used as an example of an image segmentation network, to generate facial images as masks. Labeling each pixel of an image enables detailed manipulation of particular facial or cranial attributes to simulate the effects of different diseases or disorders. In an embodiment, image segmentation according to aspects of the invention provides a pixel-by-pixel map of the healthy image 105 to generate the facemask 115.
In
Depending on the embodiment, the n transforms may pertain to one particular facial feature (e.g. drooping eyelid), or may pertain to a plurality of facial features (e.g. not only drooping eyelid but also drooping mouth), or to a plurality of facial features for different diseases or disorders (e.g. drooping facial portions, other nerve-related facial anomalies, moon face, etc.) The resulting set of disease facemasks can be augmented to address additional diseases or disorders presenting as alterations of one or more facial features.
In
The disease face images 180 may form part of a training set that may be input to a detection network, such as a convolutional neural network (CNN), to train the detection network. Once the detection network is trained, in
Different kinds of strokes can cause facial drooping to different degrees, for example ischemic stroke, hemorrhagic stroke, transient ischemic attack (mini-stroke or TIA), brain stem stroke, or even a stroke resulting from unknown causes, sometimes referred to as cryptogenic stroke.
A number of other diseases or disorders also can cause facial drooping to different degrees, including but not necessarily limited to trigeminal neuralgia, Bell's palsy, shingles (herpes zoster oticus—Ramsay Hunt syndrome), Treacher Collins syndrome (mandibulofacial dysotosis), Jacobsen syndrome, or Crouzon syndrome. Some of these just mentioned syndromes and/or disorders are more rare than others, so that doctors may need greater aid in diagnosis.
Other diseases, disorders, or in some cases medical treatments can cause moon face, for example Cushing's syndrome, or the administration of certain steroids such as prednisone.
There are other diseases or disorders which may affect different parts of the head and/or face. A non-exhaustive list of examples may include:
From the foregoing, ordinarily skilled artisans will appreciate that embodiments of the invention enable the generation of synthetic or artificial disease face images by adjusting a degree of disease symptoms on available normal face images. The generated disease face images may be used along with the real disease face images to train a disease recognition model.
In an embodiment, facial indications of disease or disorder may be interpreted in an end-to-end approach, using various kinds of AI/ML approaches, including deep neural networks, without a requirement that there be any measurements of a subject's face as part of any determination of the extent to which a disease or disorder is present.
An algorithm in accordance with aspects of the invention is able to modify normal face images to generate disease face images with a range of effects. Controlling a degree of disease severity in facemasks generated by the segmentation network allows this range.
According to aspects of the invention, it is possible to apply different transformations to normal facial images in order to simulate different diseases or disorders. For example, strokes involving the brain often cause central facial weakness involving the mouth and eyes. Face drooping is one of the most common signs of such a stroke. For example, one side of a stroke victim's face may become numb or weak. In an embodiment, in order to generate realistic stroke-displaying facial images, the transformation may be applied to specific facial regions that a stroke usually affects stroke (e.g., mouth, lips, and eye) without modifying other facial regions. Face segmentation masks help to apply the transformation on desired regions by excluding other regions in the transformation. Shear transformation with different degrees, for example 0.1 and 0.2, may be applied to specific regions of normal facial masks in order to generate facial distortion classes associated with a particular disease or disorder.
In an embodiment, a mask to simulate a moon face condition may be generated by adding different amounts of soft tissue to different facial regions (especially cheek and chin regions, for example), facilitating the synthesizing of realistic moon face images. Similar to the work with artificial training data sets for diagnosing strokes or other disorders or diseases, facial segmentation masks help to apply the transformation to desired regions by excluding other regions from modification.
A generated disease facial mask and a normal facial image may be used as input to the GAN model to output synthesized facial images depicting a disease or disorder. Finally, a trained CNN model may be used to detect the patient's condition and stage of severity: normal stage, watch stage (not severe, but requiring monitoring), and disease or disorder (more severe stage).
For stroke patients, it should be noted that either side of a patient's face may be affected. Accordingly, training data should include data for affectations on either the left side or the right side of a patient's face. For other disorders, the facial effects may be different, for example, affecting the eye but not the mouth, or equally affecting both sides of a patient's face.
At 415, there is the beginning of the performance of one or more transforms (n transforms) of the facemask, by setting a counter, m, to be 1. At 420, one of the n transforms is performed to produce a disease facemask. Depending on the embodiment, the n transforms may pertain to a particular portion of a face or cranium, or to a particular degree of transformation, or both. At 425, that produced disease facemask is added to a disease facemask set. At 430, a check is made to see whether all n transforms have been performed, if not, then at 435 the counter m is incremented, and flow returns to 420. This cycle continues until all n transforms have been performed (m=n at 430 is answered in the affirmative). This just-described portion of
After the n transforms have been performed, at 440 the counter is reset, so that m=1 again. At 445, a healthy face image and one of the n disease facemasks are input to a disease face generation network to generate a disease face image. At 450, that disease face image is added to the disease face image training set. At 455, a check is made to see whether all n of the disease facemasks have been used. If not, then at 460, the counter m is incremented, and flow returns to 445. This cycle continues until all n facemasks have been used with the healthy face image (m=n at 445 is answered in the affirmative). Then, at 465, a check is made to see whether there are additional healthy face images to process. If so, flow returns to 405, and another healthy face image is input to the disease face generation network with the n disease facemasks to generate another set of disease face images. In an embodiment, once all of the healthy face images have been used, at 470 the synthetic disease face training set may be said to be complete. This just-described portion of
In an embodiment, the synthetic disease face training set may be augmented by actual disease face images.
While aspects of the present invention have been described in detail with reference to various drawings, ordinarily skilled artisans will appreciate that there may be numerous variations within the scope and spirit of the invention. Accordingly, the invention is limited only by the following claims.