The present disclosure relates to a mouth and nose occluded detecting method and system thereof. More particularly, the present disclosure relates to a mouth and nose occluded detecting method and system thereof according to a convolutional neural network.
Because of the protector cannot always stay with the patient, and in order to avoid the patient be choked by mouth occluding or nose occluding, an occluded detection system usually be utilized to assist the protector so as to reduce the burden. However, a misjudgment from conventional occluded detection systems is usually occurred due to a light from an environment or a color of a cloth from the patient.
Hence, how to avoid the light from the environment or the color of the cloth affect the mouth and nose occluded detecting system is a target of the industry.
According to one embodiment of the present disclosure, a mouth and nose occluded detecting method includes a detecting step and a warning step. The detecting step includes a facial detecting step, an image extracting step and an occluded determining step. In the facial detecting step, an image is captured by an image capturing device, wherein a facial portion image is obtained from the image according to a facial detection. In the image extracting step, a mouth portion is extracted from the facial portion image according to an image extraction so as to obtain a mouth portion image. In the occluded determining step, the mouth portion image is entered into an occluding convolutional neural network so as to produce a determining result, wherein the determining result is an occluding state or a normal state. In the warning step, a warning is provided according to the determining result, when the determining result is the normal state, the detecting step is performed, when the determining result is the occluding state, the warning is provided.
According to another embodiment of the present disclosure, a mouth and nose occluded detecting system includes an image capturing device, a processor and a warning device. The image capturing device is for capturing an image. The processor is electronically connected to the image capturing device, and includes a facial detecting module, an image extracting module and an occluded determining module. The facial detecting module is electronically connected to the image capturing device, wherein the facial detecting module captures the image by the image capturing device, and a facial portion image is obtained from the image according to a facial detection. The image extracting module is electronically connected to the facial detecting module, wherein the image extracting module extracts a mouth portion from the facial portion image according to an image extraction so as to obtain a mouth portion image. The occluded determining module is electronically connected to the image extracting module, wherein the occluded determining module enters the mouth portion image into an occluding convolutional neural network so as to produce a determining result. The warning device is signally connected to the processor, wherein the warning device provides a warning according to the determining result, when the determining result is a normal state, a determining step is performed, when the determining result is an occluding state, the warning is provided.
The present disclosure can be more fully understood by reading the following detailed description of the embodiment, with reference made to the accompanying drawings as follows:
The embodiment will be described with the drawings. For clarity, some practical details will be described below. However, it should be noted that the present disclosure should not be limited by the practical details, that is, in some embodiment, the practical details is unnecessary. In addition, for simplifying the drawings, some conventional structures and elements will be simply illustrated, and repeated elements may be represented by the same labels.
In detail, the detecting step s110 includes a facial detecting step s111, an image extracting step s112 and an occluded determining step s113. In the facial detecting step s111, an image is captured by an image capturing device 410 (shown in
In
In order to increase the number of the training samples of the occluding convolutional neural network, the image procession of the image processing step s132 can be an image flipping, a histogram equalization, a log transform, a gamma processing or a Laplace processing. A target of image processing the occluding images is for simulating an illuminance of the image and a profile of the image in various situations so as to train the occluding convolutional neural network. The histogram equalization is for evenly renewing the distribution of the brightness of the occluding image so as to increase the brightness of a dark portion in the occluding image and decrease the brightness of a bright portion in the occluding image. The log transformation is for increasing the brightness of the dark portion in the occluding image. The gamma processing is for increasing the brightness of the dark portion in the occluding image and decreasing the brightness of the bright portion in the occluding image by adjusting a gamma value of the occluding image. The Laplace processing is for obtaining an image profile, an image shape and a distribution status of the occluding image by a second order partial differential. In the other word, the occluding image can be processed by each of the histogram equalization, the log transform, the gamma processing and the Laplace processing, and the occluding image can be processed by the image flipping and then processed by each of the histogram equalization, the log transform, the gamma processing and the Laplace processing for obtaining the nine post-processing occluded detection images so as to increase the number of the training samples of the occluded convolutional neural network. It should be mentioned, the image procession disclosures above, but it should not be limited to the description of the embodiments herein. Table 1 shows an accuracy rate of a first example and an accuracy rate of a first comparative example, wherein an occluded convolutional neural network structure 200 (shown in
The occluded convolutional neural network structure 200 can further include a softmax layer sl, wherein the softmax layer sl is for calculating a probability of the occluding state and a probability of the normal state so as to produce the determining result. The softmax layer sl includes at least one image state, a number of the image state, the mouth portion image 310, at least one image state parameter, at least one image state probability and an image state probability set, wherein is the image state, k is the number of the image state, x(i) is the mouth portion image 310, θ is an image state parameter set, each of θ1, θ2 . . . , θK is the image state parameter, p(y(i)=k|x(i);θ) is the image state probability, hθ(x(i)) is the image state probability set and T means transpose matrix. The softmax layer sl is corresponded by formula (1).
Therefore, the occluded convolutional neural network is for determining a probability of each of the image state of the mouth portion image 310, wherein the image state at least for the occluding state and the normal state. When the probability of the occluding state is greater than the probability of the normal state, the determining result is the occluding state. When the probability of the normal state is greater than the probability of the occluding state, the determining result is the normal state.
Table 3 shows an occluded convolutional neural network structure 200 of a second example and an occluded convolutional neural network structure of a second comparative example, a third comparative example, a fourth comparative example, a fifth comparative example and a sixth comparative example, respectively. Table 4 shows an accuracy of the occluded convolutional neural network of the second example and an accuracy of the occluded convolutional neural network of the second comparative example, the third comparative example, the fourth comparative example, the fifth comparative example and the sixth comparative example, respectively. In Table 3 and Table 4, the accuracy of the occluded convolutional neural network of the second example is greater than the accuracy of the occluded convolutional neural network of the second comparative example, the third comparative example, the fourth comparative example, the fifth comparative example and the sixth comparative example, respectively.
In order to obtain the facial portion image from the image, the facial detection can utilize a Multi-task cascaded convolutional network for detecting a facial portion of the image, wherein the Multi-task cascaded convolutional network includes a Proposal-Net (P-Net), a Refine-Net (R-Net) and an Output-Net (O-Net). The Proposal-Net obtains a plurality of bounding box by a Proposal-Net convolutional neural network. The Refine-Net removes a non-facial bounding box by a Refine-Net convolutional neural network. The Output-Net outputs a facial feature by an Output-Net convolutional neural network. Therefore, the facial portion image is obtained by entering the image into the Multi-task cascaded convolutional network.
Because of the difference of a size of the facial of each person, the mouth and nose occluded detecting method s100 can further includes normalized processing the mouth portion image 310 for obtaining a post-normalizing mouth portion image, so that the misjudgment of the mouth and nose occluded detecting method s100 due to the difference of the size of the facial can be avoided. Table 5 shows an accuracy rate of an occluded convolutional neural network of a third example and an accuracy rate of an occluded convolutional neural network of a seventh comparative example, an eighth comparative example, a ninth comparative example and a tenth comparative example, respectively. A size of a post-normalizing mouth portion image of the third example is 50×50, a size of a post-normalizing mouth portion image of the seventh comparative example is 25×25, a size of a post-normalizing mouth portion image of the eighth comparative example is 75×75, a size of a post-normalizing mouth portion image of the ninth comparative example is 100×100 and a size of a post-normalizing mouth portion image of the tenth comparative example is 150×150. In Table 5, the accuracy rate of the occluded convolutional neural network of the third example is greater than the accuracy rate of the occluded convolutional neural network of the seventh comparative example, the eighth comparative example, the ninth comparative example and the tenth comparative example, respectively. In the other words, when the size of the post-normalizing mouth portion image is 50×50, the accuracy rate of the occluded convolutional neural network is increased.
In detail, the image capturing device 410 is for capturing an image of the patient so as to produce the image, wherein the image capturing device 410 is camera. The facial detecting module 421 of the processor 420 is for obtaining the facial portion image from the image by the facial detection, wherein the facial detection utilizes a Multi-task cascaded convolutional network for detecting a facial portion of the image. The image extracting module 422 of the processor 420 extracts the mouth portion from the facial portion image by an image extraction so as to obtain the mouth portion image 310, wherein the image extraction utilizes a nine-square division so as to obtain a nine-square facial portion image 300, and the mouth portion image 310 is obtained by extracting a three-square image from a lower part of the nine-square facial portion image 300. The occluded determining module 423 of the processor 420 is for producing the determining result by entering the mouth portion image 310 into the occluding convolutional neural network, wherein the determining result is the occluding state or the normal state. The processor 420 is a micro-processor, a central processing unit or other electronic processing unit. The warning device 430 provides the warning according to the determining result. When the determining result is the normal state, the image capturing device 410 captures the image of the patient again and monitor a state of a patient, continuously. When the determining result is the occluding state, the warning is provided so as to notify a protector to treatment, expeditiously. The warning device 430 is an image warning (flashing light) or a voice warning (buzzer). The mouth and nose occluded detecting system 400 can be applied to a computer or a cell phone.
In order to improve an accuracy rate of the mouth and nose occluded detecting system 400, the occluded convolutional neural network can include six convolutional layers, three pooling layers, a hidden layer hl and an output layer op, wherein the occluded convolutional neural network structure 200 is the same with
Hence, the mouth and nose occluded detecting method and the mouth and nose occluded detecting system can provide the following advantages:
(1) The accuracy rate of the occluded convolutional neural network can be increased by increasing the number of the training samples of the occluded convolutional neural network via the image procession.
(2) An influence of environment factor is decreased by entering the mouth portion image into the occluded convolutional neural network so as to avoid the occluded convolutional neural network provides a misjudgment and increases the accuracy rate of the occluded convolutional neural network.
(3) The mouth and nose occluded detecting method and the mouth and nose occluded detecting system utilize the occluded convolutional neural network structure so as to increase the accuracy rate of the occluded convolutional neural network.
Although the present disclosure has been described in considerable detail with reference to certain embodiments thereof, other embodiments are possible. Therefore, the spirit and scope of the appended claims should not be limited to the description of the embodiments contained herein.
It will be apparent to those skilled in the art that various modifications and variations can be made to the structure of the present disclosure without departing from the scope or spirit of the disclosure. In view of the foregoing, it is intended that the present disclosure cover modifications and variations of this disclosure provided they fall within the scope of the following claims.