This application claims priority to and the benefit of Taiwan Application Serial Number 109138489, filed on Nov. 4, 2020, the entire content of which is incorporated herein by reference as if fully set forth below in its entirety and for all applicable purposes.
The disclosure generally relates to a recognition system and a recognition method, and more particularly, to a recognition system of human body posture and a recognition system of human body posture.
The recognition method of human body posture is commonly applied in a public place to review states of the people in the field through the human body postures to make sure public safety. For example, when the people on the road, in the traffic environment, or on the public transportation fall down, not only the people get injured and urgent care is necessary, but also chaos occurs due to the falling down. It is dangerous to public safety.
To preserve public safety in the field, cameras can be deployed in public places to monitor the field. However, the current image processing technique for recognizations is influenced by the complexity of the area or location, the camera angle, the variation in light intensity, such that it is difficult to recognize the state of the people who is in the area by processing the image. When the area or location is complicated or there are many people in the area, such that people in the image overlap, the entire image of each person cannot be captured. Because the current image recognization algorithm applies the gray image, not only the left/right and distances of the people but also the image contents cannot be determined. Therefore, the current image processing technique of training the model and recognizing the images has bad efficiency.
The disclosure can be more fully understood by reading the following detailed description of the embodiments, with reference made to the accompanying drawings as described below. It should be noted that the features in the drawings are not necessarily to scale. In fact, the dimensions of the features may be arbitrarily increased or decreased for clarity of discussion.
The present disclosure of an embodiment provides a recognition system of human body posture which includes a source image device, a storage device, and a processing device. The resource image device is configured to receive a plurality of pending recognition images. The storage device is configured to store a posture recognition model and the posture recognition model is configured to input a skeleton image and output a recognition result. The skeleton image includes a skeleton and the skeleton includes a plurality of joints and a plurality of limbs. Each of the limbs corresponds to a limb color, and each of the limb colors is different from each other. The processing device is coupled with the source image device and the storage device, and the processing device is configured to: generate the skeleton images from the pending recognition images; input the skeleton images into the posture recognition model respectively to output the recognition result which corresponds to the skeleton images inputted; and determine whether abnormal information is sent according to the recognition result.
One aspect of the present disclosure is to provide a recognition method of human body posture including: receiving a plurality of pending recognition images; generating a plurality of skeleton images from the pending recognition images, wherein the skeleton image comprises a skeleton, the skeleton comprises a plurality of joints and a plurality of limbs, each of the limbs corresponds to a limb color, and each of the limb colors is different from each other; inputting the skeleton images into a posture recognition model respectively to output a recognition result which corresponds to the skeleton images inputted; and determining whether abnormal information is sent according to the recognition result.
One aspect of the present disclosure is to provide a non-transitory computer-readable storage medium including instructions stored thereon, and the instructions are configured to cause a processor to: receive a plurality of pending recognition images; generate a plurality of skeleton images from the pending recognition images, wherein the skeleton image comprises a skeleton, the skeleton includes a plurality of joints and a plurality of limbs, each of the limbs corresponds to a limb color, and each of the limb colors is different from each other; input the skeleton images into a posture recognition model respectively to output a recognition result which corresponds to the skeleton images inputted; and determine whether abnormal information should be sent according to the recognition result.
It is to be understood that both the foregoing general description and the following detailed description are by examples, and are intended to provide further explanation of the disclosure as claimed.
The disclosure can be more fully understood by reading the following detailed description of the embodiments, with reference made to the accompanying drawings as described below. It should be noted that the features in the drawings are not necessarily to scale. In fact, the dimensions of the features may be arbitrarily increased or decreased for clarity of discussion.
The technical terms “first”, “second” and the similar terms are used to describe elements for distinguishing the same or similar elements or operations and are not intended to limit the technical elements and the order of the operations in the present disclosure. Furthermore, the element symbols/alphabets can be used repeatedly in each embodiment of the present disclosure. The same and similar technical terms can be represented by the same or similar symbols/alphabets in each embodiment. The repeated symbols/alphabets are provided for simplicity and clarity and they should not be interpreted to limit the relation of the technical terms among the embodiments.
The surveillance system can provide the videos captured from cameras that are disposed on different scenes (such as the MRT, the train stations, the department stores, and so on). The administrator has to watch the monitor of the surveillance system at all times and check the monitoring screen to determine whether any accident event occurs in the scene. However, there is a risk of determinations. If the administrator missed for a short while or the display is flawed or damaged, such an accidental situation could lead to a bad result according to losing control of the scene.
Reference is made to
Reference is made to
As shown in
In some embodiments, the source image device 210 (such as a camera) receives a plurality of pending recognition images. The pending recognition image is an image captured from a live stream or video. For example, when the frame per second (fps) of the video is 30 fps, it represents that the video shows 30 frames per second. The pending recognition image is any static picture of the video. In another embodiment, the source image device 210 also receives a live stream or the pending recognition image which is captured from a stored video.
In some embodiments, the storage device 230 stores a posture recognition model. After the posture recognition model inputs a skeleton image, a recognition result is outputted. For example, the posture recognition model stores a plurality of skeleton images and corresponding human body postures. When the pending recognition image is inputted into the posture recognition model and a determination is made that the pending recognition image includes the skeleton image, the skeleton image can be applied for recognizing the human body posture and the recognition result is outputted. The posture recognition model can be, but is not limited to, the convolutional neural network (CNN) model. The CNN can be LeNet, AlexNet, VGGNet, GoogLeNet (Inception), ResNet, and so on, and the CNN module of the disclosure is not limited herein.
In some embodiments, the skeleton image that the processing device 220 needs for generating the posture recognition model from the pending recognition images includes one or more skeleton. The method for acquiring the skeleton image from the images is, for example, the human body keypoint detection algorithm. The human body keypoint detection algorithm is performed to detect the human body keypoint, such as the joints, to sketch the skeleton or each body part information of the human body. The human body keypoint detection algorithm can be, but is not limited to, the OpenPosealgorithm, the regional multi-person pose estimation algorithm (RMPE), the DeepCut algorithm, the Mask R-CNN algorithm, and so on. It should be noted that any algorithm performed to detect the human body parts can be also applied in the present disclosure. After performing the human body keypoint detection algorithm to obtain the joint locations of the human body, the skeleton image can be sketched by the link between the coordinates of the joint locations.
It should be noted that the pending recognition images are the images or pictures captured from the live stream or videos, and the pending recognition image may not include the human body or may include one or more than one human body. When the processing device 220 generates the skeleton image from one pending recognition image, and if the pending recognition image does not include the skeleton image, the pending recognition image will not be inputted into the posture recognition model. One or more skeleton images can be captured from the pending recognition image, and each skeleton image of the pending recognition image will be inputted one-by-one into the posture recognition model for recognition.
For further descriptions of the skeleton image in the present disclosure, reference is made to
In some embodiments, the skeleton of each skeleton image includes a plurality of joints and a plurality of limbs. Each limb has a corresponding limb color, and each limb color is different. For example, after the coordinates of the joints are computed, the lines between the joint coordinates (i.e., limbs) can be obtained to sketch the skeleton image.
In some embodiments, the skeleton image 310 of
In some embodiments, the limbs 321, 322, 323, 324, 325, and 326 include the corresponding limb color, and each limb color is different. For example, the color of the limb 321 is red, the color of the limb 322 is light green, the color of the limb 323 is dark green, the color of the limb 324 is purple, and the color of the limb 325 is yellow, and the color of the limb 326 is aquamarine. Because each of the limb colors is different, the left side and the right side of the human body can be recognized by the skeleton. When the limbs of the skeleton partially overlapped, the limb colors assist the recognization of the human body posture, such that the determination can be made easier and more accurately. Furthermore, because the distance between the human body and the camera (source image device) is different case by case, the resolution of the skeleton image may be different, such as blur or clear. To use the skeleton with the corresponding distance with the camera, the larger the ratio of the pixel amount of the human body image of the skeleton image to the pixel amount of the pending recognition image is, the thinner the limbs of the skeleton are. On the contrary, the smaller the ratio is, the wider the lines of limbs of the skeleton are.
In some embodiments, the processing device 220 acquires the human body image from the pending recognition images and applies the human body keypoint detection algorithm to acquire a plurality of key point coordinates of the human body image. Then, the processing device 220 obtains the skeleton image and its limbs of the human body according to the link between the key point coordinates. In some embodiments, the key point coordinates correspond to the joints of the skeleton image.
Reference is further made to
In some embodiments, the processing device 220 inputs the four skeleton images into the posture recognition model to output the recognition result. For example, the processing device 220 obtains a first skeleton image (not shown in
Similarly, the processing device 220 obtains a second skeleton image (not shown in
Similarly, the processing device 220 obtains a third skeleton image (not shown in
Similarly, the processing device 220 obtains a fourth skeleton image (not shown in
In some embodiments, the processing device 220 determines whether abnormal information should be sent according to the recognition result. As described above, the processing device 220 determines that the falling-down posture of the human body of the passenger is recognized in the pending recognition image 100 in
For further describing the recognition method of human body posture in the disclosure, reference is made to
In step S403, receiving a plurality of pending recognition images is performed. In some embodiments, the recognition system 200 of human body posture receives a plurality of pending recognition images to recognize the pending recognition images.
In step S405, generating the skeleton image from the pending recognition images is performed. In some embodiments, the recognition system 200 of human body posture executes the human body keypoint detection algorithm on the pending recognition images to compute the skeleton image corresponding to each human body of the pending recognition images.
In some embodiments, the recognition method 400 of human body posture acquires the human body image from the pending recognition image and to obtain corresponding key point coordinate from the human body image. Then, the skeleton image and its limbs corresponding to the human body can be obtained according to the links between the key point coordinates. The key point coordinates correspond to the joints of the skeleton image.
In step S410, tagging a color feature to each limb of the skeleton image is performed, such that the color feature of each limb is different from each other. In some embodiments, each limb of the skeleton image that is pre-stored in the posture recognition model corresponds to a limb color. For example, the part of the head is tagged to the red color. When the limb of the skeleton image which is generated from the pending recognition image is tagged to the color feature, the same rule of tagging color features is applied, i.e., when the part of the head is recognized, the limb part will be tagged to the red color.
In step S415, inputting each skeleton image that is obtained from the pending recognition images to a posture recognition model is performed. In some embodiments, if multiple skeleton images are computed from the pending recognition image, each skeleton image will be inputted into the posture recognition model for determining each human body posture.
In some embodiments, the recognition method of human body posture 400 further adjusts a line width of each limb in the skeleton image. For example, the line width of the skeleton image is adjusted according to the ratio of the pixel amount of the human body image corresponding to the skeleton image to the pixel amount of the pending recognition images. For example, the ratio of the pixel amount of the human body image that includes the skeleton image to the pixel amount of the pending recognition images is computed. In some embodiments, the larger the ratio of the pixel amount of the human body image corresponding to the skeleton image to the pixel amount of the pending recognition images is (e.g., 18%), which represent that the human body is close to the camera, the thinner the line width in the skeleton image is. On the contrary, the smaller the ratio of the pixel amount of the human body image corresponding to the skeleton image to the pixel amount of the pending recognition images is (e.g., 3%), which represents that the human body is farther from the camera, the wider the line width in the skeleton image is. In some embodiments, because the distances between the human body and the camera are different, the blur/clear condition of the skeleton image varies correspondingly. If the skeletons with the different distances are applied to comparison, the accuracy will be enhanced. The human body that is farther from the camera contains a smaller pixel amount ratio and the line width of the skeleton is blur, so the line width of the skeleton is increased correspondingly. On the other hand, the human body that is close to the camera contains a larger pixel amount ratio and the line width of the skeleton is clear, so the line width of the skeleton is adjusted to be thin lines, such that the structure of the skeleton can be presented and the accuracy for recognizing the human body posture can be increased.
In step S420, outputting a recognition result to determine whether abnormal information is outputted according to the recognition result is performed. In some embodiments, if the recognition result satisfies the abnormal state, such as the falling-down posture, the determination that the abnormal state occurs in the scene is made. At this time, the recognition method of human body posture 400 will send the abnormal information for administrators to review.
The training method for the posture recognition model is described below.
In some embodiments, the posture recognition model is trained by using the plurality of training images. Reference is further made to
In some embodiments, the processing device 220 executes the human body keypoint detection algorithm on the training images to obtain a plurality of trained skeleton images, such that each limb of every trained skeleton image includes the corresponding limb color.
In some embodiments, the processing device 220 tags the recognition result that the skeleton images correspond. For example, an operating interface is provided for the administrator to select the trained skeleton image and to record the corresponding human body posture. The operating interface also shows the original training image for the administrator to confirm and to record the corresponding human body posture. The skeleton images that include the limb colors and are tagged with the corresponding recognition results are inputted into the training model. For example, the deep learning algorithm is applied to train the model. The processing device 220 trains by the trained skeleton images which have the corresponding limb colors and the corresponding recognition result to generate the posture recognition model.
In some embodiments, the processing device 220 computes a spatial feature by the pixel amount of the human body image of the trained skeleton image in each training image. The processing device 220 obtains the trained skeleton according to the plurality of key point coordinates of each training image and the spatial feature of the human body image. For example, one or more human body is included in the training image, and the human body image corresponding to the human body can be further obtained from the training image. In some embodiments, the depth of field data of the human body image includes or corresponds to the distance between the human body and the camera. In some embodiments, the distance between the human body image and the camera can be estimated from the ratio of the pixel amount of the human body image to the pixel amount of the training image to obtain the spatial feature. The spatial feature can be the depth of field data of the human body image. In some embodiments, the processing device 220 adjusts the line width of the skeleton in the skeleton image of the human body image according to the depth of field data.
In some embodiments, if the depth of field data of the human body image indicates that the distance between the human body and the camera is far, the line width of the skeleton in the skeleton image of the human body image is widened. In other embodiment, if the depth of field data of the human body image indicates that the distance between the human body and the camera is close, the line width of the skeleton in the skeleton image of the human body image is narrowed.
In some embodiments, the recognition method of human body posture 400 adjusts the size of the skeleton image with an equal proportion, such that the size-adjusted skeleton images are applied to train the posture recognition model.
In some embodiments provide a non-transitory computer-readable storage medium storing multiple instructions. When the instructions are loaded into the processor or the processing device 220 in
As described above, the recognition system of human body posture and the recognition method of human body posture in the present disclosure applies the skeleton images which are acquired from the human body images to perform posture comparisons. Because each limb of the skeleton image has a different color feature, when the overlap of the limbs or the human bodies occur, the convention method for recognizing images by the gray level cannot improve the efficiency. On the contrary, the method that each limb has a different color feature in the disclosure can improve the accuracy of the processing device performing visual recognizations. In addition, because the human body image is small when the human body is at a far position, it will decrease the accuracy of the processing device performing visual recognizations. Hence, in the present disclosure, the line width of the skeleton of the human body that is at a far position is widened according to the depth information of the human body image, such that the connection relationship between the limbs can be recognized more easily. Furthermore, the skeleton image size is smaller than the training image size and the pending recognition image size, the computation time of training images and recognizing postures can be reduced and increases the training and recognizing efficiency. Accordingly, the system and the method provided in the present disclosure which applies the color feature of the limbs and the space information has high processing efficiency and accuracy in training images and recognizing postures.
It will be apparent to those skilled in the art that various modifications and variations can be made to the structure of the present disclosure without departing from the scope or spirit of the disclosure. In view of the foregoing, it is intended that the present disclosure cover modifications and variations of this disclosure provided they fall within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
109138489 | Nov 2020 | TW | national |