The invention relates to the field of monitoring a human, such as a human operator.
Human error has been cited as a primary cause or contributing factor in disasters and accidents in many and diverse industries and fields. For example, traffic accidents involving vehicles are often attributed to human error and are one of the leading causes of injury and death in many developed countries. Similarly, it was found that distraction (e.g., mental distraction) of a worker affects performance at work and is one of the causes of workplace accidents.
Therefore, monitoring human operators, such as workers or drivers of vehicles, is an important component of accident analysis and prevention.
Image based systems are used to monitor human operators, such as drivers. Typically, these systems use infrared or near-infrared (IR) illumination and an infrared camera positioned on a vehicle's steering column, to enable capturing reflection of light coming from the eyes of the driver and to monitor the driver's state based on features of the eye, such as scleral and corneal reflection and pupil parameters.
Image based eye tracking typically requires a high resolution camera to be located very close to a person's eye (e.g., head mounted glasses or headsets). Image sensors located further away (e.g., ˜40 cm-120 cm) typically require IR illumination to capture corneal reflection. These systems calculate or estimate the eye gaze direction from reflection coming from the eyes and based on the geometry between the person's eye, the IR illumination source and image sensor. Thus, the position of the illumination source and image sensor is usually limited, and the system can typically only track the eyes of a single person.
Some additional problems associated with these systems include low light images (since the image sensor can catch typically only about 8-12% of IR light) and reflections from glasses/sunglasses due to the IR illumination. Additionally, the use of a high resolution camera and many IR LEDs gives rise to high cost, high power consumption and possible heating issues of the system. Additionally, the use of many IR LEDs may pose potential eye safety issues, due to the high amount of IR illumination used. The use of IR illumination limits the use of these systems to dark environments and to short distance imaging because, in the IR light, objects at a far distance will not be visible.
Embodiments of the invention provide systems and methods that enable accurate face and eye tracking from a distance (e.g., ˜40 cm-˜5 mr) without requiring constant IR illumination. IR illumination is used only in low lighting conditions or when sunglasses penetration is needed.
Thus, face and eye tracking, according to embodiments of the invention, utilize reduced IR illumination, providing a cost-effective system with lower power consumption, fewer heating issues and considerably less safety issues than existing systems.
Systems and methods according to embodiments of the invention provide well-lit images, with no reflections from glasses/sunglasses, enabling accurate monitoring of people.
Systems and methods according to embodiments of the invention do not require high resolution imaging and can operate at long distances, with flexible image sensor and illumination positioning. Thus, systems and methods according to embodiments of the invention can be used to monitor multiple people in a vehicle setting and for other purposes, as described herein.
The invention will now be described in relation to certain examples and embodiments with reference to the following illustrative drawing figures so that it may be more fully understood. In the drawings:
Embodiments of the invention provide systems and methods for monitoring one or a plurality of people in a space. The space may include an enclosed space, such as a vehicle cabin or a room and/or an open space such as a park.
Embodiments of the invention may be used to monitor people in general, such as, operators and passengers of vehicles, elderly or sick people, children, participants in a video call and users of devices, such as gaming and other devices.
The term “user” may refer to any person monitored according to embodiments of the invention.
A user may be monitored to determine the user's direction of gaze and/or to determine a general state of a user. The user's direction of gaze or general state of the user (e.g., the level of distraction of the user) can be indicative of a physiological or psychological condition of the user, such as illness, drowsiness, fatigue, anxiety, sobriety, inattentive blindness and readiness to take control of a machine (e.g., vehicle). The user's intention, while operating a machine, can also be deduced from monitoring the user, e.g., based on direction of gaze of the user's eyes.
Monitoring a person can include tracking the person throughout images of the space and/or using image processing techniques to determine the state of the person from the images of the space. Tracking of the head or face of a person, e.g., to detect head and/or eye movement, may be done by applying optical flow methods, histogram of gradients, deep neural networks or other appropriate detection and tracking methods. Other body parts may be tracked using similar optical flow methods.
The state of the person may sometimes be determined by running computer vision algorithms (e.g., face detection and/or eye detection algorithms) including machine learning and deep learning processes, to extract biometrics of the person. A human's head or face may be tracked in a set of images of the person and biometric parameters of the human can be extracted based on the tracking. In one embodiment biometric parameter values of a specific human obtained from a first set of images are used to represent the baseline or normal state of the human and may thus be used as a reference frame for biometric parameter values of that same human obtained from a second, later captured, set of images.
People and body parts of people (e.g., limbs) may be detected by using object detection and/or motion detection and/or color detection algorithms. Also, machine learning models, such as support vector machines, may be used to detect humans and body parts of humans.
Parameters such as direction of gaze or posture or position of a person's head may be determined by applying appropriate algorithms (and/or combination of algorithms) on image data, such as motion detection algorithms, color detection algorithms, detection of landmarks, 3D alignment, gradient detection, support vector machine, color channel separation and calculations, frequency domain algorithms and shape detection algorithms.
Combinations of the above and similar techniques can be applied on images of one or more people in a space, in order to provide a monitoring algorithm, according to embodiments of the invention. Possibly, different monitoring algorithms can be applied in different instances, as further explained below.
In the following description, various aspects of the invention will be described. For purposes of explanation, specific configurations and details are set forth in order to provide a thorough understanding of the invention. However, it will also be apparent to one skilled in the art that the invention may be practiced without the specific details presented herein. Furthermore, well known features may be omitted or simplified in order not to obscure the invention.
Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as “processing,” “computing,” “calculating,” “determining,” “detecting”, “identifying”, “extracting”, “obtaining”, “applying”, or the like, refer to the action and/or processes of a computer or computing system, or similar electronic computing device, that manipulates and/or transforms data represented as physical, such as electronic, quantities within the computing system's registers and/or memories into other data similarly represented as physical quantities within the computing system's memories, registers or other such information storage, transmission or display devices.
In one embodiment of the invention a system for monitoring a person includes a camera having an image sensor that can collect both visible and IR light. The camera can capture an image of a space which includes one or more people. A processor can determine visibility conditions from the image of the space and may control a device (such as the camera and/or illumination source) based on the determined visibility conditions. Typically, “visibility conditions” relate to conditions which enable detecting an eye of the person (or people) in the image. In some embodiments the visibility conditions enable detecting direction of gaze of the eye of the person (or people).
In one embodiment, which is schematically illustrated in
The processor then determines visibility of the eye of the person in the image, relative to a predetermined threshold (114) and controls a device based on the visibility of the eye relative to the predetermined threshold (116).
In one example, the device may be an illumination device and the processor controls the illumination device to illuminate at an intensity or wavelength, based on the visibility of the eye relative to the predetermined threshold. For example, as schematically illustrated in
In another example, which is schematically illustrated in
In one example, the illumination device is an IR illumination device, e.g., as described below. A processor may activate an IR illumination device when the visibility of the eye is below a threshold. This embodiment, among others, enables use of an illumination source (e.g., IR illumination source) only in low illumination conditions and/or low visibility conditions. The illumination source is not activated in high illumination and/or visibility conditions, thereby providing a safe to use and cost-effective solution.
In other examples the processor controls (one or more) illumination device to illuminate a first illumination (e.g., a first wavelength, e.g., IR illumination) if visibility of the eye is below the predetermined threshold and to illuminate a second illumination (e.g., a second wavelength, e.g., visible light) if visibility of the eye is above the predetermined threshold. The first and second illuminations may be different intensities of illumination, such that the processor changes intensity of illumination of one or more illumination device(s) based on visibility of the eye in an image.
In another example, which is schematically illustrated in
In some embodiments, devices at different locations in the space can be controlled based on the visibility of the eye relative to a predetermined threshold. For example, the first and second illumination source or first or second camera may be illumination sources or cameras located at different locations or positions within the space and/or may be directed in different angles.
As described above, a processor may control imaging parameters (e.g., illumination and/or camera sensitivity) based on visibility of an eye.
Visibility of the eye is a measure that may be dependent on parameters of the image which includes the eye. For example, parameters of an image that may affect visibility of an eye, include the amount of pixels corresponding to the eye and/or the pixel values of the pixels corresponding to the eye. In these cases, the predetermined threshold may be a predetermined number of pixels (which may include an average or percentage or other statistical representation) and/or a predetermined value. Visibility of the eye may be calculated from one or a combination of parameters of the image of the eye.
Thus, in one embodiment, a processor may detect a human face and/or eye in an image (e.g., by using known methods, such as Open CV face detection) and may then determine which pixels correspond to the eye (e.g., by using segmentation methods) to determine if the number and/or value of the pixels corresponding to the eye is below the predetermined threshold.
Parameters of an image, which may affect visibility of the eye, may be influenced by the environment, such as by the wavelength or intensity of ambient light. In other cases, visibility of the eye may be affected by actions of the person being imaged, for example, when a person moves his face away from the illumination source and/or from the camera. In this case, the processor may detect, for example, an area in the face that is less visible and may control an illumination device and/or camera which are located at an appropriate location and/or angle within the space to enable illuminating the less visible area and/or capturing an image of the less visible area, to provide another image of the person and to increase the visibility of the person's eye.
Typically, the processor has access to location and/or positioning information of the illumination devices and/or cameras. Location and/or positioning information may include real-world coordinates or other location marks within the space or information regarding locations and/or angles of the camera relative to the illumination device or to another location reference. Such information may be input by a user or may be calculated, e.g., based on prior calibration or by using image processing techniques to determine distances of objects within images.
The processor may also determine location and/or positioning information of the person within the space (and/or in relation to the illumination devices and/or cameras), e.g., by user input and/or by applying image processing techniques to determine locations of objects within images. The processor may then use the location and/or positioning information of the person and the location and/or positioning information of the illumination devices and/or cameras to calculate which illumination device and/or camera (located at known or calculated locations) are appropriate to enable illuminating less visible areas.
In another example a person may put an accessory, such as glasses (e.g., sunglasses) over his eyes, thereby lowering visibility of his eyes.
In one embodiment, which is schematically illustrated in
In one embodiment, the device is an IR illumination device (e.g., a LED or other light source illuminating within the infra-red and near infra-red wavelength range (referred to as IR), for example, 850 nm or 940 nm or within this range). In this embodiment a processor obtains an image of at least a face of a person and detects, in the image, an accessory, such as sunglasses in vicinity of the face of the person. The processor may then operate the IR illumination device based on detection, in the image, of the sunglasses in vicinity of the face of the person.
In one embodiment the processor is configured to turn on/off the IR illumination device, based on the detection of the sunglasses. Thus, IR illumination may be turned on when sunglasses are detected in the image and turned off when sunglasses are not detected in the image. In another embodiment the processor can turn on/off the IR illumination device based on the detection of the sunglasses and based on a location and/or positioning of the IR illumination device in the space. For example, the processor may turn on an IR illumination device that is located within the space at a location and/or angle that enables illuminating the person's face (when sunglasses are detected in the image), whereas the processor can turn off an IR illumination device that is located at a location and/or angle within the space that will not enable illuminating the person's face. The processor may determine which location and/or positioning of which illumination device enables illuminating the person's face, e.g., as described above.
In another embodiment the processor is configured to change intensity of illumination of the IR illumination device, based on the detection of the sunglasses, typically increasing intensity when sunglasses are detected in the image.
In an embodiment, which is schematically illustrated in
For example, the processor may determine that the visibility of the eye is below a predetermined threshold when sunglasses are detected in vicinity of the person's face in an image, and may control a camera and/or illumination device to create imaging parameters which will increase visibility of the eye. For example, a specific IR illumination intensity may be used if sunglasses are detected, so as to penetrate the sunglasses to be able to obtain data regarding pixels associated with the eyes. In other embodiments, different IR illumination intensities may be used, until the eyes are clearly visible for processing.
Accessories such as sunglasses may be detected based on their shape and/or color, using appropriate object detection algorithms. In some embodiments, glasses may be detected in vicinity of a person's face when there is an overlap between a detected face and an object determined to be glasses. In other embodiments, glasses are determined to be in vicinity of a person's face when an object determined to be glasses is located in an area of the eyes within a detected face. Other methods of determining vicinity of accessories to a person's face in an image, may be used.
In some embodiments, a processor detects the illuminance (e.g., by determining lux) in the space and determines the visibility of the eye relative to a predetermined threshold, based on the illuminance in the space.
In the example schematically illustrated in
In some embodiments illuminance under a predetermined value may indicate that visibility of the eye is below a predetermined threshold, and an illumination source can be controlled to illuminate at an intensity to increase illuminance in the space to a minimal illuminance required to enable visibility of the person's eye above the predetermined threshold.
In one embodiment a processor can turn on/off an IR illumination device, based on visibility of the eye. Typically, the processor will turn on the IR illumination device when illuminance in the space is low (e.g., at night) and will turn off the IR illumination device when illuminance is high. In some embodiments a processor can change intensity of illumination of the IR illumination device, based on visibility of the eye. Typically, the intensity of illumination will be increased in low illuminance conditions and will be decreased in high illuminance conditions.
Illuminance in the space can be calculated, for example, by applying image analysis algorithms on the image to determine the intensity of the image and taking into account specifications of the camera capturing the image (e.g., shutter opening time, sensitivity of the sensor, etc.).
In other embodiments detecting illuminance in the space can include receiving data from a lux meter, as further detailed below.
In an embodiment, which is schematically illustrated in
The combined value may be calculated, for example, by assigning a weight to the visibility of the eye of each person (e.g., based on location of the person within the space) and adding the weighted visibility values. For example, if it is desired to monitor people located at a rear of a vehicle cabin, the visibility of eye of each person can be assigned a higher weight if the person is determined to be located at the rear, than if the person is determined to be located in the front of the cabin. The combined value of visibility for people at the rear of the cabin will thus be higher, causing higher illumination of the rear and possibly activation of cameras located and/or positioned so as to capture the eyes of people at the rear of the cabin with increased visibility. Thus, according to embodiments if the invention, differently located illumination devices and/or different illumination conditions and/or differently located cameras and/or different sensitivities of cameras may be used to obtain an image which provides the highest visibility of eyes of a desired group of people. In other embodiments, different images may be obtained, each of the images captured in different illumination conditions and/or using different cameras, each of the images showing a highest visibility of a different person or group of people in the space.
For example, it may be advantageous to know in which direction a group of people are looking. E.g., if all passengers in a vehicle cabin, are looking in the same direction, this could assist in understanding the scene and conditions in the vehicle cabin. Similarly, if most people in a space such as a class room or conference room, are looking at a specific gaze target, this could assist in understanding the scene in the class room or conference room. Similarly, determining a gaze direction of one or more people interacting with a service device (such as a robot in a store or a user interface device displaying instructions, advertisements, etc.) can assist the service device in better understanding the nature of the interaction with the person.
In one embodiment, different types people can be monitored in a space, thereby providing better understanding of the scene in the space. In one embodiment, a system for monitoring a space includes a camera to capture an image of the space and a processor to detect first and second person types in the image of the space. The processor can apply a first monitoring algorithm on the first person type and a second monitoring algorithm on the second person type and may output a status of the space based on results of the first monitoring algorithm and the second monitoring algorithm.
The status of the space may include information regarding behavior of the people in the space (e.g., compliance with rules, peaceful/violent behavior, etc.) or information regarding the mood of people in the space or other information regarding the people being monitored in the space.
Different person types may include, for example, service providers and service receivers, such as a driver and passengers in a vehicle, a cashier and customers in a shop, a teacher and students in a class room, etc. The processor may detect the first person type and the second person type based on, for example, the location of the first person type and second person type in the space. E.g., a driver in a vehicle will typically be located at the front of the vehicle while the passengers will be located in other places in the cabin. A cashier will typically be located behind a cash register whereas customers will be located on the other side of the cash register, etc. Alternatively, or in addition, the processor can detect the first person type and the second person type based on appearance of the first person type and second person type in the image of space. E.g., a driver or other service provider may be wearing a uniform and can thus be detected based on the different colors of his uniform and/or shape of his hat, etc.
In one embodiment, which is schematically illustrated in
The first and second monitoring algorithms may differ in the specific techniques and/or in the order of steps performed to monitor the person. For example, a driver can be monitored using a monitoring algorithm that determines the driver's awareness (e.g., by monitoring eye blinks) whereas the passengers may be monitored using a monitoring algorithm that determined the passengers' mood (detecting smiles, yawns, etc.).
The camera 611 typically includes a CCD or CMOS or other appropriate image sensor. The camera 611 may be part of a 2D or 3D camera, for example, part of a standard camera provided with mobile devices such as smart phones or tablets. In one embodiment several image sensors may be used to obtain a 3D or stereoscopic image of person 615. In one embodiment camera 611 obtains images at a high frame rate (e.g., 30 frames per second or higher) to achieve real-time imaging.
In some embodiments, camera 611 includes an image sensor capable of capturing a plurality of illumination wavelengths, e.g., capable of capturing both IR and visible light. Camera 611 may include a bandpass filter 11 that enables visible light to pass, optionally together with a bandpass around near infrared light (e.g. 850 nm or 940 nm).
In some embodiments the system 600 includes one or more illumination sources 613 such as an infra-red or near infra-red (also referred to as IR) illumination source (e.g., an LED illuminating at 850 nm or 940 nm). The use of an IR illumination source enables obtaining image data of the space even in low lighting conditions, e.g., at night, and when visibility of a person's eye is low.
Typically, both illumination source 613 and camera 611 are in communication with a processor 610 and one or more memory unit(s) 612, and may transmit data to the processor 610 and/or be controlled by signals generated by the processor 610.
Processor 610 may also be in communication with an illuminance sensor such as lux meter 619, that can measure the illuminance in space 614, possibly in specific locations within space 614, e.g., in vicinity of the person 615 and/or in vicinity of the person's face or eyes. Data provided from the lux meter 619 can be used by processor 610 to calculate if illumination from illumination source 613 is below a threshold, and if so, processor 610 can control the illumination source 613 to illuminate at an intensity to bring the lux in the space 614 to a minimal illuminance required to enable visibility of the person's eye above a predetermined threshold.
Communication between components of the system 600 and/or external components (such lux meter 619 and devices that can be controlled according to embodiments of the invention) may be through wired or wireless connection. For example, the system 600 may include an internet connection.
Processor 610 may include, for example, one or more processors and may be a central processing unit (CPU), a digital signal processor (DSP), a Graphical Processing Unit (GPU), a microprocessor, a controller, a chip, a microchip, an integrated circuit (IC), or any other suitable multi-purpose or specific processor or controller.
In some embodiments processor 610 is a dedicated unit. In other embodiments processor 610 may be part of an already existing processor, such as a vehicle processor. For example, the processor 610 may be one core of a multi-core CPU already existing in a vehicle, such as in the vehicle IVI (In-Vehicle Infotainment) system, telematics box of the vehicle, domain controller or another processor associated with the vehicle.
Processor 610 may be locally embedded or remote, e.g., cloud-based.
Memory unit(s) 612 may include, for example, a random access memory (RAM), a dynamic RAM (DRAM), a flash memory, a volatile memory, a non-volatile memory, a cache memory, a buffer, a short term memory unit, a long term memory unit, or other suitable memory units or storage units.
According to some embodiments, image data may be stored in memory unit 612. Typically, memory unit 612 stores executable instructions that, when executed by the processor 610, facilitate performance of operations of the processor 610, as described herein.
In one embodiment the processor 610 determines visibility conditions in an image of the person 615 captured by camera 611. The processor 610 determines visibility conditions which enable detecting the eye of the person 15. Processor 610 may then control a device (such as camera 611, illumination source 613 or other devices) based on the visibility conditions.
In some embodiments the visibility conditions enable extracting biometric parameters from images captured by camera 611. Biometric parameters, which may be indicative of a person's state, include, inter alia, eye related parameters, such as, one or more eye pupil direction, pupil diameter, blink frequency, blink length and percentage of eyelid closed (perclos).
In other embodiments the visibility conditions enable detecting direction of gaze of the eye of the person 615.
System 600 enables running an existing computing system (e.g., as detailed above), which is not, however, equipped with IR illumination and other hardware that might be required for eye detection and/or gaze direction tracking. Another advantage of system 600 is that processor 610 can run separately from the camera 611 and/or illumination source 613 and can thus be applied to an already existing camera/illumination system and/or allows flexibility in choosing the camera and/or illumination source and in locating them within the space 614.
System 600 may be used for gaming purposes, e.g., when tracking a person's gaze direction is desired, and processor 610 may control functions of the gaming device. System 600 may be used for medical purposes and may control a medical device. System 600 may be used for security purposes, e.g., when viewing a person's eye is required. and when communicating with a robot or personal assistant, e.g., when detecting a gaze direction of a person is useful. System 600 may be used in advertisements and for TVs and home appliances and in other applications where devices can be controlled based on a person's direction of gaze and/or based on biometric parameters of the person, e.g., eye related biometric parameters.
System 600 and the methods described above enable accurate face and eye tracking from a distance (e.g., ˜40 cm-˜5 mr) without requiring constant and high intensity IR illumination, thereby providing a cost-effective system with low power consumption, few heating issues and few safety issues for a user.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB20/50342 | 1/16/2020 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62793390 | Jan 2019 | US |