The present disclosure generally relates to a method and apparatus for early detection of dynamic attentive states and for providing an inattentive warning. More specifically, the present disclosure relates to a method and apparatus for early detection of dynamic attentive states based on an operator's eye movements and surround features for providing inattentive warning.
Conventionally, attention allocation models based on saliency, effort, expectancy, and value, have been used in selective attention research, and have been applied mainly in aviation. Attention allocation of airplane pilots during flight related tasks such as aviating, navigating, and landing, is conventionally experimented with secondary tasks of monitoring in-flight traffic displays and communicating with air traffic control centers.
Moreover, a variation of this approach has been tested in surface driving situations to analyze required attention levels for proper maneuvers while engaged in secondary in-vehicular tasks. Such conventional approaches describe selective attention models to predict the attention of an operator to static areas of interest (AOIs) in operation of the vehicle and secondary in-vehicle tasks.
The inventors discovered that these conventional approaches do not provide accurate predictions of the operator perception for complex environment events. Also, these conventional approaches are not capable of predicting how the operator would react to the occurrence of an unperceived event.
The present disclosure provides a method and apparatus for early detection of dynamic attentive state based on an operator's eye movements and surround features for providing an inattentive warning.
According to an embodiment of the present disclosure, there is provided a method and apparatus for determining an inattentive state of an operator of a vehicle and for providing information to the operator of the vehicle by obtaining, via a first camera, facial images of the operator of the vehicle, obtaining, via a second camera, images of an environment of the vehicle, determining one or more areas of interest in the environment of the vehicle based on the images of the environment of the vehicle, obtaining, from a relevance and priority database, relevance and priority values corresponding to the one or more areas of interest, determining a probability of attention of the operator of the vehicle to the one or more areas of interest based on the images of the environment of the vehicle and the relevance and priority values, determining an attention deficiency based on the determined probability of attention and the facial images, and providing, via a warning/guidance device, the information to the operator of the vehicle based on the determined attention deficiency.
The foregoing paragraphs have been provided by way of general introduction, and are not intended to limit the scope of the following claims. The described embodiments, together with further advantages, will be best understood by reference to the following detailed description taken in conjunction with the accompanying drawings.
A more complete appreciation of the embodiments described herein, and many of the attendant advantages thereof will be readily obtained as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings, wherein:
According to one embodiment, there is provided a method and apparatus to predict the allocation of attention to multiple dynamic AOIs in the environment of an operator, thus providing an attention estimate for the external events.
According to one embodiment, a method and apparatus is provided to predict how the operator would react to the occurrence of an unexpected event in the environment once perceived without prior attention.
The present disclosure also describes the countermeasures for possible erratic actions as a result of such unexpected events.
According to one embodiment, a method is described to estimate the attention on multiple dynamic AOIs in the environment of the vehicle operation, learn normal/ideal operator scanning behavior for different AOIs, and predict inattentiveness by thresholding learned values against observed values.
According to one embodiment, a method is described to learn operator's reaction patterns to unexpected events and issue variable active warnings based on the predictions.
According to one embodiment, there is provided a method and apparatus that is capable of warning an operator in fail-to-look and look-but-fail-to-see situations.
Referring now to the drawings, wherein like reference numerals designate identical or corresponding parts throughout the several views,
In step S301, the operator's eye movements are detected and the operator's eye movement parameter values are determined based on the face images. The operator's eye movement parameter values may include operator gaze pointers to the scene, and the dwell time on different areas of the scene. To compute gaze pointer and dwell time, features in the eye region such as iris and pupil location are used. The gaze vector can be generated using parameters such as the location of the iris and the angle to the iris calculated with respect to the optical axis. The gaze vector can then be produced by extending a line from the iris using a computed angle from the optical axis to the outside scene as observed by the forward roadway camera. Thus, the dwell time can be a derivation of rate of change of angle to the optical axis.
The operator's face images are recorded using a camera. The camera may be mounted on the dashboard or may be included in the rear view mirror assembly or any other location inside the vehicle such that the camera can capture the face images of the driver.
According to one embodiment, the operator's eye movements are detected by detecting facial features, such as, eye corners, upper and lower eyelid features, etc. from the face image. Feature point extraction algorithms may be used to extract those features, such as, eye corners and upper and lower eyelid features. Iris and pupil locations are then detected and extracted based on the detected eye corners and upper and lower eyelid features. The iris and (/or) pupil location coordinates are then mapped with external forward images to determine the gaze pointers on external environment. Finally, gaze fixation dwell times are computed for different regions that represent AOIs in the scene.
In step S303, features are extracted in the environment recorded by cameras mounted on the vehicle, facing the forward or backward roadway and/or the periphery. These features are then used to define and segment AOIs in the scene, e.g., traffic lights, traffic signs, other vehicles on the road, pedestrians, cyclists, or animals. The linear distance from each AOI to the vehicle is also computed. For example, for AOI-3115 shown in
where d is the longitudinal displacement and θ is the angle between a forward line-of-sight and the line-of-sight corresponding to AOI-3, as shown in
γ is later used in step S309 to compute the probability of attention to a given AOI.
In step S305, saliencies in each AOI are extracted and a saliency map of the environment is built from the images obtained with cameras facing the forward or backward roadway and/or the periphery. Image analysis algorithms may be applied to detect motion, color intensity, and/or texture of different objects to determine the saliency levels of that object. Steps S301, S303, and S305 can be independent and performed simultaneously by the controller.
In step S307, micro analysis of saliency variations is performed to detect events that occur within the AOI boundary and the frequency of the detected events are recorded. According to one embodiment, pattern analysis algorithms are applied for each identified AOI to detect the events that occur within the AOI boundary. These events are segmented and their frequency of occurrence is computed.
As an example, frequency analysis of a traffic signal light is performed by using the detection results of step S303 to detect a traffic light box. Further segmentations are done to separate individual light positions. Blinking, solid state, or changing frequencies of these lights are then recorded.
In step S309, relevance and priority values stored in a database, and the saliency, frequency, and linear distance values computed in steps S303, S305, and S307 are used to compute a probability of attention to each AOI in the scene. According to one embodiment, a probability of attention to AOIi is determined by:
where B, R, and P parameters indicate the bandwidth, relevance, and priority values for AOIi, and γi,t is the displacement parameter for AOIi at time t. Bandwidth is computed based on the frequency of events computed in step S307.
The bandwidth can be computed as a summation of frequency of occurrence of events in a given AOI. Thus, for a given sampling time T, the bandwidth B can be given as,
Where, t0 is the start of the sampling time and n denotes the number of events (E) in the AOI. For example, in a normal operation of a traffic light, alternate occurrences of each light event will be observed within the sampling time T. However, in a priority situation, one light event, most likely Red or Yellow will blink frequently. In such a case, a higher bandwidth corresponds to a higher frequency of blinks observed than the frequency in a normal operation of a traffic light. In a situation of an emergency or high priority vehicle, such as a patrol car or an ambulance, multiple events corresponding to multiple lights may blink simultaneously, producing higher bandwidth than other previous situations described. See
Relevance and priority values can be obtained from pre-computed datasets and stored in a database. Saliency values are from step S305, and the visual displacement parameter γi,t is from step S303.
The database values of R and P may be pre-estimated for different traffic situations and different objects that correspond to real world AOIs. This may be done by, e.g., a survey of experienced operators who evaluate relevance and priority values for different objects in the scene and different traffic conditions. For example, in a given intersection scenario, expert operators evaluate relevance and priority values of real world AOIs such as other vehicles, different types of traffic lights, traffic signs, pedestrians, or animals. Median and standard deviation values for these objects are then computed and stored in the database.
In step S311, attention deficiency level is computed based on a currently observed attention level and an ideally expected attention level. An operator's reaction to unexpected events is also predicted based on the degree of attention deficiency. According to one embodiment, attention deficiency level φ is:
φ=P(AOIt,iob)−P(AOIt,iid) (2),
where P(AOIt,iob) is the observed attention level to AOIi at time t, and P(AOIt,iid) is the ideal attention level to AOIi at time t derived for a similar traffic situation. The attention level to a given AOI has a positive correlation with eye gaze dwell time on that AOI. Therefore, according to one embodiment, the observed level of attention P(AOIt,iob) to AOIi at time t is the average eye gaze dwell time computed for AOIi at time t.
The ideal attention value for the AOIi at time t, P(AOIt,iid), is
The attention level measurement time window is chosen to be greater than the sampling frequency of the parameters. For example, when the attention level is measured at 3 sec epochs, saliency Si,t, bandwidth Bi,t, relevance Ri,t, priority Pi,t, and visual displacement γi,t, parameters may be sampled at 100 ms, and
In order to evaluate attention deficiency, φ is thresholded against a threshold value ξ
φ<ξ
where ξ corresponds to a lower bound of attention. The threshold value ξ may be empirically determined for a control set of operators with different experiences to determine, e.g., the look-but-fail-to-see situation. In fail-to-look situations, gaze dwell time is zero and ξ may be set as a negative value.
In step S313, based on the environment severity level, an appropriate warning and guidance is issued to the operator. For example, when ξ is a negative value indicating the operator's failure to look at a critical AOI in the scene, audio-visual warnings or, based on the crash criticality, pre-crash safety procedures may be deployed. As another example, when φ<ξ, guidance mechanisms such as visual indications of highlighted AOIs on, e.g., heads up display units may be issued.
Next, a hardware description of the controller 205 according to exemplary embodiments is described with reference to
In
Further, the claimed advancements may be provided as a utility application, background daemon, or component of an operating system, or combination thereof, executing in conjunction with CPU 400 and an operating system such as Microsoft Windows 7, UNIX, Solaris, LINUX, Apple MAC-OS and other systems known to those skilled in the art.
CPU 400 may be a Xenon or Core processor from Intel of America or an Opteron processor from AMD of America, or may be other processor types that would be recognized by one of ordinary skill in the art. Alternatively, CPU 400 may be implemented on an FPGA, ASIC, PLD or using discrete logic circuits, as one of ordinary skill in the art would recognize. Further, CPU 400 may be implemented as multiple processors cooperatively working in parallel to perform the instructions of the inventive processes described above.
The controller 205 in
The controller 205 further includes a display controller 408, such as a NVIDIA GeForce GTX or Quadro graphics adaptor from NVIDIA Corporation of America for interfacing with display 410, such as a Hewlett Packard HPL2445w LCD monitor. A general purpose I/O interface 412 interfaces with a keyboard and/or mouse 414 as well as a touch screen panel 416 on or separate from display 410. General purpose I/O interface 412 also connects to a variety of peripherals 418 including printers and scanners, such as an OfficeJet or DeskJet from Hewlett Packard.
A sound controller 420 is also provided in the controller 205, such as Sound Blaster X-Fi Titanium from Creative, to interface with speakers/microphone 422 thereby providing sounds and/or music. The speakers/microphone 422 can also be used to accept dictated words as commands for controlling the controller 205 or for providing location and/or property information with respect to the target property.
The general purpose storage controller 424 connects the storage medium disk 404 with communication bus 426, which may be an ISA, EISA, VESA, PCI, or similar, for interconnecting all of the components of the controller 205. A description of the general features and functionality of the display 410, keyboard and/or mouse 414, as well as the display controller 408, storage controller 424, network controller 406, sound controller 420, and general purpose I/O interface 412 is omitted herein for brevity as these features are known.
A face camera controller 440 is provided in the controller 205 to interface with the face camera 201.
An environment camera controller 442 is provided in the controller 205 to interface with the environment camera 203.
A warning/guidance device controller 444 is provided in the controller 205 to interface with the warning/guidance device 209. Alternatively, display 410, speaker 422, and/or peripherals 418 may be used in place of or in addition to the warning/guidance device 209 to provide warning and/or guidance.
A relevance and priority database controller 446 is provided in the controller 205 to interface with the relevance and priority database 207. Alternatively, the relevance and priority database 207 may be included in disk 404 of the controller 205.
In the above description, any processes, descriptions or blocks in flowcharts should be understood as representing modules, segments or portions of code which include one or more executable instructions for implementing specific logical functions or steps in the process, and alternate implementations are included within the scope of the exemplary embodiments of the present advancements in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending upon the functionality involved, as would be understood by those skilled in the art.
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel methods, apparatuses and systems described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the methods, apparatuses and systems described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
This application is a continuation of and claims the benefit of priority under 35 U.S.C. §120 from U.S. application Ser. No. 14/447,752, filed Jul. 31, 2014 which is a continuation of U.S. application Ser. No. 13/750,137 (now U.S. Pat. No. 8,847,771), filed Jan. 25, 2013, the entire contents of which are incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
7403124 | Arakawa | Jul 2008 | B2 |
7633383 | Dunsmoir | Dec 2009 | B2 |
7884705 | Nishimura | Feb 2011 | B2 |
8085140 | Mochizuki | Dec 2011 | B2 |
20040090334 | Zhang | May 2004 | A1 |
20040239509 | Kisacanin | Dec 2004 | A1 |
20060103539 | Isaji | May 2006 | A1 |
20070159344 | Kisacanin | Jul 2007 | A1 |
20080084499 | Kisacanin | Apr 2008 | A1 |
20090268022 | Omi | Oct 2009 | A1 |
20090273687 | Tsukizawa | Nov 2009 | A1 |
20100033333 | Victor | Feb 2010 | A1 |
20100049375 | Tanimoto | Feb 2010 | A1 |
20100129263 | Arakawa | May 2010 | A1 |
20100202658 | Ishida | Aug 2010 | A1 |
20100241021 | Morikawa | Sep 2010 | A1 |
20100265074 | Namba | Oct 2010 | A1 |
20110169625 | James | Jul 2011 | A1 |
20120002027 | Takahashi | Jan 2012 | A1 |
20120050138 | Sato | Mar 2012 | A1 |
20120154441 | Kim | Jun 2012 | A1 |
20130057671 | Levin | Mar 2013 | A1 |
20130073115 | Levin | Mar 2013 | A1 |
20130162794 | Wakiyama | Jun 2013 | A1 |
20140062704 | Kubotani | Mar 2014 | A1 |
20140125474 | Gunaratne | May 2014 | A1 |
20140139655 | Mimar | May 2014 | A1 |
Number | Date | Country |
---|---|---|
2009-018625 | Jan 2009 | JP |
Number | Date | Country | |
---|---|---|---|
20160171322 A1 | Jun 2016 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 14447752 | Jul 2014 | US |
Child | 15052211 | US | |
Parent | 13750137 | Jan 2013 | US |
Child | 14447752 | US |