This disclosure relates generally to the field of video monitoring and, more particularly, to the field of video monitoring of the condition of a passenger cabin of a motor vehicle.
Unless otherwise indicated herein, the materials described in this section are not prior art to the claims in this application and are not admitted to the prior art by inclusion in this section.
Vehicles on public motorways are almost exclusively controlled by human operators. As technologies move towards autonomous driving, however, some vehicles on the public motorways will be autonomously controlled by advanced computer systems. Autonomous vehicles are capable of transporting human passengers and do not require a human operator. Instead, the computer system guides the vehicle to a destination selected by the passengers, for example.
A shared autonomous vehicle is an autonomous vehicle that is shared among multiple passengers and provides taxi-like services, for example. Since the shared vehicle is autonomous, a human operator is not present. In such a system, it is typically desirable to monitor the state and the condition of the shared vehicle and to monitor the passengers being transported in order to ensure that the vehicle is in a desired condition and to ensure that the passengers are comfortable.
Numerous attempts have been made to develop systems for monitoring the passengers including face-tracking systems, eye-tracking systems, and systems that track and recognize gestures made by the passengers. Each of these systems attempts to ensure that the passengers are comfortable and are acting appropriately while being transported by the autonomous vehicle. Less attention, however, has been paid to sensing the interior environment within the autonomous vehicle. Consequently, improvements to systems and methods for in-vehicle monitoring would be beneficial.
According to an exemplary embodiment of the disclosure, a method for operating a vehicle including a vehicle sensing system includes generating a baseline image model of an cabin of the vehicle based on image data of the cabin of the vehicle generated by an imaging device of the vehicle sensing system, the baseline image model generated before a passenger event, and generating an event image model of the cabin of the vehicle based on image data of the cabin of the vehicle generated by the imaging device, the event image model generated after the passenger event. The method further includes identifying image deviations by comparing the event image model to the baseline image model with a controller of the vehicle sensing system, the image deviations corresponding to differences in the cabin of the vehicle from before the passenger event to after the passenger event, and operating the vehicle based on the identified image deviations.
According to another exemplary embodiment of the disclosure, a vehicle sensing system for a corresponding vehicle includes an imaging device, a memory, and a controller. The imaging device is configured to generate image data of a cabin of the vehicle. The memory is configured to store a baseline image model of the cabin of the vehicle that is generated prior to a passenger event. The controller is operably connected to the imaging device and the memory. The controller is configured to generate an event image model of the cabin of the vehicle based on the generated image data after the passenger event. The controller is also configured to identify image deviations by comparing the event image model to the baseline image model, and to operate the vehicle based on the identified image deviations. The image deviations correspond to differences in the cabin of the vehicle from before the passenger event to after the passenger event.
An example use case scenario is illustrated in
As another example, a user makes an electronic request for a ride with a shared autonomous vehicle. The passenger, after exiting the vehicle, forgets an item in the vehicle, such as a wallet. According to the disclosure, the vehicle sensing system detects the forgotten object and notifies the passenger via an electronic message to the user, such an email, text message, or voicemail to the passenger. After receiving the notification, the passenger can choose to retrieve the forgotten item.
In another example scenario, when the vehicle sensing system detects that the vehicle seats have become dirty or damaged, the vehicle is automatically sent for maintenance at a service center before the next passenger event.
With these features, as described above, a superior user experience is delivered to the passengers of shared autonomous vehicles as well as shared user-operated vehicles (non-autonomous vehicles).
To achieve these goals, the vehicle sensing system detects events of interest, which are also referred to herein as deviations. In an exemplary embodiment, the detection of events occurs in several different levels. A first example event detected by the vehicle sensing system includes deviations between the car condition after a passenger event and the car condition before the passenger event. The passenger event is a use of the vehicle by a passenger or passengers. The vehicle sensing system identifies a region of the vehicle cabin where the deviations exist, but without any semantic information, such as what type of object has caused the deviation. Typically, lighting changes are not considered as a deviation or an event of interest. A second example event detected by the vehicle sensing system includes information regarding a change in attributes of the vehicle. Exemplary attributes include a position of the seats and a position of other user-adjustable features of the vehicle. The vehicle sensing system also recognizes the causes of the deviation. If the cause of the deviation is the detection of an object, then the system identifies the type of object among a pre-defined set of objects including smartphones, wallets, keys, and the like. If the object cannot be identified, the object is classified as an unknown object. Moreover, the vehicle sensing system detects when the passenger has remained in the vehicle at the end of the passenger event as another exemplary deviation.
An exemplary workflow of the vehicle sensing system to achieve the abovementioned goals includes capturing vehicle interior image data before the passenger event with the vehicle in a baseline condition or a clean condition; capturing of vehicle interior image data after the passenger event, and comparing the pre-passenger event vehicle interior image data and the post-passenger event vehicle interior image data to detect deviations. Exemplary core technologies included in the process are video monitoring systems that perform background subtraction, change detection, image decomposition, and object recognition.
The embodiments of the vehicle sensing system described herein provide improvements to technology that perform automated monitoring and analysis of the interior state of the cabin of a vehicle in a wide range of environmental lighting conditions, which is particularly beneficial for monitoring the interior of a vehicle that moves to different locations with uncontrolled environmental light conditions.
For the purpose of promoting an understanding of the principles of the disclosure, reference will now be made to the embodiments illustrated in the drawings and described in the following written specification. It is understood that no limitation to the scope of the disclosure is thereby intended. It is further understood that this disclosure includes any alterations and modifications to the illustrated embodiments and includes further applications of the principles of the disclosure as would normally occur to one skilled in the art to which this disclosure pertains.
Aspects of the disclosure are disclosed in the accompanying description. Alternate embodiments of the disclosure and their equivalents may be devised without parting from the spirit or scope of the disclosure. It should be noted that any discussion herein regarding “one embodiment”, “an embodiment”, “an exemplary embodiment”, and the like indicate that the embodiment described may include a particular feature, structure, or characteristic, and that such particular feature, structure, or characteristic may not necessarily be included in every embodiment. In addition, references to the foregoing do not necessarily comprise a reference to the same embodiment. Finally, irrespective of whether it is explicitly described, one of ordinary skill in the art would readily appreciate that each of the particular features, structures, or characteristics of the given embodiments may be utilized in connection or combination with those of any other embodiment discussed herein.
For the purposes of the disclosure, the phrase “A and/or B” means (A), (B), or (A and B). For the purposes of the disclosure, the phrase “A, B, and/or C” means (A), (B), (C), (A and B), (A and C), (B and C), or (A, B and C).
The terms “comprising,” “including,” “having,” and the like, as used with respect to embodiments of the disclosure, are synonymous.
As shown in
The drivetrain 104 is configured to generate a force for moving the vehicle 100. In the exemplary embodiment illustrated in
The vehicle 100 is an autonomously-controlled vehicle, and the rotational speed of the electric motor 132 is determined automatically by the controller 128 in response to a vehicle guidance program 140 stored in the memory 120. In another embodiment, the vehicle 100 is controlled by an operator and the rotational speed of the electric motor 132 is determined by the controller 128 in response to inputs from a human operator. In a further embodiment, the motor 132 is an internal combustion engine (ICE) that is either controlled by an operator or the vehicle guidance program 140. In yet another embodiment, vehicle 100 is a hybrid vehicle and the motor 132 includes an electric motor and an ICE that work together to rotate the wheel 136. Accordingly, the vehicle 100 is provided as any type of vehicle including an autonomous vehicle, an operator-controlled vehicle, an electric vehicle, an internal-combustion vehicle, and a hybrid vehicle.
With continued reference to
The memory 120 is an electronic storage device that is configured to store at least the vehicle guidance program 140, vehicle setting data 144, and program instruction data 148. The memory 120 is also referred to herein as a non-transient computer readable medium.
The seat 116 is configured to support occupants, passengers, users, drivers, and/or operators of the vehicle 100. The seat 116 includes a seat bottom 150 and a seat back 152. The seat controller 112 is configured to change a position of the seat bottom 150 and the seat back 152, so as to accommodate a passenger, for example. Moreover, in some embodiments, in response to a signal from the controller 128, the seat controller 112 resets the position of the seat 116 to a default position after the passenger has exited the vehicle 100 at the conclusion of a passenger event.
Additionally or alternatively, the seat controller 112 is configured to generate the vehicle setting data 144, which includes data corresponding to a current position of the seat 116. Specially, the vehicle setting data 144 includes a front-to-back position of the seat bottom 150 and a tilt position of the seat back 152 relative to the seat bottom 150.
As shown in
The controller 128 of the vehicle 100 is configured to execute the program instruction data 148 in order to operate the drivetrain 104, the seat controller 112, the transceiver 118, the memory 120, and the vehicle sensing system 124. The controller 128 is provided as at least one microcontroller and/or microprocessor. In autonomous or semi-autonomous embodiments of the vehicle 100, the controller 128 is configured to execute the vehicle guidance program 140 to guide autonomously the vehicle 100 from an initial location to a desired location using the roadway network. The desired location may be selected by an occupant of the vehicle 100 or by the controller 128. For example, the controller 128 may determine that the vehicle 100 should be moved to a service station, automotive dealer, car wash, or car detailer.
With reference to
The memory 164 is an electronic storage device that is configured to store at least the image data 172, program instructions 176 for operating the vehicle sensing system 124, the notification data 178, a prior model pool 180, an event image model 184 of the cabin 130, and deviation data 190. The memory 164 is also referred to herein as a non-transient computer readable medium.
As shown in
The prior model pool 180 includes a plurality baseline image models 192, which corresponds to the cabin 130 in different conditions and configurations. The baseline image models 192 are electronic data models of the cabin 130 in a clean condition without any passengers or passenger belongings (i.e. items), such as smartphones, located in the cabin 130 (as shown in
The event image model data 184 is generated by the controller 168 and is based on image data 172 from the imaging device 160 after a passenger event. The event image model data 184, therefore, may include passengers and/or passenger belongings, such as smartphones. In at least one embodiment, the event image model data 184 is generated using an HDR process and includes at least one HDR image of the cabin 130. The at least one HDR image of the cabin 130 is configured to reduce an impact of environmental lighting on the generation of the event image model 184 of the cabin 130.
The deviation data 190 is generated by the controller 168 and is based on a comparison of a selected one of the baseline image models 192 and the event image model data 184 after a passenger event has occurred. The sensing system 124 identifies the deviation data 190 as differences between the baseline image model 192 and the event image model data 184. The identified differences of the deviation data 190 typically correspond to one or more passengers, one or more objects or personal belongings left behind by the passenger, damage to the cabin 130, and soiled areas of the cabin 130. The controller 168 processes the deviation data 190 with an objection detection algorithm to identify the specific type of objects left behind by the passenger. In this way, the sensing system 124 is configured to identify that an object was left behind by the passenger and then identify that object as being a smartphone, for example. The image deviations of the deviation data 190 are also referred to herein as events of interest.
The controller 168 of the vehicle sensing system 124 is configured to execute the program instruction data 176 in order to operate the sensing system 124. The controller 168 is provided as at least one microcontroller and/or microprocessor.
In operation, the vehicle 100 is configured to implement a method 200 illustrated by the flowchart of
In block 204 of the method 200, the vehicle 100 has transported the passenger to the desired location. Typically, when the passenger arrives at the desired location, the passenger removes all of the personal belongings, such as electronic devices, smartphones, wallets, sunglasses, keys, and the like from the vehicle 100, to prevent theft or loss of the personal items. Some passengers, as shown in
In the example of
In block 208 of the method 200, the vehicle sensing system 124 generates the event image model data 184. The event image model data 184 is generated after the passenger exits the vehicle 100 at the conclusion of the passenger event and before a different passenger enters the vehicle 100 during a subsequent passenger event. In this example, the event image model data 184 is generated based on the vehicle 100 of
Next, in block 212 of the method 200, the vehicle sensing system 124 compares the event image model data 184 to a selected one of the baseline image models 192, which is selected from the prior model pool 180. The baseline image models 192 of the prior model pool 180 are generated by the sensing system 124 with the cabin 130 in a baseline state, as shown in
Each baseline image model 192 corresponds to a particular configuration of the vehicle 100 based on the vehicle setting data 144. For example, a first baseline image model 192 is generated with the seat backs 152 in an upright position, and a second baseline image model 192 that is generated with the seat backs 152 in a reclined or tilted position. The vehicle setting data 144 is associated with the baseline image models 192 so that a particular baseline image model 192 can be located and selected based on the present configuration of the vehicle 100 according to the vehicle setting data 144.
In one embodiment, the method 200 includes selecting an optimal baseline image model 192 from the prior model pool 180 by comparing the vehicle setting data 144 associated with the event image model data 184 to the vehicle setting data 144 associated with each of the baseline image models 192. The baseline image model 192 having vehicle setting data 144 that corresponds most closely to the vehicle setting data 144 of the event image model data 184 is selected as the optimal baseline image model 192.
Next, in block 212 of the method 200, the sensing system 124 compares the event image model data 184 to the selected baseline image model 192 to generate the deviation data 190. By comparing the models 184, 192, the sensing system 124 identifies visual differences between the baseline image model 192 and the event image model data 184. These visual differences typically correspond to belongings left behind by the passenger and soiled regions of the cabin 130. The visual differences between the baseline image model 192 and the event image model 184 are referred to herein as “image deviations” or “deviations” because the visual differences are a deviation from the clean/empty state of the cabin 130 from before the passenger event to after the passenger event. The deviations are stored in the memory 164 as the deviation data 190.
If at block 216 the sensing system 124 does not identify any deviations in comparing the event image model 184 to the baseline image model 192, then at block 220 the vehicle 100 is returned to service and is ready for another passenger event. Accordingly, at block 220, the sensing system 124 has determined that the vehicle 100 is free from personal belongings of the passenger, is reasonably clean, and is in a condition suitable for the next passenger. The sensing system 124 has made these determinations without human involvement and has efficiently improved the user experience of the vehicle 100.
At block 224 of the method 200, the sensing system 124 has identified deviations (in block 216) and determines if the vehicle 100 is suitable for service. Specifically, the sensing system 124 uses an object recognition system to classify the deviations of the deviation data 190. For example, considering again
Next at block 224 of the method 200, the sensing system 124 determines if, in spite of the identified deviations, the vehicle 100 is in a condition suitable to return to service. The sensing system 124 makes this determination by evaluating the type of deviations that have been identified. For example, if the sensing system 124 has determined that the deviations correspond to a cabin 130 that is unacceptably dirty or soiled, then the vehicle 100 is not returned to service. If, however, the sensing system 124 has determined that the deviations correspond to a cabin 130 is only moderately dirty or soiled, then the vehicle 100 is returned to service for additional passenger events.
At block 228 of the method 200, the vehicle 100 returns to service and is available for further passenger events. Typically, the event image model data 184 is saved for predetermined time period after the passenger event. Thus, in at least this manner the vehicle 100 is operated based on the identified image deviations.
If, at block 224 of the method, the sensing system 124 determines that vehicle 100 is unsuitable to return to service, then a different approach is taken. Specifically, in such an example, at block 232 the sensing system 124 generates the notification data 178. The notification data 178 is sent to the passenger and/or the service station. For example, the notification data 178 may inform the passenger that certain personal belongings were left behind in the cabin 130. The notification data 178 may identify the belongings that were left behind and include instructions for the passenger to retrieve the belongings. Moreover, in this example, the notification data 178 is sent to a service center to alert the service center that the vehicle 100 will be shortly arriving and is in need of a cleaning and/or personal belongings will need to be removed. The notification data 178 is sent from the vehicle 100 to the passenger and the service center using the transceiver 118 of the vehicle 100. Generating the notification data 178 corresponds to operating the vehicle 100 based on the identified image deviations.
At block 236 of the method 200, the vehicle 100 autonomously returns to a service center. In this example, the vehicle sensing system 124 has identified that the previous passenger(s) has left behind many personal belongings (shown in
The vehicle 100 and the sensing system 124 are an improvement over the prior art, because the sensing system 124 solves the problem of determining the condition of a shared autonomous vehicle 100, when the autonomous vehicle 100 is away from a base station or service center. In the past, a human inspector was required to determine when the vehicle is dirty or when personal belongings were left behind. The sensing system 124 automates this task and offers certain additional benefits. For example, when the sensing system 124 evaluates the condition of the cabin 130 after each passenger event, the sensing system 124 can identify a specific passenger that left behind the personal items or has soiled the cabin 130. Thus, the sensing system 130 can prevent loss of items of the passengers by returning the vehicle 100 to a service station to enable the passenger to retrieve the items. Moreover, the sensing system 124 can prevent losses to the owners of the vehicle 100 by identifying the passengers that have soiled and/or damaged the cabin 130.
Additional aspects of the vehicle 100 and the vehicle sensing system 124 are described below.
Next, the neighbor search & prior synthesis module 612, which is an implementation of the prior retrieval module 260 (
With continued reference to
The deviation detection module 620, which is a portion of an implementation of the status sensing module 254 (
In some embodiments, there is an offline stage prior model generation module 608 which is configured to construct the prior model pool 180. The prior model pool 180 is generated and stored in the memory 164 prior to placing the vehicle 100 into service. The prior model pool 180 includes a plurality of baseline image models 192 for the expected configurations of the seats 116 and other moveable objects within the cabin 130.
With continued reference to
The HDR image 630 is particularly useful in a vehicle 100 that is photographed at different times of day and at different locations where the external lighting (e.g. sun, moon, streetlights, etc.) cannot be controlled. A typical scenario when the vehicle 100 is in the outdoor environment is strong sunlight illuminating the passenger seat 116. With the typical auto-exposure setting of an imaging device 160, the captured image will look very similar to
In the system of
To generate the prior model pool 180, the system 124 captures the baseline image models 192 under multiple configurations of the movable parts of the vehicle 100. However, the number of different configurations is typically huge and it is typically infeasible to capture all possible variations. In such case, the sensing system 124 performs a sub-sampling for certain adjustable parts of the vehicle 100. For example, the prior model pool 180 is built with baseline image models 192 of the seat 116 in only a selected set of seat positions in order to reduce the dimensionality. The neighbor search and prior synthesis module 612 synthesizes the prior model pool 180 based on these samples. If some attributes of the deviation (e.g., size and location) are desired by the system 124, the relevant attribute information can be included in the prior model pool 180. For example, in the captured HDR images 630, regions for each passenger seat 116 can be marked and the size can be specified. Such information can be directly used in the deviation detection module 620 to provide the desired attribute information. The prior model pool 180 also includes a 3D model of the cabin 130, for example.
The neighbor search and prior synthesis module 612 of the sensing system 124 is implemented using at least one processor that searches the prior model pool 180 to identify one of the baseline image models 192 that most closely corresponds to the current configuration of the seats 116 and other elements within the cabin 130 based on the vehicle setting data 144. One embodiment of a neighbor search is a brute force search process that compares the current configuration and the configurations in the prior model pool 180. The module 612 generates a distance metric via a weighted sum between the distances in each dimension of comparison. For example, assuming the seat position value of the vehicle setting data 144 is denoted as p and the tilt angle of the seat back 152 is denoted as θ, the metric can be defined as: wp|p−pc|+wθ|θ−θc| where the subscript c means the current configuration and the terms wp, wθ leverage the relative importance of each dimension.
In the case where the prior model pool 180 contains a large amount of data, a fast search algorithm can be used, such as KD-Tree. As an example, a KD-Tree can be constructed from the prior model pool 180 using the following data: [wpp, wθθ]. Upon construction of the KD-Tree, the module 612 performs a fast search of the KD-Tree to identify the nearest neighbor (i.e. the selected baseline image model 192).
As previously mentioned, the prior model pool 180 typically contains data from only a subset of the possible configurations of the adjustable vehicle components. In this case, a prior synthesis step adapts the identified neighboring model (i.e. the selected baseline image model 192) in the model pool 180 to the current configuration of the vehicle 100. When the prior model pool 180 includes HDR images 630, image warping can be performed to achieve this goal. The basic assumption is that the configuration of the selected baseline image model 192 is close to the current configuration as determined from the vehicle setting data 144. Therefore, the difference between the selected baseline image model 192 and the synthesis target is small. To compensate for this small difference, interest points can be detected and matched between the selected baseline image model 192 (i.e. a neighbor image) and the configuration corresponding to the event image model data 184 (i.e. a current image). Based on that, the neighbor image can be warped towards the current image, e.g., based on radial basis functions. In this case, the warped image will be the output prior data from the module 612.
The inputs to the de-lighting module 616 include the reference image IA of a clean scene (i.e. selected baseline image model 192) and the HDR image IB of the current scene (i.e. event image model data 184). The goal of the module 616 is to perform an image decomposition de-lighting process to decompose IB into two layers including a reflectance layer RB that contains the scene appearance with the same lighting as IA; and a shading layer SB that contains only the lighting information. The product of these two layers will recover the original image based on the following equation:
IB=SB·RB
The embodiments described herein use the reference image IA as a guide to develop an iterative procedure for the decomposition. The computing is done in log scale of the image. The equations below denote the log-scaled images with superscript L. The relationship among these images can be described using the following equations:
IAL=log(IA)
IBL=log(IB)
SBL=log(SB)
RBL=log(RB)
IBL=SBL+RBL
The procedure to perform de-lighting process using the module 616 uses the following process: Inputs: IAL, IBL, M0 that begins with an estimated {circumflex over (M)}=M0 and iterates until the estimate {circumflex over (M)} converges to a final estimate based on the following process. Inputs: IAL, IBL, M0, where M0 is an initial estimate {circumflex over (M)}=M0 and the process iterates until {circumflex over (M)} converges to a final estimate. Guided decomposition through an optimization: Inputs: IAL, IBL, {circumflex over (M)}, and Outputs: {circumflex over (R)}BL, ŜBL. Segmentation to identify the regions where significant deviations between IA and IB exist: Inputs: IAL, {circumflex over (R)}BL, and Output: {circumflex over (M)}. Outputs: M={circumflex over (M)}, RBL={circumflex over (R)}BL, SBL=ŜBL.
In the process described above, M is a binary mask indicating the significant deviations between IA and IB. M0 is an initialization of M, while {circumflex over (M)} is the intermediate estimation of M during the iterations. {circumflex over (R)}BL and ŜBL are the intermediate estimations of RBL and SBL during the iterations. In more detail, the guided decomposition process minimizes the following energy function to solve for {circumflex over (R)}BL and ŜBL
ED=Eg+λsEs+λrEr
under the constraint: {circumflex over (R)}BL+ŜBL=IBL.
The weights λs and λr leverage the relative importance of each individual team in the function. Each term is explained in more details in the following paragraphs.
The term Eg is a gradient term that minimizes the difference between the gradient of {circumflex over (R)}BL against some reference gradient fields {Gx, Gy}, such as:
Eg=(∇x{circumflex over (R)}BL−Gx)TWxg(∇x{circumflex over (R)}BL−Gx)+(∇y{circumflex over (R)}BL−Gy)TWyg(∇y{circumflex over (R)}BL−Gy)
where ∇x and ∇y are gradient operators in x and y directions, respectively. In one implementation, the gradient field is the image gradient of IBL assuming the gradients in IB mostly originate from the reflectance layer. In this case, {Gx, Gy} are defined as
Gx=∇xIBL;Gy=∇yIBL.
More sophisticated methods can be used to assign different weights to the pixels. For example, classifiers can be used to recognize if an edge should belong to the reflectance layer or the shading layer. Based on that, higher values can be used for edges classified as reflectance edges.
The term Es is a shading smoothness term that enforces smoothness of the shading layer and depends on the choice of the shading model. Some embodiments of the shading model along with the corresponding smoothness term definitions are listed below.
If ŜBL is a full model, i.e., each pixel has a three-channel (e.g., red, green and blue) shading layer, ŜBL={ŜB,1L, ŜB,2L, ŜB,3L}. Then either one of the following two options can be used. First, the smoothness is enforced for each layer separately using the Laplacian operator Δ. Therefore,
Second, the smoothness is enforced for both individual layer and cross layers. With that
where (k, l) denotes a pair of neighboring pixels in the image space and
ES,ij(k,l)=∥(ΔŜB,iL(k)−ΔŜB,jL(k))−(ΔŜB,iL(l)−ΔŜB,jL(l))∥22
The first part is the difference between two channels (i,j) for pixel k, while the second part is the difference between the same two channels for the neighboring pixel l. The term then measures the difference between these two quantities to enforce consistent lighting color between neighboring pixels.
If the shading contains only one channel, the system can use the Laplacian of the layer directly
Es=∥ΔŜBL∥22
Customized shading model can also be used. One example is
ŜBL={ŜB,1L,δB,2,δB,3}
where the scalars δB,2 and δB,3 are constant across pixels. That is, for any pixel k, the customized shading produces ŜB,iL(k)=ŜB,iL(k)+δB,i; i=2,3. With this model, the smoothness only needs to be enforced only the first channel. Therefore, this embodiment uses the same energy function
Es=∥ΔŜBL∥22
Additional embodiments use relaxed shading models that enable more variations in the second and the third layers. One embodiment defines:
ŜBL={ŜB,1L,δB,2r(p),δB,2c(q),δB,3r(p),δB,3c(q),|p=1, . . . ,#rows;q=1, . . . ,#columns}
ŜB,iL(k)=ŜB,iL(k)+δB,ir(row of k)+δB,ic(column of k);i=2,3
This model enables the second layer to include two sets of values: δB,2r(p) is constant for row p and δB,2c(q) is constant for column q, and similarly for the third layer. In other words, both of the second and the third layers include n distinct values where n=#rows+#columns (in the previous model, n=1). With this model, the smoothness term can be defined as
On top of Laplacian for the first layer, the energy function further enforces the values between nearby rows (and columns) for the second and the third layers to be similar. Other customized shading models (e.g., block-wise constant) can also be used and similar smoothness energy function can be derived.
The term Er is a reference term that enforces the values in the target {circumflex over (R)}BL to be consistent with the reference inside the region marked with no significant deviation in the mask, i.e., in the case where {circumflex over (M)}(k)=0 for pixel k. Therefore, the term is defined as
As all individual terms Eg, Es, Er are quadratic functions of the unknowns, minimizing the total cost ED can be done via solving a linear equation. Once the iterative procedure converges, the output of the image de-lighting module 616 is the final estimation of the reflectance layer RB=exp(RBL).
Following the de-lighting process, de-lighted image RBL and the reference image IA, the goal of the detection module 620 is to identify the regions where significant deviations exist and to generate the deviation data 190. This can be achieve in several ways, including but not limited to, per-pixel detection, graph-based detection and learning-based detection.
The per-pixel detection compares these two images (i.e. the selected baseline image model 192 and the event image model 184) at each individual pixel in a certain color space. In one example, both images are converted to CIELab color space and the detection is done via thresholding on the a-channel and b-channel for each pixel k as follows:
{circumflex over (M)}(k)=|IA,a(k)−RB,a(k)|>τa or |IA,b(k)−RB,b(k)|>τb
where IA,a and IA,b are the a-channel and b-channel of IA in CIELab space (similarly for RB,a and RB,b); τa and τb are predefined thresholds.
The graph-based detection combines both the per-pixel measurement and spatial smoothness into consideration. The problem is typically formulated with Markov Random Field (MRF) and solved with Belief Propagation or Graph Cut. In one implementation, the probabilities of a pixel k is and is not in the region with significant deviation are defined via logistic functions as
where v is a parameter controlling the shape of the distribution function and τ is a threshold value similar to the role of τa and τb above. The weights wa(k) and wb(k) are used to leverage the importance of a-channel versus b-channel, which can be same or different across pixels. An example plot is shown in
In the abovementioned methods, the color space is not limited to CIELab, other color spaces that attempts to separate illumination from chromaticity can be used, such as YCbCr. The output from the detection module is a refined binary mask MD, along with the de-lighted image RB from previous module. In the case where certain attribute information is desired, as describes above regarding the system of
With the mask MD defining the regions with significant deviations, this recognition module can be used to identify what are the reasons of these deviations. A non-exclusive list of typical reasons underlying a deviation includes the following reasons, such as objects left in the vehicle 100 among a pre-defined set (e.g., smartphones 308, wallets 312, keys), objects not in the pre-defined set, passenger not leaving, dirty seat, and damage to the vehicle interior, such as scratches.
State-of-the-art recognition engines can be trained using data that contain instances in these categories and applied to the region of interest which are cropped based on the mask MD. The advantage here is the recognition can be done on the de-lighted image RB (for both training and testing). This way, the recognition engines can be trained to focus on distinguishing the different causes without the need to accommodate strong lighting variations.
Embodiments within the scope of the disclosure may also include non-transitory computer-readable storage media or machine-readable medium for carrying or having computer-executable instructions or data structures stored thereon. Such non-transitory computer-readable storage media or machine-readable medium may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such non-transitory computer-readable storage media or machine-readable medium can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired program code means in the form of computer-executable instructions or data structures. Combinations of the above should also be included within the scope of the non-transitory computer-readable storage media or machine-readable medium.
Embodiments may also be practiced in distributed computing environments where tasks are performed by local and remote processing devices that are linked (either by hardwired links, wireless links, or by a combination thereof) through a communications network.
Computer-executable instructions include, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. Computer-executable instructions also include program modules that are executed by computers in stand-alone or network environments. Generally, program modules include routines, programs, objects, components, and data structures, etc. that perform particular tasks or implement particular abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of the program code means for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps.
It will be appreciated that variants of the above-described and other features and functions, or alternatives thereof, may be desirably combined into many other different systems, applications, or methods. Various presently unforeseen or unanticipated alternatives, modifications, variations or improvements may be subsequently made by those skilled in the art that are also intended to be encompassed herein in the following embodiments.
This application is a 35 U.S.C. § 371 National Stage Application of PCT/EP2019/056392, filed on Mar. 14, 2019, which claims the benefit of priority of U.S. provisional application Ser. No. 62/649,624, filed on Mar. 29, 2018, the disclosures of which are incorporated herein by reference in their entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2019/056392 | 3/14/2019 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2019/185359 | 10/3/2019 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
7477758 | Piirainen et al. | Jan 2009 | B2 |
7676062 | Breed et al. | Mar 2010 | B2 |
10127795 | Hwang | Nov 2018 | B1 |
20040090525 | Eichmann | May 2004 | A1 |
20060008150 | Zhao et al. | Jan 2006 | A1 |
20060155442 | Luo | Jul 2006 | A1 |
20070086624 | Breed et al. | Apr 2007 | A1 |
20070135979 | Plante | Jun 2007 | A1 |
20100250052 | Ogino | Sep 2010 | A1 |
20160034780 | Duan et al. | Feb 2016 | A1 |
20160249191 | Avrahami | Aug 2016 | A1 |
20170200203 | Kingsbury | Jul 2017 | A1 |
20170291539 | Avery | Oct 2017 | A1 |
20170339385 | Usui et al. | Nov 2017 | A1 |
20180322342 | Clifford | Nov 2018 | A1 |
20180370496 | Sykula | Dec 2018 | A1 |
20190197325 | Reiley | Jun 2019 | A1 |
Number | Date | Country |
---|---|---|
10 2012 024 650 | Jun 2014 | DE |
10 2018 110 430 | Nov 2018 | DE |
2561062 | Oct 2018 | GB |
2563995 | Jan 2019 | GB |
2005-115911 | Apr 2005 | JP |
2006-338535 | Dec 2006 | JP |
Entry |
---|
International Search Report corresponding to PCT Application No. PCT/EP2019/056392, dated Jun. 13, 2019 (4 pages). |
Duchêne, S. et al., “Multiview Intrinsic Images of Outdoors Scenes with an Application to Relighting,” Association for Computing Machinery (ACM), ACM Transactions on Graphics, vol. 34, No. 5, Article 164, Oct. 2015 (16 pages). |
Wikipedia, “High-dynamic-range imaging,” retrieved from Internet Feb. 14, 2018, https://en.wikipedia.org/wiki/High-dynamic-range_imaging (10 pages). |
Dipert, B. et al., “Improved Vision Processors, Sensors Enable Proliferation of New and Enhanced ADAS Functions,” Edge AI + Vision Alliance, Jan. 29, 2014, available at https://www.edge-ai-vision.com/2014/01/improved-vision-processors-sensors-enable-proliferation-of-new-and-enhanced-adas-functions/ (14 pages). |
Dipert, B. et al., “Smart In-Vehicle Cameras Increase Driver and Passenger Safety,” Edge AI + Vision Alliance, Oct. 3, 2014, available at https://www.edge-ai-vision.com/2014/10/smart-in-vehicle-cameras-increase-driver-and-passenger-safety/ (13 pages). |
Mikolajczyk, K. et al., “A Performance Evaluation of Local Descriptors,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, No. 10, Oct. 2005, pp. 1615-1630 (16 pages). |
Tappen, M. F. et al., “Recovering Intrinsic Images from a Single Image,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, No. 9, Sep. 2005, pp. 1459-1472 (14 pages). |
Boykov, Y. et al., “Fast Approximate Energy Minimization via Graph Cuts,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, No. 11, Nov. 2001, pp. 1-18 (18 pages). |
Lin, T.-Y. et al., “Feature Pyramid Networks for Object Detection,” IEEE Conference on Computer Vision and Pattern Recognition, 2017, CVPR Open Access (9 pages). |
Zhu, Y. et al., “CoupleNet: Coupling Global Structure with Local Parts for Object Detection,” IEEE European Conference on Computer Vision, 2017, CVPR Open Access (9 pages). |
Bouwmans, T., “Traditional and recent approaches in background modeling for foreground detection: An overview,” Computer Science Review, May 2014, vol. 11-12, pp. 31-66 (36 pages). |
Chen, T.-L. et al., “Image warping using radial basis functions,” Journal of Applied Statistics, 2014, vol. 41, No. 2, pp. 242-258 (17 pages). |
Radke, R. J. et al., “Image Change Detection Algorithms: A Systematic Survey,” IEEE Transactions on Image Processing, Mar. 2005, vol. 14, No. 3, pp. 294-307 (14 pages). |
Alcantarilla, S. et al., “Street-View Change Detection with Deconvolutional Networks,” Robotics: Science and Systems, 2016 (10 pages). |
Land, E. H. et al., “Lightness and Retinex Theory,” Journal of the Optical Society of America (JOSA), Jan. 1971, vol. 61, No. 1, pp. 1-11 (11 pages). |
Grosse, R. et al., “Ground truth dataset and baseline evaluations for intrinsic image algorithms,” IEEE International, 12th International Conference on Computer Vision (ICCV), 2009, pp. 2335-2342 (8 pages). |
Li, Y. et al., “Single Image Layer Separation using Relative Smoothness,” IEEE Conference on Computer Vision and Pattern Recognition, 2014, CVPR2014 Open Access (8 pages). |
Number | Date | Country | |
---|---|---|---|
20210034888 A1 | Feb 2021 | US |
Number | Date | Country | |
---|---|---|---|
62649624 | Mar 2018 | US |