Computer vision may be used by security systems for monitoring and securing environments. Videos of events occurring in a monitored environment may be recorded for a later review.
In general, in one aspect, the invention relates to a method for analyzing a video captured by a security system, including obtaining the video of a monitored environment and detecting an occurrence of an event in the video of the monitored environment. Detecting the occurrence of the event includes identifying the presence of a foreground object in the video of the monitored environment and classifying the foreground object. The method further includes tagging the occurrence of the event in the video with the foreground object classification and generating an event history video from the video of the monitored environment, including resampling the video of the monitored environment, the resampling including applying a first event-specific frame drop rate to segments of the video of the monitored environment that include the foreground object, based on the tagging, and applying at least one other frame drop rate to other segments of the video of the monitored environment.
In general, in one aspect, the invention relates to a method for analyzing a video captured by a security system, including obtaining the video of a monitored environment and detecting an occurrence of an event in the video of the monitored environment. Detecting the occurrence of the event includes identifying the presence of a foreground object in the video of the monitored environment and classifying the foreground object. The method further includes generating an event history video from the video of the monitored environment. Generating the event history video includes generating a set of frames of the event history video. Each frame of the event history video includes the background region, each frame of the event history video corresponds to a time window of the video of the monitored environment, and in at least a frame of the set of frames, a color shift is applied to a portion of the pixels of the frame that are in a region of the frame in which the foreground object was present in the video of the monitored environment during the time window corresponding to the frame. The foreground object is not shown in the frame.
In general, in one aspect, the invention relates to a non-transitory computer readable medium including instructions that enable a system to obtain a video of a monitored environment and detect an occurrence of an event in the video of the monitored environment. Detecting the occurrence of the event includes identifying the presence of a foreground object in the video of the monitored environment and classifying the foreground object. The non-transitory computer readable medium further includes instructions that enable the system to tag the occurrence of the event in the video with the foreground object classification and generate an event history video from the video of the monitored environment including resampling the video of the monitored environment, the resampling including applying a first event-specific frame drop rate to segments of the video of the monitored environment that include the foreground object, based on the tagging, and applying at least one other frame drop rate to other segments of the video of the monitored environment.
In general, in one aspect, the invention relates to a non-transitory computer readable medium including instructions that enable a system to obtain a video of a monitored environment and detect an occurrence of an event in the video of the monitored environment. Detecting the occurrence of the event includes identifying the presence of a foreground object in the video of the monitored environment and classifying the foreground object. The instructions further enable the system to generate an event history video from the video of the monitored environment, including generating a set of frames of the event history video. Each frame of the event history video includes the background region, each frame of the event history video corresponds to a time window of the video of the monitored environment, and in at least a frame of the set of frames, a color shift is applied to a portion of the pixels of the frame that are in a region of the frame in which the foreground object was present in the video of the monitored environment during the time window corresponding to the frame. The foreground object is not shown in the frame.
Other aspects of the invention will be apparent from the following description and the appended claims.
Specific embodiments of the invention will now be described in detail with reference to the accompanying figures. In the following detailed description of embodiments of the invention, numerous specific details are set forth in order to provide a more thorough understanding of the invention. However, it will be apparent to one of ordinary skill in the art that the invention may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description.
In the following description of
In general, embodiments of the invention relate to a monitoring system used for securing an environment. A monitoring system may detect object movement in a monitored environment, may isolate the moving object(s) from the surrounding environment, and may classify the moving object(s). Based on the classification of the moving object(s) by a classification algorithm, the moving objects may be determined to be, for example, threats, harmless, or unknown. Appropriate actions, such as calling the police, may subsequently be taken.
In one or more embodiments of the invention, the monitoring system generates event history videos of the monitored environment. An event history video may summarize events that have occurred in the monitored environment, for example, throughout a day. An event history video may include multiple segments with differing time scales. For example, segments that are deemed interesting, e.g., when a person is present in the workspace, may be played back in real-time or slightly accelerated. For segments with activity deemed less relevant, e.g., when a pet is present in the workspace, the playback speed may be further accelerated, whereas the playback speed may be highly accelerated when no activity at all is observed in the monitored environment. In one embodiment of the invention, the generation of an event history video may be initiated when the monitoring system is armed. An event history video may further be generated when the monitoring system is disarmed but active, i.e., when the monitoring system observes the monitored environment without taking actions such as, for example, triggering an alarm.
In one or more embodiments of the invention, the monitoring system (100) includes a camera system (102). The camera system may include a video camera (108) and a local computing device (110), and may further include a depth sensing camera (104) if the monitored environment is captured and analyzed in three-dimensional space. The camera system (102) may be a portable unit that may be positioned such that the field of view of the video camera (108) covers an area of interest in the environment to be monitored. The camera system (102) may be placed, for example, on a shelf in a corner of a room to be monitored, thereby enabling the camera to monitor the space between the camera system (102) and a back wall of the room. Other locations of the camera system may be used without departing from the invention.
The video camera (108) may be capable of continuously capturing a two-dimensional video of the environment (150). The video camera may be rigidly connected to the other components of the camera system (102). The field of view and the orientation of the video camera may be selected to cover a portion of the monitored environment (150) similar (or substantially similar) to the portion of the monitored environment captured by the depth sensing camera, if included in the monitoring system. The video camera may use, for example, an RGB or CMYG color CCD or CMOS sensor with a spatial resolution of for example, 320×240 pixels, and a temporal resolution of 30 frames per second (fps). Those skilled in the art will appreciate that the invention is not limited to the aforementioned image sensor technologies, temporal, and/or spatial resolutions. Further, a video camera's frame rates may vary, for example, depending on the lighting situation in the monitored environment.
In one embodiment of the invention, the depth-sensing camera (104) is a camera capable of reporting multiple depth values from the monitored environment (150). For example, the depth-sensing camera (104) may provide depth measurements for a set of 320×240 pixels (Quarter Video Graphics Array (QVGA) resolution) at a temporal resolution of 30 frames per second (fps). The depth-sensing camera (104) may be based on scanner-based or scannerless depth measurement techniques such as, for example, LIDAR, using time-of-flight measurements to determine a distance to an object in the field of view of the depth-sensing camera (104). In one embodiment of the invention, the depth-sensing camera (104) may further provide a 2D grayscale image, in addition to the depth-measurements, thereby providing a complete 3D grayscale description of the monitored environment (150). Those skilled in the art will appreciate that the invention is not limited to the aforementioned depth-sensing technology, temporal, and/or spatial resolutions. For example, stereo cameras may be used rather than time-of-flight-based cameras.
In one embodiment of the invention, the volume of the monitored environment (150) is defined by the specifications of the video camera (108) and/or the depth-sensing camera (104). The video camera (108) may, for example, have a set field of view, and the depth-sensing camera (104) may, for example, have a limited minimum and/or maximum depth tracking distance in addition to a set field of view.
In one embodiment of the invention, the camera system (102) includes a local computing device (110). Any combination of mobile, desktop, server, embedded, or other types of hardware may be used to implement the local computing device. For example, the local computing device (110) may be a system on a chip (SOC), i.e., an integrated circuit (IC) that integrates all components of the local computing device (110) into a single chip. The SOC may include one or more processor cores, associated memory (e.g., random access memory (RAM), cache memory, flash memory, etc.), a network interface (e.g., a local area network (LAN), a wide area network (WAN) such as the Internet, mobile network, or any other type of network) via a network interface connection (not shown), and interfaces to storage devices, input and output devices, etc. The local computing device (110) may further include one or more storage device(s) (e.g., a hard disk, an optical drive such as a compact disk (CD) drive or digital versatile disk (DVD) drive, a flash memory stick, etc.), and numerous other elements and functionalities. In one embodiment of the invention, the computing device includes an operating system (e.g., Linux) that may include functionality to execute the methods further described below. Those skilled in the art will appreciate that the invention is not limited to the aforementioned configuration of the local computing device (110). In one embodiment of the invention, the local computing device (110) may be integrated with the depth sensing camera (104) and/or the video camera (108). Alternatively, the local computing device (110) may be detached from the depth sensing camera (104) and/or the video camera (108), and may be using wired and/or wireless connections to interface with the local computing device (110). In one embodiment of the invention, the local computing device (110) executes methods that include functionality to implement at least portions of the various methods described below (see, e.g.,
Continuing with the discussion of
In one or more embodiments of the invention, the monitoring system (100) includes one or more portable devices (114). A portable device (114) may be a device (e.g., a laptop, smart phone, tablet, etc.) enabling a user of the portable device (114) to interact with the camera system (102) and/or the remote processing service (112). The user may, for example, receive video streams from the camera system, configure, activate or deactivate the camera system, etc. In one embodiment of the invention, the user may employ a portable device to navigate, control and/or view event history videos and/or to configure the generation of event history videos, as described below with reference to
The components of the monitoring system (100), i.e., the camera system(s) (102), the remote processing service (112) and the portable device(s) (114) may communicate using any combination of wired and/or wireless communication protocols. In one embodiment of the invention, the camera system(s) (102), the remote processing service (112) and the portable device(s) (114) communicate via a wide area network (e.g., over the Internet), and/or a local area network (e.g., an enterprise or home network). The communication between the components of the monitoring system (100) may include any combination of secured (e.g., encrypted) and non-secure (e.g., un-encrypted) communication. The manner in which the components of the monitoring system (100) communicate may vary based on the implementation of the invention.
One skilled in the art will recognize that the monitoring system is not limited to the components shown in
The video of the monitored environment (260), in accordance with an embodiment of the invention, is obtained from the video camera (108) of the camera system (102) and may be transmitted via the network (116) to the remote processing service (112), where it may be archived in a video file, e.g., on a hard disk drive. The archiving may alternatively be performed locally by the local computing device (110). The video of the monitored environment may be archived, for example, using a ring buffer-like storage with a capacity sufficient to store video data for the desired time span. To increase the amount of video data to be stored, the video of the monitored environment may be resampled and/or compressed using video compression algorithms (e.g., MPEG-1, 2, or 4, etc.).
The event tags (270), in accordance with an embodiment of the invention, label occurrences of events in the video (260). The event tags may be generated based on an analysis of the video of the monitored environment (260), obtained from the video camera (108), or, if available, based on an analysis of depth data recordings from the depth-sensing camera (104). An event, as illustrated in the exemplary event tags (270) of
In one or more embodiments of the invention, the monitoring system (100) may be able to classify a variety of foreground objects. For example, the monitoring system may be able to distinguish between a human and a pet, based on the size and other characteristics of observed foreground objects. Further the monitoring system may also be able to distinguish the owner from another person moving within the monitored environment. The distinction may be performed based on visual features and/or using other identifying features such as WiFi and/or Bluetooth signals of the owner's portable device (114). Accordingly, a variety of foreground object classifications that may be used for event tagging may exist.
The event tags (270) may be stored in volatile and/or non-volatile memory of the remote processing service (112). Alternatively, if a foreground object classification is performed locally by the local computing device, the event tags may be stored in volatile and/or non-volatile memory of the local computing device (110). The event tags (270) may be stored separately from the video of the monitored environment (260) for example by identifying start frames or start times and end frames or end times of event occurrences in the video of the monitored environment (260) or by storing individual frame numbers of the video of the environment (260). Alternatively, the video of the monitored environment (260) itself, e.g., individual video frames or sets of frames, may be tagged with labels indicating the detected classes of foreground objects found in the frames of the video.
In one or more embodiments of the invention, the event history video (280) is a video of the monitored environment, generated from the video (260). The event history video may enable a user to quickly assess events that have occurred. To increase the amount of information provided by the event history video (280) within the limited playback time of the event history video, variable temporal scaling may be applied when generating the event history video (280) from the video (260). For example, for periods during which no activity was observed in the monitored environment, the playback may be highly accelerated, e.g., hours of inactivity in the monitored environment may be displayed within a few seconds. For periods during which events of limited relevance (e.g. a pet being active in the monitored environment) were registered, the playback may be accelerated, although to a lesser degree. In contrast, for periods during which activity deemed relevant is occurring in the monitored workspace, a mild acceleration may be applied only, thus enabling the owner to review these events. The exemplary event history video of
As a result, the event history video (280), composed from the segments where variable playback acceleration was applied, is sufficiently short to be reviewed in a limited time, while still showing activities that have occurred with sufficient temporal resolution when necessary.
In one embodiment of the invention, the event history video (280) has a variable length, determined by the combination of events to be included in the event history video. More specifically, in a variable-length event history video, the length of the event history video is governed by factors including, for example, the length of the time interval for which an event history video is to be generated and the classes of events that were detected in that time interval. The details of generating variable-length event history videos are described below with reference to
In an alternative embodiment of the invention, the event history video (280) has a fixed length that may be pre-specified by a user as part of the configuration of the monitoring system. The length of the event history video may be independent from the length of the time interval for which the event history video is generated and from the classes of events that are detected in that time interval. The degree of playback acceleration may be set such that the combination of the events result in an event history video with the desired length. The details of generating fixed-length event history videos are described below with reference to
Those skilled in the art will recognize that the invention is not limited to the exemplary video (260), detected events (270) and event history video (280) shown in
While the various steps in the flowcharts are presented and described sequentially, one of ordinary skill will appreciate that some or all of these steps may be executed in different orders, may be combined or omitted, and some or all of the steps may be executed in parallel. In one embodiment of the invention, the steps shown in
Software instructions in the form of computer readable program code to perform embodiments of the technology may be stored, in whole or in part, temporarily or permanently, on a non-transitory computer readable medium such as a CD, DVD, storage device, a diskette, a tape, flash memory, physical memory, or any other computer readable storage medium. Specifically, the software instructions may correspond to computer readable program code that, when executed by a processor(s), is configured to perform methods, described in
In Step 300, a video of the monitored environment is obtained. The video may be obtained continuously using the monitoring system's video camera. The obtained video of the monitored environment may be processed by the local computing device. Processing of the video of the monitored environment may include resampling and compressing the video. Processing may further include forwarding the video of the monitored environment to the remote processing service. In one embodiment of the invention, obtaining the video includes obtaining three-dimensional (3D) depth data from the monitored environment. The 3D depth data may be obtained from the monitoring system's depth-sensing camera. The obtained 3D depth data, combined with the video of the monitored environment may enable the reconstructions of a full 3D grayscale or color representation of the monitored environment.
In Step 302, the video of the monitored environment is stored. The video may be stored, for example, in a non-volatile storage, e.g. on a hard disk drive, of the local computing device and/or the remote processing service.
In Step 304, occurrences of events are detected. The detection may be performed based on the previously obtained video and/or 3D depth data of the monitored environment. The detection of an occurrence of an event, in accordance with an embodiment of the invention, includes a detection of one or more foreground objects in the video and/or depth data of the monitored environment. The detection may be based on the detection of movement in the video and/or in the depth data. Movement of clusters of pixels in the video and/or the depth data may indicate the movement of objects in the monitored environment. Based on the detection of movement, the monitoring system in accordance with an embodiment of the invention distinguishes foreground objects from the background of the monitored environment. Additional details regarding the distinction of foreground objects from the background of the monitored environment are provided in U.S. patent application Ser. No. 14/813,907 filed Jul. 30, 2015, the entire disclosure of which is hereby expressly incorporated by reference herein.
In one embodiment of the invention, the detection of event occurrences is performed in real-time or near-real time as the video and/or depth data of the monitored environment are received. Alternatively, the detection may be performed at a later time. In one embodiment of the invention, the detection is performed by the local computing device. Alternatively, the detection may be performed by the remote processing service.
In Step 306, the detected occurrences of events are classified. More specifically, the foreground objects, identified and isolated from the background in Step 304, are classified, in accordance with an embodiment of the invention. The classification may assign each foreground object to a particular class. Classes may include, but are not limited to, persons, pets and unknown foreground objects. Classes may further distinguish between different persons, for example between known and unknown persons. A known person may be a person that the monitoring system is capable to identify. The identification may be performed based on one or more features associated with that person. These features may include, for example, appearance, size, known behavioral patterns, including posture and gait. These features may have been learned by the monitoring system. Further these features may include other distinguishable aspects, such as the presence of a uniquely identifiable signal, including Bluetooth and or WiFi signal of a portable device, carried by the person. A known person may further be a person that, the monitoring system may be able to distinguish from other persons, even though the monitoring system does not know the identity of the person. For example, a person wearing a blue sweater that repeatedly appears in the monitored environment may be distinguished from other persons appearing in the monitored environment. Those skilled in the art will recognize that the classification is not limited to a particular type of classes. Classes can be very broad (e.g. a distinction between a foreground object considered a threat and a foreground object considered benign, or a distinction between a person and a pet), or narrower (e.g. distinction of different persons, distinction of dogs from cats, etc.). The classes of foreground objects to be used by the monitoring system for classification purposes may be specified by the user in a setup procedure of the monitoring system. This setup procedure may include providing necessary data to enable a classification algorithm of the monitoring system to reliably perform the classification.
The classification algorithm may be any algorithm capable of distinguishing classes of foreground objects and may include, but is not limited to, linear classifiers, support vector machines, quadratic classifiers, kernel estimators, boosting algorithms, decision trees, deep learning algorithms, and neural networks. In one embodiment of the invention, the classification is performed by the local computing device, executing the classification algorithm.
Features considered by the classification algorithm may include any kind of characteristics that may be captured by the video camera and/or by the depth-sensing camera. These characteristics may include, for example, dimensions (e.g., size and volume of the detected foreground object), color and texture (e.g., for recognition of a person based on clothing), movement (e.g., for recognition of a person based on gait and posture), and/or any other visually perceivable characteristics (e.g., for performing a face recognition).
In one embodiment of the invention, a classification of a detected event occurrence is performed in real-time or near-real time after the event occurrence has been detected. Alternatively, the classification may be performed at a later time. In one embodiment of the invention, the classification is performed by the local computing device. Alternatively, the classification may be performed by the remote processing service.
In Step 308, the video of the monitored environment is tagged with the classifications of the event occurrences, i.e., based on the classifications obtained for the detected foreground objects. Event occurrence tagging may be performed either by tagging individual frames or sets of frames in the video of the monitored environment itself, or alternatively by documenting the occurrence of events in a separate file. This documentation may be based on frame numbers or frame times of either individual frames, or sets of frames marked, for example, by a beginning and an end frame.
In Step 310, an event history video is generated from the video of the monitored environment. The event history video may be generated once the video of the monitored environment and the event tags are available. In another scenario, event history video may be generated, for example, after the recording of the video of the monitored environment has stopped, of upon user request, e.g., when the user request the viewing of the event history video. The details of Step 310 are described in
Turning to
In one embodiment of the invention, fixed event-specific frame drop rates are used to generate the event history video. For example, only a single frame may be included in the event history video for each minute of inactivity found in the video of the monitored environment, ten frames may be included for each minute of activity tagged as being of limited relevance, and 120 frames may be included for each minute of activity tagged as being of high relevance. Accordingly the resulting event history video may have a variable length. The length of the event history video may depend on, for example, the length of the time interval for which an event history video is to be generated and on the classes of events that were detected in that time interval. For example, the event history video may be very short if no activity at all was detected, whereas the event history video may be considerably longer if activity deemed relevant were detected during the time interval. The fixed event-specific frame drop rates may be user-configurable.
In an alternative embodiment of the invention, variable event-specific frame drop rates may be used to obtain a fixed-length event history video. For example, the user configuring the monitoring system may specify that the desired length of the event history video is to be one minute long, regardless of the length of the time interval for which the event history video is to be generated, and regardless of the types of events detected during that time period. To obtain a fixed-length event history video, only ratios of frame drop rates between the events of various significance, but not absolute frame drop rates may be specified. For example, a ratio 1:10:120 (inactivity:limited relevance:high relevance) indicates that for each frame of “inactivity”, ten frames of event occurrences considered to be of limited relevance and 120 frames of event occurrences considered to be of high relevance are selected. The actual frame drop rates may then be adapted based on the desired length of the event history video and based on the characteristics of the video and detected event tags.
As previously discussed, classes of foreground objects but also the corresponding frame drop rates may be user configurable. Accordingly, a user may decide what time allotment various types of event occurrences receive in the event history video. For example, many users may choose to dedicate significant playback time to foreground objects that pose potential security risks and/or to unknown foreground objects. In contrast, a user who wishes to study his pet's behavior may configure the monitoring system such that significant playback time is dedicated to his pet.
In Step 402, the video of the monitored environment is resampled, based on the event-specific frame drop rates obtained in Step 400. In the resulting event history video, the duration of the various segments of the video of the monitored environment, is modulated by the frame drop rates associated with the tags.
The resampling operation may be performed by the remote processing service, if part of the monitoring system. In monitoring systems that do not include a remote processing service, the resampling may operation may be performed by the local computing device.
The event history video may be stored, for example, by the remote processing service, from where it may be accessible by portable devices.
Turning to
In one embodiment of the invention, fixed event-specific time window sizes are used to generate the event history video. A fixed event-specific time window size may be determined based on the significance of the event. For example, a time window may be comparatively large if only a pet is present in the monitored environment during the time windows, whereas a time windows may be relatively short if a person is present in the monitored environment during the time window. Accordingly the resulting event history video, generated in Step 452, may have a variable length. Specifically, the generated event history video may be longer if events in the monitored environment are considered to be more significant, whereas the event history video may be shorter if events in the monitored environment are considered to be less significant.
The fixed event-specific time window sizes may be user-configurable. Alternatively or additionally, the time windows may be scaled to obtain an event history video of a specified length, analogous to the scaling of the frame drop rates, previously described in Step 400.
In Step 452, an event history video is generated based on events occurring in the video of the monitored environment, during the time windows. The event history video, in accordance with an embodiment of the invention, includes a stationary image of the background region (e.g. the entire background, captured by the camera system in the monitored environment). Foreground object markers that indicate the presence of foreground objects may be superimposed. In one embodiment of the invention, only foreground object markers, but not the corresponding foreground objects themselves, are shown in the event history video. The stationary image may be a frame of the video of the monitored environment, taken at a time when no foreground objects were present in the monitored environment. The stationary image may be displayed for the entire duration of the event history video. The foreground object markers may be superimposed over the stationary image. The superimposed foreground object markers may include color shifts or blurs of color, applied to the stationary image, that indicate the presence of foreground objects in the monitored environment. Different colors may be used to represent different foreground objects. For example, the presence of a person may be encoded by a red color shift, whereas the presence of a pet may be encoded by a green color shift. Further, the intensity of the discoloration may be modulated based on the duration that a foreground object was present.
The event history video, in accordance with an embodiment of the invention, includes a series of event history video frames. Each of these event history frames is generated based on events captured in the video of the monitored environment during a particular time window. A single event history video frame may be obtained from a segment of the video of the monitored environment by applying local color shifts to the stationary image. The intensity of a color shift may be determined separately for each pixel of the single event history video frame. The color shift may be light if the corresponding foreground object, in the segment of the video of the monitored environment, only briefly occupied the region of the pixel. The color shift may be more pronounced if the foreground object occupied the region of the pixel over a prolonged time. In the single event history video frame, the foreground object may thus appear as a colored cloud, where areas in which the foreground object was present during a prolonged time may have a pronounced color shift, whereas areas in which the foreground object was rarely present may have a less visible color shift. As previously noted, different colors may be used to represent different foreground objects or classes of foreground objects. Accordingly, differently colored clouds may coexist in a single event history video frame.
A set of consecutive event history video frames may be combined into an event history video. To obtain a smooth transition between the multiple event history video frames, subsequent event history video frames may be generated from overlapping time windows of the event history video. For example, the first event history video frame may be generated from a time window that includes minutes 1-10 of the video of the monitored environment. The second frame may be generated from a time window that includes minutes 2-11 of the video of the monitored environment. The third frame may be generated from a time window that includes minutes 3-12 of the video of the monitored environment, etc. Those skilled in the art will appreciate that each time window may be of any configurable duration, and that the overlap between time windows may also be configurable. Further, cross fading may be applied to achieve a smoother transition between the event history video frames.
Consider a scenario in which the owner's dog is in the monitored environment. For most of the day, the owner's dog sleeps in one corner of the monitored environment, but occasionally the dog wakes up, goes to his food bowl in a different corner and subsequently returns to his resting corner. Assume that a single event history frame is generated from a one hour segment of the video of the monitored environment. Further assume that the dog is resting for 55 minutes, before he moves to the food bowl and stays there for the remaining five minutes. In a single event history frame, generated for a time window during which the dog is resting, a strong green color shift in the area of the resting corner indicates that the dog is resting at that location during the captured time window. In a later event history video frame, a green color shift is visible at the location of the food bowl, indicating that the dog has moved to the food bowl. Assume that multiple event history video frames are generated for consecutive 10 minute time windows. In the first five frames, an intense green color shift indicates that the dog is resting in the corner during the entire captured time windows. In the last frame, however, during which the dog spent five minutes in his resting corner and five minutes at the food bowl, an intermediate green color shift in both regions indicates that the dog spent approximately the same time in his resting corner and at the food bowl. A very light green color shift between the resting corner and the food bowl indicates the path the dog took from the resting corner to the food bowl. If the event history video frames are played back in consecutive order, a viewer may see the green color shift, representing the dog, first resting in the resting corner, and then moving over to the food bowl. To obtain an event history video where the dog's movement patterns show smoothly, overlapping consecutive five minute time windows are used rather than non-overlapping, consecutive five minute time windows. In the resulting video, the dog initially shows as a pronounced green color shift, at the location of the dog's resting corner. Eventually, during a brief transition period, some frames show the green color shift in the resting corner fade and move toward the food bowl, where the color shift intensifies again, as the dog remains at the food bowl.
In Step 500, a summary of the event history video is displayed to a user. The summary may be a symbolic representation of the event history video that may be displayed along with other event history videos. For example, a list of event history videos may be displayed, e.g., of an entire week, in consecutive order. The symbolic representation may be a timeline that, depending on the available space for displaying the timeline, may include text labels of event occurrences or individual video frames picked to provide a brief summary of the event occurrences in the event history video. Alternatively, the symbolic representation may be an icon and/or a text label. The symbolic representation may also be a highly downsampled event history video that may show, in a limited number of frames, events that are deemed significant. These highly downsampled videos may serve as preview videos that play back repeatedly or even continuously. The resolution of such preview videos may be reduced, thus making them suitable for presentation along with multiple other preview videos on a single screen.
In Step 502, a content selection is obtained from the user. The user may, for example, select a particular event history video for playback. The user may further select a particular segment of an event history video for playback. The selection may be made using a variety of selection criteria. For example, the selection criteria may be time, e.g., the user may select the first ten minutes of an event history video for playback. In one embodiment of the invention, a user may select segments to be played back based on the classifications of event occurrences documented by the event history video. For example, the user may select only segments for playback that are tagged as including unknown or security-relevant foreground objects. In one embodiment of the invention, foreground objects based on which a content selection may be performed include a particular person, or a particular animal, e.g. a pet, or any other foreground object of potential interest to the user. Content selection may further be performed based on classes of foreground objects. For example, the foreground object class “person” or “animal” may be selected, regardless of the detected person or animal, respectively.
In one embodiment of the invention, foreground objects, displayed in an event history video are highlighted. The foreground object may be marked by, for example, a halo. The marking of the foreground object may include color coding, for example to encode the relevance of the foreground object, which may be defined based on a perceived threat level or any other characteristic of the foreground object.
In one embodiment of the invention, the classification-based selection of segments from the event history video supports multi-camera monitoring systems. In such systems, the selection of a particular classification for playback may cause the multi-camera monitoring system to consider all occurrences of events of the specified classification, regardless of what camera system originally captured the event occurrences. Based on the detected event occurrences, an event history video that may include event occurrences captured by multiple camera systems may then be generated. The event history video may be generated either from existing single camera event history videos, or directly from the videos of the monitored environments and corresponding event tags, obtained from the camera systems of the monitoring system.
Consider for example a scenario in which a homeowner uses his multi-camera monitoring system to track the activity of a contractor while he is not at home. The contractor is authorized to perform work in the living room, but is not supposed to enter any other rooms. On the same day, the housekeeper, who is authorized to enter all rooms, is also present. Each of the rooms is equipped with a camera system in accordance with an embodiment of the invention. The multi-camera system recognizes the contractor (e.g. based on the color of his coat) and accordingly tags all segments of the videos provided by the cameras of the monitoring system, regardless of the location of the contractor. Accordingly, the monitoring system is capable of tracking the contactor within the house. The monitoring system uses a separate classification for tracking the housekeeper. When the homeowner returns and reviews the event history video, the homeowner specifies the classification used for the contractor to review the contractor's activities within the house. Based on the selected classification, all footage that shows the contractor, regardless of what camera system in what room captured the activity, is played back to the homeowner. Because a separate classification is used for the housekeeper, the monitoring system may reliably identify the presence of the contractor within a monitored environment and may avoid confusion with the housekeeper. The homeowner may thus verify whether the contractor has complied with the instructions not to enter any rooms except for the living room.
In Step 504, the selected content is played back to the user. The user may control the playback and may, for example, modulate the playback speed and may skip and or repeat sections of the selected content.
Embodiments of the invention may enable a monitoring system to generate event history summaries, i.e., video summaries of event occurrences detected in an environment secured by the monitoring system. An event history summary, in accordance with one or more embodiments of the invention enables a user of the monitoring system to rapidly review and assess event occurrences. The user may specify the relevance for individual classes of event occurrences such that the created event history summary primarily displays event occurrences that are considered relevant, while putting less emphasis on event occurrences deemed less relevant or non-relevant. An event history video may have a pre-specified length, regardless of the time span for which an event history video is to be created. Thus, regardless of whether a video is generated for a period of a few hours only, or for a period spanning multiple days, a video of the desired length may always be created. In one embodiment of the invention, the user has control over the playback of the video and may, for example, replay, or skip segments of the video and may further manipulate the playback speed as desired. In one embodiment of the invention, an event history video is generated from videos obtained from multiple camera systems monitoring multiple environments. Thus, occurrences of events may be included in the event history video, regardless of where the events occurred.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this disclosure, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Accordingly, the scope of the invention should be limited only by the attached claims.