The invention relates generally to motion detection and more specifically to a surveillance system and method, for use in security systems or the like, in which a moving camera can be used to detect motion in an area.
Conventional security systems typically protect an enclosed area using switches at doors, windows, and other potential entry points. When a switch is activated, an alarm is sounded, a message is generated, or some other means of notifying the appropriate persons and/or discouraging the persons breaching security is activated. It is also known to use passive infra red (PIR) sensors, which sense heat differences caused by animate objects such as humans or animals, to detect the presence of persons in unauthorized areas. Other sensors used in surveillance and security systems include vibration sensors, radio frequency sensors, laser sensors and microwave sensors. Sensors often can be activated erroneously by power surges or large electromagnetic fields, such as occur when lightening is present. Such activation of course can trigger a false alarm.
To increase the reliability of security and surveillance systems, video cameras have been used to monitor premises. However, with camera surveillance, a constant communications channel must be maintained with the operator at the monitoring site. It is known to combine video camera surveillance with another sensing mechanism, a PIR sensor, for example, so actuation of the video camera is initiated by activation of the other sensor and the operator's attention is focused by sounding an alarm or delivering a message. However, when monitoring continuous video, even for relatively short periods of time, the operator must maintain a constant vigilance. However, an operator's ability to pay attention to a video display generally diminishes rapidly to the point where the operator is essentially ineffective after several minutes. Accordingly, video surveillance is labor intensive, expensive, and not always effective.
More recently, video cameras have been used to monitor an area within a field of view and the resulting image signal is processed to detect any motion in the field of view. U.S. Pat. No. 4,408,224 is exemplary of such systems in which a video camera monitors an area, such as a parking lot, and produces a video signal. The video signal is digitized and stored in a memory and is compared with a previous video signal that has been digitized and stored in a memory. If any differences between the two signals exceeds a threshold, an output is generated and fed to an alarm generation circuit. Various algorithms can be used to compare video signals with one another to determine if motion has occurred in the monitored area. For example, U.S. Pat. No. 6,069,655 discloses comparing video signals on a pixel by pixel basis, generating a difference signal between the two signals, and interpreting any non-zero pixel in the difference signal to be a possible movement. U.S. Pat. No. 4,257,063 discloses a video monitoring system in which a video line from a camera is compared to the same video line viewed at an earlier time to detect motion. U.S. Pat. No. 4,161,750 teaches that changes in the average value of a video line can be used to detect motion.
While the use of video cameras for detecting motion has solved many problems associated with surveillance, some limitations still exist. Specifically, a video camera can only monitor an area within its field of view. The field of view can be increased by locating the camera at a position far away from the area or by using wide angle optics. In either case, each pixel of the imager in the camera will correspond to a larger portion of the area as the field of view is increased. Therefore as the field of view is increased, resolution of the image signal decreases and the ability of the camera to accurately detect motion is reduced. To increase the area covered by a video camera surveillance system, it is well known to provide multiple video cameras. Of course, this increases the cost and complexity of the surveillance system. It is also known to utilize a moving camera to increase the field of view. For example, U.S. Pat. No. 5,473,364 discloses a surveillance system having moving cameras. However, the system disclosed in U.S. Pat. No. 5,473,364 requires complex algorithms, such as affine transforms, for adjusting images for camera movement. Accordingly, such systems are complex and require a great deal of processing power.
An object of the invention is to improve surveillance systems. To achieve this and other objects, a first aspect of the invention is an apparatus for detecting motion in an area. The apparatus comprises an imaging device, such as a camera, having a field of view that is smaller than the area, means for moving the field of view to vary the portion of the area that is covered by the field of view, means for storing a first set of image data captured by the imaging device when the field of view covers a first portion of the area and for storing a second set of image data captured by the imaging device when the field of view covers a second portion of the area, means for determining a fixed object image portion in an overlapping area, means for adjusting at least one of the first set of image data and the second set of image data based on the fixed object image portion to obtain two sets of adjusted image data, and means for comparing the two sets of corrected image data to determine if any objects in the overlapping area have moved.
A second aspect of the invention is a method for detecting motion in an area of interest. The method comprises recording test image data of a portion of the area having a fixed object therein, selecting a portion of the test image data corresponding to the fixed object, storing the portion of the test image data as learned image data, recording first image data at a first field of view, changing the field of view to a second field of view including the fixed object, recording second image data at the second field of view, recognizing the fixed object in the first image data and the second image data, adjusting at least one of the first image data and the second image data for position based on the position of the fixed object in the first image data and the second image data, and comparing the first image data and the second image data after the adjusting step to determine if motion has occurred in an area encompassed by both the first field of view and the second field of view.
The invention is described through a preferred embodiment and the attached drawing in which:
Imaging section 22 and/or optics section 24 are coupled to panning mechanism 30 which comprises a motive device to move the field of view as desired by moving camera 20, imaging section 22, or optic section 24. For example, the motive device can be the output shaft of a transmission coupled to a motor to rotate camera 20 about an axis or move camera 20 linearly. Further, the motive device can be coupled to a mirror or other element of optics section 24 to change the field of view without the need to move imaging section 22. Panning mechanism 30 can be any device or combination of devices for moving the field of view of camera 20 across a desired area.
Processor 40 of the preferred embodiment can comprise a microprocessor based device, such as a general purpose programmable computer. For example, processor 40 can be embodied in a personal computer, a server, or a dedicated programmable device. Processor 40 includes storage device 42, determining module, 44, adjusting module 46, comparing module 48, messaging layer 50, and user interface 52. The various components of processor 40 can be embodied as hardware and/or software, as will become apparent below. Such components are described as separate entities for the clarity. However, the components need not be embodied in separate hardware and/or software and the functionality thereof can be combined or further separated. For example, all of the modules can be embodied in a single executable program file of a control program running on processor 40.
Camera 20 generates a set of image data as an image signal based on the image in the field of view and communicates the signal to processor 40 for processing. As the field of view changes, by virtue of panning mechanism 30, the image signal changes accordingly.
Storage device 42 can include a Random Access Memory (RAM), a magnetic disk, such as a hard disk, or any other device capable of retaining image data. Image data corresponding to the image signal is stored in storage device 42. The image data can be updated periodically, such as every second, every minute, or the like. Because the field of view is changing, the image signal will change over time. Storage device 42 preferably is capable of storing at least two sets of image data at a time for reasons which will become apparent below.
Determining module 44 can include any algorithm or other logic for determining a static portion of an image corresponding to an image signal stored in memory device 42. For example, Principal Component Analysis (PCA) techniques can be used. PCA distributes image data of a multidimensional image space and converts the image data into feature space. The principal components of eigenvectors which serve to characterize such space are then used for processing. More specifically, the eigenvectors are defined respectively by the amount of change in pixel intensity corresponding to changes within the image group, and can thus be thought of as characteristic axes for explaining the image.
A large number of eigenvectors are required to accurately reproduce an image. However, if one only desires to express the characteristics of the outward appearance of an image, the image can be sufficiently expressed using a smaller number of eigenvectors to thereby reduce the required processing power. Known PCA techniques can be used to compare a “learned” image with a current image to recognize patterns in the present image that are similar or identical to the learned image. In the preferred embodiment, the learned image is a designated portion of a previous image signal taken by camera 20 as described in detail below.
The learned image can be obtained by directing camera 20 toward an area including a substantially fixed object, such as a tree, a sign, a building, or a portion of such an object. The resulting image can be displayed on a screen in user interface 52, such as a CRT display or the like. The operator can then designate the portion of the image representing the fixed object by selecting that portion of the image with a mouse pointer or other input device in a known manner. The portion of the image data representing the fixed object is then stored as a learned image. This learned image can be recognized in subsequent images by determining module 44, using PCA techniques for example, and the position of the learned image in the current image can be output to adjusting module 46.
Alternatively a software algorithm of determining module 44 can automatically determine a portion of an image representing a fixed object using any known image analysis technique. For example, determining module 44 can determine a fixed object image portion by comparing successive image data of a test field of view to determine a reference image portion having a fixed object therein, i.e. a portion where data does not change in successive views. The reference image portion can then be compared with portions of the first and second image data to determine which portion of the first and second image data has the fixed object therein. Many reference images can be taken over time to eliminate false fixed objects, such as cars, that may appear fixed and then can be moved later on.
Adjusting module 46 includes logic for adjusting images based on the determination of determining module 44. In particular, adjusting module 46 compares the position of the learned image in two sets of image data and offsets the image data of at least one set of image data to locate the learned image in the same place in each set of image data. This operation permits the adjusted image data to be compared notwithstanding the fact that the field of view is different for each set of image data.
The adjusted sets of image data are sent to comparing module 48 for comparison in a known manner to ascertain if an object in the area has moved, e.g., an animate object has entered the area of surveillance. Appropriate filters and other logic can be applied to the determination to reduce detection of motion caused by small animals, wind, or the like, in a known manner. In the case of motion detection, messaging layer 50 can send a message, or other signal, to annunciation device 60 which can include an audible alarm, an image display, a phone dialer, or the like, to notify the proper parties and provide the desired information thereto.
In step 130, a surveillance image N of the area is recorded with camera 20 at a first field of view and image N is stored in storage device 42. In step 140, the field of view of camera 20 is changed by an incremental amount by panning mechanism 30, while still including the fixed object, and in step 150, surveillance image N+1 is recorded at the new filed of view. In step 160, adjusting module 46 adjusts one or both of images N and N+1 for position based on the position of the fixed object recognized by determining module 44 in each image. The images N and N+1 are compared after adjustment by comparing module 48 to determine if motion has occurred in the area based on a known algorithm. If it is determined that motion has occurred, annunciation device 60 is activated to sound an alarm or take any appropriate action to notify the proper persons or entities that motion has been detected.
At this time, the mode of surveillance can be changed in step 200. For example, an operator may now be given control of panning mechanism 30 to selectively view portions of the area to ascertain the source of motion or the operator may be presented with various displays automatically. If no motion is detected in step 170, N is set to N−1, i.e. image N+1 becomes image N and surveillance continues in step 140 in the manner described above. This process can continue until panning mechanism has taken the field of view of camera 20 to the edge of the area and can continue with panning mechanism moving in a reverse direction back across the area.
Note that steps 100 through 120, i.e., the recording of the learned image, can be accomplished at the same time as step 130. In other words, the learned image can be captured directly out of the first or subsequent surveillance images. Also, the learned image can be captured again periodically to improve performance. In fact, the learned image can be of plural objects as long as each successive surveillance image includes at least one fixed object in common.
The logic of and data manipulation of the invention can be accomplished by any device, such as a general purpose programmable computer or hardwired devices. The imaging device can be any type of sensor for capturing image data, such as a still camera, a video camera, an x-ray imager, an acoustic imager, an electromagnetic imager, or the like. The camera can sense visible light, infra red light, or any other radiation or characteristic. The panning mechanism can comprise any type of motors, transmissions, and the like and can be coupled to any appropriate element to change the field of view of the camera. Any type of comparison and adjustment algorithm can be used with the invention.
The invention has been described through a preferred embodiment. However, various modifications can be made without departing from the scope of the invention as defined by the appended claims and legal equivalents.