Ultra wide-angle cameras (e.g., cameras configured with one or more fisheye lens, hemi-spheric dome cameras) are commonly used in security or surveillance applications. While these non-rectilinear (NR) cameras are useful for monitoring wide regions of the environment in which they are deployed, the captured video stream or images are severely distorted by the optical geometry. For example, when a fisheye camera is mounted on a ceiling, the vertical orientation of objects is relative throughout the image (i.e., oriented radially from the center of the image) rather than absolute (i.e., uniformly oriented along the vertical axis of the image). Furthermore, objects are severely distorted (e.g., compressed and warped) by the optical geometry near the edge of the field of view (FOV). One or more projection models may be used to transform the NR imagery into a rectilinear form that can be more easily interpreted by a user or detection models. A dynamic and computationally efficient (e.g., in terms of processor time/bandwidth, communication time/bandwidth) method of configuring the dewarping process for NR cameras is desirable.
In general, one or more embodiments of the invention relate to a method of processing a video stream from an NR camera. The method comprises: obtaining an activity score map that corresponds to a view of the NR camera; obtaining, from the NR camera, an NR image that includes the view of the NR camera; detecting motion in the NR image; generating an updated activity score map by incrementing the activity score map based on the detected motion in the NR image; performing clustering on the updated activity score map to identify a region of interest (ROI) in the NR image; generating dewarping information of the ROI based on a constraint of the NR camera (the dewarping information includes parameters to convert the ROI into a rectilinear output); and outputting the dewarping information of the ROI.
In general, one or more embodiments of the invention relate to a non-transitory computer readable medium (CRM) storing computer readable program code for processing a video stream from an NR camera. The computer readable program code causes a computer to: obtain an activity score map that corresponds to a view of the NR camera; obtain, from the NR camera, an NR image that includes the view of the NR camera; detect motion in the NR image; generate an updated activity score map by incrementing the activity score map based on the detected motion in the NR image; perform clustering on the updated activity score map to identify a region of interest (ROI) in the NR image; generate dewarping information of the ROI based on a constraint of the NR camera (the dewarping information includes parameters to convert the ROI into a rectilinear output); and output the dewarping information of the ROI.
In general, one or more embodiments of the invention relate to a system for processing a video stream from an NR camera. The system comprises a memory and a processor coupled to the memory. The processor is configured to: obtain an activity score map that corresponds to a view of the NR camera; obtain, from the NR camera, an NR image that includes the view of the NR camera; detect motion in the NR image; generate an updated activity score map by incrementing the activity score map based on the detected motion in the NR image; perform clustering on the updated activity score map to identify a region of interest (ROI) in the NR image; generate dewarping information of the ROI based on a constraint of the NR camera (the dewarping information includes parameters to convert the ROI into a rectilinear output); and output the dewarping information of the ROI.
Other aspects of the invention will be apparent from the following description and the appended claims.
Specific embodiments of the invention will now be described in detail with reference to the accompanying figures. Like elements in the various figures are denoted by like reference numerals for consistency.
In the following detailed description of embodiments of the invention, numerous specific details are set forth in order to provide a more thorough understanding of the invention. However, it will be apparent to one of ordinary skill in the art that the invention may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description.
Throughout the application, ordinal numbers (e.g., first, second, third) may be used as an adjective for an element (i.e., any noun in the application). The use of ordinal numbers is not to imply or create a particular ordering of the elements nor to limit any element to being only a single element unless expressly disclosed, such as by the use of the terms “before,” “after,” “single,” and other such terminology. Rather the use of ordinal numbers is to distinguish between the elements. By way of an example, a first element is distinct from a second element, and the first element may encompass more than one element and may succeed (or precede) the second element in an ordering of elements.
In
As shown in
Most detection models (e.g., shape recognition, facial recognition) are intended for processing undistorted imagery, such as rectilinear image 14. However, the rectilinear FOV 12 in the rectilinear image 14 has a relatively narrow range that is limited by optical geometry of the rectilinear camera 10. Positioning the camera 10 farther from the surveillance environment 1 to monitor a larger area would result in a loss of detail. Therefore, it is common to use a NR camera, such as a wide-angle view camera (e.g., a hemispheric fisheye camera), to monitor a large surveillance environment 1.
In
As shown in
Because distortion in NR image 24 is largely unique to the exact lens configuration and orientation of the NR camera 20, most detection models would not be effective when presented with NR image 24. For example, machine learning (ML) algorithms are trained to identify objects based on training datasets built from vast databases including various images of an object. Because most photographers capture undistorted rectilinear images, the databases used to train ML algorithms are biased to rectilinear views of the object. As a result, the trained ML algorithm is likely to produce more erroneous results when presented with an NR image of the same object. For example, a real-time object recognition algorithm such as You Only Look Once (YOLO) could be applied to rectilinear image 14 with little to no preprocessing but would likely struggle to provide meaningful results if applied to NR image 24.
Therefore, individual regions of the NR image 24 must be processed (e.g., cropped, reoriented, transformed to remove distortion) to produce a rectilinear output 26. Various “dewarping” methods exist to transform an NR image 24 into format in which captured objects look similar to those in rectilinear image 14. For example, an equirectangular projection model may be used to unwrap a circular fisheye image into a panoramic format and a perspective model may be used to transform into rectilinear format.
In general, embodiments of the invention provide a method, a non-transitory CRM, and a system for processing a video stream from an NR camera. One or more embodiments are directed to configuring dewarping information (e.g., size, resolution, boundaries, orientation information) that identifies one or more ROIs in an NR image for transformation into a rectilinear format. The dewarping information is determined based on one or more constraints (e.g., optical geometry, processing power, frame rate, communication bandwidth) of the NR camera. For example, the claimed method may configure one or more rectilinear outputs 26 (images or video streams) without losing valuable content in the NR image 24 and while conforming to standard video or data processing techniques (e.g., image format, resolution, aspect ratio, computational resource limits).
The system 100 has multiple components, and may include, for example, a buffer 102, a motion engine 104, a clustering engine 106, an ROI engine 108, and, optionally, a command engine 110. Each of these components is discussed in further detail below.
The buffer 102 of the system 100 is configured to store any number of the following: image(s) (X); constraint(s) (C); dewarping information (R); and activity score map(s) (M). Each of these items is described in further detail below.
Each image (X) may be an NR image 24 obtained from a video stream from an NR camera 20, a section of the NR image 24, a rectilinear image (e.g., a rectilinear output image 26, a reference image, an image recognition template). In one or more embodiments, the buffer 102 may store a video recording or live video stream from the NR camera 20. The images (X) may be saved in the buffer 102 in any imaging format (e.g., file format, size, resolution, compression).
Each constraint (C) may be information related to the properties and/or configuration of the NR camera 20. For example, constraint information may include: an output resolution of the NR camera 20; an optical parameter of the NR camera 20 (e.g., focal lengths, lens distortion information); a frame rate of the NR camera; a communication bandwidth of the NR camera 20 or the surveillance system including the NR camera 20; a computational resource limit of the NR camera 20 or the surveillance system including the NR camera 20 (e.g., processing power, number of processing threads, video stream capacity (i.e., number of streams/processing thread/communication channels available for use)); control parameters of the NR camera 20 (e.g., motion capability, light mode capability (visible/infrared), processing modes); orientation constraint of the image recognition algorithm utilized by the NR camera 20; limitation of an ML algorithm utilized by the NR camera 20 (e.g., resolution/size limits of rectilinear output 26, required computational resources, processing thread allocation); number of outputs/ROIs. While the present description includes a limited number of examples, those having ordinary skill in the art will appreciate that alternative examples of constraints may be used without deviating from the gist of the invention.
Dewarping information (R) may be information that defines one or more ROIs in the NR image 24 for transformation into a rectilinear format. For example, coordinates within the NR image 24, dimensions, transformation parameters (e.g., projection type, coefficients for transformations), and/or labels (e.g., ROI identifiers, keywords) may be provided for each ROI in the dewarping information (R). In one or more embodiments, the dewarping information (R) may include weights or coefficients that are used to reconfigure the dewarping information (R) at a later time (e.g., when a stable change occurs in the surveillance environment 1).
Each activity score map (M) may be a representation of detected motion in the FOV 22 of the NR camera 20. For example, an activity score map (M) may be an image with the same dimensions as the original NR image 24, where each pixel indicates a level of activity within a corresponding location in the surveillance environment 1. For example, the activity score map (M) may be a 2D pixel map, whose dimension is the same as that of the original NR image 24, where each pixel corresponds to an activity score at that location in the NR image 24. In one or more embodiments, the activity score map (M) may be a monochromatic bitmap image where the pixel intensity indicates a number of times motion has been detected over a predetermined duration.
In one or more embodiments, the activity score map (M) may be binary (e.g., a single bit that determines whether a motion is detected or not), multibit (e.g., multiple bits to determine how strongly the pixel is emphasized), and/or multi-dimensional (e.g., multiple masks corresponding to different color channels, different spatial dimensions, different collaborators). While the present description focuses on a monochromatic bitmap, those having ordinary skill in the art will appreciate that alternative examples (e.g., conversion into vector format, multibit image for different types of motion (e.g., directionality, duration, repeatability), video map representing evolving motion pattens) may be used without deviating from the gist of the invention. The activity score map (M) may be saved in the buffer 102 in any format (e.g., file format, size, resolution, compression).
In one or more embodiments, the buffer 102 may include any other information required by the system 100 to execute the invention. For example, the buffer 102 may further include instructions for executing one or more transforms (H) (not shown). The transforms (H) may include any number of projection models or coordinate space transformations to convert an NR image into to a rectilinear output. For example, as part of the initialization of the system 100, the buffer 102 may be configured with one or more pixel map transformations that map coordinates of each pixel in the NR image 24 to a corresponding pixel location in the rectilinear output. Therefore, values (e.g., RGB or intensity values) at each pixel on the NR image 24 can be copied to the rectilinear output. Furthermore, any appropriate image processing transformation (e.g., rotation, translation, scale, skew, cropping, or any appropriate image processing function) or combination of image processing transformations, such as a convolution of one or more transformations, may be included.
The motion engine 104 of the system 100 is configured to detect motion in the NR image 24 obtained from the NR camera 20. In one or more embodiments, the motion engine 104 may compare the NR image 24 to one or more images (X) from the buffer 102 (e.g., a previous frame from the NR camera 20) and identify movement by differences in pixel values that exceed a predetermined threshold.
In one or more embodiments, the motion engine 104 may identify movement based on object analysis (e.g., detecting and analyzing foreground objects in NR image 24). For example, an object of interest may be identified based on an image segmentation technique (e.g., thresholding, edge detection, region extraction) and tracked over one or more images (X) in the buffer 102. Identified objects may be filtered based on one or more characteristics (e.g., size, shape, color, contrast) to provide more reliable motion detection (e.g., less sensitive to noise, changes in ambient light). Changes in measured object characteristics (e.g., change in size implies the object is moving toward/away from NR camera 20) may be recorded as detected motion in the NR image 24.
Any appropriate number or combination difference detection algorithms may be used to detect motion in the NR image 24.
In one or more embodiments, the motion engine 104 accepts an NR image 24 as an input and outputs an activity score map (M) (e.g., generates a new map, updates a stored map).
The clustering engine 106 of the system 100 is configured to identify high intensity regions within an activity score map (M). The clustering engine 106 may utilize one or more clustering algorithms (e.g., K-means clustering, centroid based clustering, hierarchy clustering) to group collections of pixels (or areas) in the activity score map (M). The clustering engine 106 may define the center and/or boundaries of each cluster to define each corresponding ROI in the NR image 24.
In one or more embodiments, the clustering engine 106 accepts an activity score map (M) as an input and outputs a portion of the dewarping information (R) (e.g., generates new information, updates existing information).
In one or more embodiments, the clustering engine (106) may perform pattern recognition or any appropriate content analysis to identify a stable change in the activity score map (M). A stable change may be any change in a region that persists for a predetermined amount of time. For example, the stable change may be characterized by a change in an intensity level of a pixel between two different captured images (i.e., a value threshold) and/or a change in the number of pixels within a region that a predetermined threshold (i.e., a count threshold). The predetermined amount of time may be any appropriate value to distinguish stable changes from unwanted artifacts (e.g., camera obstruction, poor quality image, interrupted video feed). In one or more embodiments, the method of detecting the stable change (e.g., image recognition programs, predetermined threshold values, predetermined time intervals) may be dynamically updated.
The ROI engine 108 of the system 100 is configured to define, or refine the definition of ROIs within the NR image 24. The ROI engine 108 may utilize one or more algorithms or transforms to define each ROI such that the NR image 24 can be reconfigured into one or more rectilinear outputs 26 (i.e., generates or updates dewarping information (R) for that ROI). Furthermore, the ROI engine 108 determines the dewarping information in view of the constraints (C) of the NR camera 20 (i.e., the camera itself and the system supporting the camera). For example, the ROI engine 108 may use a cluster location determined by the clustering engine 106 and a projection transform stored in the buffer 102 to define a region in the NR image 24 that can be converted into a rectilinear output 26. Based on the constraints (C), the region defined by the dewarping information may be limited in size, given a specific aspect ratio, etc. Furthermore, the ROI engine 108 may prioritize (e.g., rank, categorize, or otherwise differentiate) the regions in the NR image 24 when a limited number of ROIs are supported by the NR camera 20.
In one or more embodiments, the ROI engine 106 accepts one or more constraints (C) as an input and outputs a portion of the dewarping information (R) (e.g., generates new information, updates existing information).
The command engine 110 of the system 100 is configured to execute one or more commands. For example, the command engine 110 may perform dewarping of the NR image 24 based on the dewarping information (R) when the system 100 is configured to both generate the dewarping information and provide the rectilinear output.
In one or more embodiments where the NR camera 20 is configured to accept commands (e.g., pan/tilt/zoom movements to reorient the FOV, frame rate changes, camera mode changes), the system 100 may provide feedback to the NR camera 20 to improve image acquisition. For example, if the motion detection engine has difficulty identifying movement in a dark image at night, the command engine 110 may cause the NR camera 20 to switch to an infrared mode for better image contrast. In one or more embodiments, the command engine 110 may regulate the frame rate of the NR camera 20 to match the throughput rate of the dewarping process to prevent processing bottlenecks.
In one or more embodiments, the command engine 110 may send a signal to another system based on the analysis performed by the system 100 (e.g., notify security of a stable change in the activity score map).
Those having ordinary skill in the art will appreciate that various commands and/or signals may be used without deviating from the gist of the invention.
Although the system 100 is described with respect to functional components 102, 104, 106, 108, and 110, in other embodiments of the invention, the system 100 may have more or fewer functional components. In addition, each functional component 102, 104, 106, 108, and 110, may be omitted, utilized multiple times (e.g., in serial or parallel), or reordered based on aspects of any given application.
Each of the functional components of system 100 may be implemented in hardware (i.e., circuitry), software (e.g., instructions executed by hardware), or any combination thereof. The functions of each functional component may be shared or performed entirely by other functional components. In addition, each functional component may be executed by the same computing device or on different computing devices connected by a network of any size having wired and/or wireless segments.
By utilizing the above described engines, the system 100 can configure a method for processing imagery generated by an NR camera, as described in further detail below with respect to
NR image 200 is an overhead view of an office space captured through a fisheye camera. The office space include a series of pathways A, B, C, D that intersect below the fisheye camera. Each pathway A, B, C, D leads to a different section of the office and therefore experiences different foot traffic patterns. Pathway A leads to a multi-function peripheral device 202 (e.g., scanner/copier/print/fax machine) and to a window 210. Pathway B leads to a cubicle 208 and a meeting room 206. Pathway C leads to the exit 204. Pathway D leads to another area of the office space.
The FOV distortion due to the ultra wide-angle fisheye lens means the NR image 200 would not be directly useful for processing with AI/ML techniques. For example, identifying users of the multi-function peripheral device 202 would be difficult when the facial features are distorted by the fisheye projection. Therefore, the imagery from the NR camera may be processed in accordance with one or more embodiments of the invention to produce rectilinear outputs that can be input into said AI/ML techniques with better outcomes.
As shown in
In one or more embodiments, the ROIs 220 may be selected to include some or all of the identified clusters under a constraint (C) that each ROI must conform to a specified format (e.g., size, aspect ratio, orientation, acceptable amount of distortion based on the NR camera lens). This configuration constraint (C) may ensure that the processing of the NR image 200 is not bottlenecked.
For example, some fisheye cameras are equipped with on-camera dewarping capabilities that are limited by the processing power included in the camera (e.g., number of threads available to process ROIs, memory allocation for each ROI, input/output bandwidth or file size limits). Generating or communicating large rectilinear output files may overtax the system resulting in poor performance (e.g., delays, bottlenecking, lost frames). To avoid these problems, the NR camera may be configured to send raw NR images to a separate computing system (e.g., a server) with more computational resources to perform dewarping with fewer or different limitations (e.g., GPU limitations, AI/ML limitations, multithreading limitations). In either case, the ROIs 220 may be determined in accordance with one or more embodiments, and transmitted to the appropriate component of the surveillance system that performs the dewarping.
In one or more embodiments, the ROIs 220 may be selected to include some or all of the identified clusters under a constraint (C) that each ROI must be captured conform to a surveillance condition (e.g., cycling views of the NR camera with a minimum number of views during a timeframe, prioritized regions NR camera FOV). This configuration constraint (C) may ensure that the processing of the NR image 200 is appropriately weighted based on the configuration of the NR camera (i.e., the needs of the user/supervisor that installed the NR camera).
In one or more embodiments, the ROIs 220 may be selected to include all of the identified clusters under a constraint (C) that a minimum amount of the original NR image must be included among the ROIs. This configuration constraint (C) may ensure that the full FOV of the NR camera is utilized (i.e., nearly the entire office space can be monitored).
For example, in
While the present description contains a limited number of examples, those having ordinary skill in the art will appreciate that alternative examples of constraints may be used without deviating from the gist of the invention.
As discussed above with respect to
In this case, the dewarping information (R) may simply include three angular regions of the FOV corresponding to ROIs 220A-C. In general, the dewarping information (R) for the NR camera may include coordinates of each ROI 220 within the FOV of the NR camera, dimensions of each ROI 220 (e.g., in the NR image and/or in the expected rectilinear output), transformation parameters (e.g., projection type, coefficients for a transformation), and/or labels (e.g., ROI identifiers, cluster identifiers, keywords).
In
In one or more embodiments, the rectilinear outputs 200A-C are not perfectly perspective corrections of ROIs 220A-C. For example, when a constraint (C) exists to limit use of computational resources, a simpler transformation (e.g., an equirectangular projection model that “unwraps” the angular regions of the fisheye image) may be preferable to save on computational resources. In other words, the rectilinear output based on the dewarping information (R) may include an output image with some level of distortion. Therefore, a rectilinear output of one or more embodiments may be a two dimensional image with no distortion or an acceptable amount of distortion (e.g., defined by a constraint (C)).
In one or more embodiments, the constraint (C) may include a limitation on the computational resources (e.g., processing power and communication bandwidth requirements) utilized by the dewarping process. For example, in embodiments with dewarping performed by an onboard processor disposed in the NR camera (e.g., an IP camera, a network camera), there may be hardware limits (e.g., I/O, processor, memory limitations) or software limitations (e.g., limited parallel processing, fixed resolution requirements) to maintain uninterrupted operation of the NR camera. Therefore, as shown in
In
In any of the above embodiments, integrating the activity score map (M) into the dewarping process results in more efficient use of computational resources. Furthermore, by focusing on clusters in the activity score map (M) and limiting analysis based on one or more constraints (C), embodiments of the invention produce improved outcomes from detection models that use the rectilinear output of the integrated dewarping process.
At 810, the system 100 obtains an activity score map (M) that corresponds to a view of an NR camera. Obtaining may include initializing a new map (i.e., generating a new file in the buffer 102) or retrieving a stored map from the buffer 102.
At 820, the system 100 obtains, from the NR camera, an NR image that includes the view of the NR camera. Because the NR image includes the view of the NR camera that corresponds to the activity score map (M), at least a portion of the NR image corresponds to the activity score map (M).
Furthermore, the system 100 detects motion in the NR image. As discussed above, the motion engine 104 may detect motion in the NR image by one or more algorithms (e.g., pixel difference with respect to a reference image or previous NR image from the NR camera).
At 830, the system 100 generates an updated activity score map (M) by incrementing the activity score map (M) based on the detected motion in the NR image. Updating the activity score map (M) (i.e., repeating blocks 820, 830) may be repeated any number of times. For example, each pixel of the activity score map (M) could be defined as the average number of times per hour an activity of interest (e.g., a moving object) is detected in NR image acquired during H number of hours.
At 835, a determination is made as to whether or not the predetermined duration has been reached. The predetermined duration may be quantified as a time period, number of iterations, or any appropriate metric to develop enough data in the activity score map (M). When the determination at 835 is NO (i.e., more data is required for the activity score map (M)), the process returns to 820. When the determination at 835 is YES (i.e., generating/updating the activity score map (M) is complete), the process continues to 840.
At 840, the system 100 performs clustering on the updated activity score map (M) to identify one or more ROIs in the NR image. As discussed above, the clustering engine 106 may utilize one or more clustering algorithms and define the center and/or boundaries of each cluster to define each corresponding ROI in the NR image.
At 850, the system 100 generates dewarping information (R) of one or more ROIs based on a constraint (C) of the NR camera. As discussed above, the ROI engine 108 may define each ROI such that the NR image can be reconfigured into one or more rectilinear outputs. In other words, the dewarping information (R) includes parameters (e.g., image coordinates, transformation settings/parameters) to convert each ROI into a rectilinear output.
In one or more embodiments, one or the constraints (C) of the NR camera may be based on a video stream capacity of a surveillance system that includes the NR camera. For example, if the surveillance system is only configured to process a predetermined number of images (e.g., one image from each camera install in the system), the constraint (C) may limit generating the dewarping information (R) to match the capacity of the surveillance system.
In one or more embodiments, where a plurality of ROIs are identified in the NR image, a predetermined number of ROIs may be selected based on the video stream capacity of the surveillance system. Therefore, dewarping information (R) maybe be generated for each of the predetermined number of ROIs to match the capacity of the surveillance system. For example, in one or more embodiments where a single image is expected from the NR camera but a plurality of ROIs exist, the dewarping information for each of the predetermined number of ROIs includes instructions for combining rectilinear outputs of the respective ROIs into a single output image, as described above with respect to
At 860, the system 100 outputs the dewarping information (R) of one or more ROIs. For example, in embodiments with dewarping performed onboard the NR camera or via a remote computing system, the system 100 sends the dewarping information to the NR camera or the computing system, respectively.
Optionally, at 870, the system 100 may proceed with dewarping the NR image (e.g., convert the ROI into the rectilinear output) and analyzing the rectilinear outputs. For example, a processor of the system 100 (e.g., installed in the NR camera or remote computing system) may use a projection model to convert the ROI into the rectilinear output based on the dewarping information). In one or more embodiments, the processor may transmit the rectilinear output. In one or more embodiments, the rectilinear output is input into an image recognition algorithm (e.g., any appropriate detection model or surveillance algorithm).
Optionally, at 880, the system 100 may send a command to the NR camera. As discussed above, the command engine 110 may provide feedback to the NR camera to improve image acquisition based on information learned during the dewarping process (e.g., based on an image quality parameter of the NR image or rectilinear output).
For example, optimal ROI configurations may change based on changes in the surveillance environment 1. The stored activity score map (M′) shown in
The change in traffic patterns may be accounted for by running method 900 in the background after an initial set of regions have been configured (e.g., by method 800). Whenever a significant change is detected from a previous activity score map (M), the system 100 can create a new set of ROIs and dewarping information. The method 900 may be applied during regular intervals (e.g., scheduled times), irregular intervals (e.g., asynchronous updates), on command (e.g., user intervention), or in response to any appropriate command of the system 100.
At 910, the system 100 obtains a stored activity score map (M′) that is different from the updated activity score map (M). The stored activity score map (M′) corresponds to stored dewarping information (R′).
Optionally, at 920, the system 100 may apply a filter to obtain smoothed maps (M) and (M′). For example, the system may apply a smoothing filter to obtain smoothed versions of the activity score maps to reduce noise in later processing steps.
At 930, the system 100 computes a difference score (D) between the stored activity score map (M′) and the updated activity score map (M). In one or more embodiments, the difference score (D) may be a normalized difference (e.g., a mean squared difference) between pixel values in the stored activity score map (M′) and the updated activity score map (M).
In one or more embodiments, the difference score (D) may be weighted by additional information included in the dewarping information (R, R′) or additional channels of the activity score maps (M, M′) (e.g., frequency, periodicity, labels).
In one or more embodiments, the clustering engine 106 may calculate a difference score (D) based on a difference in clustering between the stored activity score map (M′) and the updated activity score map (M).
At 935, a determination is made as to whether or not the difference score (D) is greater than or equal to the predetermined threshold. When the determination at 935 is YES (i.e., significant change between the activity score maps (M, M′)), the process continues to 940. When the determination at 935 is NO (i.e., not enough change between the activity score maps (M, M′)), the process continues to 950.
At 940, the system 100 replaces the stored activity map (M′) with the updated activity map (M). ROIs are subsequently identified based on the updated activity map (M).
At 950, the system 100 uses the stored dewarping information (R′) corresponding to the stored activity score map (M′). In other words, the previously used dewarping information (R′) is still considered applicable and using computational resources on new ROI determinations can be avoided.
Although methods 800, 900 have been described with respect to a limited number of examples, those skilled in the art, having benefit of this disclosure, will appreciate that various other embodiments may be devised without departing from the scope of the present disclosure. Furthermore, while the various blocks in
In one or more embodiments, the dewarping algorithm 1000 accepts an activity score map (M) and constraints (C) as inputs and outputs dewarping information (R). The dewarping algorithm 1000 may be executed using the entire system 100, a subcomponent of the system 100, an additional functional block, or any combination thereof. The dewarping algorithm 1000 may include a clustering algorithm 1010 and/or a machine learning model 1020. For example, the clustering engine 106 may use the clustering algorithm 1010 to identify clusters of activity in the activity score map (M) while the ROI engine 108 generates the dewarping information (R).
In one or more embodiments, the ROI engine 108 may use ML model 1020 to generate the dewarping information (R). The ML model 1020 is designed to accept cluster information and constraints (C) and output dewarping information (R) that conforms to the constraints of the NR camera.
As shown in
Each hidden layer 1022 includes one or more modelling neurons. The neurons are modelling nodes (i.e., neurons) that are interconnected to emulate the connection patterns of the human brain. Each neuron may combine data inputs with a set of network weights and biases for adjusting the data inputs. The network weights may amplify or reduce the value of a particular data input to alter the significance of each of the various data inputs for a task that is being modeled. For example, adding a constant to a particular data input shifts the activation function for an associated task being modeled. The activation function in turn determines whether and to what extent an output of one neuron affects other neurons (e.g., one neuron output may be a weight value for use as an input to another neuron or hidden layer). Through machine learning, the ML model 1020 may determine which data inputs should receive greater priority in determining one or more elements of the dewarping information.
While
Embodiments of the invention may be implemented on virtually any type of computing system, regardless of the platform being used. For example, the computing system 1100 may be one or more mobile devices (e.g., laptop computer, smart phone, personal digital assistant, tablet computer, or other mobile device), desktop computers, servers, blades in a server chassis, or any other type of computing device(s) that includes at least the minimum processing power, memory, and input and output device(s) to perform one or more embodiments of the invention. For example, as shown in
Software instructions in the form of computer readable program code to perform embodiments of the invention may be stored, in whole or in part, temporarily or permanently, on a non-transitory computer readable medium such as a CD, DVD, storage device, a diskette, a tape, flash memory, physical memory, or any other computer readable storage medium. Specifically, the software instructions may correspond to computer readable program code that when executed by a processor(s), is configured to perform embodiments of the invention.
One or more of the embodiments of the invention may have one or more of the following improvements to image processing technologies: a method to automatically configure the regions by monitoring motion activities in the fisheye imagery for a predefined period of time; a method of generating a minimum set of output images to cover the areas with motion activities without losing valuable content in the original NR image; activity score metrics (e.g., number of activities-of-interest detected per time unit) can be defined and adapted to suit surveillance applications; configures activity data and dewarping information in coordinated manner to improve utility of existing or planned camera setups; a method to adapt configured region(s) to a changing environment, while the video processing system is in operation; configuring an optimal set of ROIs for a traffic pattern and allowing downstream processing to detect and process multiple objects in an unaltered view (no composite images) will likely lead to better outcome; generate distortion free, high resolution, high quality images for downstream AI modules. These advantages demonstrate that one or more embodiments of the invention are integrated into a practical application by improving resource consumption and reducing bandwidth requirements in the field of wide-angle surveillance systems.
Although the disclosure has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this disclosure, will appreciate that various other embodiments may be devised without departing from the scope of the present invention. Accordingly, the scope of the invention should be limited only by the attached claims.