The present invention relates to an interactive video system including a camera for capturing user motion and a graphical display which is arranged to be altered in response to detection of user motion as captured by the camera, and more particularly the present invention relates to a user interface arranged to display a visual representation of the motion detected by the system to assist in calibrating the system in relation to a surrounding environment.
Use of interactive display surfaces are known in various forms for entertainment, promotion, education and the like. A typical interactive display surface generally comprises a graphical display such as a video screen to display a graphical image or a surface onto which the graphical image may be projected for display to users within an adjacent environment, together with a system for detecting motion of the users within the adjacent environment. The motion detecting system typically relies on a suitable camera directed towards the adjacent environment and a motion detecting algorithm which analyzes the data captured by the camera to determine what type of motion has occurred. The graphical image can then be varied according to various characteristics of the detected motion. For example an object displayed in the graphical image may be displaced or varied in size, color, or configuration, etc. according to the location or amount of motion detected.
Various examples of interactive display surfaces are described in U.S. Pat. Nos. 7,834,846, 7,259,747, and 7,809,167 all by Bell; U.S. Pat. No. 7,775,883 by Smoot et al; and U.S. Pat. No. 5,534,917 by MacDougall.
All known commercial interactive display surfaces are typically generated by systems which are configured for a dedicated environment due to the complexity of calibrating the system to the conditions of the environment such as camera placement, video display placement, size of the environment, lighting conditions and the like. The calibration of the known systems to their environment is therefore generally required to be performed by programmers having considerable knowledge about the system. The installation of known systems is thus generally considered to be very costly and unable to be performed by persons who are not experts in the field.
According to one aspect of the invention there is provided an interactive video system comprising:
an output display area arranged to display a graphical image;
an image capturing device arranged to capture a video comprised of a sequence of frames;
a processing system comprising:
a user interface comprising:
The visual representation of the motion event map provides a tool which allows any average person to recognize the effect of various adjustments to the criteria of the motion detecting algorithm. The resulting feedback provided to a user calibrating the interactive video system to the surrounding environment allows users of various skill levels to set up the system easily using conventional computer equipment of relatively low cost. Accordingly the interactive video system of the present invention is well suited to be set up in various environments which were previously unsuitable for prior art interactive video systems.
Preferably each frame is comprised of pixels and wherein the motion detecting algorithm is arranged to:
compare the pixels of each frame to the pixels of the previous frame to generate a difference map indicating pixels which have changed; and
generate the motion event map using pixels which have changed as indicated by the difference map to define the identified motion events.
In one embodiment, the prescribed criteria of the motion detecting algorithm includes a Gaussian blurring function arranged to be applied by the processing system to the difference map to produce the motion event map such that adjustment of the Gaussian blurring function through the user input affects a sensitivity of the motion detecting algorithm to motion.
Preferably the motion detecting algorithm is arranged to group adjacent pixels which have changed in the difference map to define the motion events such that each motion event represents a group of changed pixels.
The image generating algorithm may be arranged to alter the graphical image according to a size of the motion events, the location of the motion events, or both.
In preferred embodiments there is provided a primary display locating the output display area thereon and an auxiliary display separate from the primary display which locates the controller display area thereon. The auxiliary display may be arranged to visually represent the video of the image capturing device thereon separately from the motion event map.
Preferably he user interface includes a scene selection tool arranged to select one graphical image to be displayed on the output display area from a plurality of graphical images stored on an associated memory of the processing system.
Preferably the user interface includes a camera selection tool arranged to select one image capturing device to capture said video among a plurality of image capturing devices arranged to be associated with the processing system.
The motion detecting system is preferably operable in a first camera orientation mode in which a normal orientation of the video is used to generate the difference map and a second camera orientation mode in which a mirror image of the video relative to the normal orientation is used to generate the difference map. In this instance, the user interface preferably includes a camera orientation selection tool arranged to allow a user to select between the first and second camera orientation modes through the user interface.
In one embodiment, the image capturing device comprises a web camera including an adjustable brightness control, the brightness control being visually represented on the user interface and being adjustable through the user input. Preferably an adjustable contrast control is also arranged to be visually represented on the user interface and adjustable through the user input.
When using a web camera, the user interface may also include a boundary control arranged to select a designated portion of the frames of the video to be compared by the motion detecting algorithm in which the boundary control is adjustable through said user input.
In an alternative embodiment, the image capturing device may comprise any suitable camera or combination of cameras, for example an infrared camera or a stereoscopic camera, which is arranged to capture a depth field such that each frame is comprised of pixels and each pixel represents a distance of a represented object in the surrounding environment from the image capturing device.
In this instance, the image capturing device is preferably arranged to only represent distance to represented objects which are within a prescribed range of depths. The prescribed criteria of the motion detecting algorithm preferably includes said prescribed range of depths such that the prescribed range of depths is adjustable through the user interface.
The prescribed criteria may also include a depth sensitivity threshold, wherein each frame is comprised of pixels, and wherein the motion detecting algorithm is arranged to:
compare the pixels of each frame to the pixels of the previous frame to generate a difference map indicating pixels which have changed by a distance which is greater than the depth sensitivity threshold; and
generate the motion event map using pixels which have changed as indicated by the difference map to define the identified motion events.
Preferably the depth sensitivity threshold is adjustable through the user interface.
The prescribed criteria of the motion detecting algorithm may also include a size threshold, wherein each frame is comprised of pixels, and wherein the motion detecting algorithm is arranged to:
compare the pixels of each frame to the pixels of the previous frame to generate a difference map indicating pixels which have changed;
group adjacent pixels which have changed in the difference map into respective groups of changed pixels;
discard groups of changed pixels which are smaller than the size threshold; and
generate the motion event map such that each motion event is defined by a respective one of the groups of changed pixels which is greater than the size threshold.
Preferably the size threshold is also adjustable through the user interface.
According to a second aspect of the present invention there is provided an interactive video system comprising:
an output display area arranged to display a graphical image;
an image capturing device arranged to capture a video comprised of a sequence of frames in which each frame comprises a plurality of pixels;
a processing system comprising:
a user interface comprising:
The system may further include a primary display locating the output display area thereon and an auxiliary display separate from the primary display which locates the controller display area and the user input thereon such that the visual representation of the motion event map and a visual representation of the adjustable prescribed criteria are arranged to be displayed thereon.
Various embodiments of the invention will now be described in conjunction with the accompanying drawings in which:
a through 2d are schematic representations of alternative configurations of the interactive video system.
In the drawings like characters of reference indicate corresponding parts in the different figures.
Referring to the accompanying figures there is illustrated an interactive video system generally indicated by reference numeral 10. Although various embodiments of the system are described and illustrated herein, the common features will first be described.
The system 10 generally comprises an output display area 12 such as a primary display surface in the form of a video screen, a screen onto which an image is projected, or any other surface onto which an image can be projected from a projector 13 such as a wall or a floor. The output display area is generally located adjacent to a surrounding environment locating users 15 therein which interact with the graphical images being displayed on the output display area.
The system 10 further includes an image capturing device 14, typically in the form of a camera arranged to capture video images of the users in the environment adjacent the output display area to which the graphical display image is displayed. In further instances, the image capturing device may be arranged to capture video of any moving objects within a target area. In either instance, the video captured comprises a sequence of frames 17 in which each frame is comprised of a two dimensional array of pixels.
The system 10 further includes a processing system 16 for example a personal computer or lap top having a processor therein so as to be arranged to execute various algorithms stored on the associated memory of the processor. Among the algorithms is a motion detecting algorithm which receives the video from the image capturing device and compares adjacent frames of video in the sequence according to prescribed criteria in order to determine where within the two dimensional array and how much motion is occurring at any given time. The motion detecting algorithm detects motion for each frame relative to a previous frame in real time as the video is captured.
The processing system also includes an image generating algorithm which produces the graphical image to be displayed on the output display area. More particularly, the image generating algorithm alters a graphical image being displayed in response to the amount (or size) and location of motion detected within the video frames.
The system further includes a graphical user interface displayed on a controller display area. The controller display area 18 is typically provided in the form of an auxiliary display separate from the primary display locating the output display area thereon, for example the monitor associated with the personal computer or laptop on which the algorithms of the present invention are executed. The user interface permits interaction with an operator of the system through a user input 20, typically in the form of input controls on the computer. The user interface allows the various criteria of the motion detecting algorithm to be visually represented on the controller display area such that the user can readily adjust the criteria through the user input and effectively adjust the sensitivity of the interactive video system 10 to motion for calibrating the system to the surrounding environment.
As shown in
In an alternative configuration as shown in
Alternatively, the camera and the projector may be located in proximity to one another and be commonly directed towards a surface defining the output display area 12. The surface may be a wall or a floor for example. The embodiments of
Once the camera and output display area have been configured and connected to a suitable processing system, the operator can then further execute the present invention by following the flow chart of
If no camera is detected, the feed remains blank and the user cannot advance to load scenes of graphical images into the associated memory of the processing system. Once the camera is found and the system is operation, the motion detecting algorithm begins comparing and calculating differences from frame to frame in the video feed and interprets the data as motion. The detected motion is visually represented as a motion event map on the user interface in which areas of change from the frame to frame analysis are identified to the user as motion events based on the current camera settings and settings of other prescribed criteria of the motion detecting algorithm.
The detection and selection of a camera associated with the processing system is executed by a camera selection tool 22 which forms part of the algorithms of the processing system of the present invention to allow an operator to select one image capturing device to capture the video stream among a potential plurality of image capturing devices arranged to be associated with the system.
To display graphical images through the image generating algorithm, a user must first follow the steps of
A conventional file dialogue opens the scenes folder and an appropriate scene file is selected from a default location or elsewhere on the computer. Users can activate scenes in full screen mode on the current display or on a secondary display such as a projector or additional monitor by clicking and dragging the scene window to the desired location and pressing “full screen” or clicking “enter”.
The system of the present invention then begins transmitting motion data to the scene loaded in the image generating algorithm. Each scene uses the motion data to effect different elements in each scene and to create different reactions. These can include triggering, following, avoidance and visibility as different techniques of altering the graphic image being displayed on the output display area.
If a secondary monitor such as a secondary projector is used for the scene, the settings panel remains available on the auxiliary monitor of the computer executing the program. A new scene can be loaded by clicking the “load scene” button graphically represented on the user interface for choosing a new scene. When the control panel displayed on the user interface is closed, the program stops. Before quitting, the application saves all current settings of the prescribed criteria or any other adjustable features and loads the settings on the next restart.
Turning now to
When the motion detecting algorithm is operable in a first camera orientation mode as in
Once the difference map has been generated in step B of
Once areas of change 29 have been identified in the motion event map 27, the motion detecting algorithm defines each identified area of change 29 formed by a group of adjacent pixels as a motion event. The motion events are represented as respective rectangles 31 when input into the image generating algorithm which alters the graphical image displayed according to the motion events. More particularly, the motion events are used by the algorithm to define a size and location of the areas of motion within the environment captured by the camera. The image generating algorithm can then alter the graphical image displayed on the output display area according to either the amount of motion represented by the size of the identified areas of change or the location of the motion of the users as identified by the location of the identified areas of change within the two dimensional pixel array.
The differences between steps C and D in
In the illustrated embodiment where a user with extended fingers forms a fist by folding their fingers forwardly and inwardly towards the camera, the difference map initially identifies several pixels in Steps A and B which change resulting from the motion of each finger. According to a first setting of the prescribed criteria of the Gaussian function of the motion detecting algorithm shown in
In
While the second setting is simpler and quicker to execute by the processing system due to a single identified area of change instead of several, this setting results in less sensitivity to smaller individual motions. The reduced sensitivity is advantageous when the smaller individual motions would otherwise be so numerous that the processing speed of the system is noticeably reduced. Depending upon the type of motion expected in the surrounding environment to which the image capturing device is directed, the user can adjust the prescribed criteria through the user interface visually represented on the controller display area.
In both embodiments of
Turning now more particularly to the embodiment of
The zoom function generally comprises a boundary control which is adjustable through the user interface such that only a selected designated portion of each frame of video may be used by the motion detecting algorithm for detecting motion. The boundary tool thus functions for cropping the frames of video to concentrate only on one portion of the target area versus another or versus the whole. This can also be accomplished simply by controlling a zoom function of the lens on the web camera to adjust the size and location of the video frames being captured and compared by the motion detecting algorithm.
The brightness and contrast controls typically comprise existing adjustments associated with the web camera, but which are reproduced and visually represented on the controller display area with the other criteria to allow an operator access for adjusting these criteria commonly with the other adjustable criteria instead of requiring a separate interface for adjusting these aspects of the camera. Adjustment of any one of the above noted criteria will affect either the quality of the video frames captured by the camera or the manner in which calculations are performed in comparing adjacent frames by the motion detecting algorithm such that each of the criteria settings has an affect on how the motion event map is generated which in turn affects the sensitivity of the system to motion.
Turning now to the embodiment of
The infrared camera of the illustrated embodiment generally comprises an infrared light source 34, a lens 36, and a processing sub-system 38. The infrared light source effectively projects infrared light into the target area or surrounding environment adjacent the output display area, for example in a grid pattern. The lens 36 captures the infrared light reflected back from objects in the target environment such that the processing system can analyse the captured array and define a 3-D shape of objects within the target environment by studying how the grid pattern of projected infrared light is altered in its reflective state captured by the lens 36.
The three dimensional data for each frame of video is presented as a two dimensional array of pixels in which each pixel represents a value among a range of values corresponding to a depth or distance from the lens of the corresponding object represented by that pixel.
The motion detecting algorithm according to the second embodiment of
The adjustable criteria used by the motion detecting algorithm to produce the difference map and the subsequent motion event map in this embodiment include a depth sensitivity threshold 40, a motion sensitivity threshold 42, a minimum depth threshold 44, and a maximum depth threshold 46. As in the previous embodiment, these criteria are visually represented on the user interface as shown in
The minimum depth and maximum depth correspond to minimum and maximum distances from the camera lens which the camera measures as a depth to be recorded in the two dimensional depth fields of pixels defining the video frames 17 captured by the camera. These minimum and maximum distances can be adjusted by the user to define the boundaries within the surrounding environment locating the users therein where motion is being assessed. The processing sub-system 38 of the camera is thus arranged to only represent distance to objects from the surrounding environment in the frames which are within the prescribed range of depths defined by the minimum and maximum depth settings. The minimum and maximum depth thresholds define a prescribed range of depths which is adjustable through the user interface such that the image capturing device is arranged to only represent distance to objects in the captured environment which are within the prescribed range of depths.
In the second embodiment, the motion detecting algorithm builds each pixel of the difference map 25 as follows. Firstly the algorithm considers if the pixel of the relevant video frame and the corresponding pixel within the previous video frame are within the prescribed range of depths by applying the minimum and maximum depth thresholds. Pixels of the video frames outside of the prescribed range of depth have zero depth when comparing corresponding pixels to assess if there is a difference between one frame and the previous frame
Secondly the algorithm considers if the difference between the pixel of the relevant video frame and the corresponding pixel of the previous video frame exceeds a depth sensitivity threshold. When comparing the pixels of each frame to the pixels of the previous frame to generate the difference map, pixels which have changed by a distance which is greater than the depth sensitivity threshold are represented as motion indicating pixels 23 on the difference map. Alternatively, pixels which have not changed by a distance which is greater than the depth sensitivity threshold are represented as pixels having no change and thus no motion. The depth sensitivity threshold 40 thus relates to the amount of difference in depth required between each pixel of one frame and the corresponding pixel of the previous frame in order to determine if motion or change has occurred at that pixel location.
Once the difference map 25 of a respective video frame has been generated by comparison to the previous video frame, the difference map is used to generate the motion event map 27 in which motion events 29 are defined. This is typically accomplished in two steps. Firstly the pixels 23 indicating change are grouped together into respective groups of changed pixels otherwise referred to as blobs 33 within a blob map 35. Secondly the motion sensitivity threshold 42 of the motion detecting algorithm, which is also adjustable through the user interface, is applied. The motion sensitivity threshold 42 is effectively a size threshold such that only groups of pixels or blobs 33 which exceed the threshold are recorded as motion events 27 in the motion event map 29. Groups of pixels or blobs 33 which are smaller than the size threshold are effectively discarded and no longer considered as motion as shown in
As in the previous embodiment, all of the adjustable criteria used in producing the motion event map visually represented on the user interface are also visually represented so that the operator can clearly see what each criterion setting is within its respective scale of possible settings. In addition to having a visual representation of the current settings, the visual representation of the motion event map 27 on the user interface allows a user to immediately see the effects of each changing criteria in terms of how motion is detected. The system 10 is thus able to be readily calibrated by operators with minimal technical knowledge regardless of the environment where the interactive video system is to be set up and used.
Since various modifications can be made in my invention as herein above described, and many apparently widely different embodiments of same made within the spirit and scope of the claims without department from such spirit and scope, it is intended that all matter contained in the accompanying specification shall be interpreted as illustrative only and not in a limiting sense.