Interactive directed light/sound system

Information

  • Patent Grant
  • 8199108
  • Patent Number
    8,199,108
  • Date Filed
    Monday, August 17, 2009
    15 years ago
  • Date Issued
    Tuesday, June 12, 2012
    12 years ago
Abstract
An interactive directed beam system is provided. In one implementation, the system includes a projector, a computer and a camera. The camera is configured to view and capture information in an interactive area. The captured information may take various forms, such as, an image and/or audio data. The captured information is based on actions taken by an object, such as, a person within the interactive area. Such actions include, for example, natural movements of the person and interactions between the person and an image projected by the projector. The captured information from the camera is then sent to the computer for processing. The computer performs one or more processes to extract certain information, such as, the relative location of the person within the interactive area for use in controlling the projector. Based on the results generated by the processes, the computer directs the projector to adjust the projected image accordingly. The projected image can move anywhere within the confines of the interactive area.
Description
DESCRIPTION OF THE RELATED ART

The present invention is generally related to using human position and movement, as well as other visual cues, as input to an interactive system that reorients one or more directed light or sound sources and modifies their content in real time based on the input.


Detecting the position and movement of a human body is referred to as “motion capture.” With motion capture techniques, mathematical descriptions of a human performer's movements are input to a computer or other processing system. Natural body movements can be used as inputs to the computer to study athletic movement, capture data for later playback or simulation, enhance analysis for medical purposes, etc.


Although motion capture provides benefits and advantages, motion capture techniques tend to be complex. Some techniques require the human actor to wear special suits with high-visibility points at several locations. Other approaches use radio-frequency or other types of emitters, multiple sensors and detectors, blue-screens, extensive post-processing, etc. Techniques that rely on simple visible-light image capture are usually not accurate enough to provide well-defined and precise motion capture.


Some motion capture applications allow an actor, or user, to interact with images that are created and displayed by a computer system. For example, an actor may stand in front of a large video screen projection of several objects. The actor can move, or otherwise generate, modify, and manipulate, the objects by using body movements. Different effects based on an actor's movements can be computed by the processing system and displayed on the display screen. For example, the computer system can track a path of the actor in front of the display screen and render an approximation, or artistic interpretation, of the path onto the display screen. The images with which the actor interacts can be e.g., on the floor, wall or other surface, suspended three-dimensionally in space, displayed on one or more monitors, projection screens or other devices. Any type of display device or technology can be used to present images with which a user can control or interact.


In some applications, such as point-of-sale, retail advertising, promotions, arcade entertainment sites, etc., it is desirable to capture the motion of an untrained user (e.g., a person passing by) in a very unobtrusive way. Ideally, the user will not need special preparation or training and the system will not use unduly expensive equipment. Also, the method and system used to motion capture the actor should, preferably, be invisible or undetectable to the user. Many real-world applications must work in environments where there are complex and changing background and foreground objects, changing lighting conditions and other factors that can make motion capture difficult.


Light beams created by simple flashlights, slide projectors, and video projectors, which are designed to project light or an image onto a specific location, can have their light reoriented in real time using mobile mirrors placed in front of the beam. These mirrors are often controlled by one or more stepper motors, allowing precise, computer-controlled movements. Larger motorized mounts can be controlled in a similar manner to redirect the beam by moving the entire light-producing device.


Recent work in the audio domain has produced speakers that can direct a sound beam in the same way that a spotlight directs a light beam. These speakers work by emitting directed ultrasound pulses that disturb the air in a nonlinear way so as to create audible sound in a particular direction.


In the theater and stage environments, there has been a lot of work on finding automated ways for a spotlight to track a moving person on the stage. Current solutions include having an RF (radio frequency) transmitter on the person and then using several detectors to triangulate the person's position. However, these solutions generally require the person being tracked to wear some type of transmitting device.


Hence, it would be desirable to provide an improved interactive directed light/sound system that allows effects to be generated more accurately and in a less intrusive manner.


SUMMARY OF THE INVENTION

The present invention includes a system that allows easy and unencumbered real time interactions between people and reorientable directed light and/or sound systems (henceforth referred to as directed beams) using people's (or other objects') movement, position, and shape as input. The output of this system includes the real time physical reorienting of the directed beams and real time changes in the content projected by the directed beams.


Uses for such a device include, but are not limited to, spotlights that automatically stay on a person as s/he moves, virtual characters that move around in physical space and can interact with people in a variety of ways for informative, entertainment, or advertising purposes, interactive lighting effects for a variety of venues including concerts, dance clubs etc., interactive virtual objects for augmented-reality games, interactive information spaces, and audio instructions aimed at a single person as s/he moves around an enclosed environment.


In one embodiment, the system includes the following components: an image detection system, such as, a video camera, which produces image information; a computer which uses the image information produced by the image detection system, and possibly other information, as input to an application that controls the position and content of one or more directed beams; and one or more directed beams such as a video projector with a motorized mirror in front of it.


In another embodiment, the system includes an image detection system configured to capture information within an interactive area, a first application configured to process the information captured by the image detection system, and a second application configured to receive and use processed information from the first application to generate control information, the control information being used to control a directed beam projected at a location within the interactive area.


The interactive area includes an object. The object can be a person. The directed beam includes an image. The information captured by the image detection system includes information relating to actions taken by the object within the interactive area. The actions taken by the object include natural movements of the person, actions taken by the person with respect to the image, and interactions between the person and the image.


In one embodiment, the image detection system is designed to not suffer interference from the directed beams; for example, the image detection system may be an infrared video camera and may include its own infrared illumination.


The model of the interactive area created by the first application involves extracting information about the position and outline covered by the people or other mobile objects in the interactive area, collectively referred to as the “foreground”. The remaining locations in the interactive area are collectively referred to as “background”. This foreground/background classification may take the form of an image corresponding to the interactive area, with each pixel classified as foreground or background. This information is merged with information about the positions of each of the directed beams, allowing the second application to use this information to compute its output.


The directed beam may also include an audio stream, aimed toward a location within the interactive area. The audio stream may be aimed toward the location of a person, or it may be aimed at the same location as one of the projected images, giving the illusion that the image is making noises or speaking.


The second application is configured to generate the control information without the aid of information provided by a device affixed by the objects or people. The processed information generated by the first application includes background and foreground information. Based on this information, the second application generates image and audio output for the directed beams. This information is also used to determine the direction of the directed beams. Furthermore, the background and foreground information may be used to generate control information such that the location at which the directed light beam is projected is within the background of the interactive area.


In one application, the directed beam is used to provide a spotlight to follow the object within the interactive area. In another application, the image (and perhaps sound) of the directed beam represents an intelligent virtual entity.


Reference to the remaining portions of the specification, including the drawings and claims, will realize other features and advantages of the present invention. Further features and advantages of the present invention, as well as the structure and operation of various embodiments of the present invention, are described in detail below with respect to accompanying drawings, like reference numbers indicate identical or functionally similar elements.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a simplified schematic diagram illustrating an exemplary embodiment of the present invention.



FIG. 2 is a simplified schematic diagram illustrating an exemplary embodiment of aspects of the present invention.





DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

The present invention in the form of one or more exemplary embodiments will now be described. FIG. 1 illustrates one exemplary embodiment of the present invention. This exemplary embodiment includes a system 10 having the following components: a co-located video projector 12, a mirror 14, a computer 16 and an image detection system, such as, a camera 18, mounted on a ceiling.


The camera 18 is configured to view and capture information in an interactive area 20. For example, as shown in FIG. 1, the camera 18 is able to capture information relating to a person or user 22 located within the interactive area 20. The captured information may take various forms, and may include audio data as well as image data. The captured information is based on actions taken by the person 22 within the interactive area 20. Such actions include, for example, natural movements of the person 22 and interactions between the person 22 and the projected image 24. It should be noted information relating to the person 22 is not obtained via any type of transmitting or monitoring device being worn by the person 22. The captured information from the camera 18 is sent to the computer 16 for processing.


The computer 16 then performs one or more processes to extract certain information, such as, the relative location of the person 22 within the interactive area 20 for use in controlling the projector 12, the mirror 14 and the content of projected images. Based on the results generated by the processes, the computer 16 directs the projector 12 and the mirror 14 accordingly. The projector 12 projects an image onto the mirror 14 which, in turn, causes the image 24 to appear within the interactive area 20. For example, the computer 16 may direct the light of the projector 12 to cause the image 24 to appear to move within the interactive area 20 by adjusting the mirror 14. The adjustment may be performed by using a motorized mirror 14. The image 24 can move anywhere within the confines of the interactive area 20, and may leave the interactive area 20 in some cases.


The projected image 24 is a specific form of a directed beam. The directed beam includes any video or audio information, or a combination thereof. Optionally, the system 10 may include audio components 15 that are also controlled by the computer 16 using the results generated by the processes. For example, as shown in FIG. 1, in addition to causing the projected image 24 to appear, the computer 16 may also direct audio components 15 to generate directed sounds to the area around image 24. These audio components 15 may include a speaker system that can direct sound in a narrow beam, including parabolic speakers, and ultrasonic emitters systems such as the HyperSonic Sound (HSS) made by American Technology Corporation. The audio beam 17 may be redirected by the computer 16 to a new location by use of a motorized mount 13 for the audio components 15.


Furthermore, it should be noted that while one directed beam 24 and one directed audio, beam 17 are shown in FIG. 1, multiple directed beams 24 and 17 can be generated and controlled by the system 10.


The image detection system is used to obtain a view of the interactive area 20 and any objects (including, for example, a person) located therein. As mentioned above, in one implementation, the image detection system is a camera including, for example, a video camera, a stereo video camera, a cluster of video cameras, or a time-of-flight 3D (3-dimensional) camera system. In one implementation, the image detection system interacts with the computer 16 in real-time.


In some configurations, the directed beam produced by the projector 12 is a visible light beam. To prevent this beam from interfering with the image detection system, several methods can be employed. For example, if the image detection system is a single, stereo, or cluster of video cameras, the camera can operate at a wavelength that is not used by the projector 12 to generate the directed beam, such as, infrared. Consequently, the camera is not affected by the directed beam. Various methods can be employed to improve the quality of the camera's image and decrease interference, including the illumination of the interactive area with infrared LEDs visible to the camera and the use of a narrow-bandpass infrared filter in front of the camera. This filter only passes light of the wavelengths emitted by the LEDs. Also, the quality of the image can be further improved by strobing the LEDs in time with the camera exposure, or strobing the LEDs on and off for alternate camera frames. Some implementations of these techniques are described in U.S. patent application Ser. No. 10/160,217, entitled “INTERACTIVE VIDEO DISPLAY SYSTEM” filed on May 28, 2002, and U.S. patent application Ser. No. 60/504,375, entitled “SELF-CONTAINED INTERACTIVE VIDEO DISPLAY SYSTEM” filed on Sep. 18, 2003.


As mentioned above, the computer 16 performs one or more processes to extract information from data provided by the image detection system. In one embodiment, these processes are designed to be modular. These processes can be implemented in the form of control logic using software or hardware or a combination of both, and may be performed by one or more computer processors.



FIG. 2 shows one exemplary embodiment of systems of the present invention. The systems may be implemented in the form of control logic using software, hardware, or any combination thereof. System 100 includes a computer vision system for processing camera information 106, a model of real and virtual objects in the interactive area 114, an application 126 that uses the information from the interactive area model 114 to create interactive output, and a rendering engine 128 and a directed beam controller 132 that update the content and position of the directed beams, respectively.


One or more video cameras 102 produce output images in real time, which are input into a calibration and merging system 104. Calibration and merging system 104 includes information about the physical position of the area viewed by each camera such that the multiple camera images may be stitched together. In one embodiment, the information may have been attained during an initial calibration process and may include (x,y) coordinates of the position on the interactive area's surface for each of the four corners of each image. Although four corners are described, it will be understood that any number of corners may be used. Stitching the camera images together may involve making affine, perspective, or other transformations of the images and compositing them into a single image in the interactive area's coordinate space. In the case of a single camera pointed perpendicular to the interactive area, this step may not be needed. The output of calibration and merging system 104 may be a single image of all camera images composited together in a seamless manner such that the coordinates of an object in the image correspond to a scaled version of the coordinates of an object in the interactive area.


The vision processing system 106 receives the image output of calibration and merging system 104. Vision processing system 106 analyzes the image to distinguish between objects that may move, such as people and objects that may be moved, in this image of interactive area 20, collectively referred to as “foreground”, and the remaining static parts of the interactive area, collectively referred to as “background”. The location, movement, and actions of the foreground may eventually serve as input into the interactive application 126. This foreground/background classification may take the form of an image 108 corresponding to the interactive area, with each pixel classified as foreground 110 or background 112. Classification image 108 shows a top-down view of two people classified as foreground in the interactive area. Classifications that produce information on the shape and/or outline of each foreground object, such as the information in a classification image 108, provide the advantage of allowing more sophisticated interactions with the directed beams, such as physics simulations and gesture recognition. Vision processing system 106 may also or alternately produce and track the location of each foreground object in the interactive area. Alternatively, the vision processing system 106 may run on each camera image 102, and the calibration and mapping system 104 may then take as input the vision images for each camera, and integrate them into a single, calibrated vision image.


The model 114 of the interactive area includes a data structure 116 that stores information about the physical location of real and virtual objects within the interactive area. This allows the position and outline of foreground objects 118 to be compared with the location of the images projected by the directed beams 120. The dots 121 at the four corners of the directed beam 120 are merely schematics representing the four corners of the projector's screen, with the top left corner represented by a white dot; these dots do not represent anything about the content of the actual projected image. Since each projected image changes size and shape as it moves around, it may be important to consider the positions of the corners of the projected image, not just the center. Data structure 116 allows for the direct comparison of data from the vision processing system 106 and data about the location and boundaries of the images created by the directed beams.


Information relative to projected image 122 is a coordinate transformation of the information in the model 114 including the vision foreground background information. The transformation is designed to undistort the projected image's screen such that its corners substantially form a rectangle 123 with the same aspect ratio as the projected image. The result is a set of locations 124 in the coordinate space of the projected image including information about the location and shape of vision foreground information 125 and any other directed beams that may overlap the projected image. The transformed vision foreground information may take the form of an image, with each pixel classified as foreground or background. Alternately or in addition, information about the locations of foreground objects may be provided as a transformed coordinate value. The necessary coordinate transformations may be accomplished via an affine, perspective, or other such transformation as is known in the art. Many methods also exist for rendering a coordinate-transformed image onto another image, including rendering engines such as OpenGL.


Set of locations 124 allows direct analysis of the physical overlap between the undistorted content of the projected image and foreground objects in the interactive area. This simplifies the computation of interactions between these foreground objects and virtual objects in the projected image, as well as the generation of visual effects in the projected image based on the position or outline of the foreground objects. In addition, it allows the computation of overlap between the projected image and any content from another directed beam, allowing the creation of black masks that prevent the content of the two projected images from overlapping.


The application 126 that creates interactive effects includes any software, hardware, or any combination thereof that takes as input the model of interactive area 114 and/or the vision information relative to projected images 122 as well as other inputs, and outputs content and position for one or more directed beams. This application is open-ended, and can be programmed to have any desired output behavior given its inputs. Examples of output for application 126 include an interactive character, a virtual sport, or a set of visual effects or spotlights that follows users around. Depending on the design of the application, it may choose to use the vision data relative to the undistorted projected images 122, or it may choose to use the model of the overall interactive area 114 in determining its outputs.


The rendering engine 128 receives input image information from the application 126 about the content to be rendered for each directed beam. This image information may take the form of images, textures, points, vectors, polygons, and other data that may be supplied to a rendering engine. Also, more than one rendering engine may be present. The outputs of these rendering engines are images to the video projectors 130 or other display devices controlled by the application. These display devices receive video image inputs in anyone of a variety of standard ways, including analog composite, s-video, 15-pin analog, and DVI.


The directed beam controllers 132 receive input from the application 126 about the desired new positions of the directed beams. This information may be provided in a variety of ways. Many of these ways involve specifying information in the coordinate space of the model 114 of the interactive area. For example, a direction and speed of motion for each beam may be specified, or a destination location and a time to arrive there. However, this information may not directly correspond to settings 134 for mirror 14 or the mount motors on the directed beam. Thus, a mapping 138 between information about the motor settings 134 and the physical position 136 in the interactive area of the screen may be used. The mapping may be established during a calibration process for each directed beam, and allow the translation of any physical position information within the interactive area into a new group of settings for the motors 140. These motors may be controlled by the controller 132 in a variety of ways, including the DMX protocol, a serial connection, and Ethernet. A variety of off-the-shelf motor control mechanisms using these protocols exist, including the I-Cue Intelligent Mirror by Rosco Laboratories. The I-Cue is a mirror connected to two motors that can control the pan and the tilt of the mirror. The motors can be controlled via the DMX protocol from a computer.


In addition, as the position of the projected image changes, the projected image may need to be refocused. Unless an autofocus system is included in the projector, the focus settings for each position of the image may be found as part of the mapping 138. The focus information may then be sent to the projector 130. The information may be sent through a variety of methods, for example, a serial or Ethernet connection. Finally, the information about the new positions of the corners of the screen of each directed beam may be passed back to the model 114 of the interactive area in order to provide up-to-date information.


The system 100 may also incorporate one or more directed audio beams. The control systems are very similar to those of directed video beams, except that the application 126 would provide sound information to be played on the speakers of the directed audio beams as well as provide information to a directed beam controller 132 that has the correct mapping 138 for the audio beam.


In one embodiment, one process is designed to handle detection and/or identification of interesting features of the interactive area 20 viewed by the image detection system. Another process is designed to utilize the information produced by the previous process to generate an output that is used to control the directed beam. A number of illustrative, non-exhaustive examples of each process are described below.


For ease of reference, the components that are used to provide feature detection is referred to as the vision processing system 106. There are several things that can be sensed using the vision processing system 106. One such feature is whether each pixel in the image or scene of the interactive area 20 should be treated as foreground or background. Examples of the vision processing system are described in U.S. patent application Ser. No. 10/160,217, entitled “INTERACTIVE VIDEO DISPLAY SYSTEM” filed on May 28, 2002 and No. 60/514,024, entitled “METHOD AND SYSTEM FOR PROCESSING CAPTURED IMAGE INFORMATION IN AN INTERACTIVE VIDEO DISPLAY SYSTEM” filed on Oct. 24, 2003, which is adept at separating foreground and background using an image input provided by a single video camera. If a stereo camera is used to provide image input, then a stereopsis algorithm (either in hardware or in software) can be used to derive 3D information about the scene. This 3D information can be used to define or refine foreground/background distinctions. Using a time-of-flight camera, the 3D information can be obtained directly from the hardware.


Using the foreground/background distinctions, and optionally camera output and/or 3D information as well as other types of information, a variety of person tracking algorithms can be applied to get a constant knowledge of the present and past positions of objects in the interactive area 20.


Data generated by the vision processing system 106, and optionally other types of inputs, are used to produce an output that is used to control the directed beam. Such control can be effected through orientation, motion or content information. In one embodiment, the output that is used to control the directed beam is generated by a computer application residing within the computer 16. The computer application can be implemented using software, hardware or a combination of both.


Data generated by the vision processing system include, but are not limited to, the foreground/background classification of the image of the interactive area 20, person tracking information, and 3D scene information about the interactive area 20. In addition, other types of inputs include sound, temperature, keyboard input, RF tags, communications with wireless devices etc.


An appropriate spatial translation system is used such that the information from the vision processing system and information about orientations of the directed beam can be mapped into a single model of physical space, allowing an application to align the physical location of its outputs with the physical location of its inputs. A variety of simple calibration schemes can be devised to allow this mapping to be created.


In one embodiment, the image produced by the camera 18 is considered the primary representation of the interactive area, and all other information is translated into the coordinates of this image. However, in other embodiments, the images produced by one or more cameras 18 are translated into a single model 114 of the physical space of the interactive area, which has a coordinate system based on the physical distances between locations in the interactive area. In this latter case, a calibration method may be employed to map each camera's image to the model of the interactive area. Several such methods may be used. For example, one such method involves placing reflective dots or other machine-recognizable patterns at known locations on the ground in the interactive area. These dots may be placed in a regular grid pattern in the interactive area. A human user then uses a computer input device to input the (x,y) coordinates in physical space of at least four dots visible in the camera's image. This procedure is repeated for each camera 18. The end result is a correspondence between points in each camera's image and points in physical space that allows transformation parameters to be computed in system 104, allowing the various camera images to be merged into a single image with a known correspondence to the physical space of the interactive area.


The directed beams may be calibrated to this model 114 such that there is a known mapping between the settings of the directed beam and the beam's position within the interactive area. The directed beam parameters may be mapped to either coordinates of a camera's image or coordinates of the physical space of the interactive area. Several calibration methods may be employed. One such calibration scheme involves pointing each directed beam at series of specific locations within the interactive area. For each location, a marker recognizable to the vision processing system such as a reflective dot is placed at the center of the directed beam. If the directed beam is a projector, additional information such as the position of the four corners of the screen may also be gathered with the aid of reflective dots. By doing this for a variety of locations throughout the interactive area, a reliable mapping can be devised. Using interpolation from nearby locations or curve fitting techniques as known in the art, the (x,y) coordinates of any location that was not observed during calibration can be translated into pan and tilt settings that the directed beam (as well as other information that may have been recorded, such as the positions of the corners of the screen in a projector) would have at that location. This ability to translate between (x,y) coordinates in the interactive area to settings for a directed beam is what allows the system 100 to control the directed beams in the desired fashion.


The output of the processing software from the previous component is projected either visually or aurally using a system that can direct light or sound in a particular direction. The direction and content can change over time. Directed light display systems include, but are not limited to simple spotlights, slide projectors, gobos, and video projectors. Directed audio systems include, but are not limited to, parabolic speakers and ultrasonic emitters that produce audible sound upon interaction with air.


The directed beam can be reoriented in several ways. In one implementation, the component generating the directed beam is on a mount with two or more degrees of freedom and is motorized and controlled by the computer 16. In another implementation, the path of the light beam is directed using a mirror or series of mirrors placed in front of the beam. This mirror or set of mirrors would also be motorized, with its orientation controllable in real time by the computer 16.


There are multiple possible physical configurations of the different components as described above. For example, the directed beam can be aimed either at a wall or the ground. In one implementation, the image detection system is positioned as close to the directed beam as possible to make the scene viewed by the image detection system the same scene as the one that the directed beam is projected onto. However, this need not be the case, especially if a 3D camera is used as the image detection system.


The following illustrates a number of examples or applications where the system 10 can be deployed. In one application, the system 10 can be used to produce a “follow spot” effect. The system 10 can direct a directed beam in the form of a spotlight to follow a person as s/he moves around within a defined area. Unlike a manually operated spotlight, this spotlight can automatically and accurately track the person with little or no human input at the control end.


By using a gobo or video projection system as a spotlight, the spotlight is able to change over time based on other input data, including, but not limited to, the position, shape, or movement of the person in the spotlight.


In an alternative application, the system 10 uses a directed beam in the form of a directed audio stream instead of a light beam. The system 10 can be used to deliver a specialized audio stream to a single person as s/he moves around within a defined area. This could be used to allow a person to hear music without disturbing others and without the encumbrance of headphones. It could also be used to deliver private instructions, information, advertisements, or warnings to one or more persons. This directed audio system could be augmented with a directed microphone pointed in the same direction, allowing two-way conversation. The conversant at the computer end could either be a human being or a voice recognition computer interface.


As mentioned above, in other applications, multiple directed beams can be deployed and such beams can take various forms including, for example, video, audio, and audio/video data.


In yet another application, the system 10 can be used to create an intelligent virtual entity whose image moves around in the physical world. This entity could be created with a projector including, for example, a slide projector or gobo and a video projector. In one implementation, a video projector is used allowing the entity's image to change over time in complex and myriad ways. The entity could take a variety of appearances, including, but not limited to, an abstract shape, a logo, a text message, or a static or animated virtual character.


In order for the virtual entity to move in a realistic way, it would be preferable for the virtual entity's image to avoid moving onto other objects, such as people or tables. The movement of the virtual entity's image can be controlled by the vision processing system as described above. For example, by classifying the image of the interactive area 20 to produce a background/foreground distinction, the system 10 can restrict the virtual entity to only move within background areas.


Furthermore, the virtual entity can interact with the person(s) in the interactive area 20 in a variety of ways. For example, the virtual entity could follow people around; the virtual entity could try to stand in front of people; the virtual entity could lead people to a specific location; the virtual entity could be pushed, pulled, or otherwise moved by person(s) approaching or touching its image.


In addition, the virtual entity could contain active regions that generate a reaction when a person moves a part of his/her body into one or more of those regions. The virtual entity could change appearance or behavior when a person interacts with its image. The specific portion of a virtual entity's image that the person is interacting with could be determined by the vision processing system and used to modify the virtual entity's appearance or behavior with further specificity. See, for example, U.S. patent application Ser. No. 10/160,217, entitled “INTERACTIVE VIDEO DISPLAY SYSTEM” filed on May 28, 2002, for an illustrative description of how a display can be altered based on a person's position and shape.


Optionally, the virtual entity can be augmented with directed speakers to allow it to broadcast sound in the immediate area around its image, as well as a directed microphone to allow two-way conversation between people near the image. The virtual entity's ability to speak and understand speech can be provided by either a human being or a voice recognition computer interface.


Uses for the “intelligent virtual entities” configurations are numerous. For example, whether an object or a character, these entities could provide entertainment value through their range of behaviors and appearances, and their ability to interact with people in a variety of ways. The virtual entities could take the form of virtual playmates that educate or entertain children or adults. The virtual entities could also be used to add atmospheric and/or aesthetic value to a particular location or environment. For example, a virtual scorpion could crawl around a desert-themed restaurant, adding to the ambiance. As another example, a virtual character could act as a greeter for a store, welcoming people as they come in. The virtual entities can also be used as advertising tools, promoting particular brands or products in an engaging and interactive way. The virtual entities could also take the form of informational tools, assistants who can move within the world and provide information to people. This informational exchange could take the form of a visual dialogue, in which the virtual entity presents a person with a visual menu of selectable options. In addition, the virtual entities could be used to direct people. For example, a virtual entity could keep people from entering a forbidden area by appearing in front of someone if a person is about to enter the area, and giving the person either a visual or audio warning or both. The virtual entities could be used in a variety of augmented reality games, such as, items, balls, players, or other thing important to a game. This list of uses is not meant to be exhaustive, and the examples provided herein are for illustrative purposes only.


Accordingly, embodiments of the present invention are able to detect actions or interactions by an object when the projected image, the projection surface, or lighting conditions are dynamic. Also, system 100 is able to detect actions or the motion of objects, such as people, outside of the projected image. This is because the vision system is independent of the directed beam's image. The vision system can also capture the full outline of objects in the interactive area at all times.


Based on the disclosure and teachings provided herein, a person of ordinary skill in the art will appreciate other ways and/or methods to deploy embodiments in accordance with the present invention.


In an exemplary implementation, the present invention is implemented using software in the form of control logic, in either an integrated or a modular manner. Alternatively, hardware or a combination of software and hardware can also be used to implement the present invention. Based on the disclosure and teachings provided herein, a person of ordinary skill in the art will know of other ways and/or methods to implement the present invention.


It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incorporated by reference for all purposes in their entirety.

Claims
  • 1. An interactive audiovisual system comprising: an image detection computing system configured to determine a position of at least a portion of a person in an interactive area based on one or more images of the interactive area received from an imaging device, wherein the position is determined without the need for a tracking device associated with at least the portion of the person;an interactive media module configured to use the determined position to generate a video control signal and an audio control signal;a video beam controller configured to receive the video control signal and to adjust a position of a video image projected from a video projection device based on at least the video control signal; andan audio beam controller configured to receive the audio control signal and to adjust a direction of an audio signal emitted from an audio device based on at least the audio control signal.
  • 2. The interactive audiovisual system of claim 1, wherein the audio signal is directed towards the video image to create an illusion that the video image is making noises or speaking.
  • 3. The interactive audiovisual system of claim 2, wherein the video image comprises a representation of a person, an animated character, or an avatar.
  • 4. The interactive audiovisual system of claim 1, wherein the audio signal is an audio beam.
  • 5. The interactive audiovisual system of claim 1, wherein the images of the interactive area comprise indications of infrared light reflected by at least the portion of the person in the interactive area.
  • 6. The interactive audiovisual system of claim 5, wherein the video image comprises a substantially visible light projection.
  • 7. An interactive video system comprising: an image detection computing system configured to determine a position of an object in an interactive area based on one or more images of the object, wherein the position is determined without the need for a tracking device associated with the object;an interactive media module configured to generate a video control signal based on the determined position of the object; anda video device configured to receive the video control signal and to adjust a position of a video image projected from the video device based on at least the video control signal.
  • 8. The interactive video system of claim 7, wherein the image detection computing system includes one or more cameras.
  • 9. The interactive video system of claim 7, wherein the object is at least a portion of a person.
  • 10. The interactive video system of claim 7 wherein the position of the video image is adjusted to correspond to the determined position of the object.
  • 11. The interactive video system of claim 7, wherein the video image comprises one or more of a spotlight, a slide projection, a gobo, an abstract shape, a logo, a text message, or a virtual character.
  • 12. The interactive video system of claim 9, wherein the video image comprises a spotlight and wherein the video device adjusts the position of the spotlight to follow at least the portion of the person as the person moves within the interactive area.
  • 13. The interactive video system of claim 7, wherein the image detection system is further configured to generate a model of the interactive area that indicates positions of one or more predetermined points of the video image in the interactive area, and the position of the object is determined based at least in part on the position of the object in reference to one or more of the predetermined points.
  • 14. The interactive video system of claim 13, wherein the predetermined points include at least one corner of the video image in the interactive area.
  • 15. A method of providing interactive video effects comprising: receiving one or more images of an object within an interactive area;determining a position of the object in the interactive area based on the one or more images, wherein the determining is performed without receiving information from a tracking device associated with the object;generating a video control signal that indicates an updated position of a video image in the interactive area, wherein the updated position is determined based on at least the position of the object; andadjusting a position of a video beam controller that transmits the video image according to the video control signal,wherein at least some of the method is performed by a suitably configured computing system having one or more processors.
  • 16. The method of claim 15, wherein the updated position of the video image substantially corresponds to the position of the object.
  • 17. The method of claim 15, further comprising: generating an audio control signal that indicates an updated target of a directed audio signal, wherein the updated target is adjusted based on at least the position of the object; andadjusting a position of an audio beam controller in accordance with the audio control signal.
  • 18. The method of claim 17, wherein the updated target of the directed audio signal is substantially a position of a virtual object.
  • 19. A tangible computer-readable medium having stored thereon computer-executable instructions that, in response to execution by one or more computing devices, cause the one or more computing devices to perform operations comprising: determining a position of a user in an interactive area based on one or more images of the interactive area received from one or more cameras;determining an updated direction of a projected video image and an updated direction of a projected audio beam based on at least the position of the user;initiating adjustment of a direction of a video beam controller in accordance with the updated direction of the projected video image; andinitiating adjustment of an audio beam controller in accordance with the updated direction of the projected audio beam.
  • 20. The tangible computer-readable medium of claim 19, wherein determining the position of the user comprises determining portions of the one or more images that are foreground and determining portions of the one or more images that are background.
  • 21. The tangible computer-readable medium of claim 19, wherein determining the updated direction of the projected video image and the updated direction of the projected audio beam comprises comparing the position of the user to a position of the video image in the interactive area.
RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 10/737,730, entitled “INTERACTIVE DIRECTED LIGHT/SOUND SYSTEM” filed on Dec. 15, 2003, which is a non-provisional of and claims priority under 35 U.S.C. §119(e) to U.S. Provisional Application No. 60/433,158, entitled “INTERACTIVE DIRECTED LIGHT/SOUND SYSTEM” filed on Dec. 13, 2002, the disclosures of which are hereby incorporated by reference in their entireties for all purposes. The present application is also related to U.S. patent application Ser. No. 10/160,217, entitled “INTERACTIVE VIDEO DISPLAY SYSTEM” filed on May 28, 2002; and U.S. patent application Ser. No. 60/504,375, entitled “SELF-CONTAINED INTERACTIVE VIDEO DISPLAY SYSTEM” filed on Sep. 18, 2003, the disclosures of which are hereby incorporated by reference in its entirety for all purposes.

US Referenced Citations (222)
Number Name Date Kind
2917980 Grube et al. Dec 1959 A
3068754 Benjamin et al. Dec 1962 A
3763468 Ovshinsky et al. Oct 1973 A
4053208 Kato et al. Oct 1977 A
4275395 Dewey et al. Jun 1981 A
4573191 Kidode et al. Feb 1986 A
4725863 Dumbreck et al. Feb 1988 A
4843568 Krueger et al. Jun 1989 A
4887898 Halliburton et al. Dec 1989 A
4948371 Hall Aug 1990 A
5001558 Burley et al. Mar 1991 A
5138304 Bronson Aug 1992 A
5151718 Nelson Sep 1992 A
5239373 Tang et al. Aug 1993 A
5276609 Durlach Jan 1994 A
5319496 Jewell et al. Jun 1994 A
5325472 Horiuchi et al. Jun 1994 A
5325473 Monroe et al. Jun 1994 A
5426474 Rubstov et al. Jun 1995 A
5436639 Arai et al. Jul 1995 A
5442252 Golz Aug 1995 A
5454043 Freeman Sep 1995 A
5497269 Gal Mar 1996 A
5510828 Lutterbach et al. Apr 1996 A
5526182 Jewell et al. Jun 1996 A
5528263 Platzker et al. Jun 1996 A
5528297 Seegert et al. Jun 1996 A
5534917 MacDougall Jul 1996 A
5548694 Gibson Aug 1996 A
5591972 Noble et al. Jan 1997 A
5594469 Freeman et al. Jan 1997 A
5633691 Vogeley et al. May 1997 A
5703637 Miyazaki et al. Dec 1997 A
5808784 Ando et al. Sep 1998 A
5846086 Bizzi et al. Dec 1998 A
5861881 Freeman et al. Jan 1999 A
5882204 Iannazo et al. Mar 1999 A
5923380 Yang et al. Jul 1999 A
5923475 Kurtz et al. Jul 1999 A
5953152 Hewlett Sep 1999 A
5969754 Zeman Oct 1999 A
5978136 Ogawa et al. Nov 1999 A
5982352 Pryor Nov 1999 A
6008800 Pryor Dec 1999 A
6058397 Barrus et al. May 2000 A
6075895 Qiao et al. Jun 2000 A
6084979 Kanade et al. Jul 2000 A
6088612 Blair Jul 2000 A
6097369 Wambach Aug 2000 A
6106119 Edwards Aug 2000 A
6118888 Chino et al. Sep 2000 A
6125198 Onda Sep 2000 A
6166744 Jaszlics et al. Dec 2000 A
6176782 Lyons et al. Jan 2001 B1
6191773 Maruno et al. Feb 2001 B1
6198487 Fortenbery et al. Mar 2001 B1
6198844 Nomura Mar 2001 B1
6217449 Kaku Apr 2001 B1
6263339 Hirsh Jul 2001 B1
6270403 Watanabe et al. Aug 2001 B1
6278418 Doi Aug 2001 B1
6292171 Fu et al. Sep 2001 B1
6304267 Sata Oct 2001 B1
6308565 French et al. Oct 2001 B1
6323895 Sata Nov 2001 B1
6333735 Anvekar Dec 2001 B1
6335977 Kage Jan 2002 B1
6339748 Hiramatsu Jan 2002 B1
6349301 Mitchell et al. Feb 2002 B1
6353428 Maggioni et al. Mar 2002 B1
6359612 Peter et al. Mar 2002 B1
6388657 Natoli May 2002 B1
6394896 Sugimoto May 2002 B2
6400374 Lanier Jun 2002 B2
6407870 Hurevich et al. Jun 2002 B1
6414672 Rekimoto et al. Jul 2002 B2
6445815 Sato Sep 2002 B1
6454419 Kitazawa Sep 2002 B2
6480267 Yanagi et al. Nov 2002 B2
6491396 Karasawa et al. Dec 2002 B2
6501515 Iwamura Dec 2002 B1
6522312 Ohshima et al. Feb 2003 B2
6545706 Edwards et al. Apr 2003 B1
6552760 Gotoh et al. Apr 2003 B1
6598978 Hasegawa Jul 2003 B2
6607275 Cimini et al. Aug 2003 B1
6611241 Firester et al. Aug 2003 B1
6654734 Mani et al. Nov 2003 B1
6658150 Tsuji et al. Dec 2003 B2
6661918 Gordon et al. Dec 2003 B1
6677969 Hongo Jan 2004 B1
6707054 Ray Mar 2004 B2
6707444 Hendriks et al. Mar 2004 B1
6712476 Ito et al. Mar 2004 B1
6720949 Pryor et al. Apr 2004 B1
6732929 Good et al. May 2004 B2
6747666 Utterback Jun 2004 B2
6752720 Clapper et al. Jun 2004 B1
6754370 Hall-Holt et al. Jun 2004 B1
6791700 Omura et al. Sep 2004 B2
6826727 Mohr et al. Nov 2004 B1
6831664 Marmaropoulos et al. Dec 2004 B2
6871982 Holman et al. Mar 2005 B2
6877882 Haven et al. Apr 2005 B1
6912313 Li Jun 2005 B2
6965693 Kondo et al. Nov 2005 B1
6975360 Slatter Dec 2005 B2
6999600 Venetianer Feb 2006 B2
7015894 Morohoshi Mar 2006 B2
7042440 Pryor May 2006 B2
7054068 Yoshida et al. May 2006 B2
7058204 Hildreth et al. Jun 2006 B2
7068274 Welch et al. Jun 2006 B2
7069516 Rekimoto Jun 2006 B2
7084859 Pryor et al. Aug 2006 B1
7088508 Ebina et al. Aug 2006 B2
7149262 Nayar et al. Dec 2006 B1
7158676 Rainsford Jan 2007 B1
7170492 Bell Jan 2007 B2
7190832 Frost et al. Mar 2007 B2
7193608 Stuerzlinger Mar 2007 B2
7227526 Hildreth et al. Jun 2007 B2
7259747 Bell Aug 2007 B2
7262874 Suzuki Aug 2007 B2
7289130 Satoh et al. Oct 2007 B1
7330584 Weiguo et al. Feb 2008 B2
7339521 Scheidemann et al. Mar 2008 B2
7348963 Bell Mar 2008 B2
7379563 Shamaie May 2008 B2
7382897 Brown et al. Jun 2008 B2
7394459 Bathiche et al. Jul 2008 B2
7428542 Fink et al. Sep 2008 B1
7432917 Wilson et al. Oct 2008 B2
7536032 Bell May 2009 B2
7559841 Hashimoto Jul 2009 B2
7576727 Bell Aug 2009 B2
7598942 Underkoffler et al. Oct 2009 B2
7619824 Poulsen Nov 2009 B2
7665041 Wilson et al. Feb 2010 B2
7710391 Bell et al. May 2010 B2
7737636 Li et al. Jun 2010 B2
RE41685 Feldman et al. Sep 2010 E
7809167 Bell Oct 2010 B2
7834846 Bell Nov 2010 B1
20010012001 Rekimoto et al. Aug 2001 A1
20010033675 Maurer et al. Oct 2001 A1
20020006583 Michiels et al. Jan 2002 A1
20020032697 French et al. Mar 2002 A1
20020041327 Hildreth et al. Apr 2002 A1
20020064382 Hildreth et al. May 2002 A1
20020081032 Chen et al. Jun 2002 A1
20020103617 Uchiyama et al. Aug 2002 A1
20020105623 Pinhanez Aug 2002 A1
20020130839 Wallace et al. Sep 2002 A1
20020140633 Rafii et al. Oct 2002 A1
20020140682 Brown et al. Oct 2002 A1
20020178440 Agnihotri et al. Nov 2002 A1
20020186221 Bell Dec 2002 A1
20030032484 Ohshima et al. Feb 2003 A1
20030076293 Mattsson Apr 2003 A1
20030091724 Mizoguchi May 2003 A1
20030093784 Dimitrova et al. May 2003 A1
20030098819 Sukthankar et al. May 2003 A1
20030103030 Wu Jun 2003 A1
20030113018 Nefian et al. Jun 2003 A1
20030122839 Matraszek et al. Jul 2003 A1
20030137494 Tulbert Jul 2003 A1
20030161502 Morihara et al. Aug 2003 A1
20030178549 Ray Sep 2003 A1
20040005924 Watabe et al. Jan 2004 A1
20040015783 Lennon et al. Jan 2004 A1
20040046736 Pryor et al. Mar 2004 A1
20040046744 Rafii et al. Mar 2004 A1
20040073541 Lindblad et al. Apr 2004 A1
20040091110 Barkans May 2004 A1
20040095768 Watanabe et al. May 2004 A1
20040183775 Bell Sep 2004 A1
20050028188 Latona et al. Feb 2005 A1
20050039206 Opdycke Feb 2005 A1
20050086695 Keele et al. Apr 2005 A1
20050088407 Bell Apr 2005 A1
20050089194 Bell Apr 2005 A1
20050104506 Youh et al. May 2005 A1
20050110964 Bell May 2005 A1
20050122308 Bell et al. Jun 2005 A1
20050132266 Ambrosino et al. Jun 2005 A1
20050147282 Fujii Jul 2005 A1
20050162381 Bell et al. Jul 2005 A1
20050185828 Semba et al. Aug 2005 A1
20050195598 Dancs et al. Sep 2005 A1
20050265587 Schneider Dec 2005 A1
20060010400 Dehlin et al. Jan 2006 A1
20060031786 Hillis et al. Feb 2006 A1
20060132432 Bell Jun 2006 A1
20060139314 Bell Jun 2006 A1
20060168515 Dorsett, Jr. et al. Jul 2006 A1
20060184993 Goldthwaite et al. Aug 2006 A1
20060187545 Doi Aug 2006 A1
20060227099 Han et al. Oct 2006 A1
20060242145 Krishnamurthy et al. Oct 2006 A1
20060256382 Matraszek et al. Nov 2006 A1
20060258397 Kaplan et al. Nov 2006 A1
20060294247 Hinckley et al. Dec 2006 A1
20070285419 Givon Dec 2007 A1
20080040692 Sunday et al. Feb 2008 A1
20080062123 Bell Mar 2008 A1
20080062257 Corson Mar 2008 A1
20080090484 Lee et al. Apr 2008 A1
20080150890 Bell et al. Jun 2008 A1
20080150913 Bell et al. Jun 2008 A1
20080170776 Albertson et al. Jul 2008 A1
20080245952 Troxell et al. Oct 2008 A1
20080252596 Bell et al. Oct 2008 A1
20090027337 Hildreth Jan 2009 A1
20090077504 Bell et al. Mar 2009 A1
20090102788 Nishida et al. Apr 2009 A1
20090225196 Bell et al. Sep 2009 A1
20090235295 Bell et al. Sep 2009 A1
20090251685 Bell Oct 2009 A1
20100039500 Bell et al. Feb 2010 A1
20100060722 Bell et al. Mar 2010 A1
20100121866 Bell et al. May 2010 A1
Foreign Referenced Citations (28)
Number Date Country
0 055 366 Jul 1982 EP
0 626 636 Nov 1994 EP
0 913 790 May 1999 EP
1 689 172 Jun 2002 EP
57-094672 Jun 1982 JP
2000-105583 Apr 2000 JP
2002-014997 Jan 2002 JP
2002-092023 Mar 2002 JP
2002-171507 Jun 2002 JP
2003-517642 May 2003 JP
2003-271084 Sep 2003 JP
2003-0058894 Jul 2003 KR
WO 9838533 Sep 1998 WO
WO 0016562 Mar 2000 WO
WO 0163916 Aug 2001 WO
WO 0201537 Jan 2002 WO
WO 02100094 Dec 2002 WO
WO 2004055776 Jul 2004 WO
WO 2004097741 Nov 2004 WO
WO 2005041578 May 2005 WO
WO 2005041579 May 2005 WO
WO 2005057398 Jun 2005 WO
WO 2005057399 Jun 2005 WO
WO 2005057921 Jun 2005 WO
WO 2005091651 Sep 2005 WO
WO 2007019443 Feb 2007 WO
WO 2008124820 Oct 2008 WO
WO 2009035705 Mar 2009 WO
Related Publications (1)
Number Date Country
20100026624 A1 Feb 2010 US
Provisional Applications (1)
Number Date Country
60433158 Dec 2002 US
Continuations (1)
Number Date Country
Parent 10737730 Dec 2003 US
Child 12542633 US