Mode switching for integrated gestural interaction and multi-user collaboration in immersive virtual reality environments

Information

  • Patent Grant
  • 12299207
  • Patent Number
    12,299,207
  • Date Filed
    Thursday, September 28, 2023
    a year ago
  • Date Issued
    Tuesday, May 13, 2025
    5 days ago
Abstract
The technology disclosed relates to user interfaces for controlling augmented reality (AR) or virtual reality (VR) environments. Real and virtual objects can be seamlessly integrated to form an augmented reality by tracking motion of one or more real objects within view of a wearable sensor system. Switching the AR/VR presentation on or off to interact with the real world surrounding them, for example to drink some soda, can be addressed with a convenient mode switching gesture associated with switching between operational modes in a VR/AR enabled device.
Description
FIELD OF THE TECHNOLOGY DISCLOSED

The present disclosure relates generally to human machine interface and in particular to augmented reality for wearable devices and methods for object detection and tracking.


INCORPORATIONS

Materials incorporated by reference in this filing include the following:


“DETERMINING POSITIONAL INFORMATION FOR AN OBJECT IN SPACE”, U.S. Non. Prov. application. Ser. No. 14/214,605, filed 14 Mar. 2014,


“RESOURCE-RESPONSIVE MOTION CAPTURE”, US Non-Prov. application Ser. No. 14/214,569, filed on 14 Mar. 2014,


“PREDICTIVE INFORMATION FOR FREE-SPACE GESTURE CONTROL AND COMMUNICATION”, U.S. Prov. App. No. 61/873,758, filed on 4 Sep. 2013,


“VELOCITY FIELD INTERACTION FOR FREE SPACE GESTURE INTERFACE AND CONTROL”, U.S. Prov. App. No. 61/891,880, filed on 16 Oct. 2013,


“INTERACTIVE TRAINING RECOGNITION OF FREE SPACE GESTURES FOR INTERFACE AND CONTROL”, U.S. Prov. App. No. 61/872,538, filed on 30 Aug. 2013,


“DRIFT CANCELATION FOR PORTABLE OBJECT DETECTION AND TRACKING”, U.S. Prov. App. No. 61/938,635, filed on 11 Feb. 2014,


“IMPROVED SAFETY FOR WEARABLE VIRTUAL REALITY DEVICES VIA OBJECT DETECTION AND TRACKING”, U.S. Prov. App. No. 61/981,162, filed on 17 Apr. 2014,


“WEARABLE AUGMENTED REALITY DEVICES WITH OBJECT DETECTION AND TRACKING”, U.S. Prov. App. No. 62/001,044, filed on 20 May 2014,


“METHODS AND SYSTEMS FOR IDENTIFYING POSITION AND SHAPE OF OBJECTS IN THREE-DIMENSIONAL SPACE”, U.S. Prov. App. No. 61/587,554, filed 17 Jan. 2012,


“SYSTEMS AND METHODS FOR CAPTURING MOTION IN THREE-DIMENSIONAL SPACE”, U.S. Prov. App. No. 61/724,091, filed 8 Nov. 2012,


“NON-TACTILE INTERFACE SYSTEMS AND METHODS”, U.S. Prov. App. No. 61/816,487, filed 26 Apr. 2013,


“DYNAMIC USER INTERACTIONS FOR DISPLAY CONTROL”, U.S. Prov. App. No. 61/752,725, filed on 15 Jan. 2013,


“VEHICLE MOTION SENSORY CONTROL”, U.S. Prov. App. No. 62/005,981, filed 30 May 2014,


“SYSTEMS AND METHODS OF PROVIDING HAPTIC-LIKE FEEDBACK IN THREE-DIMENSIONAL (3D) SENSORY SPACE”, U.S. Prov. App. No. 61/937,410, filed 7 Feb. 2014,


“SYSTEMS AND METHODS OF INTERACTING WITH A VIRTUAL GRID IN A THREE-DIMENSIONAL (3D) SENSORY SPACE”, U.S. Prov. App. No. 62/007,885, filed 4 Jun. 2014,


“SYSTEMS AND METHODS OF GESTURAL INTERACTION IN A PERVASIVE COMPUTING ENVIRONMENT”, U.S. Prov. App. No. 62/003,298, filed 27 May 2014,


“MOTION CAPTURE USING CROSS-SECTIONS OF AN OBJECT”, U.S. application Ser. No. 13/414,485, filed on 7 Mar. 2012,


“USER INTERFACE FOR INTEGRATED GESTURAL INTERACTION AND MULTI-USER COLLABORATION IN IMMERSIVE VIRTUAL REALITY ENVIRONMENTS”, U.S. Prov. App. No. 62/094,015, filed 18 Dec. 2014, and


“SYSTEM AND METHODS FOR CAPTURING MOTION IN THREE-DIMENSIONAL SPACE”, U.S. application Ser. No. 13/742,953, filed 16 Jan. 2013.


BACKGROUND

The subject matter discussed in this section should not be assumed to be prior art merely as a result of its mention in this section. Similarly, a problem mentioned in this section or associated with the subject matter provided as background should not be assumed to have been previously recognized in the prior art. The subject matter in this section merely represents different approaches, which in and of themselves may also correspond to implementations of the claimed technology.


Conventional motion capture approaches rely on markers or sensors worn by the subject while executing activities and/or on the strategic placement of numerous bulky and/or complex equipment in specialized and rigid environments to capture subject movements. Unfortunately, such systems tend to be expensive to construct. In addition, markers or sensors worn by the subject can be cumbersome and interfere with the subject's natural movement. Further, systems involving large numbers of cameras tend not to operate in real time, due to the volume of data that needs to be analyzed and correlated. Such considerations have limited the deployment and use of motion capture technology.


Consequently, there is a need for providing the ability to view and/or interact with the real world when using virtual reality capable devices (e.g., wearable or otherwise having greater portability) by capturing the motion of objects in real time without fixed or difficult to configure sensors or markers.


INTRODUCTION

The technology disclosed relates to user interfaces for controlling augmented reality environments. In devices for viewing augmented reality (AR) or virtual reality (VR), real and virtual objects can be seamlessly integrated to form an augmented reality by tracking motion of one or more real objects within view of a wearable sensor system using one or more cameras. For example, menus and other controls of user interface components, e.g., menus, icons, widgets, controls and the like integrated within a virtual or augmented reality environment can be displayed arranged over the user's arm and/or hand. The sensor system is capable of recognizing the user's hand(s) and/or arm(s) (or other body parts); integrating computer generated menus, controls or other objects thereon; and displaying the real world view augmented by the integrated computer generated items to the user. Such displays enable the user to enjoy convenient access to controls for applications used in conjunction with the AR system presented in a “believable” manner (e.g., on the hand or arm).


Users' desire to switch the AR/VR presentation on or off to interact with the real world surrounding them, for example to drink some soda, can be addressed with a convenient mode switching gesture associated with switching between operational modes in a VR/AR enabled device. Example operational modes can include a mode in which a video stream provided by an application program implementing a virtual reality (VR) or augmented reality (AR) is displayed to the user via the VR/AR enabled device and a mode in which a live video stream of the real world surrounding the user captured with cameras is displayed to the user in a pass through mode. The live video stream can be captured by cameras integrated into a wearable sensor system that can be affixed to a wearable VR/AR enabled device in one implementation. In one implementation, the wearable sensor system camera captures IR images for gesture detection and object virtualization into a VR presentation in VR mode as well as for conversion to visible images for display to the user as a pass-through mode. Other implementations include capturing input information about gestures using a set of IR (infrared) pixels while capturing live video information using a set of a RGB (red, green, and blue) pixels of a wearable sensor system. This approach is compatible with a wide variety of AR/VR technologies including systems that capture image data along different portions of the electromagnetic spectrums, including visible, near-IR, and IR spectrums; thereby compensating for deficiencies in lighting, contrast, and resolution in different environmental conditions.


In an implementation, a method of switching modes in a virtual reality device including motion tracking by a wearable sensor system is provided, that includes detecting a gesture associated with switching modes by capturing a video stream of a scene in a real world space proximate to a virtual reality device using at least one camera electronically coupled to a wearable sensor system. Selecting a display mode chosen from a first video stream including live video of the scene in real world space and a second video stream including an output of a virtual reality application is also part of the method. The method further includes switching display modes by displaying the video stream selected.


Implementation specifics vary and different paradigms can be employed in


The method of claim 1, wherein selecting a display mode. For example, in one implementation, selecting includes toggling between the first video stream and the second video stream. In one example implementation, the display mode can be switched back when a second gesture associated with switching modes is detected. In another example, the display mode can be switched back when a reverse gesture in a direction opposite to a direction of the gesture associated with switching modes is detected. In a further example, display modes can be selected by gesture, e.g., a first video stream is selected whenever a first gesture is detected and a second video stream is selected whenever a second gesture is detected.


In various implementations, gestures employed for mode switching can include one or more of a swipe from top to bottom, a swipe from bottom to top, a swipe from left to right, a swipe from right to left, and/or a spiral. The swiping (or other) motion made in a zone proximate to the virtual reality device. For example, mode changing gestures can be made very close (within 12 inches) of the device, while other (non-mode changing) gestures can be made at distances greater than 12 inches.


In one implementation, selecting a display mode can include switching between the video stream including image information gathered using a set of RGB pixels and a video stream including image information gathered using a set of IR pixels. In one implementation, the mode can be switched back when a second gesture associated with switching modes is detected. The second gesture can be the same gesture repeated, or a reverse gesture in the opposite direction.


In another implementation, a method of providing an augmented reality environment is provided, that includes capturing motion of a user body portion in the field of view of one or more cameras. Detecting a gesture indicating a command input to change a mode of operation from a virtual reality mode to a live video display mode or from a live video display mode to a virtual reality mode is also part of the method. Further, the method can also include changing a mode of operation of a wearable display device according to the command input. In some implementations the detecting can include sensing a swipe of a user hand in close proximity to the one or more cameras. In some implementations, the changing can include toggling the wearable display device between the virtual reality mode and the live video display mode. In some implementations, the method can further include returning the wearable display device to a previous mode upon detecting a second gesture.


In a yet further implementation, a system is provided that includes a sensory system including one or more optical sensors; a processing system to determine from information provided by the sensory system a position and/or motion of one or more objects of interest; and a wearable rendering subsystem to display a live video presentation of a scene or rendered representations of the one or more objects of interest in a virtual reality selectable by a detected gesture command input.


In one implementation, the technology disclosed relates to systems including motion sensors having one or more cameras employing a combination a RGB (red, green, and blue) and IR (infrared) pixels. In one implementation, capturing gross or coarse features and corresponding feature values of a real world space using RGB pixels and capturing fine or precise features and corresponding feature values of the real world space using IR pixels. Once captured, motion information of the wearable sensor system with respect to at least one feature of the scene is determined based on comparison between feature values detected at different time instances. For instance, a feature of a real world space is an object at a given position in the real world space, and then the feature value can be the three-dimensional (3D) co-ordinates of the position of the object in the real world space. If, between pairs of image frame or other image volume, the value of the position co-ordinates changes, then this can be used to determine motion information of the wearable sensory system with respect to the object whose position changed between image frames.


In another example, a feature of a real world space is a wall in the real world space and the corresponding feature value is orientation of the wall as perceived by a viewer engaged with a wearable sensor system. In this example, if a change in the orientation of the wall is registered between successive image frames captured by a camera electronically coupled to the wearable sensor system, then this can indicate a change in the position of the wearable sensor system that views the wall.


According to one implementation, RGB pixels of a camera embedded in a wearable sensor system are used to identify an object in the real world space along with prominent or gross features of the object from an image or sequence of images such as object contour, shape, volumetric model, skeletal model, silhouettes, overall arrangement and/or structure of objects in a real world space. This can be achieved by measuring an average pixel intensity of a region or varying textures of regions, as described later in this application. Thus, RGB pixels allow for acquisition of a coarse estimate of the real world space and/or objects in the real world space.


Further, data from the IR pixels can be used to capture fine or precise features of the real world space, which enhance the data extracted from RGB pixels. Examples of fine features include surface textures, edges, curvatures, and other faint features of the real world space and objects in the real world space. In one example, while RGB pixels capture a solid model of a hand, IR pixels are used capture the vein and/or artery patterns or fingerprints of the hand.


Some other implementations can include capturing image data by using the RGB and IR pixels in different combinations and permutations. For example, one implementation can include simultaneously activating the RGB and IR pixels to perform a whole scale acquisition of image data, without distinguishing between coarse or detail features. Another implementation can include using the RGB and IR pixels intermittently. Yet another implementation can include activating the RGB and IR pixels according to a quadratic or Gaussian function. Some other implementations can include performing a first scan using the IR pixels followed by an RGB scan, and vice-versa.


The technology disclosed also relates to enabling multi-user collaboration and interaction in an immersive virtual environment. In particular, it relates to capturing different sceneries of a shared real world space from the perspective of multiple users. In one implementation, this is achieved by capturing video streams of the real world space using cameras embedded in wearable sensor systems engaged by the multiple users. Also, three-dimensional maps of the real world space are determined by extracting one or more feature values of the real world space from image frames captured using a combination of RGB and IR pixels of the respective cameras. Further, position, orientation, and/or velocity of the different users and/or their body portions are determined by calculating the motion information of their wearable sensor systems with respect to each other. This is achieved by comparing the respective three-dimensional maps of the real world space generated from the perspective of different users, according to one implementation.


The technology disclosed further relates to sharing content between wearable sensor systems. In particular, it relates to capturing images and video streams from the perspective of a first user of a wearable sensor system and sending an augmented version of the captured images and video stream to a second user of the wearable sensor system. The augmented version can include corresponding content, with the same capture frame as the original version, but captured from a wider or more encompassing field of view than the original version. The augmented version can be further used to provide a panoramic experience to the second user of the first user's limited view.


In one implementation, the captured content is pre-processed before it is transmitted to a second user. Pre-processing includes enhancing the resolution or contrast of the content or augmenting it with additional graphics, annotations, or comments, according to one implementation. In other implementations, pre-processing includes reducing the resolution of the captured content before transmission.


In one implementation, a wearable sensor system includes capabilities to autonomously create a three-dimensional (3D) map of an environment surrounding a user of a virtual reality device. The map can be advantageously employed to determine motion information of the wearable sensor system and/or another user in the environment. One method includes capturing a plurality of images. A flow can be determined from features identified in captured images. (For example, features in the images corresponding to objects in the real world can be detected. The features of the objects are correlated across multiple images to determine change, which can be represented as a flow.) Based at least in part upon that flow, a map of the environment can be created. The method also includes localizing a user in the environment using the map. Advantageously, processing time can be reduced when a user enters a previously visited portion of the environment, since the device need only scan for new or changed conditions (e.g., that might present hazards, opportunities or points of interest). In one implementation, once a map of the environment has been built, the map can be presented to a virtualizing (VR) system and the virtualizing system can use the map as constraint(s) upon which to construct its world. Accordingly, by employing such techniques, a VR system can enable collaboration between different users participating in collaborative experiences such as multi-user games and other shared space activities.


Implementations of the technology disclosed include methods and systems that enable a user of a wearable (or portable) virtual reality capable device, using a sensor configured to capture motion and/or determining the path of an object based on imaging, acoustic or vibrational waves, to view and/or intuitively interact with the real world. Implementations can enable improved user experience, greater safety, greater functionality to users of virtual reality for machine control and/or machine communications applications using wearable (or portable) devices, e.g., head mounted devices (HMDs), wearable goggles, watch computers, smartphones, and so forth, or mobile devices, e.g., autonomous and semi-autonomous robots, factory floor material handling systems, autonomous mass-transit vehicles, automobiles (human or machine driven), and so forth, equipped with suitable sensors and processors employing optical, audio or vibrational detection.


In one implementation, a wearable sensor system includes capabilities to provide presentation output to a user of a virtual reality device. For example, a video stream including a sequence of images of a scene in the real world is captured using one or more cameras on a head mounted device (HMD) having a set of RGB pixels and a set of IR pixels. Information from the IR sensitive pixels is separated out for processing to recognize gestures. Information from the RGB sensitive pixels is provided to a presentation interface of the wearable device as a live video feed to a presentation output. The presentation output is displayed to a user of the wearable sensor system. One or more virtual objects can be integrated with the video stream images to form the presentation output. Accordingly, the device is enabled to provide any of gesture recognition, a real world presentation of real world objects via pass through video feed, and/or an augmented reality including virtual objects integrated with a real world view. In one implementation, a wearable sensor system includes capabilities to provide presentation output to a user. For example, in one implementation, the device captures a video stream including a sequence of images of a scene in the real world. The video stream images are integrated with virtual object(s) to form a presentation output. The presentation output is displayed to a user of the wearable sensor system. For example, video can be captured with one or more cameras on a head mounted device (HMD) having a set of RGB pixels and a set of IR pixels.


In one implementation, the ambient lighting conditions are determined and can be used to adjust display of output. For example, information from the set of RGB pixels is displayed in normal lighting conditions and information from the set of IR pixels in dark lighting conditions. Alternatively, or additionally, information from the set of IR pixels can be used to enhance the information from the set of RGB pixels for low-light conditions, or vice versa. Some implementations can receive from a user a selection indicating a preferred display chosen from one of color imagery from the RGB pixels and IR imagery from the IR pixels, or combinations thereof. Alternatively, or additionally, the device itself may dynamically switch between video information captured using RGB sensitive pixels and video information captured using IR sensitive pixels for display depending upon ambient conditions, user preferences, situational awareness, other factors, or combinations thereof.


In one implementation, information from the IR sensitive pixels is separated out for processing to recognize gestures; while the information from the RGB sensitive pixels is provided to an output as a live video feed; thereby enabling conserving bandwidth to the gesture recognition processing. In gesture processing, features in the images corresponding to objects in the real world can be detected. The features of the objects are correlated across multiple images to determine change, which can be correlated to gesture motions. The gesture motions can be used to determine command information to a machine under control, application resident thereon or combinations thereof.


In one implementation, motion sensors and/or other types of sensors are coupled to a motion-capture system to monitor motion of at least the sensor of the motion-capture system resulting from, for example, users' touch. Information from the motion sensors can be used to determine first and second positional information of the sensor with respect to a fixed point at first and second times. Difference information between the first and second positional information is determined. Movement information for the sensor with respect to the fixed point is computed based upon the difference information. The movement information for the sensor is applied to apparent environment information sensed by the sensor to remove motion of the sensor therefrom to yield actual environment information; which can be communicated. Control information can be communicated to a system configured to provide a virtual reality or augmented reality experience via a portable device and/or to systems controlling machinery or the like based upon motion capture information for an object moving in space derived from the sensor and adjusted to remove motion of the sensor itself. In some applications, a virtual device experience can be augmented by the addition of haptic, audio and/or visual projectors.


In an implementation, apparent environmental information is captured from positional information of an object portion at the first time and the second time using a sensor of the motion-capture system. Object portion movement information relative to the fixed point at the first time and the second time is computed based upon the difference information and the movement information for the sensor.


In further implementations, a path of the object is calculated by repeatedly determining movement information for the sensor, using the motion sensors, and the object portion, using the sensor, at successive times and analyzing a sequence of movement information to determine a path of the object portion with respect to the fixed point. Paths can be compared to templates to identify trajectories. Trajectories of body parts can be identified as gestures. Gestures can indicate command information to be communicated to a system. Some gestures communicate commands to change operational modes of a system (e.g., zoom in, zoom out, pan, show more detail, next display page, and so forth).


Advantageously, some implementations can enable improved user experience, greater safety and improved functionality for users of virtual reality wearable devices. Some implementations further provide gesture capability allowing the user to execute intuitive gestures involving virtualized contact with a virtual object. For example, a device can be provided a capability to distinguish motion of objects from motions of the device itself in order to facilitate proper gesture recognition. Some implementations can provide improved interfacing with a variety of portable or wearable machines (e.g., smart telephones, portable computing systems, including laptop, tablet computing devices, personal data assistants, special purpose visualization computing machinery, including heads up displays (HUDs) for use in aircraft or automobiles for example, wearable virtual and/or augmented reality systems, including Google Glass, and others, graphics processors, embedded microcontrollers, gaming consoles, or the like; wired or wirelessly coupled networks of one or more of the foregoing, and/or combinations thereof), obviating or reducing the need for contact-based input devices such as a mouse, joystick, touch pad, or touch screen. Some implementations can provide for improved interface with computing and/or other machinery than would be possible with heretofore known techniques. In some implementations, a richer human-machine interface experience can be provided.


Other aspects and advantages of the present technology can be seen on review of the drawings, the detailed description and the claims, which follow.





BRIEF DESCRIPTION OF THE DRAWINGS

In the drawings, like reference characters generally refer to like parts throughout the different views. Also, the drawings are not necessarily to scale, with an emphasis instead generally being placed upon illustrating the principles of the technology disclosed. In the following description, various implementations of the technology disclosed are described with reference to the following drawings, in which:



FIG. 1 illustrates a system for capturing image and other sensory data according to an implementation of the technology disclosed.



FIG. 2 is a simplified block diagram of a computer system implementing image analysis suitable for supporting a virtual environment enabled apparatus according to an implementation of the technology disclosed.



FIG. 3A is a perspective view from the top of a sensor in accordance with the technology disclosed, with motion sensors along an edge surface thereof.



FIG. 3B is a perspective view from the bottom of a sensor in accordance with the technology disclosed, with motion sensors along the bottom surface thereof.



FIG. 3C is a perspective view from the top of a sensor in accordance with the technology disclosed, with detachable motion sensors configured for placement on a surface.



FIG. 4 illustrates apparent movement of objects from the perspective of the user of a virtual environment enabled apparatus in accordance with the technology disclosed.



FIG. 5 illustrates apparent movement of objects from the perspective of the user of a virtual environment enabled apparatus in accordance with the technology disclosed.



FIG. 6 is a flowchart showing a method of tracking motion of a wearable sensor system.



FIG. 7 shows a flowchart of one implementation of creating a multi-user interactive virtual environment using wearable sensor systems.



FIGS. 8A, 8A-1 and 8A-2 illustrate implementation of a user interface for providing an augmented reality experience.



FIG. 8B illustrates implementation of switching mode for selecting an actual reality presentation from an augmented reality experience.



FIG. 8C illustrates implementation of switching mode for selecting an augmented reality experience from an actual reality presentation.



FIG. 9 shows a flowchart of switching display modes in a virtual reality device including motion tracking by a wearable sensor system.



FIG. 10 shows a flowchart of providing an augmented reality environment.



FIG. 11 shows a flowchart of integrating a virtual object facilitating command input with a rendering of a user body portion into an augmented environment.



FIG. 12 shows a flowchart of sharing content between wearable sensor systems.





DESCRIPTION

Among other aspects, the technology described herein with reference to example implementations can provide capabilities to view and/or interact with the real world to the user of a wearable (or portable) device using a sensor or sensors configured to capture motion and/or determining the path of an object based on imaging, acoustic or vibrational waves. Implementations can enable improved user experience, greater safety, greater functionality to users of virtual reality for machine control and/or machine communications applications using wearable (or portable) devices, e.g., head mounted devices (HMDs), wearable goggles, watch computers, smartphones, and so forth, or mobile devices, e.g., autonomous and semi-autonomous robots, factory floor material handling systems, autonomous mass-transit vehicles, automobiles (human or machine driven), and so forth, equipped with suitable sensors and processors employing optical, audio or vibrational detection. In some implementations, projection techniques can supplement the sensory based tracking with presentation of virtual (or virtualized real) objects (visual, audio, haptic, and so forth) created by applications loadable to, or in cooperative implementation with, the HMD or other device to provide a user of the device with a personal virtual experience (e.g., a functional equivalent to a real experience).


Implementations include providing a “pass-through” in which live video is provided to the user of the virtual reality device, either alone or in conjunction with display of one or more virtual objects, enabling the user to perceive the real world directly. Accordingly, the user is enabled to see an actual desk environment as well as virtual applications or objects intermingled therewith. Gesture recognition and sensing enables implementations to provide the user with the ability to grasp or interact with real objects (e.g., the user's coke can) alongside the virtual (e.g., a virtual document floating above the surface of the user's actual desk. In some implementations, information from differing spectral sources is selectively used to drive one or another aspect of the experience. For example, information from IR sensitive sensors can be used to detect the user's hand motions and recognize gestures. While information from the visible light region can be used to drive the pass through video presentation, creating a real world presentation of real and virtual objects. In a further example, combinations of image information from multiple sources can be used; the system—or the user—selecting between IR imagery and visible light imagery based upon situational, conditional, environmental or other factors or combinations thereof. For example, the device can switch from visible light imaging to IR imaging when the ambient light conditions warrant. The user can have the ability to control the imaging source as well. In yet further examples, information from one type of sensor can be used to augment, correct, or corroborate information from another type of sensor. Information from IR sensors can be used to correct the display of imaging conducted from visible light sensitive sensors, and vice versa. In low-light or other situations not conducive to optical imaging, where free-form gestures cannot be recognized optically with a sufficient degree of reliability, audio signals or vibrational waves can be detected and used to supply the direction and location of the object as further described herein.


The technology disclosed can be applied to enhance user experience in immersive virtual reality environments using wearable sensor systems. Examples of systems, apparatus, and methods according to the disclosed implementations are described in a “wearable sensor systems” context. The examples of “wearable sensor systems” are being provided solely to add context and aid in the understanding of the disclosed implementations. In other instances, examples of gesture-based interactions in other contexts like automobiles, robots, or other machines can be applied to virtual games, virtual applications, virtual programs, virtual operating systems, etc. Other applications are possible, such that the following examples should not be taken as definitive or limiting either in scope, context, or setting. It will thus be apparent to one skilled in the art that implementations can be practiced in or outside the “wearable sensor systems” context.


As used herein, a given signal, event or value is “responsive to” a predecessor signal, event or value of the predecessor signal, event or value influenced by the given signal, event or value. If there is an intervening processing element, step or time period, the given signal, event or value can still be “responsive to” the predecessor signal, event or value. If the intervening processing element or step combines more than one signal, event or value, the signal output of the processing element or step is considered “responsive to” each of the signal, event or value inputs. If the given signal, event or value is the same as the predecessor signal, event or value, this is merely a degenerate case in which the given signal, event or value is still considered to be “responsive to” the predecessor signal, event or value. “Responsiveness” or “dependency” or “basis” of a given signal, event or value upon another signal, event or value is defined similarly.


As used herein, the “identification” of an item of information does not necessarily require the direct specification of that item of information. Information can be “identified” in a field by simply referring to the actual information through one or more layers of indirection, or by identifying one or more items of different information which are together sufficient to determine the actual item of information. In addition, the term “specify” is used herein to mean the same as “identify.”


Refer first to FIG. 1, which illustrates a system 100 for capturing image data according to one implementation of the technology disclosed. System 100 is preferably coupled to a wearable device 101 that can be a personal head mounted device (HMD) having a goggle form factor such as shown in FIG. 1, a helmet form factor, or can be incorporated into or coupled with a watch, smartphone, or other type of portable device.


In various implementations, the system and method for capturing 3D motion of an object as described herein can be integrated with other applications, such as a head-mounted device or a mobile device. Referring again to FIG. 1, a HMD 101 can include an optical assembly that displays a surrounding environment or a virtual environment to the user; incorporation of the motion-capture system 100 in the HMD 101 allows the user to interactively control the displayed environment. For example, a virtual environment can include virtual objects that can be manipulated by the user's hand gestures, which are tracked by the motion-capture system 100. In one implementation, the motion-capture system 100 integrated with the HMD 101 detects a position and shape of user's hand and projects it on the display of the head-mounted device 100 such that the user can see her gestures and interactively control the objects in the virtual environment. This can be applied in, for example, gaming or internet browsing.


System 100 includes any number of cameras 102, 104 coupled to sensory processing system 106. Cameras 102, 104 can be any type of camera, including cameras sensitive across the visible spectrum or with enhanced sensitivity to a confined wavelength band (e.g., the infrared (IR) or ultraviolet bands); more generally, the term “camera” herein refers to any device (or combination of devices) capable of capturing an image of an object and representing that image in the form of digital data. For example, line sensors or line cameras rather than conventional devices that capture a two-dimensional (2D) image can be employed. The term “light” is used generally to connote any electromagnetic radiation, which may or may not be within the visible spectrum, and may be broadband (e.g., white light) or narrowband (e.g., a single wavelength or narrow band of wavelengths).


Cameras 102, 104 are preferably capable of capturing video images (i.e., successive image frames at a constant rate of at least 15 frames per second), although no particular frame rate is required. The capabilities of cameras 102, 104 are not critical to the technology disclosed, and the cameras can vary as to frame rate, image resolution (e.g., pixels per image), color or intensity resolution (e.g., number of bits of intensity data per pixel), focal length of lenses, depth of field, etc. In general, for a particular application, any cameras capable of focusing on objects within a spatial volume of interest can be used. For instance, to capture motion of the hand of an otherwise stationary person, the volume of interest might be defined as a cube approximately one meter on a side.


As shown, cameras 102, 104 can be oriented toward portions of a region of interest 112 by motion of the device 101, in order to view a virtually rendered or virtually augmented view of the region of interest 112 that can include a variety of virtual objects 116 as well as contain an object of interest 114 (in this example, one or more hands) moves within the region of interest 112. One or more sensors 108, 110 capture motions of the device 101. In some implementations, one or more light sources 115, 117 are arranged to illuminate the region of interest 112. In some implementations, one or more of the cameras 102, 104 are disposed opposite the motion to be detected, e.g., where the hand 114 is expected to move. This is an optimal location because the amount of information recorded about the hand is proportional to the number of pixels it occupies in the camera images, and the hand will occupy more pixels when the camera's angle with respect to the hand's “pointing direction” is as close to perpendicular as possible. Sensory processing system 106, which can be, e.g., a computer system, can control the operation of cameras 102, 104 to capture images of the region of interest 112 and sensors 108, 110 to capture motions of the device 101. Information from sensors 108, 110 can be applied to models of images taken by cameras 102, 104 to cancel out the effects of motions of the device 101, providing greater accuracy to the virtual experience rendered by device 101. Based on the captured images and motions of the device 101, sensory processing system 106 determines the position and/or motion of object 114.


For example, as an action in determining the motion of object 114, sensory processing system 106 can determine which pixels of various images captured by cameras 102, 104 contain portions of object 114. In some implementations, any pixel in an image can be classified as an “object” pixel or a “background” pixel depending on whether that pixel contains a portion of object 114 or not. Object pixels can thus be readily distinguished from background pixels based on brightness. Further, edges of the object can also be readily detected based on differences in brightness between adjacent pixels, allowing the position of the object within each image to be determined. In some implementations, the silhouettes of an object are extracted from one or more images of the object that reveal information about the object as seen from different vantage points. While silhouettes can be obtained using a number of different techniques, in some implementations, the silhouettes are obtained by using cameras to capture images of the object and analyzing the images to detect object edges. Correlating object positions between images from cameras 102, 104 and cancelling out captured motions of the device 101 from sensors 108, 110 allows sensory processing system 106 to determine the location in 3D space of object 114, and analyzing sequences of images allows sensory processing system 106 to reconstruct 3D motion of object 114 using conventional motion algorithms or other techniques. See, e.g., U.S. patent application Ser. No. 13/414,485, filed on Mar. 7, 2012 and Ser. No. 13/742,953, filed on Jan. 16, 2013, and U.S. Provisional Patent Application No. 61/724,091, filed on Nov. 8, 2012, which are hereby incorporated herein by reference in their entirety.


Presentation interface 208 employs projection techniques in conjunction with the sensory based tracking in order to present virtual (or virtualized real) objects (visual, audio, haptic, and so forth) created by applications loadable to, or in cooperative implementation with, the device 101 to provide a user of the device with a personal virtual experience. Projection can include an image or other visual representation of an object.


One implementation uses motion sensors and/or other types of sensors coupled to a motion-capture system to monitor motions within a real environment. A virtual object integrated into an augmented rendering of a real environment can be projected to a user of a portable device 101. Motion information of a user body portion can be determined based at least in part upon sensory information received from imaging devices (e.g., cameras 102, 104) or acoustic or other sensory devices. Control information is communicated to a system based in part on a combination of the motion of the portable device 101 and the detected motion of the user determined from the sensory information received from imaging devices (e.g., cameras 102, 104) or acoustic or other sensory devices. The virtual device experience can be augmented in some implementations by the addition of haptic, audio and/or other sensory information projectors. For example, with reference to FIG. 8A, video presentation mechanism 208 can project an image of a virtual device (e.g., menu 801) superimposed upon a real world object rendered as a representation in the augmented reality, e.g., arm 803 being displayed to a user via live video feed; thereby creating a virtual device experience of interacting with a menu on the user's arm. Optional haptic projector 806 can project the feeling of “clicking” a button the user's finger. Optional audio projector 802 can project the sound of a “click” in response to detecting the user depressing a button on the menu. Because it is a virtual reality world, the back side of hand 114 is projected to the user, so that the scene looks to the user as if the user is looking at the user's own hand(s).


A plurality of sensors 108, 110 coupled to the sensory processing system 106 to capture motions of the device 101. Sensors 108, 110 can be any type of sensor useful for obtaining signals from various parameters of motion (acceleration, velocity, angular acceleration, angular velocity, position/locations); more generally, the term “motion detector” herein refers to any device (or combination of devices) capable of converting mechanical motion into an electrical signal. Such devices can include, alone or in various combinations, accelerometers, gyroscopes, and magnetometers, and are designed to sense motions through changes in orientation, magnetism or gravity. Many types of motion sensors exist and implementation alternatives vary widely.


The illustrated system 100 can include any of various other sensors not shown in FIG. 1 for clarity, alone or in various combinations, to enhance the virtual experience provided to the user of device 101. For example, in low-light situations where free-form gestures cannot be recognized optically with a sufficient degree of reliability, system 106 may switch to a touch mode in which touch gestures are recognized based on acoustic or vibrational sensors. Alternatively, system 106 may switch to the touch mode, or supplement image capture and processing with touch sensing, when signals from acoustic or vibrational sensors are sensed. In still another operational mode, a tap or touch gesture may act as a “wake up” signal to bring the image and audio analysis system 106 from a standby mode to an operational mode. For example, the system 106 may enter the standby mode if optical signals from the cameras 102, 104 are absent for longer than a threshold interval.


It will be appreciated that the figures shown in FIG. 1 are illustrative. In some implementations, it may be desirable to house the system 100 in a differently shaped enclosure or integrated within a larger component or assembly. Furthermore, the number and type of image sensors, motion detectors, illumination sources, and so forth are shown schematically for the clarity, but neither the size nor the number is the same in all implementations.


Refer now to FIG. 2, which shows a simplified block diagram of a computer system 200 for implementing sensory processing system 106. Computer system 200 includes a processor 202, a memory 204, a motion detector and camera interface 206, a presentation interface 208, speaker(s) 209, a microphone(s) 210, and a wireless interface 211. Memory 204 can be used to store instructions to be executed by processor 202 as well as input and/or output data associated with execution of the instructions. In particular, memory 204 contains instructions, conceptually illustrated as a group of modules described in greater detail below, that control the operation of processor 202 and its interaction with the other hardware components. An operating system directs the execution of low-level, basic system functions such as memory allocation, file management and operation of mass storage devices. The operating system may be or include a variety of operating systems such as Microsoft WINDOWS operating system, the Unix operating system, the Linux operating system, the Xenix operating system, the IBM AIX operating system, the Hewlett Packard UX operating system, the Novell NETWARE operating system, the Sun Microsystems SOLARIS operating system, the OS/2 operating system, the BeOS operating system, the MACINTOSH operating system, the APACHE operating system, an OPENACTION operating system, iOS, Android or other mobile operating systems, or another operating system of platform.


The computing environment may also include other removable/non-removable, volatile/nonvolatile computer storage media. For example, a hard disk drive may read or write to non-removable, nonvolatile magnetic media. A magnetic disk drive may read from or writes to a removable, nonvolatile magnetic disk, and an optical disk drive may read from or write to a removable, nonvolatile optical disk such as a CD-ROM or other optical media. Other removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The storage media are typically connected to the system bus through a removable or non-removable memory interface.


Processor 202 may be a general-purpose microprocessor, but depending on implementation can alternatively be a microcontroller, peripheral integrated circuit element, a CSIC (customer-specific integrated circuit), an ASIC (application-specific integrated circuit), a logic circuit, a digital signal processor, a programmable logic device such as an FPGA (field-programmable gate array), a PLD (programmable logic device), a PLA (programmable logic array), an RFID processor, smart chip, or any other device or arrangement of devices that is capable of implementing the actions of the processes of the technology disclosed.


Motion detector and camera interface 206 can include hardware and/or software that enables communication between computer system 200 and cameras 102, 104, as well as sensors 108, 110 (see FIG. 1). Thus, for example, motion detector and camera interface 206 can include one or more camera data ports 216, 218 and motion detector ports 217, 219 to which the cameras and motion detectors can be connected (via conventional plugs and jacks), as well as hardware and/or software signal processors to modify data signals received from the cameras and motion detectors (e.g., to reduce noise or reformat data) prior to providing the signals as inputs to a motion-capture (“mocap”) program 214 executing on processor 202. In some implementations, motion detector and camera interface 206 can also transmit signals to the cameras and sensors, e.g., to activate or deactivate them, to control camera settings (frame rate, image quality, sensitivity, etc.), to control sensor settings (calibration, sensitivity levels, etc.), or the like. Such signals can be transmitted, e.g., in response to control signals from processor 202, which may in turn be generated in response to user input or other detected events.


Instructions defining mocap program 214 are stored in memory 204, and these instructions, when executed, perform motion-capture analysis on images supplied from cameras and audio signals from sensors connected to motion detector and camera interface 206. In one implementation, mocap program 214 includes various modules, such as an object analysis module 222 and a path analysis module 224. Object analysis module 222 can analyze images (e.g., images captured via interface 206) to detect edges of an object therein and/or other information about the object's location. In some implementations, object analysis module 222 can also analyze audio signals (e.g., audio signals captured via interface 206) to localize the object by, for example, time distance of arrival, multilateration or the like. (“Multilateration is a navigation technique based on the measurement of the difference in distance to two or more stations at known locations that broadcast signals at known times. See Wikipedia, at http://en.wikipedia.org/w/index.php?title=Multilateration&oldid=523281858, on Nov. 16, 2012, 06:07 UTC). Path analysis module 224 can track and predict object movements in 3D based on information obtained via the cameras. Some implementations will include a Virtual Reality/Augmented Reality environment manager 226 provides integration of virtual objects reflecting real objects (e.g., hand 114) as well as synthesized objects 116 for presentation to user of device 101 via presentation interface 208 to provide a personal virtual experience. One or more applications 230 can be loaded into memory 204 (or otherwise made available to processor 202) to augment or customize functioning of device 101 thereby enabling the system 200 to function as a platform. Successive camera images are analyzed at the pixel level to extract object movements and velocities. Audio signals place the object on a known surface, and the strength and variation of the signals can be used to detect object's presence. If both audio and image information is simultaneously available, both types of information can be analyzed and reconciled to produce a more detailed and/or accurate path analysis. A video feed integrator 229 provides integration of live video feed from the cameras 102, 104 and one or more virtual objects (e.g., 801 of FIG. 8A) using techniques like that of described in commonly owned, U.S. Provisional Patent Application 62/001,044, filed 20 May 2014. Video feed integrator governs processing of video information from disparate types of cameras 102, 104. For example, information received from pixels sensitive to IR light and from pixels sensitive to visible light (e.g., RGB) can be separated by integrator 229 and processed differently. Image information from IR sensors can be used for gesture recognition, while image information from RGB sensors can be provided as a live video feed via presentation interface 208. Information from one type of sensor can be used to enhance, correct, and/or corroborate information from another type of sensor. Information from one type of sensor can be favored in some types of situational or environmental conditions (e.g., low light, fog, bright light, and so forth). The device can select between providing presentation output based upon one or the other types of image information, either automatically or by receiving a selection from the user. Integrator 229 in conjunction with VR/AR environment 226 control the creation of the environment presented to the user via presentation interface 208.


Presentation interface 208, speakers 209, microphones 210, and wireless network interface 211 can be used to facilitate user interaction via device 101 with computer system 200. These components can be of generally conventional design or modified as desired to provide any type of user interaction. In some implementations, results of motion capture using motion detector and camera interface 206 and mocap program 214 can be interpreted as user input. For example, a user can perform hand gestures or motions across a surface that are analyzed using mocap program 214, and the results of this analysis can be interpreted as an instruction to some other program executing on processor 200 (e.g., a web browser, word processor, or other application). Thus, by way of illustration, a user might use upward or downward swiping gestures to “scroll” a webpage currently displayed to the user of device 101 via presentation interface 208, to use rotating gestures to increase or decrease the volume of audio output from speakers 209, and so on. Path analysis module 224 may represent the detected path as a vector and extrapolate to predict the path, e.g., to improve rendering of action on device 101 by presentation interface 208 by anticipating movement.


It will be appreciated that computer system 200 is illustrative and that variations and modifications are possible. Computer systems can be implemented in a variety of form factors, including server systems, desktop systems, laptop systems, tablets, smart phones or personal digital assistants, and so on. A particular implementation may include other functionality not described herein, e.g., wired and/or wireless network interfaces, media playing and/or recording capability, etc. In some implementations, one or more cameras and two or more microphones may be built into the computer rather than being supplied as separate components. Further, an image or audio analyzer can be implemented using only a subset of computer system components (e.g., as a processor executing program code, an ASIC, or a fixed-function digital signal processor, with suitable I/O interfaces to receive image data and output analysis results).


While computer system 200 is described herein with reference to particular blocks, it is to be understood that the blocks are defined for convenience of description and are not intended to imply a particular physical arrangement of component parts. Further, the blocks need not correspond to physically distinct components. To the extent that physically distinct components are used, connections between components (e.g., for data communication) can be wired and/or wireless as desired. Thus, for example, execution of object analysis module 222 by processor 202 can cause processor 202 to operate motion detector and camera interface 206 to capture images and/or audio signals of an object traveling across and in contact with a surface to detect its entrance by analyzing the image and/or audio data.



FIGS. 3A-3C illustrate three different configurations of a movable sensor system 300A, 300B, and 300C, with reference to example implementations packaged within a single housing as an integrated sensor. In all cases, sensor 300A, 300B, 300C includes a top surface 305, a bottom surface 307, and a side wall 310 spanning the top and bottom surfaces 305, 307. With reference also to FIG. 3A, the top surface 305 of sensor 300A contains a pair of windows 315 for admitting light to the cameras 102, 104, one of which is optically aligned with each of the windows 315. If the system includes light sources 115, 117, surface 305 may contain additional windows for passing light to the object(s) being tracked. In sensor 300A, motion sensors 108, 110 are located on the side wall 310. Desirably, the motion sensors are flush with the surface of side wall 310 so that, the motion sensors are disposed to sense motions about a longitudinal axis of sensor 300A. Of course, the motion sensors can be recessed from side wall 310 internal to the device in order to accommodate sensor operation and placement within available packaging space so long as coupling with the external housing of sensor 300A remains adequate. In sensor 300B, motion sensors 108, 110 are located proximate to the bottom surface 307, once again in a flush or recessed configuration. The top surface of the sensor 300B (not shown in the figure for clarity sake) contains camera windows 315 as shown in FIG. 3A. In FIG. 3C, motion sensors 108, 110 are external contact transducers that connect to sensor 300C via jacks 320. This configuration permits the motion sensors to be located away from the sensor 300C, e.g., if the motion sensors are desirably spaced further apart than the packaging of sensor 300C allows. In other implementations, movable sensor components of FIGS. 3A, 3B and 3C can be imbedded in portable (e.g., head mounted devices (HMIDs), wearable goggles, watch computers, smartphones, and so forth) or movable (e.g., autonomous robots, material transports, automobiles (human or machine driven)) devices.



FIG. 4 illustrates apparent movement of objects from the perspective of the user of a virtual environment enabled apparatus 400 in accordance with the technology. FIG. 4 shows two views of a user of a device 101 viewing a field of view 113 at two different times. As shown in block 401, at an initial time t0, user is viewing field of view 113a using device 101 in a particular initial position to view an area 113a. As shown in block 402, device 101 presents to user a display of the device field of view 113a that includes objects 114 (hands) in a particular pose. As shown in block 403, subsequently at time t1, the user has repositioned device 101. Accordingly, the apparent position of objects 114 in the field of view 113b shown in block 404 has changed from the apparent position of the objects 114 in field of view 113a. Even in the case where the hands 114 did not move in space, the user sees an apparent movement of the hands 114 due to the change in position of the device.


Now with reference to FIG. 5, an apparent movement of one or more moving objects from the perspective of the user of a virtual environment enabled apparatus 500 is illustrated. As shown by block 502, field of view 113a presented by device 101 at time t0 includes an object 114. At time t0, the position and orientation of tracked object 114 is known with respect to device reference frame 120a, again at time t0. As shown by block 404, at time t1, the position and orientation of both device reference frame 120b and tracked object 114 have changed. As shown by block 504, field of view 113b presented by device 101 at time t1 includes object 114 in a new apparent position. Because the device 101 has moved, the device reference frame 120 has moved from an original or starting device reference frame 120a to a current or final reference frame 120b as indicated by transformation T. It is noteworthy that the device 101 can rotate as well as translate. Implementations can provide sensing the position and rotation of reference frame 120b with respect to reference frame 120a and sensing the position and rotation of tracked object 114 with respect to 120b, at time t1. Implementations can determine the position and rotation of tracked object 114 with respect to 120a from the sensed position and rotation of reference frame 120b with respect to reference frame 120a and the sensed position and rotation of tracked object 114 with respect to 120b.


In an implementation, a transformation R is determined that moves dashed line reference frame 120a to dotted line reference frame 120b, without intermediate conversion to an absolute or world frame of reference. Applying the reverse transformation RT makes the dotted line reference frame 120b lie on top of dashed line reference frame 120a. Then the tracked object 114 will be in the right place from the point of view of dashed line reference frame 120a. (It is noteworthy that RT is equivalent to R−1 for our purposes.) In determining the motion of object 114, sensory processing system 106 can determine its location and direction by computationally analyzing images captured by cameras 102, 104 and motion information captured by sensors 108, 110. For example, an apparent position of any point on the object (in 3D space) at time







t
=


t
0



:

[



x




y




z




1



]



,





can be converted to a real position of the point on the object at time






t
=


t
1



:

[




x







y







z






1



]







using an affine transform






[




R
ref




T
ref





0


1



]





from the frame of reference of the device. We refer to the combination of a rotation and translation, which are not generally commutative, as the affine transformation.


The correct location at time t=t1 of a point on the tracked object with respect to device reference frame 120a is given by an inverse affine transformation, e.g.,






[




R
ref
T





-

R
ref
T


*

T
ref






0


1



]





as provided for in equation (1):











[




R
ref
T





(

-

R
ref
T


)

*

T
ref






0


1



]

*

[



x




y




z




1



]


=

[




x







y







z






1



]





(
1
)







Where:

    • RTref—Represents the rotation matrix part of an affine transform describing the rotation transformation from the device reference frame 120a to the device reference frame 120b.
    • Tref—Represents translation of the device reference frame 120a to the device reference frame 120b.


One conventional approach to obtaining the Affine transform R (from axis unit vector u=(ux, uy, uz), rotation angle θ) method. Wikipedia, at <http://en.wikipedia.org/wiki/Rotation_matrix>, Rotation matrix from axis and angle, on Jan. 30, 2014, 20:12 UTC, upon which the computations equation (2) are at least in part inspired:









R
=

[





cos


θ

+


u
x
2

(

1
-

cos


θ


)







u
x




u
y

(

1
-

cos


θ


)


-


u
z



sin


θ







u
x



u
z



(

1
-

cos


θ


)


+


u
y



sin


θ









u
y




u
x

(

1
-

cos


θ


)


+


u
z



sin


θ






cos


θ

+


u
y
2

(

1
-

cos


θ


)







u
y



u
z



(

1
-

cos


θ


)


-


u
x



sin


θ









u
z




u
x

(

1
-

cos


θ


)


-


u
y



sin


θ







u
z




u
y

(

1
-

cos


θ


)


+


u

x





sin


θ






cos


θ

+


u
z
2

(

1
-

cos


θ


)





]





(
2
)










R
T

=

[





cos


θ

+


u
x
2

(

1
-

cos


θ


)







u
y




u
x

(

1
-

cos


θ


)


-


u
z



sin


θ







u
z



u
x



(

1
-

cos


θ


)


-


u
y



sin


θ









u
x




u
y

(

1
-

cos


θ


)


-


u
z



sin


θ






cos


θ

+


u
y
2

(

1
-

cos


θ


)







u
z



u
y



(

1
-

cos


θ


)


+


u
x



sin


θ









u
x




u
z

(

1
-

cos


θ


)


+


u
y



sin


θ







u
y




u
z

(

1
-

cos


θ


)


+


u

x





sin


θ






cos


θ

+


u
z
2

(

1
-

cos


θ


)





]








-

R
T


=

[






-
cos



θ

-


u
x
2

(

1
-

cos


θ


)







-

u
y





u
x

(

1
-

cos


θ


)


-


u
z



sin


θ







-

u
z





u
x

(

1
-

cos


θ


)


+


u
y



sin


θ









-

u
x





u
y

(

1
-

cos


θ


)


+


u
z



sin


θ







-
cos



θ

-


u
y
2

(

1
-

cos


θ


)







-

u
z





u
y

(

1
-

cos


θ


)


-


u
x



sin


θ









-

u
x





u
z

(

1
-

cos


θ


)


-


u
y



sin


θ







u
y




u
z

(

1
-

cos


θ


)


+


u

x





sin


θ







-
cos



θ

-


u
z
2

(

1
-

cos


θ


)





]







T
=

[



a




b




c



]






is a vector representing a translation of the object with respect to origin of the coordinate system of the translated frame,











-

R
T


*
T

=

[






(



-
cos



θ

-


u
x
2



(

1
-

cos


θ


)



)



(
a
)


+


(



-
cos



θ

-


u
y
2

(

1
-

cos


θ


)


)



(
b
)


+


(



-

u
z





u
x

(

1
-

cos


θ


)


+


u
y



sin


θ


)



(
c
)










(



-

u
x





u
y

(

1
-

cos


θ


)


+


u
z



sin


θ


)



(
a
)


+


(



-
cos



θ

-


u
y
2

(

1
-

cos


θ


)


)



(
b
)


+


(



-

u
z





u
y

(

1
-

cos


θ


)


-


u
x



sin


θ


)



(
c
)










(



-

u
x





u
z

(

1
-

cos


θ


)


-


u
y



sin


θ


)



(
a
)


+


(



-

u
y





u
z

(

1
-

cos


θ


)


+


u

x





sin


θ


)



(
b
)


+


(



-
cos



θ

-


u
z
2

(

1
-

cos


θ


)


)



(
c
)






]





(
c
)







In another example, an apparent orientation and position of the object at time t=t0: vector pair







[




R

o

b

j





T

o

b

j






0


1



]

,





can be converted to a real orientation and position of the object at time






t
=


t
1



:
[




R

o

b

j






T
obj






0


1



]







using an affine transform







[




R
ref




T
ref





0


1



]

.





The correct orientation and position of the tracked object with respect to device reference frame at time t=t0 (120a) is given by an inverse affine transformation, e.g.,






[




R
ref
T





-

R
ref
T


*

T
ref






0


1



]





as provided for in equation (3):











[




R
ref
T





-

R
ref
T


*

T
ref






0


1



]

*

[




R
obj




T
obj





0


1



]


=

[




R
obj





T
obj






0


1



]





(
3
)







Where:

    • RTref—Represents the rotation matrix part of an affine transform describing the rotation transformation from the device reference frame 120a to the device reference frame 120b.
    • Robj—Represents a matrix describing the rotation at t0 of the object with respect to the device reference frame 120b.
    • R′obj—Represents a matrix describing the rotation at t1 of the object with respect to the device reference frame 120a.
    • Tref—Represents a vector translation of the device reference frame 120a to the device reference frame 120b.
    • Tobj—Represents a vector describing the position at t0 of the object with respect to the device reference frame 120b.
    • T′obj—Represents a vector describing the position at t1 of the object with respect to the device reference frame 120a.


In a yet further example, an apparent orientation and position of the object at time t=t0: affine transform







[




R
obj




T
obj





0


1



]

,





can be converted to a real orientation and position of the object at time







t
=


t
1



:
[




R

o

b

j







T



o

b

j






0


1



]



,





using an affine transform







[




R
ref




T
ref





0


1



]

.





Furthermore, the position and orientation of the initial reference frame with respect to a (typically) fixed reference point in space can be determined using an affine transform







[




R

i

n

i

t





T

i

n

i

t






0


1



]

.





The correct orientation and position of the tracked object with respect to device reference frame at time t=t0 (120a) is given by an inverse affine transformation, e.g.,






[




R

i

n

i

t

T





(

-

R

i

n

i

t

T


)

*

T

i

n

i

t







0


1



]





as provided for in equation (4):












[




R

i

n

i

t

T





(

-

R

i

n

i

t

T


)

*

T

i

n

i

t







0


1



]

[




R
ref
T





(

-

R
ref
T


)

*

T
ref






0


1



]

*

[




R

o

b

j





T

o

b

j






0


1



]


=


[




R

o

b

j







T



o

b

j






0


1



]





(
4
)







Where:

    • RTinit—Represents a rotation matrix part of an affine transform describing the rotation transformation at t0 from the world reference frame 119 to the device reference frame 120a.
    • RTref—Represents the rotation matrix part of an affine transform describing the rotation transformation from the device reference frame 120a to the device reference frame 120b.
    • Robj—Represents a matrix describing the rotation of the object at t0 with respect to the device reference frame 120b.
    • R′obj—Represents a matrix describing the rotation of the object at t1 with respect to the device reference frame 120a.
    • Tinit—Represents a vector translation at t0 of the world reference frame 119 to the device reference frame 120a.
    • Tref—Represents a vector translation at t1 of the device reference frame 120a to the device reference frame 120b.
    • Tobj—Represents a vector describing the position at t0 of the object with respect to the device reference frame 120b.
    • T′obj—Represents a vector describing the position at t1 of the object with respect to the device reference frame 120a.


In some implementations, the technology disclosed can build a world model with an absolute or world frame of reference. The world model can include representations of object portions (e.g. objects, edges of objects, prominent vortices) and potentially depth information when available from a depth sensor, depth camera or the like, within the viewpoint of the virtual or augmented reality head mounted sensor. The system can build the world model from image information captured by the cameras of the sensor. Points in 3D space can be determined from the stereo-image information are analyzed to obtain object portions. These points are not limited to a hand or other control object in a foreground; the points in 3D space can include stationary background points, especially edges. The model is populated with the object portions.


When the sensor moves (e.g., the wearer of a wearable headset turns her head) successive stereo-image information is analyzed for points in 3D space. Correspondences are made between two sets of points in 3D space chosen from the current view of the scene and the points in the world model to determine a relative motion of the object portions. The relative motion of the object portions reflects actual motion of the sensor. Differences in points are used to determine an inverse transformation (the







[




R
T





-

R
T


*
T





0


1



]

)





between model position and new position of object portions. In this affine transform, RT describes the rotational portions of motions between camera and object coordinate systems, and T describes the translational portions thereof.


The system then applies an inverse transformation of the object corresponding to the actual transformation of the device (since the sensor, not the background object moves) to determine the translation and rotation of the camera. Of course, this method is most effective when background objects are not moving relative to the world frame (i.e., in free space).


The model can be updated whenever we detect new points not previously seen in the model. The new points are added to the model so that it continually grows.


Of course, embodiments can be created in which (1) device cameras are considered stationary and the world model is considered to move; or (2) the device cameras are considered to be moving and the world model is considered stationary.


The use of a world model described above does not require any gyroscopic, accelerometer or magnetometer sensors, since the same cameras in a single unit (even the same cameras) can sense both the background objects and the control object. In any view where the system can recognize elements of the model, it can re-localize its position and orientation relative to the model and without drifting from sensor data. In some embodiments, motion sensors can be used to seed the frame to frame transformation and therefore bring correspondences between the rendered virtual or augmented reality scenery closer to the sensed control object, making the result less ambiguous (i.e., the system would have an easier time determining what motion of the head had occurred to result in the change in view from that of the model). In a yet further embodiment, sensor data could be used to filter the solution above so that the motions appear to be smoother from frame to frame, while still remaining impervious to drift caused by relying upon motion sensors alone.


Virtual/Augmented Reality


Sensory processing system 106 includes a number of components for generating an immersive purely virtual and/or augmented environment. The first component is a camera such as cameras 102 or 104 or other video input to generate a digitized video image of the real world or user-interaction region. The camera can be any digital device that is dimensioned and configured to capture still or motion pictures of the real world and to convert those images to a digital stream of information that can be manipulated by a computer. For example, cameras 102 or 104 can be digital still cameras, digital video cameras, web cams, head-mounted displays, phone cameras, tablet personal computers, ultra-mobile personal computers, and the like.


The second component is a transparent, partially transparent, or semi-transparent user interface such as display 120 (embedded in a user computing device like a wearable goggle or a smartphone) that combines rendered 3D virtual imagery with a view of the real world, so that both are visible at the same time to a user. In some implementations, the rendered 3D virtual imagery can projected using holographic, laser, stereoscopic, autostereoscopic, or volumetric 3D displays.


In one implementation, a virtual reality and/or augmented reality (AR) environment can be created by instantiation of a free-floating virtual modality in a real world physical space. In one implementation, computer-generated imagery, presented as free-floating virtual modality, can be rendered in front of a user as reflections using real-time rendering techniques such as orthographic or perspective projection, clipping, screen mapping, rasterizing and transformed into the field of view or current view space of a live camera embedded in a video projector, holographic projection system, smartphone, wearable goggle or other head mounted device (HMD), or heads up display (HUD). In some other implementations, transforming models into the current view space can be accomplished using sensor output from onboard sensors. For example, gyroscopes, magnetometers and other motion sensors can provide angular displacements, angular rates and magnetic readings with respect to a reference coordinate frame, and that data can be used by a real-time onboard rendering engine to generate 3D imagery. If the user physically moves a user computing device, resulting in a change of view of the embedded camera, the virtual modality and computer-generated imagery can be updated accordingly using the sensor data.


In some implementations, a virtual modality can include a variety of information from a variety of local or network information sources. Some examples of information include specifications, directions, recipes, data sheets, images, video clips, audio files, schemas, user interface elements, thumbnails, text, references or links, telephone numbers, blog or journal entries, notes, part numbers, dictionary definitions, catalog data, serial numbers, order forms, marketing or advertising and any other information that may be useful to a user. Some examples of information resources include local databases or cache memory, network databases, Websites, online technical libraries, other devices, or any other information resource that can be accessed by user computing devices either locally or remotely through a communication link.


Virtual items in a presentation output, rendered across an interface of a wearable sensor system, can include text, images, or references to other information (e.g., links). In one implementation, interactive virtual items can be displayed proximate to their corresponding real-world objects. In another implementation, interactive virtual items can describe or otherwise provide useful information about the objects to a user.


Projected AR allows users to simultaneously view the real word physical space and the interactive virtual items superimposed in the space. In one implementation, these interactive virtual items can be projected on to the real word physical space using micro-projectors embedded in wearable goggle or other head mounted display (HMD) that cast a perspective view of a stereoscopic 3D imagery onto the real world space. In such an implementation, a camera, in-between the micro-projectors can scan for infrared identification markers placed in the real world space. The camera can use these markers to precisely track the user's head position and orientation in the real word physical space, according to another implementation. Yet another implementation includes using retroreflectors in the real word physical space to prevent scattering of light emitted by the micro-projectors and to provision multi-user participation by maintaining distinct and private user views. In such an implementation, multiple users can simultaneously interact with the same virtual modality, such that they both view the same virtual objects and manipulations to virtual objects by one user are seen by the other user.


In other implementations, projected AR obviates the need of using wearable hardware such as goggles and other hardware like displays to create an AR experience. In such implementations, a video projector, volumetric display device, holographic projector, and/or heads-up display can be used to create a “glasses-free” AR environment. In one implementation, such projectors can be electronically coupled to user computing devices such as smartphones or laptop and configured to produce and magnify virtual items that are perceived as being overlaid on the real word physical space.


The third component is the sensory processing system 106, which captures a series of sequentially temporal images of a region of interest. It further identifies any gestures performed in the region of interest and controls responsiveness of the rendered 3D virtual imagery to the performed gestures by updating the 3D virtual imagery based on the corresponding gestures.


Feature Matching


Motion information of a wearable sensor system or a user or body portion of the user can be determined with respect to a feature of a real world space that includes the wearable sensory system and/or the user. Some implementations include the features of a real world space being different real world products or objects in the real world space such as furniture (chairs, couches, tables, etc.), kitchen appliances (stoves, refrigerators, dishwashers, etc.), office appliances (copy machines, fax machines, computers), consumer and business electronic devices (telephones, scanners, etc.), furnishings (pictures, wall hangings, sculpture, knick knacks, plants), fixtures (chandeliers and the like), cabinetry, shelving, floor coverings (tile, wood, carpets, rugs), wall coverings, paint colors, surface textures, countertops (laminate, granite, synthetic countertops), electrical and telecommunication jacks, audio-visual equipment, speakers, hardware (hinges, locks, door pulls, door knobs, etc.), exterior siding, decking, windows, shutters, shingles, banisters, newels, hand rails, stair steps, landscaping plants (trees, shrubs, etc.), and the like, and qualities of all of these (e.g. color, texture, finish, etc.).


As discussed above, a combination of RGB and IR pixels can be used to respectively capture the gross and fine features of the real world space. Once captured, changes in features values are detected by comparing pairs of frames of the captured video stream. In one implementation, subpixel refinement of the matches is used to determine the position of the wearable sensory system with respect to the analyzed feature. In another implementation, a feature in one image is matched to every feature within a fixed distance from it in the successive image such that all features that are within a certain disparity limit from each other. In other implementations, normalized correlation over a specified window can be used to evaluate the potential matches.


Some other implementations include copying each identified feature from a frame and storing the feature as a vector. Further, a scalar product of the identified feature vectors is calculated and a mutual consistency check is applied such that a feature with highest normalized correlation is considered to be determinative and changes in the feature values (position, orientation) of the feature are used to calculate motion information of the wearable sensory system. In other implementations, sum of absolute differences (SAD) can be used to identify the determinative feature in a real world space.


Flowcharts



FIG. 6 is a flowchart showing a method 600 of tracking motion of a wearable sensor system. Flowchart 600 can be implemented at least partially with a computer or other data processing system, e.g., by one or more processors configured to receive or retrieve information, process the information, store results, and transmit the results. Other implementations may perform the actions in different orders and/or with different, fewer or additional actions than those illustrated in FIG. 6. Multiple actions can be combined in some implementations. For convenience, this flowchart is described with reference to the system that carries out a method. The system is not necessarily part of the method.


At action 610, a video stream of a scene of a real world space is captured using at least one camera electronically coupled to a wearable sensor system.


At action 620, one or more feature values of the scene are detected from a plurality of images of the video stream captured at times t0 and t1 using a set of RGB pixels and a set of IR pixels of the camera. In one implementation, the wearable sensor system has moved between t0 and t1.


At action 630, motion information of the wearable sensor system is determined with respect to at least one feature of the scene based on comparison between feature values detected at times t0 and t1.


At action 640, a presentation output is generated for display across an interface of the wearable sensor display based on information from the sets of RGB and IR pixels.


At action 650, responsiveness of the presentation output is automatically calibrated based on the determined motion information of the wearable sensor system with respect to the at least one feature of the scene. In one implementation, perceived field of view of the presentation output is proportionally adjusting responsive to the determined motion information of the wearable sensor system with respect to the at least one feature of the scene.


In yet another implementation, motion information of a body portion engaged with the wearable sensory system is determined based on the motion information of the wearable sensor system.


In some implementations, gross features of the real world space are extracted using RGB pixels that respectively capture red, green, and blue components of illumination in the scene.


In other implementations, fine features of the real world space are extracted using IR pixels that capture infrared components of illumination in the scene. In one implementation, fine features of the real world space include surface texture of the real world space. In another implementation, fine features of the real world space include edges of the real world space. In some another implementation, fine features of the real world space include curvatures of the real world space. In yet another implementation, fine features of the real world space include surface texture of objects in the real world space. In a further implementation, fine features of the real world space include edges of objects in the real world space.


In some implementations, fine features of the real world space include curvatures of objects in the real world space. In another implementation, a feature of the scene is an object in the real world space. In some other implementation, a feature value of the scene is orientation of the object. In yet another implementation, a feature value of the scene is position of the object. In a further implementation, a feature of the scene is an arrangement of plurality of objects in the real world space. In other implementations, a feature value of the scene is position of the objects with respect to each other in the arrangement.


According to some implementations, comparison between feature values includes detecting a change in rotation between the images captured at times t0 and t1. According to other implementations, comparison between feature values includes detecting a change in translation between the images captured at times t0 and t1.


In yet other implementations, motion information of the wearable sensor system is determined with respect to at least one feature of the scene by matching features in images captured at time t0 with corresponding features in images captured at time t1. In one implementation, the matched features are within a threshold distance.


In another implementation, motion information of the wearable sensor system is determined with respect to at least one feature of the scene by calculating displacement between the images captured at times t0 and t1 based on at least one of RGB and IR pixel values.


In one implementation, the motion information includes position of the wearable sensor system. In another implementation, the motion information includes orientation of the wearable sensor system. In yet another implementation, the motion information includes velocity of the wearable sensor system. In a further implementation, the motion information includes acceleration of the wearable sensor system.


Some implementations include using monocular vision to capture the video stream. Other implementations include using stereoscopic vision to capture the video stream. Yet other implementations including more than two cameras to capture the video stream.


In one implementation, the images captured at times t0 and t1 are successive image pairs. In another implementation, the images captured at times t0 and t1 are alternative image pairs. In a further implementation, the images captured at times t0 and t1 are alternative image pairs. In yet another implementation, the images captured are right and left stereo images captured simultaneously.


This method and other implementations of the technology disclosed can include one or more of the following features and/or features described in connection with additional methods disclosed. Other implementations can include a non-transitory computer readable storage medium storing instructions executable by a processor to perform any of the methods described above. Yet another implementation can include a system including memory and one or more processors operable to execute instructions, stored in the memory, to perform any of the methods described above.



FIG. 7 shows a flowchart 700 of one implementation of creating a multi-user interactive virtual environment using wearable sensor systems. Flowchart 700 can be implemented at least partially with a computer or other data processing system, e.g., by one or more processors configured to receive or retrieve information, process the information, store results, and transmit the results. Other implementations may perform the actions in different orders and/or with different, fewer or additional actions than those illustrated in FIG. 7. Multiple actions can be combined in some implementations. For convenience, this flowchart is described with reference to the system that carries out a method. The system is not necessarily part of the method.


At action 710, a first video stream of a real world space is captured using at least one camera electronically coupled to a first wearable sensor system engaged by a first user.


At action 720, a second video stream of a real world space is captured using at least one camera electronically coupled to a second wearable sensor system engaged by a second user.


At action 730, respective three-dimensional maps of the real world space are generated using sets of RGB and IR pixels of the first and second cameras by extracting one or more feature values of the real world space from the first and second video streams. In one implementation, generating respective three-dimensional maps further includes determining a graph of features of the real world space based on the extracted feature values.


At action 740, motion information of the first and second wearable sensor systems is determined with respect to each other based on comparison between the respective three-dimensional maps of the real world space.


At action 750, responsiveness of the presentation outputs is automatically calibrated based on the determined motion information of the first and second wearable sensor systems with respect to each other. In some implementations, presentation outputs are generated for display across respective interfaces of the first and second wearable sensor systems based on information from the sets of RGB and IR pixels of the first and second cameras. In other implementations, respective perceived fields of view of the presentation outputs are proportionally adjusted responsive to the determined motion information of the first and second wearable sensor systems with respect to each other.


Some other implementations include determining motion information of respective body portions of the first and second users based on the motion information of the first and second wearable sensor systems with respect to each other.


In some implementations, gross features of the real world space are extracted using RGB pixels that respectively capture red, green, and blue components of illumination in the scene.


In other implementations, fine features of the real world space are extracted using IR pixels that capture infrared components of illumination in the scene. In one implementation, fine features of the real world space include surface texture of the real world space. In another implementation, fine features of the real world space include edges of the real world space. In some another implementation, fine features of the real world space include curvatures of the real world space. In yet another implementation, fine features of the real world space include surface texture of objects in the real world space. In a further implementation, fine features of the real world space include edges of objects in the real world space.


In some implementations, fine features of the real world space include curvatures of objects in the real world space. In another implementation, a feature of the scene is an object in the real world space. In some other implementation, a feature value of the scene is orientation of the object. In yet another implementation, a feature value of the scene is position of the object. In a further implementation, a feature of the scene is an arrangement of plurality of objects in the real world space. In other implementations, a feature value of the scene is position of the objects with respect to each other in the arrangement.


According to some implementations, comparison between feature values includes detecting a change in rotation between the images captured at times t0 and t1. According to other implementations, comparison between feature values includes detecting a change in translation between the images captured at times t0 and t1.


In yet other implementations, motion information of the wearable sensor system is determined with respect to at least one feature of the scene by matching features in images captured at time t0 with corresponding features in images captured at time t1. In one implementation, the matched features are within a threshold distance.


In another implementation, motion information of the wearable sensor system is determined with respect to at least one feature of the scene by calculating displacement between the images captured at times t0 and t1 based on at least one of RGB and IR pixel values.


In one implementation, the motion information includes position of the wearable sensor system. In another implementation, the motion information includes orientation of the wearable sensor system. In yet another implementation, the motion information includes velocity of the wearable sensor system. In a further implementation, the motion information includes acceleration of the wearable sensor system.


Some implementations include using monocular vision to capture the video stream. Other implementations include using stereoscopic vision to capture the video stream. Yet other implementations including more than two cameras to capture the video stream.


In one implementation, the images captured at times t0 and t1 are successive image pairs. In another implementation, the images captured at times t0 and t1 are alternative image pairs. In a further implementation, the images captured at times t0 and t1 are alternative image pairs. In yet another implementation, the images captured are right and left stereo images captured simultaneously.


This method and other implementations of the technology disclosed can include one or more of the following features and/or features described in connection with additional methods disclosed. Other implementations can include a non-transitory computer readable storage medium storing instructions executable by a processor to perform any of the methods described above. Yet another implementation can include a system including memory and one or more processors operable to execute instructions, stored in the memory, to perform any of the methods described above.


Some implementations can be applied to virtual reality or augmented reality applications. For example, and with reference to FIG. 8A, which illustrates a system 800 for providing an augmented reality experience including augmented environment 113 according to one implementation of the technology disclosed. Augmented environment 113 can include one or more virtual objects 811, 813 one or more real objects 114, one or more virtualized renderings of real objects 803 reflected in augmented environment 113 or any combinations thereof as provided in accordance with application 821. In the example illustrated by FIG. 8A, application 821 is a “wearable planetarium” application including a variety of virtual constellations 811 and a variety of other virtual heavenly objects, such as virtual quasar 813. Of course the type, quantity and nature of the virtual objects will vary depending upon the application under use.


System 800 includes a processing system 106 controlling a variety of sensors and projectors, such as for example one or more cameras 102, 104 (or other image sensors) and optionally some illumination sources 115, 117 comprising an imaging system. In other implementations, system 800 can include one or more sensors 808, 810 to, e.g., detect acceleration, temperature, humidity, water, supplied power, proximity, external motion, device motion, sound signals, ultrasound signals, light signals, fire, smoke, carbon monoxide, global-positioning-satellite (GPS) signals, or radio-frequency (RF), WiFi, or other electromagnetic signals or fields. Thus, for example, system 800 can include temperature sensor(s), humidity sensor(s), hazard-related sensor(s) or other environmental sensor(s), accelerometer(s), microphone(s), optical sensors up to and including camera(s) (e.g., charged-coupled-device or video cameras), active or passive radiation sensors, GPS receiver(s) or radio-frequency identification detector(s). While FIG. 8A illustrates an implementation with motion sensory control device, many implementations can include multiple sensors. In some instances, system 800 includes one or more primary sensors and one or more secondary sensors. The primary sensor(s) can sense data central to the core operation of the device (e.g., interpreting gestures performed in the augmented environment 113). The secondary sensor(s) can sense other types of data (e.g., light, acceleration, or sound).


Motions through 3D space made by the one or more objects of interest such as user hand 114, are captured by the cameras and any other sensors and tracked by the system 106. The sensor system 106 is capable of recognizing the user's hand(s) 114 and/or arm(s) (or other body parts); integrating computer generated menus 801, controls or other objects thereon; and displaying the result 803 integrated with the computer generated items 811, 813 in the augmented environment 113. Accordingly, menus 801 and other controls can be displayed arranged over the user's arm and/or hand 803. Thus, the user (e.g., owner of hand 114) is able to interact with virtual devices e.g., menu 801, in the same environment as virtual objects 811. Such displays enable the user to enjoy convenient access to controls for applications used in conjunction with the AR system presented in a “believable” manner (e.g., on the hand or arm). The approach is compatible with a wide variety of AR/VR technologies including systems that capture image data along different portions of the electromagnetic spectrums, including visible, near-IR, and IR spectrums; thereby compensating for deficiencies in lighting, contrast, and resolution in different environmental conditions.


In operation, sensors are oriented toward a region of interest 112 that can include free space in to which an augmented environment 113 is provided. Region 112 can include one or more objects of interest 114 (in this example, a pair of hands) which can be sensed, tracked and reflected into the augmented environment 113 by operation of cameras 102, 104, sensors 808, 810 and processing system 106. For example, in FIG. 8A and detail FIG. 8A-1, one hand 114 is moving along a path in relation to the other hand, which is represented by a virtualized rendering 803 of the user's hand and arm to which a virtual device, menu 801, is attached. As shown by FIG. 8A-1, menu 801 includes one or more status indicators and controls for controlling the augmented environment 113 and application 821, including a slider bar control 805. User hand 114 is moving slider bar control 805 to control the opacity of the constellations in the wearable planetarium application example. Now with reference to FIG. 8A-2, virtualized rendering of 803 is again depicted, however, this time the user has chosen to switch formats for menu 801. Selection of format of virtual objects affixed to rendered user body portions, such as menu 801, can be achieved under user control by any of gesture, motion, voice command, or any combination thereof. In the implementation illustrated, menu 801 format changes between formats illustrated by FIG. 8A-1 and FIG. 8A-2 when the user rotates their left arm. As shown by FIG. 8A-2, menu 801 includes one or more status indicators and controls for controlling the augmented environment 113 and application 821.


In some implementations, projection techniques can enhance the augmented reality experience provided to a user. Projection can include an image or other visual representation of an object. For example, optional projectors under control of system 106 can augment the virtual device experience 113, such as an optional audio projector 802 to provide for example audio feedback and/or an optional haptic projector 806 to provide for example haptic feedback to a user of virtual device experience 801. For example, optional haptic projector 806 can project the feeling of the click of the “virtual button” of the menu 801. In another example, optional audio projector 802 can project the “click” sound of the “virtual button” of the menu 801. Further information on projectors can be had with reference to “Visio-Tactile Projector” YouTube (https://www.youtube.com/watch?v=Bb0hNMxxewg) (accessed Jan. 15, 2014).


When embodied in a wearable device, system 800 can beneficially include a mode switching gesture detection methods and systems as described herein and with reference to FIG. 8B which illustrates an example of a detecting a mode switching gesture 118 made by the VR/AR user and detected by system 106. As illustrated by FIG. 8B, a user presently in a first mode—the augmented reality 113—has swiped their left arm, represented as arm 803 in the augmented environment 113 from a first position A to a second position B in swiping motion 118 to indicate a desire to change to a different mode, such as a “pass-through” e.g., “live” video mode 119 of FIG. 8C for example. Accordingly, arm 803 becomes a de facto mode control switch. In one implementation, pure gestures, or gestures in combination with voice recognition, and/or a virtual or real keyboard in combination with the gestures can be used to switch modes.


In some implementations, a user can raise an arm, utter a verbal command, perform an optical command, or make different poses using hands and fingers (e.g., ‘one finger point’, ‘one finger click’, ‘two finger point’, ‘two finger click’, ‘prone one finger point’, ‘prone one finger click’, ‘prone two finger point’, ‘prone two finger click’, ‘medial one finger point’, ‘medial two finger point’) to indicate an intent to switch modes. In other implementations, a point and grasp gesture can be used, verbal commands can be used to select a mode, and/or eye movements can be used to indicate a selection.


In some other implementations, the gestures or body movements can also be used to switch between modes. As illustrated by FIG. 8C, when the user is viewing the real scene presented as a “pass-through” presentation to the user's wearable headset, in order to switch back to a previous mode, the VR/AR user performs a subsequent gesture 118 such as a downward or upward swipe of hand and/or fingers to return the system 800 to a previous mode. In FIG. 8C, the image sensors 102, 104 provide live video of the real space scene in a pass-through mode. Accordingly, the user is able to see the environment surrounding herself, including her desk 816 and can grab a drink from her can of soda 817. When the user desires to return to the VR/AR presentation (e.g., planetarium in the example illustrated by FIGS. 8A-8B), the user can issue a second gesture command to switch the mode of the system 800 back to VR/AR mode. For instance, a swipe 118 of the user's arm 803 (or a finger flip up or down, spiral motion, etc.) can be used to return the VR/AR device back to virtual/augmented mode. Return gestures can be “reverse” gestures (e.g., a gesture in the opposite direction of the mode switching gesture (as illustrated by upward swipe 118 of FIG. 8C being in the opposite direction of the downward swipe 118 of FIG. 8B, or a second repeated instance of the mode switching gesture (e.g., a second downward swipe as illustrated in FIG. 8B would return the presentation of FIG. 8C back to the VR/AR mode of FIG. 8B). In another implementation, modes are “tied” to specific gestures, so that the user selects the mode specifically with the gesture, rather than simply toggling back and forth (e.g., a left to right swipe can put the system 800 into VR/AR mode irrespective of the present mode, while a right to left swipe can put the system 800 into pass-through mode irrespective of the present mode. This technique is useful in implementations having more than two modes, e.g., a pass-through live video mode, a completely virtual reality mode and an augmented reality mode.


In yet other implementations, the gestures can control changing to one or more the different modes in environment 113 using a graphical display or other feedback device, a set of menu elements, selection elements, and pan and zoom capabilities. Navigation through the devices can be consistent from high-level selection of target device down to manipulation of individual selection elements.


In other implementations, one or more user-interface components, including or in addition to menu 801 in system 800 can be used to present information to a user via an audio speaker in addition to or instead of visual display. In one implementation, user-interface components such as menu 801 can receive information from the user through a touchscreen, buttons, scroll component (e.g., a movable or virtual ring component), microphone, and/or camera (e.g., to detect gestures).



FIG. 9 shows a flowchart 900 of switching display modes in a virtual reality device including motion tracking by a wearable sensor system. Flowchart 900 can be implemented at least partially with a computer or other data processing system, e.g., by one or more processors configured to receive or retrieve information, process the information, store results, and transmit the results. Other implementations may perform the actions in different orders and/or with different, fewer or additional actions than those illustrated in FIG. 9. Multiple actions can be combined in some implementations. For convenience, this flowchart is described with reference to the system that carries out a method. The system is not necessarily part of the method.


At action 910, a gesture associated with switching modes is detected by capturing a video stream of a scene in a real world space proximate to a virtual reality device using at least one camera electronically coupled to a wearable sensor system.


At action 920, a display mode is selected. The display mode is chosen from a first video stream including live video of the scene in real world space and a second video stream including an output of a virtual reality application.


At action 930, display modes of the virtual reality device are switched by displaying the video stream selected.



FIG. 10 shows a flowchart 1000 of providing an augmented reality environment. Flowchart 1000 can be implemented at least partially with a computer or other data processing system, e.g., by one or more processors configured to receive or retrieve information, process the information, store results, and transmit the results. Other implementations may perform the actions in different orders and/or with different, fewer or additional actions than those illustrated in FIG. 10. Multiple actions can be combined in some implementations. For convenience, this flowchart is described with reference to the system that carries out a method. The system is not necessarily part of the method.


At action 1010, motion of a user body portion in the field of view of one or more cameras is captured.


At action 1020, a gesture is detected that indicates a command input to change a mode of operation from a virtual reality mode to a live video display mode or from a live video display mode to a virtual reality mode.


At action 1030, a mode of operation of a wearable display device is changed according to the command input.



FIG. 11 shows a flowchart 1100 of integrating a virtual object facilitating command input with a rendering of a user body portion into an augmented environment. Flowchart 900 can be implemented at least partially with a computer or other data processing system, e.g., by one or more processors configured to receive or retrieve information, process the information, store results, and transmit the results. Other implementations may perform the actions in different orders and/or with different, fewer or additional actions than those illustrated in FIG. 11. Multiple actions can be combined in some implementations. For convenience, this flowchart is described with reference to the system that carries out a method. The system is not necessarily part of the method.


At action 1110, a video stream of a real world space including at least a portion of a user's body is captured using at least one camera electronically coupled to a first wearable sensor system engaged by a first user.


At action 1120, position of the user body portion is identified from the video stream.


At action 1130, a rendered representation of the user body portion is determined.


At action 1140, a virtual object that facilitates a command input is sized to the rendered representation. In an implementation, the virtual object (e.g., menu) is sized to be proportionate to the user body portion (e.g., Hand).


At action 1150, the virtual object that facilitates a command input is overlaid onto the rendered representation and presented to the user as an integrated portion of a virtual reality application. In one implementation, the virtual object is affixed at a base point at the wrist of the representation of a hand. Other body portions will have different points which are determined for the body portion using a database query, lookup table search, or other techniques for determining a correspondence between object identity and point at which a virtual device can be affixed to the object of interest



FIG. 12 shows a flowchart 1200 of sharing content between wearable sensor systems. Flowchart 1200 can be implemented at least partially with a computer or other data processing system, e.g., by one or more processors configured to receive or retrieve information, process the information, store results, and transmit the results. Other implementations may perform the actions in different orders and/or with different, fewer or additional actions than those illustrated in FIG. 12. Multiple actions can be combined in some implementations. For convenience, this flowchart is described with reference to the system that carries out a method. The system is not necessarily part of the method.


At action 1210, a first video stream of a real world space is captured at time t0 using at least one camera electronically coupled to a first wearable sensor system engaged by a first user. In one implementation, the first video stream is captured at a field of view of the first user.


At action 1220, a second video stream of the real world space is captured at the time t0 using at least one camera electronically coupled to the first wearable sensor system. In one implementation, the second video stream is captured at a field of view of the camera.


At action 1230, a communication channel is established between the first wearable sensor system and a second wearable sensor system and the second video stream is transmitted to the second wearable sensor system.


In some implementations, the second video stream is preprocessed to enhance resolution and sending the preprocessed second video stream via the communication channel to the second wearable sensor system.


In other implementations, the second video stream is preprocessed to reduce bandwidth and sending the preprocessed second video stream via the communication channel to the second wearable sensor system.


In one implementation, the field of view of the at least one camera substantially overlaps with the field of view of the user. In another implementation, the field of view of the at least one camera encompasses and exceeds the field of view of the user. In yet another implementation, the field of view of the at least one camera narrows and deceeds the field of view of the user. In some other implementation, the field of view of the at least one camera is separate and additional to the field of view of the user.


In one implementation, short-beam illumination elements are used to capture a narrow-field of view. In some implementations, the short-beam illumination elements have a beam angle of approximately 60°. In another implementation, wide-beam illumination elements are used to capture a broad-field of view. In some implementations, the wide-beam illumination elements have a beam angle of approximately 120°.


In some implementations, the second video stream is transmitted to the second sensor system in response to user selection.


Typically, a “wide beam” is about 120° wide and a narrow beam is approximately 60° wide, although these are representative figures only and can vary with the application; more generally, a wide beam can have a beam angle anywhere from >90° to 180°, and a narrow beam can have a beam angle anywhere from >0° to 90°. For example, the detection space can initially be lit with one or more wide-beam lighting elements with a collective field of view similar to that of the tracking device, e.g., a camera. Once the object's position is obtained, the wide-beam lighting element(s) can be turned off and one or more narrow-beam lighting elements, pointing in the direction of the object, activated. As the object moves, different ones of the narrow-beam lighting elements are activated. In many implementations, these directional lighting elements only need to be located in the center of the field of view of the camera; for example, in the case of hand tracking, people will not often try to interact with the camera from a wide angle and a large distance simultaneously.


If the tracked object is at a large angle to the camera (i.e., far to the side of the motion-tracking device), it is likely relatively close to the device. Accordingly, a low-power, wide-beam lighting element can be suitable in some implementations. As a result, the lighting array can include only one or a small number of wide-beam lighting elements close to the camera along with an equal or larger number of narrow-beam devices (e.g., collectively covering the center-field region of space in front of the camera—for example, within a 30° or 45° cone around the normal to the camera). Thus, it is possible to decrease or minimize the number of lighting elements required to illuminate a space in which motion is detected by using a small number of wide-beam elements and a larger (or equal) number of narrow-beam elements directed toward the center field.


It is also possible to cover a wide field of view with many narrow-beam LEDs pointing in different directions, according to other implementations. These can be operated so as to scan the monitored space in order to identify the elements actually spotlighting the object; only these are kept on and the others turned off. In some embodiments, the motion system computes a predicted trajectory of the tracked object, and this trajectory is used to anticipate which illumination elements should be activated as the object moves. The trajectory is revised, along with the illumination pattern, as new tracking information is obtained.


In some implementations, motion capture is achieved using an optical motion-capture system. In some implementations, object position tracking is supplemented by measuring a time difference of arrival (TDOA) of audio signals at the contact vibrational sensors and mapping surface locations that satisfy the TDOA, analyzing at least one image, captured by a camera of the optical motion-capture system, of the object in contact with the surface, and using the image analysis to select among the mapped TDOA surface locations as a surface location of the contact.


Reference may be had to the following sources, incorporated herein by reference, for further information regarding computational techniques:

    • 1. Wikipedia, at http://en.wikipedia.org/wiki/Euclidean_group, on Nov. 4, 2013, 04:08 UTC;
    • 2. Wikipedia, at http://en.wikipedia.org/wiki/Affine_transformation, on Nov. 25, 2013, 11:01 UTC;
    • 3. Wikipedia, at http://en.wikipedia.org/wiki/Rotation_matrix, Rotation matrix from axis and angle, on Jan. 30, 2014, 20:12 UTC;
    • 4. Wikipedia, at http://en.wikipedia.org/wiki/Rotation_group_SO(3), Axis of rotation, on Jan. 21, 2014, 21:21 UTC;
    • 5. Wikipedia, at http://en.wikipedia.org/wiki/Transformation_matrix, Affine Transformations, on Jan. 28, 2014, 13:51 UTC; and
    • 6. Wikipedia, at http://en.wikipedia.org/wiki/Axis % E2%80%93angle_representation, on Jan. 25, 2014, 03:26 UTC.
    • 7. Wikipedia, at http://en.wikipedia.org/wiki/Visual_odometry, on Jun. 26, 2014, 09:38 UTC.
    • 8. Wikipedia, at http://en.wikipedia.org/wiki/Optical_flow, on Jun. 26, 2014, 09:38 UTC.


While the disclosed technology has been described with respect to specific implementations, one skilled in the art will recognize that numerous modifications are possible. The number, types and arrangement of cameras and sensors can be varied. The cameras' capabilities, including frame rate, spatial resolution, and intensity resolution, can also be varied as desired. The sensors' capabilities, including sensitively levels and calibration, can also be varied as desired. Light sources are optional and can be operated in continuous or pulsed mode. The systems described herein provide images and audio signals to facilitate tracking movement of an object, and this information can be used for numerous purposes, of which position and/or motion detection is just one among many possibilities.


Threshold cutoffs and other specific criteria for distinguishing object from background can be adapted for particular hardware and particular environments. Frequency filters and other specific criteria for distinguishing visual or audio signals from background noise can be adapted for particular cameras or sensors and particular devices. In some implementations, the system can be automatically calibrated for a particular environment or application, e.g., by adjusting frequency filters, threshold criteria, and so on.


Any type of object can be the subject of motion capture using these techniques, and various aspects of the implementation can be optimized for a particular object. For example, the type and positions of cameras and/or other sensors can be selected based on the size of the object whose motion is to be captured, the space in which motion is to be captured, and/or the medium of the surface through which audio signals propagate. Analysis techniques in accordance with implementations of the technology disclosed can be implemented as algorithms in any suitable computer language and executed on programmable processors. Alternatively, some or all of the algorithms can be implemented in fixed-function logic circuits, and such circuits can be designed and fabricated using conventional or other tools.


Computer programs incorporating various features of the technology disclosed may be encoded on various computer readable storage media; suitable media include magnetic disk or tape, optical storage media such as compact disk (CD) or DVD (digital versatile disk), flash memory, and any other non-transitory medium capable of holding data in a computer-readable form. Computer-readable storage media encoded with the program code may be packaged with a compatible device or provided separately from other devices. In addition program code may be encoded and transmitted via wired optical, and/or wireless networks conforming to a variety of protocols, including the Internet, thereby allowing distribution, e.g., via Internet download.


The terms and expressions employed herein are used as terms and expressions of description and not of limitation, and there is no intention, in the use of such terms and expressions, of excluding any equivalents of the features shown and described or portions thereof. In addition, having described certain implementations of the technology disclosed, it will be apparent to those of ordinary skill in the art that other implementations incorporating the concepts disclosed herein can be used without departing from the spirit and scope of the technology disclosed. Accordingly, the described implementations are to be considered in all respects as only illustrative and not restrictive.

Claims
  • 1. A method of switching display modes between a first mode in which a live video stream is displayed and a second mode in which a virtual reality video stream is displayed in a first virtual reality device including motion tracking by a first wearable sensor system, the method including: determining from a first video stream as captured by at least one camera from a scene in a real world space proximate to a first virtual reality device coupled to a first wearable sensor system that a motion of a user's hand captured in the first video stream indicates a user's hand gesture to the first wearable sensor system to switch display modes of the first virtual reality device; andresponsive to the user's hand gesture detected in the first video stream as captured, switching the display modes of the first virtual reality device by displaying a video stream selected from a video stream including live video of the scene in real world space and a video stream including an output of a virtual reality application; whereby the display mode is switched between displaying the live video stream and displaying the video stream including an output of a virtual reality application by switching the display modes.
  • 2. The method of claim 1, wherein switching the display modes further includes toggling between the video stream including the live video of the scene in real world space and the video stream including the output of the virtual reality application.
  • 3. The method of claim 2, further including switching the display mode back when a second user's hand gesture associated with switching modes is detected.
  • 4. The method of claim 2, further including switching the display mode back when a reverse user's hand gesture in a direction opposite to a direction of the user's hand gesture associated with switching modes is detected.
  • 5. The method of claim 1, wherein switching the display modes further includes selecting the video stream including the live video of the scene in real world space when a first user's hand gesture is detected and selecting the video stream including an output of a virtual reality application when a second user's hand gesture is detected.
  • 6. The method of claim 1, wherein the user's hand gesture includes a swipe of the user's hand.
  • 7. The method of claim 1, further including identifying position of a user body portion from the video stream.
  • 8. The method of claim 7, further including determining a rendered representation of the user body portion.
  • 9. The method of claim 8, further including sizing a virtual object that facilitates a command input to the rendered representation.
  • 10. The method of claim 9, further including presenting the virtual object that facilitates the command input overlayed onto the rendered representation as an integrated portion of a virtual reality application.
  • 11. The method of claim 1, wherein the user's hand gesture includes a swiping motion made in a zone proximate to the first virtual reality device.
  • 12. The method of claim 1, wherein selecting a display mode further includes switching from a video stream including image information gathered using a set of IR pixels and a video stream including image information gathered using a set of RGB pixels.
  • 13. The method of claim 1, further including: capturing using at least one camera electronically coupled to a second wearable sensor system, a second video stream of the real world space at a time substantially contemporaneous to capture of the first video stream by the first wearable sensor system; andestablishing a communication channel between the first wearable sensor system and the second wearable sensor system.
  • 14. The method of claim 1, wherein determining a motion indicates a user's hand gesture to the first wearable sensory system includes: sensing a swipe of a user's hand portion in close proximity to one or more cameras.
  • 15. The method of claim 14, wherein the switching display modes includes: toggling the first virtual reality device between a virtual reality mode and a live video display mode responsive to motion of the user's hand portion.
  • 16. The method of claim 1, further including: returning the first virtual reality device to a previous mode upon detecting a second gesture made by a user's hand portion.
  • 17. A non-transitory computer readable storage medium impressed with computer program instructions for switching display modes between a first mode in which a live video stream is displayed and a second mode in which a virtual reality video stream is displayed in a first virtual reality device including motion tracking by a first wearable sensor system, which instructions, when executed on a processor, implement a method comprising: determining from a first video stream as captured by at least one camera from a scene in a real world space proximate to a first virtual reality device coupled to a first wearable sensor system that a motion of a user's hand captured in the first video stream indicates a user's hand gesture to the first wearable sensor system to switch display modes of the first virtual reality device; andresponsive to the user's hand gesture detected in the first video stream as captured, switching the display modes of the first virtual reality device by displaying a video stream selected from a video stream including live video of the scene in real world space and a video stream including an output of a virtual reality application; whereby the display mode is switched between displaying a live video stream and displaying a virtual video stream including an output of a virtual reality application by switching the display modes.
  • 18. The non-transitory computer readable storage medium of claim 17, further including instructions that when executed by one or more processors perform: capturing using at least one camera electronically coupled to a second wearable sensor system, a second video stream of the real world space at a time substantially contemporaneous to capture of the first video stream by the first wearable sensor system; andestablishing a communication channel between the first wearable sensor system and the second wearable sensor system.
  • 19. A system including a first set of one or more processors coupled to memory storing computer instructions for switching display modes in a first virtual reality device including motion tracking by a first wearable sensor system, which computer instructions, when executed on the processors, implement: determining from a first video stream as captured by at least one camera from a scene in a real world space proximate to a first virtual reality device coupled to a first wearable sensor system that a motion of a user's hand captured in the first video stream indicates a user's hand gesture to the first wearable sensor system to switch display modes of the first virtual reality device; andresponsive to the user's hand gesture detected in the first video stream as captured, switching the display modes of the first virtual reality device by displaying a video stream selected from a live video of the scene in real world space and a video stream including an output of a virtual reality application; whereby the display mode is switched between displaying a live video stream and displaying a virtual reality video stream by switching the display modes.
  • 20. The system of claim 19, further including a second set of one or more processors coupled to memory storing computer instructions to switch display modes in a second virtual reality device including motion tracking by a second wearable sensor system, which computer instructions, when executed on the processors, implement: capturing using at least one camera electronically coupled to the second wearable sensor system, a second video stream of the real world space at a time substantially contemporaneous to capture of the first video stream by the first wearable sensor system; andestablishing a communication channel between the first wearable sensor system and the second wearable sensor system.
RELATED APPLICATION

This application is a continuation of U.S. patent application Ser. No. 17/345,919 entitled, “MODE SWITCHING FOR INTEGRATED GESTURAL INTERACTION AND MULTI-USER COLLABORATION IN IMMERSIVE VIRTUAL REALITY ENVIRONMENTS,” filed Jun. 11, 2021, which is a continuation of U.S. patent application Ser. No. 16/877,231 entitled, “MODE SWITCHING FOR INTEGRATED GESTURAL INTERACTION AND MULTI-USER COLLABORATION IN IMMERSIVE VIRTUAL REALITY ENVIRONMENTS,” filed May 18, 2020, which is a continuation of U.S. patent application Ser. No. 14/997,454, entitled, “MODE SWITCHING FOR INTEGRATED GESTURAL INTERACTION AND MULTI-USER COLLABORATION IN IMMERSIVE VIRTUAL REALITY ENVIRONMENTS,” filed Jan. 15, 2016, which claims the benefit of U.S. Provisional Patent Application No. 62/104,675, entitled, “MODE SWITCHING FOR INTEGRATED GESTURAL INTERACTION AND MULTI-USER COLLABORATION IN IMMERSIVE VIRTUAL REALITY ENVIRONMENTS,” filed on Jan. 16, 2015. The priority non-provisional and provisional applications are hereby incorporated by reference for all purposes.

US Referenced Citations (534)
Number Name Date Kind
2665041 Maffucci Jan 1954 A
4175862 DiMatteo et al. Nov 1979 A
4521831 Thayer Jun 1985 A
4734939 Copp Apr 1988 A
4876455 Sanderson et al. Oct 1989 A
4879659 Bowlin et al. Nov 1989 A
4893223 Arnold Jan 1990 A
5038258 Koch et al. Aug 1991 A
5134661 Reinsch Jul 1992 A
5282067 Liu Jan 1994 A
5434617 Bianchi Jul 1995 A
5454043 Freeman Sep 1995 A
5574511 Yang et al. Nov 1996 A
5581276 Cipolla et al. Dec 1996 A
5594469 Freeman et al. Jan 1997 A
5659475 Brown Aug 1997 A
5691737 Ito et al. Nov 1997 A
5742263 Wang et al. Apr 1998 A
5900849 Gallery May 1999 A
5900863 Numazaki May 1999 A
5940538 Spiegel et al. Aug 1999 A
6002808 Freeman Dec 1999 A
6031161 Baltenberger Feb 2000 A
6031661 Tanaami Feb 2000 A
6072494 Nguyen Jun 2000 A
6075895 Qiao et al. Jun 2000 A
6147678 Kumar et al. Nov 2000 A
6154558 Hsieh Nov 2000 A
6181343 Lyons Jan 2001 B1
6184326 Razavi et al. Feb 2001 B1
6184926 Khosravi et al. Feb 2001 B1
6195104 Lyons Feb 2001 B1
6204852 Kumar et al. Mar 2001 B1
6252598 Segen Jun 2001 B1
6263091 Jain et al. Jul 2001 B1
6346933 Lin Feb 2002 B1
6417970 Travers et al. Jul 2002 B1
6421453 Kanevsky et al. Jul 2002 B1
6463402 Bennett et al. Oct 2002 B1
6492986 Metaxas et al. Dec 2002 B1
6493041 Hanko et al. Dec 2002 B1
6498628 Iwamura Dec 2002 B2
6578203 Anderson, Jr. et al. Jun 2003 B1
6603867 Sugino et al. Aug 2003 B1
6661918 Gordon et al. Dec 2003 B1
6674877 Jojic et al. Jan 2004 B1
6702494 Dumler et al. Mar 2004 B2
6734911 Lyons May 2004 B1
6738424 Allmen et al. May 2004 B1
6771294 Pulli et al. Aug 2004 B1
6798628 Macbeth Sep 2004 B1
6804654 Kobylevsky et al. Oct 2004 B2
6804656 Rosenfeld et al. Oct 2004 B1
6814656 Rodriguez Nov 2004 B2
6819796 Hong et al. Nov 2004 B2
6901170 Terada et al. May 2005 B1
6919880 Morrison et al. Jul 2005 B2
6950534 Cohen et al. Sep 2005 B2
6993157 Que et al. Jan 2006 B1
7050606 Paul et al. May 2006 B2
7152024 Marschner et al. Dec 2006 B2
7213707 Hubbs et al. May 2007 B2
7215828 Luo May 2007 B2
7236611 Roberts et al. Jun 2007 B2
7244233 Krantz et al. Jul 2007 B2
7257237 Luck et al. Aug 2007 B1
7259873 Sikora et al. Aug 2007 B2
D552152 Almond Oct 2007 S
7289645 Yamamoto et al. Oct 2007 B2
7308112 Fujimura et al. Dec 2007 B2
7340077 Gokturk et al. Mar 2008 B2
7483049 Aman et al. Jan 2009 B2
7519223 Dehlin et al. Apr 2009 B2
7532206 Morrison et al. May 2009 B2
7536032 Bell May 2009 B2
7542586 Johnson Jun 2009 B2
7598942 Underkoffler et al. Oct 2009 B2
7606417 Steinberg et al. Oct 2009 B2
7646372 Marks et al. Jan 2010 B2
7656372 Sato et al. Feb 2010 B2
7665041 Wilson et al. Feb 2010 B2
7692625 Morrison et al. Apr 2010 B2
7743348 Robbins et al. Jun 2010 B2
D620514 Kim et al. Jul 2010 S
D627815 Oba Nov 2010 S
7831932 Josephsoon et al. Nov 2010 B2
7840031 Albertson et al. Nov 2010 B2
7861188 Josephsoon et al. Dec 2010 B2
7940885 Stanton et al. May 2011 B2
7948493 Klefenz et al. May 2011 B2
7961174 Markovic et al. Jun 2011 B1
7961934 Thrun et al. Jun 2011 B2
7971156 Albertson et al. Jun 2011 B2
7980885 Gattwinkel et al. Jul 2011 B2
8023698 Niwa et al. Sep 2011 B2
8035624 Bell et al. Oct 2011 B2
8045825 Shimoyama et al. Oct 2011 B2
8064704 Kim et al. Nov 2011 B2
8085339 Marks Dec 2011 B2
8086971 Radivojevic et al. Dec 2011 B2
8111239 Pryor et al. Feb 2012 B2
8112719 Hsu et al. Feb 2012 B2
8126985 Kandekar et al. Feb 2012 B1
D656175 Fong Mar 2012 S
8144233 Fukuyama Mar 2012 B2
8185176 Mangat et al. May 2012 B2
8213707 Li et al. Jul 2012 B2
8218858 Gu Jul 2012 B2
8229134 Duraiswami et al. Jul 2012 B2
8235529 Raffle Aug 2012 B1
8244233 Chang et al. Aug 2012 B2
8249345 Wu et al. Aug 2012 B2
8270669 Aichi et al. Sep 2012 B2
8289162 Mooring et al. Oct 2012 B2
8290208 Kurtz et al. Oct 2012 B2
8304727 Lee et al. Nov 2012 B2
8319832 Nagata et al. Nov 2012 B2
8363010 Nagata Jan 2013 B2
8375473 Celona et al. Feb 2013 B2
8395600 Kawashima et al. Mar 2013 B2
8396252 Dokor Mar 2013 B2
8432377 Newton Apr 2013 B2
8471848 Tschesnok Jun 2013 B2
8514221 King et al. Aug 2013 B2
D689539 Zaletel et al. Sep 2013 S
8542320 Berestov et al. Sep 2013 B2
8553037 Smith et al. Oct 2013 B2
8582809 Halimeh et al. Nov 2013 B2
8593417 Kawashima et al. Nov 2013 B2
8605202 Muijs et al. Dec 2013 B2
8631355 Murillo et al. Jan 2014 B2
8638989 Holz Jan 2014 B2
8659594 Kim et al. Feb 2014 B2
8659658 Vassigh et al. Feb 2014 B2
8661571 Teetzel et al. Mar 2014 B1
8693731 Holz et al. Apr 2014 B2
8738523 Sanchez et al. May 2014 B1
8744122 Salgian et al. Jun 2014 B2
8744645 Vaghefinazari et al. Jun 2014 B1
8751979 Socha Jun 2014 B1
8768022 Miga et al. Jul 2014 B2
8773512 Rafii Jul 2014 B1
8781171 King et al. Jul 2014 B2
8781234 Zhang et al. Jul 2014 B2
8817087 Weng et al. Aug 2014 B2
8824749 Leyvand et al. Sep 2014 B2
8842084 Andersson et al. Sep 2014 B2
8843857 Berkes et al. Sep 2014 B2
8872914 Gobush Oct 2014 B2
D717361 Nikaido Nov 2014 S
8878749 Wu et al. Nov 2014 B1
8891868 Ivanchenko Nov 2014 B1
8907982 Zontrop et al. Dec 2014 B2
8922590 Luckett, Jr. et al. Dec 2014 B1
8929609 Padovani et al. Jan 2015 B2
8930852 Chen et al. Jan 2015 B2
8942881 Hobbs et al. Jan 2015 B2
8954340 Sanchez et al. Feb 2015 B2
8957857 Lee et al. Feb 2015 B2
9014414 Katano et al. Apr 2015 B2
9056396 Linnell Jun 2015 B1
9070019 Holz Jun 2015 B2
9119670 Yang et al. Sep 2015 B2
9122354 Sharma Sep 2015 B2
9124778 Crabtree Sep 2015 B1
9170674 Forutanpour et al. Oct 2015 B2
9213185 Starner et al. Dec 2015 B1
9274742 Phillips Mar 2016 B2
D756443 Ju et al. May 2016 S
9459697 Bedikian et al. Oct 2016 B2
9600935 Cohen Mar 2017 B2
9741169 Holz Aug 2017 B1
9754167 Holz et al. Sep 2017 B1
9766855 Lapidot et al. Sep 2017 B2
9868449 Holz et al. Jan 2018 B1
10043320 Holz et al. Aug 2018 B2
10318100 Abercrombie et al. Jun 2019 B2
10475249 Holz et al. Nov 2019 B2
10600248 Holz Mar 2020 B2
11080937 Holz Aug 2021 B2
11538224 Holz et al. Dec 2022 B2
11561519 Gordon et al. Jan 2023 B2
11676349 Holz Jun 2023 B2
20010044858 Rekimoto Nov 2001 A1
20010052985 Ono Dec 2001 A1
20020008139 Albertelli Jan 2002 A1
20020008211 Kask Jan 2002 A1
20020041327 Hildreth et al. Apr 2002 A1
20020080094 Biocca et al. Jun 2002 A1
20020105484 Navab et al. Aug 2002 A1
20020107637 Okamura et al. Aug 2002 A1
20030053658 Pavlidis Mar 2003 A1
20030053659 Pavlidis et al. Mar 2003 A1
20030076996 Neumann et al. Apr 2003 A1
20030081141 Mazzapica May 2003 A1
20030123703 Pavlidis et al. Jul 2003 A1
20030152289 Luo Aug 2003 A1
20030202697 Simard et al. Oct 2003 A1
20040103111 Miller et al. May 2004 A1
20040125228 Dougherty Jul 2004 A1
20040125984 Ito et al. Jul 2004 A1
20040145809 Brenner Jul 2004 A1
20040155877 Hong et al. Aug 2004 A1
20040212725 Raskar Oct 2004 A1
20050007673 Chaoulov et al. Jan 2005 A1
20050063564 Yamamoto et al. Mar 2005 A1
20050068518 Baney et al. Mar 2005 A1
20050094019 Grosvenor et al. May 2005 A1
20050131607 Breed Jun 2005 A1
20050156888 Xie et al. Jul 2005 A1
20050168578 Gobush Aug 2005 A1
20050210105 Hirata et al. Sep 2005 A1
20050236558 Nabeshima et al. Oct 2005 A1
20050238201 Shamaie Oct 2005 A1
20060006235 Kurzweil et al. Jan 2006 A1
20060017807 Lee et al. Jan 2006 A1
20060028656 Venkatesh et al. Feb 2006 A1
20060029296 King et al. Feb 2006 A1
20060034545 Mattes et al. Feb 2006 A1
20060050979 Kawahara Mar 2006 A1
20060056056 Ahiska et al. Mar 2006 A1
20060072105 Wagner Apr 2006 A1
20060098899 King et al. May 2006 A1
20060204040 Freeman et al. Sep 2006 A1
20060210112 Cohen et al. Sep 2006 A1
20060262421 Matsumoto et al. Nov 2006 A1
20060273984 Wanda et al. Dec 2006 A1
20060290950 Platt et al. Dec 2006 A1
20070014466 Baldwin Jan 2007 A1
20070042346 Weller Feb 2007 A1
20070057764 Sato et al. Mar 2007 A1
20070086621 Aggarwal et al. Apr 2007 A1
20070130547 Boillot Jun 2007 A1
20070203904 Ren et al. Aug 2007 A1
20070206719 Suryanarayanan et al. Sep 2007 A1
20070230929 Niwa et al. Oct 2007 A1
20070238956 Haras et al. Oct 2007 A1
20070275755 Chae et al. Nov 2007 A1
20080002860 Super et al. Jan 2008 A1
20080013826 Hillis et al. Jan 2008 A1
20080019576 Senftner et al. Jan 2008 A1
20080019589 Yoon et al. Jan 2008 A1
20080030429 Hailpern et al. Feb 2008 A1
20080031492 Lanz Feb 2008 A1
20080056752 Denton et al. Mar 2008 A1
20080064954 Adams et al. Mar 2008 A1
20080106637 Nakao et al. May 2008 A1
20080106746 Shpunt et al. May 2008 A1
20080110994 Knowles et al. May 2008 A1
20080118091 Serfaty et al. May 2008 A1
20080126937 Pachet May 2008 A1
20080170776 Albertson et al. Jul 2008 A1
20080187175 Kim et al. Aug 2008 A1
20080244468 Nishihara et al. Oct 2008 A1
20080246759 Summers Oct 2008 A1
20080273764 Scholl Nov 2008 A1
20080278589 Thorn Nov 2008 A1
20080291160 Rabin Nov 2008 A1
20080304740 Sun et al. Dec 2008 A1
20080317292 Baker et al. Dec 2008 A1
20080319356 Cain et al. Dec 2008 A1
20090002489 Yang et al. Jan 2009 A1
20090038056 Bobbin et al. Feb 2009 A1
20090093307 Miyaki Apr 2009 A1
20090102840 Li Apr 2009 A1
20090103780 Nishihara et al. Apr 2009 A1
20090116742 Nishihara May 2009 A1
20090122146 Zalewski et al. May 2009 A1
20090153655 Ike et al. Jun 2009 A1
20090203993 Mangat et al. Aug 2009 A1
20090203994 Mangat et al. Aug 2009 A1
20090217211 Hildreth et al. Aug 2009 A1
20090239587 Negron et al. Sep 2009 A1
20090257623 Tang et al. Oct 2009 A1
20090274339 Cohen et al. Nov 2009 A9
20090278915 Kramer et al. Nov 2009 A1
20090309710 Kakinami Dec 2009 A1
20090315837 Geiger Dec 2009 A1
20090323121 Valkenburg et al. Dec 2009 A1
20100013832 Xiao et al. Jan 2010 A1
20100020078 Shpunt Jan 2010 A1
20100023015 Park Jan 2010 A1
20100026963 Faulstich Feb 2010 A1
20100027845 Kim et al. Feb 2010 A1
20100046842 Conwell Feb 2010 A1
20100053164 Imai et al. Mar 2010 A1
20100053209 Rauch et al. Mar 2010 A1
20100053612 Ou-Yang et al. Mar 2010 A1
20100058252 Ko Mar 2010 A1
20100066737 Liu Mar 2010 A1
20100066975 Rehnstrom Mar 2010 A1
20100091110 Hildreth Apr 2010 A1
20100095206 Kim Apr 2010 A1
20100118123 Freedman et al. May 2010 A1
20100121189 Ma et al. May 2010 A1
20100125815 Wang et al. May 2010 A1
20100127995 Rigazio et al. May 2010 A1
20100141762 Siann et al. Jun 2010 A1
20100153457 Grant Jun 2010 A1
20100158372 Kim et al. Jun 2010 A1
20100175172 Dempsey et al. Jul 2010 A1
20100177929 Kurtz et al. Jul 2010 A1
20100194863 Lopes et al. Aug 2010 A1
20100199229 Kipman et al. Aug 2010 A1
20100199230 Latta et al. Aug 2010 A1
20100199232 Mistry et al. Aug 2010 A1
20100201880 Iwamura Aug 2010 A1
20100208942 Porter et al. Aug 2010 A1
20100219934 Matsumoto Sep 2010 A1
20100222102 Rodriguez Sep 2010 A1
20100264833 Van Endert et al. Oct 2010 A1
20100277411 Yee et al. Nov 2010 A1
20100296698 Lien et al. Nov 2010 A1
20100302015 Kipman et al. Dec 2010 A1
20100302357 Hsu et al. Dec 2010 A1
20100303298 Marks et al. Dec 2010 A1
20100306712 Snook et al. Dec 2010 A1
20100309097 Raviv et al. Dec 2010 A1
20110007072 Khan et al. Jan 2011 A1
20110025818 Gallmeier et al. Feb 2011 A1
20110026765 Ivanich et al. Feb 2011 A1
20110043806 Guetta et al. Feb 2011 A1
20110057875 Shigeta et al. Mar 2011 A1
20110066984 Li Mar 2011 A1
20110080470 Kuno et al. Apr 2011 A1
20110080490 Clarkson et al. Apr 2011 A1
20110090252 Yoon et al. Apr 2011 A1
20110090407 Friedman Apr 2011 A1
20110093820 Zhang et al. Apr 2011 A1
20110107216 Bi May 2011 A1
20110115486 Frohlich et al. May 2011 A1
20110116684 Coffman et al. May 2011 A1
20110119640 Berkes et al. May 2011 A1
20110134112 Koh et al. Jun 2011 A1
20110145981 Teetzel et al. Jun 2011 A1
20110148875 Kim et al. Jun 2011 A1
20110157555 Mashitani Jun 2011 A1
20110169726 Holmdahl et al. Jul 2011 A1
20110173574 Clavin et al. Jul 2011 A1
20110176146 Alvarez Diez et al. Jul 2011 A1
20110181509 Rautiainen et al. Jul 2011 A1
20110193778 Lee et al. Aug 2011 A1
20110205151 Newton et al. Aug 2011 A1
20110213664 Osterhout et al. Sep 2011 A1
20110221672 Osterhout Sep 2011 A1
20110228978 Chen et al. Sep 2011 A1
20110234631 Kim et al. Sep 2011 A1
20110234840 Klefenz et al. Sep 2011 A1
20110239354 Celona et al. Oct 2011 A1
20110243451 Oyaizu Oct 2011 A1
20110251896 Impollonia et al. Oct 2011 A1
20110261178 Lo et al. Oct 2011 A1
20110267259 Tidemand et al. Nov 2011 A1
20110279397 Rimon et al. Nov 2011 A1
20110286676 El Dokor Nov 2011 A1
20110289455 Reville et al. Nov 2011 A1
20110289456 Reville et al. Nov 2011 A1
20110291925 Israel et al. Dec 2011 A1
20110291988 Bamji et al. Dec 2011 A1
20110296353 Ahmed et al. Dec 2011 A1
20110299737 Wang et al. Dec 2011 A1
20110304600 Yoshida Dec 2011 A1
20110304650 Campillo et al. Dec 2011 A1
20110310007 Margolis et al. Dec 2011 A1
20110310220 McEldowney Dec 2011 A1
20110314427 Sundararajan Dec 2011 A1
20120030637 Dey et al. Feb 2012 A1
20120038637 Marks Feb 2012 A1
20120050143 Border et al. Mar 2012 A1
20120050157 Latta et al. Mar 2012 A1
20120062444 Cok Mar 2012 A1
20120065499 Chono Mar 2012 A1
20120068914 Jacobsen et al. Mar 2012 A1
20120079420 Arriola Mar 2012 A1
20120086624 Thompson et al. Apr 2012 A1
20120086728 Mcardle et al. Apr 2012 A1
20120113316 Ueta et al. May 2012 A1
20120132682 Mongan et al. May 2012 A1
20120143361 Kurabayashi et al. Jun 2012 A1
20120159380 Kocienda et al. Jun 2012 A1
20120163675 Joo et al. Jun 2012 A1
20120170800 da Silva Frazao et al. Jul 2012 A1
20120194420 Osterhout et al. Aug 2012 A1
20120194517 Izadi et al. Aug 2012 A1
20120204133 Guendelman et al. Aug 2012 A1
20120223959 Lengeling Sep 2012 A1
20120236288 Stanley Sep 2012 A1
20120250936 Holmgren Oct 2012 A1
20120257797 Leyvand et al. Oct 2012 A1
20120270654 Padovani et al. Oct 2012 A1
20120274781 Shet et al. Nov 2012 A1
20120281873 Brown et al. Nov 2012 A1
20120281884 Whillock et al. Nov 2012 A1
20120293667 Baba et al. Nov 2012 A1
20120314030 Datta et al. Dec 2012 A1
20120320080 Giese et al. Dec 2012 A1
20130002649 Wu et al. Jan 2013 A1
20130014052 Frey et al. Jan 2013 A1
20130019204 Kotler et al. Jan 2013 A1
20130027300 Nakasu et al. Jan 2013 A1
20130038694 Nichani et al. Feb 2013 A1
20130044951 Cherng et al. Feb 2013 A1
20130050425 Im et al. Feb 2013 A1
20130080898 Lavian et al. Mar 2013 A1
20130086531 Sugita et al. Apr 2013 A1
20130097566 Berglund Apr 2013 A1
20130120319 Givon May 2013 A1
20130127980 Haddick et al. May 2013 A1
20130148852 Partis et al. Jun 2013 A1
20130156296 Dokor Jun 2013 A1
20130182079 Holz Jul 2013 A1
20130182897 Holz Jul 2013 A1
20130187952 Berkovich et al. Jul 2013 A1
20130191911 Dellinger et al. Jul 2013 A1
20130204457 King et al. Aug 2013 A1
20130208948 Berkovich et al. Aug 2013 A1
20130222308 Ma et al. Aug 2013 A1
20130222640 Baek et al. Aug 2013 A1
20130239059 Chen et al. Sep 2013 A1
20130241832 Rimon et al. Sep 2013 A1
20130249787 Morimoto Sep 2013 A1
20130252691 Alexopoulos Sep 2013 A1
20130257736 Hou et al. Oct 2013 A1
20130258140 Lipson et al. Oct 2013 A1
20130261871 Hobbs et al. Oct 2013 A1
20130271370 Wang et al. Oct 2013 A1
20130271397 MacDougall et al. Oct 2013 A1
20130278631 Border et al. Oct 2013 A1
20130293688 Benson et al. Nov 2013 A1
20130293723 Benson et al. Nov 2013 A1
20130300637 Smits et al. Nov 2013 A1
20130300831 Mavromatis et al. Nov 2013 A1
20130307856 Keane et al. Nov 2013 A1
20130307935 Rappel et al. Nov 2013 A1
20130321261 Nakasu Dec 2013 A1
20130321265 Bychkov et al. Dec 2013 A1
20130328867 Jung et al. Dec 2013 A1
20130328928 Yamagishi et al. Dec 2013 A1
20130335301 Wong et al. Dec 2013 A1
20130335573 Forutanpour et al. Dec 2013 A1
20140002365 Ackley et al. Jan 2014 A1
20140010441 Shamaie Jan 2014 A1
20140023247 Kuwahara et al. Jan 2014 A1
20140064566 Shreve et al. Mar 2014 A1
20140081521 Frojdh et al. Mar 2014 A1
20140085203 Kobayashi Mar 2014 A1
20140099019 Dokor Apr 2014 A1
20140111423 Park et al. Apr 2014 A1
20140125775 Holz May 2014 A1
20140125813 Holz May 2014 A1
20140132738 Ogura et al. May 2014 A1
20140134733 Wu et al. May 2014 A1
20140139425 Sakai May 2014 A1
20140139641 Holz May 2014 A1
20140152809 Jarvis Jun 2014 A1
20140157135 Lee et al. Jun 2014 A1
20140161311 Kim Jun 2014 A1
20140168062 Katz et al. Jun 2014 A1
20140176310 Kotlicki Jun 2014 A1
20140176420 Zhou et al. Jun 2014 A1
20140177913 Holz Jun 2014 A1
20140184493 Chen Jul 2014 A1
20140189579 Rimon et al. Jul 2014 A1
20140189938 Redpath et al. Jul 2014 A1
20140192024 Holz Jul 2014 A1
20140195988 Kramer et al. Jul 2014 A1
20140198024 Adzhigirey et al. Jul 2014 A1
20140201666 Bedikian et al. Jul 2014 A1
20140201689 Bedikian et al. Jul 2014 A1
20140210793 Eriksson Jul 2014 A1
20140222385 Muenster et al. Aug 2014 A1
20140223385 Ton et al. Aug 2014 A1
20140225826 Juni Aug 2014 A1
20140225918 Mittal Aug 2014 A1
20140240215 Tremblay et al. Aug 2014 A1
20140240225 Eilat Aug 2014 A1
20140248950 Tosas Bautista Sep 2014 A1
20140253512 Narikawa et al. Sep 2014 A1
20140253711 Balch et al. Sep 2014 A1
20140253785 Chan et al. Sep 2014 A1
20140267098 Na et al. Sep 2014 A1
20140282282 Holz Sep 2014 A1
20140285403 Kobayashi Sep 2014 A1
20140306874 Finocchio et al. Oct 2014 A1
20140306891 Latta et al. Oct 2014 A1
20140307920 Holz Oct 2014 A1
20140320274 De Schepper et al. Oct 2014 A1
20140344762 Grasset et al. Nov 2014 A1
20140361976 Osman et al. Dec 2014 A1
20140364209 Perry Dec 2014 A1
20140364212 Osman Dec 2014 A1
20140369558 Holz Dec 2014 A1
20140375547 Katz et al. Dec 2014 A1
20140375947 Park et al. Dec 2014 A1
20150003673 Fletcher Jan 2015 A1
20150009149 Gharib et al. Jan 2015 A1
20150016777 Abovitz et al. Jan 2015 A1
20150022447 Hare et al. Jan 2015 A1
20150024368 King, Jr. Jan 2015 A1
20150029091 Nakashima et al. Jan 2015 A1
20150040297 Vermillion Feb 2015 A1
20150054729 Minnen et al. Feb 2015 A1
20150084864 Geiss et al. Mar 2015 A1
20150092015 Stafford Apr 2015 A1
20150094142 Stafford Apr 2015 A1
20150097772 Starner Apr 2015 A1
20150115802 Kuti et al. Apr 2015 A1
20150116214 Grunnet-Jepsen et al. Apr 2015 A1
20150122855 Parsons et al. May 2015 A1
20150131859 Kim et al. May 2015 A1
20150138070 Iwatsu May 2015 A1
20150172539 Neglur Jun 2015 A1
20150193669 Gu et al. Jul 2015 A1
20150205358 Lyren Jul 2015 A1
20150205400 Hwang et al. Jul 2015 A1
20150206321 Scavezze et al. Jul 2015 A1
20150227795 Starner et al. Aug 2015 A1
20150234569 Hess Aug 2015 A1
20150253428 Holz Sep 2015 A1
20150258432 Stafford et al. Sep 2015 A1
20150261291 Mikhailov et al. Sep 2015 A1
20150304593 Sakai Oct 2015 A1
20150317831 Ebstyne et al. Nov 2015 A1
20150320189 Ju et al. Nov 2015 A1
20150323785 Fukata et al. Nov 2015 A1
20150323795 Alto et al. Nov 2015 A1
20150326762 Ju et al. Nov 2015 A1
20150351482 Gendron et al. Dec 2015 A1
20150366284 Dowling et al. Dec 2015 A1
20160025978 Mallinson Jan 2016 A1
20160044298 Holz et al. Feb 2016 A1
20160062573 Dascola et al. Mar 2016 A1
20160086046 Holz et al. Mar 2016 A1
20160093105 Rimon et al. Mar 2016 A1
Foreign Referenced Citations (69)
Number Date Country
1984236 Jun 2007 CN
101124534 Feb 2008 CN
201332447 Oct 2009 CN
101729808 Jun 2010 CN
101930610 Dec 2010 CN
101951474 Jan 2011 CN
102053702 May 2011 CN
201859393 Jun 2011 CN
102184014 Sep 2011 CN
102201121 Sep 2011 CN
102236412 Nov 2011 CN
102696057 Sep 2012 CN
103090862 May 2013 CN
4201934 Jul 1993 DE
10326035 Jan 2005 DE
102007015495 Oct 2007 DE
102007015497 Jan 2014 DE
0999542 May 2000 EP
1477924 Nov 2004 EP
1837665 Sep 2007 EP
2369443 Sep 2011 EP
2419433 Apr 2006 GB
2480140 Nov 2011 GB
2519418 Apr 2015 GB
H02236407 Sep 1990 JP
H08261721 Oct 1996 JP
H09259278 Oct 1997 JP
2000023038 Jan 2000 JP
2002133400 May 2002 JP
2003256814 Sep 2003 JP
2004246252 Sep 2004 JP
2006019526 Jan 2006 JP
2006259829 Sep 2006 JP
2007272596 Oct 2007 JP
2008227569 Sep 2008 JP
2009031939 Feb 2009 JP
2009037594 Feb 2009 JP
2010060548 Mar 2010 JP
2011010258 Jan 2011 JP
2011065652 Mar 2011 JP
2011107681 Jun 2011 JP
4906960 Mar 2012 JP
2012527145 Nov 2012 JP
101092909 Dec 2011 KR
20130079841 Jul 2013 KR
2422878 Jun 2011 RU
200844871 Nov 2008 TW
9426057 Nov 1994 WO
2004114220 Dec 2004 WO
2006020846 Feb 2006 WO
2006090197 Aug 2006 WO
2007137093 Nov 2007 WO
20090142797 Nov 2009 WO
2010007662 Jan 2010 WO
2010032268 Mar 2010 WO
2010076622 Jul 2010 WO
2010088035 Aug 2010 WO
2010138741 Dec 2010 WO
2011024193 Mar 2011 WO
2011036618 Mar 2011 WO
2011044680 Apr 2011 WO
2011045789 Apr 2011 WO
2011119154 Sep 2011 WO
2012027422 Mar 2012 WO
2013109608 Jul 2013 WO
2013109609 Jul 2013 WO
2014208087 Dec 2014 WO
2015026707 Feb 2015 WO
2015172118 Nov 2015 WO
Non-Patent Literature Citations (111)
Entry
“EigenSolver <_ MatrixType> Class Template Reference,” Reference Eigen Values Module, retrieved from the internet: <http://eigen.luxfamily.org/dox/classEigen_1_1EigenSolver.html> on Mar. 12, 2015, pp. 1-8.
“Hessian Matrix of the Image,” Matlab—Hessian Matrix of the Image—Stack Overflow, last edited Mar. 13, 2014, retrieved from the internet: <http://stackoverflow.com/questions/22378360/hessian-matrix-of-the-image> on Mar. 10, 2015, 3 pages.
“How Hessian Feature Detector Works?” Signal Processing Stack Exchange, last edited Oct. 2013, retrieved from the internet: <http://dsp.stackexchange.com/questions/10579/how0hessian-feature-detector-works> on Mar. 10, 2015, 3 pages.
“SVD Module,” Reference, Eigen: SVD Module, retrieved from the internet: <http://eigen.luxfamily.org/dox/group_SVD_Module.html> on Mar. 12, 2015, 1 page.
Arthington, et al., “Cross-section Reconstruction During Uniaxial Loading,” Measurement Science and Technology, vol. 20, No. 7, Jun. 10, 2009, Retrieved from the Internet: http:iopscience.iop.org/0957-0233/20/7/075701, pp. 1-9.
Ballan et al., “Lecture Notes Computer Science: 12th European Conference on Computer Vision: Motion Capture of Hands in Action Using Discriminative Salient Points”, Oct. 7-13, 2012 [retrieved Jul. 14, 2016], Springer Berlin Heidelberg, vol. 7577, pp. 640-653. Retrieved from the Internet: <http://link.springer.com/chapter/10.1007/978-3-642-33783-3 46>.
Barat et al., “Feature Correspondences From Multiple Views of Coplanar Ellipses”, 2nd International Symposium on Visual Computing, Author Manuscript, 2006, 10 pages.
Bardinet, et al., “Fitting of iso-Surfaces Using Superquadrics and Free-Form Deformations” [on-line], Jun. 24-25, 1994 [retrieved Jan. 9, 2014], 1994 Proceedings of IEEE Workshop on Biomedical Image Analysis, Retrieved from the Internet: http://ieeexplore.IEEE.org/xpls/abs_all.jsp?arnumber=315882&tag=1, pp. 184-193.
Bhutami, R., “What are the Ways of Calculating 2x2 Hessian Matrix for 2D Image of Pixel at (x,y) Position?,” Quora, last updated May 2013, retrieved from the internet: <http://www.quora.com/What-are-the-ways-of-calculating-2-x-2-hessian-m-atrix-for-2d-image-of-pixel-at-x-y-position> on Mar. 10, 2015, 4 pages.
Butail, S., et al., “Three-Dimensional Reconstruction of the Fast-Start Swimming Kinematics of Densely Schooling Fish,” Journal of the Royal Society Interface, Jun. 3, 2011, retrieved from the Internet <http://www.ncbi.nlm.nih.gov/pubmed/21642367>, pp. 0, 1-12.
Cheikh et al., “Multipeople Tracking Across Multiple Cameras”, International Journal on New Computer Architectures and Their Applications (IJNCAA), vol. 2, No. 1, 2012, pp. 23-33.
Choras, M., et al., “Contactless Palmprint and Knuckle Biometrics for Mobile Devices”, Springerlink, Dec. 8, 2009, 13 pages.
Chung, et al., “Recovering LSHGCs and SHGCs from Stereo,” International Journal of Computer Vision, vol. 20, No. 1/2, Oct. 1996, pp. 43-58.
CN 2013800122765—Office Action dated Nov. 2, 2015, 17 pages.
CN 2013800122765—Response to First Office Action dated Nov. 2, 2015 filed May 14, 2016, 14 pages.
CN 2013800122765—Response to Second Office Action dated Jul. 27, 2016 filed Oct. 11, 2016, 3 pages
CN 2013800122765—Second Office Action dated Jul. 27, 2016, 6 pages.
CN 201580041594.3—First Office Action dated Feb. 3, 2019, 29 pages.
Cui et al., “Applications of Evolutionary Computing: Vision-Based Hand Motion Capture Using Genetic Algorithm”, 2004 [retrieved Jul. 15, 2016], Springer Berlin Heidelberg, vol. 3005 of LNCS, pp. 289-300. Retrieved from the Internet: <http://link.springer.com/chapter/10.1007/978-3-540-24653-4_30>.
Cumani, A., et al., “Recovering the 3D Structure of Tubular Objects from Stereo Silhouettes,” Pattern Recognition, Elsevier, GB, vol. 30, No. 7, Jul. 1, 1997, 9 pages.
Davis et al., “Toward 3-D Gesture Recognition”, International Journal of Pattern Recognition and Artificial Intelligence, vol. 13, No. 03, 1999, pp. 381-393.
DE 11 2013 000 590.5—First Office Action dated Nov. 5, 2014, 7 pages.
DE 11 2013 000 590.5—Notice of Allowance dated Jan. 18, 2016, 8 pages.
DE 11 2013 000 590.5—Response to First Office Action dated Nov. 5, 2014 filed Apr. 24, 2015, 1 page.
DE 11 2013 000 590.5—Response to Second Office Action dated Apr. 29, 2015 filed Sep. 16, 2015, 11 pages.
DE 11 2013 000 590.5—Response to Third Office Action dated Sep. 28, 2015 filed Dec. 14, 2015, 64 pages.
DE 11 2013 000 590.5—Second Office Action dated Apr. 29, 2015, 7 pages.
DE 11 2013 000 590.5—Third Office Action dated Sep. 28, 2015, 4 pages.
Delamarre et al., “Finding Pose of Hand in Video Images: A Stereo-based Approach”, Apr. 14-16, 1998 [retrieved Jul. 15, 2016], Third IEEE Intern Conf on Auto Face and Gesture Recog, pp. 585-590. Retrieved from the Internet: <http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=671011&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D671011>.
Di Zenzo, S., et al., “Advances in Image Segmentation,” Image and Vision Computing, Elsevier, Guildford, GBN, vol. 1, No. 1, Copyright Butterworth & Co Ltd., Nov. 1, 1983, pp. 196-210.
Dombeck, D., et al., “Optical Recording of Action Potentials with Second-Harmonic Generation Microscopy,” The Journal of Neuroscience, Jan. 28, 2004, vol. 24(4): pp. 999-1003.
Forbes, K., et al., “Using Silhouette Consistency Constraints to Build 3D Models,” University of Cape Town, Copyright De Beers 2003, Retrieved from the internet: <http://www.dip.ee.uct.ac.za/′kforbes/Publications/Forbes2003Prasa.pdf> on Jun. 17, 2013, 6 pages.
Franco, Michael. “Wave Your Hand to Control Smart Devices—Even if it's Out of Sight.” Cnet (Feb. 28, 2014).
Fukui et al. “Multiple Object Tracking System with Three Level Continuous Processes” IEEE, 1992, pp. 19-27.
Genovese, Angelo, Contactless and less Constrained Palmprint Recognition, Mar. 18, 2014, 48 pages.
Gorce et al., “Model-Based 3D Hand Pose Estimation from Monocular Video”, Feb. 24, 2011 [retrieved Jul. 15, 2016], IEEE Transac Pattern Analysis and Machine Intell, vol. 33, Issue: 9, pp. 1793-1805, Retri Internet: <http://ieeexplore.ieee.org/xpl/logi n .jsp ?tp=&arnu mber=571 9617 &u rl=http%3A %2 F%2 Fieeexplore. ieee.org%2Fxpls%2 Fabs all. isp%3Farnumber%3D5719617>.
Grauman, K., et al., “Chapter 3—Local Features: Detection and Description,” Visual Object Recognition: Synthesis Lectures on Artificial Intelligence and Machine Learning, Apr. 2011, retrieved from the internet :<www.morganclaypool.com/doi/abs/10.2200/S00332Ed1V01Y201103A- M011> on Mar. 12, 2015, pp. 1, 23-39.
Guo et al., Featured Wand for 3D Interaction, Jul. 2-5, 2007 [retrieved Jul. 15, 2016], 2007 IEEE International Conference on Multimedia and Expo, pp. 2230-2233. Retrieved from the Internet: <http://ieeexplore.ieee.org/xpl/login.isp?tp=&arnumber=4285129&tag=1&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D4285129%26tag%3D1>.
Heikkila, J., “Accurate Camera Calibration and Feature Based 3-D Reconstruction from Monocular Image Sequences”, Infotech Oulu and Department of Electrical Engineering, University of Oulu, 1997, 126 pages.
Hladuvka, J., et al., “Exploiting Eigenvalues of the Hessian Matrix for Volume Decimation,” CiteSeerx, Copyright 2001, retrieved from the internet: <http://citeseerx.isf.psu.edu/viewdoc/summary?doi=10.1.1.67.-565> on Mar. 12, 2015, fromVienna University of Technology, 7 pages.
Interactive Gaming Smart Vending Machine. Silikron Smart Vending, URL: https://www.youtube.com/watch?v=tK17sXvzLtU, Dec. 12, 2013.
Jin et al., Vascular Tree Segmentation in Medical Images Using Hessian-Based Multiscale Filtering and Level Set Method, Hindawi, dated Jun. 28, 2013, 8 pages.
JP 2014-552391—First Office Action dated Dec. 9, 2014, 6 page.
JP 2014-552391—Response to First Office Action dated Dec. 9, 2014 filed Jun. 8, 2016, 9 pages.
JP 2014-552391—Response to Second Office Action dated Jul. 7, 2015 filed Dec. 25, 2015, 4 pages.
JP 2014-552391—Second Office Action dated Jul. 7, 2015, 7 pages.
JP 2014-552391—Third Office Action dated Jan. 26, 2016, 5 pages.
Kanhangad, V., et al., “Combining 2D and 3D Hand Geometry Features for Biometric Verification”, IEEE 2009, 6 pages.
Kanhangad, V., et al., “A Unified Framework for Contactless Hand Verification,” IEEE Transactions on Information Forensics and Security, IEEE, Piscataway, NJ, US , vol. 6, No. 3, Sep. 1, 2011, pp. 1014-1027.
Kellogg, Bryce, Vamsi Talia, and Shyamnath Gollakota. “Bringing Gesture Recognition to All Devices,” NSDI'14: Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation, (Apr. 2, 2014), pp. 303-316. (Year: 2014).
Kim, et al., “Development of an Orthogonal Double-Image Processing Algorithm to Measure Bubble,” Department of Nuclear Engineering and Technology, Seoul National University Korea, vol. 39 No. 4, Published Jul. 6, 2007, pp. 313-326.
Kulesza, et al., “Arrangement of a Multi Stereo Visual Sensor System for a Human Activities Space,” Source: Stereo Vision, Book edited by: Dr. Asim Bhatti, ISBN 978-953-7619-22-0, Copyright Nov. 2008, I-Tech, Vienna, Austria, www.intechopen.com, pp. 153-173.
Matsuyama et al. “Real-Time Dynamic 3-D Object Shape Reconstruction and High-Fidelity Texture Mapping for 3-D Video,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, No. 3, Mar. 2004, pp. 357-369.
May, S., et al., “Robust 3D-Mapping with Time-of-Flight Cameras,” 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, Piscataway, NJ, USA, Oct. 10, 2009, pp. 1673-1678.
Melax et al., “Dynamics Based 3D Skeletal Hand Tracking”, May 29, 2013 [retrieved Jul. 14, 2016], Proceedings of Graphics Interface, 2013, pp. 63-70. Retrived from the Internet: <http://dl.acm.org/citation.cfm?id=2532141>.
Mendez, et al., “Importance Masks for Revealing Occluded Objects in Augmented Reality,” Proceedings of the 16th ACM Symposium on Virtual Reality Software and Technology, 2 pages, ACM, 2009.
Oka et al., “Real-Time Fingertip Tracking and Gesture Recognition”, Nov./Dec. 2002 [retrieved Jul. 15, 2016], IEEE Computer Graphics and Applications, vol. 22, Issue: 6, pp. 64-71. Retrieved from the Internet: <http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1046630&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabsall.jsp%3Farnumber%3D1046630>.
Olsson, K., et al., “Shape from Silhouette Scanner-Creating a Digital 3D Model of a Real Object by Analyzing Photos From Multiple Views,” University of Linkoping, Sweden, Copyright VCG 2001, Retrieved from the Internet: <http://liu.diva-portal.org/smash/get/diva2:18671/FULLTEXT01> on Jun. 17, 2013, 52 pages.
PCT/JP2008/062732, WO English Abstract with Japanese Publication of WO 2010/007662 A1, “Heat-Resistant Cushion material for Forming Press,” Jan. 21, 2010, Ichikawa Co Ltd, 35 pages.
PCT/US2013/021709—International Preliminary Report on Patentability dated Jul. 22, 2014, 22 pages (WO 2013/109608).
PCT/US2013/021709—International Search Report and Written Opinion dated Sep. 12, 2013, 22 pages.
PCT/US2013/021709, International Preliminary Report on Patentability and Written Opinion, Sep. 12, 2013, 22 pages (WO 2013/109608).
PCT/US2013/021713—International Preliminary Report on Patentability dated Jul. 22, 2014, 13 pages, (WO 2013/109609).
PCT/US2013/021713—International Search Report and Written Opinion dated Sep. 11, 2013, 7 pages.
PCT/US2013/069231—International Preliminary Report with Written Opinion dated May 12, 2015, 8 pages.
PCT/US2013/069231—International Search Report and Written Opinion mailed Mar. 13, 2014, 7 pages.
PCT/US2014/013012—International Search Report and Written Opinion dated May 14, 2014, published as WO 2014116991, 12 pages.
PCT/US2014/028265, Application (Determining Positional Information for an Object in Space), May 9, 2014, 117 pages.
PCT/US2014/028265, International Search Report and Written Opinion, Jan. 7, 2015, 15 pages.
PCT/US2015/032705—International Preliminary Report on Patentability dated Nov. 29, 2016, 11 pages.
PCT/US2015/032705—International Search Report dated Sep. 15, 2015, 3 pages.
PCT/US2015/032705—International Search Report dated Sep. 15, 2015, 3 page.
PCT/US2015/032705—Written Opinion of the ISA, mailed Sep. 15, 2015, 10 pages.
Pedersini, et al., Accurate Surface Reconstruction from Apparent Contours, Sep. 5-8, 2000 European Signal Processing Conference EUSIPCO 2000, vol. 4, Retrieved from the Internet: http://home.deib.polimi.it/sarti/CV_and_publications.html, pp. 1-4.
Pointgrab LTD. “New Line of Acer All-in-One Devices Among the First Products to Feature PointGrab's Windows 8 Hand Gesture Control Solution,” Business Wire (English), (Nov. 13, 2012).
Rasmussen, Matihew K., “An Analytical Framework for the Preparation and Animation of a Virtual Mannequin forthe Purpose of Mannequin-Clothing Interaction Modeling”, A Thesis Submitted in Partial Fulfillment of the Requirements for the Master of Science Degree in Civil and Environmental Engineering in the Graduate College of the University of lowa, Dec. 2008, 98 pages.
Rudzki, M., “Vessel Detection Method Based on Eigenvalues of Hessian Matrix and its Applicability to Airway Tree Segmentation,” XI International PhD Workshop, OWD 2009, Silesian University of Technology, Oct. 17-20, 2009, 6 pages.
Schaar, R., VCNL4020 Vishay Semiconductors. Application Note [online]. Extended Detection Range with VCNL Family of Proximity Sensor Vishay Intertechnology, Inc, Doc No. 84225, Revised Oct. 25, 2013 [retrieved Mar. 4, 2014]. Retrieved from the Internet: <www.vishay.com>. 4 pages.
Schlattmann et al., “Markerless 4 gestures 6 DOF real-time visual tracking of the human hand with automatic Initialization”, 2007 [retrieved Jul. 15, 2016], Eurographics 2007, vol. 26, No. 3, 10 pages, Retrieved from the Internet: <http://cg.cs.uni-bonn.de/aigaion2root/attachments/schlattmann-2007-markerless.pdf>.
Shinodalab, “Visuo-Tactile Projector,” YouTube Video, published on May 14, 2013, retrieved from the internet: <http://www.youtube.com/watch?v=Bb0hNMxxewg> on Mar. 12, 2015, 2 pages.
Shlens, J., “A Tutorial on Principal Component Analysis,” Derivation, Discussion and Singular Value Decomposition, Version 1, Mar. 25, 2013, UCSD.edu, pp. 1-16.
Solanki, Utpal V. and Nilesh H. Desai. “Hand Gesture Based Remote Control for Home Appliances: Handmote,” 2011 World Congress on Information and Communication Technologies, Mumbai, (2011), p. 419-423.
Texas Instruments, “4-Channel, 12-Bit, 80-MSPS ADC,” VSP5324, Revised Nov. 2012, Texas Instruments Incorporated, 55 pages.
Texas Instruments, “QVGA 3D Time-of-Flight Sensor,” Product Overview: OPT 8140, Dec. 2013, Texas Instruments Incorporated, 10 pages.
Texas Instruments, “Time-of-Flight Controller (TFC),” Product Overview; OPT9220, Jan. 2014, Texas Instruments Incorporated, 43 pages.
VCNL4020 Vishay Semiconductors. Datasheet [online]. Vishay Intertechnology, Inc, Doc No. 83476, Rev. 1.3, Oct. 29, 2013 [retrieved Mar. 4, 2014]. Retrieved from the Internet: <www.vishay.com>. 16 pages.
Wang et al., “Tracking of Deformable Hand in Real Time as Continuous Input for Gesture-based Interaction”, Jan. 28, 2007 [retrieved Jul. 15, 2016], Proceedings of the 12th International Conference on Intelligent User Interfaces, pp. 235-242. Retrieved fromthe Internet: <http://dl.acm.org/citation.cfm?id=1216338>.
Wikipedia, “Affine Transmation,” Wikipedia-the Free Encyclopedia, last modified Mar. 5, 2015, retrieved from the Internet: <http://en.wikipedia.org/w/index.php?title=Affine.sub.--transformation- &oldid=650023248> on Mar. 12, 2015, 8 pages.
Wikipedia, “Axis-angle Representation,” Wikipedia—the Free Encyclopedia, last modified Dec. 30, 2014, retrieved from the internet: <http://en.wikipedia.org/w/index.php?title=Axis-angle_representation&oldid=640273193> on Mar. 12, 2015, 5 pages.
Wikipedia, “Euclidean Group,” Wikipedia—the Free Encyclopedia, last modified Feb. 24, 2015, retrieved from the Internet: <http://en.wikipedia.org/w/index.php?title=Euclidean_group&oldid=648705193> on Mar. 12, 2015, 7 pages.
Wikipedia, “Multilateration,” Wikipedia—the Free Encyclopedia, Nov. 16, 2012, retrieved from the internet: <http://en.wikipedia.org/w/index.php?title=Multilateration&oldid=523281858> on Mar. 12, 2015, 10 pages.
Wikipedia, “Rotation Group SO(3),” Wikipedia—the Free Encyclopedia, last modified Feb. 20, 2015, retrieved from the internet: <http://en.wikipedia.org/w/index.php?title=Rotation_group_SO(3)&oldid=648012313> on Mar. 13, 2015, 17 pages.
Wikipedia, “Rotation Matrix,” Wikipedia-the Free Encyclopedia, last modified Mar. 11, 2015, retrieved from the Internet: <http://en.wikipedia.org/w/index.php?title=Rotation.sub.—matrix&oldid- =650875954> on Mar. 12, 2015, 21 pages. cited byapplicant.
Wikipedia, “Transformation Matrix,” Wikipedia—the Free Encyclopedia, last modified Mar. 5, 2015, retrieved from the Internet: <http://en.wikipedia.org/w/index.php?title=Transformation.sub.—matrix- &oldid=649936175> on Mar. 12, 2015, 8 pages.
Wong, Kie Yih Edward, et al., “Palmprint Identification Using Sobel Operator,” 10th International Conference on Control, automation, Robotics and Vision, Dec. 17-20, 2008, 4 pages.
Zhao et al., “Combining Marker-Based Mocap and RGB-D Camera for Acquiring High-Fidelity Hand Motion Data”, Jul. 29, 2012 [retrieved Jul. 15, 2016], Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 33-42, Retrieved from the Internet: <http://dl.acm.org/citation.cfm?id=2422363>.
Berman,S. and Stern,H., “Sensors for Gesture Recognition Systems,” in IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, No. 3, May 2012, pp. 277-290.
PCT/US2015/030040—International Search Report dated Aug. 13, 2015, 3 pages.
PCT/US2015/030040—Written Opinion of the International Searching Authority dated Aug. 13, 2015, 7 pages.
CN 2014204535364—Office Action dated Nov. 5, 2014, 1 page.
CN 2014204535364—Response to Nov. 5, Office Action filed Feb. 26, 2015, 8 pages.
CN 2014204535364—Notice of Allowance dated Apr. 17, 2015, 3 pages.
Rolland, et al., “A comparison of optical and video see-through head-mounted displays,” 1994, 15 pages.
Wikipedia, Euclidean Group, Wikimedia Foundation Inc., Jul. 15, 2021, 7 pages, originally accessed Nov. 4, 2013, <https://en.wikipedia.org/wiki/Euclidean_group>.
Wikipedia, Affine Transformation, Wikimedia Foundation Inc., Nov. 16, 2021, 8 pages, original access Nov. 25, 2013<https://en.wikipedia.org/wiki/Affine_transformation>.
Wikipedia, Rotation Matrix, Wikimedia Foundation Inc., Jan. 3, 2022, 26 pages, original access Jan. 30, 2014, <https://en.wikipedia.org/wiki/Rotation_matrix>.
Wikipedia, Rotation Group SO(3), Wikimedia Foundation Inc., Oct. 19, 2018, 15 pages, accessed Jan. 21, 2014<https://en.wikipedia.org/wiki/rotation_group_SO(3)>.
Wikipedia, Transformation Matrix, Wikimedia Foundation Inc., Dec. 29, 2021, 10 pages, original access Jan. 28, 2014,<https://en.wikipedia.org/wiki/Transformation_matrix>.
Wikipedia, Axis-angle Representation, Wikimedia Foundation Inc., Nov. 29, 2021, 6 pages, original access Jan. 25, 2014,<https://en.wikipedia.org/wiki/Axis%E2%80%93angle_representation>.
Devasena et al., “Controlling of Electronic Equipment Using Gesture Recognition”, International Journal of Engineering and Advanced Technology (IJEAT), vol. 3, Iss. 2, Dec. 2013, 4 pages.
Ko et al., “Gesture Recognition: Enabling Natural Interactions with Electronics”, Texas Instruments, https://www.edge-ai-vision.com/2012/04/gesture-recognition-enabling-natural-interactions-with-electronics/ (Apr. 13, 2012) (Year: 2012), in 17 pages.
Related Publications (1)
Number Date Country
20240045509 A1 Feb 2024 US
Provisional Applications (1)
Number Date Country
62104675 Jan 2015 US
Continuations (3)
Number Date Country
Parent 17345919 Jun 2021 US
Child 18374588 US
Parent 16877231 May 2020 US
Child 17345919 US
Parent 14997454 Jan 2016 US
Child 16877231 US