Tracking systems obtain data regarding the location and movement of a human or other subject in a physical space, and can use the data as an input to an application in a computing system. Some systems determine a skeletal model of a body, including joints of the skeleton, and can therefore be considered to be body joint tracking systems. Many applications are possible, such as for military, entertainment, sports and medical purposes. For instance, the motion of humans can be used to create an animated character or avatar. Optical systems, including those using visible and invisible, e.g., infrared, light, use cameras to detect the presence of a human in a field of view. However, there is a need to facilitate the development of a body joint tracking system by providing training data in the form of synthesized images.
A processor-implemented method, system and tangible computer readable storage are provided for generating proxy training data for human body tracking in a body joint tracking system.
In the development of a body joint tracking system, a depth camera is used to obtain a depth image of a person moving in a field of view of the camera. Various processing techniques are used to detect the person's body and recognize movements or poses which are performed by the person. This process can be considered to be a supervised machine learning algorithm. The process is supervised because the location and poses of the person are known. The goal is to have the body joint tracking system learn how to recognize the location and poses of the person. Various adjustments can be made to the learning algorithm, e.g., to filter out noise, to recognize different body types, and to distinguish the person's body from other objects which may be present in the field of view, such as furniture, walls and so forth. However, training the learning algorithm using a live person in a real world environment is inefficient and does not accurately represent a range of scenarios which a body joint tracking system will experience when it is deployed as a commercial product in thousands or even millions of user's homes.
To optimize the training of the learning algorithm, synthetic images can be generated as a substitute or proxy for images of a real person. The synthetic images can be used to augment or replace images of a real person. Further, the synthetic images can be provided in a way that is computationally efficient, while being realistic and providing a high degree of variability to simulate real-world conditions which a body joint tracking system will experience when it is deployed.
In one embodiment, a processor-implemented method for generating proxy training data for human body tracking is provided. The method includes a number of processor-implemented steps. The method includes accessing at least one motion capture sequence which identifies poses of an actor's body during a time period in which the actor performs a movement. For example, the sequences can be obtained in a motion capture studio by imaging an actor wearing a motion capture suit with markers, as the actor performs a series of prescribed movements. The method further includes performing retargeting to a number of different body types, and dissimilar pose selection, based on the at least one motion capture sequence, to provide a number of retargeted, dissimilar poses. The method further includes rendering each of the dissimilar poses according to a 3-D body model for a respective body type, to provide a respective depth image of the dissimilar pose, and to provide a respective classification image which identifies body parts of the dissimilar pose. A number of different 3-D body models are used, one for each body type. Further, the respective depth image and the respective classification image comprise pixel data which is usable by a machine learning algorithm for human body tracking.
In one approach, retargeting is performed before dissimilar pose selection, and in another approach, retargeting is performed after dissimilar pose selection. Optionally, noise is add to the depth images to provide a more realistic depth image which is similar to a depth image which will be seen by a depth camera in a real world environment. The noise can include noise which is caused by the presence of hair on a person, depth quantization noise, random noise, noise caused by edges of a person's body, noise caused by detection of very thin structures and noise caused by the camera geometry.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
In the drawings, like-numbered elements correspond to one another.
a provides further details of performing retargeting and dissimilar pose detection (step 502 of
b depicts an algorithm for dissimilar pose selection.
c provides further details of performing retargeting and dissimilar pose detection (step 502 of
a depicts an alternative view of the process of
b depicts an alternative view of the process of
a depicts an example view of a first pose of an actor with markers in a motion capture studio (step 500 in
b depicts an example view of a second pose of an actor with markers in a motion capture studio (step 500 in
a depicts a rendering of a depth image of a 3-D body, of a first body type, with an overlay of the corresponding skeleton of
b depicts a rendering of a depth image of a 3-D body 1360, of a second body type, with an overlay of a corresponding skeleton.
a depicts an example depth image.
b depicts an example classification image corresponding to the depth image of
Techniques are provided for generating synthesized image for use by a machine learning algorithm of a body joint tracking system. A limited number of motion capture sequences are obtained from a motion capture studio. The motion capture sequences include poses or movements performed by an actor. These sequences are leveraged to provide an increased degree of variability by retargeting the sequences to a number of different body types. Efficiency is achieved by selecting dissimilar poses so that redundant poses or near redundant poses are not provided to the machine learning algorithm. Moreover, greater realism is achieved by adding a variety of types of noises which are expected to be seen in a real world deployment of the body joint tracking system. Other random variations can be introduced as well. For example, a degree of randomness can be added to the retargeting. The data provided to the learning algorithm includes labeled training data in the form of registered pairs of depth and classification images, along with pose data.
The techniques provided herein avoid providing an overwhelming amount of data to the training algorithm, while still covering a large range of poses and body types, including, e.g., independent movement of the upper and lower body. A single system can be provided which can handle a large range of poses and body types.
Features include sample selection based on distances between poses, generation of new samples by combining partial skeletons, generation of synthetic backgrounds by inserting 3-D models and generation of synthetic noisy images by perturbing the depth map.
Generally, the body joint tracking system 10 is used to recognize, analyze, and/or track a human target. The computing environment 12 can include a computer, a gaming system or console, or the like, as well as hardware components and/or software components to execute applications.
The depth camera system 20 may include a camera which is used to visually monitor one or more people, such as the user 8, such that gestures and/or movements performed by the user may be captured, analyzed, and tracked to perform one or more controls or actions within an application, such as animating an avatar or on-screen character or selecting a menu item in a user interface (UI).
The body joint tracking system 10 may be connected to an audiovisual device such as the display 196, e.g., a television, a monitor, a high-definition television (HDTV), or the like, or even a projection on a wall or other surface, that provides a visual and audio output to the user. An audio output can also be provided via a separate device. To drive the display, the computing environment 12 may include a video adapter such as a graphics card and/or an audio adapter such as a sound card that provides audiovisual signals associated with an application. The display 196 may be connected to the computing environment 12 via, for example, an S-Video cable, a coaxial cable, an HDMI cable, a DVI cable, a VGA cable, or the like.
The user 8 may be tracked using the depth camera system 20 such that the gestures and/or movements of the user are captured and used to animate an avatar or on-screen character and/or interpreted as input controls to the application being executed by computer environment 12.
Some movements of the user 8 may be interpreted as controls that may correspond to actions other than controlling an avatar. For example, in one embodiment, the player may use movements to end, pause, or save a game, select a level, view high scores, communicate with a friend, and so forth. The player may use movements to select the game or other application from a main user interface, or to otherwise navigate a menu of options. Thus, a full range of motion of the user 8 may be available, used, and analyzed in any suitable manner to interact with an application.
The person can hold an object such as a prop when interacting with an application. In such embodiments, the movement of the person and the object may be used to control an application. For example, the motion of a player holding a racket may be tracked and used for controlling an on-screen racket in an application which simulates a tennis game. In another example embodiment, the motion of a player holding a toy weapon such as a plastic sword may be tracked and used for controlling a corresponding weapon in the virtual world of an application which provides a pirate ship.
The body joint tracking system 10 may further be used to interpret target movements as operating system and/or application controls that are outside the realm of games and other applications which are meant for entertainment and leisure. For example, virtually any controllable aspect of an operating system and/or application may be controlled by movements of the user 8.
The depth camera system 20 may include an image camera component 22, such as a depth camera that captures the depth image of a scene in a physical space. The depth image may include a two-dimensional (2-D) pixel area of the captured scene, where each pixel in the 2-D pixel area has an associated depth value which represents a linear distance from the image camera component 22.
The image camera component 22 may include an infrared (IR) light emitter 24, an infrared camera 26, and a red-green-blue (RGB) camera 28 that may be used to capture the depth image of a scene. A 3-D camera is formed by the combination of the infrared emitter 24 and the infrared camera 26. For example, in time-of-flight analysis, the IR light emitter 24 emits infrared light onto the physical space and the infrared camera 26 detects the backscattered light from the surface of one or more targets and objects in the physical space. In some embodiments, pulsed infrared light may be used such that the time between an outgoing light pulse and a corresponding incoming light pulse is measured and used to determine a physical distance from the depth camera system 20 to a particular location on the targets or objects in the physical space. The phase of the outgoing light wave may be compared to the phase of the incoming light wave to determine a phase shift. The phase shift may then be used to determine a physical distance from the depth camera system to a particular location on the targets or objects.
A time-of-flight analysis may also be used to indirectly determine a physical distance from the depth camera system 20 to a particular location on the targets or objects by analyzing the intensity of the reflected beam of light over time via various techniques including, for example, shuttered light pulse imaging.
In another example embodiment, the depth camera system 20 may use a structured light to capture depth information. In such an analysis, patterned light (i.e., light displayed as a known pattern such as grid pattern or a stripe pattern) may be projected onto the scene via, for example, the IR light emitter 24. Upon striking the surface of one or more targets or objects in the scene, the pattern may become deformed in response. Such a deformation of the pattern may be captured by, for example, the infrared camera 26 and/or the RGB camera 28 and may then be analyzed to determine a physical distance from the depth camera system to a particular location on the targets or objects.
The depth camera system 20 may include two or more physically separated cameras that may view a scene from different angles to obtain visual stereo data that may be resolved to generate depth information.
The depth camera system 20 may further include a microphone 30 which includes, e.g., a transducer or sensor that receives and converts sound waves into an electrical signal. Additionally, the microphone 30 may be used to receive audio signals such as sounds that are provided by a person to control an application that is run by the computing environment 12. The audio signals can include vocal sounds of the person such as spoken words, whistling, shouts and other utterances as well as non-vocal sounds such as clapping hands or stomping feet.
The depth camera system 20 may include a processor 32 that is in communication with the image camera component 22. The processor 32 may include a standardized processor, a specialized processor, a microprocessor, or the like that may execute instructions including, for example, instructions for receiving a depth image; generating a grid of voxels based on the depth image; removing a background included in the grid of voxels to isolate one or more voxels associated with a human target; determining a location or position of one or more extremities of the isolated human target; adjusting a model based on the location or position of the one or more extremities, or any other suitable instruction, which will be described in more detail below.
The depth camera system 20 may further include a memory component 34 that may store instructions that are executed by the processor 32, as well as storing images or frames of images captured by the 3-D camera or RGB camera, or any other suitable information, images, or the like. According to an example embodiment, the memory component 34 may include random access memory (RAM), read only memory (ROM), cache, flash memory, a hard disk, or any other suitable tangible computer readable storage component. The memory component 34 may be a separate component in communication with the image capture component 22 and the processor 32 via a bus 21. According to another embodiment, the memory component 34 may be integrated into the processor 32 and/or the image capture component 22.
The depth camera system 20 may be in communication with the computing environment 12 via a communication link 36. The communication link 36 may be a wired and/or a wireless connection. According to one embodiment, the computing environment 12 may provide a clock signal to the depth camera system 20 via the communication link 36 that indicates when to capture image data from the physical space which is in the field of view of the depth camera system 20.
Additionally, the depth camera system 20 may provide the depth information and images captured by, for example, the 3-D camera 26 and/or the RGB camera 28, and/or a skeletal model that may be generated by the depth camera system 20 to the computing environment 12 via the communication link 36. The computing environment 12 may then use the model, depth information, and captured images to control an application. For example, as shown in
The data captured by the depth camera system 20 in the form of the skeletal model and movements associated with it may be compared to the gesture filters in the gesture library 190 to identify when a user (as represented by the skeletal model) has performed one or more specific movements. Those movements may be associated with various controls of an application.
The computing environment may also include a processor 192 for executing instructions which are stored in a memory 194 to provide audio-video output signals to the display device 196 and to achieve other functionality as described herein.
A graphics processing unit (GPU) 108 and a video encoder/video codec (coder/decoder) 114 form a video processing pipeline for high speed and high resolution graphics processing. Data is carried from the graphics processing unit 108 to the video encoder/video codec 114 via a bus. The video processing pipeline outputs data to an A/V (audio/video) port 140 for transmission to a television or other display. A memory controller 110 is connected to the GPU 108 to facilitate processor access to various types of memory 112, such as RAM (Random Access Memory).
The multimedia console 100 includes an I/O controller 120, a system management controller 122, an audio processing unit 123, a network interface 124, a first USB host controller 126, a second USB controller 128 and a front panel I/O subassembly 130 that are preferably implemented on a module 118. The USB controllers 126 and 128 serve as hosts for peripheral controllers 142(1)-142(2), a wireless adapter 148, and an external memory device 146 (e.g., flash memory, external CD/DVD ROM drive, removable media, etc.). The network interface (NW IF) 124 and/or wireless adapter 148 provide access to a network (e.g., the Internet, home network, etc.) and may be any of a wide variety of various wired or wireless adapter components including an Ethernet card, a modem, a Bluetooth module, a cable modem, and the like.
System memory 143 is provided to store application data that is loaded during the boot process. A media drive 144 is provided and may comprise a DVD/CD drive, hard drive, or other removable media drive. The media drive 144 may be internal or external to the multimedia console 100. Application data may be accessed via the media drive 144 for execution, playback, etc. by the multimedia console 100. The media drive 144 is connected to the I/O controller 120 via a bus, such as a Serial ATA bus or other high speed connection.
The system management controller 122 provides a variety of service functions related to assuring availability of the multimedia console 100. The audio processing unit 123 and an audio codec 132 form a corresponding audio processing pipeline with high fidelity and stereo processing. Audio data is carried between the audio processing unit 123 and the audio codec 132 via a communication link. The audio processing pipeline outputs data to the A/V port 140 for reproduction by an external audio player or device having audio capabilities.
The front panel I/O subassembly 130 supports the functionality of the power button 150 and the eject button 152, as well as any LEDs (light emitting diodes) or other indicators exposed on the outer surface of the multimedia console 100. A system power supply module 136 provides power to the components of the multimedia console 100. A fan 138 cools the circuitry within the multimedia console 100.
The CPU 101, GPU 108, memory controller 110, and various other components within the multimedia console 100 are interconnected via one or more buses, including serial and parallel buses, a memory bus, a peripheral bus, and a processor or local bus using any of a variety of bus architectures.
When the multimedia console 100 is powered on, application data may be loaded from the system memory 143 into memory 112 and/or caches 102, 104 and executed on the CPU 101. The application may present a graphical user interface that provides a consistent user experience when navigating to different media types available on the multimedia console 100. In operation, applications and/or other media contained within the media drive 144 may be launched or played from the media drive 144 to provide additional functionalities to the multimedia console 100.
The multimedia console 100 may be operated as a standalone system by simply connecting the system to a television or other display. In this standalone mode, the multimedia console 100 allows one or more users to interact with the system, watch movies, or listen to music. However, with the integration of broadband connectivity made available through the network interface 124 or the wireless adapter 148, the multimedia console 100 may further be operated as a participant in a larger network community.
When the multimedia console 100 is powered on, a specified amount of hardware resources are reserved for system use by the multimedia console operating system. These resources may include a reservation of memory (e.g., 16 MB), CPU and GPU cycles (e.g., 5%), networking bandwidth (e.g., 8 kbs), etc. Because these resources are reserved at system boot time, the reserved resources do not exist from the application's view.
In particular, the memory reservation preferably is large enough to contain the launch kernel, concurrent system applications and drivers. The CPU reservation is preferably constant such that if the reserved CPU usage is not used by the system applications, an idle thread will consume any unused cycles.
With regard to the GPU reservation, lightweight messages generated by the system applications (e.g., popups) are displayed by using a GPU interrupt to schedule code to render popup into an overlay. The amount of memory required for an overlay depends on the overlay area size and the overlay preferably scales with screen resolution. Where a full user interface is used by the concurrent system application, it is preferable to use a resolution independent of application resolution. A scaler may be used to set this resolution such that the need to change frequency and cause a TV resynch is eliminated.
After the multimedia console 100 boots and system resources are reserved, concurrent system applications execute to provide system functionalities. The system functionalities are encapsulated in a set of system applications that execute within the reserved system resources described above. The operating system kernel identifies threads that are system application threads versus gaming application threads. The system applications are preferably scheduled to run on the CPU 101 at predetermined times and intervals in order to provide a consistent system resource view to the application. The scheduling is to minimize cache disruption for the gaming application running on the console.
When a concurrent system application requires audio, audio processing is scheduled asynchronously to the gaming application due to time sensitivity. A multimedia console application manager (described below) controls the gaming application audio level (e.g., mute, attenuate) when system applications are active.
Input devices (e.g., controllers 142(1) and 142(2)) are shared by gaming applications and system applications. The input devices are not reserved resources, but are to be switched between system applications and the gaming application such that each will have a focus of the device. The application manager preferably controls the switching of input stream, without knowledge the gaming application's knowledge and a driver maintains state information regarding focus switches. The console 100 may receive additional inputs from the depth camera system 20 of
In a body joint tracking system, the computing environment can be used to interpret one or more gestures or other movements and, in response, update a visual space on a display. The computing environment 220 comprises a computer 241, which typically includes a variety of tangible computer readable storage media. This can be any available media that can be accessed by computer 241 and includes both volatile and nonvolatile media, removable and non-removable media. The system memory 222 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 223 and random access memory (RAM) 260. A basic input/output system 224 (BIOS), containing the basic routines that help to transfer information between elements within computer 241, such as during start-up, is typically stored in ROM 223. RAM 260 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 259. A graphics interface 231 communicates with a GPU 229. By way of example, and not limitation,
The computer 241 may also include other removable/non-removable, volatile/nonvolatile computer storage media, e.g., a hard disk drive 238 that reads from or writes to non-removable, nonvolatile magnetic media, a magnetic disk drive 239 that reads from or writes to a removable, nonvolatile magnetic disk 254, and an optical disk drive 240 that reads from or writes to a removable, nonvolatile optical disk 253 such as a CD ROM or other optical media. Other removable/non-removable, volatile/nonvolatile tangible computer readable storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The hard disk drive 238 is typically connected to the system bus 221 through an non-removable memory interface such as interface 234, and magnetic disk drive 239 and optical disk drive 240 are typically connected to the system bus 221 by a removable memory interface, such as interface 235.
The drives and their associated computer storage media discussed above and depicted in
The computer 241 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 246. The remote computer 246 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 241, although only a memory storage device 247 has been depicted in
When used in a LAN networking environment, the computer 241 is connected to the LAN 245 through a network interface or adapter 237. When used in a WAN networking environment, the computer 241 typically includes a modem 250 or other means for establishing communications over the WAN 249, such as the Internet. The modem 250, which may be internal or external, may be connected to the system bus 221 via the user input interface 236, or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 241, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation,
The computing environment can include tangible computer readable storage having computer readable software embodied thereon for programming at least one processor to perform a method for generating proxy training data for human body tracking as described herein. The tangible computer readable storage can include, e.g., one or more of components 222, 234, 235, 230, 253 and 254. Further, one or more processors of the computing environment can provide a processor-implemented method for generating proxy training data for human body tracking, comprising processor-implemented steps as described herein. A processor can include, e.g., one or more of components 229 and 259.
Step 502 includes performing retargeting and dissimilar pose detection. In one approach, retargeting is performed before dissimilar pose selection, and in another approach, retargeting is performed after dissimilar pose selection. Retargeting translates the marker positions which were obtained from the actor in the motion capture studio to skeletal models of different body types. A skeletal model of a given body type can be obtained by determining the locations of joints in the skeletal model based on the locations of the markers. For example, when one or more markers are positioned on the actor in a known location relative to the shoulder, the location of a joint which represents the shoulder can be determined from the marker positions.
A skeletal model or skeleton is a virtual configuration of 3-D joints or other points of the body connected by limbs or bones, such that the configuration of the skeleton may be represented by listing the positions of the 3-D points, or alternatively by enumerating the joint angles which relate individual bones to another bone in the skeleton. This relative positioning may relate each bone to its parent in a tree-structured decomposition of the skeleton. Additionally, shape parameters can be specified with the joint angles, for example specifying the bone lengths.
In the real world use of a body joint tracking system, the imaged users will have many different body types, including variations based on height, width, weight, posture, age, gender, hair type and amount of hair, clothing and so forth. Thus, using only the body type of the actor, or some other standard body type, to provide training data for a learning algorithm of a motion capture would not provide sufficient variability. Retargeting to different body types provides increased variability without the need to obtain motion capture data from many different actors of different body types in a motion capture studio, thus saving costs and time.
Retargeting of motion capture data can involve representing the 3-D data from a motion capture sequence as the parameters of a predefined skeleton of a body type, particularly in the sense of translating the 3-D marker positions for each frame of the motion capture sequence, into a sequence of joint angles (one set of joint angles per frame) and shape parameters (one set of shape parameters per sequence). Sequences captured from the same actor will generally share shape parameters. The 3-D marker positions for each frame of the motion capture sequence can be provided as a sequence of coordinates, such as (x,y,z) coordinates. Similarly, the joint positions of the skeleton can be provided as another sequence of (x,y,z) coordinates. Retargeting to different skeletons and body types can be performed. As an example, 10-15 different body types can be used. The retargeting can also introduce further variations in the body types, such as slightly varied bone or limb lengths, to increase the degree of variability. Generally, a goal is to provide the highest amount of variability among the body poses, within boundaries that are based on a range of real-life human variability.
Dissimilar pose selection analyzes the set of all poses which are obtained from each retargeted motion capture sequence. With a frame rate of 30 frames per second, a motion capture sequence length of, e.g., 1-2 minutes, and retargeting to 15 different body types for each frame, it can be seen that the amount of frames/poses can become voluminous. To improve efficiency, and avoid providing an excessive amount of data with high redundancy to the learning algorithm, a dissimilar pose selection process can be run using each of the frames, to obtain a specified, reduced number of dissimilar poses. The dissimilar pose selection process identifies frames which are a specified distance apart, according to a distance metric. Step 502 is discussed further below in connection with
Step 504 includes performing rendering to provide depth images and classification images. Rendering refers to generating a synthetic image in pixel space. A depth image can be rendered from the perspective of a virtual camera which is in a specified position relative to a body which is represented by the depth image. Other factors such as a field of view of the virtual camera can also be specified in rendering the depth image. Essentially, the depth image simulates what a real depth camera will see in a real environment, by simulating a 3-D body and, optionally, scene elements such as a floor, walls, ceiling, furniture and other household objects, in a field of view. The depth image can have a similar pixel resolution as the real depth camera. Further, in the depth image, each pixel can identify a distance from the virtual camera to the 3-D body, a distance from the virtual camera to a 3-D scene object, or a background space, which is a pixel that represents neither a 3-D body or a 3-D scene object.
A classification image or map identifies and labels the different body parts of the 3-D body, or the different 3-D scene elements. For example, each pixel can identify a number of the body part which is closest to the virtual camera, or a unique index of a 3-D scene object. In the learning algorithm for the body joint tracking system, the depth image is processed using settings such as filter settings and a corresponding classification map is generated in which the learning algorithm attempts to identify the body parts and scene elements. The classification map generated by the learning algorithm can be compared to the classification map provided with the depth image to determine how accurate the learning algorithm is. The classification map provided with the depth image essentially is a reference map which provides the correct answer, and the learning algorithm can repeatedly adjust its settings, e.g., train itself, when processing a depth image until it can duplicate the correct result as accurately as possible. Further, the processing of depth images and comparison to the associated classification map can be repeated for the numerous dissimilar frames which the learning algorithm receives as an input data set.
Once the learning algorithm has been optimized, the corresponding settings are recorded and the learning algorithm can be shipped with the body joint tracking system for use by the end user.
The rendering can also provide a text file with each depth image and classification map which describes the pose, such as in terms of joint coordinates of the skeletal model which was used to provide the 3-D body pose. Other data such as settings which were used in the motion capture studio can also be provided. Step 504 is discussed further below in connection with
Step 506 includes adding noise to some or all of the depth images. The amount and type of noise, and the selection of depth frames to add noise, can be randomized, in one approach. Step 506 is discussed further below in connection with
Step 600 includes, at a motion capture studio, capturing a sequence of frames as an actor performs movements over time. In one approach, a large variety of movements are performed which are expected to be suitable in describing user movements which are expected to be encountered when the user is engaged with different applications of a body joint tracking system. In another approach, the movements are specific to a particular application, such as a game. For instance, an interactive tennis game may have prescribed movements such as swinging a racket. Optionally, the actor can hold a prop in his or her hands during a movement.
Each sequence is comprised of successive frames, and each frame identifies the positions of markers on the actor's body. Each sequence can be performed based on a script. For instance, one script can specify specific arm and leg movements. One or more sequences can be obtained.
At step 602, for each frame in the sequence, a data set of 3-D coordinates of markers is provided. As mentioned, the exact location of each marker can be determined by triangulation using different cameras in the motion capture studio. Step 604 outputs one or more data sets of motion capture sequences.
In an example implementation, N motion capture sequences are captured, denoted {Si|i=1 . . . N}. Each sequence comprises frames Si={Fit|t=1 . . . Ni}. Thus, Si represents a sequence or set of frames, and the object that it contains is Fit. Each Fit is a vector of 3-D point positions or a vector of joint angles, and represents a frame (F) for sequence Si at time t. Each frame is represented by a set of M marker positions, so Fit is represented by an M×3 matrix, with each row encoding the 3-D position of a marker. Note that N and each of the Ni denote distinct variables. Ni is the number of frames in sequence Si.
a provides further details of performing retargeting and dissimilar pose detection (step 502 of
Step 708 translates the 3-D marker positions to joint positions based on the selected body type. The joint positions, which are part of a skeletal model of a body of the specified type, can be obtained based on the locations of the markers. For example, when one or more markers are positioned on the actor in a known location relative to the shoulder, the location of a joint which represents the shoulder can be determined from the marker positions. Further, the location of the joint in the body of the specified type can be determined based on a model of the body and a corresponding skeleton which fits that body model. Random variations, such as based on bone or limb length, can be added during step 708 as well.
In an example implementation, the input skeleton sequence is retargeted to one or more body shapes, numbered 1 . . . B, producing retargeted frames {F′itk|k=1 . . . B}. F′itk is a frame in motion capture sequence Si at time t for body type k. The range of body shapes can be chosen to cover a large proportion of potential users of the system, and could include variations in: gender (male, female), age (adult, child), body type (specified weights such as 100 pounds, 150 pounds and 200 pounds; or fat, thin or average build), height (e.g., 5 feet, 5.5 feet, six feet), head hair type (mustache, beard, long/short hair), clothing (baggy, tight, skirt), and so forth. For instance, body type 1 may be a male, adult, weighing 150 pounds, 5.5 feet tall, with short hair and baggy clothing, body type 2 may be a male, adult, weighing 200 pounds, 6 feet tall, with long hair and tight clothing, and so forth.
This stage may optionally include adjustment of untracked joints, such as finger angles. The body model used for rendering has many more parameters (joint angles) than the input skeletons, so for most of those untracked joints one does not have information as to where they are. To fill in this information, one can set a default value (e.g., set the finger angles to correspond to an open hand). Alternatively, each rendered image can have these values set randomly, thus generating more variation in the renderings. One knows where those fingers are because one knows where the hand is (since it is a tracked joint) and one can use a kinematic model of the human skeleton and how fingers are related to hands, given finger orientations.
In one instantiation, fifteen base skeletons are used, with random variation in weight and/or bone or limb length. Thus B is effectively very large, but a random subset of all possible F′itk can be considered. See
The joint positions in the skeletal model can be identified by (x,y,z) coordinates, in a first joint location identification scheme. An example skeleton can have about forty joints. The data set can include a matrix for each frame, where each row represents a particular joint in a skeletal model, and there are three columns which represent the (x,y,z) position of the joint in a coordinate system: a first column for x, a second column for y and a third column for z. The joints can be identified, e.g., such that the left shoulder is joint #1, the right shoulder is joint #2 and so forth.
In a second joint location identification scheme, the skeletal model can be defined by specifying a starting point, along with a series of joint angles and shape parameters, such as bone or limb lengths. For instance, a joint can be a specified distance along a vector from a given point. In this case, the data set can include a matrix for each frame, where a first row represents a starting joint, and each additional row represents a neighboring joint in the skeletal model. In this case, there can be four columns. The first three columns represent an angle of a vector from the prior joint to the current joint, e.g., in each of the x-y, y-z, x-z planes, respectively. The fourth matrix column can provide the shape parameter, such as the distance from the prior joint to the current joint, e.g., a bone length. Other joint identification schemes may be used as well.
A translation between the joint location identification schemes can be made. For example, translation from the first to the second joint location identification scheme can involve subtracting two successive joint positions in 3-D space to obtain a vector between them, in terms of angles in each of the x-y, y-z, x-z planes, and a magnitude of the vector as the shape parameter. Translation from the second to the first joint location identification scheme can involve addition of the vector defined by the angles and magnitude between two successive joint positions.
At decision step 712, if there is a next body type, the process at steps 706-710 is repeated for the current frame. If the current frame has been retargeted to all body types, then decision step 712 is false, and decision step 714 determines if there is another frame in the current sequence to process. If decision step 714 is true, then a new frame is selected at step 704 and the process at steps 706-710 is repeated for the new frame. If all frames in the current sequence have been retargeted, then decision step 714 is false and decision step 716 determines if there is another sequence to process. If there is another sequence to process, then a new sequence is selected at step 702 and the process of steps 704-710 is performed. When the last sequence has been processed, decision step 716 is false, and retargeting ends, at step 718. Step 720 outputs a data set of retargeted frames.
Dissimilar pose selection begins at step 722. In the space of all possible poses, the dissimilar pose selection provides a sparse sampling. As a result, fewer frames are provided to the learning algorithm, so that its computational expense is reduced, but without losing a significant amount of quality. For example, the number of frames may be reduced by an order of magnitude.
In the approach depicted, dissimilar pose detection is performed using the joint locations in the skeletal models used in retargeting rather than positions of the markers from the motion capture sequences. Step 724 selects and removes a pose from the data set of retargeted frames, provided at step 720. Step 726 adds the pose to a new data set of selected dissimilar poses. Step 728 determines a distance between each selected dissimilar pose and all of the remaining poses in the data set of retargeted frames, which are the candidate poses. Step 730 optionally excludes the candidate poses which are not at least a threshold distance T away from any selected dissimilar pose. These candidate poses are deemed to be too similar to the selected dissimilar pose or poses. Step 732 determines which candidate pose has a largest minimum distance to any selected dissimilar pose. Thus, for each candidate pose, one can determine its distance from each of the selected dissimilar poses. One then take a minimum of those distances for each candidate pose. Then, one determines which of the minimums is largest among all candidate poses. At step 734, the selected candidate pose is added to the data set of selected dissimilar poses and removed from the data set of retargeted frames. If there is a next frame to process from the data set of retargeted frames, at decision step 736, then processing continues at steps 728-734. In one approach, decision step 736 is false when a specified number D of frames have been provided in the data set of retargeted, dissimilar frames. The dissimilar pose detection ends at step 738. Step 740 outputs a data set of selected dissimilar frames. The dissimilar poses are a subset of poses in the retargeted motion capture sequences
b depicts an algorithm for dissimilar pose selection. Associating each tuple (i, t, k) with an integer r, all retargeted motion capture frames are represented as the set S′={F′r|r=1 . . . R}, where R=B×Π Ni=B×N1× . . . ×NN. The goal of dissimilar pose selection is to choose a subset (of size D) of these frames, represented by the set of integers PI={r1, . . . , rD}, such that the subset does not include pairs of frames which are similar, as defined by a similarity function φ, which may also be considered to be a distance metric. This function (examples are given below) maps a pair of frames from S′ to a positive real number which is low for similar frames, and high otherwise. For clarity, if the frames are represented by M×3 matrices G and H, then φ(G,H) returns the similarity of G and H, with φ=0 for identical poses and larger φ denoting less similar poses. Pseudocode for the algorithm is in
The dissimilar pose detection algorithm uses the pose similarity function φ(G,H). Given matrix G, denote its M rows by {gj}j=1:M, and similarly denote the rows of H by {hj}j=1:M. A first possible definition of similarly is the maximum joint distance
The squared Euclidean distance is depicted. A second possible definition used is the sum of squared distances:.φ(G, H):=Σj=1M∥gj−hj∥22. A range of alternative distance functions can readily be defined. For example, in the dissimilar pose selection, different parts of the body can be weighted differently. For instance, the weights can indicate that it is more important to have dissimilar hands than feet, in which case the joint distances involving the hand are weighted more heavily.
The distance metric and/or the threshold can be adjusted to achieved a desired number of dissimilar poses and a corresponding desired sampling sparseness.
c provides further details of performing retargeting and dissimilar pose detection (step 502 of
Dissimilar pose selection begins at step 750. Step 752 selects and removes a pose from the data set of motion capture sequences, provided at step 606 of
The retargeting begins at step 770. Step 772 selects a frame from the data set of selected dissimilar poses from step 720. Step 704 selects a frame from the current sequence.
Step 774 selects a body type. Step 776 translates the 3-D marker positions to joint positions based on the selected body type, as discussed previously. Random variations, such as to limb length, can be added during step 776 as well. Step 778 adds the retargeted frame to a data set of retargeted, dissimilar frames.
At decision step 780, if there is a next body type, the process at steps 774-778 is repeated for the current frame. If the current frame has been retargeted to all body types, then decision step 780 is false, and decision step 782 determines if there is another frame in the data set of dissimilar frames to process. If decision step 782 is true, then a new frame is selected at step 772 and the process at steps 774-778 is repeated for the new frame. If all frames in the data set of dissimilar frames have been retargeted, then decision step 782 is false and retargeting ends, at step 784. Step 786 outputs a data set of retargeted, dissimilar frames.
In
Step 800 includes selecting a frame from the data set of retargeted, dissimilar frames.
Step 802 provides a 3-D body model corresponding to the selected frame, or more specifically, corresponding to the skeletal model and its pose in the selected frame. Step 804 applies a texture map to the 3-D body model, where the texture map includes different regions for different body parts, and each region is assigned an index or number. See
Step 806 selects a virtual camera position in the scene. The camera positions for each rendered pair of depth image and classification image can be parameterized by parameters such as the camera height above ground, the camera look direction (pan and tilt), or camera angle, and the camera field of view, or even be aimed at the body position to ensure the person is in the field of view.
Step 808 selects a body position. Generally, the body model position can be parameterized by the (x,z) position on the virtual ground plane, and rotation about the y-axis, perpendicular to the ground plane. The z axis denotes an axis of the depth from the virtual camera, the y-axis extends vertically, and the x-axis denotes a left-to-right direction parallel to the ground plane and perpendicular to the y and z-axes. Note that not all possible combinations of camera position, scene geometry, and body model position need be rendered. For some training scenarios, random variations in one or more of the parameters will suffice. These parameters are also chosen to ensure the character lies at least partially in the field of view of the camera.
Optionally, 3-D scene elements, also referred to as scene geometry objects, can be provided at step 810. Step 812 provides a unique index for each scene geometry object. Regarding a choice of scene geometry, the rendered body model can be placed in a 3-D scene model. Possible scene models include: (1) an empty scene, where only the body model appears in the rendered frames, (2) ground plane, where a plane geometry element is placed in the model, and appears under the model in the rendered frame, and (3) a general scene, where 3-D elements such as walls, furniture such as a couch, chair, or coffee table, plants, pets, other household objects, or objects which may be handheld or worn by a user while interacting with a body joint tracking system, such as a tennis racket or basketball, may be obtained and positioned in the synthetic scene. Collision detection is used to ensure that the rendered body is positioned so as to avoid unrealistic intersections with the scene geometry. Objects may be placed in front of and/or behind the body.
As an example, the actor in the motion capture sequence may pose in a position as if sitting on a couch. In this case, a couch as a scene object can be inserted in the depth image with the body model which corresponds to this pose, so that the body model appears to be sitting on the couch. Thus, one can tailor the position of the couch or other scene element to the body pose. This provides a realistic depth image which can be used to train the learning algorithm for this common scenario. Different types (style/size) of couches could be used as well. Note that some the scene elements can be tailored to the body pose, and/or provided commonly for all poses.
Based on these steps, a depth image comprising pixels is rendered. Specifically, a pixel which represents a 3-D body part is encoded with a distance from a virtual camera to the 3-D body (step 816). A pixel which represents a 3-D scene object is encoded with a distance from the virtual camera to the 3-D scene object (step 818). A pixel which represents a background region is encoded as a background pixel, e.g. using a designated color which contrasts with the pixels of the 3-D body and scene objects (step 820). A corresponding classification image can be rendered by encoding a pixel which represents a 3-D body part with a number of the body part which is closest to the virtual camera (step 822). A separate color can be assigned for each body part. In some cases, one part of the body (e.g., an arm) can be in front of another part (e.g., the torso), in which case the closer part is identified. At step 824, a pixel which represents a 3-D scene object can be encoded with a unique index of the object. As an example, the resolution of the output depth and classification images can be 640×480 pixels. The resolution should be similar to the resolution of the real world depth camera which is being trained. A text file of pose data can also be provided a step 814.
Other options include extending the set of poses by combining joint angles or positions from two or more original motion capture frames, such as taking the lower body joints from one frame and the upper body joints from another frame. Thus, the rendering could combine portions of two or more 3-D body models in different poses but with a common virtual camera position and common 3-D scene elements.
In
Regarding hair noise simulation (902), a particular characteristic of some cameras is that body hair (e.g. scalp hair, moustaches and beards—collectively “head hair”) may not be imaged reliably. To simulate the presence of head hair, a certain proportion of the rendered pixels which are classified as head hair can be randomly deleted, and replaced with the background label in the depth map, and the background class in the classification map. The deleted pixels can be those within a specified radius of some fraction of the hair pixels. For instance, referring to the classification image of
Regarding depth quantization noise (904), the rendered images contain high resolution depth information, higher than the depth resolution of the real world camera. To simulate the effect of the camera, this resolution can be quantized to one of several discrete levels, with a precision dependent on depth (less precision with greater depth). As one example, depths between 0 and 1 meters are quantized to a 5 mm increment; 1 and 2 meters to a 21 mm increment; and so on, with depths 3 to 3.5 meters quantized to a 65 mm increment.
Regarding the random noise (906), to simulate random noise in the camera, noise can be added to each non-background pixel in the rendered depth image. A non-background pixel can be a pixel that does not represent the 3-D body (step 816) or a 3-D scene element (step 818). The noise can be drawn from a uniform distribution over [−N,N] where N is a preset noise level. The noise is added to the inverse depth, so that for pixel (i, j), with depth z, the new depth, with added noise, is given by
where η is a draw from a uniform distribution in the range (−1 . . . 1).
Regarding edge roughening (908), depth information is less reliable near large discontinuities in the depth image. A process called “edge roughening” is applied in order to simulate this effect. In one approach, points in the depth image whose neighboring pixels are significantly (e.g., 10 cm) further from the camera are marked as “edges.” Such edges are also marked as “East”/“West” etc. according to the direction of the farther pixel with respect to the center. Thus, a “West” edge represents a sharp decrease in depth as one travels from the left to the right of the image. For each such edge point, the value at the point is replaced at random by the depth of a neighboring pixel, chosen from the farther surface (the surface with greater depth). For example, if an edge is a sequence of edge points (corresponding to one per pixel along the edge), then a given edge will have a random subset of its pixels replaced. So one edge might have 2 pixels replaced, another edge might have no pixels replaced and so forth.
Thus, the noise is added by identifying first and second edges in one of the dissimilar poses having a depth discontinuity greater than a threshold, and replacing pixels between the first and second edges with background pixels.
Regarding hole cutting (910), another characteristic of the sensor in the real world depth camera which it may be desired to emulate is the property that “thin” structures (say narrower than 4 pixels in the image) may not be detected at all. Such structures can be detected by locating complementary (e.g. West, then East; or North, then South) edges within this distance, and replacing the intervening values with background.
Regarding shadow casting (912), another sensor characteristic which can be modeled is the casting of shadows. Shadows occur when a nearer part of the scene occludes a farther part, preventing the depth camera from reporting an accurate depth. For example, the near part could be an arm in front of the other arm (e.g., making an “X” with the forearms). Shadows are associated with “West” edges whose depth discontinuities are greater than a threshold slope. The shadow casting process locates such edges in the depth image, finds the east most point which satisfies the slope threshold, and replaces pixels between the edge and the east most point with background. This example applies to a scenario in which shadows form only for vertical edges, because of the camera geometry, with the IR emitter left of the IR camera. The example can be modified for other camera geometries.
a depicts an alternative view of the process of
b depicts an alternative view of the process of
a depicts an example view of a first pose of an actor with markers in a motion capture studio (step 500 in
b depicts an example view of a second pose of an actor with markers in a motion capture studio (step 500 in
a depicts a rendering of a depth image of a 3-D body, of a first body type, with an overlay of the corresponding skeleton of
b depicts a rendering of a depth image of a 3-D body 1360, of a second body type, with an overlay of a corresponding skeleton. The body 1360 is a thinner, smaller body type than the body 1302 in
a depicts an example depth image. The depth image may include shades of grey to represent the depth, with a lighter color representing a greater depth from the virtual camera. These shades of grey can be calibrated to real physical units. For example, a grey value of 100 can mean 100 mm=0.1 m, and a grey value of 2500 (in a 16 bit per pixel depth image) can mean 2500 mm=2.5 m. The background can be given a grey value of 0.
b depicts an example classification image corresponding to the depth image of
The foregoing detailed description of the technology herein has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the technology to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. The described embodiments were chosen to best explain the principles of the technology and its practical application to thereby enable others skilled in the art to best utilize the technology in various embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the technology be defined by the claims appended hereto.
Number | Name | Date | Kind |
---|---|---|---|
4627620 | Yang | Dec 1986 | A |
4630910 | Ross et al. | Dec 1986 | A |
4645458 | Williams | Feb 1987 | A |
4695953 | Blair et al. | Sep 1987 | A |
4702475 | Elstein et al. | Oct 1987 | A |
4711543 | Blair et al. | Dec 1987 | A |
4751642 | Silva et al. | Jun 1988 | A |
4796997 | Svetkoff et al. | Jan 1989 | A |
4809065 | Harris et al. | Feb 1989 | A |
4817950 | Goo | Apr 1989 | A |
4843568 | Krueger et al. | Jun 1989 | A |
4893183 | Nayar | Jan 1990 | A |
4901362 | Terzian | Feb 1990 | A |
4925189 | Braeunig | May 1990 | A |
5101444 | Wilson et al. | Mar 1992 | A |
5148154 | MacKay et al. | Sep 1992 | A |
5184295 | Mann | Feb 1993 | A |
5229754 | Aoki et al. | Jul 1993 | A |
5229756 | Kosugi et al. | Jul 1993 | A |
5239463 | Blair et al. | Aug 1993 | A |
5239464 | Blair et al. | Aug 1993 | A |
5288078 | Capper et al. | Feb 1994 | A |
5295491 | Gevins | Mar 1994 | A |
5320538 | Baum | Jun 1994 | A |
5347306 | Nitta | Sep 1994 | A |
5385519 | Hsu et al. | Jan 1995 | A |
5405152 | Katanics et al. | Apr 1995 | A |
5417210 | Funda et al. | May 1995 | A |
5423554 | Davis | Jun 1995 | A |
5454043 | Freeman | Sep 1995 | A |
5469740 | French et al. | Nov 1995 | A |
5495576 | Ritchey | Feb 1996 | A |
5516105 | Eisenbrey et al. | May 1996 | A |
5524637 | Erickson | Jun 1996 | A |
5534917 | MacDougall | Jul 1996 | A |
5563988 | Maes et al. | Oct 1996 | A |
5577981 | Jarvik | Nov 1996 | A |
5580249 | Jacobsen et al. | Dec 1996 | A |
5594469 | Freeman et al. | Jan 1997 | A |
5597309 | Riess | Jan 1997 | A |
5616078 | Oh | Apr 1997 | A |
5617312 | Iura et al. | Apr 1997 | A |
5638300 | Johnson | Jun 1997 | A |
5641288 | Zaenglein | Jun 1997 | A |
5682196 | Freeman | Oct 1997 | A |
5682229 | Wangler | Oct 1997 | A |
5690582 | Ulrich et al. | Nov 1997 | A |
5703367 | Hashimoto et al. | Dec 1997 | A |
5704837 | Iwasaki et al. | Jan 1998 | A |
5715834 | Bergamasco et al. | Feb 1998 | A |
5875108 | Hoffberg et al. | Feb 1999 | A |
5877803 | Wee et al. | Mar 1999 | A |
5913727 | Ahdoot | Jun 1999 | A |
5933125 | Fernie | Aug 1999 | A |
5980256 | Carmein | Nov 1999 | A |
5989157 | Walton | Nov 1999 | A |
5995649 | Marugame | Nov 1999 | A |
6005548 | Latypov et al. | Dec 1999 | A |
6009210 | Kang | Dec 1999 | A |
6054991 | Crane et al. | Apr 2000 | A |
6066075 | Poulton | May 2000 | A |
6072494 | Nguyen | Jun 2000 | A |
6073489 | French et al. | Jun 2000 | A |
6077201 | Cheng et al. | Jun 2000 | A |
6098458 | French et al. | Aug 2000 | A |
6100896 | Strohecker et al. | Aug 2000 | A |
6101289 | Kellner | Aug 2000 | A |
6128003 | Smith et al. | Oct 2000 | A |
6130677 | Kunz | Oct 2000 | A |
6141463 | Covell et al. | Oct 2000 | A |
6147678 | Kumar et al. | Nov 2000 | A |
6152856 | Studor et al. | Nov 2000 | A |
6159100 | Smith | Dec 2000 | A |
6173066 | Peurach et al. | Jan 2001 | B1 |
6181343 | Lyons | Jan 2001 | B1 |
6188777 | Darrell et al. | Feb 2001 | B1 |
6215890 | Matsuo et al. | Apr 2001 | B1 |
6215898 | Woodfill et al. | Apr 2001 | B1 |
6226396 | Marugame | May 2001 | B1 |
6229913 | Nayar et al. | May 2001 | B1 |
6256033 | Nguyen | Jul 2001 | B1 |
6256400 | Takata et al. | Jul 2001 | B1 |
6283860 | Lyons et al. | Sep 2001 | B1 |
6289112 | Jain et al. | Sep 2001 | B1 |
6299308 | Voronka et al. | Oct 2001 | B1 |
6308565 | French et al. | Oct 2001 | B1 |
6316934 | Amorai-Moriya et al. | Nov 2001 | B1 |
6363160 | Bradski et al. | Mar 2002 | B1 |
6384819 | Hunter | May 2002 | B1 |
6411744 | Edwards | Jun 2002 | B1 |
6430997 | French et al. | Aug 2002 | B1 |
6476834 | Doval et al. | Nov 2002 | B1 |
6496598 | Harman | Dec 2002 | B1 |
6503195 | Keller et al. | Jan 2003 | B1 |
6539931 | Trajkovic et al. | Apr 2003 | B2 |
6570555 | Prevost et al. | May 2003 | B1 |
6633294 | Rosenthal et al. | Oct 2003 | B1 |
6640202 | Dietz et al. | Oct 2003 | B1 |
6661918 | Gordon et al. | Dec 2003 | B1 |
6674877 | Jojic et al. | Jan 2004 | B1 |
6681031 | Cohen et al. | Jan 2004 | B2 |
6714665 | Hanna et al. | Mar 2004 | B1 |
6731799 | Sun et al. | May 2004 | B1 |
6738066 | Nguyen | May 2004 | B1 |
6765726 | French et al. | Jul 2004 | B2 |
6788809 | Grzeszczuk et al. | Sep 2004 | B1 |
6801637 | Voronka et al. | Oct 2004 | B2 |
6873723 | Aucsmith et al. | Mar 2005 | B1 |
6876496 | French et al. | Apr 2005 | B2 |
6937742 | Roberts et al. | Aug 2005 | B2 |
6950534 | Cohen et al. | Sep 2005 | B2 |
6999601 | Pavlovic et al. | Feb 2006 | B2 |
7003134 | Covell et al. | Feb 2006 | B1 |
7036094 | Cohen et al. | Apr 2006 | B1 |
7038855 | French et al. | May 2006 | B2 |
7039676 | Day et al. | May 2006 | B1 |
7042440 | Pryor et al. | May 2006 | B2 |
7050606 | Paul et al. | May 2006 | B2 |
7058204 | Hildreth et al. | Jun 2006 | B2 |
7060957 | Lange et al. | Jun 2006 | B2 |
7113918 | Ahmad et al. | Sep 2006 | B1 |
7121946 | Paul et al. | Oct 2006 | B2 |
7170492 | Bell | Jan 2007 | B2 |
7184048 | Hunter | Feb 2007 | B2 |
7202898 | Braun et al. | Apr 2007 | B1 |
7212665 | Yang et al | May 2007 | B2 |
7222078 | Abelow | May 2007 | B2 |
7227526 | Hildreth et al. | Jun 2007 | B2 |
7259747 | Bell | Aug 2007 | B2 |
7308112 | Fujimura et al. | Dec 2007 | B2 |
7317836 | Fujimura et al. | Jan 2008 | B2 |
7348963 | Bell | Mar 2008 | B2 |
7359121 | French et al. | Apr 2008 | B2 |
7367887 | Watabe et al. | May 2008 | B2 |
7379563 | Shamaie | May 2008 | B2 |
7379566 | Hildreth | May 2008 | B2 |
7389591 | Jaiswal et al. | Jun 2008 | B2 |
7412077 | Li et al. | Aug 2008 | B2 |
7421093 | Hildreth et al. | Sep 2008 | B2 |
7430312 | Gu | Sep 2008 | B2 |
7436496 | Kawahito | Oct 2008 | B2 |
7450736 | Yang et al. | Nov 2008 | B2 |
7452275 | Kuraishi | Nov 2008 | B2 |
7460690 | Cohen et al. | Dec 2008 | B2 |
7489812 | Fox et al. | Feb 2009 | B2 |
7536032 | Bell | May 2009 | B2 |
7555142 | Hildreth et al. | Jun 2009 | B2 |
7560701 | Oggier et al. | Jul 2009 | B2 |
7570805 | Gu | Aug 2009 | B2 |
7574020 | Shamaie | Aug 2009 | B2 |
7576727 | Bell | Aug 2009 | B2 |
7590262 | Fujimura et al. | Sep 2009 | B2 |
7593552 | Higaki et al. | Sep 2009 | B2 |
7598942 | Underkoffler et al. | Oct 2009 | B2 |
7607509 | Schmiz et al. | Oct 2009 | B2 |
7620202 | Fujimura et al. | Nov 2009 | B2 |
7668340 | Cohen et al. | Feb 2010 | B2 |
7680298 | Roberts et al. | Mar 2010 | B2 |
7683954 | Ichikawa et al. | Mar 2010 | B2 |
7684592 | Paul et al. | Mar 2010 | B2 |
7701439 | Hillis et al. | Apr 2010 | B2 |
7702130 | Im et al. | Apr 2010 | B2 |
7704135 | Harrison, Jr. | Apr 2010 | B2 |
7710391 | Bell et al. | May 2010 | B2 |
7729530 | Antonov et al. | Jun 2010 | B2 |
7746345 | Hunter | Jun 2010 | B2 |
7760182 | Ahmad et al. | Jul 2010 | B2 |
7809167 | Bell | Oct 2010 | B2 |
7834846 | Bell | Nov 2010 | B1 |
7852262 | Namineni et al. | Dec 2010 | B2 |
RE42256 | Edwards | Mar 2011 | E |
7898522 | Hildreth et al. | Mar 2011 | B2 |
8035612 | Bell et al. | Oct 2011 | B2 |
8035614 | Bell et al. | Oct 2011 | B2 |
8035624 | Bell et al. | Oct 2011 | B2 |
8072470 | Marks | Dec 2011 | B2 |
20070098254 | Yang et al. | May 2007 | A1 |
20080026838 | Dunstan et al. | Jan 2008 | A1 |
20080152191 | Fujimura et al. | Jun 2008 | A1 |
20080152218 | Okada | Jun 2008 | A1 |
20090141933 | Wagg | Jun 2009 | A1 |
20090154796 | Gupta et al. | Jun 2009 | A1 |
20090175540 | Dariush et al. | Jul 2009 | A1 |
20090221368 | Yen et al. | Sep 2009 | A1 |
Number | Date | Country |
---|---|---|
101254344 | Jun 2010 | CN |
0583061 | Feb 1994 | EP |
08044490 | Feb 1996 | JP |
9310708 | Jun 1993 | WO |
9717598 | May 1997 | WO |
9944698 | Sep 1999 | WO |
W02009059065 | May 2009 | WO |
Number | Date | Country | |
---|---|---|---|
20110228976 A1 | Sep 2011 | US |