Three-dimensional computer models of a real-world environment are useful in a wide variety of applications. For example, such models can be used in applications such as immersive gaming, augmented reality, architecture/planning, robotics, and engineering prototyping. Depth cameras (also known as z-cameras) can generate real-time depth maps of a real-world environment. Each pixel in these depth maps corresponds to a discrete distance measurement captured by the camera from a 3D point in the environment. This means that these cameras provide depth maps which are composed of an unordered set of points (known as a point cloud) at real-time rates.
In addition to creating the depth map representation of the real-world environment, it is useful to be able to perform a segmentation operation that differentiates individual objects in the environment. For example, a coffee cup placed on a table is a separate object to the table, but the depth map in isolation does not distinguish this as it is unable to differentiate between an object placed on the table and something that is part of the table itself.
Segmentation algorithms exist, such as those based on machine-learning classifiers or computer vision techniques. Such algorithms are able to differentiate certain objects in an image, and label the associated pixels accordingly. However, these algorithms can be computationally complex and hence demand substantial computational resources, especially to perform the segmentation on real-time depth maps.
The embodiments described below are not limited to implementations which solve any or all of the disadvantages of known segmentation techniques.
The following presents a simplified summary of the disclosure in order to provide a basic understanding to the reader. This summary is not an extensive overview of the disclosure and it does not identify key/critical elements of the invention or delineate the scope of the invention. Its sole purpose is to present some concepts disclosed herein in a simplified form as a prelude to the more detailed description that is presented later.
Moving object segmentation using depth images is described. In an example, a moving object is segmented from the background of a depth image of a scene received from a mobile depth camera. A previous depth image of the scene is retrieved, and compared to the current depth image using an iterative closest point algorithm. The iterative closest point algorithm includes a determination of a set of points that correspond between the current depth image and the previous depth image. During the determination of the set of points, one or more outlying points are detected that do not correspond between the two depth images, and the image elements at these outlying points are labeled as belonging to the moving object. In examples, the iterative closest point algorithm is executed as part of an algorithm for tracking the mobile depth camera, and hence the segmentation does not add substantial additional computational complexity.
Many of the attendant features will be more readily appreciated as the same becomes better understood by reference to the following detailed description considered in connection with the accompanying drawings.
The present description will be better understood from the following detailed description read in light of the accompanying drawings, wherein:
Like reference numerals are used to designate like parts in the accompanying drawings.
The detailed description provided below in connection with the appended drawings is intended as a description of the present examples and is not intended to represent the only forms in which the present example may be constructed or utilized. The description sets forth the functions of the example and the sequence of steps for constructing and operating the example. However, the same or equivalent functions and sequences may be accomplished by different examples.
Although the present examples are described and illustrated herein as being implemented in a computer gaming system, the system described is provided as an example and not a limitation. As those skilled in the art will appreciate, the present examples are suitable for application in a variety of different types of computing systems using 3D models.
A real-time camera tracking system 112 monitors the location and orientation of the camera in the room. The real-time camera tracking system 112 may be integral with the mobile depth camera 102 or may be at another location provided that it is able to receive communication from the mobile depth camera 102, either directly or indirectly. For example, the real-time camera tracking system 112 may be provided at a personal computer, dedicated computer game apparatus, or other computing device in the room and in wireless communication with the mobile depth camera 102. In other examples the real-time camera tracking system 112 may be elsewhere in the building or at another remote location in communication with the mobile depth camera 102 using a communications network of any suitable type.
A moving-object segmentation system 114 receives the depth images captured by the mobile depth camera 102, and performs a segmentation operation to identify objects in the depth images that are moving or being manipulated and which are independent of the relatively unchanging background. For example, the person 104 or cat 108 can move, and hence are segmented as foreground moving objects and differentiated from background objects such as the door and window. In addition, otherwise static objects can be manipulated. For example the person 100 can move the chair, which triggers the segmentation system to detect the movement and segment the chair from the depth image as a foreground object. This can, for example, allow objects to be segmented and included in the game-play of a gaming system.
The real-time camera tracking system 112 provides input to the moving-object segmentation system 114. The input from the real-time camera tracking system 112 enables the object segmentation to be performed in real-time, without demanding excessive computational resources, as described in more detail below.
Optionally, the mobile depth camera 102 is in communication with a dense 3D environment modeling system 110 (the environment in this case is the room). “Dense” in this example refers to a high degree of accuracy and resolution of the model. For example, images captured by the mobile depth camera 102 are used to form and build up a dense 3D model of the environment as the person moves about the room.
The real-time camera tracking system 112 provides input to the dense 3D modeling system, in order to allow individual depth images to be built up into an overall 3D model. The moving-object segmentation system 114 can also provide input to the dense 3D model, such that objects in the 3D model are segmented and labeled. The real-time camera tracking system 112 may also track the position of the camera in relation to the 3D model of the environment. The combination of camera tracking and 3D modeling is known as simultaneous localization and mapping (SLAM).
The outputs of the real-time camera tracking system 112, moving-object segmentation system 114, and dense 3D modeling system 110 may be used by a game system or other application, although that is not essential. For example, as mentioned, modeled and segmented real-world objects can be included in a gaming environment.
As a further example,
The depth information may be obtained using any suitable technique including, but not limited to, time of flight, structured light, and stereo images. The mobile environment capture device 300 may also comprise an emitter 304 arranged to illuminate the scene in such a manner that depth information may be ascertained by the depth camera 302.
For example, in the case that the depth camera 302 is an infra-red (IR) time-of-flight camera, the emitter 304 emits IR light onto the scene, and the depth camera 302 is arranged to detect backscattered light from the surface of one or more objects in the scene. In some examples, pulsed infrared light may be emitted from the emitter 304 such that the time between an outgoing light pulse and a corresponding incoming light pulse may be detected by the depth camera and measure and used to determine a physical distance from the environment capture device 300 to a location on objects in the scene. Additionally, in some examples, the phase of the outgoing light wave from the emitter 304 may be compared to the phase of the incoming light wave at the depth camera 302 to determine a phase shift. The phase shift may then be used to determine a physical distance from the mobile environment capture device 300 to a location on the objects by analyzing the intensity of the reflected beam of light over time via various techniques including, for example, shuttered light pulse imaging.
In another example, the mobile environment capture device 300 can use structured light to capture depth information. In such a technique patterned light (e.g. light displayed as a known pattern such as spots, a grid or stripe pattern, which may also be time-varying) may be projected onto a scene using the emitter 304. Upon striking the surface of objects in the scene the pattern becomes deformed. Such a deformation of the pattern is captured by the depth camera 302 and analyzed to determine an absolute or relative distance from the depth camera 302 to the objects in the scene.
In another example, the depth camera 302 comprises a pair of stereo cameras such that visual stereo data is obtained and resolved to generate relative depth information. In this case the emitter 304 may be used to illuminate the scene or may be omitted.
In some examples, in addition to the depth camera 302, the mobile environment capture device 300 comprises a color video camera referred to as a red-green-blue (RGB) camera 306. The RGB camera 306 is arranged to capture sequences of images of the scene at visible light frequencies.
The mobile environment capture device 300 may comprise an orientation sensor 308 such as an inertial measurement unit (IMU), accelerometer, gyroscope, compass or other orientation or movement sensor 308. However, it is not essential to use an orientation or movement sensor. The mobile environment capture device 300 may comprise a location tracking device such as a GPS, although this is not essential.
Optionally, the mobile environment capture device also comprises one or more processors, a memory and a communications infrastructure as described in more detail below. The mobile environment capture device may be provided in a housing which is shaped and sized to be hand held by a user or worn by a user. In other examples the mobile environment capture device is sized and shaped to be incorporated or mounted on a vehicle, toy or other movable apparatus.
The mobile environment capture device 300 is connected to a real-time tracker 316. This connection may be a physical wired connection or may use wireless communications. In some examples the mobile environment capture device 300 is connected indirectly to the real-time tracker 316 over one or more communications networks such as the internet.
The real-time tracker 316 may be computer-implemented using a general purpose microprocessor controlling one or more graphics processing units (GPUs). It comprises a frame alignment engine 318 and optionally a loop closure engine 320 and a relocalization engine 322. The real-time tracker 316 takes depth image frames from the depth camera 302, and optionally also input from the mobile environment capture device 300, and optional map data 334. The real-time tracker 316 operates to place the depth image frames into spatial alignment in order to produce a real-time series 328 of six degree of freedom (6DOF) pose estimates of the depth camera 302. It may also produce transformation parameters for transforms between pairs of depth image frames. In some examples the real-time tracker operates on pairs of depth image frames 314 from the depth camera. In other examples, the real-time tracker 216 takes a single depth image 314 and aligns that with a dense 3D model 326 of the environment rather than with another depth image.
The frame alignment engine 318 of the real-time tracker is arranged to align pairs of depth image frames, or a depth image frame and an estimate of a depth image frame from the dense 3D model. It uses an iterative process which can be implemented using one or more graphics processing units in order that the frame alignment engine operates in real-time. The loop closure engine 320 is arranged to detect when the mobile environment capture device has moved in a loop so that the scene depicted in the current depth frame is at least partially overlapping with that of a previous depth frame. For example, this may occur when a user walks around the whole floor of the building in
The frame alignment engine 318 of the real-time tracker 316 provides an output to the depth image segmentation system 334. The depth image segmentation system 334 uses at least a portion of the algorithms of the real-time tracker 316 to detect when objects have moved within the depth image, and segments these objects in the depth image as foreground objects. The output of the depth image segmentation system 334 is a set of labeled image elements 336 for the depth image, indicating which image elements correspond to segmented foreground objects. The process for performing the segmentation using the results of the real-time camera tracking are described in more detail below with reference to
The real-time tracker 316 may also provide the camera pose as output to an optional dense 3D model generation system 324 which uses that information together with the depth image frames to form and store a dense 3D model 326 of the scene or environment in which the mobile environment capture device 300 is moving. For example, in the case of
The mobile environment capture device 300 may be used in conjunction with a game system 332 which is connected to a display device 330. For example, the game may be a FPS game, golf game, boxing game, motor car racing game or other type of computer game. The segmented depth images (or optionally the dense 3D model and segmentation information) may be provided to the game system 332, and aspects of the model incorporated into the game. For example, the 3D model can be used to determine the shape and location of objects in a room, which can be used with camera-based games to improve background removal or incorporated into the game itself (e.g. as in-game objects that the player can interact with). Data from the game system 332 such as the game state or metadata about the game may also be provided to the real-time tracker 316.
As mentioned, the processing performed by the real-time tracker 316 and/or the dense 3D model generation system 324 can, in one example, be executed remotely from the location of the mobile environment capture device 300. For example, the mobile environment capture device 300 can be connected to (or comprise) a computing device having relatively low processing power, and which streams the depth images over a communications network to a server. The server has relatively high processing power, and performs the computationally complex tasks of the real-time tracker 316 and/or the dense 3D model generation system 324. The server can return a rendered image of the dense reconstruction per-frame to provide an interactive experience to the user, and also return the final dense 3D reconstruction on completion of the model, for subsequent local use (e.g. in a game). Such an arrangement avoids the need for the user to possess a high-powered local computing device.
Reference is now made to
Firstly, a depth image of the scene to be segmented (as captured by the mobile depth camera) is received 400. If the segmentation operation is making use of the camera-tracking results, then this current depth image is received at the frame alignment engine 318. The current depth image is also known as the destination depth image. A previous depth image of at least a portion of the scene is then retrieved 402. The previous depth image is also known as the source depth image. In some examples, the previous depth image may be the preceding frame captured by the depth camera and stored at a storage device. In alternative examples, the previous depth image may be generated from the dense 3D model stored in graphics memory by requesting the previous depth image from the GPU. The GPU can generate the previous depth image from the dense 3D model stored on the graphics memory by determining the view from a virtual camera at the last known location and orientation of the real depth camera looking into the dense 3D model. The use of the dense 3D model provides a higher resolution previous depth image.
The capture pose (i.e. location and orientation) for the previous depth image is also retrieved 404. If the previous depth image is the preceding frame captured by the depth camera, then the capture pose is the pose of the depth camera at the time of capturing the preceding frame, as determined by the real-time camera tracker. Alternatively, if the previous depth image is generated from a dense 3D model, a rendering of a mesh model, or other 3D representation, then the capture pose is the location and orientation of the virtual camera.
Once the current depth image is received, and the previous depth image and capture pose are retrieved, then the frame alignment engine 318 executes an algorithm to determine a transformation between the previous and current depth images, and hence determine how the mobile depth camera has moved between the two depth images. From the capture pose of the previous depth image, this enables determination of the current depth camera pose. This transformation can be determined by the frame alignment engine 318 of the real-time tracker 316 using an iterative closest point (ICP) algorithm. Broadly speaking, the ICP algorithm is an iterative process that determines a set of corresponding points between the two depth images, determines a transformation that minimizes an error metric over the corresponding points, and repeats the process using the determined transformation until convergence is reached.
When the ICP algorithm is determining the corresponding points between the two depth images at each iteration, points may be found that diverge substantially between the images. These are known as “outliers”. During the camera tracking operation, these outlying points may be ignored or rejected. However, these outlying points are used as useful input to the segmentation operation.
The portions of the ICP algorithm that are relevant to the determination of outliers are indicated by brace 406 in
An initial estimation of the transformation for aligning the current and previous depth images is applied 407. This initial estimate is formed in any suitable manner. For example, a motion path for the depth camera can be determined from previously determined camera pose estimates, and used with a predictive model to provide a coarse estimate the pose at the current time (e.g. using constant velocity or acceleration predictive models). Additionally or alternatively, in further examples, one or more of the following sources of information may be used to form the initial estimate: game state, game meta data, map data, RGB camera output, orientation sensor output, and GPS data.
A set of source points (i.e. pixels) in the previous depth image are selected 408 for analysis. In some examples, this set of source points can be all the pixels in the previous depth image. In other examples, a representative subset of the pixels in the previous depth image can be selected to reduce the complexity. The operations indicated by box 410 are then performed for each of the source points. Note that because the same operations are being performed for each source point, a substantial amount of parallelism exists in the execution of the ICP algorithm. This enables the ICP algorithm to be efficiently executed on a GPU, as discussed below with reference to
The operations performed in box 410 are illustrated with reference to examples shown in
In
Two compatibility tests are then performed on the source and destination points to determine whether they are corresponding points or outliers. In the first compatibility test, a Euclidean distance is determined 414 between the source and destination points. This is illustrated in
The example of
If the Euclidean distance is found to be within the predefined distance threshold, then the second compatibility test is performed. To perform this test, the surface normals are calculated 419 at the source and destination points. For example, this is achieved for a given point by finding the four (or more) nearest neighbor points in the depth image and computing a surface patch which incorporates those neighbors and the point itself. A normal to that surface patch is then calculated at the location of the point.
The angle between the surface normal of the source point and the destination point is then determined 420. In alternative examples, a different measure relating to the angle, such as the dot product, can be calculated.
The calculated angle (or other measure) is then compared 422 to a predefined angle threshold. If the angle is greater than the predefined angle threshold, then the destination point is considered to be an outlier, and labeled 418 as such. Conversely, if the angle is not greater than the predefined angle threshold, then the destination point is selected 424 as a corresponding point to the source point. In one example, the predefined angle threshold is 30 degrees, although any suitable value can be used.
Referring to
Following the two compatibility tests outlined above, the ICP algorithm continues 426 to determine the camera tracking transform. For example, the ICP algorithm finds the transform that minimizes the error metric for the selected corresponding points, and then repeats the process of finding corresponding points given that transform until convergence is reached.
However, rather than discarding the outlying points, these are stored and provided as input to the object segmentation system. The outlying points indicate points in the depth image where something has changed since the previous depth image. Such a change can be caused by dynamic or non-rigid objects giving rise to different depth measurements. In one example, the object segmentation can simply segment 428 all the outlying points and label these as foreground objects in the depth image. This provides a very rapid segmentation of the depth image that uses very little additional computation over that performed for the camera tracking anyway.
Note that in the example described above with reference to
In some examples, further tests can be performed on the detected outliers prior to segmenting the image. For example, RGB images of the scene (as captured by RGB camera 306 at the same time as the depth image) can be used in a further comparison. Two RGB images corresponding to the current and previous depth images can be compared at the ICP outliers for RGB similarity. If this comparison indicates a difference in RGB value greater than a predefined value, then this is a further indication that an object has moved at this location, and trigger segmentation of the image.
In other examples, further additional processing can be performed to segment the depth image, which utilizes the outlying points as input. For example, the outlying points can be used as input to known segmentation algorithms. Such segmentation algorithms include those based on computer vision techniques such as graph cut algorithms, object boundary detection algorithms (such as edge detection), GrabCut, or other morphological operations such as erode/dilate/median filter, and/or those based on machine learning, such classifiers (e.g. decision trees or forests). Note that many known segmentation algorithms are configured to operate on RGB or grey-scale values in images. In this example, these can be modified to operate on the 3D volumetric representation (e.g. on the rate of change or curvature of the 3D surface).
The outlying points provide useful information to these algorithms, as they indicate an area of interest in which a change in the depth image has occurred. This allows these higher-level segmentation algorithms to focus on the region of the outliers, thereby saving computational complexity relative to analyzing the entire depth image. For example, the segmentation algorithms can analyze a predefined region surrounding the outlying points. In other examples, user input can also be used to enhance the segmentation by prompting a user to manually indicate whether an object detected by the outliers is an independent, separable object from the background.
In a further example, the outlying points can be used in combination with an RGB image of the scene (e.g. captured by RGB camera 306 at the same time as the depth image). For example, the corresponding location to the outlying points in the RGB image can be located, and a higher-level segmentation algorithm (such as GrabCut or edge detection) applied to the RGB image at the location of the outliers. This can be useful where the RGB image reveals more information about the object than the depth image outliers. For example, if an object is moving by rotation about an axis substantially parallel to the view of the depth camera, then outliers are detected around the edges of the object, but not at the centre of the object (as this has not moved between depth images). However, by analyzing the RGB image at the location of the outliers, the full extent of the object may be distinguished, and the centre of the object also correctly segmented.
In other examples, after the depth image has been segmented, it can be further processed to provide additional data to the system (e.g. gaming system) in which the depth images are used. For example, the segmentation of moving objects enables these moving objects to be tracked separately and independently of the background. This can be achieved by executing a further ICP algorithm on the portion of the depth image that was identified as a moving object by the segmentation system. In this way, the motion of the moving object can be separately tracked and monitored from the background.
In the case of non-rigid, deformable moving objects, the independent tracking of objects using an ICP algorithm becomes more complex as the shape of the object itself can change and hence the ICP cannot find a movement transformation for that object that matches this. In such cases, a piece-wise rigid body approximation can be used, such that the deformable object is represented by a plurality of rigid objects. Such a piece-wise rigid body approximation can be performed dynamically and in real-time, by starting with the assumption that all objects are rigid, and then breaking the object into piece-wise sections as this assumption breaks down (e.g. when the ICP algorithm fails to find a transformation). This can be expressed as a recursive version of the segmentation algorithm described above, in which the ICP algorithm is executed on all pixels, then executed again only on the segmented pixels (the outliers), then again only on the sub-segmented pixels, etc., until no more segmented pixels are found (or a limit is reached).
As well as independently tracking a moving object in the depth image, an independent dense 3D model of the object can also be generated. This can be performed in a similar manner to the generation of the overall dense 3D model of the scene as outlined above. For example, a portion of graphics memory can be allocated to represent the object and arranged as a 3D volume. As the moving object is independently tracked, the depth image data about the object (as segmented) can be aligned with previous measurements of the object and combined in the volume. In this way, a more detailed 3D model of the moving object is constructed, which can be used, for example, when displaying a representation of the object on-screen or for object recognition purposes.
The creation of a dense 3D model of a segmented objects can be performed in a number of ways. In a first example, after the outliers have been detected (as described above), the outlying points are integrated into a separate 3D volume in memory if they lie in front of a surface along the camera ray when compared to the previous depth image (which can itself be generated from the dense 3D model as noted above). This technique enables an ICP algorithm to track the segmented object once it has been sufficiently integrated into the separate 3D volume, and provides good performance for new objects that are added into the scene. In a second example, after the outliers have been detected (as described above), every voxel in the dense 3D model within the camera's frustum (i.e. not only the front faces which the camera sees) is projected into the current depth map. If a given voxel gets projected to an outlier and if the depth value of the outlier is larger than the depth value of the projected voxel, then the segmentation system knows that the voxel is part of a moved object (because it suddenly sees a depth point behind the voxel). This technique enables objects to be instantly segmented upon removal from the scene, and performs well for moved objects.
In further examples, the above two methods for generating a dense model of a segmented object can be combined or used selectively depending on the characteristics of the object (e.g. new or moved objects).
In further examples, the segmented objects from the depth images can be input to an object recognition system. Such an object recognition system can be in the form of a machine learning classifier, e.g. using trained decision trees. This can be utilized to automatically recognize and classify the objects in real-time. For example, with reference to
In some examples, the detected points can classified not by what they represent (cat, chair, etc.) but by their behavior in the scene. For example, a coffee cup is a separate, independent rigid object relative to other objects in the scene. Conversely, a door is a separate movable object, but is attached to the wall, and hence is constrained in its movement. Higher level analysis of the points indicated by the ICP outliers can enable these points to be classified as, for example, independent, attached, or static.
Once a segmented object has been identified, this information can be used to control further aspects of the operation of the system. For example, certain objects can be arranged to directly interact within a computer game environment, whereas others can be deliberately ignored. The object identification can also be used to control the display of objects (e.g. on display device 330). For example, when an object has been identified, the low-resolution depth (or RGB) image of the object can be substituted for a high-resolution model of the object (or facets of the object).
In one example, the faces of users of the system can be identified, and substituted with high-resolution images of these user's faces, mapped onto the depth images. In a further example, these high-resolution images of the user's faces can be parameterized such that their facial expressions are under computer control. In this way, detailed and life-like avatars can be generated, that accurately follow the movement of the users. Similarly, a 3D representation of a user's body can be generated from a parameterized model, and this 3D body representation can have facial expressions substituted in that are captured from the depth camera (and/or from the RGB camera).
In a further example, following segmentation the region of the dense 3D model which was previously “joined” to the now-moved object can be tagged (e.g. the location where a moved coffee cup base rested on the table). This tagging can indicate that this region in the model is a likely “border surface”. The tagging can comprise associating additional data to those points in the model that indicate a potential surface. By doing this, if some unknown object is placed on that area (or at least partially on that area) at some point in the future, then the system already knows that these are likely to be two detached objects (e.g. a table and an object touching that region) that are simply in contact with each other and are separable, because of the presence of the “border surface” between them.
The tagging of “border surfaces” this way is useful in the case in which the camera frames are not sufficiently fast to track a rapidly-moved object in the scene. In such cases the rapidly-moved object just “appears” in the model. A similar scenario is observed when an object is moved less than the above-mentioned predefined distance or angle threshold, in which case the movement may not be detected. However, if the object contacts a tagged border surface, then this can trigger a more detailed analysis/processing whereby the object can be assumed to be a detachable/separable object despite not meeting the basic distance or angle thresholds.
Reference is now made to
Computing-based device 700 comprises one or more processors 702 which may be microprocessors, controllers or any other suitable type of processors for processing computing executable instructions to control the operation of the device in order to perform moving object image segmentation. In some examples, for example where a system on a chip architecture is used, the processors 702 may include one or more fixed function blocks (also referred to as accelerators) which implement a part of the modeling methods in hardware (rather than software or firmware).
The computing-based device 700 also comprises a graphics processing system 704, which communicates with the processors 702 via an interface 706, and comprises one or more graphics processing units 708, which are arranged to execute parallel, threaded operations in a fast and efficient manner. The graphics processing system 704 also comprises a graphics memory 710, which is arranged to enable fast parallel access from the graphics processing units 708. In examples, the graphics memory 710 can store the dense 3D models, and the graphics processing units 708 can perform the ICP algorithms/camera tracking and dense model generation operations described above.
The computing-based device 700 also comprises a communication interface 712 arranged to receive input from one or more devices, such as the mobile environment capture device (comprising the depth camera), and optionally one or more user input devices (e.g. a game controller, mouse, and/or keyboard). The communication interface 712 may also operate as a networking interface, which can be arranged to communicate with one or more communication networks (e.g. the internet).
A display interface 714 is also provided and arranged to provide output to a display system integral with or in communication with the computing-based device. The display system may provide a graphical user interface, or other user interface of any suitable type although this is not essential.
The computer executable instructions may be provided using any computer-readable media that is accessible by computing based device 700. Computer-readable media may include, for example, computer storage media such as memory 716 and communications media. Computer storage media, such as memory 716, includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information for access by a computing device. In contrast, communication media may embody computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave, or other transport mechanism. As defined herein, computer storage media does not include communication media. Although the computer storage media (memory 716) is shown within the computing-based device 700 it will be appreciated that the storage may be distributed or located remotely and accessed via a network or other communication link (e.g. using communication interface 712).
Platform software comprising an operating system 718 or any other suitable platform software may be provided at the computing-based device to enable application software 720 to be executed on the device. The memory 716 can store executable instructions to implement the functionality of a camera tracking engine 722 (e.g. arranged to track the location and orientation of the depth camera using ICP), an object segmentation engine 724 (e.g. arranged to segment moving objects from the depth image using the ICP outliers), and optionally a dense model generation engine 726 (e.g. arranged to generate a dense 3D model of the scene or objects in graphics memory). The memory 716 can also provide a data store 728, which can be used to provide storage for data used by the processors 702 when performing the segmentation techniques, such as for storing previously captured depth images and capture poses.
The term ‘computer’ is used herein to refer to any device with processing capability such that it can execute instructions. Those skilled in the art will realize that such processing capabilities are incorporated into many different devices and therefore the term ‘computer’ includes PCs, servers, mobile telephones, personal digital assistants and many other devices.
The methods described herein may be performed by software in machine readable form on a tangible storage medium e.g. in the form of a computer program comprising computer program code means adapted to perform all the steps of any of the methods described herein when the program is run on a computer and where the computer program may be embodied on a computer readable medium. Examples of tangible (or non-transitory) storage media include disks, thumb drives, memory etc and do not include propagated signals. The software can be suitable for execution on a parallel processor or a serial processor such that the method steps may be carried out in any suitable order, or simultaneously.
This acknowledges that software can be a valuable, separately tradable commodity. It is intended to encompass software, which runs on or controls “dumb” or standard hardware, to carry out the desired functions. It is also intended to encompass software which “describes” or defines the configuration of hardware, such as HDL (hardware description language) software, as is used for designing silicon chips, or for configuring universal programmable chips, to carry out desired functions.
Those skilled in the art will realize that storage devices utilized to store program instructions can be distributed across a network. For example, a remote computer may store an example of the process described as software. A local or terminal computer may access the remote computer and download a part or all of the software to run the program. Alternatively, the local computer may download pieces of the software as needed, or execute some software instructions at the local terminal and some at the remote computer (or computer network). Those skilled in the art will also realize that by utilizing conventional techniques known to those skilled in the art that all, or a portion of the software instructions may be carried out by a dedicated circuit, such as a DSP, programmable logic array, or the like.
Any range or device value given herein may be extended or altered without losing the effect sought, as will be apparent to the skilled person.
It will be understood that the benefits and advantages described above may relate to one embodiment or may relate to several embodiments. The embodiments are not limited to those that solve any or all of the stated problems or those that have any or all of the stated benefits and advantages. It will further be understood that reference to ‘an’ item refers to one or more of those items.
The steps of the methods described herein may be carried out in any suitable order, or simultaneously where appropriate. Additionally, individual blocks may be deleted from any of the methods without departing from the spirit and scope of the subject matter described herein. Aspects of any of the examples described above may be combined with aspects of any of the other examples described to form further examples without losing the effect sought.
The term ‘comprising’ is used herein to mean including the method blocks or elements identified, but that such blocks or elements do not comprise an exclusive list and a method or apparatus may contain additional blocks or elements.
It will be understood that the above description of a preferred embodiment is given by way of example only and that various modifications may be made by those skilled in the art. The above specification, examples and data provide a complete description of the structure and use of exemplary embodiments of the invention. Although various embodiments of the invention have been described above with a certain degree of particularity, or with reference to one or more individual embodiments, those skilled in the art could make numerous alterations to the disclosed embodiments without departing from the spirit or scope of this invention.
Number | Name | Date | Kind |
---|---|---|---|
4627620 | Yang | Dec 1986 | A |
4630910 | Ross et al. | Dec 1986 | A |
4645458 | Williams | Feb 1987 | A |
4695953 | Blair et al. | Sep 1987 | A |
4702475 | Elstein et al. | Oct 1987 | A |
4711543 | Blair et al. | Dec 1987 | A |
4751642 | Silva et al. | Jun 1988 | A |
4796997 | Svetkoff et al. | Jan 1989 | A |
4809065 | Harris et al. | Feb 1989 | A |
4817950 | Goo | Apr 1989 | A |
4843568 | Krueger et al. | Jun 1989 | A |
4893183 | Nayar | Jan 1990 | A |
4901362 | Terzian | Feb 1990 | A |
4925189 | Braeunig | May 1990 | A |
5101444 | Wilson et al. | Mar 1992 | A |
5148154 | MacKay et al. | Sep 1992 | A |
5184295 | Mann | Feb 1993 | A |
5229754 | Aoki et al. | Jul 1993 | A |
5229756 | Kosugi et al. | Jul 1993 | A |
5239463 | Blair et al. | Aug 1993 | A |
5239464 | Blair et al. | Aug 1993 | A |
5288078 | Capper et al. | Feb 1994 | A |
5295491 | Gevins | Mar 1994 | A |
5320538 | Baum | Jun 1994 | A |
5347306 | Nitta | Sep 1994 | A |
5385519 | Hsu et al. | Jan 1995 | A |
5405152 | Katanics et al. | Apr 1995 | A |
5417210 | Funda et al. | May 1995 | A |
5423554 | Davis | Jun 1995 | A |
5454043 | Freeman | Sep 1995 | A |
5469740 | French et al. | Nov 1995 | A |
5495576 | Ritchey | Feb 1996 | A |
5516105 | Eisenbrey et al. | May 1996 | A |
5524637 | Erickson et al. | Jun 1996 | A |
5534917 | MacDougall | Jul 1996 | A |
5563988 | Maes et al. | Oct 1996 | A |
5577981 | Jarvik | Nov 1996 | A |
5580249 | Jacobsen et al. | Dec 1996 | A |
5594469 | Freeman et al. | Jan 1997 | A |
5597309 | Riess | Jan 1997 | A |
5616078 | Oh | Apr 1997 | A |
5617312 | Iura et al. | Apr 1997 | A |
5638300 | Johnson | Jun 1997 | A |
5641288 | Zaenglein | Jun 1997 | A |
5682196 | Freeman | Oct 1997 | A |
5682229 | Wangler | Oct 1997 | A |
5690582 | Ulrich et al. | Nov 1997 | A |
5696591 | Bilhorn et al. | Dec 1997 | A |
5703367 | Hashimoto et al. | Dec 1997 | A |
5704837 | Iwasaki et al. | Jan 1998 | A |
5715834 | Bergamasco et al. | Feb 1998 | A |
5875108 | Hoffberg et al. | Feb 1999 | A |
5877803 | Wee et al. | Mar 1999 | A |
5913727 | Ahdoot | Jun 1999 | A |
5933125 | Fernie | Aug 1999 | A |
5980256 | Carmein | Nov 1999 | A |
5989157 | Walton | Nov 1999 | A |
5995649 | Marugame | Nov 1999 | A |
6005548 | Latypov et al. | Dec 1999 | A |
6009210 | Kang | Dec 1999 | A |
6054991 | Crane et al. | Apr 2000 | A |
6066075 | Poulton | May 2000 | A |
6072494 | Nguyen | Jun 2000 | A |
6073489 | French et al. | Jun 2000 | A |
6077201 | Cheng et al. | Jun 2000 | A |
6098458 | French et al. | Aug 2000 | A |
6100896 | Strohecker et al. | Aug 2000 | A |
6101289 | Kellner | Aug 2000 | A |
6128003 | Smith et al. | Oct 2000 | A |
6130677 | Kunz | Oct 2000 | A |
6141463 | Covell et al. | Oct 2000 | A |
6147678 | Kumar et al. | Nov 2000 | A |
6152856 | Studor et al. | Nov 2000 | A |
6159100 | Smith | Dec 2000 | A |
6173066 | Peurach et al. | Jan 2001 | B1 |
6181343 | Lyons | Jan 2001 | B1 |
6188777 | Darrell et al. | Feb 2001 | B1 |
6215890 | Matsuo et al. | Apr 2001 | B1 |
6215898 | Woodfill et al. | Apr 2001 | B1 |
6226396 | Marugame | May 2001 | B1 |
6229913 | Nayar et al. | May 2001 | B1 |
6256033 | Nguyen | Jul 2001 | B1 |
6256400 | Takata et al. | Jul 2001 | B1 |
6283860 | Lyons et al. | Sep 2001 | B1 |
6289112 | Jain et al. | Sep 2001 | B1 |
6299308 | Voronka et al. | Oct 2001 | B1 |
6308565 | French et al. | Oct 2001 | B1 |
6316934 | Amorai-Moriya et al. | Nov 2001 | B1 |
6363160 | Bradski et al. | Mar 2002 | B1 |
6384819 | Hunter | May 2002 | B1 |
6411744 | Edwards | Jun 2002 | B1 |
6430997 | French et al. | Aug 2002 | B1 |
6476834 | Doval et al. | Nov 2002 | B1 |
6496598 | Harman | Dec 2002 | B1 |
6503195 | Keller et al. | Jan 2003 | B1 |
6504569 | Jasinschi et al. | Jan 2003 | B1 |
6539931 | Trajkovic et al. | Apr 2003 | B2 |
6570555 | Prevost et al. | May 2003 | B1 |
6633294 | Rosenthal et al. | Oct 2003 | B1 |
6640202 | Dietz et al. | Oct 2003 | B1 |
6661918 | Gordon et al. | Dec 2003 | B1 |
6681031 | Cohen et al. | Jan 2004 | B2 |
6714665 | Hanna et al. | Mar 2004 | B1 |
6731799 | Sun et al. | May 2004 | B1 |
6738066 | Nguyen | May 2004 | B1 |
6765726 | French et al. | Jul 2004 | B2 |
6788809 | Grzeszczuk et al. | Sep 2004 | B1 |
6801637 | Voronka et al. | Oct 2004 | B2 |
6873723 | Aucsmith et al. | Mar 2005 | B1 |
6876496 | French et al. | Apr 2005 | B2 |
6937742 | Roberts et al. | Aug 2005 | B2 |
6940538 | Rafey et al. | Sep 2005 | B2 |
6950534 | Cohen et al. | Sep 2005 | B2 |
7003134 | Covell et al. | Feb 2006 | B1 |
7036094 | Cohen et al. | Apr 2006 | B1 |
7038855 | French et al. | May 2006 | B2 |
7039676 | Day et al. | May 2006 | B1 |
7042440 | Pryor et al. | May 2006 | B2 |
7050606 | Paul et al. | May 2006 | B2 |
7058204 | Hildreth et al. | Jun 2006 | B2 |
7060957 | Lange et al. | Jun 2006 | B2 |
7113918 | Ahmad et al. | Sep 2006 | B1 |
7121946 | Paul et al. | Oct 2006 | B2 |
7135992 | Karlsson et al. | Nov 2006 | B2 |
7170492 | Bell | Jan 2007 | B2 |
7184048 | Hunter | Feb 2007 | B2 |
7202898 | Braun et al. | Apr 2007 | B1 |
7222078 | Abelow | May 2007 | B2 |
7227526 | Hildreth et al. | Jun 2007 | B2 |
7259747 | Bell | Aug 2007 | B2 |
7308112 | Fujimura et al. | Dec 2007 | B2 |
7317836 | Fujimura et al. | Jan 2008 | B2 |
7348963 | Bell | Mar 2008 | B2 |
7359121 | French et al. | Apr 2008 | B2 |
7366325 | Fujimura et al. | Apr 2008 | B2 |
7367887 | Watabe et al. | May 2008 | B2 |
7379563 | Shamaie | May 2008 | B2 |
7379566 | Hildreth | May 2008 | B2 |
7389591 | Jaiswal et al. | Jun 2008 | B2 |
7412077 | Li et al. | Aug 2008 | B2 |
7421093 | Hildreth et al. | Sep 2008 | B2 |
7430312 | Gu | Sep 2008 | B2 |
7436496 | Kawahito | Oct 2008 | B2 |
7450736 | Yang et al. | Nov 2008 | B2 |
7452275 | Kuraishi | Nov 2008 | B2 |
7460690 | Cohen et al. | Dec 2008 | B2 |
7489812 | Fox et al. | Feb 2009 | B2 |
7536032 | Bell | May 2009 | B2 |
7555142 | Hildreth et al. | Jun 2009 | B2 |
7560701 | Oggier et al. | Jul 2009 | B2 |
7570805 | Gu | Aug 2009 | B2 |
7574020 | Shamaie | Aug 2009 | B2 |
7576727 | Bell | Aug 2009 | B2 |
7590262 | Fujimura et al. | Sep 2009 | B2 |
7593552 | Higaki et al. | Sep 2009 | B2 |
7598942 | Underkoffler et al. | Oct 2009 | B2 |
7607509 | Schmiz et al. | Oct 2009 | B2 |
7620202 | Fujimura et al. | Nov 2009 | B2 |
7668340 | Cohen et al. | Feb 2010 | B2 |
7680298 | Roberts et al. | Mar 2010 | B2 |
7683954 | Ichikawa et al. | Mar 2010 | B2 |
7684592 | Paul et al. | Mar 2010 | B2 |
7697748 | Dimsdale et al. | Apr 2010 | B2 |
7701439 | Hillis et al. | Apr 2010 | B2 |
7702130 | Im et al. | Apr 2010 | B2 |
7704135 | Harrison, Jr. | Apr 2010 | B2 |
7710391 | Bell et al. | May 2010 | B2 |
7729530 | Antonov et al. | Jun 2010 | B2 |
7746345 | Hunter | Jun 2010 | B2 |
7760182 | Ahmad et al. | Jul 2010 | B2 |
7809167 | Bell | Oct 2010 | B2 |
7834846 | Bell | Nov 2010 | B1 |
7852262 | Namineni et al. | Dec 2010 | B2 |
RE42256 | Edwards | Mar 2011 | E |
7898522 | Hildreth et al. | Mar 2011 | B2 |
8035612 | Bell et al. | Oct 2011 | B2 |
8035614 | Bell et al. | Oct 2011 | B2 |
8035624 | Bell et al. | Oct 2011 | B2 |
8072470 | Marks | Dec 2011 | B2 |
20030067461 | Fletcher et al. | Apr 2003 | A1 |
20040075738 | Burke et al. | Apr 2004 | A1 |
20040167667 | Goncalves et al. | Aug 2004 | A1 |
20040233287 | Schnell | Nov 2004 | A1 |
20050238200 | Gupta et al. | Oct 2005 | A1 |
20060221250 | Rossbach et al. | Oct 2006 | A1 |
20070052807 | Zhou et al. | Mar 2007 | A1 |
20070116356 | Gong et al. | May 2007 | A1 |
20070156286 | Yamauchi | Jul 2007 | A1 |
20080026838 | Dunstan et al. | Jan 2008 | A1 |
20080060854 | Perlin | Mar 2008 | A1 |
20080152191 | Fujimura et al. | Jun 2008 | A1 |
20080304707 | Oi et al. | Dec 2008 | A1 |
20100085352 | Zhou et al. | Apr 2010 | A1 |
20100085353 | Zhou et al. | Apr 2010 | A1 |
20100094460 | Choi et al. | Apr 2010 | A1 |
20100103196 | Kumar et al. | Apr 2010 | A1 |
20100111370 | Black et al. | May 2010 | A1 |
20100278384 | Shotton et al. | Nov 2010 | A1 |
20100281432 | Geisner et al. | Nov 2010 | A1 |
20100302395 | Mathe et al. | Dec 2010 | A1 |
Number | Date | Country |
---|---|---|
201254344 | Jun 2010 | CN |
0583061 | Feb 1994 | EP |
08044490 | Feb 1996 | JP |
9310708 | Jun 1993 | WO |
9717598 | May 1997 | WO |
9944698 | Sep 1999 | WO |
Entry |
---|
J. Ziegler et al. “Tracking of the articulated upper body on multi-view stereo image sequences,” Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06) 0-7695-2597-0/06 $20.00 © 2006 IEEE. |
Kanade et al., “A Stereo Machine for Video-rate Dense Depth Mapping and Its New Applications”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1996, pp. 196-202,The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA. |
Miyagawa et al., “CCD-Based Range Finding Sensor”, Oct. 1997, pp. 1648-1652, vol. 44 No. 10, IEEE Transactions on Electron Devices. |
Rosenhahn et al., “Automatic Human Model Generation”, 2005, pp. 41-48, University of Auckland (CITR), New Zealand. |
Aggarwal et al., “Human Motion Analysis: A Review”, IEEE Nonrigid and Articulated Motion Workshop, 1997, University of Texas at Austin, Austin, TX. |
Shao et al., “An Open System Architecture for a Multimedia and Multimodal User Interface”, Aug. 24, 1998, Japanese Society for Rehabilitation of Persons with Disabilities (JSRPD), Japan. |
Kohler, “Special Topics of Gesture Recognition Applied in Intelligent Home Environments”, in Proceedings of the Gesture Workshop, 1998, pp. 285-296, Germany. |
Kohler, “Vision Based Remote Control in Intelligent Home Environments”, University of Erlangen-Nuremberg/ Germany, 1996, pp. 147-154, Germany. |
Kohler, “Technical Details and Ergonomical Aspects of Gesture Recognition applied in Intelligent Home Environments”, 1997, Germany. |
Hasegawa et al., “Human-Scale Haptic Interaction with a Reactive Virtual Human in a Real-Time Physics Simulator”, Jul. 2006, vol. 4, No. 3, Article 6C, ACM Computers in Entertainment, New York, NY. |
Qian et al., “A Gesture-Driven Multimodal Interactive Dance System”, Jun. 2004, pp. 1579-1582, IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan. |
Zhao, “Dressed Human Modeling, Detection, and Parts Localization”, 2001, The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA. |
He, “Generation of Human Body Models”, Apr. 2005, University of Auckland, New Zealand. |
Isard et al., “Condensation—Conditional Density Propagation for Visual Tracking”, 1998, pp. 5-28, International Journal of Computer Vision 29(1), Netherlands. |
Livingston, “Vision-based Tracking with Dynamic Structured Light for Video See-through Augmented Reality”, 1998, University of North Carolina at Chapel Hill, North Carolina, USA. |
Wren et al., “Pfinder: Real-Time Tracking of the Human Body”, MIT Media Laboratory Perceptual Computing Section Technical Report No. 353, Jul. 1997, vol. 19, No. 7, pp. 780-785, IEEE Transactions on Pattern Analysis and Machine Intelligence, Caimbridge, MA. |
Breen et al., “Interactive Occlusion and Collusion of Real and Virtual Objects in Augmented Reality”, Technical Report ECRC-95-02, 1995, European Computer-Industry Research Center GmbH, Munich, Germany. |
Freeman et al., “Television Control by Hand Gestures”, Dec. 1994, Mitsubishi Electric Research Laboratories, TR94-24, Caimbridge, MA. |
Hongo et al., “Focus of Attention for Face and Hand Gesture Recognition Using Multiple Cameras”, Mar. 2000, pp. 156-161, 4th IEEE International Conference on Automatic Face and Gesture Recognition, Grenoble, France. |
Pavlovic et al., “Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review”, Jul. 1997, pp. 677-695, vol. 19, No. 7, IEEE Transactions on Pattern Analysis and Machine Intelligence. |
Azarbayejani et al., “Visually Controlled Graphics”, Jun. 1993, vol. 15, No. 6, IEEE Transactions on Pattern Analysis and Machine Intelligence. |
Granieri et al., “Simulating Humans in VR”, The British Computer Society, Oct. 1994, Academic Press. |
Brogan et al., “Dynamically Simulated Characters in Virtual Environments”, Sep./Oct. 1998, pp. 2-13, vol. 18, Issue 5, IEEE Computer Graphics and Applications. |
Fisher et al., “Virtual Environment Display System”, ACM Workshop on Interactive 3D Graphics, Oct. 1986, Chapel Hill, NC. |
“Virtual High Anxiety”, Tech Update, Aug. 1995, pp. 22. |
Sheridan et al., “Virtual Reality Check”, Technology Review, Oct. 1993, pp. 22-28, vol. 96, No. 7. |
Stevens, “Flights into Virtual Reality Treating Real World Disorders”, The Washington Post, Mar. 27, 1995, Science Psychology, 2 pages. |
“Simulation and Training”, 1994, Division Incorporated. |
U.S. Appl. No. 12/367,665 , filed Feb. 9, 2009, “Camera Based Motion Sensing System”. |
U.S. Appl. No. 12/790,026, filed May 28, 2010, “Foreground and Background Image Segmentation”. |
U.S. Appl. No. 12/877,595, filed Sep. 8, 2010, “Depth Camera Based on Structured Light and Stereo Vision”. |
Baltzakis, et al., “Tracking of human hands and faces through probabilistic fusion of multiple visual cues”, retrieved on Nov. 28, 2010 at <<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.158.8443&rep=rep1&type=pdf>>, IEEE, Intl Conference on Computer Vision Systems (ICVS), Santorini, Greece, May 2008, pp. 1-10. |
Benko, et al., “Depth Touch: Using Depth-Sensing Camera to Enable Freehand Interactions on and Above the Interactive Surface”, retrieved on Nov. 28, 2010 at <<http://research.microsoft.com/en-us/um/people/benko/publications/2008/DepthTouch—poster.pdf>>, IEEE Tabletops and Interactive Surfaces, Amsterdam, the Netherlands, Oct. 2008, pp. 1. |
Besl, et al., “A Method for Registration of 3-D Shapes”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, No. 2, Feb. 1992, pp. 239-256. |
Blais, et al., “Registering Multiview Range Data to Create 3D Computer Objects”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, No. 8, Aug. 1995, pp. 820-824. |
Boehnke, “Fast Object Localization with Real Time 3D Laser Range Sensor Simulation”, retrieved on Nov. 24, 2010 at <<http://www.wseas.us/e-library/transactions/electronics/2008/Paper%204%20BOEHNKE.pdf>>, WSEAS Transactions on Electronics, vol. 5, No. 3, Mar. 2008, pp. 83-92. |
Bolitho, et al., “Parallel Poisson Surface Reconstruction”, retrieved on Nov. 29, 2010 at <<http://www.cs.jhu.edu/˜misha/MyPapers/ISVC09.pdf>>, Springer-Verlag Berlin, Proceedings of Intl Symposium on Advances in Visual Computing: Part I (ISVC), Nov. 2010, pp. 678-689. |
Bolitho, “The Reconstruction of Large Three-dimensional Meshes”, retrieved on Nov. 29, 2010 at <<http://www.cs.jhu.edu/˜misha/Bolitho/Thesis.pdf>>, Johns Hopkins University, PhD Dissertation, Mar. 2010, pp. 1-171. |
Botterill, et al., “Bag-of-Words-driven Single Camera SLAM”, retrieved on Nov. 26, 2010 at <<http://www.hilandtom.com/tombotterill/Botterill-Mills-Green-2010-BoWSLAM.pdf>>, Journal on Image and Video Processing, Aug. 2010, pp. 1-18. |
Broll, et al., “Toward Next-Gen Mobile AR Games”, retrieved on Nov. 24, 2010 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4557954>>, IEEE Computer Society, IEEE Computer Graphics and Applications, vol. 28, No. 4, 2008, pp. 40-48. |
Campbell, et al., “Automatic 3D Object Segmentation in Multiple Views using Volumetric Graph-Cuts”, Butterworth-Heinemann, Newton, MA, Image and Vision Computing, vol. 28, No. 1, Jan. 2010, pp. 14-25. |
Carmody, “How Motion Detection Works in Xbox Kinect”, retrieved on Nov. 29, 2010 at <<http://gizmodo.com/5681078/how-motion-detection-works-in-xbox-kinect>>, Gizmo.com. Nov. 3, 2010, pp. 1-4. |
Chen, et al., “Object Modeling by Registration of Multiple Range Images”, IEEE Proceedings of Intl Conference on Robotics and Automation, Sacramento, California , Apr. 1991, pp. 2724-2729. |
Cheung, et al., “Robust Background Subtraction with Foreground Validation for Urban Traffic Video”, retrieved on Nov. 28, 2010 at <<http://downloads.hindawi.com/journals/asp/2005/726261.pdf>>, Hindawi Publishing, EURASIP Journal on Applied Signal Processing, vol. 14, 2005, pp. 2330-2340. |
Cohen, et al., “Interactive Fluid-Particle Simulation using Translating Eulerian Grids”, ACM SIGGRAPH, Proceedings of Symposium on Interactive 3D Graphics and Games (I3D), 2010, pp. 15-22. |
Curless, et al., “A Volumetric Method for Building Complex Models from Range Images”, ACM SIGGRAPH, Proceedings of Conference on Computer Graphics and Interactive Techniques, New Orleans, LA, Aug. 1996, pp. 303-312. |
Cutts, “Matt Cutts: Gadgets, Google, and SEO”, retrieved on Nov. 30, 2010 at <<http://www.mattcutts.com/blog/>> Google/SEO, Nov. 2010, 10 pages. |
Davison, et al., “Mobile Robot Localisation using Active Vision”, Springer, LNCS vol. 1407, No. II, Proceedings of European Conference on Computer Vision, Freiburg, Germany, 1998, pp. 809-825. |
de la Escalera, et al., “Automatic Chessboard Detection for Intrinsic and Extrinsic Camera Parameter Calibration”, retrieved on Nov. 29, 2010 at <<http://www.mdpi.com/1424-8220/10/3/2027/pdf>>, Sensors, vol. 10, No. 3, 2010, pp. 2027-2044. |
Elfes, et al., “Sensor Integration for Robot Navigation: Combining Sonar and Stereo Range Data in a Grid-Based Representation”, IEEE, Proceedings of Conference on Decision and Control, Los Angeles, California, Dec. 1987, pp. 1802-1807. |
Frahm, et al., “Building Rome on a Cloudless Day”, Springer-Verlag Berlin, Proceedings of European Conference on Computer Vision: Part IV (ECCV), 2010, pp. 368-381. |
Fujii, et al., “Three-dimensional finger tracking using direct and reflected infrared images”, retrieved on Nov. 29, 2010 at <<http://www.acm.org/uist/archive/adjunct/2002/pdf/posters/p27-fujii.pdf>>, ACM, Symposium on User Interface Software and Technology (UIST), Paris, France, Oct. 2002, pp. 27-28. |
Furukawa, et al., “Towards Internet-scale Multi-view Stereo”, IEEE Computer Society, Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, California, Jun. 2010, pp. 1434-1441. |
Goesele, et al., “Multi-View Stereo Revisited”, IEEE Computer Society, Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), New York, NY, vol. 2, 2006, pp. 2402-2409. |
Hadwiger, et al., “Advanced Illumination Techniques for GPU-Based Volume Raycasting”, ACM SIGGRAPH, Intl Conference on Computer Graphics and Interactive Techniques, 2009, pp. 1-56. |
Harada, “Real-Time Rigid Body Simulation on GPUs”, retrieved on Apr. 18, 2011 at <<http.developer.nvidia.com/GPUGems3/gpugems3—ch29.html, Nvidia, GPU Gems 3, Chapter 29, 2008, pp. 1-21. |
Henry, et al., “RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments”, ISER, 2010, pp. 1-2. |
Herath, et al., “Simultaneous Localisation and Mapping: A Stereo Vision Based Approach”, retrieved on Nov. 26, 2010 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4058480>>, IEEE, Intl Conference on Intelligent Robots and Systems, Beijing, China, Oct. 2006, pp. 922-927. |
Hirsch, et al., “BiDi Screen: A Thin, Depth-Sensing LCD for 3D Interaction using Light Fields”, retrieved on Nov. 29, 2010 at <<http://src.acm.org/2010/MatthewHirsch/BiDiScreen/BiDi%20Screen.htm, ACM SIGGRAPH Asia , Transactions on Graphics (TOG), vol. 28, No. 5, Dec. 2009, pp. 1-7. |
Hogue, et al., “Underwater environment reconstruction using stereo and inertial data”, retrieved on Nov. 29, 2010 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=04413666>>, IEEE Intl Conference on Systems, Man and Cybernetics, Montreal, Canada, Jan. 2008, pp. 2372-2377. |
Jivet, et al., “Real Time Representation of 3D Sensor Depth Images”, retrieved on Nov. 28, 2010 at <<http://www.wseas.us/e-library/transactions/electronics/2008/Paper%202%20JIVET.pdf>>, WSEAS Transactions on Electronics, vol. 5, No. 3, Mar. 2008, pp. 65-71. |
Kazhdan, et al., “Poisson Surface Reconstruction”, Eurographics Symposium on Geometry Processing, 2006, pp. 61-70. |
Kil, et al., “GPU-assisted Surface Reconstruction on Locally-Uniform Samples”, retrieved on Nov. 29, 2010 at <<http://graphics.cs.ucdavis.edu/˜yjkil/pub/psurface/Kil.PS.IMR08.pdf>>, Proceedings of Intl Meshing Roundtable, 2008, pp. 369-385. |
Kim, et al., “Relocalization Using Virtual Keyframes for Online Environment Map Construction”, retrieved on Nov. 26, 2010 at <<http://www.cs.ucsb.edu/˜holl/pubs/Kim-2009-VRST.pdf>>, ACM, Proceedings of Symposium on Virtual Reality Software and Technology (VRST), Kyoto, Japan, Nov. 2009, pp. 127-134. |
Klein, et al., “Parallel Tracking and Mapping for Small AR Workspaces”, IEEE, Intl Symposium on Mixed and Augmented Reality, Nov. 2007, ISMAR, Nara, Japan, pp. 225-234. |
Le Grand, “Broad-Phase Collision Detection with CUDA”, retrieved on Apr. 2, 2011 at <<http.developer.nvidia.comIGPUGems3/gpugems3ch32.html>>, Nvidia, GPU Gems 3, Chapter 32, 2008, pp. 1-24. |
Levoy, et al., “The Digital Michelangelo Project: 3D Scanning of Large Statues”, ACM SIGGRAPH, New Orleans, LA, 2000, pp. 131-144. |
Lorensen, et al., “Marching Cubes: A High Resolution 3D Surface Construction Algorithm”, Computer Graphics, vol. 21, No. 4, Jul. 1987, pp. 163-169. |
Michel, et al., “GPU-accelerated Real-Time 3D Tracking for Humanoid Locomotion and Stair Climbing”, retrieved on Nov. 24, 2010 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=8,arnumber=4399104>>, IEEE, Proceedings of IEEE/RSJ Intl Conference on Intelligent Robots and Systems, San Diego, California, Nov. 2007, pp. 463-469. |
Molchanov, et al., “Non-iterative Second-order Approximation of Signed Distance Functions for Any Isosurface Representation”, retrieved on Nov. 29, 2010 at <<http://www.paul-rosenthal.de/wp-content/uploads/2010/06/molchanov—eurovis—2010.pdf>>, Blackwell Publishing, Eurographics/ IEEE-VGTC Symposium on Visualization, vol. 29, No. 3, 2010, pp. 1-10. |
Newcombe, et al., “Live Dense Reconstruction with a Single Moving Camera”, IEEE Computer Society, Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp. 1498-1505. |
Osher, et al., “Level Set Methods and Dynamic Implicit Surfaces, Signed Distance Functions”, Springer-Verlag New York, Applied Mathematical Sciences, Chapter 2, 2002, pp. 17-22. |
Parker, et al., “Interactive Ray Tracing for Isosurface Rendering”, IEEE Computer Society, Proceedings of Conference on Visualization (VIS), 1998, pp. 233-238 and 538. |
Pollefeys, et al., “Detailed Real-Time Urban 3D Reconstruction From Video”, Kluwer Academic Publishers , International Journal of Computer Vision, vol. 78, No. 2-3, Jul. 2008, pp. 143-167. |
Purcell, et al., “Ray Tracing on Programmable Graphics Hardware”, ACM Transactions on Graphics, vol. 1, No. 3, Jul. 2002, pp. 268-277. |
Rusinkiewicz, et al., “Real-Time 3D Model Acquisition”, ACM SIGGRAPH, Proceedings of Conference on Computer Graphics and Interactive Techniques, 2002, pp. 438-446. |
Seitz, et al., “A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms”, IEEE Computer Society, Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, 2006, pp. 519-528. |
Stein, et al., “Structural Indexing: Efficient 3-D Object Recognition”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, No. 2, Feb. 1992, pp. 125-145. |
Stuhmer, et al., “Real-Time Dense Geometry from a Handheld Camera”, Springer-Verlag Berlin, LNCS 6376, Conference on Pattern Recognition (DAGM), 2010, pp. 11-20. |
Thrun, et al., “Probabilistic Robotics”, The MIT Press, Chapter 9, Sep. 2005, pp. 281-335. |
van Dam, et al., “Immersive VR for Scientific Visualization: A Progress Report”, retrieved on Nov. 24, 2010 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=888006>>, IEEE Computer Society, IEEE Computer Graphics and Applications, vol. 20, No. 6, Nov. 2000, pp. 26-52. |
Vaughan-Nichols, “Game-Console Makers Battle over Motion-Sensitive Controllers”, retrieved on Nov. 24, 2010 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5197417>>, IEEE Computer Society, Computer, Aug. 2009, pp. 13-15. |
Vidal, et al., “Pursuit-Evasion Games with Unmanned Ground and Aerial Vehicles”, retrieved on Nov. 24, 2010 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=933069>>, IEEE, Proceedings of IEEE Intl Conference on Robotics and Automation, Seoul, Korea, May 2001, pp. 2948-2955. |
Vogiatzis, et al., “Reconstructing relief surfaces”, Elsevier Press, Image and Vision Computing, vol. 26, 2008, pp. 397-404. |
Welch, et al., “Motion Tracking: No Silver Bullet, but a Respectable Arsenal”, retrieved on Nov. 24, 2010 at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1046626>>, IEEE Computer Society, IEEE Computer Graphics and Applications, vol. 22, No. 6, Nov. 2002, pp. 24-38. |
Williams, et al., “Real-Time SLAM Relocalisation”, retrieved on Nov. 26, 2010 at <<http://www.robots.ox.ac.uk:5000/˜lav/Papers/williams—etal—iccv2007/williams—etal—iccv2007.pdf>>, IEEE, Proceedings of Intl Conference on Computer Vision (ICCV), Rio de Janeiro, Brazil, Oct. 2007, pp. 1-8. |
Wilson, et al., “Combining Multiple Depth Cameras and Projectors for Interactions On, Above, and Between Surfaces”, retrieved on Nov. 29, 2010 at <<http://research.microsoft.com/en-us/um/people/awilson/publications/wilsonuist2010/Wilson%20UIST%202010%20LightSpace.pdf>>, ACM, Proceedings of Symposium on User Interface Software and Technology (UIST), New York, NY, Oct. 2010, pp. 273-282. |
Wurm, et al., “OctoMap: A Probabilistic, Flexible, and Compact 3D Map Representation for Robotic Systems”, Proceedings of Workshop on Best Practice in 3D Perception and Modeling for Mobile Manipulation (ICRA), Anchorage, Alaska, May 2010, 8 pages. |
Yu, et al., “Monocular Video Foreground/Background Segmentation by Tracking Spatial-Color Gaussian Mixture Models”, retrieved on Nov. 28, 2010 at <<http://research.microsoft.com/en-us/um/people/cohen/segmentation.pdf>>, IEEE, Proceedings of Workshop on Motion and Video Computing (WMVC), Feb. 2007, pp. 1-8. |
Zach, et al., “A Globally Optimal Algorithm for Robust TV-L1 Range Image Integration”, IEEE Proceedings of Intl Conference on Computer Vision (ICCV), 2007, pp. 1-8. |
Zhou, et al., “Data-Parallel Octrees for Surface Reconstruction”, IEEE Transactions on Visualization and Computer Graphics, vol. 17, No. 5, May 2011, pp. 669-681. |
Zhou, et al., “Highly Parallel Surface Reconstruction”, retrieved on Nov. 29, 2010 at <<http://research.microsoft.com/pubs/70569/tr-2008-53.pdf, Microsoft Corporation, Microsoft Research, Technical Report MSR-TR-2008-53, Apr. 2008, pp. 1-10. |
Number | Date | Country | |
---|---|---|---|
20120195471 A1 | Aug 2012 | US |