Systems and methods for generating and using visual datasets for training computer vision models

Information

  • Patent Grant
  • 12340538
  • Patent Number
    12,340,538
  • Date Filed
    Friday, June 25, 2021
    4 years ago
  • Date Issued
    Tuesday, June 24, 2025
    6 months ago
Abstract
A system for collecting data for training a computer vision model for shape estimation includes: an imaging system configured to capture one or more images; and a processing system including a processor and memory storing instructions that, when executed by the processor, cause the processor to: receive one or more input images from the imaging system; estimate a pose of an object depicted in the one or more images; render a shape estimate from a 3-D model of the object posed in accordance with the pose of the object; and generate a data point of a training dataset, the data point including one or more images based on the one or more input images and a label corresponding to the one or more images, the label including the shape estimate.
Description
FIELD

Aspects of embodiments of the present disclosure relate to systems and methods for generating and using visual datasets for training computer vision models including object pose detection models.


BACKGROUND

In machine learning or statistical learning, large datasets are commonly used to train models to perform predictions or estimations based on statistical patterns found in the datasets. In the case of supervised training, these datasets generally include data samples or data points represented by example inputs and their corresponding ground truth or “labels” (considering the models to operate as mathematical functions the example inputs correspond to the independent variables and the labels correspond to dependent variables).


For example, when applying machine learning in the particular field of computer vision, these datasets may include input images of a variety of different types of objects and corresponding labels such as textual descriptions of the types of objects depicted in the images and/or the locations of these objects within those images (e.g., as defined by bounding boxes or where each pixel is associated with a class of object depicted by that pixel). One example of such a dataset is ImageNet (see, e.g., J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, ImageNet: A Large-Scale Hierarchical Image Database. IEEE Computer Vision and Pattern Recognition (CVPR), 2009. and Olga Russakovsky*, Jia Deng*, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg and Li Fei-Fei. (*=equal contribution) ImageNet Large Scale Visual Recognition Challenge. IJCV, 2015.), which includes images associated with concepts described by multiple words or phrases (on average, about one thousand images for each of about one hundred thousand different concepts). These visual datasets have been useful in training a wide variety of machine learning models such as deep neural networks (e.g., convolutional neural networks) to perform tasks such as image classification and image segmentation. These trained machine learning models for computer vision have been applied in a variety of areas including autonomous vehicles, robotics for manufacturing and logistics processes, detection of abnormalities in medical imaging, and the like.


SUMMARY

Aspects of embodiments of the present disclosure relate to systems and methods for generating and using visual datasets for training computer vision models including object pose detection models.


According to one embodiment of the present disclosure, a system for collecting data for training a computer vision model for shape estimation includes: an imaging system configured to capture one or more images; and a processing system including a processor and memory storing instructions that, when executed by the processor, cause the processor to: receive one or more input images from the imaging system; estimate a pose of an object depicted in the one or more images; render a shape estimate from a 3-D model of the object posed in accordance with the pose of the object; and generate a data point of a training dataset, the data point including one or more images based on the one or more input images and a label corresponding to the one or more images, the label including the shape estimate.


The imaging system may include a polarization camera system, and the one or more input images may include one or more polarization images.


The one or more polarization images may include a plurality of spectral channels corresponding to different portions of an electromagnetic spectrum.


The shape estimate may include a surface normals map rendered from the 3-D model posed in accordance with the pose of the object.


The one or more images of the data point may include the one or more polarization images.


The one or more images of the data point may include one or more polarization signatures computed based on the one or more polarization images.


The one or more images of the data point may include one or more surface normals maps computed from the one or more polarization images.


The shape estimate may include a rendered depth map.


The imaging system may include a depth camera system, and the one or more images may include one or more depth maps.


The pose of the object may be estimated based on aligning a shape of the 3-D model with the one or more depth maps.


The processing system may be further configured to estimate the pose of the object using a computer vision model trained to compute shape estimates based on the one or more input images.


The processing system may be further configured to re-train the computer vision model using the training dataset including the data point.


According to one embodiment of the present disclosure, a method for collecting data for training a computer vision model for shape estimation includes: capturing one or more images of a scene using an imaging system; receiving, by a processing system including a processor and memory, the one or more input images from the imaging system; estimating, by the processing system, a pose of an object depicted in the one or more images; rendering, by the processing system, a shape estimate from a 3-D model of the object posed in accordance with the pose of the object; and generating, by the processing system, a data point of a training dataset, the data point including one or more images based on the one or more input images and a label corresponding to the one or more images, the label including the shape estimate.


The imaging system may include a polarization camera system, and the one or more input images may include one or more polarization images.


The one or more polarization images may include a plurality of spectral channels corresponding to different portions of an electromagnetic spectrum.


The shape estimate may include a surface normals map rendered from the 3-D model posed in accordance with the pose of the object.


The one or more images of the data point may include the one or more polarization images.


The one or more images of the data point may include one or more polarization signatures computed based on the one or more polarization images.


The one or more images of the data point may include one or more surface normals maps computed from the one or more polarization images.


The shape estimate may include a rendered depth map.


The imaging system may include a depth camera system, and the one or more images may include one or more depth maps.


The pose of the object may be estimated based on aligning a shape of the 3-D model with the one or more depth maps.


The method may further include estimating the pose of the object using a computer vision model trained to compute shape estimates based on the one or more input images.


The method may further include re-training the computer vision model using the training dataset including the data point.





BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, together with the specification, illustrate exemplary embodiments of the present invention, and, together with the description, serve to explain the principles of the present invention.



FIG. 1A is a schematic diagram depicting a pose estimation system according to one embodiment of the present disclosure.



FIG. 1B is a high-level depiction of the interaction of light with transparent objects and non-transparent (e.g., diffuse and/or reflective) objects.



FIG. 2A is a perspective view of a camera array according to one embodiment of the present disclosure.



FIG. 2B is a cross sectional view of a portion of a camera array according to one embodiment of the present disclosure.



FIG. 2C is a perspective view of a stereo camera array system according to one embodiment of the present disclosure.



FIG. 3 is a flowchart depicting a method for computing six-degree-of-freedom (6-DoF) poses of objects according to some embodiments of the present disclosure.



FIG. 4A is a flow diagram of a process for object level correspondence according to one embodiment.



FIG. 4B is a block diagram of an architecture for instance segmentation and mask generation according to one embodiment.



FIG. 4C is a more detailed flow diagram of a matching algorithm for identifying object-level correspondence for a particular object instance in a first segmentation mask according to one embodiment.



FIG. 5 is a flowchart depicting a method for computing a pose of an object based on dense correspondences according to some embodiments of the present disclosure.



FIG. 6 is a schematic depiction of a 3-D model, depicted in shaded form, posed in accordance with an initial pose estimate and overlaid onto an observed image of a scene, depicted in line drawing form.



FIG. 7A is a block diagram depicting a pipeline for refining an initial pose estimate using dense correspondences according to one embodiment of the present disclosure.



FIG. 7B is a schematic depiction of mappings between observed images and 3-D mesh models based on image-to-object correspondences computed in accordance with some embodiments of the present disclosure.



FIG. 8 is a flowchart depicting a method for generating datasets including images of known objects and corresponding shape estimates according to one embodiment of the present disclosure.



FIG. 9 is a schematic block diagram depicting training a computer vision model using a dataset according to some embodiments of the present disclosure.



FIG. 10 is a schematic block diagram depicting a computer vision model according to some embodiments of the present disclosure.



FIG. 11 is a block diagram of a shape estimator according to one embodiment of the present disclosure.



FIG. 12 is a flowchart of a method for re-training computer vision model according to one embodiment of the present disclosure.





DETAILED DESCRIPTION

In the following detailed description, only certain exemplary embodiments of the present invention are shown and described, by way of illustration. As those skilled in the art would recognize, the invention may be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein.


Aspects of embodiments of the present disclosure relate to systems and methods for generating and using visual datasets for training computer vision models including object pose detection models and surface shape detection models. In some embodiments, these visual datasets include polarization raw frames such as images captured using a polarization camera (a camera that has a polarization filter in its optical path) and/or polarization features (e.g., Stokes vectors, degree of linear polarization (DOLP), and angle of linear polarization (AOLP)), which may be computed from polarization raw frames. These images may be associated with ground truth data relating to the shape of objects, such as clean (e.g., low noise or substantially free of noise) and high resolution surface normals maps (e.g., where each pixel or location in the surface normals map identifies the direction of the surface normals or orientation of the depicted surface as a vector in a particular coordinate system, such as a coordinate system defined with respect to the viewpoint) and depth maps (e.g., where each pixel or location in the depth map identifies the distance from the camera to the surface depicted at that pixel, where the depth map may also be interpreted as a point cloud of 3-D coordinates) and such as poses of 3-D models of objects.


Polarization imaging provides information that would not be available to comparative cameras (e.g., imaging modalities that do not include polarization filters and that therefore do not capture information about the polarization of light). This information includes detecting the shape of reflective and transparent objects, determining the surface normals of objects using Fresnel equations, and robustness to specular reflections (e.g., glare). Accordingly, the use of scene polarization information, in the form of polarization images and/or polarization features (e.g., AOLP/DOLP) provides additional information to that can be used by computer vision models to compute more accurate classifications of objects and detections of their locations, poses, and shapes.


Some embodiments of the present disclosure relate to datasets where each data sample includes images of a scene and corresponding ground truth surface normals maps and/or ground truth depth maps, point clouds, or 3-D models of the surfaces of one or more objects in the scene. The images of a scene may include images captured using one or more imaging modalities, including polarization, polarization features, color, infrared, thermal, depth maps (e.g., captured using passive stereo, active stereo with structured light, time of flight, and the like), and the like.


In some embodiments, these images also include surface normals maps computed from the images. In some cases, the surface normals maps are computed from a depth map captured using a depth camera system (e.g., by computing the slope or gradient between neighboring pixels of the captured depth map). In some cases, the surface normals maps are computed using closed form equations (e.g., the Fresnel equations) in accordance with shape-from-polarization (SfP) techniques.


Techniques for computing the shape of objects from polarization information include Polarized 3D (described in, for example, Kadambi, Achuta, et al. “Polarized 3D: High-quality depth sensing with polarization cues.” Proceedings of the IEEE International Conference on Computer Vision. 2015.) which provides deterministic techniques for computing the surface normal field for an object based on a captured polarization signature of an object (e.g., using a polarization camera system) and a coarse approximation of a depth map (e.g., computed using a depth camera system).


While depth maps and Polarized 3D provide routes to computing surface normals maps directly from captured images (e.g., captured depth maps and/or polarization raw frames), the resulting surface normals maps generally exhibit substantial noise or artifacts in accordance with the characteristics of the underlying sensing technique. For example, depth maps captured through stereo depth camera systems may exhibit errors or noise due to ambiguities due to lack of surface texture or ambiguous surface texture, and/or depth resolution limits due to sensor resolution and feature matching constraints, thereby resulting in errors or noise in the surface normals maps computed therefrom. Surface normals maps computed through the direct application of the Fresnel equations may produce artifacts from: ambiguities that arise when determining the azimuth angle of the surface normal; refractive distortions in estimating the zenith angle; non-uniformly in the polarized lighting from the environment; texture copy artifacts when an object has multiple different unique textures; and fronto-parallel surfaces that produce noise in zenith angle estimations when they are close to zero. Furthermore, as the paper by Kadambi et al. shows, obtaining accurate surface normals through polarization is an involved process that has to address the above ambiguities in the surface normal estimations, along with constraints on depth discontinuities among other aspects. Later work by Ba et al. (Ba, Yunhao, et al. “Deep Shape from Polarization.” ECCV. 2020.) followed up on Kadambi et al. by leveraging the physics of polarization through a deep learning network and training the network to learn the relationships between polarization signatures and the surface normal at the point of reflection while disambiguating the estimated normals in the process. This represents a significant improvement over the prior physics-based approach in that the trained network was able to resolve some of these ambiguities resulting in a reduced mean angular error (MAE) in the estimated surface normals. However, some problems remain when dealing with regions of high frequency, increased specularity, shadows and inter-reflections. In addition, the network of Ba is trained using a dataset of collected images in which the surface normals were computed from 3-D scans captured by a structured light 3-D scanner.


In contrast, the ground truth surface normals maps of datasets in accordance with some embodiments of the present disclosure provide clean (e.g., having low noise or being substantially free of noise) shape information of the objects in a scene (e.g., surface normals maps and/or depth maps) that accurately match the shapes of the objects depicted in the corresponding images. In some embodiments, these clean shape estimates are obtained by detecting the poses of known objects in a scene, aligning accurate 3-D models of those known objects based on the detected poses, and rendering the ground truth shape information based on the posed 3-D models. These approaches generally work due to the existence of accurate 3-D models representing the known objects. This is typically possible in the case where the objects are manufactured objects that are substantially uniform in shape and appearance, and where the 3-D model was created as part of the design process in designing the manufactured object and/or designing the manufacturing process for manufacturing the objects (e.g., when creating molds for injection molding or casting of the parts).


As such, datasets generated in accordance with embodiments of the present disclosure provide training data for training computer vision models to compute estimates or predictions of the shapes of objects, where the surface normals maps of the datasets exhibit lower noise and higher accuracy than comparative datasets based on observed or captured data, as opposed to synthetic datasets generated through computer simulations, such as by rending synthetic images and synthetic surface normals maps of a virtual scene using a 3-D graphics engine. In some embodiments, these datasets are used to train computer vision models (e.g., trained statistical models) to generalize from the clean data ground truth data in the training data set and thereby enable the prediction or estimation or inference of the shapes of unknown objects (e.g., objects for which the computer vision system does not have a 3-D model, such as may be the case when the objects are unique, have high variability in shape and appearance, highly diverse, or where accurate 3-D models are otherwise not available to the computer vision system).


Some aspects of embodiments of the present disclosure relate to an integrated system, including imaging hardware and an integrated physics-based deep learning system, that estimates surface normals of known objects (e.g., in a manufacturing assembly line) with very high accuracy. In some embodiments, the imaging hardware implements a multi-view, multi-spectral, and multi-modal approach to image acquisition, and the physics-based deep learning system leverages this additional information to overcome many of the shortcomings of comparative approaches.


Additional aspects of embodiments relate to systems and methods for generating a corpus of data that provides correlations between the various signatures that are captured by imaging hardware (e.g., multi-view, multi-spectral, and multi-modal images) and the final computed six degree-of-freedom (6-DoF) pose, surface normals, and depth estimate of the object in a manner that can be used to train deep learning networks to correctly detect objects and estimate their poses based on the captured signatures.


To provide some context, FIG. 1A is a schematic diagram depicting a pose estimation system according to one embodiment of the present disclosure. As shown in FIG. 1A, a main camera 10 is arranged such that its field of view 12 captures an arrangement 20 of objects 22 resting on a support platform 2 in a scene 1. In the embodiment shown in FIG. 1A, the main camera 10 is located above the support platform (e.g., spaced apart from the objects 22 along the direction of gravity), but embodiments of the present disclosure are not limited thereto—for example, the main camera 10 can be arranged to have a downward angled view of the objects 22.


In some embodiments, one or more support cameras 30 are arranged at different poses or viewpoints around the scene containing the arrangement 20 of objects 22. Accordingly, each of the support cameras 30, e.g., first support camera 30a, second support camera 30b, and third support camera 30c, captures a different view of the objects 22 from a different viewpoint (e.g., a first viewpoint, a second viewpoint, and a third viewpoint, respectively) from one another and a different viewpoint from the main camera 10. The viewpoints may be distinguished from one another in that they have substantially different optical axes, such as optical axes that are not parallel (non-parallel) to one another or that are spaced apart by a large distance if they are parallel to one another.


While FIG. 1A shows three support cameras 30, embodiments of the present disclosure are not limited thereto and may include, for example, at least one support camera 30 and may include more than three support cameras 30. In some embodiments, no support cameras are used and only a single main camera 10 is used from a single viewpoint.


In addition, while the main camera 10 is depicted in FIG. 1A as a stereo camera, embodiments of the present disclosure are not limited thereto, and may be used with, for example, a monocular main camera. As used herein, a stereo camera will be referred to as capturing images from a single viewpoint, as the camera modules of a stereo camera generally have optical axes that are substantially parallel to one another (and may be rectified to synthetically produce such parallel optical axes) and are generally spaced apart along a relatively short baseline to generate a depth map using stereo from a single viewpoint.


A shape estimator 100 according to various embodiments of the present disclosure is configured to compute or estimate shapes and/or poses of the objects 22 based on information captured by the main camera 10 and the support cameras 30. According to various embodiments of the present disclosure, the shape estimator 100 is implemented using one or more processing circuits or electronic circuits configured to perform various operations as described in more detail below. Types of electronic circuits may include a central processing unit (CPU), a graphics processing unit (GPU), an artificial intelligence (AI) accelerator (e.g., a vector processor, which may include vector arithmetic logic units configured efficiently perform operations common to neural networks, such dot products and softmax), a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), a digital signal processor (DSP), or the like. For example, in some circumstances, aspects of embodiments of the present disclosure are implemented in program instructions that are stored in a non-volatile computer readable memory where, when executed by the electronic circuit (e.g., a CPU, a GPU, an AI accelerator, or combinations thereof), perform the operations described herein to compute a processing output, such as an instance segmentation map and/or 6-DoF poses, from input images 18 (including, for example, polarization raw frames or the underlying images captured by polarization cameras or cameras with polarization filters in their optical paths). The operations performed by the shape estimator 100 may be performed by a single electronic circuit (e.g., a single CPU, a single GPU, or the like) or may be allocated between multiple electronic circuits (e.g., multiple GPUs or a CPU in conjunction with a GPU). The multiple electronic circuits may be local to one another (e.g., located on a same die, located within a same package, or located within a same embedded device or computer system) and/or may be remote from one other (e.g., in communication over a network such as a local personal area network such as Bluetooth®, over a local area network such as a local wired and/or wireless network, and/or over wide area network such as the internet, such a case where some operations are performed locally and other operations are performed on a server hosted by a cloud computing service). One or more electronic circuits operating to implement the shape estimator 100 may be referred to herein as a computer or a computer system, which may include memory storing instructions that, when executed by the one or more electronic circuits, implement the systems and methods described herein.


In more detail, the main camera 10 and the support cameras 30 are configured to estimate the shapes and/or poses of objects 22 detected within their fields of view 12 (while FIG. 1A illustrates a field of view 12 for the main camera 10 using dashed lines, the fields of view of the support cameras 30 are not explicitly shown). In the embodiment shown in FIG. 1A, the objects 22 are depicted abstractly as simple three-dimensional solids such as spheres, rectangular prisms, and cylinders. However, embodiments of the present disclosure are not limited thereto and characterization of shape estimators may be performed using any arbitrary object for which a pose with respect to a camera can be clearly defined, including deformable objects mentioned above, such as flex circuits, bags or other pliable containers containing solids, liquids, and/or fluids, flexible tubing, and the like.


In particular, a “pose” refers to the position and orientation of an object with respect to a reference coordinate system. For example, a reference coordinate system may be defined with the main camera 10 at the origin, where the direction along the optical axis of the main camera 10 (e.g., a direction through the center of its field of view 12) is defined as the z-axis of the coordinate system, and the x and y axes are defined to be perpendicular to one another and perpendicular to the z-axis. (Embodiments of the present disclosure are not limited to this particular coordinate system, and a person having ordinary skill in the art would understand that poses can be mathematically transformed to equivalent representations in different coordinate systems.)


Each object 22 may also be associated with a corresponding coordinate system of its own, which is defined with respect to its particular shape. For example, a rectangular prism with sides of different lengths may have a canonical coordinate system defined where the x-axis is parallel to its shortest direction, z-axis is parallel to its longest direction, the y-axis is orthogonal to the x-axis and z-axis, and the origin is located at the centroid of the object 22.


Generally, in a three-dimensional coordinate system, objects 22 have six degrees of freedom—rotation around three axes (e.g., rotation around x-, y-, and z-axes) and translation along the three axes (e.g., translation along x-, y-, and z-axes). For the sake of clarity, symmetries of the objects 22 will not be discussed in detail herein, but may be addressed, for example, by identifying multiple possible poses with respect to different symmetries (e.g., in the case of selecting the positive versus negative directions of the z-axis of a right rectangular prism), or by ignoring some rotational components of the pose (e.g., a right cylinder is rotationally symmetric around its axis).


In some embodiments, it is assumed that a three-dimensional (3-D) model or computer aided design (CAD) model representing a canonical or ideal version of each type of object 22 in the arrangement of objects 20 is available. For example, in some embodiments of the present disclosure, the objects 22 are individual instances of manufactured components that have a substantially uniform appearance from one component to the next. Examples of such manufactured components include screws, bolts, nuts, connectors, and springs, as well as specialty parts such electronic circuit components (e.g., packaged integrated circuits, light emitting diodes, switches, resistors, and the like), laboratory supplies (e.g. test tubes, PCR tubes, bottles, caps, lids, pipette tips, sample plates, and the like), and manufactured parts (e.g., handles, switch caps, light bulbs, and the like). Accordingly, in these circumstances, a CAD model defining the ideal or canonical shape of any particular object 22 in the arrangement 20 may be used to define a coordinate system for the object (e.g., the coordinate system used in the representation of the CAD model).


Based on a reference coordinate system (or camera space, e.g., defined with respect to the pose estimation system) and an object coordinate system (or object space, e.g., defined with respect to one of the objects), the pose of the object may be considered to be a rigid transform (rotation and translation) from object space to camera space. The pose of object 1 in camera space 1 may be denoted as Pc11, and the transform from object 1 space to camera space may be represented by the matrix:






[




R

1

1





R

1

2





R

1

3





T
1






R

2

1





R

2

2





R

2

3





T
2






R
31




R

3

2





R

3

3





T
3





0


0


0


1



]





where the rotation submatrix R:






R
=

[




R
11




R

1

2





R

1

3







R

2

1





R

2

2





R

2

3







R

3

1





R

3

2





R

3

3





]






represents rotations along the three axes from object space to camera space, and the translation submatrix T:






T
=

[




T
1






T
2






T
3




]






represents translations along the three axes from object space to camera space.


If two objects—Object A and Object B—are in the same camera C coordinate frame, then the notation PCA is used to indicate the pose of Object A with respect to camera C and PCB is used to indicate the pose of Object B with respect to camera C. For the sake of convenience, it is assumed herein that the poses of objects are represented based on the reference coordinate system, so the poses of objects A and B with respect to camera space C may be denoted PA and PB, respectively.


If Object A and Object B are actually the same object, but performed during different pose estimation measurements, and a residual pose Perr or PAB (PAB=Perr) is used to indicate a transform from pose PA to pose PB, then the following relationship should hold:

PAPerr=PB  (1)
and therefore
Perr=PA−1PB  (2)


Ideally, assuming the object has not moved (e.g., translated or rotated) with respect to the main camera 10 between the measurements of pose estimates PA and PB, then PA and PB should both be the same, and Perr should be the identity matrix (e.g., indicating no error between the poses):






[



1


0


0


0




0


1


0


0




0


0


1


0




0


0


0


1



]




In a similar manner, the pose of a particular object can be computed with respect to views from two different cameras. For example, images of Object A captured by a main camera C can be used to compute the pose PCA of Object A with respect to main camera C. Likewise, images of Object A captured by a first support camera S1 can be used to compute the pose PS1A of object A with respect to the support camera S1. If the relative poses of main camera C and support camera S1 are known, then the pose PS1A can be transformed to the coordinate system of the main camera C.


Ideally, assuming that the known relative poses of main camera C and support camera S1 are accurate and the poses calculated based on the data captured by the two cameras is accurate, then PCA and PS1A should both be the same, and Perr should be the identity matrix (e.g., indicating no error between the poses):






[



1


0


0


0




0


1


0


0




0


0


1


0




0


0


0


1



]




Differences Perr between the actual measured value as computed based on the estimates computed by the shape estimator 100 and the identity matrix may be considered to be errors:

Rerr=∥R(Perr)∥  (3)
Terr=∥T(Perr)∥  (4)

where Rerr is the rotation error and Terr is the translation error. The function R( ) converts Perr into an axis-angle where the magnitude is the rotation difference, and the function T( ) extracts the translation component of the pose matrix.


The axis-angle representation from rotation matrix R is given by:










T


r

(
R
)


=

1
+

2


cos


θ






(
5
)















"\[LeftBracketingBar]"

θ


"\[RightBracketingBar]"


=

arc


cos



(



T


r

(
R
)


-
1

2

)






(
6
)








where Tr( ) denotes the matrix trace (the sum of the diagonal elements of the matrix), and θ represents the angle of rotation.


Some aspects of embodiments of the present disclosure relate to computing a high accuracy pose estimate of objects 22 in a scene based on a joint estimate of the poses the objects across the main camera 10 and the support cameras 30, and/or the shapes of objects 22 in the scene 1 (e.g., the surface normals or slopes of the objects 22 in the scene and/or the 3-D coordinates of points on the surfaces of the objects), as described in more detail below.


Some aspects of embodiments of the present disclosure also relate to providing information to assist in the control of a robotic arm 24 having an end effector 26 that may be used to grasp and manipulate objects 22. The robotic arm 24, including its end effector 26, may be controlled by a robotic arm controller 28, which, in some embodiments, receives the six-degree-of-freedom poses and/or shapes of objects computed by the shape estimator 100, which may include 3-D models representing various objects 22 in the scene 1, where the 3-D models have configurations that estimate or approximate the configurations of their corresponding real-world objects, noting, for example, that the configuration of portions of the objects 22 that are occluded or otherwise not visible in the fields of view 12 of the main camera 10 and support cameras 30 may be difficult or impossible to estimate with high accuracy.


While the sensor system is generally referred to herein as including a shape estimator 100, embodiments of the present disclosure are not limited to computing shapes and poses (e.g., 6-DoF poses) of objects in a scene and may, instead of or in addition to computing 6-DoF poses, the sensor system, including one or more cameras (e.g., main camera and/or support cameras) and processing circuits may implement generalized vision systems that provide information to controller systems.


For example, a processing pipeline may include receiving images captured by sensor devices (e.g., main cameras 10 and support cameras 30) and outputting control commands for controlling a robot arm, where the processing pipeline is trained, in an end-to-end manner, based on training data that includes sensor data as input and commands for controlling the robot arm (e.g., a destination pose for the end effector 26 of the robotic arm 24) as the labels for the input training data.


As shown in FIG. 1A, the 6-DoF poses computed by the shape estimator 100 may be supplied to a renderer, which is configured to compute or render images of 3-D models posed in a virtual scene in accordance with the poses computed by the shape estimator 100, where the images are rendered from the viewpoints of virtual cameras that may correspond to the viewpoints of the main camera 10 and/or one or more support cameras 30 on the actual scene 1, such that the rendered images correspond to the estimated views of the objects 22 detected in the scene 1 and thereby provide estimates of the shape of the object, including in the case where the rendered images include rendered surface normals maps, as described in more detail below.


Sensing Hardware


In the embodiment shown in FIG. 1A, the pose estimation system includes a main camera 10 and one or more support cameras 30. In some embodiments of the present disclosure, the main camera 10 includes a stereo camera. Examples of stereo cameras include camera systems that have at least two monocular cameras spaced apart from each other along a baseline, where the monocular cameras have overlapping fields of view and optical axes that are substantially parallel to one another. While embodiments of the present disclosure will be presented herein in embodiments where the main camera 10 and the support cameras 30 are passive cameras (e.g., that are not connected to a dedicated light projector and that instead use ambient lighting or other light sources), embodiments of the present disclosure are not limited thereto and may also include circumstances where one or more active light projector are included in the camera system, thereby forming an active camera system, where the active light projector may be configured to project structured light or a pattern onto the scene. The support cameras 30 may be stereo cameras, monocular cameras, or combinations thereof (e.g., some stereo support cameras and some monocular support cameras). In some embodiments, the main camera 10 and/or one or more support cameras 30 may include one or more time-of-flight depth camera systems.


The main camera 10 and the support cameras 30 may use the same imaging modalities or different imaging modalities, and each of the main camera 10 and support cameras 30 may capture images using one or more different imaging modalities. Examples of imaging modalities include monochrome, color, infrared, ultraviolet, thermal, polarization, and combinations thereof (e.g., polarized color, polarized infrared, unpolarized ultraviolet, etc.).


The interaction between light and transparent objects is rich and complex, but the material of an object determines its transparency under visible light. For many transparent household objects, the majority of visible light passes straight through and a small portion (˜4% to ˜8%, depending on the refractive index) is reflected. This is because light in the visible portion of the spectrum has insufficient energy to excite atoms in the transparent object. As a result, the texture (e.g., appearance) of objects behind the transparent object (or visible through the transparent object) dominate the appearance of the transparent object. For example, when looking at a transparent glass cup or tumbler on a table, the appearance of the objects on the other side of the tumbler (e.g., the surface of the table) generally dominate what is seen through the cup. This property leads to some difficulties when attempting to detect surface characteristics of transparent objects such as glass windows and glossy, transparent layers of paint, based on intensity images alone:



FIG. 1B is a high-level depiction of the interaction of light with transparent objects and non-transparent (e.g., diffuse and/or reflective) objects. As shown in FIG. 1B, in some embodiments the main camera 10 includes a polarization camera 11 that captures polarization raw frames of a scene that includes a transparent object 41 in front of an opaque background object 42. A light ray 43 hitting the image sensor 14 of the polarization camera contains polarization information from both the transparent object 41 and the background object 42. The small fraction of reflected light 44 from the transparent object 41 is heavily polarized, and thus has a large impact on the polarization measurement, in contrast to the light 45 reflected off the background object 42 and passing through the transparent object 41.


Similarly, a light ray hitting the surface of an object may interact with the shape of the surface in various ways. For example, a surface with a glossy paint may behave substantially similarly to a transparent object in front of an opaque object as shown in FIG. 1B, where interactions between the light ray and a transparent or translucent layer (or clear coat layer) of the glossy paint causes the light reflecting off of the surface to be polarized based on the characteristics of the transparent or translucent layer (e.g., based on the thickness and surface normals of the layer), which are encoded in the light ray hitting the image sensor. Similarly, as discussed in more detail below with respect to shape from polarization (SfP) theory, variations in the shape of the surface (e.g., direction of the surface normals) may cause significant changes in the polarization of light reflected by the surface of the object. For example, smooth surfaces may generally exhibit the same polarization characteristics throughout, but a scratch or a dent in the surface changes the direction of the surface normals in those areas, and light hitting scratches or dents may be polarized, attenuated, or reflected in ways different than in other portions of the surface of the object. Models of the interactions between light and matter generally consider three fundamentals: geometry, lighting, and material. Geometry is based on the shape of the material. Lighting includes the direction and color of the lighting. Material can be parameterized by the refractive index or angular reflection/transmission of light. This angular reflection is known as a bi-directional reflectance distribution function (BRDF), although other functional forms may more accurately represent certain scenarios. For example, the bidirectional subsurface scattering distribution function (BSSRDF) would be more accurate in the context of materials that exhibit subsurface scattering (e.g. marble or wax).


A light ray 43 hitting the image sensor 14 of a polarization camera has three measurable components: the intensity of light (intensity image/I), the percentage or proportion of light that is linearly polarized (degree of linear polarization/DOLP/p), and the direction of that linear polarization (angle of linear polarization/AOLP/p). These properties encode information about the surface curvature and material of the object being imaged, which can be used by the shape estimator 100 to detect transparent objects, as described in more detail below. In some embodiments, by using one or more polarization cameras, the shape estimator 100 can detect the shapes of optically challenging objects (e.g., that include surfaces made of materials having optically challenging properties such as transparency, reflectivity, or dark matte surfaces) based on similar polarization properties of light passing through translucent objects and/or light interacting with multipath inducing objects or by non-reflective objects (e.g., matte black objects).


In more detail, the polarization camera 11 may further includes a polarizer or polarizing filter or polarization mask 16 placed in the optical path between the scene 1 and the image sensor 14. According to various embodiments of the present disclosure, the polarizer or polarization mask 16 is configured to enable the polarization camera 11 to capture images of the scene 1 with the polarizer set at various specified angles (e.g., at 45° rotations or at 60° rotations or at non-uniformly spaced rotations).


As one example, FIG. 1B depicts an embodiment where the polarization mask 16 is a polarization mosaic aligned with the pixel grid of the image sensor 14 in a manner similar to a red-green-blue (RGB) color filter (e.g., a Bayer filter) of a color camera. In a manner similar to how a color filter mosaic filters incoming light based on wavelength such that each pixel in the image sensor 14 receives light in a particular portion of the spectrum (e.g., red, green, or blue) in accordance with the pattern of color filters of the mosaic, a polarization mask 16 using a polarization mosaic filters light based on linear polarization such that different pixels receive light at different angles of linear polarization (e.g., at 0°, 45°, 90°, and 135°, or at 0°, 60° degrees, and 120°). Accordingly, the polarization camera 11 using a polarization mask 16 such as that shown in FIG. 1B is capable of concurrently or simultaneously capturing light at four different linear polarizations. One example of a polarization camera is the Blackfly® S Polarization Camera produced by FLIR® Systems, Inc. of Wilsonville, Oregon.


While the above description relates to some possible implementations of a polarization camera using a polarization mosaic, embodiments of the present disclosure are not limited thereto and encompass other types of polarization cameras that are capable of capturing images at multiple different polarizations. For example, the polarization mask 16 may have fewer than four polarizations or more than four different polarizations, or may have polarizations at different angles than those stated above (e.g., at angles of polarization of: 0°, 60°, and 120° or at angles of polarization of 0°, 30°, 60°, 90°, 120°, and 150°). As another example, the polarization mask 16 may be implemented using an electronically controlled polarization mask, such as an electro-optic modulator (e.g., may include a liquid crystal layer), where the polarization angles of the individual pixels of the mask may be independently controlled, such that different portions of the image sensor 14 receive light having different polarizations. As another example, the electro-optic modulator may be configured to transmit light of different linear polarizations when capturing different frames, e.g., so that the camera captures images with the entirety of the polarization mask set to, sequentially, to different linear polarizer angles (e.g., sequentially set to: 0 degrees; 45 degrees; 90 degrees; or 135 degrees). As another example, the polarization mask 16 may include a polarizing filter that rotates mechanically, such that different polarization raw frames are captured by the polarization camera 11 with the polarizing filter mechanically rotated with respect to the lens 18 to transmit light at different angles of polarization to image sensor 14. Furthermore, while the above examples relate to the use of a linear polarizing filter, embodiments of the present disclosure are not limited thereto and also include the use of polarization cameras that include circular polarizing filters (e.g., linear polarizing filters with a quarter wave plate). Accordingly, in various embodiments of the present disclosure, a polarization camera uses a polarizing filter to capture multiple polarization raw frames at different polarizations of light, such as different linear polarization angles and different circular polarizations (e.g., handedness).


As a result, the polarization camera 11 captures multiple input images (or polarization raw frames) of the scene including the surfaces of the objects 22. In some embodiments, each of the polarization raw frames corresponds to an image taken behind a polarization filter or polarizer at a different angle of polarization ϕpol (e.g., 0 degrees, 45 degrees, 90 degrees, or 135 degrees). Each of the polarization raw frames is captured from substantially the same pose with respect to the scene 1 (e.g., the images captured with the polarization filter at 0 degrees, 45 degrees, 90 degrees, or 135 degrees are all captured by a same polarization camera 11 located at a same location and orientation), as opposed to capturing the polarization raw frames from disparate locations and orientations with respect to the scene. The polarization camera 11 may be configured to detect light in a variety of different portions of the electromagnetic spectrum, such as the human-visible portion of the electromagnetic spectrum, red, green, and blue portions of the human-visible spectrum, as well as invisible portions of the electromagnetic spectrum such as infrared and ultraviolet.



FIG. 2A is a perspective view of a camera array 10′ according to one embodiment of the present disclosure. FIG. 2B is a cross sectional view of a portion of a camera array 10′ according to one embodiment of the present disclosure. Some aspects of embodiments of the present disclosure relate to a camera array in which multiple cameras (e.g., cameras having different imaging modalities and/or sensitivity to different spectra) are arranged adjacent to one another and in an array and may be controlled to capture images in a group (e.g., a single trigger may be used to control all of the cameras in the system to capture images concurrently or substantially simultaneously). In some embodiments, the individual cameras are arranged such that parallax shift between cameras is substantially negligible based on the designed operating distance of the camera system to objects 2 and 3 in the scene 1, where larger spacings between the cameras may be tolerated when the designed operating distance is large.



FIG. 2B shows a cross sectional view of two of the cameras or camera modules 10A′ and 10B′ of the camera array 10′ shown in FIG. 2A. As seen in FIG. 2B, each camera or camera module (10A′ and 10B′) includes a corresponding lens, a corresponding image sensor, and may include one or more corresponding filters. For example, in some embodiments, camera 10A′ is a visible light color camera that includes lens 12A′, image sensor 14A′, and color filter 16A′ (e.g., a Bayer filter). In the embodiment shown in FIG. 2B, the filter 16 is located behind the lens 12 (e.g., between the lens 12 and the image sensor 14), but embodiments of the present disclosure are not limited thereto. In some embodiments, the filter 16 is located in front of the lens 12, and in some embodiments, the filter 16 may include multiple separate components, where some components are located in front of the lens and other components are located behind the lens (e.g., a polarizing filter in front of the lens 12 and a color filter behind the lens 12). In some embodiments, camera 10B′ is a polarization camera that includes lens 12B′, image sensor 14B′, and polarizing filter 16B′ (a polarization camera may also include a visible light color filter or other filter for passing a particular portion of the electromagnetic spectrum, such as an infrared filter, ultraviolet filter, and the like). In some embodiments of the present disclosure, the image sensors four cameras 10A′, 10B′, 10C′, and 10D′ are monolithically formed on a same semiconductor die, and the four cameras are located in a same housing with separate apertures for the lenses 12 corresponding to the different image sensors. Similarly, the filters 16 may correspond to different portions of a single physical layer that has different optical filter functions (e.g., different linear polarizing angles or circular polarizers, color filters with corresponding spectral response functions, and the like) in different regions of the layer (corresponding to the different cameras). In some embodiments, a filter 16 of a polarization camera includes a polarization mask 16 similar to the Sony® IMX250MZR sensor, which includes a polarization mosaic aligned with the pixel grid of the image sensor 14 in a manner similar to a red-green-blue (RGB) color filter (e.g., a Bayer filter) of a color camera. In a manner similar to how a color filter mosaic filters incoming light based on wavelength such that each pixel in the image sensor 14 receives light in a particular portion of the spectrum (e.g., red, green, or blue) in accordance with the pattern of color filters of the mosaic, a polarization mask 16 using a polarization mosaic filters light based on linear polarization such that different pixels receive light at different angles of linear polarization (e.g., at 0°, 45°, 90°, and 135°, or at 0°, 60° degrees, and 120°). Accordingly, a camera of the camera array 10′ may use a polarization mask 16 to concurrently or simultaneously capture light at four different linear polarizations.


In some embodiments, a demosaicing process is used to compute separate red, green, and blue channels from the raw data. In some embodiments of the present disclosure, each polarization camera may be used without a color filter or with filters used to transmit or selectively transmit various other portions of the electromagnetic spectrum, such as infrared light.


As noted above, embodiments of the present disclosure relate to multi-modal and/or multi-spectral camera arrays. Accordingly, in various embodiments of the present disclosure, the cameras within a particular camera array include cameras configured to perform imaging in a plurality of different modalities and/or to capture information in a plurality of different spectra.


As one example, in some embodiments, the first camera 10A′ is a visible light camera that is configured to capture color images in a visible portion of the electromagnetic spectrum, such as by including a Bayer color filter 16A′ (and, in some cases, a filter to block infrared light), and the second camera 10B′, third camera 10C′, and fourth camera 10D′ are polarization cameras having different polarization filters, such filters having linear polarization angles of 0°, 60°, and 120°, respectively. The polarizing filters in the optical paths of each of the cameras in the array cause differently polarized light to reach the image sensors of the cameras. The individual polarization cameras in the camera array have optical axes that are substantially perpendicular to one another, are placed adjacent to one another, and have substantially the same field of view, such that the cameras in the camera array capture substantially the same view of a scene as the visible light camera 10A′, but with different polarizations. While the embodiment shown in FIG. 2A includes a 2×2 array of four cameras, three of which are polarization cameras, embodiments of the present disclosure are not limited thereto, and the camera array may more than three polarization cameras, each having a polarizing filter with a different polarization state (e.g., a camera array may have four polarization cameras along with the visible light color camera 10A′, where the polarization cameras may have polarization filters with angles of linear polarization, such as 0°, 45°, 90°, and 135°). In some embodiments, one or more of the cameras may include a circular polarizer.


As another example, one or more of the cameras in the camera array 10′ may operate in other imaging modalities and/or other imaging spectra, such as polarization, near infrared, far infrared, shortwave infrared (SWIR), longwave infrared (LWIR) or thermal, ultraviolet, and the like, by including appropriate filters 16 (e.g., filters that pass light having particular polarizations, near-infrared light, SWIR light, LWIR light, ultraviolet light, and the like) and/or image sensors 14 (e.g., image sensors optimized for particular wavelengths of electromagnetic radiation) for the particular modality and/or portion of the electromagnetic spectrum.


For example, in the embodiment of the camera array 10′ shown in FIG. 2A, four cameras 10A′, 10B′, 10C′, and 10D′ are arranged in a 2×2 grid to form a camera array, where the four cameras have substantially parallel optical axes. In addition, the optical axes of the camera modules of the camera array are arranged close together such that the camera modules capture images from substantially the same viewpoint with respect to the objects in the scene 1. One of skill in the art would understand that the acceptable spacing between the optical axes of the camera modules within an array in order to capture images of the scene from substantially the same viewpoint depends on the working distance to objects 22 in the scene, where longer working distances allow for larger spacing between the optical axes while shorter working distances may require closer or tighter spacing between the optical axes. The four cameras may be controlled together such that they capture images substantially simultaneously. In some embodiments, the four cameras are configured to capture images using the same exposure settings (e.g., same aperture, length of exposure, and gain or “ISO” settings). In some embodiments, the exposure settings for the different cameras can be controlled independently from one another (e.g., different settings for each camera), where the shape estimator 100 jointly or holistically sets the exposure settings for the cameras based on the current conditions of the scene 1 and the characteristics of the imaging modalities and spectral responses of the cameras 10A′, 10B′, 10C′, and 10D′ of the camera array 10′.


In some embodiments, the various individual cameras of the camera array are registered with one another by determining their relative poses (or relative positions and orientations) by capturing multiple images of a calibration target, such as a checkerboard pattern, an ArUco target (see, e.g., Garrido-Jurado, Sergio, et al. “Automatic generation and detection of highly reliable fiducial markers under occlusion.” Pattern Recognition 47.6 (2014): 390-402.) or a ChArUco target (see, e.g., An, Gwon Hwan, et al. “Charuco board-based omnidirectional camera calibration method.” Electronics 7.12 (2018): 421.). In particular, the process of calibrating the targets may include computing intrinsic matrices characterizing the internal parameters of each camera (e.g., matrices characterizing the focal length, image sensor format, and principal point of the camera) and extrinsic matrices characterizing the pose of each camera with respect to world coordinates (e.g., matrices for performing transformations between camera coordinate space and world or scene coordinate space). Different cameras within a camera array may have image sensors with different sensor formats (e.g., aspect ratios) and/or different resolutions without limitation, and the computed intrinsic and extrinsic parameters of the individual cameras enable the shape estimator 100 to map different portions of the different images to a same coordinate space (where possible, such as where the fields of view overlap).



FIG. 2C is a perspective view of a stereo camera array system 10 according to one embodiment of the present disclosure. For some applications, stereo vision techniques are used to capture multiple images of scene from different perspectives. As noted above, in some embodiments of the present disclosure, individual cameras (or camera modules) within a camera array 10′ are placed adjacent to one another such that parallax shifts between the cameras are small or substantially negligible based on the designed operating distance of the camera system to the subjects being imaged (e.g., where the parallax shifts between cameras of a same array are less than a pixel for objects at the operating distance). In addition, as noted above, in some embodiments, differences in the poses of the individual cameras within a camera array 10′ are corrected through image registration based on the calibrations (e.g., computed intrinsic and extrinsic parameters) of the cameras such that the images are aligned to a same coordinate system for the viewpoint of the camera array.


In stereo camera array systems according to some embodiments, the camera arrays are spaced apart from one another such that parallax shifts between the viewpoints corresponding to the camera arrays are detectable for objects in the designed operating distance of the camera system. This enables the distances to various surfaces in a scene (the “depth”) to be detected in accordance with a disparity measure or a magnitude of a parallax shift (e.g., larger parallax shifts in the locations of corresponding portions of the images indicate that those corresponding portions are on surfaces that are closer to the camera system and smaller parallax shifts indicate that the corresponding portions are on surfaces that are farther away from the camera system). These techniques for computing depth based on parallax shifts are sometimes referred to as Depth from Stereo


Accordingly, FIG. 2C depicts a stereo camera array system 10 having a first camera array 10-1′ and a second camera array 10-2′ having substantially parallel optical axes and spaced apart along a baseline 10-B. In the embodiments shown in FIG. 2C, the first camera array 10-1′ includes cameras 10A′, 10B′, 10C′, and 10D′ arranged in a 2×2 array similar to that shown in FIG. 2A and FIG. 2B. Likewise, the second camera array 10-2′ includes cameras 10E′, 10F′, 10G′, and 10H′ arranged in a 2×2 array, and the overall stereo camera array system 10 includes eight individual cameras (e.g., eight separate image sensors behind eight separate lenses). In some embodiments of the present disclosure, corresponding cameras of the camera arrays 10-1′ and 10-2′ are of the same type or, in other words, configured to capture raw frames or images using substantially the same imaging modalities or in substantially the same spectra. In the specific embodiment shown in FIG. 2C, cameras 10A′ and 10E′ may be of a same first type, cameras 10B′ and 10F′ may be of a same second type, cameras 10C′ and 10G′ may be of a same third type, and cameras 10D′ and 10H′ may be of a same fourth type. For example, cameras 10A′ and 10E′ may both have linear polarizing filters at a same angle of 0°, cameras 10B′ and 10F′ may both have linear polarizing filters at a same angle of 45°, cameras 10C′ and 10G′ may both be viewpoint-independent cameras having no polarization filter (NF), such as near-infrared cameras, and cameras 10D′ and 10H′ may both have linear polarizing filters at a same angle of 90°. As another example, cameras 10A′ and 10E′ may both be viewpoint-independent cameras such as visible light cameras without polarization filters, cameras 10B′ and 10F′ may both be thermal cameras, cameras 10C′ and 10G′ may both have polarization masks with a mosaic pattern polarization filters at different angles of polarization (e.g., a repeating pattern with polarization angles of 0°, 45°, 90°, and 135°), and cameras 10D′ and 10H′ may both be thermal (LWIR) cameras.


While some embodiments are described above wherein each array includes cameras of different types in a same arrangement, embodiments of the present disclosure are not limited thereto. For example, in some embodiments, the arrangements of cameras within a camera array are mirrored along an axis perpendicular to the baseline 10-B. For example, cameras 10A′ and 10F′ may be of a same first type, cameras 10B′ and 10E′ may be of a same second type, cameras 10C′ and 10H′ may be of a same third type, and cameras 10D′ and 10G′ may be of a same fourth type.


In a manner similar to that described for calibrating or registering cameras within a camera array, the various polarization camera arrays of a stereo camera array system may also be registered with one another by capturing multiple images of calibration targets and computing intrinsic and extrinsic parameters for the various camera arrays. The camera arrays of a stereo camera array system 10 may be rigidly attached to a common rigid support structure 10-S in order to keep their relative poses substantially fixed (e.g., to reduce the need for recalibration to recompute their extrinsic parameters). The baseline 10-B between camera arrays is configurable in the sense that the distance between the camera arrays may be tailored based on a desired or expected operating distance to objects in a scene—when the operating distance is large, the baseline 10-B or spacing between the camera arrays may be longer, whereas the baseline 10-B or spacing between the camera arrays may be shorter (thereby allowing a more compact stereo camera array system) when the operating distance is smaller.


As noted above with respect to FIG. 1B, a light ray 43 hitting the image sensor 14 of a polarization camera 10 has three measurable components: the intensity of light (intensity image/I), the percentage or proportion of light that is linearly polarized (degree of linear polarization/DOLP/φ, and the direction of that linear polarization (angle of linear polarization/AOLP/ϕ).


Measuring intensity I, DOLP p, and AOLP at each pixel requires 3 or more polarization raw frames of a scene taken behind polarizing filters (or polarizers) at different angles, ϕpol (e.g., because there are three unknown values to be determined: intensity I, DOLP ρ, and AOLP ϕ. For example, a polarization camera such as those described above with respect to FIG. 1B captures polarization raw frames with four different polarization angles ϕpol, e.g., 0 degrees, 45 degrees, 90 degrees, or 135 degrees, thereby producing four polarization raw frames Iϕpol, denoted herein as I0, I45, I90, and I135, and a camera module in accordance with some embodiments of FIGS. 2A, 2B, and 2C may capture polarization raw frames at three different polarization angles ϕpol, e.g., 0 degrees, 60 degrees, and 120 degrees, thereby producing three polarization raw frames Iϕpol denoted herein as I0, I60, and I120.


The relationship between Iϕpol and intensity I, DOLP ρ, and AOLP ϕ at each pixel can be expressed as:

Iϕpol=I(1+ρ cos(2(ϕ−ϕpol)))  (7)


Accordingly, with four different polarization raw frames Iϕpol (I0, I45, I90, and I135), a system of four equations can be used to solve for the intensity I, DOLP ρ, and AOLP ϕ.


Shape from Polarization (SfP) theory (see, e.g., Gary A Atkinson and Edwin R Hancock. Recovery of surface orientation from diffuse polarization. IEEE transactions on image processing, 15(6):1653-1664, 2006.) states that the relationship between the refractive index (n), azimuth angle (θa) and zenith angle (θz) of the surface normal of an object and the ϕ and ρ components of the light ray coming from that object follow the following characteristics when diffuse reflection is dominant:









ρ
=




(

n
-

1
n


)

2




sin
2




(

θ
z

)



2
+

2


n
2


-



(

n
+

1
n


)

2




sin
2




θ
z


+

4


cos



θ
z





n
2

-


sin
2




θ
z











(
8
)












ϕ
=

θ
a





(
9
)








and when the specular reflection is dominant:









ρ
=


2



sin

2





θ
z



cos



θ
z





n
2

-


sin
2




θ
z







n
2

-


sin
2




θ
z


-


n
2




sin
2




θ
z


+

2


s



in

4






θ
z








(
10
)












ϕ
=


θ
a

-

π
2






(
11
)







Note that in both cases p increases exponentially as θz, increases and if the refractive index is the same, specular reflection is much more polarized than diffuse reflection.


Accordingly, some aspects of embodiments of the present disclosure relate to applying SfP theory to detect or measure the gradients of surfaces (e.g., the orientation of surfaces or their surface normals or directions perpendicular to the surfaces) based on the raw polarization frames of the objects, as captured by the polarization cameras among the main camera 10 and the support cameras 30. Computing these gradients produces a gradient map (or slope map or surface normals map) identifying the slope of the surface depicted at each pixel in the gradient map. These gradient maps can then be used when estimating the shape and/or pose of the object by supplying these gradient maps or surface normals maps to a trained computer vision model (e.g., a convolutional neural network) and/or by aligning a pre-existing 3-D model (e.g., CAD model) of the object with the measured surface normals (gradients or slopes) of the object in based on the slopes of the surfaces of the 3-D model, as described in more detail below.


One example of an imaging system according to embodiments of the present disclosure includes a stereo pair of 2×2 camera arrays, in an arrangement similar to that shown in FIG. 2C, Each 2×2 camera array includes three color (RGB) cameras with polarization filters at different angles to capture a diverse range of polarization signatures of the scene in the spectral bands (red, green, and blue) and fourth near-IR camera without a polarization filter to capture the scene in the near-IR spectral band. This stereo pair of 2×2 camera arrays may be combined with other cameras located at different viewpoints with respect to the scene, thereby providing a multi-view imaging system. The other cameras may also be similar stereo camera arrays (e.g., similar stereo pairs of 2×2 camera arrays) or monocular camera arrays (e.g., single camera arrays of closely-spaced camera modules), and the camera arrays, in the stereo or monocular case, may have different arrangements and numbers of camera modules in the array (e.g., a 3×2 arrangement of 6 camera modules), and where the camera modules may operate in different modalities (e.g., thermal, ultraviolet, depth from time of flight, polarization, and the like).


Pose Estimation of Known Objects Based on Captured Polarization Information


In some circumstances, the shape estimator 100 has access to 3-D models or computer aided design (CAD) models representing idealized or canonical versions of the objects 22 imaged by the imaging system. These circumstances generally correspond to conditions in which the objects 22 are standardized components that are produced in accordance with those 3-D models, and where each particular real-world instance of the object is substantially identical to each other instance and therefore can be accurately represented by its corresponding known 3-D model. The 3-D model may have been previously generated during the design of the standardized component (e.g., as part of the process of creating the molds) or may be generated through performing a 3-D scan of a part (e.g., using a laser 3-D scanner). Examples of these types of components include manufactured parts, which may be formed through injection molding (in the case of plastics) or casting (in the case of metals). Various surface treatments may be applied to the surfaces of the manufactured parts, which may cause the surfaces of the instances of the objects to have different appearances (e.g., metal parts may be plated, plastic parts may be metalized or coated in metals, various parts may be painted or dyed, and parts may be polished or roughened, and the like).


Examples of techniques for computing estimated poses of known objects for which a 3-D model is available are described in more detail in International Patent Application No. PCT/US21/15926, “Systems and Methods for Object Pose Detection and Measurement,” filed in the United States Patent and Trademark Office on Jan. 29, 2021, U.S. patent application Ser. No. 17/232,084 “Systems and Methods for Six-Degree of Freedom Pose Estimation of Deformable Objects,” filed in the United States Patent and Trademark Office on Apr. 15, 2021, and U.S. patent application Ser. No. 17/314,929, “System and Method for Using Computer Vision to Pick Up Small Objects,” filed in the United States Patent and Trademark Office on May 7, 2021, the entire disclosures of which are incorporated by reference herein.


Generally, some approaches for computing estimated poses of known objects for which a 3-D model is available include determining a class or type of the object (e.g., a known or expected object) and aligning that corresponding 3-D model of the object (e.g., a canonical or ideal version of the object based on known design specifications of the object and/or based on the combination of a collection of samples of the object) with the various views of the object, as captured from different viewpoints around the object. The surface normals of objects in a scene, as computed directly from the polarization information or polarization signatures of surfaces in the scene, provide additional features for properly aligning the 3-D model with the pose of the real-world object in the scene.



FIG. 3 is a flowchart depicting a method for computing six-degree-of-freedom (6-DoF) poses of objects according to some embodiments of the present disclosure.


In operation 310, the shape estimator 100 controls one or more cameras, such as the main camera 10 and the support cameras 30, to capture one or more images of the scene, which may be from multiple viewpoints in the case of multiple cameras. In embodiments using multiple cameras, the cameras are configured to capture images concurrently or substantially simultaneously. Each camera is arranged at a different pose with respect to the scene 1, such that each camera captures scene from its corresponding different viewpoint. Accordingly, the collection of images captured by multiple cameras represent a collection of multi-viewpoint images of the scene 1. (In some embodiments, the images are captured from multiple viewpoints using one or more cameras, such as by moving the one or more cameras between different viewpoints while keeping the scene fixed, and/or rigidly transforming the scene between captures by the one or more cameras.) The one or more images of the scene may be referred to herein as being “consistent” in that they are all pictures of the same consistent scene but providing different views of the scene from different viewpoints and/or different imaging modalities. This consistency between the images of the scene may be achieved by capturing all of the images substantially simultaneously or concurrently or by requiring that none of the objects of interest in the scene that are depicted in the image have moved (e.g., translated or rotated) between in the time between the capture of different images of the scene.


In some circumstances, one or more of the “cameras” are multi-modal cameras that capture multiple images from the same viewpoint, but having different modalities, such as different portions of the electromagnetic spectrum (e.g., red, green and blue portions of the visible light spectrum, near infrared light, far infrared light, ultraviolet light, etc.), different optical filters (e.g., linear polarization filters at different angles and/or circular polarization filters), and combinations thereof. Accordingly, a collection of multi-viewpoint images of a scene does not require that all images be captured from different viewpoints, but only that there are at least two images captured from different viewpoints. Such a collection of multi-viewpoint images therefore may include at least some images that are captured from the same viewpoint.


In the case of a sensing system using multi-viewpoint images or images of a scene from more than one viewpoint, in operation 330, the shape estimator 100 computes object-level correspondences on the multi-viewpoint images of the scene. More specifically, instances of one or more types of objects are identified in the multi-viewpoint images of the scene, and corresponding instances of objects are identified between the multi-viewpoint images. For example, a scene 1 may include two cubes and three spheres, and various of the multi-viewpoint images may depict some or all of these five objects. A process of instance segmentation identifies the pixels in each of the images that depict the five objects, in addition to labeling them separately based on the type or class of object (e.g., a classification as a “sphere” or a “cube”) as well as instance labels (e.g., assigning a unique label to each of the objects, such as numerical labels “1,” “2,” “3,” “4,” and “5”). Computing object-level correspondences between the multi-viewpoint images further relates to computing consistent labels between the different viewpoints (for example, such that the same cube is labeled “1” from each of the viewpoint). Accordingly, the shape estimator 100 generates collections of crops or patches of the multi-viewpoint images of the scene, where each collection of patches depicts the same instance from different viewpoints (cropped to the region containing the object and, in some cases, a small neighborhood or margin around the object).


In the case of a single image depicting a scene from a single viewpoint, in operation 330, the shape estimator 100 may merely compute a segmentation map, which similarly enables the generation of a crop or patch for each object instance detected in the image.


Systems and methods for computing object-level correspondences are described in International Patent Application No. PCT/US21/15926, titled “SYSTEMS AND METHODS FOR POSE DETECTION AND MEASUREMENT,” filed in the United States Patent and Trademark Office on Jan. 29, 2021, which, as noted above, is incorporated by reference herein in its entirety. For the sake of clarity, some techniques for computing object-level correspondences on images are described herein with reference to FIGS. 4A, 4B, and 4C.


In general terms, computing object-level correspondences reduces a search space for conducting image processing tasks such as, for example, pixel-level correspondence. In one embodiment, instance segmentation is performed to identify different instances of objects in images portraying a scene as viewed from different viewpoints, and instance segmentation maps/masks may be generated in response to the instance segmentation operation. The instance segmentation masks may then be employed for computing object level correspondences.


In one embodiment, object level correspondence allows the matching of a first instance of an object appearing in a first image that depicts a view of a scene from a first viewpoint, to a second instance of the same object appearing in a second image that depicts a view of a scene from a second viewpoint. Once object level correspondence is performed, the search space for performing, for example, pixel-level correspondence, may be limited to the regions of the image that correspond to the same object. Reducing the search space in this manner may result in faster processing of pixel-level correspondence and other similar tasks.



FIG. 4A is a flow diagram of a process for object level correspondence according to one embodiment. The process may be implemented by one or more processing circuits or electronic circuits that are components of the shape estimator 100. It should be understood that the sequence of steps of the process is not fixed, but can be modified, changed in order, performed differently, performed sequentially, concurrently, or simultaneously, or altered into any desired sequence, as recognized by a person of skill in the art. The process described with respect to FIG. 4A may be used, in some embodiments of the present disclosure, to compute object level correspondences in operation 330 of FIG. 3, but embodiments of the present disclosure are not limited thereto.


The process starts, and at block 400, the shape estimator 100 receives multi-view images from the main and support cameras 10, 30. A first image captured by one of the cameras may depict one or more objects in a scene from a first viewpoint, and a second image captured by a second camera may depict the one or more objects in the scene from a second viewpoint different from the first viewpoint. The images captured by the cameras may be, for example, polarized images and/or images that have not undergone any polarization filtering.


At block 402 the shape estimator 100 performs instance segmentation and mask generation based on the captured images. In this regard, the shape estimator 100 classifies various regions (e.g. pixels) of an image captured by a particular camera 10, 30 as belonging to particular classes of objects. Each of the different instances of the objects in the image may also be identified, and unique labels be applied to each of the different instances of objects, such as by separately labeling each object in the image with a different identifier.


In one embodiment, segmentation masks delineating the various object instances are also be generated. Each segmentation mask may be a 2-D image having the same dimensions as the input image, where the value of each pixel may correspond to a label (e.g. a particular instance of the object depicted by the pixel). A different segmentation mask may be generated for different images depicting different viewpoints of the objects of interest. For example, a first segmentation mask may be generated to depict object instances in a first image captured by a first camera, and a second segmentation mask may be generated to depict object instances in a second image captured by a second camera. As convolutional neural network such as, for example, Mask R-CNN, may be employed for generating the segmentation masks.


At block 404, the shape estimator 100 engages in object-level correspondence of the objects identified in the segmentation masks. In this regard, the shape estimator may invoke a matching algorithm to identify a segmented instance of a particular object in one image as corresponding (or matching) a segmented instance of the same object in another image. The matching algorithm may be constrained to search for matching object instances along an epipolar line through an object instance in one image to find a corresponding object instance in a different image. In one embodiment, the matching algorithm compares different features of the regions corresponding to the segmented object instances to estimate the object correspondence. The matching of object instances from one image to another may narrow a search space for other image processing tasks such as, for example, performing pixel level correspondence or keypoint correspondence. The search space may be narrowed to the identified regions of the images that are identified as corresponding to the same object.


At block 406, the shape estimator 100 generates an output based on the object-level correspondence. The output may be, for example, a measure of disparity or an estimated depth (e.g., distance from the cameras 10, 30) of the object based on the disparity between corresponding instances as depicted in the various images. In one embodiment, the output is a three-dimensional reconstruction of the configuration of the object and a 6-DoF pose of the object, as described in more detail below with respect to FIG. 3.



FIG. 4B is a block diagram of an architecture for instance segmentation and mask generation of step 402 according to one embodiment. Input images 410 captured by the various cameras 10, 30 are provided to a deep learning network 412 such as, for example, a CNN backbone. In the embodiments where the images include polarized images, the deep learning network may be implemented as a Polarized CNN backbone as described in PCT Patent Application No. PCT/US2020/048604, also filed as U.S. patent application Ser. No. 17/266,046, the content of which is incorporated herein by reference.


In one embodiment, the deep learning network 412 is configured to generate feature maps based on the input images 410, and employ a region proposal network (RPN) to propose regions of interest from the generated feature maps. The proposals by the CNN backbone may be provided to a box head 414 for performing classification and bounding box regression. In one embodiment, the classification outputs a class label 416 for each of the object instances in the input images 410, and the bounding box regression predicts bounding boxes 418 for the classified objects. In one embodiment, a different class label 416 is provided to each instance of an object.


The proposals by the CNN backbone may also be provided to a mask head 420 for generating instance segmentation masks. The mask head 416 may be implemented as a fully convolutional network (FCN). In one embodiment, the mask head 420 is configured to encode a binary mask for each of the object instances in the input images 410.



FIG. 4C is a more detailed flow diagram of a matching algorithm employed at step 404 (FIG. 4A) for identifying object-level correspondence for a particular object instance in a first segmentation mask according to one embodiment. The process may repeat for all object instance identified in the first segmentation mask. The sequence of steps of the process of FIG. 4C is not fixed, but can be modified, changed in order, performed differently, performed sequentially, concurrently, or simultaneously, or altered into any desired sequence, as recognized by a person of skill in the art.


At block 430, the matching algorithm identifies features of a first object instance in a first segmentation mask. The identified features for the first object instance may include a shape of the region of the object instance, a feature vector in the region, and/or keypoint predictions in the region. The shape of the region for the first object instance may be represented via a set of points sampled along the contours of the region. Where a feature vector in the region is used as the feature descriptor, the feature vector may be an average deep learning feature vector extracted via a convolutional neural network.


At block 432, the matching algorithm identifies an epipolar line through the first object instance in the first segmentation mask.


At block 434, the matching algorithm identifies one or more second object instances in a second segmentation mask that may correspond to the first object instance. A search for the second object instances may be constrained to the epipolar line between the first segmentation map and the second segmentation map that runs through the first object instance. In one embodiment, the matching algorithm searches approximately along the identified epiploar line to identify object instances in the second segmentation mask having a same class identifier as the first object instance. For example, if the first object instance belongs to a “dog” class, the matching algorithm evaluates object instances in the second segmentation mask that also belong to the “dog” class, and ignores objects that belong to a different class (e.g., a “cat” class).


At block 436, the matching algorithm identifies the features of the second object instances that belong the same class. As with the first object instance, the features of a particular second object instance may include a shape of the region of the second object instance, a feature vector representing the region, and/or keypoint predictions in the region.


At block 438, the matching algorithm compares the features of the first object instance to the features of second object instances for determining a match. In one embodiment, the matching algorithm identifies a fit between the features of the first object instance and features of the second object instances for selecting a best fit. In one embodiment, the best fit may be identified via a matching function such as the Hungarian matching function. In one embodiment, the features of the object instances are represented as probability distributions, and the matching function attempts to find a match of the probability distributions that minimizes a Kullback-Leibler (KL) divergence.


At block 440, a determination is made as to whether a match has been found. If the answer is YES, an output is generated at block 442. The output may include, for example, information (e.g. object ID) of the second object instance that matched the first object instance.


If the answer is NO, an output may be generate indicating a match failure at block 444.


Accordingly, object level correspondences can be computed from the multi-viewpoint images. These object level correspondences may be used to extract corresponding crops or patches from the multi-viewpoint images, where each of these crops or patches depicts a single instance of an object, and collections of corresponding crops or patches depict the same instance of an object from multiple viewpoints.


In operation 350, the shape estimator 100 loads a 3-D model of the object based on the detected object type one or more object detected in the scene (e.g., for each detected instance of a type of object). For example, in a circumstance where the collection of objects 22 includes a mixture of different types of flexible printed circuit boards, the process of computing object-level correspondences assigns both an instance identifier and a type (or classification) to each detected instance of a flexible printed circuit board (e.g., which of the different types of printed circuit boards). Therefore, a 3-D model of the object may then be loaded from a library based on the detected object type.


In operation 370, the shape estimator 100 aligns the corresponding 3-D model to the appearances of the object to be consistent with the appearance of the object as seen from the one or more viewpoints. In the case of deformable objects, the alignment process in operation 370 may also include deforming the 3-D model to match the estimated configuration of the actual object in the scene. This alignment of the 3-D model provides the 6-DoF pose of the object in a global coordinate system (e.g., a coordinate system based on the main camera 10 or based on the robot controller 28). Details of aspects of the present disclosure for performing the alignment of a 3-D model with the appearance of an object will be described in more detail below.


Generally, the methods described herein will make use of a 3-D model or computer-aided-design (CAD) model C of the object (e.g., as loaded in operation 350) and observed two-dimensional (2-D) image data I of the object (e.g., as captured by the cameras in operation 310 and with object-level corresponding patches of the images extracted therefrom in operation 330). In some embodiments, the output of the 6-DoF pose estimation technique (computed by the shape estimator 100) includes a mesh M and its 6-DoF pose in a global coordinate system (e.g., 3 dimensional translational and rotational coordinates in a coordinate system oriented with respect to a main camera 10) for each of the detected objects in the scene.


To align a 3-D model with the observed 6-DoF pose of an object in a scene, embodiments of the present disclosure generally attempt to find a pose of the 3-D model that causes its appearance, from one or more virtual cameras, to be consistent with the one or more observed images of the object captured by the cameras 10, 30. Generally, these approaches include detecting keypoints in the object level patches of the images, and transforming the pose of the 3-D model such that the locations of the keypoints in the 3-D model are consistent with the locations of the keypoints in the observed images. In circumstances where the images of the scene also include one or more depth maps, the 3-D model may also be aligned with the depth maps through a 3-D model alignment algorithm such as iterative closest point (ICP).


In circumstances where the images include surface normals maps (e.g., computed from polarization signatures of the object based on shape-from-polarization, as described above), the pose of the 3-D model is further aligned with the observed surface normals. For example, in some embodiments, the correspondences between the locations of keypoints in the observed images and locations on the 3-D model are identified, and the directions of the surface normals at corresponding portions of surface normals map are compared against corresponding directions of the surface normals on the 3-D model to compute an error that is used as part of an error function for aligning the pose of 3-D model with the actual pose of the observed object. In some embodiments, the correspondences are computed based on identifying matching keypoints using a keypoint detector (e.g., a classical keypoint detector or a trained neural network based keypoint detector), (e.g., updating an estimated pose of the 3-D model to minimize differences or errors between the locations of the keypoints in the observed images and the locations of the keypoints in 3-D model of the object, a render-and-compare approach (e.g., by using a differentiable rendering engine, where the differences or errors between detected keypoints and locations of keypoints in renderings of the 3-D model are propagated backward through the differentiable rendering engine to update the pose, see, e.g., Labbé, Yann, et al. “CosyPose: Consistent multi-view multi-object 6D pose estimation.” European Conference on Computer Vision. Springer, Cham, 2020.), or dense correspondences between surfaces of 3-D models and surfaces of objects and may be computed as described in more detail below with respect to FIG. 5.


Some approaches to aligning 3-D model to their appearances in images relate to computing dense correspondences between surfaces of the object depicted in the one or more images of the scene and surfaces of the 3-D model by rendering images of the 3-D model in an initial (or current) estimated pose.



FIG. 5 is a flowchart depicting a method 500 for computing a pose of an object based on dense correspondences according to some embodiments of the present disclosure. For the sake of clarity, embodiments of the present disclosure will be described with respect to the estimation of the pose of one object in the scene. However, embodiments of the present disclosure are not limited thereto and include embodiments wherein the pose estimator 100 estimates the poses of multiple objects in the scene as depicted in the one or more images captured in operation 310 (e.g., where the poses of the multiple objects may be estimated in parallel or jointly in a combined process).


In operation 510, the pose estimator 100 computes an initial pose estimate of an object based on one or more images of the object, such as the image patches extracted in operation 330. The pose estimator 100 may also receive one or more 3-D models corresponding to the detected objects (e.g., as loaded in operation 350) where the 3-D model is posed (e.g., translated and rotated) based on the initial pose estimate. In some embodiments, the initial pose estimate is computed based on detecting keypoints in the one or more images of the object and using a Perspective-n-Point algorithm to match the detected keypoints with corresponding known locations of keypoints in the 3-D model. See, e.g., Zhao, Wanqing, et al. “Learning deep network for detecting 3D object keypoints and 6D poses.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. and Lepetit, Vincent, Francesc Moreno-Noguer, and Pascal Fua. “EPnP: An accurate O(n) solution to the PnP problem.” International Journal of Computer Vision 81.2 (2009): 155. The keypoints may be detected using, for example, a classical keypoint detector (e.g., scale-invariant feature transform (SIFT), speeded up robust features (SURF), gradient location and orientation histogram (GLOH), histogram of oriented gradients (HOG), basis coefficients, Haar wavelet coefficients, and the like.) or a trained deep learning keypoint detector such as a trained convolutional neural network using HRNet (Wang, Jingdong, et al. “Deep high-resolution representation learning for visual recognition.” IEEE transactions on pattern analysis and machine intelligence (2020).) with a differential spatial to numerical (DSNT) layer and Blind Perspective-n-Point (Campbell, Dylan, Liu, and Stephen Gould. “Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization.” European Conference on Computer Vision. Springer, Cham, 2020.).


As another example, the initial pose estimate may be computed by capturing a depth image or depth map of the object (e.g., using a stereo depth camera or time of flight depth camera) and applying an iterative closest point (ICP) algorithm or a point pair feature matching algorithm (see, e.g., Drost, Bertram, et al. “Model globally, match locally: Efficient and robust 3D object recognition.” 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 2010.) to align the 3-D model to the shape of the object as it appears in the depth image. In some embodiments, the initial pose estimate is computed directly from a trained network (see, e.g., Xiang, Yu, et al. “PoseCNN: A convolutional neural network for 6D object pose estimation in cluttered scenes.” arXiv preprint arXiv:1711.00199 (2017).) and/or approaches such as a dense pose object detector (Zakharov, Sergey, Ivan Shugurov, and Slobodan Ilic. “DPOD: 6D Pose Object Detector and Refiner.” 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE Computer Society, 2019.)



FIG. 6 is a schematic depiction of a 3-D model, depicted in shaded form, posed in accordance with an initial pose estimate and overlaid onto an observed image of a scene, depicted in line drawing form. As shown in FIG. 6 these is an error between the observed object 602 and the rendering of the 3-D model 604 as posed based on the initial pose estimate, both in the form of rotation error and translation error. Accordingly, aspects of embodiments of the present disclosure relate to refining this initial pose estimate (whether performed using keypoint detection and a PnP algorithm or using a depth image and an ICP algorithm as discussed above, or through other techniques) as described in more detail below.



FIG. 7A is a block diagram depicting a pipeline 700 for refining an initial pose estimate using dense correspondences according to one embodiment of the present disclosure. In various embodiments, the pipeline 700 is implemented in whole or in part by the pose estimator 100 to compute refined pose estimates, or feature vectors in other representation spaces representing the location of the object, based on input images of the object.


Referring back to FIG. 5 and to FIG. 7A, in operation 530, the pose estimator 100 uses a renderer 710 (or rendering engine) to render an image 731 (e.g., a 2-D image) of the 3-D model 711 in its initial pose 712 from the viewpoint of a camera (e.g., as specified by extrinsic camera parameters) that captured an image of the object in the scene. In embodiments in which multiple consistent images of the object were captured from multiple viewpoints, the pose estimator 100 renders a separate image of the 3-D model in its initial estimated pose in the scene observed by the cameras from each of the separate viewpoints with respect to the object in the scene. The rendering may also be performed in accordance with camera intrinsic parameters (e.g., accounting for field of view and lens distortions of the camera or cameras used to capture the observed images of the object in the scene).


In some embodiments of the present disclosure, the rendered image of the object is a rendered surface normals map, where each pixel or point in the rendered surface normals map is a vector indicating the direction of the surface of the 3-D model depicted at that pixel or point (e.g., a vector perpendicular to the surface of the object at that pixel or point). In some cases, the normal vector at each pixel is encoded in the color channels of an image (e.g., in red, green, and blue color channels). In some embodiments, the pose estimator 100 renders the rendered surface normals map by computing a depth map from the perspective or viewpoint of the observing camera used to capture the observed image (e.g., using the Moller-Trumbore ray-triangle intersection algorithm as described in Möller, Tomas, and Ben Trum bore. “Fast, minimum storage ray-triangle intersection.” Journal of graphics tools 2.1 (1997): 21-28.). According to these embodiments, the depth map of the object is converted to a point cloud, and a rendered surface normals map is computed from the point map (e.g., by computing the slope between neighboring or adjacent points of the point cloud).


In some embodiments of the present disclosure, the pose estimator 100 renders the rendered surface normals map directly from 3-D model with a virtual camera placed at the perspective or viewpoint of the observing camera. This direct rendering may be performed by tracing rays directly from the virtual camera into a virtual scene containing the 3-D model in its initial estimated pose and computing the surface normal of the first surface that each ray intersects with (in particular, the surfaces of the 3-D model in the initial estimated pose that the rays intersect with).


While the rendered image 731 in the embodiments described above include one or more rendered surface normals maps, embodiments of the present disclosure are not limited thereto and the renderer may be configured to generate different types of rendered 2-D images such as color (e.g., red, green, blue) images, monochrome images, and the like.


In operation 570, the pose estimator 100 computes dense image-to-object correspondences between the one or more images of the object and the 3-D model of the object. For example, the rendered image 731 of the object in the scene based on the initial estimated pose and observed image 732 of the object in the same scene (or multiple rendered images 731 and multiple observed images 732 from different viewpoints) are supplied to correspondence calculator 730, which computes dense correspondence features between the rendered image 731 and the observed image 732 (or the rendered images 731 and the corresponding observed images 732 of the object in the scene).


In various embodiments, the correspondence calculator 730 may use different techniques to compute dense correspondence features between the rendered image 731 and the observed image 732. In some embodiments, a disparity neural network is used to detect correspondences (see, e.g., Xu, Haofei, and Juyong Zhang. “AANet: Adaptive aggregation network for efficient stereo matching.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.), where the disparity neural network is modified to match pixels along the y-axis of the images (e.g., perpendicular to the usual direction of identifying correspondences by a disparity neural network) in addition to along the x-axis of the input images (as traditional, where the input images are rectified to extend along the x-axis between stereo pairs of images), where the modification may include flattening the output of the neural network before supplying the output to the loss function used to train the disparity neural network, such that the loss function accounts identifies and detects disparities along both the x-axis and the y-axis. In some embodiments, an optical flow neural network is trained and/or retrained to operate on the given types of input data (e.g., observed surface normals maps and observed images), where examples of optical flow neural networks are described in Dosovitskiy, Alexey, et al. “FlowNet: Learning optical flow with convolutional networks.” Proceedings of the IEEE international conference on computer vision. 2015. IIg, Eddy, et al. “FlowNet 2.0: Evolution of optical flow estimation with deep networks.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. and Trabelsi, Ameni, et al. “A Pose Proposal and Refinement Network for Better 6D Object Pose Estimation.” Proceedings of the IEEE/C VF Winter Conference on Applications of Computer Vision. 2021. In some embodiments, classical techniques for computing dense correspondences are be used, such as classical algorithms for computing optical flow (see, e.g., Horn and Schunck, referenced above) or classical techniques for computing disparity (e.g., block matching, but applied along both the x-axis and y-axis). Other embodiments of the present disclosure include modifications and/or retraining of existing neural network backbones to take two inputs (e.g., the observed image and the rendered image) to compute correspondences.


The observed image or observed images 732 supplied as input to the correspondence calculator 730 may be the same images that were used to compute the initial pose estimate or may be different images, such as images from different viewpoints from those used to compute the initial pose estimate, images captured in different modalities (e.g., polarization and/or different spectra), or images or feature maps computed based on captured or observed images (e.g., observed features in polarization representation spaces or observed surface normals computed from polarization features using shape-from-polarization techniques). Examples of types of images include color images (e.g., red, green, blue images) captured by color cameras, monochrome images (e.g., in the visible light, infrared, or ultraviolet portions of the spectrum), polarization raw frames (e.g., color or monochrome images captured through a polarization filter), polarization features in polarization representation spaces (e.g., angle of linear polarization (AOLP) and degree of linear polarization (DOLP)). As discussed in more detail above, shape from polarization (SfP) provides techniques for computing observed surface normals maps from captured or observed polarization raw frames.


Accordingly, the correspondence calculator 730 computes dense correspondences between the rendered image 731 and the observed image 732.


Through the rendering process, the pose estimator 100 also stores information associated with the rendered image 731 regarding the point in the 3-D model that is represented by each pixel in the rendered image. For example, when rendering the image using a ray tracing technique, each pixel of the rendered image corresponds to a location on the surface of the 3-D model (e.g., in uv coordinate space representing points on the surface of the 3-D model) as defined by a ray connecting the camera origin, the pixel, and the location on the surface of the 3-D model, as modified by any virtual optics system (e.g., as defined by camera intrinsic parameters). As such, the pose estimator 100 stores 2-D to 3-D correspondences between the 2-D rendered image 731 and the 3-D model in its initial pose.


Therefore, the correspondence calculator 730 further computes dense image-to-object correspondences 740 that maps pixels in the observed image 732 to locations on the surface of the 3-D model 711. In more detail, as shown in FIG. 7B, the optical flow features computed by the correspondence calculator 730 provide a mapping from pixels in the observed image 732 to pixels in the rendered image 731 and the 2-D to 3-D mapping information from the rendering process provides mappings from pixels in the rendered image 731 to locations on the surface of the 3-D model 711. As a result, the dense image-to-object correspondences 740 provide 2-D to 3-D correspondences between every visible pixel in the observed image 732 and the predicted point it represents on the 3-D model 711 of the object.


In operation 590, the pose estimator 100 updates the estimated pose based on the dense image-to-object correspondences. For example, as shown in FIG. 7A, the dense image-to-object correspondences may be supplied to a Perspective-n-Point (PnP) algorithm to compute a refined pose estimate. In some embodiments, the PnP algorithm estimates the refined pose P by finding the pose P that minimizes the error function below:








arg


min

P






x

X






KPf

(
x
)

-
x









where K is the camera intrinsic matrix of the camera used to capture the observed image of the object, P is a pose matrix representing the transformation between the object and the camera, f: custom character2custom character3 is the dense image-to-object correspondences described above (computed in operation 570) mapping from pixel coordinates in the observed image to 3-D coordinates on the surface of the 3-D model, and X is the domain of f (e.g., across all of the pixels in the observed image of the object).


Because the correspondence calculator 730 computes a large number of correspondences (e.g., dense correspondences) between the image and the 3-D model of the object, these correspondences can also be used to estimate the configuration of the deformable object using a PnP algorithm, thereby enabling the measurement of the configuration of deformable objects (e.g., bags holding loose items such as food, clothes, flexible printed circuit boards, and the like) by deforming the 3-D model to match the configuration of the object. In some embodiments, the deformation of the 3-D model to match the configuration of the deformable object in the images can be computed for every pixel coordinate x∈X (where X represents the collection of all pixels in the observed images) as:

{Pf(x)−projL(x)(Pf(x))|x∈X}

where L(x) represents a line of a projection of point x from the camera, P is a pose matrix representing the transformation between the object and the camera, f: custom character2 custom character3 is the dense image-to-object correspondences described above (computed in operation 570) mapping from pixel coordinates in the observed image to 3-D coordinates on the surface of the 3-D model, projL(x) (Pf(x)) is the estimated depth of the object coordinate seen at point x from the camera along line L(x), and X is the domain off (e.g., across all of the pixels in the observed image of the object). Accordingly, the above expression provides one estimate of the deformation of the object, e.g., the difference between the predicted location based on the current pose P and a 3-D model of the object (as represented by the term Pf(x)) and the actual observed location of the corresponding point in the observed image, as represented by the term projL(x) (Pf(x)), where the difference represents the change in 3-D coordinates to be applied to make the shape of the 3-D model match up with the actual deformed shape or configuration of the observed object.


In some embodiments where a depth map D of the scene is available (e.g., by capturing a depth map of the scene using a depth camera such as a stereo camera) among the one or more observed images 732, the depth map is used to convert the pixel coordinates x to 3-D coordinates D(x) and therefore the deformation would be computed for each pixel x as:

{Pf(x)−D(x)|x∈X}

Accordingly, the above expression provides one estimate of the deformation of the object, e.g., the difference between the predicted location based on the current pose P and a 3-D model of the object (as represented by the term Pf(x)) and the actual observed location of the corresponding point in the observed depth image D(x), where the difference represents the change in 3-D coordinates to be applied to make the shape of the 3-D model match up with the actual deformed shape or configuration of the observed object.


While FIG. 5 shows an embodiment where an updated pose of the 3-D model is computed once, in some embodiments the pose is iteratively refined by supplying the pose computed in operation 590 as the initial pose of the next iteration in operation 530 in order to further refine the estimated pose of the object for consistency with the observed image of the object.


In addition, while FIG. 5 depicts a circumstance in which the observed image of the object is captured from a single viewpoint, embodiments of the present disclosure are not limited thereto and may be applied in a multi-view environment where multiple cameras (e.g., a main camera 10 and support cameras 30) capture observed images of the object from multiple different viewpoints. In such embodiments, the multiple views (e.g., N different views) may be jointly used to compute a pose estimate that minimizes a combined error metric across the multiple views (e.g., errors computed by comparing the locations of keypoints in the observed images from each viewpoint with renderings from each viewpoint).


Generating Datasets of Images of Known Objects and Corresponding Shape Estimates


As noted above, in some embodiments, the shape estimator 100 includes a renderer 150 such as a 3-D rendering engine that is configured to compute shape estimates of the objects detected in the scene based on estimated poses of those objects.



FIG. 8 is a flowchart depicting a method for generating datasets including images of known objects and corresponding shape estimates according to one embodiment of the present disclosure.


In operation 810, the shape estimator controls one or more cameras to capture one or more images of a scene containing known objects. The images may be captured in accordance as described above with respect to operation 310 of FIG. 3, such as by controlling a main camera 10 and support cameras 30 to capture consistent images of a scene from one or more viewpoints. The images may include images in different modalities, such as images covering different parts of the electromagnetic spectrum (e.g., visible light of different colors, near infrared, thermal, ultraviolet, and the like), depth maps (e.g., captured using depth from active or passive stereo), polarization raw frames captured by cameras with different polarization filters (e.g., circular polarization filters or linear polarization filters at different angles), and combinations thereof.


In some embodiments, the images may also include polarization signatures or polarization signature maps computed from the raw images from the cameras, including Stokes vectors, degree of linear polarization (DOLP), and angle of linear polarization (AOLP) (for cases where there are three or more polarization raw frames captured with different polarization angles from the same viewpoint). In some cases, the images also include physics-based surface normals maps (e.g., Nx, Ny, and Nz for each pixel), where these normals maps may be computed from the polarization signatures based on shape from polarization techniques, as described above.


In the case of a multi-viewpoint system, such as where multiple ones of the main camera 10 and the support cameras 30 include multi-modal camera systems (e.g., monocular multi-modal camera arrays and stereo multi-modal camera arrays), the above values can be estimated for multiple viewpoints. While the 3-D depth or 3-D coordinates of each point of the objects visible in the scene will be consistent (within expected noise tolerances) in the depth maps captured across the viewpoints of the multiple cameras, the DOLP and AOLP will vary depending on the viewpoint as well as the color of the object and as a result each of the normal maps will be different. For each viewpoint and each color channel, there is a corresponding set of surface normals N estimated from the physics of polarization by applying the Fresnel equation which will vary depending on the viewpoint due to a number of factors that include: material reflectivity or “albedo,” wavelength, and specular reflections/viewing direction.


Regarding material reflectivity or “albedo,” when the albedo of the material is low it has a significant impact on polarization. Umov's law which states that the albedo and the degree of polarization are inversely proportional to one another. (For example, low albedo materials have a very high degree of polarization, while high albedo materials have a low degree of polarization.) In these cases of low albedo materials, it is likely that the surface normals estimated from different viewpoints are substantially similar (after accounting for the rigid body transformation between the two viewpoints), whereas the estimated surface normals based on physics may be very different for high albedo materials.


Regarding wavelength, the albedo is wavelength dependent for a whole range of colors other than pure black and pure white (e.g., the albedo of black car and a white car are spectrally invariant over the visible wavelength range). As a result, the degree of polarization is stronger for certain colors and its corresponding signal-to-noise ratio (SNR) is higher. Therefore, the surface normals estimated from certain, low albedo wavelength channels will be more accurate than those from other channels with higher albedo.


Regarding specular reflections and viewing direction, specular reflections change with viewing direction as well as illumination direction. The brightness variations that result from specular reflection (and not material geometry) are sometimes referred to as “texture-copy artifacts.” In such situations, having a substantially different viewpoint will result in a change in brightness to one which is more consistent with the material and geometry of the object. In that case, the surface normals estimated from viewpoints where texture-copy artifacts are not visible on the surface of the object are likely to be more accurate than those from other viewpoints in which texture-copy artifacts do appear on the surface of the object.


In some embodiments, the observed images take the form of:

    • Ir1, Ig1, Ib1, Nr1d, Nr1s1, Ng1d, Ng1s1, Ng1s2, Nb1d, Nb1s1, Nb1s2 Ir2, Ig2, Ib2, Nr2d, Nr1s1, Ng2d, Ng2s1, Ng2s2, Nb2d, Nb2s1, Nb2s2 . . . Irn, Ign, Ibn, Nrnd, Nr1n1, Ngnd, Ngns1, Ngns2, Nbnd, Nbns1, Nbns2

      where Iri, Igi, Ibi, represent intensity images from viewpoint i among viewpoints 1 through n in r, g, and b spectral channels, Nrid, Nris1, Nris2, Ngid, Ngis1, Ngis2, Nbtid, Nbtis1. Nbis2 are the surface normals estimated from diffuse and specular reflection models on the red, green, and blue spectral channel images using a physics-based approach that leverages the Fresnel equations (e.g., based on shape from polarization).


According to some embodiments, pose estimation and data generation systems of the present disclosure are deployed in factory conditions where the illumination conditions are not always known or may not be uniform (e.g., periodically changing illumination due to moving machinery may change the illumination conditions from one image to the next). Accordingly, applying photometric constraints is challenging given the varying illumination conditions. Imaging systems according to some embodiments of the present disclosure capture multi-channel polarization information to provide additional constraints for disambiguating (or reducing ambiguity) in the surface normals computed based on a computer vision model trained based on datasets generated in accordance with embodiments of the present disclosure. For example, as discussed in more detail below, the polarization images as well as the corresponding estimated normal maps may be supplied to train a computer vision model (e.g., a deep learning network) to choose how to combine these inputs effectively to form the desired output, such as shape estimates of objects depicted in the images such as depth maps and surface normals maps that have low noise compared to comparative approaches (e.g., using depth from stereo, depth from time of flight, and surface normals from shape from polarization using the Fresnel equations).


In operation 830, the shape estimator 100 computes pose estimates of the known objects depicted in the scene. These pose estimates may be computed from the one or more images based on the pose estimation techniques such as those described above with respect to FIGS. 3, 4A, 4B, 4C, 5, 6, 7A, and 7B. In some example embodiments, the multi-view light field capture of the scene using a main camera 10 and multiple support cameras 30 is used to estimate depths of surfaces of objects using multi-view stereo correspondence, correlates the depths of these surfaces with the detected keypoints in the images and keypoints in the ground truth 3-D CAD models of the objects in the images, and jointly optimizes the depth errors over the multiple viewpoints and multiple imaging modalities to provide 6DoF pose estimations.


In some embodiments, the joint optimization of depth errors includes computing surface shape estimates based on the input images using a shape estimation neural network, and including the differences between the shape estimates and the rendered shapes (e.g., comparing the estimated depth maps to rendered depth maps and/or comparing the estimated surface normals maps of the 3-D model in an initial estimated pose to the rendered surface normals maps of the 3-D model in the initial estimated pose to update the estimated pose of the 3-D model). This shape estimation neural network may be trained based on existing training data mapping input images and image signatures (e.g., polarization signatures) to ground truth smooth shapes. In some circumstances the differences or errors computed in accordance with the different factors are separately weighted, such as based on relative confidences output by the different factors. For example, the shape estimation neural network may include a confidence score as part of its output and, as another example, depth maps and surface normals maps computed using shape from polarization approaches may be weighted based on the level of noise present in the underlying images.


However, embodiments of the present disclosure are not limited thereto, and the poses of the objects may be estimated using different techniques, as discussed above.


In operation 850, the renderer 150 renders shape estimates of the objects in the scene based on 3-D models of the objects posed based on the estimated poses. These rendered shape estimates represent the “ground truth” or desired output labels associated with the captured images. Using the pose estimates computed in operation 830 and ground truth, accurate 3-D CAD models of the objects, the shape estimator 100 infers the shape of the objects, including surface normal maps, accurately enough to have them represent ground truth for the purpose of populating a dataset for training computer vision models, such as by adding a new data point to a collection of data points of the dataset in operation 870.


These rendered shape estimates may include images of a virtual scene with one or more 3-D models posed in accordance with the estimated poses of the corresponding objects depicted in the scene, and the images are rendered from the perspective of virtual cameras having intrinsic and extrinsic camera parameters matching those of the main camera 10 and/or support cameras 30, such that each rendered view corresponds to a view from the observed images 18. For example, rendering color images of a scene may provide estimates regarding the outline or silhouette of the object as viewed from a particular camera. As another example, rendering a depth map based on the 3-D model can provide a high resolution depth map or point cloud of the shape of the objects in the scene with substantially no noise compared to point clouds or depth maps generated from depth camera systems. As a third example, rendering a surface normals map of the surface normals directions of the surfaces of the 3-D model produces a higher resolution surface normals map with substantially no noise compared to surface normals maps computed directly from depth maps from depth camera systems or computed from polarization raw frames based on shape from polarization techniques. In various embodiments, these surface normals maps may be rendered by directly detecting the surface normals (e.g., the angles or slopes or orientations) of the surfaces of the posed 3-D model, or may be computed from the high resolution depth map rendered from the 3-D model (e.g., by computing the gradient between adjacent pixels of the depth map or adjacent points of the point cloud).


In operation 870, the shape estimator 100 adds a data point to a collection of data points, where the added data point includes the one or more observed images of the scene (as captured by the one or more cameras) and the corresponding shape estimates from the same viewpoint with respect to the object (e.g., as generated by the renderer). In some cases, each data point includes one or more images from a given viewpoint and one or more shape estimates (e.g., rendered images) rendered from the corresponding virtual viewpoint. In some embodiments, the shape estimator 100 determines whether or not to add a particular data point to the collection of data points in operation 870 based on whether the robotic arm 24 was able to pick an object depicted in the images, or otherwise perform a particular task, based on the estimated 6-DoF pose of the object as computed from the observed images of the scene associated with the data point. In other words, the ability of the robotic controller to control the robotic arm to pick the object may be included as a factor in validating the rendered “ground truth” surface normals of the object depicted in the images and therefore in determining whether or not the data point should be included in the generated dataset.


These observed images are paired with their corresponding “ground truth” shape signatures generated by the renderer from the posed 3-D model. For example, ground truth surface normals maps generated by the renderer may be substantially similar to the surface normals Nrid, Nris1, Nris2, Ngid, Ngis1, Ngis2, Nbid, Nbis1, Nbis2 estimated from the captured polarization images, but will generally be smoother (e.g., have less noise or substantially no noise) than the observed images because the ground truth surface normals maps were generated from a virtual rendering environment in which the directions of the surface normals of the posed 3-D model are known.


As such, aspects of embodiments of the present disclosure provide systems and methods for generating datasets that include observed images of real-world objects and corresponding ground truth images or signatures representing the shapes of those real-world objects, where the ground truth images or signatures may include high resolution, low noise or substantially noise-free depth maps and surface normals maps of the objects.


As described above, the dataset may be generated, in part, by shape estimators 100 operating as pose estimators for estimating the poses of known objects in a manufacturing environment, where the computed poses are supplied to a controller 28 for controlling a robotic arm 24 to pick the objects based on their computed poses. The collection of data as part of an existing pose estimation process in accordance with some embodiments of the present disclosure generates a large number of data points relating to known objects under a variety of different conditions (e.g., appearing in different orientations, under varying lighting conditions, interacting with various other objects, and the like). Some aspects of embodiments of the present disclosure relate to aggregating data points collected from diverse environments (e.g., different shape estimators operating on different logistics facilities or manufacturing lines that are configured to compute the poses and/or shapes of different objects, such as different manufacturing lines that manufacture different products from different components, and where the different facilities may be operated by different entities). Accordingly, embodiments of the present disclosure provide systems and methods for generating large and diverse datasets for training computer vision models to perform computer vision tasks such as shape estimation, such as the slope or surface normals map of an object.


Computer Vision Models Trained Based on Datasets Including Slope Data


Datasets generated in accordance with embodiments of the present disclosure may be applied to train computer vision models such as deep neural networks (e.g., convolutional neural networks) to compute images or signatures representing the shapes of objects that are depicted in one or more given input images. Considering the arrangement shown in FIG. 1A, a main camera 10 and support cameras 30 may capture images of a scene, such as polarization raw frames in different portions of the visible spectrum (e.g., red, green, and blue color channels of color images), infrared images, depth maps from stereo, and the like. In addition, shape from polarization may be used to compute surface normals maps from polarization signatures (e.g., AOLP and DOLP) based on the polarization raw frames in accordance with the Fresnel equations.


In more detail, these aspects of embodiments of the present disclosure relate to training a computer vision model such as a neural network to implement shape from polarization and/or multi-view/multi-spectral stereo given a set of input images. In particular, neural networks are capable of performing non-parametric functional approximations and therefore can be trained to compute the desired mapping between input space (e.g., captured multi-modal and multi-view images and/or signatures such as polarization signatures, noisy depth maps, and noisy surface normals maps) and the corresponding shape of the depicted objects (e.g., surface orientation in the form of surface normals maps and/or depth maps).



FIG. 9 is a schematic block diagram depicting training a computer vision model using a dataset according to some embodiments of the present disclosure. As shown in FIG. 9, a training dataset 910 including captured images and corresponding clean ground-truth shape data (e.g., corresponding surface normals maps) is supplied with along with an untrained computer vision model 920 to a model training system 930 to compute a trained computer vision model 940 based on the training dataset 910. In particular, the training process may compute a plurality of parameters that configure the computer vision model to perform particular tasks. As a specific example, in the case of deep convolutional neural network, the trained parameters may include weights and biases of connections between the neurons in various layers of the deep convolutional neural network. The untrained computer vision model 920 may have initial parameters that are set randomly or may be a pre-trained network that may have previously been trained to perform a different task or trained based on a different training dataset. Captured images 950 of a scene may then be supplied to the trained computer vision model 940 to compute shape estimates by the computer vision model 960. In comparison, a comparative shape estimator 970 that computed the shapes of objects directly from the images (e.g., without using a trained computer vision model) may use techniques such as depth from disparity and/or shape from polarization to compute shape estimates directly from the images 980 (these may be referred to as being computed using “classical” techniques as opposed to techniques based on machine learning or statistical learning). However, these classically computed shape estimates may exhibit noisy or inconsistent estimates.


As such, a neural network trained according to embodiments of the present disclosure would disambiguate between the different noisy or inconsistent surface normals N computed from the different views, modalities, and spectral information captured (e.g., where the surface normals maps N may differ in accordance with the noise or variability of the polarization signature due to differences in albedo, wavelength, and texture copy artifacts due to viewing direction, as well as other noise in ambiguity in the image capture process). Comparing the shape estimates from the model 960 against the shape estimates computed directly from the images 980 in a comparison module 990 show that a properly-trained computer vision model 940 produces smoother and more accurate shape estimates than comparative techniques.


In addition, given a dataset depicting a sufficiently diverse set of objects (e.g., generated based on images of known objects and their corresponding 3-D models), in some embodiments, the trained model is generalized to generate accurate and low-noise estimates of the shapes of arbitrary objects (e.g., objects that may not depicted in the training dataset), thereby enabling the estimation of unknown or novel objects (e.g., objects for which corresponding 3-D models may not be available to the shape estimator).


One example of a computer vision model uses a multi-view deep neural network, where the images captured from each viewpoint would pass through its own polarization fusion backbone (see, e.g., International Patent Application No. PCT/US20/48604 filed Aug. 28, 2020, U.S. patent application Ser. No. 17/266,046, and Kalra, Agastya, et al. “Deep polarization cues for transparent object segmentation.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, the entire disclosures of which are incorporated by reference herein), with independent weights for each viewpoint-specific backbone. The features computed from the images captured from each viewpoint by the separate polarization fusion backbones are then used by the computer vision model to compute a set of multi-scale features (e.g., using ResNet, as described in He, Kaiming, et al. “Deep residual learning for image recognition.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. or using a Feature Pyramid Network see, e.g., Lin, Tsung-Yi, et al. “Feature pyramid networks for object detection.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.). The computer vision model may then compute correspondences between multi-scale features across the multiple viewpoints using a multi-view correlation search based on epipolar geometry (see, e.g., GCNet as described in Cao, Yue, et al. “Gcnet: Non-local networks meet squeeze-excitation networks and beyond.” Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2019., AANet as described in Xu, Haofei, and Juyong Zhang. “Aanet: Adaptive aggregation network for efficient stereo matching.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, GA-Net as described in Zhang, Feihu, et al. “GA-Net: Guided aggregation net for end-to-end stereo matching.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019., and the like). These correspondences provide 3-D information about the scene imaged by the multi-view imaging system including cameras at different viewpoints. This 3-D information can then be used to compute surface normals and 6-DoF poses of objects in the scene. Finally, in some embodiments, the surface normals are further refined using Polarized 3D in real-time, as described in Kadambi, Achuta, et al. “Polarized 3d: High-quality depth sensing with polarization cues.” Proceedings of the IEEE International Conference on Computer Vision. 2015.



FIG. 10 is a schematic block diagram depicting a computer vision model according to some embodiments of the present disclosure. As shown in FIG. 10, in some embodiments, the computer vision model 1040 includes a first neural network 1041 (e.g., using an architecture as described above) that is trained to compute shape estimates 1060 (e.g., surface normals maps) from irradiance images 1050 captured from a scene. The computer vision model 1040 in these embodiments further include a second neural network 1042 (e.g., a convolutional neural network) or a shape-to-irradiances model 1042 that is trained to map a computed surface normals map (e.g., trained using the ground truth labels in the dataset) to synthesized image irradiances 1070 from polarization filtered cameras (e.g., estimates of what the captured polarization raw frames would give rise to the surface normals map). This second neural network 1042 may be trained by reversing the roles of the ground truth labels and the input data pairs of the training dataset. This second neural network 1042 between the actual input 1050 and the inversely predicted input 1070 may then be used by a comparison module 1080 to compute a confidence measure 1080 in the surface normals maps 1060 computed by the first neural network 1041, described above, by comparing the synthesized polarization raw frames 1070 with the observed polarization raw frames 1050 captured by the cameras. During training, the confidence measure 1090 may be used as a component of the training loss function for training the parameters of the computer vision model 1040 to increase the confidence score, and may also be used in deployment as an estimate of the confidence in the shape estimates computed by the first computer vision model 1041.


In some embodiments, the computer vision model computes estimates of the shapes of surfaces based on the polarized inverse rendering problem. In particular, the computer vision model is trained to take the AOLP, DOLP, and Intensity image from each viewpoint as input (e.g., computed from polarization raw frames captured by the cameras) and to decompose each viewpoint into polarized lighting parameters, polarized material parameters (albedo and reflectance properties), and surface normals maps. In more detail, when generating the dataset, a differentiable rendering engine may be used to in operation 850 compute the polarized lighting parameters, polarized material parameters, and surface normals maps as a part of the shape estimates, such that the rendered images rendered by the differentiable rendering engine match the appearance of the input images. In some embodiments, the model further maps the features into a material invariant polarization embedding space, thereby enabling a physics-based equation search that improves physics-based polarization reconstruction and also the accuracy of the computed surface normals maps.


Accordingly, aspects of embodiments of the present disclosure relate to systems and methods for generating datasets for training computer vision models, such as neural networks, to predict the shapes of objects, such as the surface normals of those objects, based on a set of input images captured by an imaging system, such as a multi-view and/or multi-modal imaging system. Aspects of embodiments of the present disclosure also relate to such computer vision models trained based on such datasets. Such computer vision models provide a very efficient means of determining poses and shapes of objects, which is of value in the case of automation and robotics, where short cycle times (fast computations) increase the throughput of such systems (e.g., short processing times for determining the poses of objects enables robotic arm systems to pick and manipulate those objects more quickly, thereby enabling more objects to be manipulated per unit of time).


As noted above, in some circumstances, systems and methods for generating such datasets may be deployed within an existing production context, such as within a factory or other manufacturing facility, within a logistics pipeline (e.g., warehouse), and/or other operating environment where the shape estimator is configured to detect the shapes and/or poses of known objects. Through the process of detecting the shapes and poses of the known objects using the known 3-D model, embodiments of the present disclosure collect input images and 6-DoF poses of objects, which are then used to populate the training dataset by automatically generating ground truth data or labels for the data based on the input images, the 6-DoF poses of objects, and the 3-D models of the objects, without requiring hand labeling of these ground truth data (e.g., without requiring direct human involvement in generating these ground truth labels).


This process produces a large dataset with, possibly, millions of different images collected through the deployment of such systems in environments such as factories for autonomous manufacturing of products. Accordingly, the collected dataset can be used to train and/or re-train computer vision systems to produce robust predictions of the shapes of objects under a wide range of objects made of different materials, having different geometries (including unknown geometries), and under different illumination conditions, and therefore these systems can be quickly redeployed to new environments with little to no adaptation (e.g., retraining) required to achieve good performance, noting that additional data collected from the new environments may further improve performance.



FIG. 11 is a block diagram of a shape estimator according to one embodiment of the present disclosure. FIG. 12 is a flowchart of a method 1200 for re-training computer vision model according to one embodiment of the present disclosure. As described above with respect to FIG. 1A, in some embodiments, in operation 1210, a shape estimator 100 receives input images 18 based on images captured by the imaging system, which may include a main camera 10 and, in the case of a multi-view imaging system, support cameras 30. These input images 18 may be multi-view images (from multiple viewpoints) and may be captured using multiple imaging modalities (with or without polarization filters, in different portions of the electromagnetic spectrum, depth maps generated from time of flight or stereo, polarization signatures generated from polarization raw frames captured by camera modules with polarization filters, and the like). These input images 18 are processed by a pose estimator 120 to compute a pose of a known object in the scene in operation 1230, where a 3-D model of the known object is available to the pose estimator 120. (The pose estimator 120 may also concurrently compute the poses of multiple known objects of the same type represented by the same 3-D model or of different types represented by different 3-D models in the scene.) In the process of computing the pose of the object in the scene, the pose estimator 120 may compute a shape estimate of the object detected in the images 18 and compute a pose of the object by aligning the 3-D model to match estimated shape of the object. In some embodiments, the pose estimator 120 uses of one or more trained computer vision models to compute features from the input images 18. For example, the pose estimator 120 may use a computer vision model trained on a dataset as described above to compute a shape estimate (e.g., an estimated surface normals map of the object). The pose estimator 120 may then compare the shape estimate against a rendered shape of the 3-D model (e.g., comparing an estimated surface normals map against a rendered surface normals map) in order to update the estimated pose to reduce a difference between the rendered shape and the shape estimate as a constraint or as an additional constraint when computing the estimated pose of the object (e.g., in addition to constraints from other factors such as keypoint matching across one or more views, dense correspondence matching between rendered images and the observed surfaces of the object, and the like).


The computed pose and 3-D model of the object may then be output to a controller 28 for controlling an actuator, such as a robotic arm 24, to pick up objects detected in the input images 18. The pose and the 3-D model may also be supplied to a renderer 150 that is configured to render a final shape estimate of the object based on the 3-D model, and this shape estimate may also be supplied to the controller 28.


The computer vision model used by the pose estimator 120 to compute the shape estimate during pose estimation may also be retrained based on additional data collected from the environment in which the shape estimator 100 is operating. For example, the shape estimates generated by the renderer 150 may be combined with the observed images of the scene in operation 1250 to generate training data points where the data points include a set of one or more images and the corresponding rendered shape estimates (e.g., surface normals maps). These generated data points may then be supplied to a model trainer 170 to generate one or more data points for a dataset. The model trainer 170 may then periodically or continuously retrain the computer vision model in operation 1270 based on the additional training data (along with verifying that the updated model does not exhibit regressions or decreases in accuracy of the shape estimates). The retrained, updated computer vision model can then be installed or run by the pose estimator 120 for use in performing shape estimations as part of computing the poses of objects in received input images 18. In addition, the model trainer 170 may also receive training data (including data points of the same type as the input images 18 and ground truth shape estimates as labels) from other sources (e.g., other shape estimators deployed in other areas of the same facility or deployed in other facilities or from an external source of training data) to further update the computer vision models to improve performance. In some circumstances, the model trainer 170 is remote from the imaging system (e.g., remote from the main camera 10), such as a case where a centralized system receives training data points generated by one or more shape estimators 100, aggregates the received training data, and trains one or more computer vision models for deployment a shape estimator 100 (e.g., one or more of the shape estimators from which it received the training data).


Accordingly, shape estimators 100 in accordance with some embodiments of the present disclosure update internal computer vision models based on additional training data collected from their operational environments, thereby enabling the shape estimators 100 to continuously or periodically improve performance on the estimations of the poses and shapes of objects detected in the environment. This continuous improvement and domain adaptation is available even as the environment changes, either gradually (e.g., due to gradual changes in the types of objects presented to the system) or suddenly (e.g., deployment into a new environment with different types of objects and lighting conditions).


As such, datasets collected in accordance with aspects of embodiments of the present disclosure are useful in the training of computer vision models for performing shape estimation. The ImageNet dataset has over 14 million images that are hand-labeled to indicate what objects are pictured those images along with providing bounding box labels for those object in about one million of those images. The ImageNet dataset has had an enormous impact in improving object classification techniques over the years. Image datasets with multi-modal data (including, for example, polarization data) and corresponding ground truth labels indicating the shapes of the objects (e.g., the surface normals of surfaces depicted in the images) such as those described herein likewise enable efficient estimation of poses for new objects with new geometries and materials based on corresponding images such as their polarization and spectral signatures.


While the present invention has been described in connection with certain exemplary embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims, and equivalents thereof.

Claims
  • 1. A system for collecting data for a training dataset for training a machine learning model, the system comprising: an imaging system configured to capture one or more images; anda processing system comprising one or more processors and memory storing instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: for each of multiple iterations: receiving one or more respective input images of a respective scene from the imaging system;estimating a respective pose of a respective object in the respective scene from the one or more respective input images including aligning keypoints in the one or more respective input images with respective corresponding keypoints of a respective 3-D model of the respective object;rendering, using the respective 3-D model of the respective object posed in accordance with the respective estimated pose, a respective shape estimate comprising a surface normals map that associates each of a plurality of locations in a respective input image with surface normal data;generating a respective data point, the respective data point comprising a respective image of the respective scene and a respective label comprising the respective shape estimate of the respective object; andincluding the respective data point in the training dataset;training the machine learning model on the training dataset generated using the respective 3-D models of the respective objects to compute shape estimates based on one or more input images;obtaining one or more images of a scene;generating a shape estimate of an object in the scene from the one or more images using the machine learning model;updating an estimated pose of the object to reduce a difference between the shape estimate and a rendered shape of a 3-D model of the object; andproviding the estimated pose to a controller for causing a robot to manipulate the object.
  • 2. The system of claim 1, wherein the imaging system comprises a polarization camera system, and wherein the one or more respective input images comprise one or more polarization images,wherein, for one or more of the multiple iterations, the operations further comprise generating a surface normals map from the one or more polarization images, andwherein aligning the keypoints in the one or more respective input images with respective corresponding keypoints of the respective 3-D model of the respective object comprises refining the respective estimated pose of the object according to differences between the surface normals map computed from the one or more polarization images and surface normals of the respective 3-D model of the object.
  • 3. The system of claim 2 wherein the one or more polarization images comprise a plurality of spectral channels corresponding to different portions of an electromagnetic spectrum.
  • 4. The system of claim 1, wherein the respective data point comprises one or more polarization images.
  • 5. The system of claim 1, wherein the respective data point comprises one or more polarization signatures computed based on one or more polarization images.
  • 6. The system of claim 1, wherein the respective data point comprises one or more surface normals maps computed from one or more polarization images.
  • 7. The system of claim 2, wherein the respective shape estimate comprises a rendered depth map.
  • 8. The system of claim 1, wherein the imaging system comprises a depth camera system, and wherein the respective data point comprises one or more depth maps.
  • 9. The system of claim 8, wherein the respective pose of the respective object is estimated based on aligning a shape of the respective 3-D model with the one or more depth maps.
  • 10. The system of claim 1, wherein for one or more of the multiple iterations, estimating the respective pose of the respective object further comprises using a previously trained machine learning model trained to compute shape estimates based on the one or more respective input images, and wherein training the machine learning model on the training dataset comprises re-training the machine learning model on the respective generated data points.
  • 11. The system of claim 1, wherein training the machine learning model on the training dataset comprises training the machine learning model on the respective generated data points.
  • 12. A method for collecting data for a training dataset for training a machine learning model, the method comprising: for each of multiple iterations: receiving, by a processing system comprising one or more processors and memory, one or more respective input images of a respective scene;estimating, by the processing system, a respective pose of a respective object in the respective scene from the one or more respective input images including aligning keypoints in the one or more respective input images with respective corresponding keypoints of a respective 3-D model of the respective object;rendering, by the processing system using the respective 3-D model of the respective object posed in accordance with the respective estimated pose, a respective shape estimate comprising a surface normals map that associates each of a plurality of locations in a respective input image with surface normal data;generating, by the processing system, a respective data point, the respective data point comprising a respective image of the respective scene and a respective label comprising the respective shape estimate of the respective object; andincluding the respective data point in the training dataset;training the machine learning model on the training dataset generated using the respective 3-D models of the respective objects to compute shape estimates based on one or more input images;obtaining one or more images of a scene;generating a shape estimate of an object in the scene from the one or more images using the machine learning model;updating an estimated pose of the object to reduce a difference between the shape estimate and a rendered shape of a 3-D model of the object; andproviding the estimated pose to a controller for causing a robot to manipulate the object.
  • 13. The method of claim 12, wherein the one or more respective input images comprise one or more polarization images, and wherein, for one or more of the multiple iterations, the method further comprises generating a surface normals map from the one or more polarization images, wherein aligning the keypoints in the one or more respective input images with respective corresponding keypoints of the respective 3-D model of the respective object comprises refining the respective estimated pose of the respective object according to differences between the surface normals map computed from the one or more polarization images and surface normals of the respective 3-D model of the respective object.
  • 14. The method of claim 13, wherein the one or more polarization images comprise a plurality of spectral channels corresponding to different portions of an electromagnetic spectrum.
  • 15. The method of claim 12, wherein the respective data point comprises one or more polarization images.
  • 16. The method of claim 12, wherein the respective data point comprises one or more polarization signatures computed based on one or more polarization images.
  • 17. The method of claim 12, wherein the respective data point comprises one or more surface normals maps computed from one or more polarization images.
  • 18. The method of claim 13, wherein the respective shape estimate comprises a rendered depth map.
  • 19. The method of claim 12, wherein the respective data point comprises one or more depth maps.
  • 20. The method of claim 19, wherein the respective pose of the respective object is estimated based on aligning a shape of the respective 3-D model with the one or more depth maps.
  • 21. The method of claim 12, wherein, for one or more of the multiple iterations, estimating the respective pose of the respective object further comprises using a previously trained machine learning model trained to compute shape estimates based on the one or more respective input images, and wherein training the machine learning model on the training dataset comprises re-training the machine learning model on the respective generated data points.
  • 22. The method of claim 12, wherein training the machine learning model on the training dataset comprises training the machine learning model on the respective data points.
  • 23. One or more non-transitory computer storage media encoded with computer program instructions that when executed by one or more computers cause the one or more computers to perform operations for collecting data for a training dataset, comprising: for each of multiple iterations: receiving one or more respective input images of a respective scene;estimating a respective pose of a respective object in the respective scene from the one or more respective input images including aligning keypoints in the one or more respective input images with respective corresponding keypoints of a respective 3-D model of the respective object;rendering, using the respective 3-D model of the respective object posed in accordance with the respective estimated pose, a respective shape estimate comprising a surface normals map that associates each of a plurality of locations in a respective input image with surface normal data;generating a respective data point, the respective data point comprising a respective image of the respective scene and a respective label comprising the respective shape estimate of the respective object; andincluding the respective data point in the training dataset;training a machine learning model on the training dataset generated using the respective 3-D models of the respective objects to compute shape estimates based on one or more input images;obtaining one or more images of a scene;generating a shape estimate of an object in the scene from the one or more images using the machine learning model;updating an estimated pose of the object to reduce a difference between the shape estimate and a rendered shape of a 3-D model of the object; andproviding the estimated pose to a controller for causing a robot to manipulate the object.
US Referenced Citations (1289)
Number Name Date Kind
4124798 Thompson Nov 1978 A
4198646 Alexander et al. Apr 1980 A
4323925 Abell et al. Apr 1982 A
4460449 Montalbano Jul 1984 A
4467365 Murayama et al. Aug 1984 A
4652909 Glenn Mar 1987 A
4888645 Mitchell et al. Dec 1989 A
4899060 Lischke Feb 1990 A
4962425 Rea Oct 1990 A
5005083 Grage et al. Apr 1991 A
5070414 Tsutsumi Dec 1991 A
5144448 Hornbaker et al. Sep 1992 A
5157499 Oguma et al. Oct 1992 A
5325449 Burt et al. Jun 1994 A
5327125 Iwase et al. Jul 1994 A
5463464 Ladewski Oct 1995 A
5475422 Suzuki et al. Dec 1995 A
5488674 Burt et al. Jan 1996 A
5517236 Sergeant et al. May 1996 A
5629524 Stettner et al. May 1997 A
5638461 Fridge Jun 1997 A
5675377 Gibas et al. Oct 1997 A
5703961 Rogina et al. Dec 1997 A
5710875 Hsu et al. Jan 1998 A
5757425 Barton et al. May 1998 A
5793900 Nourbakhsh et al. Aug 1998 A
5801919 Griencewic Sep 1998 A
5808350 Jack et al. Sep 1998 A
5832312 Rieger et al. Nov 1998 A
5833507 Woodgate et al. Nov 1998 A
5880691 Fossum et al. Mar 1999 A
5911008 Niikura et al. Jun 1999 A
5933190 Dierickx et al. Aug 1999 A
5963664 Kumar et al. Oct 1999 A
5973844 Burger Oct 1999 A
6002743 Telymonde Dec 1999 A
6005607 Uomori et al. Dec 1999 A
6034690 Gallery et al. Mar 2000 A
6069351 Mack May 2000 A
6069365 Chow et al. May 2000 A
6084979 Kanade et al. Jul 2000 A
6095989 Hay et al. Aug 2000 A
6097394 Levoy et al. Aug 2000 A
6124974 Burger Sep 2000 A
6130786 Osawa et al. Oct 2000 A
6137100 Fossum et al. Oct 2000 A
6137535 Meyers Oct 2000 A
6141048 Meyers Oct 2000 A
6160909 Melen Dec 2000 A
6163414 Kikuchi et al. Dec 2000 A
6172352 Liu Jan 2001 B1
6175379 Uomori et al. Jan 2001 B1
6185529 Chen et al. Feb 2001 B1
6198852 Anandan et al. Mar 2001 B1
6205241 Melen Mar 2001 B1
6239909 Hayashi et al. May 2001 B1
6292713 Jouppi et al. Sep 2001 B1
6340994 Margulis et al. Jan 2002 B1
6358862 Ireland et al. Mar 2002 B1
6373518 Sogawa Apr 2002 B1
6419638 Hay et al. Jul 2002 B1
6443579 Myers Sep 2002 B1
6445815 Sato Sep 2002 B1
6476805 Shum et al. Nov 2002 B1
6477260 Shimomura Nov 2002 B1
6502097 Chan et al. Dec 2002 B1
6525302 Dowski, Jr. et al. Feb 2003 B2
6546153 Hoydal Apr 2003 B1
6552742 Seta Apr 2003 B1
6563537 Kawamura et al. May 2003 B1
6571466 Glenn et al. Jun 2003 B1
6603513 Berezin Aug 2003 B1
6611289 Yu et al. Aug 2003 B1
6627896 Hashimoto et al. Sep 2003 B1
6628330 Lin Sep 2003 B1
6628845 Stone et al. Sep 2003 B1
6635941 Suda Oct 2003 B2
6639596 Shum et al. Oct 2003 B1
6647142 Beardsley Nov 2003 B1
6657218 Noda Dec 2003 B2
6671399 Berestov Dec 2003 B1
6674892 Melen Jan 2004 B1
6750488 Driescher et al. Jun 2004 B1
6750904 Lambert Jun 2004 B1
6765617 Tangen et al. Jul 2004 B1
6771833 Edgar Aug 2004 B1
6774941 Boisvert et al. Aug 2004 B1
6788338 Dinev et al. Sep 2004 B1
6795253 Shinohara Sep 2004 B2
6801653 Wu et al. Oct 2004 B1
6819328 Moriwaki et al. Nov 2004 B1
6819358 Kagle et al. Nov 2004 B1
6833863 Clemens Dec 2004 B1
6879735 Portniaguine et al. Apr 2005 B1
6897454 Sasaki et al. May 2005 B2
6903770 Kobayashi et al. Jun 2005 B1
6909121 Nishikawa Jun 2005 B2
6917702 Beardsley Jul 2005 B2
6927922 George et al. Aug 2005 B2
6958862 Joseph Oct 2005 B1
6985175 Iwai et al. Jan 2006 B2
7013318 Rosengard et al. Mar 2006 B2
7015954 Foote et al. Mar 2006 B1
7085409 Sawhney et al. Aug 2006 B2
7161614 Yamashita et al. Jan 2007 B1
7199348 Olsen et al. Apr 2007 B2
7206449 Raskar et al. Apr 2007 B2
7215364 Wachtel et al. May 2007 B2
7235785 Hornback et al. Jun 2007 B2
7245761 Swaminathan et al. Jul 2007 B2
7262799 Suda Aug 2007 B2
7292735 Blake et al. Nov 2007 B2
7295697 Satoh Nov 2007 B1
7333651 Kim et al. Feb 2008 B1
7369165 Bosco et al. May 2008 B2
7391572 Jacobowitz et al. Jun 2008 B2
7408725 Sato Aug 2008 B2
7425984 Chen et al. Sep 2008 B2
7430312 Gu Sep 2008 B2
7471765 Jaffray et al. Dec 2008 B2
7496293 Shamir et al. Feb 2009 B2
7564019 Olsen et al. Jul 2009 B2
7599547 Sun et al. Oct 2009 B2
7606484 Richards et al. Oct 2009 B1
7620265 Wolff et al. Nov 2009 B1
7633511 Shum et al. Dec 2009 B2
7639435 Chiang Dec 2009 B2
7639838 Nims Dec 2009 B2
7646549 Zalevsky et al. Jan 2010 B2
7657090 Omatsu et al. Feb 2010 B2
7667824 Moran Feb 2010 B1
7675080 Boettiger Mar 2010 B2
7675681 Tomikawa et al. Mar 2010 B2
7706634 Schmitt et al. Apr 2010 B2
7723662 Levoy et al. May 2010 B2
7738013 Galambos et al. Jun 2010 B2
7741620 Doering et al. Jun 2010 B2
7782364 Smith Aug 2010 B2
7826153 Hong Nov 2010 B2
7840067 Shen et al. Nov 2010 B2
7912673 Hébert et al. Mar 2011 B2
7924321 Nayar et al. Apr 2011 B2
7956871 Fainstain et al. Jun 2011 B2
7965314 Miller et al. Jun 2011 B1
7973834 Yang Jul 2011 B2
7986018 Rennie Jul 2011 B2
7990447 Honda et al. Aug 2011 B2
8000498 Shih et al. Aug 2011 B2
8013904 Tan et al. Sep 2011 B2
8027531 Wilburn et al. Sep 2011 B2
8044994 Vetro et al. Oct 2011 B2
8055466 Bryll Nov 2011 B2
8077245 Adamo et al. Dec 2011 B2
8089515 Chebil et al. Jan 2012 B2
8098297 Crisan et al. Jan 2012 B2
8098304 Pinto et al. Jan 2012 B2
8106949 Tan et al. Jan 2012 B2
8111910 Tanaka Feb 2012 B2
8126279 Marcellin et al. Feb 2012 B2
8130120 Kawabata et al. Mar 2012 B2
8131097 Lelescu et al. Mar 2012 B2
8149323 Li et al. Apr 2012 B2
8164629 Zhang Apr 2012 B1
8169486 Corcoran et al. May 2012 B2
8180145 Wu et al. May 2012 B2
8189065 Georgiev et al. May 2012 B2
8189089 Georgiev et al. May 2012 B1
8194296 Compton et al. Jun 2012 B2
8212914 Chiu Jul 2012 B2
8213711 Tam Jul 2012 B2
8231814 Duparre Jul 2012 B2
8242426 Ward et al. Aug 2012 B2
8244027 Takahashi Aug 2012 B2
8244058 Intwala et al. Aug 2012 B1
8254668 Mashitani et al. Aug 2012 B2
8279325 Pitts et al. Oct 2012 B2
8280194 Wong et al. Oct 2012 B2
8284240 Saint-Pierre et al. Oct 2012 B2
8289409 Chang Oct 2012 B2
8289440 Pitts et al. Oct 2012 B2
8290358 Georgiev Oct 2012 B1
8294099 Blackwell, Jr. Oct 2012 B2
8294754 Jung et al. Oct 2012 B2
8300085 Yang et al. Oct 2012 B2
8305456 McMahon Nov 2012 B1
8315476 Georgiev et al. Nov 2012 B1
8345144 Georgiev et al. Jan 2013 B1
8360574 Ishak et al. Jan 2013 B2
8400555 Georgiev et al. Mar 2013 B1
8406562 Bassi et al. Mar 2013 B2
8411146 Twede Apr 2013 B2
8416282 Lablans Apr 2013 B2
8446492 Nakano et al. May 2013 B2
8456517 Spektor et al. Jun 2013 B2
8493496 Freedman et al. Jul 2013 B2
8514291 Chang Aug 2013 B2
8514491 Duparre Aug 2013 B2
8541730 Inuiya Sep 2013 B2
8542933 Venkataraman et al. Sep 2013 B2
8553093 Wong et al. Oct 2013 B2
8558929 Tredwell Oct 2013 B2
8559705 Ng Oct 2013 B2
8559756 Georgiev et al. Oct 2013 B2
8565547 Strandemar Oct 2013 B2
8576302 Yoshikawa Nov 2013 B2
8577183 Robinson Nov 2013 B2
8581995 Lin et al. Nov 2013 B2
8619082 Ciurea et al. Dec 2013 B1
8648918 Kauker et al. Feb 2014 B2
8648919 Mantzel et al. Feb 2014 B2
8655052 Spooner et al. Feb 2014 B2
8682107 Yoon et al. Mar 2014 B2
8687087 Pertsel et al. Apr 2014 B2
8692893 McMahon Apr 2014 B2
8754941 Sarwari et al. Jun 2014 B1
8773536 Zhang Jul 2014 B1
8780113 Ciurea et al. Jul 2014 B1
8787691 Takahashi et al. Jul 2014 B2
8792710 Keselman Jul 2014 B2
8804255 Duparre Aug 2014 B2
8823813 Mantzel et al. Sep 2014 B2
8830375 Ludwig Sep 2014 B2
8831367 Venkataraman et al. Sep 2014 B2
8831377 Pitts et al. Sep 2014 B2
8836793 Kriesel et al. Sep 2014 B1
8842201 Tajiri Sep 2014 B2
8854433 Rafii Oct 2014 B1
8854462 Herbin et al. Oct 2014 B2
8861089 Duparre Oct 2014 B2
8866912 Mullis Oct 2014 B2
8866920 Venkataraman et al. Oct 2014 B2
8866951 Keelan Oct 2014 B2
8878950 Lelescu et al. Nov 2014 B2
8885059 Venkataraman et al. Nov 2014 B1
8885922 Ito et al. Nov 2014 B2
8896594 Xiong et al. Nov 2014 B2
8896719 Venkataraman et al. Nov 2014 B1
8902321 Venkataraman et al. Dec 2014 B2
8928793 McMahon Jan 2015 B2
8977038 Tian et al. Mar 2015 B2
9001226 Ng et al. Apr 2015 B1
9019426 Han et al. Apr 2015 B2
9025894 Venkataraman et al. May 2015 B2
9025895 Venkataraman et al. May 2015 B2
9030528 Pesach et al. May 2015 B2
9031335 Venkataraman et al. May 2015 B2
9031342 Venkataraman May 2015 B2
9031343 Venkataraman May 2015 B2
9036928 Venkataraman May 2015 B2
9036931 Venkataraman et al. May 2015 B2
9041823 Venkataraman et al. May 2015 B2
9041824 Lelescu et al. May 2015 B2
9041829 Venkataraman et al. May 2015 B2
9042667 Venkataraman et al. May 2015 B2
9047684 Lelescu et al. Jun 2015 B2
9049367 Venkataraman et al. Jun 2015 B2
9055233 Venkataraman et al. Jun 2015 B2
9060120 Venkataraman et al. Jun 2015 B2
9060124 Venkataraman et al. Jun 2015 B2
9077893 Venkataraman et al. Jul 2015 B2
9094661 Venkataraman et al. Jul 2015 B2
9100586 McMahon et al. Aug 2015 B2
9100635 Duparre et al. Aug 2015 B2
9123117 Ciurea et al. Sep 2015 B2
9123118 Ciurea et al. Sep 2015 B2
9124815 Venkataraman et al. Sep 2015 B2
9124831 Mullis Sep 2015 B2
9124864 Mullis Sep 2015 B2
9128228 Duparre Sep 2015 B2
9129183 Venkataraman et al. Sep 2015 B2
9129377 Ciurea et al. Sep 2015 B2
9143711 McMahon Sep 2015 B2
9147254 Florian et al. Sep 2015 B2
9185276 Rodda et al. Nov 2015 B2
9188765 Venkataraman et al. Nov 2015 B2
9191580 Venkataraman et al. Nov 2015 B2
9197821 McMahon Nov 2015 B2
9210392 Nisenzon et al. Dec 2015 B2
9214013 Venkataraman et al. Dec 2015 B2
9235898 Venkataraman et al. Jan 2016 B2
9235900 Ciurea et al. Jan 2016 B2
9240049 Ciurea et al. Jan 2016 B2
9247117 Jacques Jan 2016 B2
9253380 Venkataraman et al. Feb 2016 B2
9253397 Lee et al. Feb 2016 B2
9256974 Hines Feb 2016 B1
9264592 Rodda et al. Feb 2016 B2
9264610 Duparre Feb 2016 B2
9361662 Lelescu et al. Jun 2016 B2
9374512 Venkataraman et al. Jun 2016 B2
9412206 McMahon et al. Aug 2016 B2
9413953 Maeda Aug 2016 B2
9426343 Rodda et al. Aug 2016 B2
9426361 Venkataraman et al. Aug 2016 B2
9438888 Venkataraman et al. Sep 2016 B2
9445003 Lelescu et al. Sep 2016 B1
9456134 Venkataraman et al. Sep 2016 B2
9456196 Kim et al. Sep 2016 B2
9462164 Venkataraman et al. Oct 2016 B2
9485496 Venkataraman et al. Nov 2016 B2
9497370 Venkataraman et al. Nov 2016 B2
9497429 Mullis et al. Nov 2016 B2
9516222 Duparre et al. Dec 2016 B2
9519972 Venkataraman et al. Dec 2016 B2
9521319 Rodda et al. Dec 2016 B2
9521416 McMahon et al. Dec 2016 B1
9536166 Venkataraman et al. Jan 2017 B2
9576369 Venkataraman et al. Feb 2017 B2
9578237 Duparre et al. Feb 2017 B2
9578259 Molina Feb 2017 B2
9602805 Venkataraman et al. Mar 2017 B2
9633442 Venkataraman et al. Apr 2017 B2
9635274 Lin et al. Apr 2017 B2
9638883 Duparre May 2017 B1
9661310 Deng et al. May 2017 B2
9706132 Nisenzon et al. Jul 2017 B2
9712759 Venkataraman et al. Jul 2017 B2
9729865 Kuo et al. Aug 2017 B1
9733486 Lelescu et al. Aug 2017 B2
9741118 Mullis Aug 2017 B2
9743051 Venkataraman et al. Aug 2017 B2
9749547 Venkataraman et al. Aug 2017 B2
9749568 McMahon Aug 2017 B2
9754422 McMahon et al. Sep 2017 B2
9766380 Duparre et al. Sep 2017 B2
9769365 Jannard Sep 2017 B1
9774789 Ciurea et al. Sep 2017 B2
9774831 Venkataraman et al. Sep 2017 B2
9787911 McMahon et al. Oct 2017 B2
9794476 Nayar et al. Oct 2017 B2
9800856 Venkataraman et al. Oct 2017 B2
9800859 Venkataraman et al. Oct 2017 B2
9807382 Duparre et al. Oct 2017 B2
9811753 Venkataraman et al. Nov 2017 B2
9813616 Lelescu et al. Nov 2017 B2
9813617 Venkataraman et al. Nov 2017 B2
9826212 Newton et al. Nov 2017 B2
9858673 Ciurea et al. Jan 2018 B2
9864921 Venkataraman et al. Jan 2018 B2
9866739 McMahon Jan 2018 B2
9888194 Duparre Feb 2018 B2
9892522 Smirnov et al. Feb 2018 B2
9898856 Yang et al. Feb 2018 B2
9917998 Venkataraman et al. Mar 2018 B2
9924092 Rodda et al. Mar 2018 B2
9936148 McMahon Apr 2018 B2
9942474 Venkataraman et al. Apr 2018 B2
9955070 Lelescu et al. Apr 2018 B2
9986224 Mullis May 2018 B2
10009538 Venkataraman et al. Jun 2018 B2
10019816 Venkataraman et al. Jul 2018 B2
10027901 Venkataraman et al. Jul 2018 B2
10089740 Srikanth et al. Oct 2018 B2
10091405 Molina Oct 2018 B2
10119808 Venkataraman et al. Nov 2018 B2
10122993 Venkataraman et al. Nov 2018 B2
10127682 Mullis Nov 2018 B2
10142560 Venkataraman et al. Nov 2018 B2
10182216 Mullis et al. Jan 2019 B2
10218889 McMahan Feb 2019 B2
10225543 Mullis Mar 2019 B2
10250871 Ciurea et al. Apr 2019 B2
10261219 Duparre et al. Apr 2019 B2
10275676 Venkataraman et al. Apr 2019 B2
10306120 Duparre May 2019 B2
10311649 McMohan et al. Jun 2019 B2
10334241 Duparre et al. Jun 2019 B2
10366472 Lelescu et al. Jul 2019 B2
10375302 Nayar et al. Aug 2019 B2
10375319 Venkataraman et al. Aug 2019 B2
10380752 Ciurea et al. Aug 2019 B2
10390005 Nisenzon et al. Aug 2019 B2
10412314 McMahon et al. Sep 2019 B2
10430682 Venkataraman et al. Oct 2019 B2
10455168 McMahon Oct 2019 B2
10455218 Venkataraman et al. Oct 2019 B2
10462362 Lelescu et al. Oct 2019 B2
10482618 Jain et al. Nov 2019 B2
10489683 Koh Nov 2019 B1
10540806 Yang et al. Jan 2020 B2
10542208 Lelescu et al. Jan 2020 B2
10547772 Molina Jan 2020 B2
10560684 Mullis Feb 2020 B2
10574905 Srikanth et al. Feb 2020 B2
10621779 Topiwala Apr 2020 B1
10638099 Mullis et al. Apr 2020 B2
10643383 Venkataraman May 2020 B2
10674138 Venkataraman et al. Jun 2020 B2
10679046 Black Jun 2020 B1
10694114 Venkataraman et al. Jun 2020 B2
10708492 Venkataraman et al. Jul 2020 B2
10735635 Duparre Aug 2020 B2
10742861 McMahon Aug 2020 B2
10767981 Venkataraman et al. Sep 2020 B2
10805589 Venkataraman et al. Oct 2020 B2
10818026 Jain et al. Oct 2020 B2
10839485 Lelescu et al. Nov 2020 B2
10909707 Ciurea et al. Feb 2021 B2
10944961 Ciurea et al. Mar 2021 B2
10958892 Mullis Mar 2021 B2
10984276 Venkataraman et al. Apr 2021 B2
11022725 Duparre et al. Jun 2021 B2
11024046 Venkataraman Jun 2021 B2
12138805 Sundermeyer Nov 2024 B2
12190536 Kuss Jan 2025 B2
20010005225 Clark et al. Jun 2001 A1
20010019621 Hanna et al. Sep 2001 A1
20010028038 Hamaguchi et al. Oct 2001 A1
20010038387 Tomooka et al. Nov 2001 A1
20020003669 Kedar et al. Jan 2002 A1
20020012056 Trevino et al. Jan 2002 A1
20020015536 Warren et al. Feb 2002 A1
20020027608 Johnson et al. Mar 2002 A1
20020028014 Ono Mar 2002 A1
20020039438 Mori et al. Apr 2002 A1
20020057845 Fossum et al. May 2002 A1
20020061131 Sawhney et al. May 2002 A1
20020063807 Margulis May 2002 A1
20020075450 Aratani et al. Jun 2002 A1
20020087403 Meyers et al. Jul 2002 A1
20020089596 Yasuo Jul 2002 A1
20020094027 Sato et al. Jul 2002 A1
20020101528 Lee et al. Aug 2002 A1
20020113867 Takigawa et al. Aug 2002 A1
20020113888 Sonoda et al. Aug 2002 A1
20020118113 Oku et al. Aug 2002 A1
20020120634 Min et al. Aug 2002 A1
20020122113 Foote Sep 2002 A1
20020163054 Suda Nov 2002 A1
20020167537 Trajkovic Nov 2002 A1
20020171666 Endo et al. Nov 2002 A1
20020177054 Saitoh et al. Nov 2002 A1
20020190991 Efran et al. Dec 2002 A1
20020195548 Dowski, Jr. et al. Dec 2002 A1
20030025227 Daniell Feb 2003 A1
20030026474 Yano Feb 2003 A1
20030086079 Barth et al. May 2003 A1
20030124763 Fan et al. Jul 2003 A1
20030140347 Varsa Jul 2003 A1
20030156189 Utsumi et al. Aug 2003 A1
20030179418 Wengender et al. Sep 2003 A1
20030188659 Merry et al. Oct 2003 A1
20030190072 Adkins et al. Oct 2003 A1
20030198377 Ng Oct 2003 A1
20030211405 Venkataraman Nov 2003 A1
20030231179 Suzuki Dec 2003 A1
20040003409 Berstis Jan 2004 A1
20040008271 Hagimori et al. Jan 2004 A1
20040012689 Tinnerino et al. Jan 2004 A1
20040027358 Nakao Feb 2004 A1
20040047274 Amanai Mar 2004 A1
20040050104 Ghosh et al. Mar 2004 A1
20040056966 Schechner et al. Mar 2004 A1
20040061787 Liu et al. Apr 2004 A1
20040066454 Otani et al. Apr 2004 A1
20040071367 Irani et al. Apr 2004 A1
20040075654 Hsiao et al. Apr 2004 A1
20040096119 Williams et al. May 2004 A1
20040100570 Shizukuishi May 2004 A1
20040105021 Hu Jun 2004 A1
20040114807 Lelescu et al. Jun 2004 A1
20040141659 Zhang Jul 2004 A1
20040151401 Sawhney et al. Aug 2004 A1
20040165090 Ning Aug 2004 A1
20040169617 Yelton et al. Sep 2004 A1
20040170340 Tipping et al. Sep 2004 A1
20040174439 Upton Sep 2004 A1
20040179008 Gordon et al. Sep 2004 A1
20040179834 Szajewski et al. Sep 2004 A1
20040196379 Chen et al. Oct 2004 A1
20040207600 Zhang et al. Oct 2004 A1
20040207836 Chhibber et al. Oct 2004 A1
20040212734 Macinnis et al. Oct 2004 A1
20040213449 Safaee-Rad et al. Oct 2004 A1
20040218809 Blake et al. Nov 2004 A1
20040234873 Venkataraman Nov 2004 A1
20040239782 Equitz et al. Dec 2004 A1
20040239885 Jaynes et al. Dec 2004 A1
20040240052 Minefuji et al. Dec 2004 A1
20040251509 Choi Dec 2004 A1
20040264806 Herley Dec 2004 A1
20050006477 Patel Jan 2005 A1
20050007461 Chou et al. Jan 2005 A1
20050009313 Suzuki et al. Jan 2005 A1
20050010621 Pinto et al. Jan 2005 A1
20050012035 Miller Jan 2005 A1
20050036778 DeMonte Feb 2005 A1
20050047678 Jones et al. Mar 2005 A1
20050048690 Yamamoto Mar 2005 A1
20050068436 Fraenkel et al. Mar 2005 A1
20050083531 Millerd et al. Apr 2005 A1
20050084179 Hanna et al. Apr 2005 A1
20050111705 Waupotitsch et al. May 2005 A1
20050117015 Cutler Jun 2005 A1
20050128509 Tokkonen et al. Jun 2005 A1
20050128595 Shimizu Jun 2005 A1
20050132098 Sonoda et al. Jun 2005 A1
20050134698 Schroeder et al. Jun 2005 A1
20050134699 Nagashima Jun 2005 A1
20050134712 Gruhlke et al. Jun 2005 A1
20050147277 Higaki et al. Jul 2005 A1
20050151759 Gonzalez-Banos et al. Jul 2005 A1
20050168924 Wu et al. Aug 2005 A1
20050175257 Kuroki Aug 2005 A1
20050185711 Pfister et al. Aug 2005 A1
20050203380 Sauer et al. Sep 2005 A1
20050205785 Hornback et al. Sep 2005 A1
20050219264 Shum et al. Oct 2005 A1
20050219363 Kohler et al. Oct 2005 A1
20050224843 Boemler Oct 2005 A1
20050225654 Feldman et al. Oct 2005 A1
20050265633 Piacentino et al. Dec 2005 A1
20050275946 Choo et al. Dec 2005 A1
20050286612 Takanashi Dec 2005 A1
20050286756 Hong et al. Dec 2005 A1
20060002635 Nestares et al. Jan 2006 A1
20060007331 Izumi et al. Jan 2006 A1
20060013318 Webb et al. Jan 2006 A1
20060018509 Miyoshi Jan 2006 A1
20060023197 Joel Feb 2006 A1
20060023314 Boettiger et al. Feb 2006 A1
20060028476 Sobel et al. Feb 2006 A1
20060029270 Berestov et al. Feb 2006 A1
20060029271 Miyoshi et al. Feb 2006 A1
20060033005 Jerdev et al. Feb 2006 A1
20060034003 Zalevsky Feb 2006 A1
20060034531 Poon et al. Feb 2006 A1
20060035415 Wood Feb 2006 A1
20060038891 Okutomi et al. Feb 2006 A1
20060039611 Rother et al. Feb 2006 A1
20060046204 Ono et al. Mar 2006 A1
20060049930 Zruya et al. Mar 2006 A1
20060050980 Kohashi et al. Mar 2006 A1
20060054780 Garrood et al. Mar 2006 A1
20060054782 Olsen et al. Mar 2006 A1
20060055811 Frtiz et al. Mar 2006 A1
20060069478 Iwama Mar 2006 A1
20060072029 Miyatake et al. Apr 2006 A1
20060087747 Ohzawa et al. Apr 2006 A1
20060098888 Morishita May 2006 A1
20060103754 Wenstrand et al. May 2006 A1
20060119597 Oshino Jun 2006 A1
20060125936 Gruhike et al. Jun 2006 A1
20060138322 Costello et al. Jun 2006 A1
20060139475 Esch et al. Jun 2006 A1
20060152803 Provitola Jul 2006 A1
20060153290 Watabe et al. Jul 2006 A1
20060157640 Perlman et al. Jul 2006 A1
20060159369 Young Jul 2006 A1
20060176566 Boettiger et al. Aug 2006 A1
20060187322 Janson, Jr. et al. Aug 2006 A1
20060187338 May et al. Aug 2006 A1
20060197937 Bamji et al. Sep 2006 A1
20060203100 Ajito et al. Sep 2006 A1
20060203113 Wada et al. Sep 2006 A1
20060210146 Gu Sep 2006 A1
20060210186 Berkner Sep 2006 A1
20060214085 Olsen et al. Sep 2006 A1
20060215924 Steinberg et al. Sep 2006 A1
20060221250 Rossbach et al. Oct 2006 A1
20060239549 Kelly et al. Oct 2006 A1
20060243889 Farnworth et al. Nov 2006 A1
20060251410 Trutna Nov 2006 A1
20060274174 Tewinkle Dec 2006 A1
20060278948 Yamaguchi et al. Dec 2006 A1
20060279648 Senba et al. Dec 2006 A1
20060289772 Johnson et al. Dec 2006 A1
20070002159 Olsen et al. Jan 2007 A1
20070008575 Yu et al. Jan 2007 A1
20070009150 Suwa Jan 2007 A1
20070024614 Tam et al. Feb 2007 A1
20070030356 Yea et al. Feb 2007 A1
20070035707 Margulis Feb 2007 A1
20070036427 Nakamura et al. Feb 2007 A1
20070040828 Zalevsky et al. Feb 2007 A1
20070040922 McKee et al. Feb 2007 A1
20070041391 Lin et al. Feb 2007 A1
20070052825 Cho Mar 2007 A1
20070083114 Yang et al. Apr 2007 A1
20070085917 Kobayashi Apr 2007 A1
20070092245 Bazakos et al. Apr 2007 A1
20070102622 Olsen et al. May 2007 A1
20070116447 Ye May 2007 A1
20070126898 Feldman et al. Jun 2007 A1
20070127831 Venkataraman Jun 2007 A1
20070139333 Sato et al. Jun 2007 A1
20070140685 Wu Jun 2007 A1
20070146503 Shiraki Jun 2007 A1
20070146511 Kinoshita et al. Jun 2007 A1
20070153335 Hosaka Jul 2007 A1
20070158427 Zhu et al. Jul 2007 A1
20070159541 Sparks et al. Jul 2007 A1
20070160310 Tanida et al. Jul 2007 A1
20070165931 Higaki Jul 2007 A1
20070166447 Ur-Rehman et al. Jul 2007 A1
20070171290 Kroger Jul 2007 A1
20070177004 Kolehmainen et al. Aug 2007 A1
20070182843 Shimamura et al. Aug 2007 A1
20070201859 Sarrat Aug 2007 A1
20070206241 Smith et al. Sep 2007 A1
20070211164 Olsen et al. Sep 2007 A1
20070216765 Wong et al. Sep 2007 A1
20070225600 Weibrecht et al. Sep 2007 A1
20070228256 Mentzer et al. Oct 2007 A1
20070236595 Pan et al. Oct 2007 A1
20070242141 Ciurea Oct 2007 A1
20070247517 Zhang et al. Oct 2007 A1
20070257184 Olsen et al. Nov 2007 A1
20070258006 Olsen et al. Nov 2007 A1
20070258706 Raskar et al. Nov 2007 A1
20070263113 Baek et al. Nov 2007 A1
20070263114 Gurevich et al. Nov 2007 A1
20070268374 Robinson Nov 2007 A1
20070291995 Rivera Dec 2007 A1
20070296721 Chang et al. Dec 2007 A1
20070296832 Ota et al. Dec 2007 A1
20070296835 Olsen et al. Dec 2007 A1
20070296846 Barman et al. Dec 2007 A1
20070296847 Chang et al. Dec 2007 A1
20070297696 Hamza et al. Dec 2007 A1
20080006859 Mionetto Jan 2008 A1
20080019611 Larkin et al. Jan 2008 A1
20080024683 Damera-Venkata et al. Jan 2008 A1
20080025649 Liu et al. Jan 2008 A1
20080030592 Border et al. Feb 2008 A1
20080030597 Olsen et al. Feb 2008 A1
20080043095 Vetro et al. Feb 2008 A1
20080043096 Vetro et al. Feb 2008 A1
20080044170 Yap et al. Feb 2008 A1
20080054518 Ra et al. Mar 2008 A1
20080056302 Erdal et al. Mar 2008 A1
20080062164 Bassi et al. Mar 2008 A1
20080079805 Takagi et al. Apr 2008 A1
20080080028 Bakin et al. Apr 2008 A1
20080084486 Enge et al. Apr 2008 A1
20080088793 Sverdrup et al. Apr 2008 A1
20080095523 Schilling-Benz et al. Apr 2008 A1
20080099804 Venezia et al. May 2008 A1
20080106620 Sawachi May 2008 A1
20080112059 Choi et al. May 2008 A1
20080112635 Kondo et al. May 2008 A1
20080117289 Schowengerdt et al. May 2008 A1
20080118241 TeKolste et al. May 2008 A1
20080131019 Ng Jun 2008 A1
20080131107 Ueno Jun 2008 A1
20080151097 Chen et al. Jun 2008 A1
20080152213 Medioni et al. Jun 2008 A1
20080152215 Horie et al. Jun 2008 A1
20080152296 Oh et al. Jun 2008 A1
20080156991 Hu et al. Jul 2008 A1
20080158259 Kempf et al. Jul 2008 A1
20080158375 Kakkori et al. Jul 2008 A1
20080158698 Chang et al. Jul 2008 A1
20080165257 Boettiger Jul 2008 A1
20080174670 Olsen et al. Jul 2008 A1
20080187305 Raskar et al. Aug 2008 A1
20080193026 Horie et al. Aug 2008 A1
20080208506 Kuwata Aug 2008 A1
20080211737 Kim et al. Sep 2008 A1
20080218610 Chapman et al. Sep 2008 A1
20080218611 Parulski et al. Sep 2008 A1
20080218612 Border et al. Sep 2008 A1
20080218613 Janson et al. Sep 2008 A1
20080219654 Border et al. Sep 2008 A1
20080239116 Smith Oct 2008 A1
20080240598 Hasegawa Oct 2008 A1
20080246866 Kinoshita et al. Oct 2008 A1
20080247638 Tanida et al. Oct 2008 A1
20080247653 Moussavi et al. Oct 2008 A1
20080272416 Yun Nov 2008 A1
20080273751 Yuan et al. Nov 2008 A1
20080278591 Barna et al. Nov 2008 A1
20080278610 Boettiger Nov 2008 A1
20080284880 Numata Nov 2008 A1
20080291295 Kato et al. Nov 2008 A1
20080298674 Baker et al. Dec 2008 A1
20080310501 Ward et al. Dec 2008 A1
20090027543 Kanehiro Jan 2009 A1
20090050946 Duparre et al. Feb 2009 A1
20090052743 Techmer Feb 2009 A1
20090060281 Tanida et al. Mar 2009 A1
20090066693 Carson Mar 2009 A1
20090079862 Subbotin Mar 2009 A1
20090086074 Li et al. Apr 2009 A1
20090091645 Trimeche et al. Apr 2009 A1
20090091806 Inuiya Apr 2009 A1
20090092363 Daum et al. Apr 2009 A1
20090096050 Park Apr 2009 A1
20090102956 Georgiev Apr 2009 A1
20090103792 Rahn et al. Apr 2009 A1
20090109306 Shan et al. Apr 2009 A1
20090127430 Hirasawa et al. May 2009 A1
20090128644 Camp, Jr. et al. May 2009 A1
20090128833 Yahav May 2009 A1
20090129667 Ho et al. May 2009 A1
20090140131 Utagawa Jun 2009 A1
20090141933 Wagg Jun 2009 A1
20090147919 Goto et al. Jun 2009 A1
20090152664 Klem et al. Jun 2009 A1
20090167922 Perlman et al. Jul 2009 A1
20090167923 Safaee-Rad et al. Jul 2009 A1
20090167934 Gupta Jul 2009 A1
20090175349 Ye et al. Jul 2009 A1
20090179142 Duparre et al. Jul 2009 A1
20090180021 Kikuchi et al. Jul 2009 A1
20090200622 Tai et al. Aug 2009 A1
20090201371 Matsuda et al. Aug 2009 A1
20090207235 Francini et al. Aug 2009 A1
20090219435 Yuan Sep 2009 A1
20090225203 Tanida et al. Sep 2009 A1
20090237520 Kaneko et al. Sep 2009 A1
20090245573 Saptharishi et al. Oct 2009 A1
20090245637 Barman et al. Oct 2009 A1
20090256947 Ciurea et al. Oct 2009 A1
20090263017 Tanbakuchi Oct 2009 A1
20090268192 Koenck et al. Oct 2009 A1
20090268970 Babacan et al. Oct 2009 A1
20090268983 Stone et al. Oct 2009 A1
20090273663 Yoshida Nov 2009 A1
20090274387 Jin Nov 2009 A1
20090279800 Uetani et al. Nov 2009 A1
20090284651 Srinivasan Nov 2009 A1
20090290811 Imai Nov 2009 A1
20090297056 Lelescu et al. Dec 2009 A1
20090302205 Olsen et al. Dec 2009 A9
20090317061 Jung et al. Dec 2009 A1
20090322876 Lee et al. Dec 2009 A1
20090323195 Hembree et al. Dec 2009 A1
20090323206 Oliver et al. Dec 2009 A1
20090324118 Maslov et al. Dec 2009 A1
20100002126 Wenstrand et al. Jan 2010 A1
20100002313 Duparre et al. Jan 2010 A1
20100002314 Duparre Jan 2010 A1
20100007714 Kim et al. Jan 2010 A1
20100013927 Nixon Jan 2010 A1
20100044815 Chang Feb 2010 A1
20100045809 Packard Feb 2010 A1
20100053342 Hwang et al. Mar 2010 A1
20100053347 Agarwala et al. Mar 2010 A1
20100053415 Yun Mar 2010 A1
20100053600 Tanida et al. Mar 2010 A1
20100060746 Olsen et al. Mar 2010 A9
20100073463 Momonoi et al. Mar 2010 A1
20100074532 Gordon et al. Mar 2010 A1
20100085351 Deb et al. Apr 2010 A1
20100085425 Tan Apr 2010 A1
20100086227 Sun et al. Apr 2010 A1
20100091389 Henriksen et al. Apr 2010 A1
20100097444 Lablans Apr 2010 A1
20100097491 Farina et al. Apr 2010 A1
20100103175 Okutomi et al. Apr 2010 A1
20100103259 Tanida et al. Apr 2010 A1
20100103308 Butterfield et al. Apr 2010 A1
20100111444 Coffman May 2010 A1
20100118127 Nam et al. May 2010 A1
20100128145 Pitts et al. May 2010 A1
20100129048 Pitts et al. May 2010 A1
20100133230 Henriksen et al. Jun 2010 A1
20100133418 Sargent et al. Jun 2010 A1
20100141802 Knight et al. Jun 2010 A1
20100142828 Chang et al. Jun 2010 A1
20100142839 Lakus-Becker Jun 2010 A1
20100157073 Kondo et al. Jun 2010 A1
20100165152 Lim Jul 2010 A1
20100166410 Chang Jul 2010 A1
20100171866 Brady et al. Jul 2010 A1
20100177411 Hegde et al. Jul 2010 A1
20100182406 Benitez Jul 2010 A1
20100194860 Mentz et al. Aug 2010 A1
20100194901 van Hoorebeke et al. Aug 2010 A1
20100195716 Klein Gunnewiek et al. Aug 2010 A1
20100201809 Oyama et al. Aug 2010 A1
20100201834 Maruyama et al. Aug 2010 A1
20100202054 Niederer Aug 2010 A1
20100202683 Robinson Aug 2010 A1
20100208100 Olsen et al. Aug 2010 A9
20100214423 Ogawa Aug 2010 A1
20100220212 Perlman et al. Sep 2010 A1
20100223237 Mishra et al. Sep 2010 A1
20100225740 Jung et al. Sep 2010 A1
20100231285 Boomer et al. Sep 2010 A1
20100238327 Griffith et al. Sep 2010 A1
20100244165 Lake et al. Sep 2010 A1
20100245684 Xiao et al. Sep 2010 A1
20100254627 Panahpour Tehrani et al. Oct 2010 A1
20100259610 Petersen Oct 2010 A1
20100265346 Iizuka Oct 2010 A1
20100265381 Yamamoto et al. Oct 2010 A1
20100265385 Knight et al. Oct 2010 A1
20100277629 Tanaka Nov 2010 A1
20100281070 Chan et al. Nov 2010 A1
20100289941 Ito et al. Nov 2010 A1
20100290483 Park et al. Nov 2010 A1
20100302423 Adams, Jr. et al. Dec 2010 A1
20100309292 Ho et al. Dec 2010 A1
20100309368 Choi et al. Dec 2010 A1
20100321595 Chiu Dec 2010 A1
20100321640 Yeh et al. Dec 2010 A1
20100329556 Mitarai et al. Dec 2010 A1
20100329582 Albu et al. Dec 2010 A1
20110001037 Tewinkle Jan 2011 A1
20110013006 Uzenbajakava et al. Jan 2011 A1
20110018973 Takayama Jan 2011 A1
20110019048 Raynor et al. Jan 2011 A1
20110019243 Constant, Jr. et al. Jan 2011 A1
20110031381 Tay et al. Feb 2011 A1
20110032341 Ignatov et al. Feb 2011 A1
20110032370 Ludwig Feb 2011 A1
20110033129 Robinson Feb 2011 A1
20110038536 Gong Feb 2011 A1
20110043604 Peleg et al. Feb 2011 A1
20110043613 Rohaly et al. Feb 2011 A1
20110043661 Podoleanu Feb 2011 A1
20110043665 Ogasahara Feb 2011 A1
20110043668 McKinnon et al. Feb 2011 A1
20110044502 Liu et al. Feb 2011 A1
20110051255 Lee et al. Mar 2011 A1
20110055729 Mason et al. Mar 2011 A1
20110064327 Dagher et al. Mar 2011 A1
20110069189 Venkataraman et al. Mar 2011 A1
20110080487 Venkataraman et al. Apr 2011 A1
20110084893 Lee et al. Apr 2011 A1
20110085028 Samadani et al. Apr 2011 A1
20110090217 Mashitani et al. Apr 2011 A1
20110102553 Corcoran et al. May 2011 A1
20110108708 Olsen et al. May 2011 A1
20110115886 Nguyen et al. May 2011 A1
20110121421 Charbon et al. May 2011 A1
20110122308 Duparre May 2011 A1
20110128393 Tavi et al. Jun 2011 A1
20110128412 Milnes et al. Jun 2011 A1
20110129165 Lim et al. Jun 2011 A1
20110141309 Nagashima et al. Jun 2011 A1
20110142138 Tian et al. Jun 2011 A1
20110149408 Hahgholt et al. Jun 2011 A1
20110149409 Haugholt et al. Jun 2011 A1
20110150321 Cheong et al. Jun 2011 A1
20110153248 Gu et al. Jun 2011 A1
20110157321 Nakajima et al. Jun 2011 A1
20110157451 Chang Jun 2011 A1
20110169994 DiFrancesco et al. Jul 2011 A1
20110176020 Chang Jul 2011 A1
20110181797 Galstian et al. Jul 2011 A1
20110193944 Lian et al. Aug 2011 A1
20110199458 Hayasaka et al. Aug 2011 A1
20110200319 Kravitz et al. Aug 2011 A1
20110206291 Kashani et al. Aug 2011 A1
20110207074 Hall-Holt et al. Aug 2011 A1
20110211068 Yokota Sep 2011 A1
20110211077 Nayar et al. Sep 2011 A1
20110211824 Georgiev et al. Sep 2011 A1
20110221599 Högasten Sep 2011 A1
20110221658 Haddick et al. Sep 2011 A1
20110221939 Jerdev Sep 2011 A1
20110221950 Oostra et al. Sep 2011 A1
20110222757 Yeatman, Jr. et al. Sep 2011 A1
20110228142 Brueckner et al. Sep 2011 A1
20110228144 Tian et al. Sep 2011 A1
20110234825 Liu et al. Sep 2011 A1
20110234841 Akeley et al. Sep 2011 A1
20110241234 Duparre Oct 2011 A1
20110242342 Goma et al. Oct 2011 A1
20110242355 Goma et al. Oct 2011 A1
20110242356 Aleksic et al. Oct 2011 A1
20110243428 Das Gupta et al. Oct 2011 A1
20110255592 Sung et al. Oct 2011 A1
20110255745 Hodder et al. Oct 2011 A1
20110255786 Hunter et al. Oct 2011 A1
20110261993 Weiming et al. Oct 2011 A1
20110267264 Mccarthy et al. Nov 2011 A1
20110267348 Lin et al. Nov 2011 A1
20110273531 Ito et al. Nov 2011 A1
20110274175 Sumitomo Nov 2011 A1
20110274366 Tardif Nov 2011 A1
20110279705 Kuang et al. Nov 2011 A1
20110279721 McMahon Nov 2011 A1
20110285701 Chen et al. Nov 2011 A1
20110285866 Bhrugumalla et al. Nov 2011 A1
20110285910 Bamji et al. Nov 2011 A1
20110292216 Fergus et al. Dec 2011 A1
20110298898 Jung et al. Dec 2011 A1
20110298917 Yanagita Dec 2011 A1
20110300929 Tardif et al. Dec 2011 A1
20110310980 Mathew Dec 2011 A1
20110316968 Taguchi et al. Dec 2011 A1
20110317766 Lim et al. Dec 2011 A1
20120012748 Pain Jan 2012 A1
20120013748 Stanwood et al. Jan 2012 A1
20120014456 Martinez Bauza et al. Jan 2012 A1
20120019530 Baker Jan 2012 A1
20120019700 Gaber Jan 2012 A1
20120023456 Sun et al. Jan 2012 A1
20120026297 Sato Feb 2012 A1
20120026342 Yu et al. Feb 2012 A1
20120026366 Golan et al. Feb 2012 A1
20120026451 Nystrom Feb 2012 A1
20120026478 Chen et al. Feb 2012 A1
20120038745 Yu et al. Feb 2012 A1
20120039525 Tian et al. Feb 2012 A1
20120044249 Mashitani et al. Feb 2012 A1
20120044372 Côté et al. Feb 2012 A1
20120051624 Ando Mar 2012 A1
20120056982 Katz et al. Mar 2012 A1
20120057040 Park et al. Mar 2012 A1
20120062697 Treado et al. Mar 2012 A1
20120062702 Jiang et al. Mar 2012 A1
20120062756 Tian et al. Mar 2012 A1
20120069235 Imai Mar 2012 A1
20120081519 Goma et al. Apr 2012 A1
20120086803 Malzbender et al. Apr 2012 A1
20120105590 Fukumoto et al. May 2012 A1
20120105654 Kwatra et al. May 2012 A1
20120105691 Waqas et al. May 2012 A1
20120113232 Joblove May 2012 A1
20120113318 Galstian et al. May 2012 A1
20120113413 Miahczylowicz-Wolski et al. May 2012 A1
20120114224 Xu et al. May 2012 A1
20120114260 Takahashi et al. May 2012 A1
20120120264 Lee et al. May 2012 A1
20120127275 Von Zitzewitz et al. May 2012 A1
20120127284 Bar-Zeev et al. May 2012 A1
20120147139 Li et al. Jun 2012 A1
20120147205 Lelescu et al. Jun 2012 A1
20120153153 Chang et al. Jun 2012 A1
20120154551 Inoue Jun 2012 A1
20120155830 Sasaki et al. Jun 2012 A1
20120162374 Markas et al. Jun 2012 A1
20120163672 McKinnon Jun 2012 A1
20120163725 Fukuhara Jun 2012 A1
20120169433 Mullins et al. Jul 2012 A1
20120170134 Bolis et al. Jul 2012 A1
20120176479 Mayhew et al. Jul 2012 A1
20120176481 Lukk et al. Jul 2012 A1
20120188235 Wu et al. Jul 2012 A1
20120188341 Klein Gunnewiek et al. Jul 2012 A1
20120188389 Lin et al. Jul 2012 A1
20120188420 Black et al. Jul 2012 A1
20120188634 Kubala et al. Jul 2012 A1
20120198677 Duparre Aug 2012 A1
20120200669 Lai et al. Aug 2012 A1
20120200726 Bugnariu Aug 2012 A1
20120200734 Tang Aug 2012 A1
20120206582 DiCarlo et al. Aug 2012 A1
20120218455 Imai et al. Aug 2012 A1
20120219236 Ali et al. Aug 2012 A1
20120224083 Jovanovski et al. Sep 2012 A1
20120229602 Chen et al. Sep 2012 A1
20120229628 Ishiyama et al. Sep 2012 A1
20120237114 Park et al. Sep 2012 A1
20120249550 Akeley et al. Oct 2012 A1
20120249750 Izzat et al. Oct 2012 A1
20120249836 Ali et al. Oct 2012 A1
20120249853 Krolczyk et al. Oct 2012 A1
20120250990 Bocirnea Oct 2012 A1
20120262601 Choi et al. Oct 2012 A1
20120262607 Shimura et al. Oct 2012 A1
20120268574 Gidon et al. Oct 2012 A1
20120274626 Hsieh Nov 2012 A1
20120287291 McMahon Nov 2012 A1
20120290257 Hodge et al. Nov 2012 A1
20120293489 Chen et al. Nov 2012 A1
20120293624 Chen et al. Nov 2012 A1
20120293695 Tanaka Nov 2012 A1
20120307084 Mantzel Dec 2012 A1
20120307093 Miyoshi Dec 2012 A1
20120307099 Yahata Dec 2012 A1
20120314033 Lee et al. Dec 2012 A1
20120314937 Kim et al. Dec 2012 A1
20120327222 Ng et al. Dec 2012 A1
20130002828 Ding et al. Jan 2013 A1
20130002953 Noguchi et al. Jan 2013 A1
20130003184 Duparre Jan 2013 A1
20130010073 Do et al. Jan 2013 A1
20130016245 Yuba Jan 2013 A1
20130016885 Tsujimoto Jan 2013 A1
20130022111 Chen et al. Jan 2013 A1
20130027580 Olsen et al. Jan 2013 A1
20130033579 Wajs Feb 2013 A1
20130033585 Li et al. Feb 2013 A1
20130038696 Ding et al. Feb 2013 A1
20130047396 Au et al. Feb 2013 A1
20130050504 Safaee-Rad et al. Feb 2013 A1
20130050526 Keelan Feb 2013 A1
20130057710 McMahon Mar 2013 A1
20130070060 Chatterjee et al. Mar 2013 A1
20130076967 Brunner et al. Mar 2013 A1
20130077859 Stauder et al. Mar 2013 A1
20130077880 Venkataraman et al. Mar 2013 A1
20130077882 Venkataraman et al. Mar 2013 A1
20130083172 Baba Apr 2013 A1
20130088489 Schmeitz et al. Apr 2013 A1
20130088637 Duparre Apr 2013 A1
20130093842 Yahata Apr 2013 A1
20130100254 Morioka et al. Apr 2013 A1
20130107061 Kumar et al. May 2013 A1
20130113888 Koguchi May 2013 A1
20130113899 Morohoshi et al. May 2013 A1
20130113939 Strandemar May 2013 A1
20130120536 Song et al. May 2013 A1
20130120605 Georgiev et al. May 2013 A1
20130121559 Hu et al. May 2013 A1
20130127988 Wang et al. May 2013 A1
20130128049 Schofield et al. May 2013 A1
20130128068 Georgiev et al. May 2013 A1
20130128069 Georgiev et al. May 2013 A1
20130128087 Georgiev et al. May 2013 A1
20130128121 Agarwala et al. May 2013 A1
20130135315 Bares et al. May 2013 A1
20130135448 Nagumo et al. May 2013 A1
20130147979 McMahon et al. Jun 2013 A1
20130155050 Rastogi et al. Jun 2013 A1
20130162641 Zhang et al. Jun 2013 A1
20130169754 Aronsson et al. Jul 2013 A1
20130176394 Tian et al. Jul 2013 A1
20130208138 Li et al. Aug 2013 A1
20130215108 McMahon et al. Aug 2013 A1
20130215231 Hiramoto et al. Aug 2013 A1
20130216144 Robinson et al. Aug 2013 A1
20130222556 Shimada Aug 2013 A1
20130222656 Kaneko Aug 2013 A1
20130223759 Nishiyama Aug 2013 A1
20130229540 Farina et al. Sep 2013 A1
20130230237 Schlosser et al. Sep 2013 A1
20130250123 Zhang et al. Sep 2013 A1
20130250150 Malone et al. Sep 2013 A1
20130258067 Zhang et al. Oct 2013 A1
20130259317 Gaddy Oct 2013 A1
20130265459 Duparre et al. Oct 2013 A1
20130274596 Azizian et al. Oct 2013 A1
20130274923 By Oct 2013 A1
20130278631 Border et al. Oct 2013 A1
20130286236 Mankowski Oct 2013 A1
20130293760 Nisenzon et al. Nov 2013 A1
20130308197 Duparre Nov 2013 A1
20130321581 El-Ghoroury et al. Dec 2013 A1
20130321589 Kirk et al. Dec 2013 A1
20130335598 Gustavsson et al. Dec 2013 A1
20130342641 Morioka et al. Dec 2013 A1
20140002674 Duparre et al. Jan 2014 A1
20140002675 Duparre et al. Jan 2014 A1
20140009586 McNamer et al. Jan 2014 A1
20140013273 Ng Jan 2014 A1
20140037137 Broaddus et al. Feb 2014 A1
20140037140 Benhimane et al. Feb 2014 A1
20140043507 Wang et al. Feb 2014 A1
20140059462 Wernersson Feb 2014 A1
20140076336 Clayton et al. Mar 2014 A1
20140078333 Miao Mar 2014 A1
20140079336 Venkataraman et al. Mar 2014 A1
20140081454 Nuyujukian et al. Mar 2014 A1
20140085502 Lin et al. Mar 2014 A1
20140092281 Nisenzon et al. Apr 2014 A1
20140098266 Nayar et al. Apr 2014 A1
20140098267 Tian et al. Apr 2014 A1
20140104490 Hsieh et al. Apr 2014 A1
20140118493 Sali et al. May 2014 A1
20140118584 Lee et al. May 2014 A1
20140125760 Au et al. May 2014 A1
20140125771 Grossmann et al. May 2014 A1
20140132810 McMahon May 2014 A1
20140139642 Ni et al. May 2014 A1
20140139643 Hogasten et al. May 2014 A1
20140140626 Cho et al. May 2014 A1
20140146132 Bagnato et al. May 2014 A1
20140146201 Knight et al. May 2014 A1
20140176592 Wilburn et al. Jun 2014 A1
20140183258 DiMuro Jul 2014 A1
20140183334 Wang et al. Jul 2014 A1
20140186045 Poddar et al. Jul 2014 A1
20140192154 Jeong et al. Jul 2014 A1
20140192253 Laroia Jul 2014 A1
20140198188 Izawa Jul 2014 A1
20140204183 Lee et al. Jul 2014 A1
20140218546 McMahon Aug 2014 A1
20140232822 Venkataraman et al. Aug 2014 A1
20140240528 Venkataraman et al. Aug 2014 A1
20140240529 Venkataraman et al. Aug 2014 A1
20140253738 Mullis Sep 2014 A1
20140267243 Venkataraman et al. Sep 2014 A1
20140267286 Duparre Sep 2014 A1
20140267633 Venkataraman et al. Sep 2014 A1
20140267762 Mullis et al. Sep 2014 A1
20140267829 McMahon et al. Sep 2014 A1
20140267890 Lelescu et al. Sep 2014 A1
20140285675 Mullis Sep 2014 A1
20140300706 Song Oct 2014 A1
20140307058 Kirk et al. Oct 2014 A1
20140307063 Lee Oct 2014 A1
20140313315 Shoham et al. Oct 2014 A1
20140321712 Ciurea et al. Oct 2014 A1
20140333731 Venkataraman et al. Nov 2014 A1
20140333764 Venkataraman et al. Nov 2014 A1
20140333787 Venkataraman et al. Nov 2014 A1
20140340539 Venkataraman et al. Nov 2014 A1
20140347509 Venkataraman et al. Nov 2014 A1
20140347748 Duparre Nov 2014 A1
20140354773 Venkataraman et al. Dec 2014 A1
20140354843 Venkataraman et al. Dec 2014 A1
20140354844 Venkataraman et al. Dec 2014 A1
20140354853 Venkataraman et al. Dec 2014 A1
20140354854 Venkataraman et al. Dec 2014 A1
20140354855 Venkataraman et al. Dec 2014 A1
20140355870 Venkataraman et al. Dec 2014 A1
20140368662 Venkataraman et al. Dec 2014 A1
20140368683 Venkataraman et al. Dec 2014 A1
20140368684 Venkataraman et al. Dec 2014 A1
20140368685 Venkataraman et al. Dec 2014 A1
20140368686 Duparre Dec 2014 A1
20140369612 Venkataraman et al. Dec 2014 A1
20140369615 Venkataraman et al. Dec 2014 A1
20140376825 Venkataraman et al. Dec 2014 A1
20140376826 Venkataraman et al. Dec 2014 A1
20150002734 Lee Jan 2015 A1
20150003752 Venkataraman et al. Jan 2015 A1
20150003753 Venkataraman et al. Jan 2015 A1
20150009353 Venkataraman et al. Jan 2015 A1
20150009354 Venkataraman et al. Jan 2015 A1
20150009362 Venkataraman et al. Jan 2015 A1
20150015669 Venkataraman et al. Jan 2015 A1
20150035992 Mullis Feb 2015 A1
20150036014 Lelescu et al. Feb 2015 A1
20150036015 Lelescu et al. Feb 2015 A1
20150042766 Ciurea et al. Feb 2015 A1
20150042767 Ciurea et al. Feb 2015 A1
20150042814 Vaziri Feb 2015 A1
20150042833 Lelescu et al. Feb 2015 A1
20150049915 Ciurea et al. Feb 2015 A1
20150049916 Ciurea et al. Feb 2015 A1
20150049917 Ciurea et al. Feb 2015 A1
20150055884 Venkataraman et al. Feb 2015 A1
20150085073 Bruls et al. Mar 2015 A1
20150085174 Shabtay et al. Mar 2015 A1
20150091900 Yang et al. Apr 2015 A1
20150095235 Dua Apr 2015 A1
20150098079 Montgomery et al. Apr 2015 A1
20150104076 Hayasaka Apr 2015 A1
20150104101 Bryant et al. Apr 2015 A1
20150122411 Rodda et al. May 2015 A1
20150124059 Georgiev et al. May 2015 A1
20150124113 Rodda et al. May 2015 A1
20150124151 Rodda et al. May 2015 A1
20150138346 Venkataraman et al. May 2015 A1
20150146029 Venkataraman et al. May 2015 A1
20150146030 Venkataraman et al. May 2015 A1
20150161798 Venkataraman et al. Jun 2015 A1
20150199793 Venkataraman et al. Jul 2015 A1
20150199841 Venkataraman et al. Jul 2015 A1
20150207990 Ford et al. Jul 2015 A1
20150228081 Kim et al. Aug 2015 A1
20150235476 McMahon et al. Aug 2015 A1
20150237329 Venkataraman et al. Aug 2015 A1
20150243480 Yamada Aug 2015 A1
20150244927 Laroia et al. Aug 2015 A1
20150245013 Venkataraman et al. Aug 2015 A1
20150248744 Hayasaka et al. Sep 2015 A1
20150254868 Srikanth et al. Sep 2015 A1
20150264337 Venkataraman et al. Sep 2015 A1
20150288861 Duparre Oct 2015 A1
20150296137 Duparre et al. Oct 2015 A1
20150312455 Venkataraman et al. Oct 2015 A1
20150317638 Donaldson Nov 2015 A1
20150326852 Duparre et al. Nov 2015 A1
20150332468 Hayasaka et al. Nov 2015 A1
20150373261 Rodda et al. Dec 2015 A1
20160037097 Duparre Feb 2016 A1
20160042548 Du et al. Feb 2016 A1
20160044252 Molina Feb 2016 A1
20160044257 Venkataraman et al. Feb 2016 A1
20160057332 Ciurea et al. Feb 2016 A1
20160065934 Kaza et al. Mar 2016 A1
20160163051 Mullis Jun 2016 A1
20160165106 Duparre Jun 2016 A1
20160165134 Lelescu et al. Jun 2016 A1
20160165147 Nisenzon et al. Jun 2016 A1
20160165212 Mullis Jun 2016 A1
20160182786 Anderson et al. Jun 2016 A1
20160191768 Shin et al. Jun 2016 A1
20160195733 Lelescu et al. Jul 2016 A1
20160198096 McMahon et al. Jul 2016 A1
20160209654 Riccomini et al. Jul 2016 A1
20160210785 Balachandreswaran et al. Jul 2016 A1
20160227195 Venkataraman et al. Aug 2016 A1
20160249001 McMahon Aug 2016 A1
20160255333 Nisenzon et al. Sep 2016 A1
20160266284 Duparre et al. Sep 2016 A1
20160267486 Mitra et al. Sep 2016 A1
20160267665 Venkataraman et al. Sep 2016 A1
20160267672 Ciurea et al. Sep 2016 A1
20160269626 McMahon Sep 2016 A1
20160269627 McMahon Sep 2016 A1
20160269650 Venkataraman et al. Sep 2016 A1
20160269651 Venkataraman et al. Sep 2016 A1
20160269664 Duparre Sep 2016 A1
20160309084 Venkataraman et al. Oct 2016 A1
20160309134 Venkataraman et al. Oct 2016 A1
20160316140 Nayar et al. Oct 2016 A1
20160323578 Kaneko et al. Nov 2016 A1
20170004791 Aubineau et al. Jan 2017 A1
20170006233 Venkataraman et al. Jan 2017 A1
20170011405 Pandey Jan 2017 A1
20170048468 Pain et al. Feb 2017 A1
20170053382 Lelescu et al. Feb 2017 A1
20170054901 Venkataraman et al. Feb 2017 A1
20170070672 Rodda et al. Mar 2017 A1
20170070673 Lelescu et al. Mar 2017 A1
20170070753 Kaneko Mar 2017 A1
20170078568 Venkataraman et al. Mar 2017 A1
20170085845 Venkataraman et al. Mar 2017 A1
20170094243 Venkataraman et al. Mar 2017 A1
20170099465 Mullis et al. Apr 2017 A1
20170109742 Varadarajan Apr 2017 A1
20170142405 Shors et al. May 2017 A1
20170163862 Molina Jun 2017 A1
20170178363 Venkataraman et al. Jun 2017 A1
20170187933 Duparre Jun 2017 A1
20170188011 Panescu et al. Jun 2017 A1
20170244960 Ciurea et al. Aug 2017 A1
20170257562 Venkataraman et al. Sep 2017 A1
20170365104 McMahon et al. Dec 2017 A1
20180005244 Govindarajan et al. Jan 2018 A1
20180007284 Venkataraman et al. Jan 2018 A1
20180012411 Richey Jan 2018 A1
20180013945 Ciurea et al. Jan 2018 A1
20180024330 Laroia Jan 2018 A1
20180035057 McMahon et al. Feb 2018 A1
20180040135 Mullis Feb 2018 A1
20180048830 Venkataraman et al. Feb 2018 A1
20180048879 Venkataraman et al. Feb 2018 A1
20180050453 Peters Feb 2018 A1
20180081090 Duparre et al. Mar 2018 A1
20180097993 Nayar et al. Apr 2018 A1
20180109782 Duparre et al. Apr 2018 A1
20180124311 Lelescu et al. May 2018 A1
20180131852 McMahon May 2018 A1
20180139382 Venkataraman et al. May 2018 A1
20180189767 Bigioi Jul 2018 A1
20180197035 Venkataraman et al. Jul 2018 A1
20180211402 Ciurea et al. Jul 2018 A1
20180227511 McMahon Aug 2018 A1
20180240265 Yang et al. Aug 2018 A1
20180270473 Mullis Sep 2018 A1
20180286120 Fleishman et al. Oct 2018 A1
20180302554 Lelescu et al. Oct 2018 A1
20180330182 Venkataraman et al. Nov 2018 A1
20180376122 Park et al. Dec 2018 A1
20190012768 Tafazoli Bilandi et al. Jan 2019 A1
20190037116 Molina Jan 2019 A1
20190037150 Srikanth et al. Jan 2019 A1
20190043253 Lucas et al. Feb 2019 A1
20190057513 Jain et al. Feb 2019 A1
20190063905 Venkataraman et al. Feb 2019 A1
20190089947 Venkataraman et al. Mar 2019 A1
20190098209 Venkataraman et al. Mar 2019 A1
20190109998 Venkataraman et al. Apr 2019 A1
20190164341 Venkataraman May 2019 A1
20190174040 Mcmahon Jun 2019 A1
20190197735 Xiong et al. Jun 2019 A1
20190215496 Mullis et al. Jul 2019 A1
20190230348 Ciurea et al. Jul 2019 A1
20190235138 Duparre et al. Aug 2019 A1
20190243086 Rodda et al. Aug 2019 A1
20190244379 Venkataraman Aug 2019 A1
20190251744 Flagg Aug 2019 A1
20190268586 Mullis Aug 2019 A1
20190278983 Iqbal Sep 2019 A1
20190289176 Duparre Sep 2019 A1
20190347768 Lelescu et al. Nov 2019 A1
20190355150 Tremblay Nov 2019 A1
20190356863 Venkataraman et al. Nov 2019 A1
20190362515 Ciurea et al. Nov 2019 A1
20190364263 Jannard et al. Nov 2019 A1
20200026948 Venkataraman et al. Jan 2020 A1
20200151894 Jain et al. May 2020 A1
20200175759 Russell Jun 2020 A1
20200252597 Mullis Aug 2020 A1
20200302634 Pollefeys Sep 2020 A1
20200311956 Choi Oct 2020 A1
20200334905 Venkataraman Oct 2020 A1
20200389604 Venkataraman et al. Dec 2020 A1
20210042952 Jain et al. Feb 2021 A1
20210044790 Venkataraman et al. Feb 2021 A1
20210063141 Venkataraman et al. Mar 2021 A1
20210133927 Lelescu et al. May 2021 A1
20210150748 Ciurea et al. May 2021 A1
20210366153 Hoelscher Nov 2021 A1
20220072707 Fan Mar 2022 A1
20220101639 Shugurov Mar 2022 A1
20220277472 Birchfield Sep 2022 A1
20220375125 Taamazyan Nov 2022 A1
Foreign Referenced Citations (278)
Number Date Country
2488005 Apr 2002 CN
1619358 May 2005 CN
1669332 Sep 2005 CN
1727991 Feb 2006 CN
1839394 Sep 2006 CN
1985524 Jun 2007 CN
1992499 Jul 2007 CN
101010619 Aug 2007 CN
101046882 Oct 2007 CN
101064780 Oct 2007 CN
101102388 Jan 2008 CN
101147392 Mar 2008 CN
201043890 Apr 2008 CN
101212566 Jul 2008 CN
101312540 Nov 2008 CN
101427372 May 2009 CN
101551586 Oct 2009 CN
101593350 Dec 2009 CN
101606086 Dec 2009 CN
101785025 Jul 2010 CN
101883291 Nov 2010 CN
102037717 Apr 2011 CN
102164298 Aug 2011 CN
102184720 Sep 2011 CN
102375199 Mar 2012 CN
103004180 Mar 2013 CN
103765864 Apr 2014 CN
104081414 Oct 2014 CN
104508681 Apr 2015 CN
104662589 May 2015 CN
104685513 Jun 2015 CN
104685860 Jun 2015 CN
105409212 Mar 2016 CN
103765864 Jul 2017 CN
104081414 Aug 2017 CN
104662589 Aug 2017 CN
107077743 Aug 2017 CN
107230236 Oct 2017 CN
107346061 Nov 2017 CN
107404609 Nov 2017 CN
104685513 Apr 2018 CN
107924572 Apr 2018 CN
108307675 Jul 2018 CN
104335246 Sep 2018 CN
107404609 Feb 2020 CN
107346061 Apr 2020 CN
107230236 Dec 2020 CN
108307675 Dec 2020 CN
107077743 Mar 2021 CN
602011041799.1 Sep 2017 DE
0677821 Oct 1995 EP
0840502 May 1998 EP
1201407 May 2002 EP
1355274 Oct 2003 EP
1734766 Dec 2006 EP
1991145 Nov 2008 EP
1243945 Jan 2009 EP
2026563 Feb 2009 EP
2031592 Mar 2009 EP
2041454 Apr 2009 EP
2072785 Jun 2009 EP
2104334 Sep 2009 EP
2136345 Dec 2009 EP
2156244 Feb 2010 EP
2244484 Oct 2010 EP
0957642 Apr 2011 EP
2336816 Jun 2011 EP
2339532 Jun 2011 EP
2381418 Oct 2011 EP
2386554 Nov 2011 EP
2462477 Jun 2012 EP
2502115 Sep 2012 EP
2569935 Mar 2013 EP
2652678 Oct 2013 EP
2677066 Dec 2013 EP
2708019 Mar 2014 EP
2761534 Aug 2014 EP
2777245 Sep 2014 EP
2867718 May 2015 EP
2873028 May 2015 EP
2888698 Jul 2015 EP
2888720 Jul 2015 EP
2901671 Aug 2015 EP
2973476 Jan 2016 EP
3066690 Sep 2016 EP
2569935 Dec 2016 EP
3201877 Aug 2017 EP
2652678 Sep 2017 EP
3284061 Feb 2018 EP
3286914 Feb 2018 EP
3201877 Mar 2018 EP
2817955 Apr 2018 EP
3328048 May 2018 EP
3075140 Jun 2018 EP
3201877 Dec 2018 EP
3467776 Apr 2019 EP
2708019 Oct 2019 EP
3286914 Dec 2019 EP
2761534 Nov 2020 EP
2888720 Mar 2021 EP
3328048 Apr 2021 EP
2482022 Jan 2012 GB
2708CHENP2014 Aug 2015 IN
361194 Mar 2021 IN
59-025483 Feb 1984 JP
64-037177 Feb 1989 JP
02-285772 Nov 1990 JP
06129851 May 1994 JP
07-015457 Jan 1995 JP
H0756112 Mar 1995 JP
09171075 Jun 1997 JP
09181913 Jul 1997 JP
10253351 Sep 1998 JP
11142609 May 1999 JP
11223708 Aug 1999 JP
11325889 Nov 1999 JP
2000209503 Jul 2000 JP
2001008235 Jan 2001 JP
2001194114 Jul 2001 JP
2001264033 Sep 2001 JP
2001277260 Oct 2001 JP
2001337263 Dec 2001 JP
2002195910 Jul 2002 JP
2002205310 Jul 2002 JP
2002209226 Jul 2002 JP
2002250607 Sep 2002 JP
2002252338 Sep 2002 JP
2003094445 Apr 2003 JP
2003139910 May 2003 JP
2003163938 Jun 2003 JP
2003298920 Oct 2003 JP
2004221585 Aug 2004 JP
2005116022 Apr 2005 JP
2005181460 Jul 2005 JP
2005295381 Oct 2005 JP
2005303694 Oct 2005 JP
2005341569 Dec 2005 JP
2005354124 Dec 2005 JP
2006033228 Feb 2006 JP
2006033493 Feb 2006 JP
2006047944 Feb 2006 JP
2006258930 Sep 2006 JP
2007520107 Jul 2007 JP
2007259136 Oct 2007 JP
2008039852 Feb 2008 JP
2008055908 Mar 2008 JP
2008507874 Mar 2008 JP
2008172735 Jul 2008 JP
2008258885 Oct 2008 JP
2009064421 Mar 2009 JP
2009132010 Jun 2009 JP
2009300268 Dec 2009 JP
2010139288 Jun 2010 JP
2011017764 Jan 2011 JP
2011030184 Feb 2011 JP
2011109484 Jun 2011 JP
2011523538 Aug 2011 JP
2011203238 Oct 2011 JP
2012504805 Feb 2012 JP
2011052064 Mar 2013 JP
2013509022 Mar 2013 JP
2013526801 Jun 2013 JP
2014519741 Aug 2014 JP
2014521117 Aug 2014 JP
2014535191 Dec 2014 JP
2015022510 Feb 2015 JP
2015522178 Aug 2015 JP
2015534734 Dec 2015 JP
5848754 Jan 2016 JP
2016524125 Aug 2016 JP
6140709 May 2017 JP
2017163550 Sep 2017 JP
2017163587 Sep 2017 JP
2017531976 Oct 2017 JP
6546613 Jul 2019 JP
2019-220957 Dec 2019 JP
6630891 Dec 2019 JP
2020017999 Jan 2020 JP
6767543 Sep 2020 JP
6767558 Sep 2020 JP
10200500042 39 Jan 2005 KR
100496875 Jun 2005 KR
10201100976 47 Aug 2011 KR
20140045373 Apr 2014 KR
20170063827 Jun 2017 KR
101824672 Feb 2018 KR
101843994 Mar 2018 KR
101973822 Apr 2019 KR
10-2002165 Jul 2019 KR
10-2111181 May 2020 KR
191151 Jul 2013 SG
11201500910 Oct 2015 SG
200828994 Jul 2008 TW
200939739 Sep 2009 TW
201228382 Jul 2012 TW
I535292 May 2016 TW
1994020875 Sep 1994 WO
2005057922 Jun 2005 WO
2006039906 Apr 2006 WO
2006039906 Apr 2006 WO
2007013250 Feb 2007 WO
2007083579 Jul 2007 WO
2007134137 Nov 2007 WO
2008045198 Apr 2008 WO
2008050904 May 2008 WO
2008108271 Sep 2008 WO
2008108926 Sep 2008 WO
2008150817 Dec 2008 WO
2009073950 Jun 2009 WO
2009151903 Dec 2009 WO
2009157273 Dec 2009 WO
2010037512 Apr 2010 WO
2011008443 Jan 2011 WO
2011026527 Mar 2011 WO
2011046607 Apr 2011 WO
2011055655 May 2011 WO
2011063347 May 2011 WO
2011105814 Sep 2011 WO
2011116203 Sep 2011 WO
2011063347 Oct 2011 WO
2011121117 Oct 2011 WO
2011143501 Nov 2011 WO
2012057619 May 2012 WO
2012057620 May 2012 WO
2012057621 May 2012 WO
2012057622 May 2012 WO
2012057623 May 2012 WO
2012057620 Jun 2012 WO
2012074361 Jun 2012 WO
2012078126 Jun 2012 WO
2012082904 Jun 2012 WO
2012155119 Nov 2012 WO
2013003276 Jan 2013 WO
2013043751 Mar 2013 WO
2013043761 Mar 2013 WO
2013049699 Apr 2013 WO
2013055960 Apr 2013 WO
2013119706 Aug 2013 WO
2013126578 Aug 2013 WO
2013166215 Nov 2013 WO
2014004134 Jan 2014 WO
2014005123 Jan 2014 WO
2014031795 Feb 2014 WO
2014052974 Apr 2014 WO
2014032020 May 2014 WO
2014078443 May 2014 WO
2014130849 Aug 2014 WO
2014131038 Aug 2014 WO
2014133974 Sep 2014 WO
2014138695 Sep 2014 WO
2014138697 Sep 2014 WO
2014144157 Sep 2014 WO
2014145856 Sep 2014 WO
2014149403 Sep 2014 WO
2014149902 Sep 2014 WO
2014150856 Sep 2014 WO
2014153098 Sep 2014 WO
2014159721 Oct 2014 WO
2014159779 Oct 2014 WO
2014160142 Oct 2014 WO
2014164550 Oct 2014 WO
2014164909 Oct 2014 WO
2014165244 Oct 2014 WO
2014133974 Apr 2015 WO
2015048694 Apr 2015 WO
2015048906 Apr 2015 WO
2015070105 May 2015 WO
2015074078 May 2015 WO
2015081279 Jun 2015 WO
2015134996 Sep 2015 WO
2015183824 Dec 2015 WO
2016054089 Apr 2016 WO
2016172125 Oct 2016 WO
2016167814 Oct 2016 WO
2016172125 Apr 2017 WO
2018053181 Mar 2018 WO
2019038193 Feb 2019 WO
WO 2021071992 Apr 2021 WO
Non-Patent Literature Citations (294)
Entry
US 8,957,977 B2, 02/2015, Venkataraman et al. (withdrawn)
Trabelsi A, Chaabane M, Blanchard N, Beveridge R. A pose proposal and refinement network for better 6d object pose estimation. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Jan. 2021 (pp. 2382-2391). URL: https://wacv2021.thecvf.com/node/38 (Year: 2021).
Z. Zhang, S. Fidler, and R. Urtasun. Instance-level segmentation for autonomous driving with deep densely connected MRFs. In CVPR, 2016 (Year: 2016).
Agastya, Kalra et al. “Deep Polarization Cues for Transparent Object Segmentation.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, pp. 8602-8611.
An, Gwon Hwan, et al. “Charuco Board-Based Omnidirectional Camera Calibration Method.” Electronics 7.12 (2018): 421, 15 pages.
Atkinson, Gary A. et al. “Multi-view Surface Reconstruction using Polarization” Tenth IEEE International Conference on Computer Vision (ICCV'05) vol. 1, pp. 309-316 (2005).
Atkinson, Gary A. et al. “Recovery of Surface Orientation From Diffuse Polarization” IEEE Transactions on Image Processing, vol. 15, No. 6, Jun. 2006, pp. 1653-1664.
Ba, Yunhao et al.: Deep shape from polarization. arXiv preprint arXiv:1903.10210 (2019), 11 pages.
Baek, S.H., et al., “Simultaneous Acquisition of Polarimetric SVBRDF and Normals,” ACM Trans. Graph, vol. 37, No. 6, Article 268, (2018), 15 pages.
Cao, Yue, et al. “GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond.” Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2019, 10 pages.
Campbell, Dylan et al. “Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric Optimization.” European Conference on Computer Vision. Springer, Cham, 2020, 18 pages.
Deng, Jia et al. “ImageNet: A Large-Scale Hierarchical Image Database” IEEE Computer Vision and Pattern Recognition (CVPR), 2009, 8 pages.
Dosovitskiy, Alexey, et al. “FlowNet: Learning Optical Flow with Convolutional Networks.” Proceedings of the IEEE international conference on computer vision. 2015, 9 pages.
Drost, Bertram, et al. “Model globally, match locally: Efficient and robust 3D object recognition.” 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 2010.
Garrido-Jurado, Sergio, et al. “Automatic generation and detection of highly reliable fiducial markers under occlusion.” Pattern Recognition 47.6 (2014): 390-402.
He, Kaiming, et al. “Deep Residual Learning for Image Recognition.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016, pp. 770-778.
Horn, Berthold KP, and Brian G. Schunck. “Determining Optical Flow.” Artificial intelligence 17.1-3 (1981): 185-203.
Huynh, Cong Phuoc et al. “Shape and Refractive Index from Single-View Spectro-Polarimetric Images,” International Journal of Computer Vision, vol. 101, No. 1, pp. 64-94, Jun. 2013.
Ilg, Eddy, et al. “FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2017, pp. 2462-2470.
Jalal, Mona, et al. “SIDOD: A Synthetic Image Dataset for 3D Object Pose Recognition with Distractors.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019, 3 pages.
Kadambi, Achuta, et al. “Polarized 3D: High-Quality Depth Sensing with Polarization Cues.” Proceedings of the IEEE International Conference on Computer Vision. 2015, pp. 3370-3378.
Kondo, Yuhi, et al. “Accurate Polarimetric BRDF for Real Polarization Scene Rendering.” European Conference on Computer Vision. Springer, Cham, 2020, 17 pages.
Labbé, Yann, et al. “CosyPose: Consistent multi-view multi-object 6D pose estimation.” European Conference on Computer Vision. Springer, Cham, 2020, 41 pages.
Lepetit, Vincent, et al. “EPnP: An Accurate O(n) Solution to the PnP Problem.” International Journal of Computer Vision 81.2 (2009): 155, 13 pages.
Lin, Tsung-Yi, et al. “Feature Pyramid Networks for Object Detection.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2017, pp. 2117-2125.
Miyazaki, Daisuke et al. Polarization-based Inverse Rendering from a Single View. In: IEEE International Conference on Computer Vision. p. 982-987 (2003).
Möller, Tomas, and Ben Trumbore. “Fast, minimum storage ray-triangle intersection.” Journal of graphics tools 2.1 (1997): 21-28.
Ngo Thanh, Trung, et al. “Shape and Light Directions from Shading and Polarization.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2015, 9 pages.
Ngo, Trung Thanh, Hajime Nagahara, and Rin-ichiro Taniguchi. “Surface Normals and Light Directions from Shading and Polarization.” IEEE Transactions on Pattern Analysis and Machine Intelligence (2021), 12 pages.
Russakovsky, Olga et al., ImageNet Large Scale Visual Recognition Challenge. IJCV, 2015, 43 pages.
Saman, Gule et al. “Refractive Index Estimation Using Photometric Stereo,” in IEEE International Conference on Image Processing (ICIP), 2011, pp. 1925-1928.
Smith, William A. R. et al. “Height-from-Polarisation with Unknown Lighting or Albedo,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, No. 12, Dec. 2019, pp. 2875-2888.
Thilak, Vimal et al. “Polarization-based index of refraction and reflection angle estimation for remote sensing applications” Applied Optics 46(30),7527-7536 (2007).
Trabelsi, Ameni, et al. “A Pose Proposal and Refinement Network for Better 6D Object Pose Estimation.” Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2021, pp. 2382-2391.
Wang, Jingdong, et al. “Deep High-Resolution Representation Learning for Visual Recognition.” IEEE Transactions on Pattern Analysis and Machine Intelligence (2020), 23 pages.
Woodham, Robert, “Photometric Method for Determining Surface Orientation From Multiple Images,” Optical Engineering, vol. 19, No. 1,1980, pp. 139-144.
Xiang, Yu, et al. “PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes.” arXiv preprint arXiv:1711.00199 (2017), 10 pages.
Xu, Haofei, et al. “AANet: Adaptive Aggregation Network for Efficient Stereo Matching.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 10 pages.
Zakharov, Sergey, et al. “DPOD: 6D Pose Object Detector and Refiner.” 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE Computer Society, 2019, 10 pages.
Zhao, Wanqing, et al. “Learning deep network for detecting 3D object keypoints and 6D poses.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 9 pages.
Zhang, Feihu, et al. “GA-Net: Guided Aggregation Net for End-to-end Stereo Matching.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, pp. 185-194.
Zhang, Yinda et al. “Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks” IEEE Conference on Computer Vision and Pattern Recognition, pp. 5287-5295 (2017).
Zou, Shihao, et al. “Polarization Human Shape and Pose Dataset.” arXiv preprint arXiv:2004.14899 (2020), 9 pages.
Ansari et al., “3-D Face Modeling Using Two Views and a Generic Face Model with Application to 3-D Face Recognition”, Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, Jul. 22, 2003, 9 pgs.
Aufderheide et al., “A MEMS-based Smart Sensor System for Estimation of Camera Pose for Computer Vision Applications”, Research and Innovation Conference 2011, Jul. 29, 2011, pp. 1-10.
Baker et al., “Limits on Super-Resolution and How to Break Them”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Sep. 2002, vol. 24, No. 9, pp. 1167-1183.
Banz et al., “Real-Time Semi-Global Matching Disparity Estimation on the GPU”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Sep. 2002, vol. 24, No. 9, pp. 1167-1183.
Barron et al., “Intrinsic Scene Properties from a Single RGB-D Image”, 2013 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 23-28, 2013, Portland, OR, USA, pp. 17-24.
Bennett et al., “Multispectral Bilateral Video Fusion”, Computer Graphics (ACM SIGGRAPH Proceedings), Jul. 25, 2006, published Jul. 30, 2006, 1 pg.
Bennett et al., “Multispectral Video Fusion”, Computer Graphics (ACM SIGGRAPH Proceedings), Jul. 25, 2006, published Jul. 30, 2006, 1 pg.
Berretti et al., “Face Recognition by Super-Resolved 3D Models from Consumer Depth Cameras”, IEEE Transactions on Information Forensics and Security, vol. 9, No. 9, Sep. 2014, pp. 1436-1448.
Bertalmio et al., “Image Inpainting”, Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, 2000, ACM Pres/Addison-Wesley Publishing Co., pp. 417-424.
Bertero et al., “Super-resolution in computational imaging”, Micron, Jan. 1, 2003, vol. 34, Issues 6-7, 17 pgs.
Bishop et al., “Full-Resolution Depth Map Estimation from an Aliased Plenoptic Light Field”, ACCV Nov. 8, 2010, Part II, LNCS 6493, pp. 186-200.
Bishop et al., “Light Field Superresolution”, Computational Photography (ICCP), 2009 IEEE International Conference, Conference Date Apr. 16-17, published Jan. 26, 2009, 9 pgs.
Bishop et al., “The Light Field Camera: Extended Depth of Field, Aliasing, and Superresolution”, IEEE Transactions on Pattern Analysis and Machine Intelligence, May 2012, vol. 34, No. 5, published Aug. 18, 2011, pp. 972-986.
Blanz et al., “A Morphable Model for the Synthesis of 3D Faces”, In Proceedings of ACM SIGGRAPH 1999, Jul. 1, 1999, pp. 187-194.
Borman, “Topics in Multiframe Superresolution Restoration”, Thesis of Sean Borman, Apr. 2004, 282 pgs.
Borman et al., “Image Sequence Processing”, Dekker Encyclopedia of Optical Engineering, Oct. 14, 2002, 81 pgs.
Borman et al, “Linear models for multi-frame super-resolution restoration under non-affine registration and spatially varying PSF”, Proc. SPIE, May 21, 2004, vol. 5299, 12 pgs.
Borman et al., “Simultaneous Multi-Frame MAP Super-Resolution Video Enhancement Using Spatio-Temporal Priors”, Image Processing, 1999, ICIP 99 Proceedings, vol. 3, pp. 469-473.
Borman et al., “Super-Resolution from Image Sequences—A Review”, Circuits & Systems, 1998, pp. 374-378.
Borman et al., “Nonlinear Prediction Methods for Estimation of Clique Weighting Parameters in NonGaussian Image Models”, Proc. SPIE, Sep. 22, 1998, vol. 3459, 9 pgs.
Borman et al., “Block-Matching Sub-Pixel Motion Estimation from Noisy, Under-Sampled Frames—An Empirical Performance Evaluation”, Proc SPIE, Dec. 28, 1998, vol. 3653, 10 pgs.
Borman et al., “Image Resampling and Constraint Formulation for Multi-Frame Super-Resolution Restoration”, Proc SPIE, Dec. 28, 1998, vol. 3653, 10 pgs.
Bose et al., “Superresolution and Noise Filtering Using Moving Least Squares”, IEEE Transactions on Image Processing, Aug. 2006, vol. 15, Issue 8, published Jul. 17, 2006, pp. 2239-2248.
Boye et al., “Comparison of Subpixel Image Registration Algorithms”, Proc. of SPIE—IS&T Electronic Imaging, Feb. 3, 2009, vol. 7246, pp. 72460X-1-72460X-9; doi: 10.1117/12.810369.
Bruckner et al., “Thin wafer-level camera lenses inspired by insect compound eyes”, Optics Express, Nov. 22, 2010, vol. 18, No. 24, pp. 24379-24394.
Bruckner et al., “Artificial compound eye applying hyperacuity”, Optics Express, Dec. 11, 2006, vol. 14, No. 25, pp. 12076-12084.
Bruckner et al., “Driving microoptical imaging systems towards miniature camera applications”, Proc. SPIE, Micro-Optics, May 13, 2010, 11 pgs.
Bryan et al., “Perspective Distortion from Interpersonal Distance Is an Implicit Visual Cue for Social Judgments of Faces”, PLOS One, vol. 7, Issue 9, Sep. 26, 2012, e45301, doi:10.1371/journal.pone.0045301, 9 pgs.
Bulat et al., “How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks)”, arxiv.org, Cornell University Library, 201 Olin Library Cornell University Ithaca, NY 14853, Mar. 21, 2017.
Cai et al., “3D Deformable Face Tracking with a Commodity Depth Camera”, Proceedings of the European Conference on Computer Vision: Part III, Sep. 5-11, 2010, 14pgs.
Capel, “Image Mosaicing and Super-resolution”, Retrieved on Nov. 10, 2012, Retrieved from the Internet at URL :<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.226.2643&rep=rep1 &type=pdf>, 2001, 269 pgs.
Caron et al., “Multiple camera types simultaneous stereo calibration, Robotics and Automation (ICRA)”, 2011 IEEE International Conference On, May 1, 2011 (May 1, 2011), pp. 2933-2938.
Carroll et al., “Image Warps for Artistic Perspective Manipulation”, ACM Transactions on Graphics (TOG), vol. 29, No. 4, Jul. 26, 2010, Article No. 127, 9 pgs.
Chan et al., “Investigation of Computational Compound-Eye Imaging System with Super-Resolution Reconstruction”, IEEE, ISASSP, Jun. 19, 2006, pp. 1177-1180.
Chan et al., “Extending the Depth of Field in a Compound-Eye Imaging System with Super-Resolution Reconstruction”, Proceedings—International Conference on Pattern Recognition, Jan. 1, 2006, vol. 3, pp. 623-626.
Chan et al., “Super-resolution reconstruction in a computational compound-eye imaging system”, Multidim. Syst. Sign. Process, published online Feb. 23, 2007, vol. 18, pp. 83-101.
Chen et al., “Interactive deformation of light fields”, Symposium on Interactive 3D Graphics, 2005, pp. 139-146.
Chen et al., “KNN Matting”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Sep. 2013, vol. 35, No. 9, pp. 2175-2188.
Chen et al., “KNN matting”, 2012 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 16-21, 2012, Providence, RI, USA, pp. 869-876.
Chen et al., “Image Matting with Local and Nonlocal Smooth Priors” CVPR '13 Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 23, 2013, pp. 1902-1907.
Chen et al., “Human Face Modeling and Recognition Through Multi-View High Resolution Stereopsis”, IEEE Conference on Computer Vision and Pattern Recognition Workshop, Jun. 17-22, 2006, 6 pgs.
Collins et al., “An Active Camera System for Acquiring Multi-View Video”, IEEE 2002 International Conference on Image Processing, Date of Conference: Sep. 22-25, 2002, Rochester, NY, 4 pgs.
Cooper et al., “The perceptual basis of common photographic practice”, Journal of Vision, vol. 12, No. 5, Article 8, May 25, 2012, pp. 1-14.
Crabb et al., “Real-time foreground segmentation via range and color imaging”, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Anchorage, AK, USA, Jun. 23-28, 2008, pp. 1-5.
Dainese et al., “Accurate Depth-Map Estimation for 3D Face Modeling”, IEEE European Signal Processing Conference, Sep. 4-8, 2005, 4 pgs.
Debevec et al., “Recovering High Dynamic Range Radiance Maps from Photographs”, Computer Graphics (ACM SIGGRAPH Proceedings), Aug. 16, 1997, 10 pgs.
Do, Minh N. “Immersive Visual Communication with Depth”, Presented at Microsoft Research, Jun. 15, 2011, Retrieved from: http://minhdo.ece.illinois.edu/talks/ImmersiveComm.pdf, 42 pgs.
Do et al., Immersive Visual Communication, IEEE Signal Processing Magazine, vol. 28, Issue 1, Jan. 2011, DOI: 10.1109/MSP.2010.939075, Retrieved from: http://minhdo.ece.illinois.edu/publications/ImmerComm_SPM.pdf, pp. 58-66.
Dou et al., “End-to-end 3D face reconstruction with deep neural networks” arXiv:1704.05020v1, Apr. 17, 2017, 10 pgs.
Drouin et al., “Improving Border Localization of Multi-Baseline Stereo Using Border-Cut”, International Journal of Computer Vision, Jul. 5, 2006, vol. 83, Issue 3, 8 pgs.
Drouin et al., “Fast Multiple-Baseline Stereo with Occlusion”, Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05), Ottawa, Ontario, Canada, Jun. 13-16, 2005, pp. 540-547.
Drouin et al., “Geo-Consistency for Wide Multi-Camera Stereo”, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), vol. 1, Jun. 20-25, 2005, pp. 351-358.
Drulea et al., “Motion Estimation Using the Correlation Transform”, IEEE Transactions on Image Processing, Aug. 2013, vol. 22, No. 8, pp. 3260-3270, first published May 14, 2013.
Duparre et al., “Microoptical artificial compound eyes—from design to experimental verification of two different concepts”, Proc. of SPIE, Optical Design and Engineering II, vol. 5962, Oct. 17, 2005, pp. 59622A-1-59622A-12.
Duparre et al., Novel Optics/Micro-Optics for Miniature Imaging Systems, Proc. of SPIE, Apr. 21, 2006, vol. 6196, pp. 619607-1-619607-15.
Duparre et al., “Micro-optical artificial compound eyes”, Bioinspiration & Biomimetics, Apr. 6, 2006, vol. 1, pp. R1-R16.
Duparre et al., “Artificial compound eye zoom camera”, Bioinspiration & Biomimetics, Nov. 21, 2008, vol. 3, pp. 1-6.
Duparre et al., “Artificial apposition compound eye fabricated by micro-optics technology”, Applied Optics, Aug. 1, 2004, vol. 43, No. 22, pp. 4303-4310.
Duparre et al., “Micro-optically fabricated artificial apposition compound eye”, Electronic Imaging—Science and Technology, Prod. SPIE 5301, Jan. 2004, pp. 25-33.
Duparre et al., “Chirped arrays of refractive ellipsoidal microlenses for aberration correction under oblique incidence”, Optics Express, Dec. 26, 2005, vol. 13, No. 26, pp. 10539-10551.
Duparre et al., “Artificial compound eyes—different concepts and their application to ultra flat image acquisition sensors”, MOEMS and Miniaturized Systems IV, Proc. SPIE 5346, Jan. 24, 2004, pp. 89-100.
Duparre et al., “Ultra-Thin Camera Based on Artificial Apposition Compound Eyes”, 10th Microoptics Conference, Sep. 1-3, 2004, 2 pgs.
Duparre et al., “Microoptical telescope compound eye”, Optics Express, Feb. 7, 2005, vol. 13, No. 3, pp. 889-903.
Duparre et al., “Theoretical analysis of an artificial superposition compound eye for application in ultra flat digital image acquisition devices”, Optical Systems Design, Proc. SPIE 5249, Sep. 2003, pp. 408-418.
Duparre et al., “Thin compound-eye camera”, Applied Optics, May 20, 2005, vol. 44, No. 15, pp. 2949-2956.
Duparre et al., “Microoptical Artificial Compound Eyes—Two Different Concepts for Compact Imaging Systems”, 11th Microoptics Conference, Oct. 30-Nov. 2, 2005, 2 pgs.
Eng et al., “Gaze correction for 3D tele-immersive communication system”, IVMSP Workshop, 2013 IEEE 11th. IEEE, Jun. 10, 2013.
Fanaswala, “Regularized Super-Resolution of Multi-View Images”, Retrieved on Nov. 10, 2012 (Nov. 10, 2012). Retrieved from the Internet at URL :<http://www.site.uottawa.ca/-edubois/theses/Fanaswala_thesis.pdf>, 2009, 163 pgs.
Fang et al., “Volume Morphing Methods for Landmark Based 3D Image Deformation”, SPIE vol. 2710, Proc. 1996 SPIE Intl Symposium on Medical Imaging, Newport Beach, CA, Feb. 10, 1996, pp. 404-415.
Fangmin et al., “3D Face Reconstruction Based on Convolutional Neural Network”, 2017 10th International Conference on Intelligent Computation Technology and Automation, Oct. 9-10, 2017, Changsha, China.
Farrell et al., “Resolution and Light Sensitivity Tradeoff with Pixel Size”, Proceedings of the SPIE Electronic Imaging 2006 Conference, Feb. 2, 2006, vol. 6069, 8 pgs.
Farsiu et al., “Advances and Challenges in Super-Resolution”, International Journal of Imaging Systems and Technology, Aug. 12, 2004, vol. 14, pp. 47-57.
Farsiu et al., “Fast and Robust Multiframe Super Resolution”, IEEE Transactions on Image Processing, Oct. 2004, published Sep. 3, 2004, vol. 13, No. 10, pp. 1327-1344.
Farsiu et al., “Multiframe Demosaicing and Super-Resolution of Color Images”, IEEE Transactions on Image Processing, Jan. 2006, vol. 15, No. 1, date of publication Dec. 12, 2005, pp. 141-159.
Fechteler et al., Fast and High Resolution 3D Face Scanning, IEEE International Conference on Image Processing, Sep. 16-Oct. 19, 2007, 4 pgs.
Fecker et al., “Depth Map Compression for Unstructured Lumigraph Rendering”, Proc. SPIE 6077, Proceedings Visual Communications and Image Processing 2006, Jan. 18, 2006, pp. 60770B-1-60770B-8.
Feris et al., “Multi-Flash Stereopsis: Depth Edge Preserving Stereo with Small Baseline Illumination”, IEEE Trans on PAMI, 2006, 31 pgs.
Fife et al., “A 3D Multi-Aperture Image Sensor Architecture”, Custom Integrated Circuits Conference, 2006, CICC '06, IEEE, pp. 281-284.
Fife et al., “A 3MPixel Multi-Aperture Image Sensor with 0.7Mu Pixels in 0.11Mu CMOS”, ISSCC 2008, Session 2, Image Sensors & Technology, 2008, pp. 48-50.
Fischer et al., “Optical System Design”, 2nd Edition, SPIE Press, Feb. 14, 2008, pp. 49-58.
Fischer et al., “Optical System Design”, 2nd Edition, SPIE Press, Feb. 14, 2008, pp. 191-198.
Garg et al., “Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue”, In European Conference on Computer Vision, Springer, Cham, Jul. 2016, 16 pgs.
Gastal et al., “Shared Sampling for Real-Time Alpha Matting”, Computer Graphics Forum, Eurographics 2010, vol. 29, Issue 2, May 2010, pp. 575-584.
Georgeiv et al., “Light Field Camera Design for Integral View Photography”, Adobe Systems Incorporated, Adobe Technical Report, 2003, 13 pgs.
Georgiev et al., “Light-Field Capture by Multiplexing in the Frequency Domain”, Adobe Systems Incorporated, Adobe Technical Report, 2003, 13 pgs.
Godard et al., “Unsupervised Monocular Depth Estimation with Left-Right Consistency”, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 14 pgs.
Goldman et al., “Video Object Annotation, Navigation, and Composition”, In Proceedings of UIST 2008, Oct. 19-22, 2008, Monterey CA, USA, pp. 3-12.
Goodfellow et al., “Generative Adversarial Nets, 2014. Generative adversarial nets”, In Advances in Neural Information Processing Systems (pp. 2672-2680).
Gortler et al., “The Lumigraph”, In Proceedings of SIGGRAPH 1996, published Aug. 1, 1996, pp. 43-54.
Gupta et al., “Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images”, 2013 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 23-28, 2013, Portland, OR, USA, pp. 564-571.
Hacohen et al., “Non-Rigid Dense Correspondence with Applications for Image Enhancement”, ACM Transactions on Graphics, vol. 30, No. 4, Aug. 7, 2011, 9 pgs.
Hamilton, “JPEG File Interchange Format, Version 1.02”, Sep. 1, 1992, 9 pgs.
Hardie, “A Fast Image Super-Algorithm Using an Adaptive Wiener Filter”, IEEE Transactions on Image Processing, Dec. 2007, published Nov. 19, 2007, vol. 16, No. 12, pp. 2953-2964.
Hasinoff et al., “Search-and-Replace Editing for Personal Photo Collections”, 2010 International Conference: Computational Photography (ICCP) Mar. 2010, pp. 1-8.
Hernandez et al., “Laser Scan Quality 3-D Face Modeling Using a Low-Cost Depth Camera”, 20th European Signal Processing Conference, Aug. 27-31, 2012, Bucharest, Romania, pp. 1995-1999.
Hernandez-Lopez et al., “Detecting objects using color and depth segmentation with Kinect sensor”, Procedia Technology, vol. 3, Jan. 1, 2012, pp. 196-204, XP055307680, ISSN: 2212-0173, DOI: 10.1016/j.protcy.2012.03.021.
Higo et al., “A Hand-held Photometric Stereo Camera for 3-D Modeling”, IEEE International Conference on Computer Vision, 2009, pp. 1234-1241.
Hirschmuller, “Accurate and Efficient Stereo Processing by Semi-Global Matching and Mutual Information”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, CA, USA, Jun. 20-26, 2005, 8 pgs.
Hirschmuller et al., “Memory Efficient Semi-Global Matching, ISPRS Annals of the Photogrammetry”, Remote Sensing and Spatial Information Sciences, vol. I-3, 2012, XXII ISPRS Congress, Aug. 25-Sep. 1, 2012, Melbourne, Australia, 6 pgs.
Holoeye Photonics AG, “Spatial Light Modulators”, Oct. 2, 2013, Brochure retrieved from https://web.archive.org/web/20131002061028/http://holoeye.com/wp-content/uploads/Spatial_Light_Modulators.pdf on Oct. 13, 2017, 4 pgs.
Holoeye Photonics AG, “Spatial Light Modulators”, Sep. 18, 2013, retrieved from https://web.archive.org/web/20130918113140/http://holoeye.com/spatial-light-modulators/ on Oct. 13, 2017, 4 pgs.
Holoeye Photonics AG, “LC 2012 Spatial Light Modulator (transmissive)”, Sep. 18, 2013, retrieved from https://web.archive.org/web/20130918151716/http://holoeye.com/spatial-light-modulators/lc-2012-spatial-light-modulator/ on Oct. 20, 2017, 3 pgs.
Horisaki et al., “Superposition Imaging for Three-Dimensionally Space-Invariant Point Spread Functions”, Applied Physics Express, Oct. 13, 2011, vol. 4, pp. 112501-1-112501-3.
Horisaki et al., “Irregular Lens Arrangement Design to Improve Imaging Performance of Compound-Eye Imaging Systems”, Applied Physics Express, Jan. 29, 2010, vol. 3, pp. 022501-1-022501-3.
Horn et al., “LightShop: Interactive Light Field Manipulation and Rendering”, In Proceedings of I3D, Jan. 1, 2007, pp. 121-128.
Hossain et al., “Inexpensive Construction of a 3D Face Model from Stereo Images”, IEEE International Conference on Computer and Information Technology, Dec. 27-29, 2007, 6 pgs.
Hu et al., “A Quantitative Evaluation of Confidence Measures for Stereo Vision”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, Issue 11, Nov. 2012, pp. 2121-2133.
Humenberger ER Al., “A Census-Based Stereo Vision Algorithm Using Modified Semi-Global Matching and Plane Fitting to Improve Matching Quality”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, Jun. 13-18, 2010, San Francisco, CA, 8 pgs.
Isaksen et al., “Dynamically Reparameterized Light Fields”, In Proceedings of SIGGRAPH 2000, 2000, pp. 297-306.
Izadi et al., “KinectFusion: Real-time 3D Reconstruction and Interaction Using a Moving Depth Camera”, UIST'11, Oct. 16-19, 2011, Santa Barbara, CA, pp. 559-568.
Jackson et al., “Large Post 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression”, arXiv: 1703.07834v2, Sep. 8, 2017, 9 pgs.
Janoch et al., “A category-level 3-D object dataset: Putting the Kinect to work”, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Nov. 6-13, 2011, Barcelona, Spain, pp. 1168-1174.
Jarabo et al., “Efficient Propagation of Light Field Edits”, In Proceedings of SIACG 2011, 2011, pp. 75-80.
Jiang et al., “Panoramic 3D Reconstruction Using Rotational Stereo Camera with Simple Epipolar Constraints”, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), vol. 1, Jun. 17-22, 2006, New York, NY, USA, pp. 371-378.
Joshi, Color Calibration for Arrays of Inexpensive Image Sensors, Mitsubishi Electric Research Laboratories, Inc., TR2004-137, Dec. 2004, 6 pgs.
Joshi et al., “Synthetic Aperture Tracking: Tracking Through Occlusions”, ICCV IEEE 11th International Conference on Computer Vision; Publication [online]. Oct. 2007 [retrieved Jul. 28, 2014]. Retrieved from the Internet: <URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4409032&isnumber=4408819>, pp. 1-8.
Jourabloo, “Large-Pose Face Alignment via CNN-Based Dense 3D Model Fitting”, ICCV IEEE 11th International Conference on Computer Vision; Publication [online]. Oct. 2007 [retrieved Jul. 28, 2014]. Retrieved from the Internet: <URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4409032&isnumber=4408819>; pp. 1-8.
Kang et al., “Handling Occlusions in Dense Multi-view Stereo”, Computer Vision and Pattern Recognition, 2001, vol. 1, pp. 1-103-1-110.
Keeton, “Memory-Driven Computing”, Hewlett Packard Enterprise Company, Oct. 20, 2016, 45 pgs.
Kim, “Scene Reconstruction from a Light Field”, Master Thesis, Sep. 1, 2010 (Sep. 1, 2010), pp. 1-72.
Kim et al., “Scene reconstruction from high spatio-angular resolution light fields”, ACM Transactions on Graphics (TOG)—SIGGRAPH 2013 Conference Proceedings, vol. 32 Issue 4, Article 73, Jul. 21, 2013, 11 pages.
Kitamura et al., “Reconstruction of a high-resolution image on a compound-eye image-capturing system”, Applied Optics, Mar. 10, 2004, vol. 43, No. 8, pp. 1719-1727.
Kittler et al., “3D Assisted Face Recognition: A Survey of 3D Imaging, Modelling, and Recognition Approaches”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jul. 2005, 7 pgs.
Konolige, Kurt “Projected Texture Stereo”, 2010 IEEE International Conference on Robotics and Automation, May 3-7, 2010, pp. 148-155.
Kotsia et al., “Facial Expression Recognition in Image Sequences Using Geometric Deformation Features and Support Vector Machines”, IEEE Transactions on Image Processing, Jan. 2007, vol. 16, No. 1, pp. 172-187.
Krishnamurthy et al., “Compression and Transmission of Depth Maps for Image-Based Rendering”, Image Processing, 2001, pp. 828-831.
Kubota et al., “Reconstructing Dense Light Field From Array of Multifocus Images for Novel View Synthesis”, IEEE Transactions on Image Processing, vol. 16, No. 1, Jan. 2007, pp. 269-279.
Kutulakos et al., “Occluding Contour Detection Using Affine Invariants and Purposive Viewpoint Control”, Computer Vision and Pattern Recognition, Proceedings CVPR 94, Seattle, Washington, Jun. 21-23, 1994, 8 pgs.
Lai et al., “A Large-Scale Hierarchical Multi-View RGB-D Object Dataset”, Proceedings—IEEE International Conference on Robotics and Automation, Conference Date May 9-13, 2011, 8 pgs., DOI: 10.1109/ICRA.201135980382.
Lane et al., “A Survey of Mobile Phone Sensing”, IEEE Communications Magazine, vol. 48, Issue 9, Sep. 2010, pp. 140-150.
Lao et al., “3D template matching for pose invariant face recognition using 3D facial model built with isoluminance line based stereo vision”, Proceedings 15th International Conference on Pattern Recognition, Sep. 3-7, 2000, Barcelona, Spain, pp. 911-916.
Lee, “NFC Hacking: The Easy Way”, Defcon Hacking Conference, 2012, 24 pgs.
Lee et al., “Electroactive Polymer Actuator for Lens-Drive Unit in Auto-Focus Compact Camera Module”, ETRI Journal, vol. 31, No. 6, Dec. 2009, pp. 695-702.
Lee et al., “Nonlocal matting”, CVPR 2011, Jun. 20-25, 2011, pp. 2193-2200.
Lee et al., “Automatic Upright Adjustment of Photographs”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012, pp. 877-884.
Lensvector, “How LensVector Autofocus Works”, 2010, printed Nov. 2, 2012 from http://www.lensvector.com/overview.html, 1 pg.
Levin et al., “A Closed Form Solution to Natural Image Matting”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006, vol. 1, pp. 61-68.
Levin et al., “Spectral Matting”, 2007 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 17-22, 2007, Minneapolis, MN, USA, pp. 1-8.
Levoy, “Light Fields and Computational Imaging”, IEEE Computer Society, Sep. 1, 2006, vol. 39, Issue No. 8, pp. 46-55.
Levoy et al., “Light Field Rendering”, Proc. ADM SIGGRAPH '96, 1996, pp. 1-12.
Li et al., “A Hybrid Camera for Motion Deblurring and Depth Map Super-Resolution”, Jun. 23-28, 2008, IEEE Conference on Computer Vision and Pattern Recognition, 8 pgs. Retrieved from www.eecis.udel.edu/˜jye/lab_research/08/deblur-feng.pdf on Feb. 5, 2014.
Li et al., “Fusing Images with Different Focuses Using Support Vector Machines”, IEEE Transactions on Neural Networks, vol. 15, No. 6, Nov. 8, 2004, pp. 1555-1561.
Lim, “Optimized Projection Pattern Supplementing Stereo Systems”, 2009 IEEE International Conference on Robotics and Automation, May 12-17, 2009, pp. 2823-2829.
Liu et al., “Virtual View Reconstruction Using Temporal Information”, 2012 IEEE International Conference on Multimedia and Expo, 2012, pp. 115-120.
Lo et al., “Stereoscopic 3D Copy & Paste”, ACM Transactions on Graphics, vol. 29, No. 6, Article 147, Dec. 2010, pp. 147:1-147:10.
Ma et al., “Constant Time Weighted Median Filtering for Stereo Matching and Beyond”, ICCV '13 Proceedings of the 2013 IEEE International Conference on Computer Vision, IEEE Computer Society, Washington DC, USA, Dec. 1-8, 2013, 8 pgs.
Martinez et al., “Simple Telemedicine for Developing Regions: Camera Phones and Paper-Based Microfluidic Devices for Real-Time, Off-Site Diagnosis”, Analytical Chemistry (American Chemical Society), vol. 80, No. 10, May 15, 2008, pp. 3699-3707.
Mcguire et al., “Defocus video matting”, ACM Transactions on Graphics (TOG)—Proceedings of ACM SIGGRAPH 2005, vol. 24, Issue 3, Jul. 2005, pp. 567-576.
Medioni et al., “Face Modeling and Recognition in 3-D”, Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces and Gestures, 2013, 2 pgs.
Merkle et al., “Adaptation and optimization of coding algorithms for mobile 3DTV”, Mobile3DTV Project No. 216503, Nov. 2008, 55 pgs.
Michael et al., “Real-time Stereo Vision: Optimizing Semi-Global Matching”, 2013 IEEE Intelligent Vehicles Symposium (IV), IEEE, Jun. 23-26, 2013, Australia, 6 pgs.
Milella et al., “3D reconstruction and classification of natural environments by an autonomous vehicle using multi-baseline stereo”, Intelligent Service Robotics, vol. 7, No. 2, Mar. 2, 2014, pp. 79-92.
Min et al., “Real-Time 3D Face Identification from a Depth Camera”, Proceedings of the IEEE International Conference on Pattern Recognition, Nov. 11-15, 2012, 4 pgs.
Mitra et al., “Light Field Denoising, Light Field Superresolution and Stereo Camera Based Refocussing using a GMM Light Field Patch Prior”, Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on Jun. 16-21, 2012, pp. 22-28.
Moreno-Noguer et al., “Active Refocusing of Images and Videos”, ACM Transactions on Graphics (TOG)—Proceedings of ACM SIGGRAPH 2007, vol. 26, Issue 3, Jul. 2007, 10 pgs.
Muehlebach, “Camera Auto Exposure Control for VSLAM Applications”, Studies on Mechatronics, Swiss Federal Institute of Technology Zurich, Autumn Term 2010 course, 67 pgs.
Nayar, “Computational Cameras: Redefining the Image”, IEEE Computer Society, Aug. 14, 2006, pp. 30-38.
Ng, “Digital Light Field Photography”, Thesis, Jul. 2006, 203 pgs.
Ng et al., “Super-Resolution Image Restoration from Blurred Low-Resolution Images”, Journal of Mathematical Imaging and Vision, 2005, vol. 23, pp. 367-378.
Ng et al., “Light Field Photography with a Hand-held Plenoptic Camera”, Stanford Tech Report CTSR 2005-02, Apr. 20, 2005, pp. 1-11.
Nguyen et al., “Image-Based Rendering with Depth Information Using the Propagation Algorithm”, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005, vol. 5, Mar. 23-23, 2005, pp. II-589-II-592.
Nguyen et al., “Error Analysis for Image-Based Rendering with Depth Information”, IEEE Transactions on Image Processing, vol. 18, Issue 4, Apr. 2009, pp. 703-716.
Nishihara, H.K. “PRISM: A Practical Real-Time Imaging Stereo Matcher”, Massachusetts Institute of Technology, A.I. Memo 780, May 1984, 32 pgs.
Nitta et al., “Image reconstruction for thin observation module by bound optics by using the iterative backprojection method”, Applied Optics, May 1, 2006, vol. 45, No. 13, pp. 2893-2900.
Nomura et al., “Scene Collages and Flexible Camera Arrays”, Proceedings of Eurographics Symposium on Rendering, Jun. 2007, 12 pgs.
Park et al., “Super-Resolution Image Reconstruction”, IEEE Signal Processing Magazine, May 2003, pp. 21-36.
Park et al., “Multispectral Imaging Using Multiplexed Illumination”, 2007 IEEE 11th International Conference on Computer Vision, Oct. 14-21, 2007, Rio de Janeiro, Brazil, pp. 1-8.
Park et al., “3D Face Reconstruction from Stereo Video”, First International Workshop on Video Processing for Security, Jun. 7-9, 2006, Quebec City, Canada, 2006, 8 pgs.
Parkkinen et al., “Characteristic Spectra of Munsell Colors”, Journal of the Optical Society of America A, vol. 6, Issue 2, Feb. 1989, pp. 318-322.
Perwass et al., “Single Lens 3D-Camera with Extended Depth-of-Field”, printed from www.raytrix.de, Jan. 22, 2012, 15 pgs.
Pham et al., “Robust Super-Resolution without Regularization”, Journal of Physics: Conference Series 124, Jul. 2008, pp. 1-19.
Philips 3D Solutions, “3D Interface Specifications, White Paper”, Feb. 15, 2008, 2005-2008 Philips Electronics Nederland B.V., Philips 3D Solutions retrieved from www.philips.com/3dsolutions, 29 pgs.
Polight, “Designing Imaging Products Using Reflowable Autofocus Lenses”, printed Nov. 2, 2012 from http://www.polight.no/tunable-polymer-autofocus-lens-html--11.html, 1 pg.
Pouydebasque et al., “Varifocal liquid lenses with integrated actuator, high focusing power and low operating voltage fabricated on 200 mm wafers”, Sensors and Actuators A: Physical, vol. 172, Issue 1, Dec. 2011, pp. 280-286.
Protter et al., “Generalizing the Nonlocal-Means to Super-Resolution Reconstruction”, IEEE Transactions on Image Processing, Dec. 2, 2008, vol. 18, No. 1, pp. 36-51.
Radtke et al., “Laser lithographic fabrication and characterization of a spherical artificial compound eye”, Optics Express, Mar. 19, 2007, vol. 15, No. 6, pp. 3067-3077.
Rajan et al., “Simultaneous Estimation of Super Resolved Scene and Depth Map from Low Resolution Defocused Observations”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, No. 9, Sep. 8, 2003, pp. 1-16.
Rander et al., “Virtualized Reality: Constructing Time-Varying Virtual Worlds from Real World Events”, Proc. of IEEE Visualization '97, Phoenix, Arizona, Oct. 19-24, 1997, pp. 277-283, 552.
Ranjan et al., “HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition”, May 11, 2016 (May 11, 2016), pp. 1-16.
Rhemann et al., “Fast Cost-Volume Filtering for Visual Correspondence and Beyond”, IEEE Trans. Pattern Anal. Mach. Intell, 2013, vol. 35, No. 2, pp. 504-511.
Rhemann et al., “A perceptually motivated online benchmark for image matting”, 2009 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 20-25, 2009, Miami, FL, USA, pp. 1826-1833.
Robert et al., “Dense Depth Map Reconstruction: A Minimization and Regularization Approach which Preserves Discontinuities”, European Conference on Computer Vision (ECCV), pp. 439-451, (1996).
Robertson et al., “Dynamic Range Improvement Through Multiple Exposures”, In Proc. of the Int. Conf. on Image Processing, 1999, 5 pgs.
Robertson et al., “Estimation-theoretic approach to dynamic range enhancement using multiple exposures”, Journal of Electronic Imaging, Apr. 2003, vol. 12, No. 2, pp. 219-228.
Roy et al., “Non-Uniform Hierarchical Pyramid Stereo for Large Images”, Computer and Robot Vision, 2002, pp. 208-215.
Rusinkiewicz et al., “Real-Time 3D Model Acquisition”, ACM Transactions on Graphics (TOG), vol. 21, No. 3, Jul. 2002, pp. 438-446.
Saatci et al., “Cascaded Classification of Gender and Facial Expression using Active Appearance Models”, IEEE, FGR'06, 2006, 6 pgs.
Sauer et al., “Parallel Computation of Sequential Pixel Updates in Statistical Tomographic Reconstruction”, ICIP 1995 Proceedings of the 1995 International Conference on Image Processing, Date of Conference: Oct. 23-26, 1995, pp. 93-96.
Scharstein et al., “High-Accuracy Stereo Depth Maps Using Structured Light”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), Jun. 2003, vol. 1, pp. 195-202.
Seitz et al., “Plenoptic Image Editing”, International Journal of Computer Vision 48, Conference Date Jan. 7, 1998, 29 pgs., DOI: 10.1109/ICCV.1998.710696 · Source: DBLP Conference: Computer Vision, Sixth International Conference.
Shechtman et al., “Increasing Space-Time Resolution in Video”, European Conference on Computer Vision, LNCS 2350, May 28-31, 2002, pp. 753-768.
Shotton et al., “Real-time human pose recognition in parts from single depth images”, CVPR 2011, Jun. 20-25, 2011, Colorado Springs, CO, USA, pp. 1297-1304.
Shum et al., “Pop-Up Light Field: An Interactive Image-Based Modeling and Rendering System”, Apr. 2004, ACM Transactions on Graphics, vol. 23, No. 2, pp. 143-162, Retrieved from http://131.107.65.14/en-us/um/people/jiansun/papers/PopupLightField_TOG.pdf on Feb. 5, 2014.
Shum et al., “A Review of Image-based Rendering Techniques”, Visual Communications and Image Processing 2000, May 2000, 12 pgs.
Sibbing et al., “Markerless reconstruction of dynamic facial expressions”, 2009 IEEE 12TH International Conference on Computer Vision Workshops, ICCV Workshop: Kyoto, Japan, Sep. 27-Oct. 4, 2009, Institute of Electrical and Electronics Engineers, Piscataway, NJ, Sep. 27, 2009 (Sep. 27, 2009), pp. 1778-1785.
Silberman et al., “Indoor segmentation and support inference from RGBD images”, ECCV'12 Proceedings of the 12th European conference on Computer Vision, vol. Part V, Oct. 7-13, 2012, Florence, Italy, pp. 746-760.
Stober, “Stanford researchers developing 3-D camera with 12,616 lenses”, Stanford Report, Mar. 19, 2008, Retrieved from: http://news.stanford.edu/news/2008/march19/camera-031908.html, 5 pgs.
Stollberg et al., “The Gabor superlens as an alternative wafer-level camera approach inspired by superposition compound eyes of nocturnal insects”, Optics Express, Aug. 31, 2009, vol. 17, No. 18, pp. 15747-15759.
Sun et al., “Image Super-Resolution Using Gradient Profile Prior”, 2008 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 23-28, 2008, 8 pgs.; DOI: 10.1109/CVPR.2008.4587659.
Taguchi et al., “Rendering-Oriented Decoding for a Distributed Multiview Coding System Using a Coset Code”, Hindawi Publishing Corporation, EURASIP Journal on Image and Video Processing, vol. 2009, Article ID 251081, Online: Apr. 22, 2009, 12 pgs.
Takeda et al., “Super-resolution Without Explicit Subpixel Motion Estimation”, IEEE Transaction on Image Processing, Sep. 2009, vol. 18, No. 9, pp. 1958-1975.
Tallon et al., “Upsampling and Denoising of Depth Maps via Joint-Segmentation”, 20th European Signal Processing Conference, Aug. 27-31, 2012, 5 pgs.
Tanida et al., “Thin observation module by bound optics (TOMBO): concept and experimental verification”, Applied Optics, Apr. 10, 2001, vol. 40, No. 11, pp. 1806-1813.
Tanida et al., “Color imaging with an integrated compound imaging system”, Optics Express, Sep. 8, 2003, vol. 11, No. 18, pp. 2109-2117.
Tao et al., “Depth from Combining Defocus and Correspondence Using Light-Field Cameras”, ICCV '13 Proceedings of the 2013 IEEE International Conference on Computer Vision, Dec. 1, 2013, pp. 673-680.
Taylor, “Virtual camera movement: The way of the future?”, American Cinematographer, vol. 77, No. 9, Sep. 1996, pp. 93-100.
Tseng et al., “Automatic 3-D depth recovery from a single urban-scene image”, 2012 Visual Communications and Image Processing, Nov. 27- 30, 2012, San Diego, CA, USA, pp. 1-6.
Uchida et al., 3D Face Recognition Using Passive Stereo Vision, IEEE International Conference on Image Processing 2005, Sep. 14, 2005, 4 pgs.
Vaish et al., “Reconstructing Occluded Surfaces Using Synthetic Apertures: Stereo, Focus and Robust Measures”, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), vol. 2, Jun. 17-22, 2006, pp. 2331-2338.
Vaish et al., “Using Plane + Parallax for Calibrating Dense Camera Arrays”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2004, 8 pgs.
Vaish et al., “Synthetic Aperture Focusing Using a Shear-Warp Factorization of the Viewing Transform”, IEEE Workshop on A3DISS, CVPR, 2005, 8 pgs.
Van Der Wal et al., “The Acadia Vision Processor”, Proceedings Fifth IEEE International Workshop on Computer Architectures for Machine Perception, Sep. 13, 2000, Padova, Italy, pp. 31-40.
Veilleux, “CCD Gain Lab: The Theory”, University of Maryland, College Park-Observational Astronomy (ASTR 310), Oct. 19, 2006, pp. 1-5 (online], [retrieved on May 13, 2014]. Retrieved from the Internet <URL: http://www.astro.umd.edu/˜veilleux/ASTR310/fall06/ccd_theory.pdf, 5 pgs.
Venkataraman et al., “PiCam: An Ultra-Thin High Performance Monolithic Camera Array”, ACM Transactions on Graphics (TOG), ACM, US, vol. 32, No. 6, 1 Nov. 1, 2013, pp. 1-13.
Vetro et al., “Coding Approaches for End-To-End 3D TV Systems”, Mitsubishi Electric Research Laboratories, Inc., TR2004-137, Dec. 2004, 6 pgs.
Viola et al., “Robust Real-time Object Detection”, Cambridge Research Laboratory, Technical Report Series, Compaq, CRL 2001/01, Feb. 2001, Printed from: http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-2001-1.pdf, 30 pgs.
Vuong et al., “A New Auto Exposure and Auto White-Balance Algorithm to Detect High Dynamic Range Conditions Using CMOS Technology”, Proceedings of the World Congress on Engineering and Computer Science 2008, WCECS 2008, Oct. 22-24, 2008, 5 pgs.
Wang, “Calculation of Image Position, Size and Orientation Using First Order Properties”, Dec. 29, 2010, OPTI521 Tutorial, 10 pgs.
Wang et al., “Soft scissors: an interactive tool for realtime high quality matting”, ACM Transactions on Graphics (TOG)—Proceedings of ACM SIGGRAPH 2007, vol. 26, Issue 3, Article 9, Jul. 2007, 6 pg., published Aug. 5, 2007.
Wang et al., “Automatic Natural Video Matting with Depth”, 15th Pacific Conference on Computer Graphics and Applications, PG '07, Oct. 29-Nov. 2, 2007, Maui, HI, USA, pp. 469-472.
Wang et al., “Image and Video Matting: A Survey”, Foundations and Trends, Computer Graphics and Vision, vol. 3, No. 2, 2007, pp. 91-175.
Wang et al., “Facial Feature Point Detection: A Comprehensive Survey”, arXiv: 1410.1037v1, Oct. 4, 2014, 32 pgs ..
Wetzstein et al., “Computational Plenoptic Imaging”, Computer Graphics Forum, 2011, vol. 30, No. 8, pp. 2397-2426.
Wheeler et al., “Super-Resolution Image Synthesis Using Projections Onto Convex Sets in the Frequency Domain”, Proc. SPIE, Mar. 11, 2005, vol. 5674, 12 pgs.
Widanagamaachchi et al., “3D Face Recognition from 2D Images: A Survey”, Proceedings of the International Conference on Digital Image Computing: Techniques and Applications, Dec. 1-3, 2008, 7 pgs.
Wieringa et al., “Remote Non-invasive Stereoscopic Imaging of Blood Vessels: First In-vivo Results of a New Multispectral Contrast Enhancement Technology”, Annals of Biomedical Engineering, vol. 34, No. 12, Dec. 2006, pp. 1870-1878, Published online Oct. 12, 2006.
Wikipedia, “Polarizing Filter (Photography)”, retrieved from http://en.wikipedia.org/wiki/Polarizing_filter_(photography) on Dec. 12, 2012, last modified on Sep. 26, 2012, 5 pgs.
Wilburn, “High Performance Imaging Using Arrays of Inexpensive Cameras”, Thesis of Bennett Wilburn, Dec. 2004, 128 pgs.
Wilburn et al., “High Performance Imaging Using Large Camera Arrays”, ACM Transactions on Graphics, Jul. 2005, vol. 24, No. 3, pp. 1-12.
Wilburn et al., “High-Speed Videography Using a Dense Camera Array”, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., vol. 2, Jun. 27-Jul. 2, 2004, pp. 294-301.
Wilburn et al., “The Light Field Video Camera”, Proceedings of Media Processors 2002, SPIE Electronic Imaging, 2002, 8 pgs.
Wippermann et al., “Design and fabrication of a chirped array of refractive ellipsoidal micro-lenses for an apposition eye camera objective”, Proceedings of SPIE, Optical Design and Engineering II, Oct. 15, 2005, pp. 59622C-1-59622C-11.
Wu et al., “A virtual view synthesis algorithm based on image inpainting”, 2012 Third International Conference on Networking and Distributed Computing, Hangzhou, China, Oct. 21-24, 2012, pp. 153-156.
Xu, “Real-Time Realistic Rendering and High Dynamic Range Image Display and Compression”, Dissertation, School of Computer Science in the College of Engineering and Computer Science at the University of Central Florida, Orlando, Florida, Fall Term 2005, 192 pgs.
Yang et al., “Superresolution Using Preconditioned Conjugate Gradient Method”, Proceedings of SPIE—The International Society for Optical Engineering, Jul. 2002, 8 pgs.
Yang et al., “A Real-Time Distributed Light Field Camera”, Eurographics Workshop on Rendering (2002), published Jul. 26, 2002, pp. 1-10.
Yang et al., Model-based Head Pose Tracking with Stereovision, Microsoft Research, Technical Report, MSR-TR-2001-102, Oct. 2001, 12 pgs.
Yokochi et al., “Extrinsic Camera Parameter Estimation Based-on Feature Tracking and GPS Data”, 2006, Nara Institute of Science and Technology, Graduate School of Information Science, LNCS 3851, pp. 369-378.
Zbontar et al., Computing the Stereo Matching Cost with a Convolutional Neural Network, CVPR, 2015, pp. 1592-1599.
Zhang et al., “A Self-Reconfigurable Camera Array”, Eurographics Symposium on Rendering, published Aug. 8, 2004, 12 pgs.
Zhang et al., “Depth estimation, spatially variant image registration, and super-resolution using a multi-lenslet camera”, proceedings of SPIE, vol. 7705, Apr. 23, 2010, pp. 770505-770505-8, XP055113797 ISSN: 0277-786X, DOI: 10.1117/12.852171.
Zhang et al., “Spacetime Faces: High Resolution Capture for Modeling and Animation”, ACM Transactions on Graphics, 2004, 11pgs.
Zheng et al., “Balloon Motion Estimation Using Two Frames”, Proceedings of the Asilomar Conference on Signals, Systems and Computers, IEEE, Comp. Soc. Press, US, vol. 2 of 2, Nov. 4, 1991, pp. 1057-1061.
Zhu et al., “Fusion of Time-of-Flight Depth and Stereo for High Accuracy Depth Maps”, 2008 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 23-28, 2008, Anchorage, AK, USA, pp. 1-8.
Zomet et al., “Robust Super-Resolution”, IEEE, 2001, pp. 1-6.
“File Formats Version 6”, Alias Systems, 2004, 40 pgs.
“Light fields and computational photography”, Stanford Computer Graphics Laboratory, Retrieved from: http://graphics.stanford.edu/projects/lightfield/, Earliest publication online: Feb. 10, 1997, 3 pgs.
“Exchangeable image file format for digital still cameras: Exif Version 2.2”_, Japan Electronics and Information Technology Industries Association, Prepared by Technical Standardization Committee on AV & IT Storage Systems and Equipment, Jeita CP-3451, Apr. 2002, Retrieved from: http://www.exif.org/Exif2-2.PDF, 154 pgs.
Ba et al., “Physics-based Neural Networks for Shape from Polarization,” CoRR, Mar. 2019, arxiv.org/abs/1903.10210, 10 pages.
International Search Report and Written Opinion in International Appln. No. PCT/US2022/035130, dated Oct. 19, 2022, 17 pages.
International Preliminary Report on Patentability in International Appln. No. PCT/US2022/035130, dated Jan. 4, 2024, 10 pages.
Related Publications (1)
Number Date Country
20220414928 A1 Dec 2022 US