Systems and methods for virtual and augmented reality

Description

BACKGROUND OF THE INVENTION
1). Field of the Invention

This invention is related to connected mobile computing systems, methods, and configurations, and more specifically to mobile computing systems, methods, and configurations featuring at least one wearable component which may be utilized for virtual and/or augmented reality operation.

2). Discussion of Related Art

It is desirable that mixed reality, or augmented reality, near-eye displays be lightweight, low-cost, have a small form-factor, have a wide virtual image field of view, and be as transparent as possible. In addition, it is desirable to have configurations that present virtual image information in multiple focal planes (for example, two or more) in order to be practical for a wide variety of use-cases without exceeding an acceptable allowance for vergence-accommodation mismatch. Referring to FIG. 8, an augmented reality system is illustrated featuring a head-worn viewing component (2), a hand-held controller component (4), and an interconnected auxiliary computing or controller component (6) which may be configured to be worn as a belt pack or the like on the user. Each of these components may be operatively coupled (10, 12, 14, 16, 17, 18) to each other and to other connected resources (8) such as cloud computing or cloud storage resources via wired or wireless communication configurations, such as those specified by IEEE 802.11, Bluetooth®, and other connectivity standards and configurations. As described, for example, in U.S. patent application Ser. Nos. 14/555,585, 14/690,401, 14/331,218, 15/481,255, 62/627,155, 62/518,539, 16/229,532, 16/155,564, 15/413,284, 16/020,541, 62,702,322, 62/206,765, 15,597,694, 16/221,065, 15/968,673, 62/682,788, and 62/899,678 each of which is incorporated by reference herein in its entirety, various aspects of such components are described, such as various embodiments of the two depicted optical elements (20) through which the user may see the world around them along with visual components which may be produced by the associated system components, for an augmented reality experience. As illustrated in FIG. 8, such a system may also comprise various sensors configured to provide information pertaining to the environment around the user, including but not limited to various camera type sensors (such as monochrome, color/RGB, and/or thermal imaging components) (22, 24, 26), depth camera sensors (28), and/or sound sensors (30) such as microphones. There is a need for compact and persistently connected wearable computing systems and assemblies such as those described herein, which may be utilized to provide a user with the perception of rich augmented reality experiences.

SUMMARY OF THE INVENTION

This document describes certain aspects of what may be termed “the deep middle-end matcher”, a neural network configured to match two sets of local features by jointly finding correspondences and rejecting non-matchable points. Such a neural network configuration may be utilized in association with spatial computing resources such as those illustrated in FIG. 8, including but not limited to the camera and processing resources that comprise such spatial computing systems. Within the deep middle-end matcher type of configuration, assignments may be estimated by solving an optimal transport problem, whose costs are predicted by a graph neural network. We describe a flexible context aggregation mechanism based on attention, which enables the deep middle-end matcher configuration to reason about the underlying 3D scene and feature assignments jointly. Compared to traditional, hand-designed heuristics, our technique learns priors over geometric transformations and regularities of the 3D world through end-to-end training from images to correspondences. The deep middle-end matcher outperforms other learned approaches and sets a new state-of-the-art on the task of pose estimation in challenging real-world indoor and outdoor environments. These methods and configurations match in real-time on a modern graphical processing unit (“GPU”), and can be readily integrated into modern structure-from-motion (“SfM”) or simultaneous localization and mapping (“SLAM”) systems, all of which may be incorporated into systems such as that illustrated in FIG. 8.

The invention provides a computer system including a computer-readable medium, a processor connected to the computer-readable medium and a set of instructions on the computer-readable medium. The set of instructions may include a deep middle-end matcher architecture that may include an attentional graph neural network having a keypoint encoder to map keypoint positions p and their visual descriptors d into a single vector, and alternating self- and cross-attention layers that, based on the vector, repeated L times to create representations f; and an optimal matching layer that creates an M by N score matrix from the representations f and finds an optimal partial assignment based on the M by N score matrix.

The computer system may further include that in the keypoint encoder, an initial representation ⁽⁰⁾x_ifor each keypoint i combines visual appearance and location, with the respective keypoint position embedded into a high-dimensional vector with a Multilayer Perceptron (MLP) as follows:

⁽⁰⁾x_i=d_i+MLP(p_i)

The computer system may further include the keypoint encoder allows the attentional graph neural network to reason about appearance and position jointly.

The computer system may further include in the keypoint encoder includes a multiplex graph neural network having a single complete graph with nodes that are the keypoints of two images.

The computer system may further include the graph is a multiplex graph that has two types of undirected edges, namely intra-image edges (self edges; E_self) that connect keypoints i to all other keypoints within the same image and inter-image edges (cross edges, E_cross) that connect keypoints i to all keypoints in the other image and uses a message passing formulation to propagate information along both types of edges, such that the resulting multiplex graph neural network starts with a high-dimensional state for each node and computes at each layer an updated representation by simultaneously aggregating messages across all given edges for all nodes.

The computer system may further include if ^(l)x^A_iis the intermediate representation for element i in image A at layer l, the message m_E→iis the result of the aggregation from all keypoints {j:(i, j)∈E}, where E∈{E_self, E_cross}, and a residual message passing update for all i in A is:

^(l+1)x_i^A=^(l)x_i^A+MLP ([^(l)x_i^A∥m_ε→i]).

where [.∥.] denotes concatenation.

The computer system may further include a fixed number of layers L with different parameters are chained and alternatively aggregate along the self and cross edges such that, starting from l=1, E=E_selfif l is odd and E=E_crossif l is even.

The computer system may further include the alternating self- and cross-attention layers are computed with an attention mechanism computes the message m_E→iand performs the aggregation, wherein the self edges are based on self-attention and the cross edges are based on cross-attention, wherein, for a representation of i, a query q_i, retrieves values v_jof some elements based on their attributes, the keys k_j, and the message is computed as weighted average of the values:

$m_{ℰ \to i} = \underset{j : (i, j) \in ℰ}{Σ} α_{ij} v_{j} .$

The computer system may further include an attention mask α_ijis the Softmax over the key-query similarities:

αij=Softmax_j(q_i^Tk_j).

The computer system may further include the respective key, query, and value are computed as linear projections of deep features of the graph neural network, with a query keypoint i being in an image Q and all source keypoints are in image S, (Q, S)∈{A, B}², in the equation:

$q_{i} = W_{1} i_{Q} + b_{1} [\begin{matrix} k_{j} \\ v_{j} \end{matrix}] = [\begin{matrix} W_{2} \\ W_{3} \end{matrix}]_{} i_{S} + [\begin{matrix} b_{2} \\ b_{3} \end{matrix}] .$

The computer system may further include final matching descriptors of the alternating self- and cross-attention layers are linear projections:

$f_{i}^{A} = W^{(L)} x_{i}^{A} + b \forall i \in A,$

The computer system may further include the optimal matching layer expresses a pairwise score for a set as the similarity of matching descriptors:

$S_{i, j} = < f_{i}^{A}, f_{j}^{B} > \forall (i, j) \in A \times B,$

where <., .> is the inner product. As opposed to learned visual descriptors, the matching descriptors are not normalized, and their magnitude can change per feature and during training to reflect the prediction confidence.

The computer system may further include the optimal matching layer, for occlusion and visibility suppresses occluded keypoints and augments each set of keypoints with a dustbin score so that unmatched keypoints are explicitly assigned to dustbin scores.

The computer system may further include the score S is augmented to S by appending a new row and column, the point-to-bin and bin-to-bin scores, filled with a single learnable parameter:

${\overline{S}}_{i, N + 1} = {\overline{S}}_{M + 1, j} = {\overline{S}}_{M + 1, N + 1} = α \in ℝ .$

The computer system may further include the optimal matching layer finds the optimal partial assignment based on the M by N score matrix using the Sinkhorn algorithm for T iterations.

The computer system may further include after T iterations, we the optimal matching layer drops the dustbin scores and recovers P=P^T1_M≤1_N, where

P1_N≤1_Mand P^T1_M≤1_N.

is the original assignment and

P1_N+1=a and P^T1_M+1=b.

Is the assignment with the dustbin scored augmented.

The invention also provides a computer-implemented method system that may include mapping, with a keypoint encoder of an attentional graph neural network of a deep middle-end matcher architecture, keypoint positions p and their visual descriptors d into a single vector; and executing, with alternating self- and cross-attention layers of an attentional graph neural network of the deep middle-end matcher architecture, based on the vector, for L repeated times, to create representations f, and executing an optimal matching layer, of the attentional graph neural network of the deep middle-end matcher architecture, to create an M by N score matrix from the representations f and finding an optimal partial assignment based on the M by N score matrix.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention is further described by way of example with reference to the accompanying drawings, wherein:

FIG. 1 is a representative sketch that illustrates feature matching with a deep middle-end matcher;

FIG. 2 shows correspondences estimated by the deep middle-end matcher for two difficult indoor image pairs;

FIG. 3 is a representative sketch that shows how we formulate the deep middle-end matcher to solve an optimization problem;

FIG. 4 are images that show masks as rays;

FIG. 5 are graphs that show indoor and outdoor pose estimation;

FIG. 6 shows qualitative image matches;

FIG. 7 shows visualizing attention in self- and cross-attention masks at various layers and heads; and

FIG. 8 shows an augmented reality system.

DETAILED DESCRIPTION OF THE INVENTION

Finding correspondences between points in images is a vital step for computer vision tasks dealing with 3D reconstruction or visual localization, such as Simultaneous Localization and Mapping (SLAM) and Structure-from-Motion (SfM). These estimate the 3D structure and camera poses from such correspondences after matching local features, a process known as data association. Factors such as large viewpoint change, occlusion, blur, and lack of texture make 2D-to-2D data association particularly challenging.

In this description, we present a new way of thinking about the feature matching problem. Instead of learning better task-agnostic local features followed by simple matching heuristics and tricks, we propose to learn the matching process from pre-existing local features using a novel neural architecture called the deep middle-end matcher (DMEM). In the context of SLAM that typically decomposes the problem into the visual feature detection front-end and the bundle adjustment or pose estimation back-end, our network lies directly in the middle—the deep middle-end matcher is a learnable middle-end. FIG. 1 illustrates feature matching with the deep middle-end matcher. Our approach establishes pointwise correspondences between challenging image pairs. It takes off-the-shelf local features as input and uses an attentional graph neural network to solve an assignment optimization problem. The deep middle-end matcher acts as a middle-end and handles partial point visibility and occlusion elegantly, producing a partial assignment matrix.

In this work, learning feature matching is viewed as finding the partial assignment between two sets of local features. We revisit the classical graph-based strategy of matching by solving a linear assignment problem, which, when relaxed to an optimal transport problem, can be solved differentiably [See references 50, 9, 31 below]. The cost function of this optimization is predicted by a Graph Neural Network (GNN). Inspired by the success of the Transformer [see reference 48 below], it uses self- (intra-image) and cross- (inter-image) attention to leverage both spatial relationships of the keypoints and their visual appearance. This formulation enforces the assignment structure of the predictions while enabling the cost to learn complex priors, elegantly handling occlusion and non-repeatable keypoints. Our method is trained end-to-end from images to correspondences—we learn priors for pose estimation from a large annotated dataset, enabling the deep middle-end matcher to reason about the 3D scene and the assignment. Our work can be applied to a variety of multiple-view geometry problems that require high-quality feature correspondences.

We show the superiority of the deep middle-end matcher compared to both handcrafted matchers and learned inlier classifiers. FIG. 2 shows correspondences estimated by the deep middle-end matcher for two difficult indoor image pairs. The deep middle-end matcher successfully estimates an accurate pose while other learned or handcrafted methods fail (correct correspondences in green). The proposed method brings the most substantial improvements when combined with SuperPoint [see reference 14 below], a deep front-end, thereby advancing the state-of-the-art on the tasks of homography estimation and indoor and outdoor pose estimation, and paving the way towards deep SLAM.

2. Related Work

Local feature matching is generally performed by i) detecting interest point, ii) computing visual descriptors, iii) matching these with a Nearest Neighbor (NN) search, iv) filtering incorrect matches, and finally v) estimating a geometric transformation. The classical pipeline developed in the 2000s is often based on SIFT [see reference 25 below], filters matches with Lowe's ratio test [see reference 25 below], the cross-check, and heuristics like neighborhood consensus [see references 46, 8, 5, 40 below], and finds a transformation with a robust solver like RANSAC [see references 17, 35 below].

Recent works on deep learning for matching often focus on learning better sparse detectors and local descriptors [see references 14, 15, 29, 37, 54 below] from data using Convolutional Neural Networks (CNNs). To improve their discriminativeness, some works explicitly look at a wider context using regional features [see reference 26 below] or log-polar patches [see reference 16 below]. Other approaches learn to filter matches by classifying them into inliers and outliers [see references 27, 36, 6, 56 below]. These operate on sets of matches, still estimated by NN search, and thus ignore the assignment structure and discard visual information. Works that actually learn to match have so far focused on dense matching [see reference 38 below] or 3D point clouds [see reference 52 below], and still exhibit such limitations. In contrast, our learnable middle-end simultaneously performs context aggregation, matching, and filtering in a single end-to-end architecture.

Graph matching problems are usually formulated as quadratic assignment problems, which are NP-hard, requiring expensive, complex, and thus impractical solvers [see reference 24 below]. For local features, the computer vision literature of the 2000s [see references 4, 21, 45 below] uses handcrafted costs with many heuristics, making it complex and fragile. Caetano et al. [see reference 7 below] learn the cost of the optimization for a simpler linear assignment, but only use a shallow model, while our deep middle-end matcher learns a flexible cost using a neural network. Related to graph matching is the problem of optimal transport [see reference 50 below]—it is a generalized linear assignment with an efficient yet simple approximate solution, the Sinkhorn algorithm [see references 43, 9, 31 below].

Deep Learning for sets such as point clouds aims at designing permutation equivariant or invariant functions by aggregating information across elements. Some works treat all of them equally, through global pooling [see references 55, 32, 11 below] or instance normalization [see references 47, 27, 26 below], while others focus on a local neighborhood in coordinate or feature space [see references 33, 53 below]. Attention [see references 48, 51, 49, 20 below] can perform both global and data-dependent local aggregation by focusing on specific elements and attributes, and is thus more flexible. Our work uses the fact that it can be seen as a particular instance of a Message Passing Graph Neural Network [see references 18, 3 below] on a complete graph. By applying attention to multi-edge, or multiplex, graphs, similar to [see references 22, 57 below], the deep middle-end matcher can learn complex reasoning about the two sets of local features.

3. The Deep Middle-End Matcher Architecture

Motivation: In the image matching problem, some regularities of the world could be leveraged: the 3D world is largely smooth and sometimes planar, all correspondences for a given image pair derive from a single epipolar transform if the scene is static, and some poses are more likely than others. In addition, 2D keypoints are usually projections of salient 3D points, like corners or blobs, thus correspondences across images must adhere to certain physical constraints: i) a keypoint can have at most a single correspondence in the other image; and ii) some keypoints will be unmatched due to occlusion and failure of the detector. An effective model for feature matching should aim at finding all correspondences between reprojections of the same 3D points and identifying keypoints that have no matches. FIG. 3 shows how we formulate the deep middle-end matcher as solving an optimization problem, whose cost is predicted by a deep neural network. The deep middle-end matcher is made up of two major components: the attentional graph neural network (Section 3a), and the optimal matching layer (Section 3b). The first component uses a keypoint encoder to map keypoint positions p and their visual descriptors d into a single vector, and then uses alternating self- and cross-attention layers (repeated L times) to create more powerful representations f. The optimal matching layer creates an M by N score matrix, augments it with dustbins, then finds the optimal partial assignment using the Sinkhorn algorithm (for T iterations). This alleviates the need for domain expertise and heuristics—we learn relevant priors directly from the data.

Formulation: Consider two images A and B, each with a set of keypoint positions p and associated visual descriptors d—we refer to them jointly (p, d) as the local features. Keypoints consist of x and y image coordinates as well as a detection confidence c, p_i:=(x, y, c)_i. Visual descriptors d_i∈R^Dcan be those extracted by a CNN like SuperPoint or traditional descriptors like SIFT. Images A and B have M and N local features and their sets of keypoint indices are A:{1, . . . , M} and B:={1, . . . , N}, respectively.

Partial Assignment: Constraints i) and ii) mean that correspondences derive from a partial assignment between the two sets of keypoints. For the integration into downstream tasks and better interpretability, each possible correspondence should have a confidence value. We consequently define a partial soft assignment matrix P∈[0, 1]^M×Nas:

P1_N≤1_Mand P^T1_M≤1_N. (1)

Our goal is the following: design a neural network that predicts the assignment P from two sets of local features.

3.1. Attentional Graph Neural Network

The first major block of the deep middle-end matcher (see Section 3a) is the attentional graph neural network whose job is the following: given initial local features, compute f_i∈R^D, the matching descriptors, by letting the features communicate with each other. Long-range feature communication is vital for robust matching and requires aggregation of information from within an image as well as across an image pair.

Intuitively, the distinctive information about a given keypoint depends on its visual appearance and its location, but also its spatial and visual relationship with respect to other co-visible keypoints, e.g. neighboring or salient ones. On the other hand, knowledge of keypoints in the second image can help to resolve ambiguities by comparing candidate matches or estimating the relative photometric or geometric transformation from global and unambiguous clues.

When asked to match a given ambiguous keypoint, humans look back-and-forth at both images: they sift for tentative matching keypoints, examine each of them, and look for contextual clues that help to disambiguate the true match from other self-similarities. This hints at an iterative process that can focus its attention on specific locations.

Keypoint Encoder: The initial representation ⁽⁰⁾x_ifor each keypoint i combines visual appearance and location. We embed the keypoint position into a high-dimensional vector with a Multilayer Perceptron (MLP) as follows:

⁽⁰⁾x_i=d_i+MLP(p_i). (2)

The encoder allows the network to reason about both appearance and position jointly (this is especially powerful with an attention mechanism) and is an instance of the “positional encoder” introduced in the Transformer [see reference 48 below].

Multiplex Graph Neural Network: We consider a single complete graph whose nodes are the keypoints of both images. The graph has two types of undirected edges—it is a multiplex graph. Intra-image edges, or selfedges, E_self, connect keypoints i to all other keypoints within the same image. Inter-image edges, or cross edges, E_cross, connect keypoints i to all keypoints in the other image. We use the message passing formulation [see references 18, 3 below] to propagate information along both types of edges. The resulting Multiplex Graph Neural Network starts with a high-dimensional state for each node and computes at each layer an updated representation by simultaneously aggregating messages across all given edges for all nodes.

Let ^(l)x^A_ibe the intermediate representation for element i in image A at layer l. The message m_E→iis the result of the aggregation from all keypoints {j:(i, j)∈E}, where E∈{E_self, E_cross}. The residual message passing update for all i in A is:

^(l+1)x_i^A=^(l)x_i^A+MLP([^(l)x_i^A∥m_∈→i]). (3)

where [.∥.] denotes concatenation. A similar update can be simultaneously performed for all keypoints in image B. A fixed number of layers L with different parameters are chained and alternatively aggregate along the self and cross edges. As such, starting from l=1, E=E_selfif l is odd and E=E_crossif l is even.

Attentional Aggregation: An attention mechanism computes the message m_E→iand performs the aggregation. Self edges are based on self-attention [see reference 48 below] and cross edges are based on cross-attention. Akin to database retrieval, a representation of i, the query q_i, retrieves the values v_jof some elements based on their attributes, the keys k_j. We compute the message as weighted average of the values:

$\begin{matrix} m_{ℰ \to i} = \underset{j : (i, j) \in ℰ}{Σ} α_{ij} v_{j}, & (4) \end{matrix}$

where the attention mask α_ijis the Softmax over the key-query similarities:

$α_{ij} - {Softmax}_{j} (q_{i}^{⊤} k_{j}) .$

The key, query, and value are computed as linear projections of deep features of the graph neural network. Considering that query keypoint i is in the image Q and all source keypoints are in image S, (Q, S)∈{A, B}², we can write:

$\begin{matrix} q_{i} = W_{1} i_{Q} + b_{1} [\begin{matrix} k_{j} \\ v_{j} \end{matrix}] = [\begin{matrix} W_{2} \\ W_{3} \end{matrix}]_{} i_{S} + [\begin{matrix} b_{2} \\ b_{3} \end{matrix}] . & (5) \end{matrix}$

Each layer l has its own projection parameters, and they are shared for all keypoints of both images. In practice, we improve the expressivity with multi-headed attention [see reference 48 below].

Our formulation provides maximum flexibility as the network can learn to focus on a subset of key-points based on specific attributes. In FIG. 4, masks aij are shown as rays. Attentional aggregation builds a dynamic graph between keypoints. Self-attention (top) can attend anywhere in the same image, e.g., distinctive locations, and is thus not restricted to nearby locations. Cross-attention (bottom) attends to locations in the other image, such as potential matches that have a similar local appearance. The deep middle-end matcher can retrieve, or attend based on both appearance and keypoint location as they are encoded in the representation x_i. This includes attending to a nearby keypoint and retrieving the relative positions of similar or salient keypoints. This enables representations of the geometric transformation and the assignment. The final matching descriptors are linear projections:

$\begin{matrix} f_{i}^{A} = W^{(L)} x_{i}^{A} + b \forall i \in A, & (6) \end{matrix}$

and similarly for keypoints in B.

3.2. Optimal Matching Layer

The second major block of the deep middle-end matcher (see Section 3b) is the optimal matching layer, which produces a partial assignment matrix. As in the standard graph matching formulation, the assignment P can be obtained by computing a score matrix S∈R^M×Nfor all possible matches and maximizing the total score Σ_i,jS_i,jP_i,junder constraints in Equation 1. This is equivalent to solving a linear assignment problem.

Score Prediction: Building a separate representation for all (M+1)×(N+1) potential matches would be prohibitive. We instead express the pairwise score as the similarity of matching descriptors:

$\begin{matrix} S_{i, j} = < f_{i}^{A}, f_{j}^{B} > \forall (i, j) \in A \times B, & (7) \end{matrix}$

where <., .> is the inner product. As opposed to learned visual descriptors, the matching descriptors are not normalized, and their magnitude can change per feature and during training to reflect the prediction confidence.

Occlusion and Visibility: To let the network suppress occluded keypoints, we augment each set with a dustbin so that unmatched keypoints are explicitly assigned to it. This technique is common in graph matching, and dustbins have also been used by SuperPoint [see reference 14 below] to account for image cells that might not have a detection. We augment the score S to S by appending a new row and column, the point-to-bin and bin-to-bin scores, filled with a single learnable parameter:

$\begin{matrix} {\overline{S}}_{i, N + 1} = {\overline{S}}_{M + 1, j} = {\overline{S}}_{M + 1, N + 1} = α \in ℝ . & (8) \end{matrix}$

While keypoints in A will be assigned to a single keypoint in B or the dustbin, each dustbin has as many matches as there are keypoints in the other set: N, M for dustbins in A, B respectively. We denote as a=[1_M^TN]^Tand b=[1_N^TM]^Tthe number of expected matches for each keypoint and dustbin in A and B. The augmented assignment P now has the constraints:

P1_N+1=a and P^T1_M+1=b. (9)

Sinkhorn Algorithm: The solution of the above optimization problem corresponds to the optimal transport [see reference 31 below] between discrete distributions a and b with score S. It can be approximately solved with the Sinkhorn algorithm [see references 43, 9 below], a differentiable version of the Hungarian algorithm [see reference 28 below], classically used for bipartite matching. It solves a regularized transport problem, naturally resulting in a soft assignment. This normalization amounts to iteratively performing alternating Softmax along rows and columns, and is thus easily parallelized on GPU. After T iterations, we drop the dustbins and recover P=P_1:M,1:N.

3.3. Loss

By design, both the graph neural network and the optimal matching layer are differentiable—this enables backpropagation from matches to visual descriptors. The deep middle-end matcher is trained in a supervised manner from ground truth matches M={(i, j)}⊂A×B. These are estimated from ground truth relative transformations—using poses and depth maps or homographies. This also lets us label some keypoints I⊆A and J⊆B as unmatched if they do not have any reprojection in their vicinity. Given the labels, we minimize the negative log-likelihood of the assignment P:

$\begin{matrix} \begin{matrix} Loss = - \underset{(i, j) \in ℳ}{Σ} \log {\overline{P}}_{i, j} \\ - \underset{i \in ℐ}{Σ} \log {\overline{P}}_{i, N + 1} - \underset{j \in 𝒥}{Σ} \log {\overline{P}}_{M + 1, j} . \end{matrix} & (10) \end{matrix}$

This supervision aims at simultaneously maximizing the precision and the recall of the matching.

3.4. Comparisons to Related Work

The deep middle-end matcher vs. inlier classifiers [see references 27, 56 below]: the deep middle-end matcher benefits from a strong inductive bias by being entirely permutation equivariant with respect to both images and local features. It additionally embeds the commonly-used mutual check constraint directly into the training: any match with probability P_i,jgreater than 0.5 is necessarily mutually consistent.

The deep middle-end matcher vs. Instance Normalization[see reference 47 below]: Attention, as used by the deep middle-end matcher, is a more flexible and powerful context aggregation mechanism than instance normalization, which treats all keypoints equally and is used by previous work on feature matching [see references 27, 56, 26 below].

The deep middle-end matcher vs. ContextDesc[see reference 26 below]: the deep middle-end matcher can jointly reason about appearance and position while ContextDesc processes them separately. Additionally, ContextDesc is a front-end that additionally requires a larger regional extractor, and a loss for keypoints scoring. The deep middle-end matcher only needs local features, learned or handcrafted, and can thus be a simple drop-in replacement of existing matchers.

The deep middle-end matcher vs. Transformer[see reference 48 below]: the deep middle-end matcher borrows the self-attention from the Transformer, but embeds it into a graph neural network, and additionally introduces the cross-attention, which is symmetric. This simplifies the architecture and results in better feature reuse across layers.

4. Implementation Details

The deep middle-end matcher can be combined with any local feature detector and descriptor but works particularly well with SuperPoint [see reference 14 below], which produces repeatable and sparse keypoints—enabling very efficient matching. Visual descriptors are bilinearly sampled from the semi-dense feature map, which is differentiable. Both local feature extraction and subsequent “gluing” are directly performed on the GPU. At test time, in order to extract matches from the soft assignment, one can use the confidence threshold to retain some, or simply use all of them and their confidence in a subsequent step, such as weighted pose estimation.

Architecture details: All intermediate representations (key, query value, descriptors) have the same dimension D=256 as the SuperPoint descriptors. We use L=9 layers of alternating multi-head self- and cross-attentions with 4 heads each, and perform T=100 Sinkhorn iterations—in log-space for numerical stability. The model is implemented in PyTorch [see reference 30 below] and runs in real-time on a GPU: a forward pass takes on average 150 ms (7 FPS).

Training details: To allow for data augmentation, SuperPoint detect and describe steps are performed on-the-fly as batches during training. A number of random keypoints are further added for efficient batching and increased robustness. More details are provided in Appendix A.

5. Experiments
5.1. Homography Estimation

We perform a large-scale homography estimation experiment using real images and synthetic homographies with both robust (RANSAC) and non-robust (DLT) estimators.

Dataset: We generate image pairs by sampling random homographies and applying random photometric distortions to real images, following a recipe similar to [see references 12, 14, 37, 36 below]. The underlying images come from the set of 1M distractor images in the Oxford and Paris dataset [see reference 34 below], split into training, validation, and test sets.

Baselines: We compare the deep middle-end matcher against several matchers applied to SuperPoint local features—the Nearest Neighbor (NN) matcher and various outlier rejectors: the mutual check (or cross-check), PointCN [see reference 27 below], and Order-Aware Network (OANet) [see reference 56 below]. All learned methods, including the deep middle-end matcher, are trained on ground-truth correspondences, found by projecting keypoints from one image to the other. We generate homographies and photometric distortions on-the-fly—an image pair is never seen twice during training.

Metrics: Match precision (P) and recall (R) are computed from the ground truth correspondences. Homography estimation is performed with both RANSAC and the Direct Linear Transformation [see reference 19 below] (DLT), which has a direct least-squares solution. We compute the mean reprojection error of the four corners of the image and report the area under the cumulative error curve (AUC) up to a value of 10 pixels.

Results: the deep middle-end matcher is sufficiently expressive to master homographies, achieving 98% recall and high precision. Table 1 shows Homography estimation for the deep middle-end matcher, DLT and RANSAC. The deep middle-end matcher recovers al-most all possible matches while suppressing most outliers. Because the deep middle-end matcher correspondences are high-quality, the Direct Linear Transform (DLT), a least-squares based solution with no robustness mechanism, outperforms RANSAC. The estimated correspondences are so good that a robust estimator is not required—the deep middle-end matcher works even better with DLT than RANSAC. Outlier rejection methods like PointCN and OANet cannot predict more correct matches than the NN matcher itself, overly relying on the initial descriptors.

TABLE 1

Homography estimation

Local

AUC

features
Matcher
RANSAC
DLT
P
R

SuperPoint
NN
39.47
0.00
21.7
65.4

NN + mutual
42.45
0.24
43.8
56.5

NN + PointCN
43.02
45.40
76.2
64.2

NN + OANet
44.31
49.85
80.8
64.5

DMEM
53.67
65.85
90.7
98.3

5.2. Indoor Pose Estimation

Indoor image matching is very challenging due to the lack of texture, the abundance of self-similarities, the complex 3D geometry of scenes, and large viewpoint changes. As we show in the following, the deep middle-end matcher can effectively learn priors to overcome these challenges.

Dataset: We use ScanNet [see reference 10 below], a large-scale indoor dataset composed of monocular sequences with ground truth poses and depth images as well as well-defined training, validation, and test splits corresponding to different scenes. Past works select training and evaluation pairs based on time difference [see references 29, 13 below] or SfM co-visibility [see references 27, 56, 6 below], usually computed using SIFT. We argue that this limits the difficulty of the pairs, and instead select these based on an overlap score computed for all possible image pairs in a given sequence using only ground truth poses and depth. This results in significantly wider-baseline pairs, which corresponds to the current frontier for real-world indoor image matching. Discarding pairs with too small or too large overlap, we obtain 230M training pairs and sample 1500 test pairs. More details are provided in Appendix A.

Metrics: As in previous work [see references 27, 56, 6 below], we report the AUC of the pose error at the thresholds (5°, 10°, 20°), where the pose error is the maximum of the angular errors in rotation and translation. Relative pose is obtained from essential matrix estimation with RANSAC. We also report the match precision and the matching score [see references 14, 54 below], where a match is deemed correct based on its epipolar distance.

Baselines: We evaluate the deep middle-end matcher and various baseline matchers using both root-normalized SIFT [see references 25, 2 below] and SuperPoint [see reference 14 below] features. The deep middle-end matcher is trained with correspondences and unmatched keypoints derived from ground-truth poses and depth. All baselines are based on the Nearest Neighbor (NN) matcher and potentially an outlier rejection method. In the “Handcrafted” category, we consider the simple cross-check (mutual), ratio test [see reference 25 below], descriptor distance threshold, and the more complex GMS [see reference 5 below]. Methods in the “Learned” category are PointCN [see reference 27 below], and its follow-ups OANet [see reference 56 below] and NG-RANSAC [see reference 6 below]. We retrain PointCN and OANet on ScanNet for both SuperPoint and SIFT with the classification loss using the above-defined correctness criterion and their respective regression losses. For NG-RANSAC, we use the original trained model. We do not include any graph matching methods as they are orders of magnitude too slow for the number of keypoints that we consider. We report other local features as reference: ORB [see reference 39 below] with GMS, D2-Net [see reference 15 below], and ContextDesc [see reference 26 below] using the publicly available trained models.

Results: the deep middle-end matcher enables significantly higher pose accuracy compared to both handcrafted and learned matchers. Table 2 shows wide-baseline indoor pose estimation on Scan-Net. We report the AUC of the pose error, the matching score (MS) and precision (P), all in Pose estimation AUC percents. The deep middle-end matcher outperforms all handcrafted and learned matchers when applied to both SIFT and SuperPoint. These benefits are substantial when applied to both SIFT and SuperPoint. FIG. 5 shows indoor and outdoor pose estimation. The deep middle-end matcher significantly improves the pose accuracy over OANet, a state-of-the-art outlier rejection neural network. It has a significantly higher precision than other learned matchers, demonstrating its higher representation power. It also produces a larger number of correct matches—up to 10 times more than the ratio test when applied to SIFT, because it operates on the full set of possible matches, rather than the limited set of nearest neighbors. SuperPoint and the deep middle-end matcher together achieve state-of-the-art results on indoor pose estimation. They complement well each other since repeatable keypoints make it possible to estimate a larger number of correct matches even in very challenging situations—see FIG. 2.

TABLE 2

Pose estimation

Local

AUC

features
Matcher
@5°
@10°
@20°
P
MS

ORB
NN + GMS
5.21
13.65
25.36
72.0
5.7

D2-Net
NN + mutual
5.25
14.53
27.96
46.7
12.0

ContextDesc
NN + ratio test
6.64
15.01
25.75
51.2
9.2

SIFT
NN + ratio test
5.83
13.06
22.47
40.3
1.0

NN + OANet
5.77
13.17
23.93
38.0
4.3

NN + NG-RANSAC
6.19
13.80
23.73
61.9
0.7

DMEM
6.71
15.70
28.67
74.2
9.8

SuperPoint
NN + mutual
9.43
21.53
36.40
50.4
18.8

NN + distance +
9.82
22.42
36.83
63.9
14.6

mutual

NN + GMS
8.39
18.96
31.56
50.3
19.0

NN + PointCN
11.40
25.47
41.41
71.8
25.5

NN + OANet
11.76
26.90
43.85
74.0
25.7

DMEM
16.16
33.81
51.84
84.4
31.5

In FIG. 6 shows qualitative image matches. We compare the deep middle-end matcher to the Nearest Neighbor (NN) matcher with two outlier rejectors, handcrafted and learned, in three environments. The deep middle-end matcher consistently estimates more correct matches (green lines) and fewer mismatches (red lines), coping with repeated texture, large viewpoint, and illumination changes.

5.3. Outdoor Pose Estimation

As outdoor image sequences present their own set of challenges (e.g., lighting changes and occlusion), we train and evaluate the deep middle-end matcher for pose estimation in an outdoor setting. We use the same evaluation metrics and baseline methods as in the indoor pose estimation task.

Dataset: We evaluate on the PhotoTourism dataset, which is part of the CVPR'19 Image Matching Challenge [see reference 1 below]. It is a subset of the YFCC100M dataset [see reference 44 below] and has ground truth poses and sparse 3D models obtained from an off-the-shelf SfM tool [see references 29, 41, 42 below]. For training, we use the MegaDepth dataset [see reference 23 below], which also has clean depth maps computed with multi-view stereo. Scenes that are in the PhotoTourism test set are removed from the training set.

Results: Table 3 shows outdoor pose estimation on the PhotoTourism dataset. Matching SuperPoint and SIFT features with the deep middle-end matcher results in significantly higher pose accuracy (AUC), precision (P), and matching score (MS) than with handcrafted or other learned methods. The deep middle-end matcher outperforms all baselines, at all relative pose thresholds, when applied to both SuperPoint and SIFT. Most notably, the precision of the resulting matching is very high (84.9%), reinforcing the analogy that the deep middle-end matcher “glues” together local features.

TABLE 3

Pose estimation

Local

AUC

features
Matcher
@5°
@10°
@20°
P
MS

ContextDesc
NN + ratio test
20.16
31.65
44.05
56.2
3.3

SIFT
NN + ratio test
15.19
24.72
35.30
43.4
1.7

NN + NG-RANSAC
15.61
25.28
35.87
64.4
1.9

NN + OANet
17.87
27.85
39.43
53.2
3.5

DMEM
23.03
36.51
50.07
74.0
7.3

SuperPoint
NN + mutual
9.80
18.99
30.88
22.5
4.9

NN + GMS
13.96
24.58
36.53
47.1
4.7

NN + OANet
21.03
34.08
46.88
52.4
8.4

DMEM
34.18
50.32
64.16
84.9
11.1

5.4. Understanding the Deep Middle-End Matcher

Ablation Study: To evaluate our design decisions, we repeated the indoor ScanNet experiments, but this time focusing on different the deep middle-end matcher variants. Table 4 shows Ablation of the deep middle-end matcher on ScanNet with Super-Point local features. that all the deep middle-end matcher blocks are useful and bring substantial performance gains. Differences with respect to the full model are shown. While the optimal matching layer alone improves over the baseline Nearest Neighbor matcher, the GNN ex-plains the majority of the gains brought by the deep middle-end matcher. Both cross-attention and positional encoding are critical for strong gluing, and a deeper net further improves precision.

TABLE 4

Pose
Match
Matching

Matcher

AUC@20°
precision
score

NN +

36.40
59.4
18.8

mutual

DMEM
No Graph Neural Net
38.56
66.0
17.2

No cross-attention
42.57
74.0
25.3

No positional encoding
47.12
75.8
26.6

Smaller (3 layers)
46.93
79.9
30.0

Full (9 layers)
51.84
84.4
31.5

Visualizing Attention: An understanding of the proposed technique would not be complete without an attempt to visualize the deep middle-end matcher's attention patterns throughout matching. The extensive diversity of self- and cross-attention patterns is shown in FIG. 7 and reflects the complexity of the learned behavior. FIG. 7 shows visualizing attention: self- and cross-attention masks α_ijat various layers and heads. The deep middle-end matcher learns a diversity of patterns and can focus on global or local context, self-similarities, distinctive features, and match candidates.

6. Conclusion

In this disclosure, we described what we refer to as “the deep middle-end matcher”—an attentional graph neural network inspired by the Trans-former's success in NLP—for local feature matching. We believe that the data association component of the 3D reconstruction pipeline has not received adequate attention from the research community, and powerful learning-based middle-ends are our solution. The deep middle-end matcher boosts the receptive field of local features and downplays features whose correspondences are missing, effectively performing the roles of both ContextDesc and inlier classification. Importantly, the inner-workings of the deep middle-end matcher are learned entirely from real-world data. Our results on 2D-to-2D feature matching show significant improvement over the existing state-of-the-art.

Our description herein makes a strong case for the use of learnable middle-ends in the feature matching pipeline as a modern, deep learning-based, alternative to hand-designed heuristics. Some of our future work will focus on evaluating the deep middle-end matcher inside a complete 3D reconstruction pipeline.

Various example embodiments of the invention are described herein. Reference is made to these examples in a non-limiting sense. They are provided to illustrate more broadly applicable aspects of the invention. Various changes may be made to the invention described and equivalents may be substituted without departing from the true spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation, material, composition of matter, process, process act(s) or step(s) to the objective(s), spirit or scope of the present invention. Further, as will be appreciated by those with skill in the art that each of the individual variations described and illustrated herein has discrete components and features which may be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present inventions. All such modifications are intended to be within the scope of claims associated with this disclosure.

The invention includes methods that may be performed using the subject devices. The methods may comprise the act of providing such a suitable device. Such provision may be performed by the end user. In other words, the “providing” act merely requires the end user obtain, access, approach, position, set-up, activate, power-up or otherwise act to provide the requisite device in the subject method. Methods recited herein may be carried out in any order of the recited events which is logically possible, as well as in the recited order of events.

Example aspects of the invention, together with details regarding material selection and manufacture have been set forth above. As for other details of the present invention, these may be appreciated in connection with the above-referenced patents and publications as well as generally known or appreciated by those with skill in the art. The same may hold true with respect to method-based aspects of the invention in terms of additional acts as commonly or logically employed.

In addition, though the invention has been described in reference to several examples optionally incorporating various features, the invention is not to be limited to that which is described or indicated as contemplated with respect to each variation of the invention. Various changes may be made to the invention described and equivalents (whether recited herein or not included for the sake of some brevity) may be substituted without departing from the true spirit and scope of the invention. In addition, where a range of values is provided, it is understood that every intervening value, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the invention.

Also, it is contemplated that any optional feature of the inventive variations described may be set forth and claimed independently, or in combination with any one or more of the features described herein. Reference to a singular item, includes the possibility that there are plural of the same items present. More specifically, as used herein and in claims associated hereto, the singular forms “a,” “an,” “said,” and “the” include plural referents unless the specifically stated otherwise. In other words, use of the articles allow for “at least one” of the subject item in the description above as well as claims associated with this disclosure. It is further noted that such claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements, or use of a “negative” limitation.

Without the use of such exclusive terminology, the term “comprising” in claims associated with this disclosure shall allow for the inclusion of any additional element—irrespective of whether a given number of elements are enumerated in such claims, or the addition of a feature could be regarded as transforming the nature of an element set forth in such claims. Except as specifically defined herein, all technical and scientific terms used herein are to be given as broad a commonly understood meaning as possible while maintaining claim validity.

The breadth of the present invention is not to be limited to the examples provided and/or the subject specification, but rather only by the scope of claim language associated with this disclosure.

7. Appendix A—Further Experimental Details

Homography Estimation:

The test set contains 1024 pairs of 640×480 images. Homographies are generated by applying random perspective, scaling, rotation, and translation to the original full-sized images, to avoid bordering artifacts. We evaluate with the 512 top-scoring keypoints detected by SuperPoint with a Non-Maximum Suppression (NMS) radius of 4 pixels. Correspondences are deemed correct if they have a reprojection error lower than 3 pixels. When estimating the homography with RANSAC, we use the OpenCV function findHomography with 3000 iterations and an inlier threshold of 3 pixels.

Indoor Pose Estimation:

The overlap score between two images A and B is the average ratio of pixels in A that are visible in B (and vice versa), after accounting for missing depth values and occlusion (by checking for consistency in the depth using relative errors). We train and evaluate with an overlap range of 0.4 to 0.8. For training, we sample at each epoch 200 pairs per scene, similarly as in [15]. The test set is generated by subsampling the sequences by 15 and subsequently sampling 15 pairs for each of the 300 sequences. We resize all ScanNet images and depth maps to VGA 640×480. We detect up to 1024 SuperPoint keypoints (using the publicly available trained model with NMS radius of 4) and 2048 SIFT keypoints (using OpenCV's implementation). When computing the precision and matching score, we use an epipolar threshold of 5.10e-4. Poses are computed by first estimating the essential matrix with OpenCV's findEssentialMat and RANSAC with an inlier threshold of 1 pixel divided by the average focal length, followed by recoverPose. In contrast with previous works [28, 59, 6], we compute a more accurate AUC using explicit integration rather than coarse histograms.

Outdoor Pose Estimation:

For training on Megadepth, the overlap score is the ratio of triangulated keypoints that are visible in the two images, as in [15]. We sample pairs with an overlap score in [0.1, 0.7] at each epoch. For the evaluation on the PhotoTourism dataset, we use all 11 scenes and the overlap score computed by Ono [30], with a selection range of [0.1, 0.4]. Images are resized so that their longest edge is smaller than 1600 pixels. We detect 2048 keypoints for both SIFT and SuperPoint (with an NMS radius of 3). Other evaluation parameters are identical to the ones used in the indoor evaluation.

Training of the Deep Middle-End Matcher:

For training on homography/indoor/outdoor data, we use the Adam optimizer with an initial constant learning rate of 10e-4 for the first 200k/100k/50k iterations, followed by an exponential decay of 0.999998/0.999992/0.999992 until iteration 900k. When using SuperPoint features, we employ batches with 32/64/16 image pairs and a fixed number of 512/400/1024 keypoints per image. When using SIFT features we use 1024 keypoints and 24 pairs. Because of the limited number of training scenes, the outdoor model is initialized with the homography model. Prior to the keypoint encoding, keypoints are normalized by the largest edge of the image.

Ground truth correspondences M and unmatched sets I and J are generated by first computing the M×N re-projection matrix between all detected keypoints using the ground truth homography or pose and depth map. Correspondences are cells that have a reprojection error that is a minimum along both rows and columns, and that is lower than a given threshold: 3, 5, and 3 pixels for homographies, indoor, and outdoor matching respectively. For homographies, unmatched keypoints are simply the ones that do not appear in M. For indoor and outdoor matching, because of errors in the depth and the pose, unmatched keypoints must additionally have a minimum reprojection error larger than 15 and 5 pixels, respectively. This allows to ignore labels for keypoints whose correspondences are ambiguous, while still providing some supervision through the Sinkhorn normalization.

8. References

Each of the following references is incorporated by reference herein in its entirety, and referenced in the above description:

[1] Phototourism Challenge, CVPR 2019 Image Matching Workshop. https://image matching-workshop. github.io. Accessed Nov. 8, 2019. 7
[2] Relja Arandjelović and Andrew Zisserman. Three things everyone should know to improve object retrieval. In CVPR, 2012. 6
[3] Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, et al. Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261, 2018. 2, 3
[4] Alexander C Berg, Tamara L Berg, and Jitendra Malik. Shape matching and object recognition using low distortion correspondences. In CVPR, 2005. 2
[5] JiaWang Bian, Wen-Yan Lin, Yasuyuki Matsushita, Sai-Kit Yeung, Tan-Dat Nguyen, and Ming-Ming Cheng. GMS: Grid-based motion statistics for fast, ultra-robust feature correspondence. In CVPR, 2017. 2, 6
[6] Eric Brachmann and Carsten Rother. Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses. In ICCV, 2019. 2, 6
Tibério S Caetano, Julian J McAuley, Li Cheng, Quoc V Le, and Alex J Smola. Learning graph matching. IEEE TPAMI, 31(6):1048-1058, 2009. 2
[8] Jan Cech, Jiri Matas, and Michal Perdoch. Efficient sequential correspondence selection by cosegmentation. IEEE TPAMI, 32(9):1568-1581, 2010. 2
[9] Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In NIPS, 2013. 1, 2, 4
[10] Angela Dai, Angel X Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nieβner. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In CVPR, 2017. 6
[11] Haowen Deng, Tolga Birdal, and Slobodan Ilic. Ppfnet: Global context aware local features for robust 3d point matching. In CVPR, 2018. 2
[12] Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. Deep image homography estimation. In RSS Work-shop: Limits and Potentials of Deep Learning in Robotics, 2016. 5
[13] Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. Self-improving visual odometry. arXiv:1812.03245, 2018. 6
[14] Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. SuperPoint: Self-supervised interest point detection and description. In CVPR Workshop on Deep Learningfor Visual SLAM, 2018. 2, 4, 5, 6
[15] Mihai Dusmanu, Ignacio Rocco, Tomas Pajdla, Marc Pollefeys, Josef Sivic, Akihiko Torii, and Torsten Sattler. D2-net: A trainable cnn for joint detection and description of local features. In CVPR, 2019. 2, 6
[16] Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, and Eduard Trulls. Beyond cartesian representations for local descriptors. In ICCV, 2019. 2
[17] Martin A Fischler and Robert C Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381-395, 1981. 2
[18] Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. In ICML, 2017. 2, 3
[19] Richard Hartley and Andrew Zisserman. Multiple view geometry in computer vision. Cambridge university press, 2003. 6
[20] Juho Lee, Yoonho Lee, Jungtaek Kim, Adam Kosiorek, Seungjin Choi, and Yee Whye Teh. Set transformer: A frame-work for attention-based permutation-invariant neural networks. In ICML, 2019. 2
[21] Marius Leordeanu and Martial Hebert. A spectral technique for correspondence problems using pairwise constraints. In ICCV, 2005. 2
[22] Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, and Pushmeet Kohli. Graph matching networks for learning the similarity of graph structured objects. In ICML, 2019. 2
[23] Zhengqi Li and Noah Snavely. Megadepth: Learning single-view depth prediction from internet photos. In CVPR, 2018. 7
[24] Eliane Maria Loiola, Nair Maria Maia de Abreu, Paulo Oswaldo Boaventura-Netto, Peter Hahn, and Tania Querido. A survey for the quadratic assignment problem. European journal of operational research, 176(2):657-690, 2007. 2
[25] David G Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91-110, 2004. 2, 6
[26] Zixin Luo, Tianwei Shen, Lei Zhou, Jiahui Zhang, Yao Yao, Shiwei Li, Tian Fang, and Long Quan. Contextdesc: Local descriptor augmentation with cross-modality context. In CVPR, 2019. 2, 5, 6
[27] Kwang Moo Yi, Eduard Trulls, Yuki Ono, Vincent Lepetit, Mathieu Salzmann, and Pascal Fua. Learning to find good correspondences. In CVPR, 2018. 2, 5, 6
[28] James Munkres. Algorithms for the assignment and transportation problems. Journal of the society for industrial and applied mathematics, 5(1):32-38, 1957. 4
[29] Yuki Ono, Eduard Trulls, Pascal Fua, and Kwang Moo Yi. LF-Net: Learning local features from images. In NeurIPS, 2018. 2, 6, 7
[30] Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Al-ban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. In NIPS Workshops, 2017. 5
[31] Gabriel Peyr6, Marco Cuturi, et al. Computational optimal transport. Foundations and Trends in Machine Learning, 11(5-6):355-607, 2019. 1, 2, 4
[32] Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In CVPR, 2017. 2
[33] Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In NIPS, 2017. 2
[34] Filip Radenovic, Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, and Ondrej Chum. Revisiting oxford and paris: Large-scale image retrieval benchmarking. In CVPR, 2018. 5, 9
[35] Rahul Raguram, Jan-Michael Frahm, and Marc Pollefeys. A comparative analysis of ransac techniques leading to adaptive real-time random sample consensus. In ECCV, 2008. 2
[36] René Ranftl and Vladlen Koltun. Deep fundamental matrix estimation. In ECCV, 2018. 2, 5
[37] Jerome Revaud, Philippe Weinzaepfel, Ceésar DeSouza, Noe Pion, Gabriela Csurka, Yohann Cabon, and Martin Humenberger. R2D2: Repeatable and reliable detector and descriptor. In NeurIPS, 2019. 2, 5
[38] Ignacio Rocco, Mircea Cimpoi, Relja Arandjelovic, Akihiko Torii, Tomas Pajdla, and Josef Sivic. Neighbourhood consensus networks. In NeurIPS, 2018. 2
[39] Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary R Bradski. Orb: An efficient alternative to sift or surf. In ICCV, 2011. 6
[40] Torsten Sattler, Bastian Leibe, and Leif Kobbelt. Scramsac: Improving ransac's efficiency with a spatial consistency fil-ter. In ICCV, 2009. 2
[41] Johannes Lutz Schonberger and Jan-Michael Frahm. Structure-from-motion revisited. In CVPR, 2016. 7
[42] Johannes Lutz Schonberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. Pixelwise view selection for un-structured multi-view stereo. In ECCV, 2016. 7
[43] Richard Sinkhorn and Paul Knopp. Concerning nonnegative matrices and doubly stochastic matrices. Pacific Journal of Mathematics, 1967. 2, 4
[44] Bart Thomee, David A Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, and Li-Jia Li. Yfcc100m: The new data in multimedia research. Communications of the ACM, 59(2):64-73, 2016. 7
[45] Lorenzo Torresani, Vladimir Kolmogorov, and Carsten Rother. Feature correspondence via graph matching: Models and global optimization. In ECCV, 2008. 2
[46] Tinne Tuytelaars and Luc J Van Gool. Wide baseline stereo matching based on local, affinely invariant regions. In BMVC, 2000. 2
[47] Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. Instance normalization: The missing ingredient for fast stylization. arXiv:1607.08022, 2016. 2, 5
[48] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko-reit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In NIPS, 2017. 1, 2, 3, 4, 5
[49] Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro LiŠó, and Yoshua Bengio. Graph attention networks. In ICLR, 2018. 2
[50] Cédric Villani. Optimal transport: old and new, volume338. Springer Science & Business Media, 2008. 1, 2
[51] Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. Non-local neural networks. In CVPR, 2018. 2
[52] Yue Wang and Justin M Solomon. Deep Closest Point: Learning representations for point cloud registration. In ICCV, 2019. 2
[53] Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, and Justin M. Solomon. Dynamic Graph CNN for learning on point clouds. ACM Transactions on Graphics, 2019. 2
[54] Kwang Moo Yi, Eduard Trulls, Vincent Lepetit, and Pascal Fua. Lift: Learned invariant feature transform. In ECCV, 2016. 2, 6
[55] Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan R Salakhutdinov, and Alexander J Smola. Deep sets. In NIPS, 2017. 2
[56] Jiahui Zhang, Dawei Sun, Zixin Luo, Anbang Yao, Lei Zhou, Tianwei Shen, Yurong Chen, Long Quan, and Hon-gen Liao. Learning two-view correspondences and geometry using order-aware network. In ICCV, 2019. 2, 5, 6
[57] Li Zhang, Xiangtai Li, Anurag Arnab, Kuiyuan Yang, Yunhai Tong, and Philip HS Torr. Dual graph convolutional net-work for semantic segmentation. In BMVC, 2019.

Claims

1. A computer system comprising: a computer-readable medium;a processor connected to the computer-readable medium; anda set of instructions on the computer-readable medium, including:a deep middle-end matcher architecture that includes: an attentional graph neural network having: a keypoint encoder to map keypoint positions p and their visual descriptors d into a single vector; andalternating self-attention and cross-attention layers that, based on the vector, repeated L times to create representations f; andan optimal matching layer that creates an M by N score matrix from the representations f and finds an optimal partial assignment based on the M by N score matrix,where:p are keypoint positions,d are descriptors,L are a plurality of times,f are representations,M by N is a score matrix with a length M and a width N.
2. The computer system of claim 1, wherein, in the keypoint encoder, an initial representation (0)xi for each keypoint combines visual appearance and location, with the respective keypoint position embedded into a high-dimensional vector with a Multilayer Perceptron as follows: (0)xi=di+MLP(pi).where:i are keypoints,(0)xi is an initial representation for the keypoints,di are descriptors for the keypoints,pi are keypoint positions for the keypoints, andMLP is a Multilayer Perceptron for the keypoint positions.
3. The computer system of claim 2, wherein the keypoint encoder allows the attentional graph neural network to reason about appearance and position jointly.
4. The computer system of claim 1, wherein, in the keypoint encoder includes a multiplex graph neural network having a single complete graph with nodes that are the keypoints of two images.
5. The computer system of claim 4, wherein the graph is a multiplex graph that has two types of undirected edges, namely intra-image edges Eself that connect keypoints i to all other keypoints within the same image and inter-image edges Ecross that connect keypoints i to all keypoints in the other image and uses a message passing formulation to propagate information along both types of edges, such that the resulting multiplex graph neural network starts with a high-dimensional state for each node and computes at each layer an updated representation by simultaneously aggregating messages across all given edges for all nodes where:Eself are self edges, andEcross are cross edges.
6. The computer system of claim 5, wherein if (l)xAi is the intermediate representation for keypoint i in image A at layer l, the message mE→i is the result of the aggregation from all keypoints {j:(i, j)∈E}, where E∈{Eself, Ecross}, and a residual message passing update for all i in A is: (l+1)xiA=(l)xiA+MLP([(l)xiA∥mε→i]),where:[.∥.] denotes concatenation,i are keypoints,Eself are self edges,Ecross are cross edges,mE is a message, andMLP is a Multilayer Perceptron for the keypoint positions.
7. The computer system of claim 6, wherein a fixed number of layers with different parameters are chained and alternatively aggregate along the self and cross edges such that, starting from l=1, E=Eself if l is odd and E=Ecross if l is even.
8. The computer system of claim 6, wherein the alternating self- and cross-attention layers are computed with an attention mechanism computes the message mE→i and performs the aggregation, wherein the self edges are based on self-attention and the cross edges are based on cross-attention, wherein, for a representation of i, a query qi, retrieves values vj of some elements based on their attributes, the keys kj, and the message is computed as weighted average of the values:
9. The computer system of claim 8, wherein an attention mask αij is the Softmax over the key-query similarities: αij=Softmaxj(qiTkj).
10. The computer system of claim 1, wherein final matching descriptors of the alternating self- and cross-attention layers are linear projections:
11. The computer system of claim 1, wherein the optimal matching layer expresses a pairwise score for a set as the similarity of matching descriptors:
12. The computer system of claim 11, wherein the optimal matching layer, for occlusion and visibility suppresses occluded keypoints and augments each set of keypoints with a dustbin score so that unmatched keypoints are explicitly assigned to dustbin scores.
13. The computer system of claim 12, wherein the score S is augmented to S by appending a new row and column, the point-to-bin and bin-to-bin scores, filled with a single learnable parameter: Si,N+1=SM+1,j=SM+1,N+1=α∈R. where:N+1 and M+1 represent augmentations.
14. The computer system of claim 12, wherein the optimal matching layer finds the optimal partial assignment based on the M by N score matrix using the Sinkhorn algorithm for T iterations.
15. The computer system of claim 14, wherein after T iterations, the optimal matching layer drops the dustbin scores and recovers P=P−1:M,1:N, where P1N+1=a and PT1M≤1N.
16. A computer-implemented method system that includes: mapping, with a keypoint encoder of an attentional graph neural network of a deep middle-end matcher architecture, keypoint positions p and their visual descriptors d into a single vector; andexecuting, with alternating self- and cross-attention layers of an attentional graph neural network of the deep middle-end matcher architecture, based on the vector, for L repeated times, to create representations f; and executing an optimal matching layer, of the attentional graph neural network of the deep middle-end matcher architecture, to create an M by N score matrix from the representations f and finding an optimal partial assignment based on the M by N score matrix,where:p are keypoint positions,d are descriptors,L are a plurality of times,f are representations,M by N is a score matrix with a length M and a width N.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority from U.S. Provisional Patent Application No. 62/935,597, filed on Nov. 14, 2019, all of which is incorporated herein by reference in its entirety.

US Referenced Citations (518)

Number	Name	Date	Kind
4344092	Miller	Aug 1982	A
4652930	Crawford	Mar 1987	A
4810080	Grendol et al.	Mar 1989	A
4997268	Dauvergne	Mar 1991	A
5007727	Kahaney et al.	Apr 1991	A
5074295	Willis	Dec 1991	A
5240220	Elberbaum	Aug 1993	A
5251635	Dumoulin et al.	Oct 1993	A
5410763	Bolle	May 1995	A
5455625	Englander	Oct 1995	A
5495286	Adair	Feb 1996	A
5497463	Stein et al.	Mar 1996	A
5659701	Amit et al.	Aug 1997	A
5682255	Friesem et al.	Oct 1997	A
5689669	Lynch	Nov 1997	A
5689835	Chao	Nov 1997	A
5826092	Flannery	Oct 1998	A
5854872	Tai	Dec 1998	A
5864365	Sramek et al.	Jan 1999	A
5937202	Crosetto	Aug 1999	A
6002853	de Hond	Dec 1999	A
6012811	Chao et al.	Jan 2000	A
6016160	Coombs et al.	Jan 2000	A
6064749	Hirota et al.	May 2000	A
6076927	Owens	Jun 2000	A
6079982	Meader	Jun 2000	A
6117923	Amagai et al.	Sep 2000	A
6119147	Toomey et al.	Sep 2000	A
6124977	Takahashi	Sep 2000	A
6179619	Tanaka	Jan 2001	B1
6191809	Hori et al.	Feb 2001	B1
6219045	Leahy et al.	Apr 2001	B1
6243091	Berstis	Jun 2001	B1
6271843	Lection et al.	Aug 2001	B1
6362817	Powers et al.	Mar 2002	B1
6375369	Schneider et al.	Apr 2002	B1
6385735	Wilson	May 2002	B1
6396522	Vu	May 2002	B1
6414679	Miodonski et al.	Jul 2002	B1
6538655	Kubota	Mar 2003	B1
6541736	Huang et al.	Apr 2003	B1
6570563	Honda	May 2003	B1
6573903	Gantt	Jun 2003	B2
6590593	Robertson et al.	Jul 2003	B1
6621508	Shiraishi et al.	Sep 2003	B1
6690393	Heron et al.	Feb 2004	B2
6757068	Foxlin	Jun 2004	B2
6784901	Harvfey et al.	Aug 2004	B1
6961055	Doak	Nov 2005	B2
7046515	Wyatt	May 2006	B1
7051219	Hwang	May 2006	B2
7076674	Cervantes	Jul 2006	B2
7111290	Yates, Jr.	Sep 2006	B1
7119819	Robertson et al.	Oct 2006	B1
7219245	Raghuvanshi	May 2007	B1
7382288	Wilson	Jun 2008	B1
7414629	Santodomingo	Aug 2008	B2
7431453	Hogan	Oct 2008	B2
7467356	Gettman et al.	Dec 2008	B2
7542040	Templeman	Jun 2009	B2
7573640	Nivon et al.	Aug 2009	B2
7653877	Matsuda	Jan 2010	B2
7663625	Chartier et al.	Feb 2010	B2
7724980	Shenzhi	May 2010	B1
7746343	Charaniya et al.	Jun 2010	B1
7751662	Kleemann	Jul 2010	B2
7758185	Lewis	Jul 2010	B2
7788323	Greenstein et al.	Aug 2010	B2
7804507	Yang et al.	Sep 2010	B2
7814429	Buffet et al.	Oct 2010	B2
7817150	Reichard et al.	Oct 2010	B2
7844724	Van Wie et al.	Nov 2010	B2
8060759	Arnan et al.	Nov 2011	B1
8120851	Iwasa	Feb 2012	B2
8214660	Capps, Jr.	Jul 2012	B2
8246408	Elliot	Aug 2012	B2
8353594	Lewis	Jan 2013	B2
8360578	Nummela et al.	Jan 2013	B2
8408696	Hsieh	Apr 2013	B2
8508676	Silverstein et al.	Aug 2013	B2
8547638	Levola	Oct 2013	B2
8605764	Rothaar et al.	Oct 2013	B1
8619365	Harris et al.	Dec 2013	B2
8696113	Lewis	Apr 2014	B2
8698701	Margulis	Apr 2014	B2
8733927	Lewis	May 2014	B1
8736636	Kang	May 2014	B2
8759929	Shiozawa et al.	Jun 2014	B2
8793770	Lim	Jul 2014	B2
8823855	Hwang	Sep 2014	B2
8847988	Geisner et al.	Sep 2014	B2
8874673	Kim	Oct 2014	B2
9010929	Lewis	Apr 2015	B2
9015501	Gee	Apr 2015	B2
9086537	Iwasa et al.	Jul 2015	B2
9095437	Boyden et al.	Aug 2015	B2
9239473	Lewis	Jan 2016	B2
9244293	Lewis	Jan 2016	B2
9244533	Friend et al.	Jan 2016	B2
9383823	Geisner et al.	Jul 2016	B2
9489027	Ogletree	Nov 2016	B1
9519305	Wolfe	Dec 2016	B2
9581820	Robbins	Feb 2017	B2
9582060	Balatsos	Feb 2017	B2
9658473	Lewis	May 2017	B2
9671566	Abovitz et al.	Jun 2017	B2
9671615	Vallius et al.	Jun 2017	B1
9696795	Marcolina et al.	Jul 2017	B2
9798144	Sako et al.	Oct 2017	B2
9874664	Stevens et al.	Jan 2018	B2
9880441	Osterhout	Jan 2018	B1
9918058	Takahasi et al.	Mar 2018	B2
9955862	Freeman et al.	May 2018	B2
9978118	Ozgumer et al.	May 2018	B1
9996797	Holz et al.	Jun 2018	B1
10018844	Levola et al.	Jul 2018	B2
10082865	Raynal et al.	Sep 2018	B1
10151937	Lewis	Dec 2018	B2
10185147	Lewis	Jan 2019	B2
10218679	Jawahar	Feb 2019	B1
10241545	Richards et al.	Mar 2019	B1
10317680	Richards et al.	Jun 2019	B1
10436594	Belt et al.	Oct 2019	B2
10516853	Gibson et al.	Dec 2019	B1
10551879	Richards et al.	Feb 2020	B1
10578870	Kimmel	Mar 2020	B2
10698202	Kimmel et al.	Jun 2020	B2
10856107	Mycek et al.	Oct 2020	B2
10825424	Zhang	Nov 2020	B2
10987176	Poltaretskyi et al.	Apr 2021	B2
11190681	Brook et al.	Nov 2021	B1
11209656	Choi et al.	Dec 2021	B1
11236993	Hall et al.	Feb 2022	B1
20010010598	Aritake et al.	Aug 2001	A1
20010018667	Kim	Aug 2001	A1
20020007463	Fung	Jan 2002	A1
20020108064	Nunally	Feb 2002	A1
20020063913	Nakamura et al.	May 2002	A1
20020071050	Homberg	Jun 2002	A1
20020095463	Matsuda	Jul 2002	A1
20020113820	Robinson et al.	Aug 2002	A1
20020122648	Mule' et al.	Sep 2002	A1
20020140848	Cooper et al.	Oct 2002	A1
20030028816	Bacon	Feb 2003	A1
20030048456	Hill	Mar 2003	A1
20030067685	Niv	Apr 2003	A1
20030077458	Korenaga et al.	Apr 2003	A1
20030080976	Satoh et al.	May 2003	A1
20030115494	Cervantes	Jun 2003	A1
20030218614	Lavelle et al.	Nov 2003	A1
20030219992	Schaper	Nov 2003	A1
20030226047	Park	Dec 2003	A1
20040001533	Tran et al.	Jan 2004	A1
20040021600	Wittenberg	Feb 2004	A1
20040025069	Gary et al.	Feb 2004	A1
20040042377	Nikoloai et al.	Mar 2004	A1
20040073822	Greco	Apr 2004	A1
20040073825	Itoh	Apr 2004	A1
20040111248	Granny et al.	Jun 2004	A1
20040113887	Pair et al.	Jun 2004	A1
20040174496	Ji et al.	Sep 2004	A1
20040186902	Stewart	Sep 2004	A1
20040193441	Altieri	Sep 2004	A1
20040201857	Foxlin	Oct 2004	A1
20040238732	State et al.	Dec 2004	A1
20040240072	Schindler et al.	Dec 2004	A1
20040246391	Travis	Dec 2004	A1
20040268159	Aasheim et al.	Dec 2004	A1
20050001977	Zelman	Jan 2005	A1
20050034002	Flautner	Feb 2005	A1
20050093719	Okamoto et al.	May 2005	A1
20050128212	Edecker et al.	Jun 2005	A1
20050157159	Komiya et al.	Jul 2005	A1
20050177385	Hull	Aug 2005	A1
20050231599	Yamasaki	Oct 2005	A1
20050273792	Inohara et al.	Dec 2005	A1
20060013435	Rhoads	Jan 2006	A1
20060015821	Jacques Parker et al.	Jan 2006	A1
20060019723	Vorenkamp	Jan 2006	A1
20060038880	Starkweather et al.	Feb 2006	A1
20060050224	Smith	Mar 2006	A1
20060090092	Verhulst	Apr 2006	A1
20060126181	Levola	Jun 2006	A1
20060129852	Bonola	Jun 2006	A1
20060132914	Weiss et al.	Jun 2006	A1
20060179329	Terechko	Aug 2006	A1
20060221448	Nivon et al.	Oct 2006	A1
20060228073	Mukawa et al.	Oct 2006	A1
20060250322	Hall et al.	Nov 2006	A1
20060259621	Ranganathan	Nov 2006	A1
20060268220	Hogan	Nov 2006	A1
20070058248	Nguyen et al.	Mar 2007	A1
20070103836	Oh	May 2007	A1
20070124730	Pytel	May 2007	A1
20070159673	Freeman et al.	Jul 2007	A1
20070188837	Shimizu et al.	Aug 2007	A1
20070198886	Saito	Aug 2007	A1
20070204672	Huang et al.	Sep 2007	A1
20070213952	Cirelli	Sep 2007	A1
20070283247	Brenneman et al.	Dec 2007	A1
20080002259	Ishizawa et al.	Jan 2008	A1
20080002260	Arrouy et al.	Jan 2008	A1
20080030429	Hailpern	Feb 2008	A1
20080043334	Itzkovitch et al.	Feb 2008	A1
20080046773	Ham	Feb 2008	A1
20080063802	Maula et al.	Mar 2008	A1
20080068557	Menduni et al.	Mar 2008	A1
20080084533	Jannard et al.	Apr 2008	A1
20080125218	Collins	May 2008	A1
20080146942	Dala-Krishna	Jun 2008	A1
20080173036	Willaims	Jul 2008	A1
20080177506	Kim	Jul 2008	A1
20080183190	Adcox et al.	Jul 2008	A1
20080205838	Crippa et al.	Aug 2008	A1
20080215907	Wilson	Sep 2008	A1
20080225393	Rinko	Sep 2008	A1
20080235570	Sawada et al.	Sep 2008	A1
20080246693	Hailpern et al.	Oct 2008	A1
20080316768	Travis	Dec 2008	A1
20090076791	Rhoades et al.	Mar 2009	A1
20090091583	McCoy	Apr 2009	A1
20090153797	Allon et al.	Jun 2009	A1
20090224416	Laakkonen et al.	Sep 2009	A1
20090245730	Kleemann	Oct 2009	A1
20090287728	Martine et al.	Nov 2009	A1
20090300528	Stambaugh	Dec 2009	A1
20090310633	Ikegami	Dec 2009	A1
20100005326	Archer	Jan 2010	A1
20100019962	Fujita	Jan 2010	A1
20100056274	Uusitalo et al.	Mar 2010	A1
20100063854	Purvis et al.	Mar 2010	A1
20100070378	Trotman et al.	Mar 2010	A1
20100079841	Levola	Apr 2010	A1
20100115428	Shuping et al.	May 2010	A1
20100153934	Lachner	Jun 2010	A1
20100194632	Raento et al.	Aug 2010	A1
20100205541	Rappaport et al.	Aug 2010	A1
20100214284	Rieffel et al.	Aug 2010	A1
20100232016	Landa et al.	Sep 2010	A1
20100232031	Batchko et al.	Sep 2010	A1
20100244168	Shiozawa et al.	Sep 2010	A1
20100274567	Carlson et al.	Oct 2010	A1
20100274627	Carlson	Oct 2010	A1
20100277803	Pockett et al.	Nov 2010	A1
20100284085	Laakkonen	Nov 2010	A1
20100287485	Bertolami et al.	Nov 2010	A1
20100296163	Sarikko	Nov 2010	A1
20100309687	Sampsell et al.	Dec 2010	A1
20110010636	Hamilton, II et al.	Jan 2011	A1
20110021263	Anderson et al.	Jan 2011	A1
20110022870	Mcgrane	Jan 2011	A1
20110041083	Gabai et al.	Feb 2011	A1
20110050640	Lundback et al.	Mar 2011	A1
20110050655	Mukawa	Mar 2011	A1
20110064268	Cobb	Mar 2011	A1
20110122240	Becker	May 2011	A1
20110145617	Thomson et al.	Jun 2011	A1
20110170801	Lu et al.	Jul 2011	A1
20110218733	Hamza et al.	Sep 2011	A1
20110286735	Temblay	Nov 2011	A1
20110291969	Rashid et al.	Dec 2011	A1
20120011389	Driesen	Jan 2012	A1
20120050535	Densham et al.	Mar 2012	A1
20120075501	Oyagi et al.	Mar 2012	A1
20120081392	Arthur	Apr 2012	A1
20120089854	Breakstone	Apr 2012	A1
20120113235	Shintani	May 2012	A1
20120127062	Bar-Zeev et al.	May 2012	A1
20120154557	Perez et al.	Jun 2012	A1
20120215094	Rahimian et al.	Aug 2012	A1
20120218301	Miller	Aug 2012	A1
20120246506	Knight	Sep 2012	A1
20120249416	Maciocci et al.	Oct 2012	A1
20120249741	Maciocci et al.	Oct 2012	A1
20120260083	Andrews	Oct 2012	A1
20120307075	Margalitq	Dec 2012	A1
20120307362	Silverstein et al.	Dec 2012	A1
20120314959	White et al.	Dec 2012	A1
20120320460	Levola	Dec 2012	A1
20120326948	Crocco et al.	Dec 2012	A1
20130021486	Richardon	Jan 2013	A1
20130050258	Liu et al.	Feb 2013	A1
20130050642	Lewis et al.	Feb 2013	A1
20130050833	Lewis et al.	Feb 2013	A1
20130051730	Travers et al.	Feb 2013	A1
20130061240	Yan et al.	Mar 2013	A1
20130077049	Bohn	Mar 2013	A1
20130077170	Ukuda	Mar 2013	A1
20130094148	Sloane	Apr 2013	A1
20130129282	Li	May 2013	A1
20130162940	Kurtin et al.	Jun 2013	A1
20130169923	Schnoll et al.	Jul 2013	A1
20130205126	Kruglick	Aug 2013	A1
20130222386	Tannhauser et al.	Aug 2013	A1
20130268257	Hu	Oct 2013	A1
20130278633	Ahn et al.	Oct 2013	A1
20130314789	Saarikko et al.	Nov 2013	A1
20130318276	Dalal	Nov 2013	A1
20130336138	Venkatraman et al.	Dec 2013	A1
20130342564	Kinnebrew et al.	Dec 2013	A1
20130342570	Kinnebrew et al.	Dec 2013	A1
20130342571	Kinnebrew et al.	Dec 2013	A1
20130343408	Cook	Dec 2013	A1
20140002329	Nishimaki et al.	Jan 2014	A1
20140013098	Yeung	Jan 2014	A1
20140016821	Arth et al.	Jan 2014	A1
20140022819	Oh et al.	Jan 2014	A1
20140078023	Ikeda et al.	Mar 2014	A1
20140082526	Park et al.	Mar 2014	A1
20140119598	Ramachandran et al.	May 2014	A1
20140126769	Reitmayr et al.	May 2014	A1
20140140653	Brown et al.	May 2014	A1
20140149573	Tofighbakhsh et al.	May 2014	A1
20140168260	O'Brien et al.	Jun 2014	A1
20140244983	McDonald et al.	Aug 2014	A1
20140266987	Magyari	Sep 2014	A1
20140267419	Ballard et al.	Sep 2014	A1
20140274391	Stafford	Sep 2014	A1
20140282105	Nordstrom	Sep 2014	A1
20140292645	Tsurumi et al.	Oct 2014	A1
20140313228	Kasahara	Oct 2014	A1
20140340449	Plagemann et al.	Nov 2014	A1
20140359589	Kodsky et al.	Dec 2014	A1
20140375680	Ackerman et al.	Dec 2014	A1
20150005785	Olson	Jan 2015	A1
20150009099	Queen	Jan 2015	A1
20150015842	Chen	Jan 2015	A1
20150077312	Wang	Mar 2015	A1
20150097719	Balachandreswaran et al.	Apr 2015	A1
20150123966	Newman	May 2015	A1
20150130790	Vazquez, II et al.	May 2015	A1
20150134995	Park et al.	May 2015	A1
20150138248	Schrader	May 2015	A1
20150155939	Oshima et al.	Jun 2015	A1
20150168221	Mao et al.	Jun 2015	A1
20150205126	Schowengerdt	Jul 2015	A1
20150235427	Nobori et al.	Aug 2015	A1
20150235431	Schowengerdt	Aug 2015	A1
20150253651	Russell et al.	Sep 2015	A1
20150256484	Cameron	Sep 2015	A1
20150269784	Miyawaki et al.	Sep 2015	A1
20150294483	Wells et al.	Oct 2015	A1
20150301955	Yakovenko et al.	Oct 2015	A1
20150310657	Eden	Oct 2015	A1
20150338915	Publicover et al.	Nov 2015	A1
20150355481	Hilkes et al.	Dec 2015	A1
20160004102	Nisper et al.	Jan 2016	A1
20160015470	Border	Jan 2016	A1
20160027215	Burns et al.	Jan 2016	A1
20160033770	Fujimaki et al.	Feb 2016	A1
20160051217	Douglas et al.	Feb 2016	A1
20160077338	Robbins et al.	Mar 2016	A1
20160085285	Mangione-Smith	Mar 2016	A1
20160085300	Robbins et al.	Mar 2016	A1
20160091720	Stafford et al.	Mar 2016	A1
20160093099	Bridges	Mar 2016	A1
20160093269	Buckley et al.	Mar 2016	A1
20160103326	Kimura et al.	Apr 2016	A1
20160123745	Cotier et al.	May 2016	A1
20160139402	Lapstun	May 2016	A1
20160139411	Kang et al.	May 2016	A1
20160155273	Lyren et al.	Jun 2016	A1
20160180596	Gonzalez del Rosario	Jun 2016	A1
20160187654	Border et al.	Jun 2016	A1
20160191887	Casas	Jun 2016	A1
20160202496	Billetz et al.	Jul 2016	A1
20160217624	Finn et al.	Jul 2016	A1
20160266412	Yoshida	Sep 2016	A1
20160267708	Nistico et al.	Sep 2016	A1
20160274733	Hasegawa et al.	Sep 2016	A1
20160287337	Aram et al.	Oct 2016	A1
20160300388	Stafford et al.	Oct 2016	A1
20160321551	Priness et al.	Nov 2016	A1
20160327798	Xiao et al.	Nov 2016	A1
20160334279	Mittleman et al.	Nov 2016	A1
20160357255	Lindh et al.	Dec 2016	A1
20160370404	Quadrat et al.	Dec 2016	A1
20160370510	Thomas	Dec 2016	A1
20170038607	Camara	Feb 2017	A1
20170060225	Zha et al.	Mar 2017	A1
20170061696	Li et al.	Mar 2017	A1
20170064066	Das et al.	Mar 2017	A1
20170100664	Osterhout et al.	Apr 2017	A1
20170102544	Vallius et al.	Apr 2017	A1
20170115487	Travis	Apr 2017	A1
20170122725	Yeoh et al.	May 2017	A1
20170123526	Trail et al.	May 2017	A1
20170127295	Black et al.	May 2017	A1
20170131569	Aschwanden et al.	May 2017	A1
20170147066	Katz et al.	May 2017	A1
20170160518	Lanman et al.	Jun 2017	A1
20170161951	Fix et al.	Jun 2017	A1
20170185261	Perez et al.	Jun 2017	A1
20170192239	Nakamura et al.	Jul 2017	A1
20170201709	Igarashi et al.	Jul 2017	A1
20170205903	Miller et al.	Jul 2017	A1
20170206668	Poulos et al.	Jul 2017	A1
20170213388	Margolis et al.	Jul 2017	A1
20170214907	Lapstun	Jul 2017	A1
20170219841	Popovich et al.	Aug 2017	A1
20170232345	Rofougaran et al.	Aug 2017	A1
20170235126	DiDomenico	Aug 2017	A1
20170235129	Kamakura	Aug 2017	A1
20170235142	Wall et al.	Aug 2017	A1
20170235144	Piskunov et al.	Aug 2017	A1
20170235147	Kamakura	Aug 2017	A1
20170243403	Daniels et al.	Aug 2017	A1
20170246070	Osterhout et al.	Aug 2017	A1
20170254832	Ho et al.	Sep 2017	A1
20170256096	Faaborg et al.	Sep 2017	A1
20170258526	Lang	Sep 2017	A1
20170266529	Reikmoto	Sep 2017	A1
20170270712	Tyson et al.	Sep 2017	A1
20170281054	Stever et al.	Oct 2017	A1
20170287376	Bakar et al.	Oct 2017	A1
20170293141	Schowengerdt et al.	Oct 2017	A1
20170307886	Stenberg et al.	Oct 2017	A1
20170307891	Bucknor et al.	Oct 2017	A1
20170312032	Amanatullah et al.	Nov 2017	A1
20170322418	Liu et al.	Nov 2017	A1
20170322426	Tervo	Nov 2017	A1
20170329137	Tervo	Nov 2017	A1
20170332098	Rusanovskyy et al.	Nov 2017	A1
20170336636	Amitai et al.	Nov 2017	A1
20170357332	Balan et al.	Dec 2017	A1
20170363871	Vallius	Dec 2017	A1
20170371394	Chan	Dec 2017	A1
20170371661	Sparling	Dec 2017	A1
20180014266	Chen	Jan 2018	A1
20180024289	Fattal	Jan 2018	A1
20180044173	Netzer	Feb 2018	A1
20180052007	Teskey et al.	Feb 2018	A1
20180052501	Jones, Jr. et al.	Feb 2018	A1
20180059305	Popovich et al.	Mar 2018	A1
20180067779	Pillalamarri et al.	Mar 2018	A1
20180070855	Eichler	Mar 2018	A1
20180082480	White et al.	Mar 2018	A1
20180084245	Lapstun	Mar 2018	A1
20180088185	Woods et al.	Mar 2018	A1
20180102981	Kurtzman et al.	Apr 2018	A1
20180108179	Tomlin et al.	Apr 2018	A1
20180114298	Malaika et al.	Apr 2018	A1
20180129112	Osterhout	May 2018	A1
20180131907	Schmirler et al.	May 2018	A1
20180136466	Ko	May 2018	A1
20180144691	Choi et al.	May 2018	A1
20180150971	Adachi	May 2018	A1
20180151796	Akahane	May 2018	A1
20180172995	Lee et al.	Jun 2018	A1
20180188115	Hsu et al.	Jul 2018	A1
20180189568	Powderly et al.	Jul 2018	A1
20180190017	Mendez et al.	Jul 2018	A1
20180191990	Motoyama et al.	Jul 2018	A1
20180217395	Lin et al.	Aug 2018	A1
20180218545	Garcia et al.	Aug 2018	A1
20180250589	Cossairt et al.	Sep 2018	A1
20180260218	Gopal	Sep 2018	A1
20180284877	Klein	Oct 2018	A1
20180292654	Wall et al.	Oct 2018	A1
20180299678	Singer et al.	Oct 2018	A1
20180357472	Dreessen	Dec 2018	A1
20190005069	Filgueiras de Araujo et al.	Jan 2019	A1
20190011691	Peyman	Jan 2019	A1
20190056591	Tervo et al.	Feb 2019	A1
20190087015	Lam et al.	Mar 2019	A1
20190101758	Zhu et al.	Apr 2019	A1
20190107723	Lee et al.	Apr 2019	A1
20190137788	Suen	May 2019	A1
20190155034	Singer et al.	May 2019	A1
20190155439	Mukherjee et al.	May 2019	A1
20190158926	Kang et al.	May 2019	A1
20190162950	Lapstun	May 2019	A1
20190167095	Krueger	Jun 2019	A1
20190172216	Ninan et al.	Jun 2019	A1
20190178654	Hare	Jun 2019	A1
20190182415	Sivan	Jun 2019	A1
20190196690	Chong et al.	Jun 2019	A1
20190206116	Xu	Jul 2019	A1
20190219815	Price et al.	Jul 2019	A1
20190243123	Bohn	Aug 2019	A1
20190287270	Nakamura et al.	Sep 2019	A1
20190318502	He	Oct 2019	A1
20190318540	Piemonte et al.	Oct 2019	A1
20190321728	Imai et al.	Oct 2019	A1
20190347853	Chen et al.	Nov 2019	A1
20190380792	Poltaretskyi et al.	Dec 2019	A1
20190388182	Kumar	Dec 2019	A1
20200066045	Stahl et al.	Feb 2020	A1
20200098188	Bar-Zeev et al.	Mar 2020	A1
20200100057	Galon et al.	Mar 2020	A1
20200110928	Al Jazaery et al.	Apr 2020	A1
20200117267	Gibson et al.	Apr 2020	A1
20200117270	Gibson et al.	Apr 2020	A1
20200184217	Faulkner	Jun 2020	A1
20200184653	Faulker	Jun 2020	A1
20200202759	Ukai et al.	Jun 2020	A1
20200242848	Ambler et al.	Jul 2020	A1
20200309944	Thoresen et al.	Oct 2020	A1
20200356161	Wagner	Nov 2020	A1
20200368616	Delamont	Nov 2020	A1
20200391115	Leeper et al.	Dec 2020	A1
20200409528	Lee	Dec 2020	A1
20210008413	Asikainen et al.	Jan 2021	A1
20210033871	Jacoby et al.	Feb 2021	A1
20210041951	Gibson et al.	Feb 2021	A1
20210053820	Gurin et al.	Feb 2021	A1
20210093391	Poltaretskyi et al.	Apr 2021	A1
20210093410	Gaborit et al.	Apr 2021	A1
20210093414	Moore et al.	Apr 2021	A1
20210097886	Kuester et al.	Apr 2021	A1
20210124901	Liu	Apr 2021	A1
20210132380	Wieczorek	May 2021	A1
20210141225	Meynen et al.	May 2021	A1
20210142582	Jones et al.	May 2021	A1
20210158023	Fu	May 2021	A1
20210158627	Cossairt et al.	May 2021	A1
20210173480	Osterhout et al.	Jun 2021	A1
20220366598	Azimi et al.	Nov 2022	A1

Foreign Referenced Citations (117)

Number	Date	Country
100416340	Sep 2008	CN
101449270	Jun 2009	CN
103460255	Dec 2013	CN
104040410	Sep 2014	CN
104603675	May 2015	CN
105938426	Sep 2016	CN
106662754	May 2017	CN
107683497	Feb 2018	CN
109223121	Jan 2019	CN
105190427	Nov 2019	CN
0504930	Mar 1992	EP
0535402	Apr 1993	EP
0632360	Jan 1995	EP
1215522	Jun 2002	EP
1494110	Jan 2005	EP
1938141	Jul 2008	EP
1943556	Jul 2008	EP
2290428	Mar 2011	EP
2350774	Aug 2011	EP
1237067	Jan 2016	EP
3139245	Mar 2017	EP
3164776	May 2017	EP
3236211	Oct 2017	EP
2723240	Aug 2018	EP
2896986	Feb 2021	EP
2499635	Aug 2013	GB
2542853	Apr 2017	GB
938DEL2004	Jun 2006	IN
H03-036974	Apr 1991	JP
H10-333094	Dec 1998	JP
2002-529806	Sep 2002	JP
2003-029198	Jan 2003	JP
2003-141574	May 2003	JP
2003-228027	Aug 2003	JP
2003-329873	Nov 2003	JP
2005-303843	Oct 2005	JP
2007-012530	Jan 2007	JP
2007-86696	Apr 2007	JP
2007-273733	Oct 2007	JP
2008-257127	Oct 2008	JP
2009-090689	Apr 2009	JP
2009-244869	Oct 2009	JP
2010-014443	Jan 2010	JP
2010-139575	Jun 2010	JP
2011-033993	Feb 2011	JP
2011-257203	Dec 2011	JP
2011-530131	Dec 2011	JP
2012-015774	Jan 2012	JP
2012-235036	Nov 2012	JP
2013-525872	Jun 2013	JP
2013-206322	Oct 2013	JP
2014-500522	Jan 2014	JP
2014-192550	Oct 2014	JP
2015-191032	Nov 2015	JP
2016-502120	Jan 2016	JP
2016-85463	May 2016	JP
2016-516227	Jun 2016	JP
2016-126134	Jul 2016	JP
2017-015697	Jan 2017	JP
2017-153498	Sep 2017	JP
2017-531840	Oct 2017	JP
2017-535825	Nov 2017	JP
6232763	Nov 2017	JP
6333965	May 2018	JP
2005-0010775	Jan 2005	KR
10-2006-0059992	Jun 2006	KR
10-2011-0006408	Jan 2011	KR
10-1372623	Mar 2014	KR
10-2017-0017243	Feb 2017	KR
201219829	May 2012	TW
201803289	Jan 2018	TW
1991000565	Jan 1991	WO
2000030368	Jun 2000	WO
2002071315	Sep 2002	WO
2004095248	Nov 2004	WO
2006132614	Dec 2006	WO
2007041678	Apr 2007	WO
2007037089	May 2007	WO
2007085682	Aug 2007	WO
2007102144	Sep 2007	WO
2008148927	Dec 2008	WO
2009101238	Aug 2009	WO
2010015807	Feb 2010	WO
2014203440	Dec 2010	WO
2012030787	Mar 2012	WO
2013049012	Apr 2013	WO
2013062701	May 2013	WO
2013145536	Oct 2013	WO
2014033306	Mar 2014	WO
2015079610	Jun 2015	WO
2015143641	Oct 2015	WO
2015143641	Oct 2015	WO
2015194597	Dec 2015	WO
2016054092	Apr 2016	WO
2017004695	Jan 2017	WO
2017044761	Mar 2017	WO
2017049163	Mar 2017	WO
2017051595	Mar 2017	WO
2017120475	Jul 2017	WO
2017176861	Oct 2017	WO
2017203201	Nov 2017	WO
2017203201	Nov 2017	WO
2018008232	Jan 2018	WO
2018031261	Feb 2018	WO
2018022523	Feb 2018	WO
2018044537	Mar 2018	WO
2018039273	Mar 2018	WO
2018057564	Mar 2018	WO
2018085287	May 2018	WO
2018087408	May 2018	WO
2018097831	May 2018	WO
2018166921	Sep 2018	WO
2018166921	Sep 2018	WO
2018236587	Dec 2018	WO
2019040493	Feb 2019	WO
2019148154	Aug 2019	WO
2020010226	Jan 2020	WO

Non-Patent Literature Citations (270)

Entry
Communication according to Rule 164(1) EPC mailed on Feb. 23, 2022, European Patent Application No. 20753144.3, (11 pages).
Extended European Search Report mailed on Jun. 19, 2020, European Patent Application No. 20154750.2, (10 pages).
Extended European Search Report mailed on Mar. 22, 2022, European Patent Application No. 19843487.0, (14 pages).
Final Office Action mailed on Feb. 23, 2022, U.S. Appl. No. 16/748,193, (23 pages).
First Office Action mailed on Mar. 14, 2022 with English translation, Chinese Patent Application No. 201880079474.6, (11 pages).
Non Final Office Action mailed on Apr. 1, 2022, U.S. Appl. No. 17/256,961, (65 pages).
Non Final Office Action mailed on Apr. 11, 2022, U.S. Appl. No. 16/938,782, (52 pages).
Non Final Office Action mailed on Apr. 12, 2022, U.S. Appl. No. 17/262,991, (60 pages).
Non Final Office Action mailed on Mar. 31, 2022, U.S. Appl. No. 17/257,814, (60 pages).
Non Final Office Action mailed on Mar. 9, 2022, U.S. Appl. No. 16/870,676, (57 pages).
“Communication Pursuant to Article 94(3) EPC mailed on Apr. 25, 2022”, European Patent Application No. 18885707.2, (5 pages).
“Communication Pursuant to Article 94(3) EPC mailed on May 30, 2022”, European Patent Application No. 19768418.6, (6 pages).
“Extended European Search Report issued on Jul. 20, 2022”, European Patent Application No. 19885958.9, (9 pages).
“Extended European Search Report issued on Aug. 8, 2022”, European Patent Application No. 19898874.3, (8 pages).
“Extended European Search Report mailed on May 16, 2022”, European Patent Application No. 19871001.4, (9 pages).
“Extended European Search Report mailed on May 30, 2022”, European Patent Application No. 20753144.3, (10 pages).
“Final Office Action mailed on Jul. 13, 2022”, U.S. Appl. No. 17/262,991, (18 pages).
“First Examination Report Mailed on Jul. 27, 2022”, Chinese Patent Application No. 201980036675.2, (5 pages).
“First Examination Report Mailed on Jul. 28, 2022”, Indian Patent Application No. 202047024232, (6 pages).
“First Examination Report Mailed on May 13, 2022”, Indian Patent Application No. 202047026359, (8 pages).
“Non Final Office Action mailed on Jul. 26, 2022”, U.S. Appl. No. 17/098,059, (28 pages).
“Non Final Office Action mailed on May 10, 2022”, U.S. Appl. No. 17/140,921, (25 pages).
“Non Final Office Action mailed on May 17, 2022”, U.S. Appl. No. 16/748,193, (11 pages).
“Second Office Action mailed on Jul. 13, 2022 with English Translation”, Chinese Patent Application No. 201880079474.6, (10 pages).
Chittineni, C. , et al., “Single filters for combined image geometric manipulation and enhancement”, Proceedings of SPIE vol. 1903, Image and Video Processing, Apr. 8, 1993, San Jose, CA. (Year: 1993), pp. 111-121.
“Extended European Search Report issued on Aug. 24, 2022”, European Patent Application No. 20846338.0, (13 pages).
“Extended European Search Report issued on Sep. 8, 2022”, European Patent Application No. 20798769.4, (13 pages).
“FS_XR5G: Permanent document, v0.4.0”, Qualcomm Incorporated, 3GPP TSG-SA 4 Meeting 103 retrieved from the Internet: URL:http://www.3gpp.org/ftp/Meetings%5F3GP P%5FSYNC/SA4/Docs/S4%2DI90526%2Ezip [retrieved on Apr. 12, 2019], Apr. 12, 2019, (98 pages).
“Non Final Office Action mailed on Sep. 19, 2022”, U.S. Appl. No. 17/263,001, (14 pages).
“Second Office Action mailed on Jun. 20, 2022 with English Translation”, Chinese Patent Application No. 201880089255.6, (14 pages).
Anonymous , “Koi Pond: Top iPhone App Store Paid App”, https://web.archive.org/web/20080904061233/https://www.iphoneincanada.ca/reviews /koi-pond-top-iphone-app-store-paid-app/—[retrieved on Aug. 9, 2022], (2 pages).
“Extended European Search Report mailed on Nov. 3, 2022”, European Patent Application No. 20770244.0, (23 pages).
“First Examination Report Mailed on Dec. 8, 2022”, Australian Patent Application No. 2018392482, (3 pages).
“First Office Action mailed on Sep. 16, 2022 with English translation”, Chinese Patent Application No. 201980063642.7, (7 pages).
“Non Final Office Action mailed on Dec. 7, 2022”, U.S. Appl. No. 17/357,795, (63 pages).
“Notice of Reason for Rejection mailed on Oct. 28, 2022 with English translation”, Japanese Patent Application No. 2020-531452, (3 pages).
“Office Action mailed on Nov. 24, 2022 with English Translation”, Japanese Patent Application No. 2020-533730, (11 pages).
“Extended European Search Report issued on Dec. 14, 2022”, European Patent Application No. 20886547.7, (8 pages).
“Final Office Action mailed on Dec. 29, 2022”, U.S. Appl. No. 17/098,059, (32 pages).
“Non Final Office Action mailed on Feb. 3, 2023”, U.S. Appl. No. 17/429,100, (16 pages).
“Non Final Office Action mailed on Feb. 3, 2023”, U.S. Appl. No. 17/497,965, (32 pages).
“Non Final Office Action mailed on Jan. 24, 2023”, U.S. Appl. No. 17/497,940, (10 pages).
European Search Report mailed on Oct. 15, 2020, European Patent Application No. 20180623.9, (10 pages).
Extended European Search Report issued on Jan. 22, 2021, European Patent Application No. 18890390.0, (11 pages).
Extended European Search Report issued on Nov. 3, 2020, European Patent Application No. 18885707.2, (7 pages).
Extended European Search Report issued on Nov. 4, 2020, European Patent Application No. 20190980.1, (14 pages).
Final Office Action mailed on Nov. 24, 2020, U.S. Appl. No. 16/435,933, (44 pages).
International Search Report and Written Opinion mailed on Feb. 12, 2021, International Application No. PCT/US20/60555, (25 pages).
International Search Report and Written Opinion mailed on Feb. 2, 2021, International PCT Patent Application No. PCT/US20/60550, (9 pages).
International Search Report and Written Opinion mailed on Dec. 3, 2020, International Patent Application No. PCT/US20/43596, (25 pages).
Non Final Office Action mailed on Jan. 26, 2021, U.S. Appl. No. 16/928,313, (33 pages).
Non Final Office Action mailed on Jan. 27, 2021, U.S. Appl. No. 16/225,961, (15 pages).
Non Final Office Action mailed on Mar. 3, 2021, U.S. Appl. No. 16/427,337, (41 pages).
Altwaijry, et al., “Learning to Detect and Match Keypoints with Deep Architectures”, Proceedings of the British Machine Vision Conference (BMVC), BMVA Press, Sep. 2016, [retrieved on Jan. 8, 2021 (Jan. 8, 2021 )] < URL: http://www.bmva.org/bmvc/2016/papers/paper049/index.html >, en lire document, especially Abstract, pp. 1-6 and 9.
Butail, et al., “Putting the fish in the fish tank: Immersive VR for animal behavior experiments”, In: 2012 IEEE International Conference on Robotics and Automation. May 18, 2012 (May 18, 2012) Retrieved on Nov. 14, 2020 (Nov. 14, 2020) from <http:/lcdcl.umd.edu/papers/icra2012.pdf> entire document, (8 pages).
Lee, et al., “Self-Attention Graph Pooling”, Cornell University Library/Computer Science/ Machine Learning, Apr. 17, 2019 [retrieved on Jan. 8, 2021 from the Internet< URL: https://arxiv.org/abs/1904.08082 >, entire document.
Libovicky, et al., “Input Combination Strategies for Multi-Source Transformer Decoder”, Proceedings of the Third Conference on Machine Translation (WMT). vol. 1: Research Papers, Belgium, Brussels, Oct. 31-Nov. 1, 2018; retrieved on Jan. 8, 2021 (Jan. 8, 2021 ) from < URL: https://doi.org/10.18653/v1/W18-64026 >, entire document, pp. 253-260.
Sarlin, et al., “SuperGlue: Learning Feature Matching with Graph Neural Networks”, Cornell University Library/Computer Science/Computer Vision and Pattern Recognition, Nov. 26, 2019 [retrieved on Jan. 8, 2021 from the Internet< URL: https://arxiv.org/abs/1911.11763 >, entire document, especially.
Communication Pursuant to Article 94(3) EPC mailed on Jan. 4, 2022, European Patent Application No. 20154070.5, (8 pages).
Communication Pursuant to Article 94(3) EPC mailed on Oct. 21, 2021, European Patent Application No. 16207441.3, (4 pages).
Communication Pursuant to Rule 164(1) EPC mailed on Jul. 27, 2021, European Patent Application No. 19833664.6, (11 pages).
Extended European Search Report issued on Jun. 30, 2021, European Patent Application No. 19811971.1, (9 pages).
Extended European Search Report issued on Mar. 4, 2021, European Patent Application No. 19768418.6, (9 pages).
Extended European Search Report mailed on Jan. 28, 2022, European Patent Application No. 19815876.8, (9 pages).
Extended European Search Report mailed on Jan. 4, 2022, European Patent Application No. 19815085.6, (9 pages).
Extended European Search Report mailed on Jul. 16, 2021, European Patent Application No. 19810142.0, (14 pages).
Extended European Search Report mailed on Jul. 30, 2021, European Patent Application No. 19839970.1, (7 pages).
Extended European Search Report mailed on Oct. 27, 2021, European Patent Application No. 19833664.6, (10 pages).
Extended European Search Report mailed on Sep. 20, 2021, European Patent Application No. 19851373.1, (8 pages).
Extended European Search Report mailed on Sep. 28, 2021, European Patent Application No. 19845418.3, (13 pages).
Final Office Action mailed on Feb. 3, 2022, U.S. Appl. No. 16/864,721, (36 pages).
Final Office Action mailed on Jun. 15, 2021, U.S. Appl. No. 16/928,313, (42 pages).
Final Office Action mailed on Mar. 1, 2021, U.S. Appl. No. 16/214,575, (29 pages).
Final Office Action mailed on Mar. 19, 2021, U.S. Appl. No. 16/530,776, (25 pages).
Final Office Action mailed on Sep. 17, 2021, U.S. Appl. No. 16/938,782, (44 pages).
“Multi-core processor”, TechTarget, 2013, (1 page).
Non Final Office Action mailed on Aug. 4, 2021, U.S. Appl. No. 16/864,721, (51 pages).
Non Final Office Action mailed on Feb. 2, 2022, U.S. Appl. No. 16/783,866, (8 pages).
Non Final Office Action mailed on Jul. 9, 2021, U.S. Appl. No. 17/002,663, (43 pages).
Non Final Office Action mailed on Jul. 9, 2021, U.S. Appl. No. 16/833,093, (47 pages).
Non Final Office Action mailed on Jun. 10, 2021, U.S. Appl. No. 16/938,782, (40 Pages).
Non Final Office Action mailed on Jun. 29, 2021, U.S. Appl. No. 16/698,588, (58 pages).
Non Final Office Action mailed on May 26, 2021, U.S. Appl. No. 16/214,575, (19 pages).
Non Final Office Action mailed on Sep. 20, 2021, U.S. Appl. No. 17/105,848, (56 pages).
Non Final Office Action mailed on Sep. 29, 2021, U.S. Appl. No. 16/748,193, (62 pages).
Giuseppe, Donato, et al., “Stereoscopic helmet mounted system for real time 3D environment reconstruction and indoor ego-motion estimation”, Proc. SPIE 6955, Head- and Helmet-Mounted Displays XIII: Design and Applications, 69550P.
Molchanov, Pavlo, et al., “Short-range FMCW monopulse radar for hand-gesture sensing”, 2015 IEEE Radar Conference (RadarCon) (2015), pp. 1491-1496.
Mrad, et al., “A framework for System Level Low Power Design Space Exploration”, 1991.
Sheng, Liu, et al., “Time-multiplexed dual-focal plane head-mounted display with a liquid lens”, Optics Letters, Optical Society of Amer I Ca, US, vol. 34, No. 11, Jun. 1, 2009 (Jun. 1, 2009), XP001524475, ISSN: 0146-9592, pp. 1642-1644.
“ARToolKit: Hardware”, https://web.archive.org/web/20051013062315/http://www.hitl.washington.edu:80/artoolkit/documentation/hardware.htm (downloaded Oct. 26, 2020), Oct. 13, 2015, (3 pages).
Communication Pursuant to Article 94(3) EPC mailed on Sep. 4, 2019, European Patent Application No. 10793707.0, (4 pages).
Examination Report mailed on Jun. 19, 2020, European Patent Application No. 20154750.2, (10 pages).
Extended European Search Report issued on May 20, 2020, European Patent Application No. 20154070.5, (7 pages).
Extended European Search Report mailed on Jun. 12, 2017, European Patent Application No. 16207441.3, (8 pages).
Final Office Action mailed on Aug. 10, 2020, U.S. Appl. No. 16/225,961, (13 pages).
Final Office Action mailed on Dec. 4, 2019, U.S. Appl. No. 15/564,517, (15 pages).
Final Office Action mailed on Feb. 19, 2020, U.S. Appl. No. 15/552,897, (17 pages).
International Search Report and Written Opinion mailed on Mar. 12, 2020, International PCT Patent Application No. PCT/US19/67919, (14 pages).
International Search Report and Written Opinion mailed on Aug. 15, 2019, International PCT Patent Application No. PCT/US19/33987, (20 pages).
International Search Report and Written Opinion mailed on Jun. 15, 2020, International PCT Patent Application No. PCT/US2020/017023, (13 pages).
International Search Report and Written Opinion mailed on Oct. 16, 2019, International PCT Patent Application No. PCT/US19/43097, (10 pages).
International Search Report and Written Opinion mailed on Oct. 16, 2019, International PCT Patent Application No. PCT/US19/36275, (10 pages).
International Search Report and Written Opinion mailed on Oct. 16, 2019, International PCT Patent Application No. PCT/US19/43099, (9 pages).
International Search Report and Written Opinion mailed on Jun. 17, 2016, International PCT Patent Application No. PCT/FI2016/050172, (9 pages).
International Search Report and Written Opinion mailed on Oct. 22, 2019, International PCT Patent Application No. PCT/US19/43751, (9 pages).
International Search Report and Written Opinion mailed on Dec. 23, 2019, International PCT Patent Application No. PCT/US19/44953, (11 pages).
International Search Report and Written Opinion mailed on May 23, 2019, International PCT Patent Application No. PCT/US18/66514, (17 pages).
International Search Report and Written Opinion mailed on Sep. 26, 2019, International PCT Patent Application No. PCT/US19/40544, (12 pages).
International Search Report and Written Opinion mailed on Aug. 27, 2019, International PCT Application No. PCT/US2019/035245, (8 pages).
International Search Report and Written Opinion mailed on Dec. 27, 2019, International Application No. PCT/US19/47746, (16 pages).
International Search Report and Written Opinion mailed on Sep. 30, 2019, International Patent Application No. PCT/US19/40324, (7 pages).
International Search Report and Written Opinion mailed on Sep. 4, 2020, International Patent Application No. PCT/US20/31036, (13 pages).
International Search Report and Written Opinion mailed on Jun. 5, 2020, International Patent Application No. PCT/US20/19871, (9 pages).
International Search Report and Written Opinion mailed on Aug. 8, 2019, International PCT Patent Application No. PCT/US2019/034763, (8 pages).
International Search Report and Written Opinion mailed on Oct. 8, 2019, International PCT Patent Application No. PCT/US19/41151, (7 pages).
International Search Report and Written Opinion mailed on Jan. 9, 2020, International Application No. PCT/US19/55185, (10 pages).
International Search Report and Written Opinion mailed on Feb. 28, 2019, International Patent Application No. PCT/US18/64686, (8 pages).
International Search Report and Written Opinion mailed on Feb. 7, 2020, International PCT Patent Application No. PCT/US2019/061265, (11 pages).
International Search Report and Written Opinion mailed on Jun. 11, 2019, International PCT Application No. PCT/US19/22620, (7 pages).
Invitation to Pay Additional Fees mailed Aug. 15, 2019, International PCT Patent Application No. PCT/US19/36275, (2 pages).
Invitation to Pay Additional Fees mailed Sep. 24, 2020, International Patent Application No. PCT/US2020/043596, (3 pages).
Invitation to Pay Additional Fees mailed on Oct. 22, 2019, International PCT Patent Application No. PCT/US19/47746, (2 pages).
Invitation to Pay Additional Fees mailed on Apr. 3, 2020, International Patent Application No. PCT/US20/17023, (2 pages).
Invitation to Pay Additional Fees mailed on Oct. 17, 2019, International PCT Patent Application No. PCT/US19/44953, (2 pages).
Non Final Office Action mailed Nov. 19. 2019, U.S. Appl. No. 16/355,611, (31 pages).
Non Final Office Action mailed on Aug. 21, 2019, U.S. Appl. No. 15/564,517, (14 pages).
Non Final Office Action mailed on Jul. 27, 2020, U.S. Appl. No. 16/435,933, (16 pages).
Non Final Office Action mailed on Jun. 17, 2020, U.S. Appl. No. 16/682,911, (22 pages).
Non Final Office Action mailed on Jun. 19, 2020, U.S. Appl. No. 16/225,961, (35 pages).
Non Final Office Action mailed on Nov. 19, 2019, U.S. Appl. No. 16/355,611, (31 pages).
Non Final Office Action mailed on Nov. 5, 2020, U.S. Appl. No. 16/530,776, (45 pages).
Non Final Office Action mailed on Oct. 22, 2019, U.S. Appl. No. 15/859,277, (15 pages).
Non Final Office Action mailed on Sep. 1, 2020, U.S. Appl. No. 16/214,575, (40 pages).
Notice of Allowance mailed on Mar. 25, 2020, U.S. Appl. No. 15/564,517, (11 pages).
Notice of Allowance mailed on Oct. 5, 2020, U.S. Appl. No. 16/682,911, (27 pages).
Notice of Reason of Refusal mailed on Sep. 11, 2020 with English translation, Japanese Patent Application No. 2019-140435, (6 pages).
“Phototourism Challenge”, CVPR 2019 Image Matching Workshop. https://image matching-workshop. github.io., (16 pages).
Summons to attend oral proceedings pursuant to Rule 115(1) EPC mailed on Jul. 15, 2019, European Patent Application No. 15162521.7, (7 pages).
Aarik, J. et al., “Effect of crystal structure on optical properties of TiO2 films grown by atomic layer deposition”, Thin Solid Films; Publication [online). May 19, 1998 [retrieved Feb. 19, 2020]. Retrieved from the Internet: <URL: https://www.sciencedirect.com/science/article/pii/S0040609097001351?via%3Dihub>; DOI: 10.1016/S0040-6090(97)00135-1; see entire document, (2 pages).
Arandjelović, Relja et al., “Three things everyone should know to improve object retrieval”, CVPR, 2012, (8 pages).
Azom, “Silica—Silicon Dioxide (SiO2)”, AZO Materials; Publication [Online]. Dec. 13, 2001 [retrieved Feb. 19, 2020]. Retrieved from the Internet: <URL: https://www.azom.com/article.aspx?Article1D=1114>, (6 pages).
Azuma, Ronald T. , “A Survey of Augmented Reality”, Presence: Teleoperators and Virtual Environments 6, 4 (Aug. 1997), 355-385; https://web.archive.org/web/20010604100006/http://www.cs.unc.edu/˜azuma/ARpresence.pdf (downloaded Oct. 26, 2020).
Azuma, Ronald T. , “Predictive Tracking for Augmented Reality”, Department of Computer Science, Chapel Hill NC; TR95-007, Feb. 1995, 262 pages.
Battaglia, Peter W. et al., “Relational inductive biases, deep learning, and graph networks”, arXiv: 1806.01261, Oct. 17, 2018, pp. 1-40.
Berg, Alexander C et al., “Shape matching and object recognition using low distortion correspondences”, In CVPR, 2005, (8 pages).
Bian, Jiawang et al., “GMS: Grid-based motion statistics for fast, ultra-robust feature correspondence.”, In CVPR (Conference on Computer Vision and Pattern Recognition), 2017, (10 pages).
Bimber, Oliver et al., “Spatial Augmented Reality: Merging Real and Virtual Worlds”, https://web.media.mit.edu/˜raskar/book/BimberRaskarAugmentedReality Book.pdf; published by A K Peters/CRC Press (Jul. 31, 2005); eBook (3rd Edition, 2007), (393 pages).
Brachmann, Eric et al., “Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses”, In ICCV (International Conference on Computer Vision ), arXiv: 1905.04132v2 [cs.CV] Jul. 31, 2019, (17 pages).
Caetano, Tibério S et al., “Learning graph matching”, IEEE TPAMI, 31(6):1048-1058, 2009.
Cech, Jan et al., “Efficient sequential correspondence selection by cosegmentation”, IEEE TPAMI, 32(9):1568-1581, Sep. 2010.
Cuturi, Marco , “Sinkhorn distances: Lightspeed computation of optimal transport”, NIPS, 2013, (9 pages).
Dai, Angela et al., “ScanNet: Richly-annotated 3d reconstructions of indoor scenes”, In CVPR, arXiv:1702.04405v2 [cs.CV] Apr. 11, 2017, (22 pages).
Deng, Haowen et al., “PPFnet: Global context aware local features for robust 3d point matching”, In CVPR, arXiv:1802.02669v2 [cs.CV] Mar. 1, 2018, (12 pages).
Detone, Daniel et al., “Deep image homography estimation”, In RSS Work-shop: Limits and Potentials of Deep Learning in Robotics, arXiv:1606.03798v1 [cs.CV] Jun. 13, 2016, (6 pages).
Detone, Daniel et al., “Self-improving visual odometry”, arXiv:1812.03245, Dec. 8, 2018, (9 pages).
Detone, Daniel et al., “SuperPoint: Self-supervised interest point detection and description”, In CVPR Workshop on Deep Learning for Visual SLAM, arXiv:1712.07629v4 [cs.CV] Apr. 19, 2018, (13 pages).
Dusmanu, Mihai et al., “D2-net: A trainable CNN for joint detection and description of local features”, CVPR, arXiv:1905.03561v1 [cs.CV] May 9, 2019, (16 pages).
Ebel, Patrick et al., “Beyond cartesian representations for local descriptors”, ICCV, arXiv:1908.05547v1 [cs.CV] Aug. 15, 2019, (11 pages).
Fischler, Martin A et al., “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography”, Communications of the ACM, 24(6): 1981, pp. 381-395.
Gilmer, Justin et al., “Neural message passing for quantum chemistry”, In ICML, arXiv:1704.01212v2 [cs.LG] Jun. 12, 2017, (14 pages).
Goodfellow, , “Titanium Dioxide—Titania (TiO2)”, AZO Materials; Publication [online]. Jan. 11, 2002 [retrieved Feb. 19, 2020]. Retrieved from the Internet: <URL: https://www.azom.com/article.aspx?Article1D=1179>, (9 pages).
Hartley, Richard et al., “Multiple View Geometry in Computer Vision”, Cambridge University Press, 2003, pp. 1-673.
Jacob, Robert J. , “Eye Tracking in Advanced Interface Design”, Human-Computer Interaction Lab, Naval Research Laboratory, Washington, D.C., date unknown. 2003, pp. 1-50.
Lee, Juho et al., “Set transformer: A frame-work for attention-based permutation-invariant neural networks”, ICML, arXiv:1810.00825v3 [cs.LG] May 26, 2019, (17 pages).
Leordeanu, Marius et al., “A spectral technique for correspondence problems using pairwise constraints”, Proceedings of (ICCV) International Conference on Computer Vision, vol. 2, pp. 1482-1489, Oct. 2005, (8 pages).
Levola, T. , “Diffractive Optics for Virtual Reality Displays”, Journal of the SID Eurodisplay 14/05, 2005, XP008093627, chapters 2-3, Figures 2 and 10, pp. 467-475.
Levola, Tapani , “Invited Paper: Novel Diffractive Optical Components for Near to Eye Displays—Nokia Research Center”, SID 2006 Digest, 2006 SID International Symposium, Society for Information Display, vol. XXXVII, May 24, 2005, chapters 1-3, figures 1 and 3, pp. 64-67.
Li, Yujia et al., “Graph matching networks for learning the similarity of graph structured objects”, ICML, arXiv:1904.12787v2 [cs.LG] May 12, 2019, (18 pages).
Li, Zhengqi et al., “Megadepth: Learning single-view depth prediction from internet photos”, In CVPR, fromarXiv: 1804.00607v4 [cs.CV] Nov. 28, 2018, (10 pages).
Loiola, Eliane M et al., “A survey for the quadratic assignment problem”, European journal of operational research, 176(2): 2007, pp. 657-690.
Lowe, David G. , “Distinctive image features from scale-invariant keypoints”, International Journal of Computer Vision, 60(2): 91-110, 2004, (28 pages).
Luo, Zixin et al., “ContextDesc: Local descriptor augmentation with cross-modality context”, CVPR, arXiv:1904.04084v1 [cs.CV] Apr. 8, 2019, (14 pages).
Memon, F. et al., “Synthesis, Characterization and Optical Constants of Silicon Oxycarbide”, EPJ Web of Conferences; Publication [online). Mar. 23, 2017 [retrieved Feb. 19, 2020).<URL: https://www.epj-conferences.org/articles/epjconf/pdf/2017/08/epjconf_nanop2017_00002.pdf>; DOI: 10.1051/epjconf/201713900002, (8 pages).
Munkres, James , “Algorithms for the assignment and transportation problems”, Journal of the Society for Industrial and Applied Mathematics, 5(1): 1957, pp. 32-38.
Ono, Yuki et al., “LF-Net: Learning local features from images”, 32nd Conference on Neural Information Processing Systems (NIPS 2018), arXiv:1805.09662v2 [cs.CV] Nov. 22, 2018, (13 pages).
Paszke, Adam et al., “Automatic differentiation in Pytorch”, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, (4 pages).
Peyré, Gabriel et al., “Computational Optimal Transport”, Foundations and Trends in Machine Learning, 11(5-6):355-607, 2019; arXiv:1803.00567v4 [stat.ML] Mar. 18, 2020, (209 pages).
Qi, Charles R. et al., “Pointnet++: Deep hierarchical feature learning on point sets in a metric space.”, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA., (10 pages).
Qi, Charles R et al., “Pointnet: Deep Learning on Point Sets for 3D Classification and Segmentation”, CVPR, arXiv:1612.00593v2 [cs.CV] Apr. 10, 201, (19 pages).
Radenović, Filip et al., “Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking”, CVPR, arXiv:1803.11285v1 [cs.CV] Mar. 29, 2018, (10 pages).
Raguram, Rahul et al., “A comparative analysis of RANSAC techniques leading to adaptive real-time random sample consensus”, Computer Vision—ECCV 2008, 10th European Conference on Computer Vision, Marseille, France, Oct. 12-18, 2008, Proceedings, Part I, (15 pages).
Ranftl, René et al., “Deep fundamental matrix estimation”, European Conference on Computer Vision (ECCV), 2018, (17 pages).
Revaud, Jerome et al., “R2D2: Repeatable and Reliable Detector and Descriptor”, In NeurIPS, arXiv:1906.06195v2 [cs.CV] Jun. 17, 2019, (12 pages).
Rocco, Ignacio et al., “Neighbourhood Consensus Networks”, 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada, arXiv:1810.10510v2 [cs.CV] Nov. 29, 2018, (20 pages).
Rublee, Ethan et al., “ORB: An efficient alternative to SIFT or SURF”, Proceedings of the IEEE International Conference on Computer Vision. 2564-2571. 2011; 10.1109/ICCV.2011.612654, (9 pages).
Sattler, Torsten et al., “SCRAMSAC: Improving RANSAC's efficiency with a spatial consistency filter”, ICCV, 2009: 2090-2097., (8 pages).
Schonberger, Johannes L. et al., “Pixelwise view selection for un-structured multi-view stereo, Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, Oct. 11-14, 2016, Proceedings, Part III”, pp. 501-518, 2016.
Schonberger, Johannes L. et al., “Structure-from-motion revisited”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 4104-4113, (11 pages).
Sinkhorn, Richard et al., “Concerning nonnegative matrices and doubly stochastic matrices.”, Pacific Journal of Mathematics, 1967, pp. 343-348.
Spencer, T. et al., “Decomposition of poly(propylene carbonate) with UV sensitive iodonium 11 salts”, Polymer Degradation and Stability; (online]. Dec. 24, 2010 (retrieved Feb. 19, 2020]., <URL: http:/fkohl.chbe.gatech.edu/sites/default/files/linked_files/publications/2011Decomposition%20of%20poly(propylene%20carbonate)%20with%20UV%20sensitive%20iodonium%20salts,pdf>; DOI: 10, 1016/j.polymdegradstab.2010, 12.003, (17 pages).
Tanriverdi, Vildan et al., “Interacting With Eye Movements in Virtual Environments”, Department of Electrical Engineering and Computer Science, Tufts University; Proceedings of the SIGCHI conference on Human Factors in Computing Systems, Apr. 2000, pp. 1-8.
Thomee, Bart et al., “YFCC100m: The new data in multimedia research”, Communications of the ACM, 59(2):64-73, 2016; arXiv:1503.01817v2 [cs.MM] Apr. 25, 2016, (8 pages).
Torresani, Lorenzo et al., “Feature correspondence via graph matching: Models and global optimization”, Computer Vision—ECCV 2008, 10th European Conference on Computer Vision, Marseille, France, Oct. 12-18, 2008, Proceedings, Part II, (15 pages).
Tuytelaars, Tinne et al., “Wide baseline stereo matching based on local, affinely invariant regions”, BMVC, 2000, pp. 1-14.
Ulyanov, Dmitry et al., “Instance normalization: The missing ingredient for fast stylization”, arXiv:1607.08022v3 [cs.CV] Nov. 6, 2017, (6 pages).
Vaswani, Ashish et al., “Attention is all you need”, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA; arXiv:1706.03762v5 [cs.CL] Dec. 6, 2017, (15 pages).
Veli{hacek over (c)}kovi{hacek over (c)}, Petar et al., “Graph attention networks”, ICLR, arXiv:1710.10903v3 [stat.ML] Feb. 4, 2018, (12 pages).
Mllani, Cédric , “Optimal transport: old and new”, vol. 338. Springer Science & Business Media, Jun. 2008, pp. 1-998.
Wang, Xiaolong et al., “Non-local neural networks”, CVPR, arXiv:1711.07971v3 [cs.CV] Apr. 13, 2018, (10 pages).
Wang, Yue et al., “Deep Closest Point: Learning representations for point cloud registration”, ICCV, arXiv:1905.03304v1 [cs.CV] May 8, 2019, (10 pages).
Wang, Yue et al., “Dynamic Graph CNN for learning on point clouds”, ACM Transactions on Graphics, arXiv:1801.07829v2 [cs.CV] Jun. 11, 2019, (13 pages).
Weissel, et al., “Process cruise control: event-driven clock scaling for dynamic power management”, Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems. Oct. 11, 2002 (Oct. 11, 2002) Retrieved on May 16, 2020 (May 16, 2020) from <URL: https://dl.acm.org/doi/pdf/10.1145/581630.581668>, p. 238-246.
Yi, Kwang M. et al., “Learning to find good correspondences”, CVPR, arXiv:1711.05971v2 [cs.CV] May 21, 2018, (13 pages).
Yi, Kwang Moo et al., “Lift: Learned invariant feature transform”, ECCV, arXiv:1603.09114v2 [cs.CV] Jul. 29, 2016, (16 pages).
Zaheer, Manzil et al., “Deep Sets”, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA; arXiv:1703.06114v3 [cs.LG] Apr. 14, 2018, (29 pages).
Zhang, Jiahui et al., “Learning two-view correspondences and geometry using order-aware network”, ICCV; aarXiv:1908.04964v1 [cs.CV] Aug. 14, 2019, (11 pages).
Zhang, Li et al., “Dual graph convolutional net-work for semantic segmentation”, BMVC, 2019; arXiv:1909.06121v3 [cs.CV] Aug. 26, 2020, (18 pages).
“Communication Pursuant to Article 94(3) EPC mailed on Feb. 28, 2023”, European Patent Application No. 19845418.3, (6 Pages).
“Communication Pursuant to Article 94(3) EPC mailed on Jul. 28, 2023”, European Patent Application No. 19843487.0, (15 pages).
“Communication Pursuant to Article 94(3) EPC mailed on May 23, 2023”, European Patent Application No. 18890390.0, (5 pages).
“Communication Pursuant to Rule 164(1) EPC mailed on Feb. 23, 2022”, European Patent Application No. 20753144.3, (11 pages).
“Decision of Rejection mailed on Jan. 5, 2023 with English translation”, Chinese Patent Application No. 201880079474.6, (10 pages).
“Extended European Search Report issued on Apr. 5, 2023”, European Patent Application No. 20888716.6, (11 pages).
“Final Office Action mailed on Dec. 1, 2023”, U.S. Appl. No. 17/357,795, (18 pages).
“Final Office Action mailed on Mar. 10, 2023”, U.S. Appl. No. 17/357,795, (15 pages).
“Final Office Action mailed on Sep. 8, 2023 with English translation”, Japanese Patent Application No. 2020-566620, (18 pages).
“First Examination Report Mailed on Aug. 8, 2023”, Australian Patent Application No. 2018379105, (3 pages).
“First Office Action mailed Apr. 21, 2023 with English translation”, Japanese Patent Application No. 2021-509779, (26 pages).
“First Office Action mailed Jul. 4, 2023 with English translation”, Japanese Patent Application No. 2021-505669, (6 pages).
“First Office Action mailed on Apr. 13, 2023 with English Translation”, Japanese Patent Application No. 2020-567766, (7 pages).
“First Office Action mailed on Dec. 22, 2022 with English translation”, Chinese Patent Application No. 201980061450.2, (11 pages).
“First Office Action mailed on Jan. 24, 2023 with English translation”, Japanese Patent Application No. 2020-549034, (7 pages).
“First Office Action mailed on Jan. 30, 2023 with English translation”, Chinese Patent Application No. 201980082951.9, (5 pages).
“First Office Action mailed on Jun. 13, 2023 with English translation”, Japanese Patent Application No. 2020-567853, (7 pages).
“First Office Action mailed on Mar. 27, 2023 with English translation”, Japanese Patent Application No. 2020-566617, (6 pages).
“First Office Action mailed on Mar. 6, 2023 with English translation”, Korean Patent Application No. 10-2020-7019685, (7 pages).
“First Office Action mailed on May 26, 2023 with English translation”, Japanese Patent Application No. 2021-500607, (6 pages).
“First Office Action mailed on May 30, 2023 with English translation”, Japanese Patent Application No. 2021-519873, (8 pages).
“First Office Action mailed Sep. 29, 2023 with English translation”, Japanese Patent Application No. 2023-10887, (5 pages).
“Non Final Office Action mailed on Aug. 2, 2023”, U.S. Appl. No. 17/807,600, (25 pages).
“Non Final Office Action mailed on Jul. 20, 2023”, U.S. Appl. No. 17/650,188, (11 pages).
“Non Final Office Action mailed on Jun. 14, 2023”, U.S. Appl. No. 17/516,483, (10 pages).
“Non Final Office Action mailed on Mar. 1, 2023”, U.S. Appl. No. 18/046,739, (34 pages).
“Non Final Office Action mailed on May 11, 2023”, U.S. Appl. No. 17/822,279, (24 pages).
“Non Final Office Action mailed on Nov. 22, 2023”, U.S. Appl. No. 17/268,376, (8 pages).
“Non Final Office Action mailed on Nov. 3, 2023”, U.S. Appl. No. 17/416,248, (17 pages).
“Non Final Office Action mailed on Oct. 11, 2023”, U.S. Appl. No. 17/357,795, (14 pages).
“Non Final Office Action mailed on Oct. 24, 2023”, U.S. Appl. No. 17/259,020, (21 pages).
“Notice of Allowance mailed on Jul. 27, 2023 with English translation”, Korean Patent Application No. 10-2020-7019685, (4 pages).
“Office Action mailed on Apr. 13, 2023 with English translation”, Japanese Patent Application No. 2020-533730, (13 pages).
“Office Action mailed on Jul. 20, 2023 with English translation”, Japanese Patent Application No. 2021-505884, (6 pages).
“Office Action mailed on Jun. 8, 2023 with English translation”, Japanese Patent Application No. 2021-503762, (6 pages).
“Office Action mailed on Mar. 30, 2023 with English translation”, Japanese Patent Application No. 2020-566620, (10 pages).
“Office Action mailed on Nov. 7, 2023 with English translation”, Korean Patent Application No. 10-2023-7036734, (5 pages).
“Penultimate Office Action mailed on Oct. 19, 2023 with English translation”, Japanese Patent Application No. 2021-509779, (5 pages).
“Second Office Action mailed on May 2, 2023 with English Translation”, Japanese Patent Application No. 2020-549034, (6 pages).
“Second Office Action mailed on Sep. 25, 2023 with English translation”, Japanese Patent Application No. 2020-567853, (8 pages).
“Wikipedia Dioptre”, Jun. 22, 2018 (Jun. 22, 2018), XP093066995, Retrieved from the Internet: URL:https://en.wikipedia.org/w/index.php? title=Dioptre&direction=next&oldid=846451540 [retrieved on Jul. 25, 2023], (3 pages).
Li, Yujia, et al., “Graph Matching Networks for Learning the Similarity of Graph Structured Objects”, arxiv.org, Cornell University Library, 201 Olin Library Cornell University Ithaca, NY 14853, XP081268608, Apr. 29, 2019.
Luo, Zixin , et al., “ContextDesc: Local Descriptor Augmentation With Cross-Modality Context”, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, XP033686823, DOI: 10.1109/CVPR.2019.00263 [retrieved on Jan. 8, 2020], Jun. 15, 2019, pp. 2522-2531.
Zhang, Zen , et al., “Deep Graphical Feature Learning for the Feature Matching Problem”, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), IEEE, XP033723985, DOI: 10.1109/ICCV.2019.00519 [retrieved on Feb. 24, 2020], Oct. 27, 2019, pp. 5086-5095.
“Communication Pursuant to Article 94(3) EPC mailed on Oct. 6, 2023”, European Patent Application No. 19851373.1, (6 pages).
“Extended European Search Report issued on Jan. 8, 2024”, European Patent Application No. 23195266.4, (8 pages).
“First Office Action mailed Dec. 12, 2023 with English translation”, Japanese Patent Application No. 2021-545712, (8 pages).
“First Office Action mailed on Dec. 11, 2023”, Chinese Patent Application No. 201980032005.3, (10 pages).
“Office Action mailed Nov. 21, 2023 with English Translation”, Japanese Patent Application No. 2021-535716, (15 pages).
“Office Action mailed on Dec. 14, 2023 with English translation”, Japanese Patent Application No. 2021-526564, (13 pages).
“Office Action mailed on Nov. 8, 2023 with English translation”, Chinese Patent Application No. 201980060018.1, (12 pages).
“First Office Action mailed Dec. 20, 2023 with English translation”, Chinese Patent Application No. 201980050600.X, (21 pages).
“First Office Action mailed Dec. 27, 2023 with English translation”, Chinese Patent Application No. 201980075942.7, (7 pages).
“First Office Action mailed on Dec. 25, 2023 with English translation”, Chinese Patent Application No. 2019800046303.8, (13 pages).
“Non Final Office Action mailed on Feb. 26, 2024”, U.S. Appl. No. 18/046,739, (48 pages).
“Communication Pursuant to Article 94(3) EPC mailed on Feb. 21, 2024”, European Patent Application No. 20770244.0, (8 pages).
“Communication Pursuant to Article 94(3) EPC mailed on Mar. 11, 2024”, European Patent Application No. 20798769.4, (12 pages).
“Extended European Search Report issued on Apr. 25, 2024”, European Patent Application No. 23208907.8, (9 pages).
“First Office Action mailed Mar. 1, 2024 with English translation”, Japanese Patent Application No. 2021-553297, (5 pages).
“Office Action mailed on Feb. 19, 2024 with English translation”, Korean Patent Application No. 10-2020-7020552, (18 pages).
“Office Action mailed on Feb. 26, 2024 with English translation”, Chinese Patent Application No. 201980069194.1, (11 pages).
“Office Action mailed on Mar. 6, 2024 with English translation”, Chinese Patent Application No. 201980053016.X, (7 pages).
“Non Final Office Action mailed on May 16, 2024”, U.S. Appl. No. 18/361,546, (11 pages).

Related Publications (1)

	Number	Date	Country
	20210150252 A1	May 2021	US

Provisional Applications (1)

	Number	Date	Country
	62935597	Nov 2019	US

Systems and methods for virtual and augmented reality

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

CPC

International Classifications

Term Extension

Abstract