This invention relates generally to computer vision, and more particularly to processing a sequence of images online.
In many computer vision applications, images can be processed to detect objects, or to improve the quality of the input images by, e.g., background subtraction, removing or reducing unwanted artifacts, noise and occlusions. In image processing, principal component analysis (PCA) is commonly applied for dimensionality reduction. However, when the image data contains unintended artifacts, such as gross corruptions, occlusions or outliers, the conventional PCA can fail. To solve this problem, robust PCA (RPCA) models can be used.
An online recursive RPCA can separate data samples in an online mode, i.e., with only a previous estimate and newly acquired data. Unlike the conventional RPCA methods, which first saves all the data samples and then processes them, the online RPCA significantly reduces the required memory requirement and improves computational efficiency and convergence.
For multidimensional data (tensors) of order greater than 2, it is common to embed the data into a vector space by vectorizing the data such that conventional matrix-based approaches can still be used. Although this vectorization process works well in most cases, it restricts the effectiveness of the tensor representation in extracting information from the multidimensional perspective.
Alternatively, tensor algebraic approaches exhibit significant advantages in preserving multidimensional information when dealing with high order data. However, it is very time-consuming for the tensor RPCA to operate in batch mode because all of the high dimensional data needs to be stored and processed.
Tensor robust principal component analysis (PCA) is used in many image processing applications such as background subtraction, denoising, and outlier and object detection, etc. The embodiments of the invention provide an online tensor robust PCA where multi-dimensional data, representing a set of images in the form of tensors are processed sequentially. The tensor PCA updates tensors based on the previous estimation and newly acquired data.
Compared to the conventional tensor robust PCA operating in batch mode, the invention significantly reduces the required amount of memory and improves computational efficiency. In addition, the method is superior in convergence speed and performance compared to conventional batch mode approaches. For example, the performance is at least 10% better than for matrix-based online robust PCA methods according to a relative squared error, and the speed of convergence is at least three times faster than for the matrix-based online robust PCA methods.
To reduce memory and increase computational efficiency, we provide an online tensor RPCA algorithm, which extends an online matrix PCA method to high dimensional data (tensor). The online tensor RPCA is based in part on a tensor singular value decomposition (t-SVD) structure.
The key idea behind this tensor algebraic is to construct group-rings along tensor tubes subsequently. For example, regard a 2-D array as a vector of tubes and 3-D tensor as a matrix of tubes; such a tensor framework has been used in high dimensional data compression and completion. The embodiments extend the batch tensor RPCA problem and provide the benefit of sequential data collection, which reduces the required memory and increases efficiency.
In the example application shown in
As shown in
The set of input images 101 are acquired 210 by the processor either directly or indirectly, e.g., the image can be acquired 106 by a camera, a video camera, or be obtained by other means or from other sources, e.g., a memory transfer, or wireless or wireless communication. For the purpose of the processing described herein, each image is represented by the image tensor Zt.
For each image at time step t, the following steps are performed. Data samples in the image are projected 220 to the tensor coefficients Rt and the sparse tensor Et using a previous spanning tensor basis Lt−1, where Rt(t, :, :)={right arrow over (R)} denotes the coefficients corresponding the spanning basis Lt−1, and Et(:, t, :)={right arrow over (E)}, t=1,2, . . . , T.
The spanning tensor basis is updated 225 by using the previous basis Lt−1 as the starting point. The updated spanning tensor basis Lt is saved for the next image to be processed. A low rank tubal tensor X=Lt*RtT and the sparse tensor Et are updated 230 to produce a set of output images Xt 104 and a set of sparse images Et 105, so that Xt+El=Zt.
Overview of Tensor Framework
We describe the tensor structure used by the embodiments of the invention, taking third-order tensor as an example. Instead of vectorizing images of size n1×n3 into a vector of dimension n1n3 as in conventional image processing, we consider each image as a vector of tubal scalars normal to the image plane. All the vectors form a free module over a commutative ring with identity, and the free module behaves similarly to vector spaces with basis and dimension, which are well-defined.
Notations and Definitions
For a third-order tensor A of size n1×n2×n3, A(i, j, k) denotes the (i, j, k)th element of A and A(i, j, :) denotes the (i, j)th tubal-scalar. A(i, :, :) is the ith horizontal slice, A(:, j, :) is the jth lateral slice, and A(:, :, k) or A(k) denotes the kth frontal slice of A respectively.
t-Product
Let {right arrow over (v)} ∈1×1×n
{right arrow over (w)}(i)={right arrow over (u)}*{right arrow over (v)}=Σk=0n
where i=0, 1, . . . , n3−1.
Given two third-order tensors A ∈n
C(i, l,:)=A*B=Σj=1n
where i=1, 2, . . . , n1 and l=1, 2, . . . , n4. This is consistent with the multiplication between matrices with the t-product ‘*’ corresponding to the multiplication operation.
Commutative Ring
Under the defined multiplication (t-product) and the addition, the set of n3-tuples forms a commutative ring (n
Free-Module over the Commutative Ring
Let n
Moreover, n
{right arrow over (X)}=Σi=1n
Tensor-PCA and Tensor Singular Value Decomposition (t-SVD)
Similar to the matrix PCA that identifies the lower-dimensional subspace approximately containing the data, we consider a tensor PCA for high-order tensor data. We focus on third-order tensors. Suppose the 2-D data samples come from a lower dimensional free submodule of the free module n
t-SVD
Given n2 2-D data samples X1, . . . , Xn
As shown in
X=U*S*VT, (4)
where U ∈n
S ∈d×d×n
Based on the relation between the circular convolution and the discrete Fourier transform (DFT), we can determine the t-SVD via an SVD in the Fourier domain. Let {circumflex over (X)} be the DFT along the third dimension of tensor X represented by {circumflex over (X)}=fft(X, [ ], 3). Given SVD in the Fourier domain
[U(:, :, k), S(:, :, k), V(:, :, k)]=SVD({circumflex over (X)}(:, :, k)), for k=1, . . . , n3, we can determine t-SVD in Eqn. (4) by
U=ifft(U, [ ], 3), S=ifft(S, [ ], 3), V=ifft(V, [ ], 3), (5)
where fft and ifft represent the fast Fourier transform and its inverse, respectively.
Note that many properties of matrix SVD are retained in t-SVD, among which an important one is the optimality of the truncated t-SVD for provably optimal dimension reduction.
Online Tensor Robust PCA
Now we consider the problem of recovering a tensor of low dimensional submodule from sparsely corrupted data. Suppose we have a third-order tensor Z, which can be decomposed as,
Z=X+E, (6)
where X is a tensor with low tensor tubal rank and E is a sparse tensor. The problem of recovering X and E separately, termed tensor RPCA, can be formulated as an optimization problem
where λ is a predetermined weighting factor, ∥X∥TNN denotes the tensor nuclear norm defined as the summation of all the singular values of tensor X in the t-SVD sense:
∥E∥1=Σi,j,k|E(i, j, k)|;and λ>0.
Note that Eqn. (7) is equivalent to the following problem,
with λ1, λ2>0.
Now, we describe an implementation of tensor PCA that operates online. Suppose the 2-D data samples Z(:, i, :), i=1, 2, . . . , T representing the set of images 102 are acquired 210 sequentially. Our goal is to estimate the spanning basis (principal components) of X online as the images are received at the processor 100, and separate the sparse tensor concurrently. In order to proceed, we rely on the following lemma.
For a third-order tensor X ∈n
Using the above lemma, we re-write Eqn. (8) as
where L ∈ n
For sequentially acquired data {{right arrow over (Z)}1, {right arrow over (Z)}2, . . . , {right arrow over (Z)}T} ∈ n
Input to algorithm 1 includes the acquired data AT and the number of time rounds T. For simplicity, we use  to denote the fft(A, [ ], 3), and A ∈ n
A=blkdiag(Â)=diag(Â(1), Â(2), . . . , Â(n
One key idea of our online tensor RPCA algorithm is that at new image Zt, we minimize a loss function over Zt given the previous estimation of the spanning tensor basis Lt−1, to produce the optimal Rt and Et. Then, we alternately use the latest estimated components to update 225 the spanning basis Lt by minimizing a cumulative loss.
Specifically, Rt and Et are optimized in step 3 with details given in algorithm 3. In the data projection 220 step in Algorithm 2, Sλ[·] is a soft-thresholding operator defined by
To update 225 the spanning basis Lt, we have
where tr is the trace operator.
Let At=At−1+{right arrow over (R)}t*{right arrow over (R)}T and Bt=Bt−1+({right arrow over (Z)}−{right arrow over (E)}t)*{right arrow over (R)}t, where {right arrow over (R)} ∈ r×1×n
For the batch tensor robust PCA, all the data samples up to image T, i.e., the total number of entries in {Zi}i=1T, are required. Therefore, the memory requirement for the batch tensor robust PCA is n1n3T.
For online tensor robust PCA, we need to save Lt−1 ∈ n
Other Image Processing Applications
The invention can also be used for other applications. In the case of background subtraction, also known as foreground detection, a foreground of an image is extracted for further processing such as object detection and recognition, e.g., pedestrians, vehicles, etc. Background subtraction can be used for detecting moving objects in a sequence of images (video). Background subtraction provides important cues for numerous applications in computer vision, for example surveillance tracking or human poses estimation.
In the processing method according to embodiments of the invention, the output background images would be constructed from the low rank tubal tensor X, and the foreground images are constructed from the sparse tensors E.
In the case of noise reduction, the reduced noise images would be derived from the low rank tubal tensor X The sparse tensor E representing the noise can essentially be discarded.
Although the invention has been described by way of examples of preferred embodiments, it is understood that various other adaptations and modifications can be made within the spirit and scope of the invention. Therefore, it is the object of the appended claims to cover all such variations and modifications as come within the true spirit and scope of the invention.
Number | Name | Date | Kind |
---|---|---|---|
20060165308 | Chakraborty | Jul 2006 | A1 |
20080247608 | Vasilescu | Oct 2008 | A1 |
20140181171 | Dourbal | Jun 2014 | A1 |
20150074158 | Kimmel | Mar 2015 | A1 |
20150301208 | Lewis | Oct 2015 | A1 |
20160013773 | Dourbal | Jan 2016 | A1 |
20160232175 | Zhou | Aug 2016 | A1 |
20160299243 | Jin | Oct 2016 | A1 |
20170076180 | Liu | Mar 2017 | A1 |
Entry |
---|
Lu et al., “Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Tensors via Convex Optimization”, IEEE, 2016. |
Qiu et al., “Recursive Projected Sparse Matrix Recovery (REPROSMR) With Application in Real-Time Video Layer Separation”, IEEE, 2014. |
Jiashi Feng, Huan Xu, and Shuicheng Yan, “Online robust PCA via stochastic optimization,” in Advances in Neural Information Processing Systems 26, pp. 404-412. Curran Associates, Inc., 2013. |
Zemin Zhang, Gregory Ely, Shuchin Aeron, Ning Hao, and Misha Elena Kilmer, “Novel methods for multilinear data completion and de-noising based on tensor-SVD,” in IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 3842-3849. |
Misha E. Kilmer and Carla D. Martin, “Factorization strategies for third-order tensors,” Linear Algebra and its Applications, vol. 435, No. 3, pp. 641-658, Aug. 2011. |
Carmeliza Navasca, Michael Opperman, Timothy Penderghest, and Christino Taman, “Tensors as module homomorphisms over group rings,” CoRR, vol. abs/1005.1894, 2010. |
Kilmer et al. “Third-Order Tensors as Operators on Matrices: A Theoretical and Computational Framework with Applications in Imaging,” SIAM. J. Matrix Anal. & Appl., vol. 34, Issue 1, 148-172. Feb. 28, 2013. |
Number | Date | Country | |
---|---|---|---|
20170076180 A1 | Mar 2017 | US |