The cost of collecting information has been substantially reduced with the arrival of the Information Age. Sensing equipment, for example, has become increasingly inexpensive, accurate, and pervasive in society. Analyzing data with this sensing equipment may reveal new trends or relationships. The collected data, however, may be extremely large and contain billions of entries. Further, the collected data may be streaming data that is continuously growing as new information is gathered. Applying conventional data analysis techniques to these extremely large data sets to identify new relationships requires an immense amount of computational resources and high storage resources.
Aspects and examples are directed to various techniques for generating low-rank matrix approximations of large data sets. These low-rank matrix approximations are significantly smaller than the original data set and may be analyzed in a similar fashion to, for example, identify trends or locate features of interest in the data set. Thereby, low-rank approximations reduce the storage and/or computational requirements for analyzing these large data sets.
The low-rank matrix approximation techniques disclosed herein leverage the structural information revealed by performing singular value decomposition (SVD) on a given matrix to generate a compact representation of the input matrix that is a close approximation of the input. According to certain aspects, it is appreciated that performing (SVD) is a computationally expensive step that can be replaced by various incremental techniques to update a previously generated SVD with new information. As described in more detail below, employing SVD updates in place of the computationally expensive SVD operation reduces the computational complexity at minimal, if any, cost to the accuracy of the resulting low-rank approximation. Thereby, the low-rank approximation of an input matrix may be generated in less time with fewer computational resources.
According to one aspect, a system for generating a low-rank approximation of a matrix including a plurality of data entries is provided. The system includes a memory, at least one processor coupled to the memory, and a sketching component executable by the at least one processor. The sketching component may be configured to receive the matrix and at least one desired dimension of the low-rank approximation, identify at least a first set of right singular vectors and a first set of singular values of a subset of the matrix, reduce the subset by an amount of energy of a selected data entry of the subset based on the first set of right singular vectors and the first set of singular values, incorporate at least one new data entry from the matrix into the subset, update the first set of right singular vectors and the first set of singular values of the subset based on the at least one new data entry to produce an updated first set of right singular vectors and an updated first set of singular values, and generate the low-rank approximation of the matrix based on the updated first set of right singular vectors and the updated first set of singular values. It is appreciated that the sketching component may update the first set of right singular vectors and the first set of singular values by one or more singular value decomposition (SVD) update operations.
In some examples, the at least one desired dimension of the low-rank approximation includes at least one of a desired row size and a desired column size of the low-rank approximation. In some examples, the selected data entry of the subset is one of a data entry with a least amount of energy in the subset and a data entry with a median amount of energy in the subset. In some examples, the plurality of data entries in the matrix are representative of an image.
In some examples, the sketching component is configured to update the first set of right singular vectors and the first set of singular values by determining a singular value decomposition (SVD) of a combination of the first set of singular values and a projection of the new data entry onto the first set of right singular vectors to produce at least a second set of right singular vectors and a second set of singular values. The sketching component may be further configured to update the first set of right singular vectors and the first set of singular values based on at least the second set of singular values, the first set of right singular vectors, and the second set of right singular vectors.
In some examples, the sketching component is configured to update the first set of right singular vectors and the first set of singular values by projecting the new data entry onto the first set of right singular vectors and generating an orthonormal basis for an orthogonal component of the data entry to the first set of right singular vectors. The sketching component may be further configured to update the first set of right singular vectors and the first set of singular values by determining a singular value decomposition (SVD) of a combination of the first set of singular values, the projection of the new data entry onto the first set of right singular vectors, and the orthonormal basis to produce at least a second set of right singular vectors and a second set of singular values. In at least one example, the sketching component may be further configured to update the first set of right singular vectors and the first set of singular values based on at least the second set of singular values, the orthonormal basis, the first set of right singular vectors, and the second set of right singular vectors.
According to at least one aspect, a computer implemented method of generating a low-rank approximation of a matrix including a plurality of data entries is provided. The method includes receiving, by at least one processor, the matrix and at least one desired dimension of the low-rank approximation, the matrix containing a plurality of data entries, identifying, by the at least one processor, a first set of right singular vectors and a first set of singular values of a subset of the matrix, reducing, by the at least one processor, the subset by an amount of energy of a selected data entry of the subset based on the first set of right singular vectors and the first set of singular values, incorporating, by the at least one processor, at least one new data entry from the matrix into the subset, updating, by the at least one processor, the first set of right singular vectors and the first set of singular values based on the at least one new data entry to produce an updated first set of right singular vectors and an updated first set of singular values; and generating, by the at least one processor, the low-rank approximation of the matrix based on the updated first set of right singular vectors and the updated first set of singular values.
In some examples, updating the first set of right singular vectors and the first set of singular values includes determining a singular value decomposition (SVD) of a combination of the first set of singular values and a projection of the new data entry onto the first set of right singular vectors to generate at least a second set of right singular vectors and a second set of singular values. In these examples, updating the first set of right singular vectors and the first set of singular values may further include updating the first set of right singular vectors and the first set of singular values based on at least the second set of singular values, the first set of right singular vectors, and the second set of right singular vectors.
In some examples, updating the first set of right singular vectors and the first set of singular values includes projecting the new data entry onto the first set of right singular vectors and generating an orthonormal basis for an orthogonal component of the data entry to the first set of right singular vectors. In these examples, updating the first set of right singular vectors and the first set of singular values may further include determining a singular value decomposition (SVD) of a combination of the first set of singular values with the projection of the new data entry onto the first set of right singular vectors and the orthonormal basis to generate at least a second set of right singular vectors and a second set of singular values. In these examples, updating the first set of right singular vectors and the first set of singular values may further include updating the first set of right singular vectors and the first set of singular values based on at least the second set of singular values, the orthonormal basis, the first set of right singular vectors, and the second set of right singular vectors.
According to at least one aspect, a system for latent semantic indexing (LSI) is provided. The system for LSI includes a memory, at least one processor coupled to the memory, and a sketching component executable by the at least one processor. The sketching component may be configured to receive a term-document matrix defining a frequency of a plurality of terms in a plurality of documents and to generate a low-rank approximation matrix based on the term-document matrix. The LSI system may further include an approximated term-document generator executable by the at least one processor and configured to generate an approximated term-document matrix based on the low-rank approximation matrix. The LSI system may further include a scoring component configured to receive a query defining a search and apply the query to the approximated term-document matrix to generate at least one relevancy score.
In some examples, the approximated term-document generator is configured to generate the approximated term-document matrix at least in part by determining a product of the sketch matrix with a transposed version of the sketch matrix and determining a singular variable decomposition (SVD) of the product to determine singular values, left singular vectors, and right singular vectors of the product. In these examples, the approximated term-document generator may be configured to generate the approximated term-document matrix at least in part by constructing the approximated term-document matrix from the sketch in addition to the singular values, the left singular vectors, the right singular vectors of the product.
Still other aspects, examples, and advantages of these exemplary aspects and examples are discussed in detail below. Examples disclosed herein may be combined with other examples in any manner consistent with at least one of the principles disclosed herein, and references to “an example,” “some examples,” “an alternate example,” “various examples,” “one example” or the like are not necessarily mutually exclusive and are intended to indicate that a particular feature, structure, or characteristic described may be included in at least one example. The appearances of such terms herein are not necessarily all referring to the same example.
Various aspects of at least one example are discussed below with reference to the accompanying figures, which are not intended to be drawn to scale. The figures are included to provide illustration and a further understanding of the various aspects and examples, and are incorporated in and constitute a part of this specification, but are not intended as a definition of the limits of the invention. In the figures, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For purposes of clarity, not every component may be labeled in every figure. In the figures:
Aspects and examples are directed to low-rank matrix approximations of large data sets. These low-rank matrix approximations are significantly smaller than the original input matrix and, thereby, reduce the storage and/or computational requirements for analyzing these large data sets. The low-rank matrix approximation techniques disclosed herein leverage the structural information revealed by performing singular value decomposition (SVD) on a given input matrix to generate a compact representation of the input matrix. Performing SVD, however, is a very computationally expensive operation. Replacing SVD operations with an incremental SVD update as described below reduces the computational complexity of determining a low-rank approximation of a matrix with only a minimal, if any, degradation of the accuracy of the resulting low-rank approximation.
It is to be appreciated that examples of the methods and apparatuses discussed herein are not limited in application to the details of construction and the arrangement of components set forth in the following description or illustrated in the accompanying drawings. The methods and apparatuses are capable of implementation in other examples and of being practiced or of being carried out in various ways. Examples of specific implementations are provided herein for illustrative purposes only and are not intended to be limiting. In particular, acts, elements, and features discussed in connection with any one or more examples are not intended to be excluded from a similar role in any other example. Also, the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. Any references to examples or elements or acts of the systems and methods herein referred to in the singular may also embrace examples including a plurality of these elements, and any references in plural to any example or element or act herein may also embrace examples including only a single element. The use herein of “including,” “comprising,” “having,” “containing,” “involving,” and variations thereof is meant to encompass the items listed thereafter and equivalents thereof as well as additional items. References to “or” and “and/or” may be construed as inclusive so that any terms described using “or” and “and/or” may indicate any of a single, more than one, and all of the described terms. Any references to front and back, left and right, top and bottom, upper and lower, and vertical and horizontal are intended for convenience of description, not to limit the present systems and methods or their components to any one positional or spatial orientation.
Singular value decomposition (SVD) reveals structural information about a matrix by reorganizing the data into subspaces that have the most variation. The SVD of a matrix A with n rows and m columns (e.g., an n by m matrix) decomposes the matrix A into three matrices including a U matrix, a E matrix, and a V matrix as illustrated below in equations (1)-(4).
The matrix U contains the left singular vectors ui of the matrix A. The matrix Σ contains singular values σi along the diagonal of the matrix Σ and zeros for all of the other values. The number of singular values σi in the matrix Σ is equal to the rank r of the matrix A. The rank of a matrix A is the dimension of the vector space generated (or spanned) by its columns, which is the same as the dimension of the space spanned by its rows. The matrix V contains the right singular vectors vi of the matrix A.
SVD offers an additional useful property due to the reorganization of the data discussed above. The best rank-k approximation of a matrix A (referred to as Ak) may be obtained directly from the SVD of A by taking the first k left singular vectors ui, the first k singular values σi, and the first k right singular vectors vi as illustrated below in equation (5).
Ak=UkΣkVkT=ΣikuiσiviT (5)
The rank-k approximation of A (Ak) is smaller than A and, thereby, reduces the storage requirements of storing the information in the matrix A. The rank-k approximation may also remove noise from the matrix A by removing the smallest singular values (e.g., σk+1 through σr are removed) and the smallest singular vectors (e.g., uk+1 through um and vk+1 through vn), which are representative of the data with the least amount of variation.
Various methods may be performed to determine the SVD of a given matrix A including, for example, QR decomposition. These methods, however, are of great computational complexity. Given an n by m matrix A with n≧m, calculating the SVD of A has order of nm2 (e.g., O(nm2)) time complexity. Performing SVD quickly becomes infeasible for large matrices because the SVD operation scales cubically with the dimensions of the input matrix. Accordingly, various methods may generate a low-rank approximation of A by performing one or more SVD operations on smaller matrices including, for example, matrix sketching as described in more detail below.
Matrix sketching generates a significantly smaller matrix B that approximates the original matrix A without performing SVD of the entire matrix A. Due to the smaller size of the sketch B, it is inherently low-rank. The sketch B may have the same number of rows as the input matrix A and a configurable number of columns l that may be altered to reduce (or increase) the rank of the resulting sketch B. Conversely, the sketch B may have the same number of columns as the input matrix A and a configurable number of rows l. The number of columns (or rows) l in B may be substantially smaller than the number of columns (or rows) in A.
The matrix sketching techniques disclosed herein have particularly desirable characteristics including, for example, being deterministic, yielding mergeable results, and being capable of processing streaming data. Deterministic techniques provide a consistent output given the same input. For example, the matrix sketching techniques described herein yield the same sketch given the same input matrix. The mergeable result provided by these matrix sketching techniques enables the data to be processed in a parallel fashion by, for example, splitting the data set into multiple subsections, processing the subsections independently, and merging the results into a final sketch of the entire input matrix. The ability to process streaming data enables the low-rank approximation to be updated with new information without having to review the entire original data set. For example, a new row or column may be added to the input matrix and the corresponding sketch may be updated based on only the new row or column and the previously generated sketch.
The matrix sketching process builds a sketch matrix B by filling the matrix with new data points (e.g., columns) from an input matrix A until the l columns of B are full, and then decrementing the B matrix (e.g., compacting) to make new space available for additional data points. Decrementing the matrix B does require an SVD operation; however, the SVD operation is performed on matrix B, which is substantially smaller than matrix A.
In act 102, the system determines whether there is new data to incorporate into the sketch B. If there is not any new data to add to the sketch B, the process 100A ends. Otherwise, the system proceeds to act 104 and determines whether the sketch B is full (e.g., the l columns in matrix B are full). If the matrix B is not full, the system proceeds to act 112 and adds the new data point to an available column in B. Otherwise, the system decrements the matrix B in act 114 to make a column available and adds the new data point to the available column in act 112.
The system may decrement the matrix B in act 114 by performing a series of steps as illustrated by acts 106, 108, and 110 in process 100A. In act 106, the system updates the SVD of matrix B to yield a U matrix, a Σ matrix, and a V matrix. As discussed above, performing SVD on a matrix reorganizes the information in the matrix. For example, multiplying the U matrix by the Σ matrix (e.g., to produce a UΣ matrix) may reorganize the columns such that the columns are in descending order from most energy to least energy. An SVD update operation is employed in act 106 to generate new U, Σ, V matrices that incorporate the most recently added data point based on the U, Σ, V matrices from the previous iteration and the latest data point. As described in more detail below with reference to
In act 108, the system determines the norm squared, δ, of the last column in the reorganized matrix (i.e., the column with the least amount of energy) and proceeds to act 110 to subtract δ from all of the columns in matrix B. Thereby, the last column in the matrix B is zeroed and the values in the remaining columns are reduced by δ.
It is appreciated that various alterations may be made to process 100A to generate a sketch B of a matrix. For example, as illustrated by process 100B in
The sketching processes described above yields low-rank approximations of a given matrix A for a lesser computational complexity relative to generating Ak by performing SVD on the entire matrix A (e.g., O(nm2)). In addition, the computational complexity is further reduced relative to other matrix sketching processes by employing an SVD update to incorporate the new data entry in act 106 as opposed to performing SVD of the full matrix B. For illustration, a matrix sketching procedure employing a full SVD of matrix B in each iteration is illustrated below in Procedure 1 that receives an n by m matrix A and a desired number of rows l in the output sketch B.
Input: l, A ∈ n×m
1: B←empty matrix ∈ l×m
2: for i ∈ [n] do
3: Set B+←Bi−1 with the last row replaced with Ai
4: [U, Σ, V]←SVD(B+)
5: δ←σl2
6: {hacek over (Σ)}←√{square root over (max(Σ2−llδ,0))}
7: Bi←{hacek over (Σ)}VT
8: end for
Return: Bi
As illustrated in Procedure 1, the system defines a sketch matrix B as an l by m matrix in step 1. The system proceeds through a loop in steps 2-8 incorporating data from A into the sketch B one entry (e.g., row) at a time. In step 3, the last row of matrix B (from the previous iteration) is replaced with A,. Note that the last row of matrix B is already zeroed from the decrement process performed by steps 4-7 (for the second and any subsequent iterations). In the decrement process, the system determines the SVD of B (at the current iteration), determines the square of the norm of the last row, and subtracts the determined norm square value from all of the rows. The resulting time complexity of Procedure 1 is O(nml2), which can be reduced by updating an existing SVD of matrix B in each iteration consistent with processes 100A and 100B as discussed in more detail below.
SVD updating techniques may be applied to an existing SVD of a matrix to reflect various changes to the matrix including, for example, incorporating new rows or columns, editing various entries, and applying shifts to the entries. These SVD update operations may be less computationally intensive than determining a new SVD of the modified input matrix.
The sketching Procedure 1 described above computes a new SVD in each iteration with a revised B matrix incorporating one or more additional rows relative to the previously computed SVD. As discussed in more detail below, the SVD of the previous iteration may be updated by incorporating the new data into the previous SVD. The new information to incorporate into the matrix B may be represented as b and the new SVD may be presented by equation (6) below.
In at least some matrix sketching examples, only the Σb and Vb are needed from the SVD of matrix B. Consequently, a tailored SVD update process 300 as illustrated in
In act 302, the system projects the new data point b onto the right singular vectors V (an orthonormal basis of m) of the previous SVD of B. The projection of b onto V yields q as illustrated below in equation (7).
q=bV (7)
In act 304, the system generates the new singular values and basis rotations. The system may generate the new singular values and basis rotations by appending q to the bottom of Σ (from the previous SVD of B) and determining the SVD of the resulting matrix as illustrated in equation (8) below.
In act 306, the system updates the right singular vectors. The system may update the right singular vectors by determining a product between the V matrix from the previous SVD and the new {hacek over (V)} matrix generated in act 304 above as illustrated in equation (9) below.
Vrot=V{hacek over (V)} (9)
In act 308, the system provides the updated SVD. The updated SVD may include, for example, an updated Σ matrix Σb and an updated V matrix Vb. The matrices Σb and Vb may be provided based on previously computed values as illustrated below in equations (10) and (11).
Σb={hacek over (Σ)} (10)
Vb=Vrot (11)
Incorporating the SVD update as described above in
Input: l, A ∈ n×m
1: Σ←empty matrix ∈ l−1×m
2: V←Im
3: for i ∈ [n] do
4: Set q←AiV {Ai is the ith row of A}
5: Set the last row of Σ to q
6: [{hacek over (U)}, {hacek over (Σ)}, {hacek over (V)}]←SVD(Σ)
7: V←V{hacek over (V)}
8: δ←σl2
9: Σ←√{square root over (max({hacek over (Σ)}2−Ilδ,0))}
10: end for
Return: B=ΣVT
As illustrated above, steps 3 and 4 in Procedure 1 of setting up and calculating the SVD of the modified B matrix are replaced with the incremental SVD update in steps 4-7 in Procedure 2. The high complexity steps in performing the SVD update in Procedure 2 include the SVD of the Σ matrix in step 6 (and act 304) and the matrix multiplication of V{hacek over (V)} in step 7 (and act 306). The SVD operation in step 6 is of an O(ml2) computational complexity because the matrix σ in step 6 is an l×m matrix, similar to Procedure 1. The matrix multiplication operation in step 7 also has an O(ml2) complexity yielding an overall computational complexity of O(nml2).
In at least one example, the SVD update step may be truncated to reduce computational complexity.
Referring to
g=bV (12)
In act 404, the system determines the component g of the new point b that is orthogonal to V. The orthogonal component may be represented as q and determined consistent with equation 13 below.
q=b
T−(VgT) (13)
In act 406, the system creates an orthonormal basis for the orthogonal component q as illustrated below in equations 14 and 15.
Rb=∥q∥ (14)
Q=R
b
−1
q (15)
In act 408, the system generates the new singular values and basis rotations. The system may determine the new singular values and basis rotations by combining the singular values, the orthogonal basis Rb, and the projection g into a single matrix and determining the SVD of the resulting matrix as illustrated below in equation 16.
In act 410, the system updates the right singular vectors. The system may update the right singular vectors by determining a product between the right singular vectors from the new SVD ({hacek over (V)}) and a combination of Q and the right singular vectors from the previous SVD (V) as illustrated in equation (17) below.
Vrot=[V Q]{hacek over (V)} (17)
In act 412, the system provides the updated SVD. The updated SVD may include, for example, an updated Σ matrix Σb and an updated V matrix Vb. The matrices Σb and Vb may be provided based on previously computed values as illustrated below in equations (18) and (19).
Σb={hacek over (Σ)} (18)
Vb=Vrot (19)
Incorporating the truncated SVD update as described above and with reference to
Input: l, A ∈ n×m
1: Σ←empty matrix ∈l×m
2: V←first l columns of Im
3: for i ∈ [n] do
4: Set g←AiV {Ai is the ith row of A}
5: Set q←AiT−(VgT)
6: Set Rb←∥q∥
7: Set Q←Rb−1q
9: V←[V Q]{hacek over (V)}
10: δ←σl2
11: Σ←√{square root over (max(Σ2−Ilδ,0))}
12: end for
Return: B=ΣVT
As illustrated above, steps 3 and 4 in Procedure 1 of setting up and calculating the SVD of the modified B matrix are replaced with the incremental SVD update in steps 4-9 in Procedure 3. The incremental SVD update offers a performance improvement relative to Procedure 1 by performing an SVD operation on a smaller l by l matrix as opposed to the larger l by m matrix in step 4 of Procedure 1. In addition, the l by l matrix that is decomposed in step 8 may exhibit a broken arrowhead structure (e.g., matrices with non-zero values only along the diagonal and in the last row or rows) that enables the application of specialized SVD procedures that require less runtime. For example, the runtime complexity of determining the SVD of an l by l matrix without any unique structure may be O(l3) while the runtime complexity of determining the SVD of an l by l matrix exhibiting a broken arrowhead structure may be O(l2). Thereby, the rate limiting step in Procedure 3 is the matrix multiplication in step 9 that may be easily parallelized, unlike the SVD operation in step 4 of Procedure 1. The performance improvement of Procedure 3 relative to Procedure 1 is illustrated below with reference to
As discussed above, various matrix sketching techniques according to certain examples reduce the computational complexity of determining a low-rank approximation of a given matrix A.
As illustrated in
∥ATA−BTB∥ (20)
As illustrated in
∥A∥F2−∥B∥F2 (21)
Similar to the additive error illustrated above, the SVD approach yielded the lowest error followed by matrix sketching Procedures 1 and 3. Note that, again, the substitution of the SVD update step in Procedure 1 to form Procedure 3 yields the same relative error (even in the batch case).
As illustrated by
Latent Semantic Indexing (LSI) is a modified approach to standard vector-space information retrieval. In both approaches, a set of m documents is represented by m individual n by 1 vectors in an n by m term-document matrix A. The elements of each vector represent the frequency of a specific word in that document. For example, element Ai,j is the frequency of word i in document j. The frequencies in the term-document matrix are often weighted locally, within a document, and/or globally, across all documents, to alter the importance of terms within or across documents. Using vector-space retrieval, a query is represented in the same fashion as a document, as a weighted n by 1 vector. The execution of the look-up of a query q is performed by mapping the query onto the row-space of the term-document matrix A to obtain relevancy scores w between the query and each document. The relationship between the vector of relevancy scores w to q and A is illustrated below in equation (22) below.
w=qTA (22)
The index of the highest score in w is the index of the document in A that most closely matches the query, and a full index-tracking sort of w returns the documents in order of relevancy to the query as determined by directly matching terms of the query and the documents.
Vector-space retrieval has numerous drawbacks. It can often return inaccurate results due to synonymy and polysemy. Synonymy is the issue of concepts being described in different terms, resulting in queries not matching appropriate documents discussing the same concepts due to word choice. Polysemy is the problem of single words having multiple meanings. Such words can lead to documents being returned with high relevancy scores when in fact they share little to no conceptual content with the query. Vector-space retrieval also requires the persistent storage of the matrix A. As information retrieval is often performed on extremely large data sets, storing A is often undesirable.
LSI uses a rank-k approximation of A to try to overcome the issues of synonymy, polysemy, and storage. The concept behind LSI is that the matrix Ak will reveal some latent semantic structure that more accurately matches queries to documents based on shared concepts rather than words. The query matching process in LSI is nearly identically to vector-space retrieval with the replacement of the term-document matrix.
ŵ=qTAk (23)
Ak=UkΣkVkT (24)
As discussed above, computing Ak by SVD methods may be prohibitively computationally expensive. The sketching techniques described above, however, may be employed to generate an approximation of Ak and reduce the complexity. Further, the sketching methods are streaming, enabling new data to easily be incorporated into a revised sketch of A. The sketch of A may be easily employed to construct an approximation of Ak on demand.
In act 702, the system receives a term-document matrix and a query to search in the term-document matrix. The system may receive the full term-document matrix and/or relevant updates to the term-document matrix that may be combined with a stored term-document matrix and/or employed by the sketch component to generate an updated sketch.
In act 704, the system generates a sketch of the term-document matrix. The system may employ any one or a combination of the techniques described in Procedures 1, 2, and 3 to generate the sketch.
In act 706, the system generates an approximation of the term-document matrix. The approximation of the term-document matrix may be, for example, a rank k approximation of the term-document matrix based on the previously generated sketch.
The system applies the query to the approximation of the term-document matrix to generate a relevancy score in act 708, and provides the relevancy score in act 710.
As described above, various sketching techniques may be incorporated into an LSI engine to reduce the computational complexity of generating an approximation of the term-document matrix. Procedure 4 below illustrates an example Procedure to convert a matrix A into a sketch and construct an approximation of Ak from the sketch.
Input: l, k, A ∈ n×m
1: B←sketch(A) where ∈ l×m
2: S←BBT
3: [U, Σ, V]←SVD(S)
4: for i=1, . . . , k do
5: ∂t=√{square root over (σt)}
6: {circumflex over (v)}t=But/∂t
7: end for
8: Ûk=A{circumflex over (V)}k{circumflex over (Σ)}k+ {{circumflex over (Σ)}k+ is the pseudoinverse of {circumflex over (Σ)}k}
Return: Ûk, {circumflex over (Σ)}k, {circumflex over (V)}k
Procedure 4 receives the term-document matrix A, a desired sketch dimension l, and a desired rank k of the approximation of Ak. In step 1, a sketch B is generated of the matrix A consistent with various techniques described above in Procedures 1, 2, or 3. Steps 2-7 generate the requisite right singular vectors {circumflex over (v)}t and singular values {circumflex over (σ)}t to form the {circumflex over (V)}k and {circumflex over (Σ)}k matrices for approximation of Ak (Âk). In step 8, the matrix Ûk is determined based on A, the matrix {circumflex over (V)}k, and a pseudo inverse of the {circumflex over (Σ)}k matrix. The pseudo inverse of the {circumflex over (Σ)}k matrix may be formed by replacing every non-zero diagonal entry with its reciprocal and transposing the resulting matrix.
As discussed above, various matrix sketching techniques may be applied to LSI reduce the computational complexity of determining low-rank approximations of Ak to apply a query.
As illustrated by
It is appreciated that the performance improvements achieved in LSI may be similarly achieved in other applications involving the receipt and processing of large data sets including, for example, image processing. These techniques improve the performance of any device or computer system constructed to perform these processing tasks by reducing the computational complexity to generate a compact representation of a given matrix A with minimal, if any, introduction of error relative to other methods and/or reducing the required memory footprint.
The low-rank approximation techniques described herein may be implemented on various special purpose computer systems designed to receive and/or process large data sets. These techniques improve the operation of the special purpose computer system by reducing the computation complexity of generating approximations of large data sets (see, e.g.,
Various aspects and functions in accord with the present invention may be implemented as specialized hardware or software executing in one or more computer systems including the computer system 902 shown in
The memory 912 may be used for storing programs and data during operation of the computer system 902. For example, the memory 912 may store one or more generated sketches of the received data. Thus, the memory 912 may be a relatively high performance, volatile, random access memory such as a dynamic random access memory (DRAM) or static memory (SRAM). However, the memory 912 may include any device for storing data, such as a disk drive or other non-volatile storage device, such as flash memory or phase-change memory (PCM).
Components of the computer system 902 may be coupled by an interconnection element such as the bus 914. The bus 914 may include one or more physical busses (for example, busses between components that are integrated within a same machine), and may include any communication coupling between system placements including specialized or standard computing bus technologies. Thus, the bus 914 enables communications (for example, data and instructions) to be exchanged between system components of the computer system 902.
Computer system 902 also includes one or more interfaces 916 such as input devices, output devices and combination input/output devices. The interface devices 916 may receive input, provide output, or both. For example, output devices may render information for external presentation. Input devices may accept information from external sources. The interface devices 916 allow the computer system 902 to exchange information and communicate with external entities, such as users and other systems.
Storage system 918 may include a computer-readable and computer-writeable nonvolatile storage medium in which instructions are stored that define a program to be executed by the processor. The instructions may be persistently stored as encoded signals, and the instructions may cause a processor to perform any of the functions described herein. A medium that can be used with various examples may include, for example, optical disk, magnetic disk or flash memory, among others. In operation, the processor 910 or some other controller may cause data to be read from the nonvolatile recording medium into another memory, such as the memory 912, that allows for faster access to the information by the processor 910 than does the storage medium included in the storage system 918. The memory may be located in the storage system 918 or in the memory 912. The processor 910 may manipulate the data within the memory 912, and then copy the data to the medium associated with the storage system 918 after processing is completed.
Various aspects and functions in accord with the present invention may be practiced on one or more computers having different architectures or components than that shown in
Having described above several aspects of at least one example, it is to be appreciated various alterations, modifications, and improvements will readily occur to those skilled in the art. Such alterations, modifications, and improvements are intended to be part of this disclosure and are intended to be within the scope of the invention. Accordingly, the foregoing description and drawings are by way of example only, and the scope of the invention should be determined from proper construction of the appended claims, and their equivalents.
This application claims priority under 35 U.S.C. §119(e) to U.S. Provisional Application Ser. No. 62/040,051 titled “FASTER STREAMING ALGORITHMS FOR DETERMINISTIC LOW-RANK MATRIX APPROXIMATIONS,” filed on Aug. 21, 2014, which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
62040051 | Aug 2014 | US |