Financial management, brokering, and trading are highly competitive businesses in which information is highly important. Participants often wish to hide their current positions (which include holdings of financial instruments such as stocks, bonds, etc.) from their competitors. However, competitors may desire to know which financial instruments, and in what proportion, are in a given portfolio.
For a detailed description of illustrative embodiments of the invention, reference will now be made to the accompanying drawings in which:
Certain terms are used throughout the following description and claims to refer to particular system components. As one skilled in the art will appreciate, computer companies may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not function. In the following discussion and in the claims, the terms “including” and “comprising” are used in an open-ended fashion, and thus should be interpreted to mean “including, but not limited to . . . .” Also, the term “couple” or “couples” is intended to mean either a logical, physical, indirect, direct, optical or wireless electrical connection. Thus, if, for example, a first device couples to a second device, that connection may be through a direct electrical connection, through an indirect electrical connection via other devices and connections, through an optical electrical connection, or through a wireless electrical connection.
The following discussion is directed to various embodiments of the invention. Although one or more of these embodiments may be preferred, the embodiments disclosed should not be interpreted, or otherwise used, as limiting the scope of the disclosure, including the claims. In addition, one skilled in the art will understand that the following description has broad application, and the discussion of any embodiment is meant only to be exemplary of that embodiment, and not intended to intimate that the scope of the disclosure, including the claims, is limited to that embodiment.
The CRSM 14 comprises volatile memory (e.g., random access memory), non-volatile storage (e.g., hard disk drive, Flash storage, compact disc read only memory, etc.), or combinations thereof. The CRSM 14 contains software 20 that is executed by the processor 12. The software 20 provides the processor 12, and thus system 10, with some or all of the functionality described herein. In some embodiments, a single processor performs all of the functionality described herein. In other embodiments, the software executes on a multi-processor system in which one processor may perform one or more of the steps described herein and one or more other processors execute one or more other steps.
A portfolio contains one or more securities that comprise the portfolio. A security is a fungible, negotiable instrument representing financial value. Each security can be a stock, a bond or a financial instrument of any kind. In accordance with various embodiments, the constituent securities of the portfolio may be all of the same type (e.g., all stocks) or may be a mix of different security types (some stock, some bonds, etc.). The term “portfolio” broadly refers to any kind of aggregation of securities. An example of a portfolio is a fund such as a mutual fund. However, the term portfolio can refer to aggregations of securities besides a fund.
The technique disclosed herein determines the securities that make up a target portfolio. The technique uses historical data. CRSM 14 in
The historical data also comprises the daily returns, over the same K days, of the universe of possible securities that might make up the portfolio. For example, if it is at least known that the constituent securities in the portfolio are stocks within the S&P 500, then the historical data comprises the daily returns over the K days of the all 500 stocks that make up the S&P 500 index. It is not known which or how many of the S&P 500 stocks are in the portfolio in question; the software 20 computes a solution to a particular linear program to make that determination as described below. Each such daily return reflects the percentage increase or decrease of each individual security.
Using the historical data of the daily returns of the portfolio in question as well as the daily returns of the universe of possible securities that could make up the portfolio in question, the software 20 computes the linear programming solution of:
min∥w∥L1 subject to
The quantity r is a vector containing the historical daily returns of the value of the portfolio. A is a matrix containing the historical daily returns of each security in a set of all possible securities in said portfolio. “L1” indicates L1 minimization, and is sometimes written as L1. The process computes the vector w which comprise weights, one weight value for each possible security in the portfolio. The solution of min∥w∥L1 includes the securities determined to be in the portfolio. There may be, and likely are, multiple w vectors that solve the equation
Each element of w corresponds to one of the securities provided in matrix A. For example, in the case of the S&P 500 index, the matrix A contains historical data for all 500 stocks in the index. The w vector that solves the linear program above contains 500 values; one value for each stock. Each element in the vector w is a weight value that indicates what fraction of the portfolio each security represents. In at least some embodiments, vector w is normalized to 1 which means its weight value in vector w is a value ranging from 0 to 1 inclusive. The chosen/computed vector w that solves equation (1) above contains some zero values and some non-zero values. A zero value means that the corresponding security is not in the portfolio in question. A non-zero value means that the corresponding security is in the portfolio in question and how much (e.g., expressed as a fraction in at least some embodiments) of the portfolio rests with that particular security.
The linear program subject to the L1 minimization works when relatively few of the universe of securities are actually in the portfolio thereby resulting in relatively few non-zero values in the vector w (which specifies how much of each possible security is in the portfolio in question). A w vector with relatively few non-zero values is said to be “sparse.” Accordingly, the L1 minimization technique described herein works because the resulting w vector is sparse.
Further, the linear program provided above works with relatively few days of historical data compared to the number of securities represented in matrix A of historical data. If there are N securities represented in matrix A, then the above linear program works when K<<N.
Thus, the two conditions of interest are:
(a) w is sparse (i.e., contains relatively few nonzero entries elative to N) (b) K<<N.
Condition (a) expresses the fact that the portfolio is actively managed with relatively few holdings. If the portfolio has many holdings relative to the universe of possible securities that could be included in the portfolio, then an exact reconstruction may not be obtained although an adequate approximation is possible.
Condition (b) expresses the fact that it is desirable to determine the constituent elements of a portfolio with as little historical data as possible. Many portfolios are actively managed and thus frequently buy and sell securities. That being the case, requiring, say, a year's worth of historical data is of questionable value in analyzing a portfolio that will have sold off at least some of the securities it held a year ago. Being able to determine a portfolio's holdings using only the most recent data (e.g., 100 days worth of data) is beneficial as such data is likely to more relevant than data that is much older.
In some cases, there may be some measurement noise. Measurement noise may comprise a miss-estimation of the universe of potential securities (i.e. the columns of A). It may also comprise mild inaccuracies in the returns reported in the vector r or the matrix A. In such cases a more error tolerant optimization problem can be set up that will closely approximate the basic setup when errors are not too significant. At least two possibilities are available in such embodiments. In one embodiment, the linear programming solution of the following expression is solved: min∥{right arrow over (w)}∥L1 subject to ∥{right arrow over (r)}−A{right arrow over (w)}∥L2≦ε. A tolerance ε of error is chosen that one is willing to accept (measured according to the L2 norm), and the L1 norm of {right arrow over (w)} is minimized consistent with that tolerance. In accordance with another embodiment, the linear programming solution of
s solved. The {right arrow over (w)} vector is determined that minimizes the L1 norm of the difference between the two vectors in the argument, which is the estimated error, and which is known. Both of these approaches are robust and error tolerant in the sense that they do not require that {right arrow over (r)}=A{right arrow over (w)}.
At 54, the method comprises obtaining vector r. Obtaining the vector r may comprise retrieving the vector from memory such as CRSM 14 or from a site across a network. Further, obtaining the vector r may comprise computing the daily returns for the portfolio based on the daily closing price (e.g., net asset value, NAV) for the portfolio over the specified time period K.
At 56, the method comprises obtaining matrix A. Obtaining the matrix A may comprise retrieving the matrix from memory such as CRSM 14 or from a site across a network. Further, obtaining the matrix A may comprise computing the daily returns for the universe of securities possibly forming the portfolio in question based on the daily closing prices (e.g., net asset value, NAV) for such securities over the specified time period K.
At 58, the method comprises computing the linear programming solution of min∥w∥L1 subject to
The value K (number of days of historical data used in the calculations) can be determined in accordance with any of a variety of techniques. For instance, a user could determine and specify K by trial and error on what is generally known about the portfolio to be reconstructed, and empirically determine K through simulation what the correct minimum K needs to be.
The value K can be determined another way. Each row of the A matrix is a day's worth of new data. At the end of every day, a user adds on a new row of historical data to matrix A, the L1 minimization technique described above is performed to compute a new w vector. So as each day passes, the matrix A grows by a row and K increments by 1 as well, and a new w vector is determined. At some point, the matrix A will have enough rows (K) of data that the w vector will stop changing from day to day. Accordingly, a threshold can be specified by a user for how much ∥w(t+1)−w(t)∥_L2 should be. Then that threshold could be a “stopping criterion” for determining when K is large enough. The “L2” notation represents the magnitude of Euclidean distance between the 2 vectors w(t+1) and w(t).
The immediately preceding process for determining the value K is further illustrated in
The above discussion is meant to be illustrative of the principles and various embodiments of the present invention. Numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.