There are many environments in which a large amount of data may be gathered. In many cases, the data that is gathered is considered noisy, and a smoothed version of the data is desirable. For example, such smoothing may be desired when analyzing financial data such as stock prices, returns, or trading volumes, or economic data such as gross domestic product or employment statistics. As another example, a computer system that utilizes touch or gestures to receive user input may gather many data points as the user interacts with the system. Various sources of noise may affect the data that is gathered. For example, such a computer system may include a digitizer to convert analog touch or gesture data to digital data. Depending on the type of digitizer used, the digitizer may introduce varying degrees of noise.
Exponential Moving Average (EMA) is a recursive function that takes a weighted average of all sampled data points using a constant smoothing factor, α, having a value between zero and one. The most recently sampled data point is multiplied by the constant, α, and previously sampled data points are multiplied by successive powers of α. Because α is typically less than one, powers of α can quickly drop to negligible fractional percentages. In this way, less recent data points decay to negligible values quickly.
The value selected for α can significantly influence the results of data smoothing using EMA. When α is near one, the smoothed output is nearly the same as the raw input. When α is near zero, the smoothed output has high latency and responds weakly to changes in input trends.
Democratic alpha smoothing is described. As each sampled data value is received from an input data stream, a state of the input data stream is estimated. A total vote value is updated based on the estimated state of the input data stream, and a smoothing factor is calculated based on the updated total vote value. An iteration of an exponential moving average is calculated using the smoothing factor to produce a data value of a smoothed output data stream.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
The same numbers are used throughout the drawings to reference like features and components.
The following discussion is directed to democratic alpha smoothing Exponential Moving Average (EMA) is a recursive algorithm that takes a weighted average of all sampled data points. A weight, α, is multiplied by the most recently sampled data point, and successive powers of α are multiplied by less recently sampled data points. When α is near one, the smoothed output is nearly the same as the raw input. When α is near zero, the smoothed output has high latency and responds weakly to changes in input trends.
The EMA algorithm is given as:
S
n
=α*z
n+(1−α)*Sn-1
where:
Rather than relying on a single value of α, democratic alpha smoothing dynamically changes α at run-time based on whether high fidelity or high smoothness would produce more accuracy. In the described example implementation, upon each iteration of the EMA algorithm, a democratic alpha smoothing system estimates a state of the input data stream, and determines an α value to be used in calculating the exponential moving average. The democratic alpha smoothing system classifies the state of the input stream as either “trending,” meaning the data is generally rising or falling, or “flickering,” meaning the data is generally dancing about approximately the same value. If the system is in a trending state, a higher value of α will decrease latency. If the system is in a flickering state, a lower value of α will more effectively decrease the noise.
In the illustrated example, computing device 102 includes one or more processors 106, a digitizer 108, and memory 110. Example digitizer 108 converts analog data associated with the user's touch or gesture interaction with the interface 104 to digital data. Any of the hardware components of computing device 102 may introduce noise as data is gathered. For example, depending on the specific digitizer implementation, the digitizer may introduce varying levels of noise when converting the analog touch or gesture data to digital data.
An operating system 112 and one or more application programs 114 may be stored in memory 110 and executed on processor 106. Democratic alpha smoothing module 116 is also stored in memory 110, either as a stand-alone module or as a component of either operating system 112 or of an application 114, and is executed by the processor(s) 106. Democratic alpha smoothing module 116 applies a smoothing algorithm to the input data generated by the digitizer 108, lessening the effect of any noise that the digitizer may have introduced.
Although illustrated and described in the context of a single computing device, the functionality of the democratic alpha smoothing module described herein may alternatively be implemented in a client-server architecture, with the democratic alpha smoothing module residing at either the client, the server, or distributed across both the client and server.
Alternatively, or in addition, the functionality of the democratic alpha smoothing module described herein can be performed, at least in part, by one or more hardware logic components. For example, and without limitation, illustrative types of hardware logic components that can be used include Field-programmable Gate Arrays (FPGAs), Application-specific Integrated Circuits (ASICs), Application-specific Standard Products (ASSPs), System-on-a-chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), etc.
Although illustrated in
Furthermore, although illustrated and described in the context of input data generated by a digitizer in response to touch or gesture input, democratic alpha smoothing as described herein may be applied to any type of data. Examples of other types of data include, but are not limited to, input data generated through natural user interface technologies such as speech recognition, touch and stylus recognition, gesture recognition, air gestures, head and eye tracking, voice and speech, vision, touch, hover, and machine intelligence. Democratic alpha smoothing may be applied in any context that may benefit from calculating an exponential moving average. Other examples of input data to which democratic alpha smoothing may be applied include, but are not limited to, financial data such as stock prices, returns, or trading volumes, and economic data such as gross domestic product or employment statistics.
In an example implementation, MAX_ALPHA defines a maximum value for α. As discussed above, according to the standard EMA algorithm, α=[0,1]. Applying the EMA algorithm with α equal to zero results in output data in which each data point is equal to the previous data point. Applying the exponential moving average algorithm with α equal to one results in output data in which no smoothing occurs and each output data point is equal to the input data point. Accordingly, for democratic alpha smoothing, in which the value of α dynamically changes with each iteration of the EMA algorithm, a smaller range of α may be defined to prevent α from being equal to either zero or one. The upper end of this smaller range is defined using MAX_ALPHA. The lower end of this smaller range for values of α can be calculated based on MIN_VOTES, as is explained in further detail below.
MIN_VOTES and MAX_VOTES define a range of vote values that will affect the value of α. When a current vote total is equal to MIN_VOTES, α will have its smallest value. Similarly, when a current vote total is equal to MAX_VOTES, α is equal to MAX_ALPHA.
POWER is a customizable constant that determines a degree of linearity with which the value of α moves from its minimum value to MAX_ALPHA as the total vote value moves from MIN_VOTES to MAX_VOTES. If POWER equals one, the value of α moves from its minimum value to MAX_ALPHA linearly as the total vote value moves from MIN_VOTES to MAX_VOTES. If POWER is less than one, α increases quickly as the vote total increases from MIN_VOTES, and α increases more slowly as the vote total approaches MAX_VOTES. Conversely, if POWER is greater than one, α increases slowly as the vote total increases from MIN_VOTES, and α increases more quickly as the vote total approaches MAX_VOTES.
PENALTY defines a degree to which a flickering state affects the total vote value. For example, when it is determined that the input data stream is in a trending state, the total vote value may be increased by one, but when it is determined that the input data stream is in a flickering state, the total vote value may be decreased by two. In an example implementation, PENALTY may be a dynamic variable that is based, at least in part, on the current estimated state of the input data stream. For example, if it is estimated that the input data stream is consistently in a flickering state, PENALTY may be equal to one. However, if it is estimated that the input data stream has just switched from a trending state to a flickering state, PENALTY may be equal to a value greater than one.
BONUS defines a degree to which a trending state affects the total vote value. As with the PENALTY parameter, the BONUS parameter for increasing the total vote value may be static or dynamic. For example, if it is estimated that the input data stream is consistently trending, BONUS may be equal to one. However, if it is estimated that the input data stream has just moved from a flickering state to a trending state, BONUS may be equal to a value larger than one.
SIGN(n) indicates whether calculating a difference between z(n) and S(n-1) is positive or negative.
VOTE(n) indicates the total number of votes, restricted to a range defined by MIN_VOTES and MAX_VOTES.
ALPHA(n), or α(n), indicates the value of α to be used in calculating the exponential moving average of z(n). ALPHA(n) is restricted to a range that is based on MIN_VOTES and MAX_VOTES. The largest value of ALPHA(n) is defined by MAX_ALPHA.
Dynamic alpha generator 306 utilizes the received VOTE(n), POWER, MAX_VOTES, and MAX_ALPHA to determine a new α (ALPHA(n)). In the illustrated example, parameter store 302 maintains and provides access to POWER, MAX_VOTES, and MAX_ALPHA. Dynamic alpha generator 306 outputs ALPHA(n) to the parameter store, and outputs ALPHA(n) along with the sampled data point z(n) to the exponential moving average module 308.
Exponential moving average module 308 utilizes the received sampled data point (z(n)), the previously smoothed data point (S(n-1)), and ALPHA(n)) to calculate a new smoothed data point (S(n)). In the illustrated example, parameter store 302 maintains and provides access to S(n-1). Exponential moving average module 308 outputs S(n) to the parameter store and outputs S(n) as the resulting smoothed data point.
In
At block 604, democratic alpha smoothing module 116 receives an initial input value z(1). For example, in the computing environment illustrated in
At block 606, exponential moving average module 308 calculates an initial smoothed value S(1). In an example implementation, S(1)=z(1).
At block 608, the democratic alpha smoothing module 116 receives a next input value z(n). For example, as digitizer 108 generates a digital data stream, voting module 304 receives the next sampled data point of the digital data stream.
At block 610, voting module 304 estimates a state of the input data stream. In an example implementation, voting module 304 estimates the state of the input data stream based, at least in part, on a difference between the current sampled data point, z(n), and the previously calculated smoothed data point, S(n-1). By the nature of EMA smoothing, if the actual value is moving in a direction, the smoothed value will typically lag behind it. That is, if the actual value is increasing (i.e., “trending up”), the smoothed value will typically be below it; if the actual value is decreasing (i.e., “trending down”), the smoothed value will typically be above it. Therefore, if the sampled values are consistently above or consistently below the smoothed values, the input data stream is probably in a trending state. Similarly, if the actual values are not consistently above or consistently below the smoothed values, but rather, randomly alternate above and below the smoothed values, the input data stream is probably in a flickering state. Additional details regarding estimating the state of the input data stream are discussed below with reference to
At block 612, voting module 304 calculates VOTE(n) based on the determined state of the input data stream. In an example implementation, voting module 304 calculates VOTE(n) by increasing the value of VOTE(n-1) if the input data stream is determined to be in a trending state, and decreasing the value of VOTE(n-1) if the input data stream is determined to be in a flickering state. Additional details regarding calculating VOTE(n) are discussed below with reference to
At block 614, dynamic alpha generator 306 calculates ALPHA(n) based on VOTE(n). In an example implementation, if VOTE(n) is greater than VOTE(n-1), then ALPHA(n) will be greater than ALPHA(n-1). Similarly, if VOTE(n) is less than VOTE(n-1), ALPHA(n) will be less than ALPHA(n-1). Additional details regarding calculating ALPHA(n) are discussed below with reference to
At block 616, exponential moving average module 308 calculates the next exponential moving average value, S(n), based on ALPHA(n), z(n), and S(n-1). As discussed above, the EMA module implements the EMA algorithm:
S
n=αn*zn+(1−αn)*Sn-1
SIGN(n)=true if z(n)−S(n-1)>0
SIGN(n)=false if z(n)−S(n-1)<0.
At block 704, voting module 304 compares SIGN(n) to SIGN(n-1). If SIGN(n) and SIGN(n-1) are the same (i.e., both true or both false), then processing continues as described below with reference to block 708.
At block 706, when voting module 304 determines that SIGN(n) and SIGN(n-1) are not the same (i.e., one is true and the other is false), voting module 304 determines that the input data stream is in a state of “flickering.” As discussed above, when the smoothed values are consistently above or consistently below the input data values, the input data stream is likely trending in an increasing or decreasing direction. However, if the smoothed values fluctuate between being above and being below the input data values, the input data stream is likely flickering around approximately the same value. A positive difference between z(n) and S(n-1) followed by a negative difference between z(n) and S(n-1) (or a negative difference followed by a positive difference) therefore likely indicates a state of flickering.
When voting module 304 determines that SIGN(n) and SIGN(n-1) are the same (the “Yes” branch from block 704), at block 708, voting module 304 performs another calculation to determine whether or not, even though SIGN(n) and SIGN(n-1) are the same, the input data stream may be in a state of flickering. As discussed above, in a trending state, the smoothed data values are consistently above or consistently below the input data values. However, it is possible that the input data stream has entered a flickering state if the smoothed data values are significantly close to the input data values, even though they may not have moved far enough to change the value of SIGN(n) compared to the value of SIGN(n-1). Accordingly, at block 708, voting module 304 determines whether the difference between z(n) and S(n-1) is less than a threshold value. In an example implementation, the threshold value is equal to 5% of S(n-1). In alternate implementations, other threshold values may be used, including, for example, a constant value. If it is determined that the difference between z(n) and S(n-1) is less than the threshold value, then as described above with reference to block 706, voting module 304 estimates that the input data stream is in a flickering state.
At block 710, when it is determined (in block 704) that SIGN(n) equals SIGN(n-1) and it is determined (in block 708) that the difference between z(n) and S(n-1) is greater than the threshold value, voting module 304 estimates that the input data stream is in a trending state.
In an alternate implementation, rather than determining and comparing SIGN(n) and SIGN(n-1), a state of the input data stream may be estimated based on other criteria. For example, a threshold range around a previous value (as discussed with reference to block 708) may be the sole determination of whether the input data stream is trending or flickering. Other criteria may also be used to estimate the current state of the input data stream.
At block 806, voting module 304 compares VOTE(n) to MAX_VOTES. If VOTE(n) is greater than MAX_VOTES (the “Yes” branch from block 806), then at block 808, voting module 304 sets VOTE(n) equal to MAX_VOTES. The process ends at block 810.
If it is determined in block 802 that the input data stream is likely in a flickering state (the “No” branch from block 802), then at block 812, VOTE(n) is calculated by subtracting a value from VOTE(n-1). In an example implementation, voting module 304 calculates VOTE(n) by subtracting the PENALTY value from VOTE(n-1). Setting PENALTY to a value greater than one causes the value of α to decrease more quickly when it is determined that the input data stream is likely in a flickering state.
At block 814, voting module 304 compares VOTE(n) to MIN_VOTES. If VOTE(n) is less than MIN_VOTES (the “Yes” branch from block 814), then at block 816, VOTE(n) is set equal to MIN_VOTES. The process ends at block 810.
At block 902, dynamic alpha generator 306 calculates a ratio of VOTE(n) to MAX_VOTES.
At block 904, dynamic alpha generator 306 raises the calculated vote ratio to a pre-defined POWER. If POWER equals one, then ALPHA(n) varies linearly from a minimum value to MAX_ALPHA as VOTE(n) varies from MIN_VOTES to MAX_VOTES. Alternatively, if POWER is equal to a value other than one, ALPHA(n) varies from a minimum value to MAX_ALPHA in a non-linear fashion. For example, if it is desired to quickly move out of a high smoothing phase, then an increase in votes should quickly cause α to increase and then taper off. Alternatively, it may be desired that votes should only minimally effect α in the early iterations, but then quickly move it toward MAX_ALPHA, indicating a preference to stay in a high-smoothing phase. Raising the vote ratio to a power other than one changes the behavior by which α varies from its minimum value to MAX_ALPHA.
At block 906, dynamic alpha generator 306 calculates ALPHA(n) as the product of the vote ratio (raised to the pre-defined POWER) and MAX_ALPHA. This can be expressed mathematically as:
Example Clauses
A: A method comprising: receiving a first portion of an input data stream; generating a first portion of an output data stream by calculating an exponential moving average based, at least in part, on the first portion of the input data stream and a smoothing factor; receiving a second portion of the input data stream; determining a state of the input data stream; in an event that the state of the input data stream is a first state, incrementing a vote value; in an event that the state of the input data stream is a second state, decrementing the vote value; updating a value of the smoothing factor based on the vote value; and generating a second portion of the output data stream having reduced noise relative to the second portion of the input data stream by calculating an exponential moving average based, at least in part, on the first portion of the output data stream, the second portion of the input data stream, and the smoothing factor.
B: A method as paragraph A recites, wherein determining the state of the input data stream comprises: determining whether values from the input data stream are trending upward, trending downward, or not trending upward or downward; in an event that the values from the input data stream are trending upward or trending downward, determining that the input data stream is in the first state; and in an event that the values from the input data stream are not trending upward or downward, determining that the input data stream is in the second state.
C: A method as paragraph B recites, wherein determining whether values from the input data stream are trending upward, trending downward, or not trending upward or downward comprises evaluating a difference between the second portion of the input data stream and the first portion of the output data stream.
D: A method as paragraph C recites, wherein evaluating the difference between the second portion of the input data stream and the first portion of the output data stream comprises: determining a first sign by determining whether the difference between the second portion of the input data stream and the first portion of the output data stream is positive or negative; determining a second sign by determining whether a difference between a previous portion of the input data stream and a previous portion of the output data stream was positive or negative; and in an event that the first sign and the second sign are different, determining that the values from the input data stream are staying approximately the same.
E: A method as paragraph D recites, wherein evaluating the difference between the second portion of the input data stream and the first portion of the output data stream further comprises: in an event that the first sign and the second sign are the same: determining whether the difference between the second portion of the input data stream and the first portion of the output data stream is within a threshold range of values; and in an event that the difference between the second portion of the input data stream and the first portion of the output data stream is within the threshold range of values, determining that the values from the input data stream are not trending upward or downward; and in an event that the first sign and the second sign are not the same and the difference between the second portion of the input data stream and the first portion of the output data stream is not within the threshold range of values, determining that the values from the input data stream are trending upward or trending downward.
F: A method as any one of paragraphs A-E recite, wherein incrementing the vote value comprises: adding a vote increase constant value to the vote value to generate a new vote value; compare the new vote value to a maximum vote value; and in an event that the new vote value is greater than the maximum vote value, setting the new vote value equal to the maximum vote value.
G: A method as any one of paragraphs A-F recite, wherein decrementing the vote value comprises: subtracting a vote decrease constant value from the vote value to generate a new vote value; compare the new vote value to a minimum vote value; and in an event that the new vote value is less than the minimum vote value, setting the new vote value equal to the minimum vote value.
H: A method as either of paragraphs F or G recite, wherein the vote increase constant value is greater than one.
I: A method as any one of paragraphs F-H recite, wherein the vote decrease constant value is greater than one.
J: A method as any one of paragraphs A-I recite, wherein updating the value of the smoothing factor based on the vote value comprises: calculating a vote ratio by dividing the vote value by a maximum vote value; and calculating an updated smoothing factor as a product of the vote ratio and a maximum smoothing factor value.
K: A method as paragraph J recites, wherein updating the value of the smoothing factor based on the vote value further comprises raising the vote ratio to a pre-defined power prior to calculating the updated smoothing factor.
L: A computer-readable medium having computer-executable instructions thereon, the computer-executable instructions to configure a computer to perform a method as any one of paragraphs A-K recites.
M: A device comprising: one or more computer-readable media having computer-executable instructions thereon to configure a computer to perform a method as any one of paragraphs A-K recite, and a processing unit adapted to execute the instructions to perform the method as any one of paragraphs A-K recite.
N: A system comprising: one or more processors; a memory, communicatively coupled to the one or more processors; a voting module, stored in the memory and executed by the one or more processors to: determine a state of an input data stream; and increment or decrement a total vote value based on the state of the input data stream; a dynamic alpha generator, stored in the memory and executed by the one or more processors to calculate a value of a smoothing factor, alpha, based, at least in part, on the total vote value; and an exponential moving average module, stored in the memory and executed by the one or more processors to calculate an iteration of an exponential moving average based, at least in part, on the value of the smoothing factor, alpha.
O: A system as paragraph N recites, wherein: a first state indicates that values sampled from the input data stream are trending upward or trending downward; and a second state indicates that values sampled from the input data stream are remaining within a threshold range of values.
P: A system as paragraph O recites, wherein the threshold range of values is determined, at least in part, relative to a smoothed value calculated in a previous iteration of the exponential moving average.
Q: A system as either paragraph O or P recites, wherein: the voting module increments the total vote value when the input data stream is determined to be in the first state; and the voting module decrements the total vote value when the input data stream is determined to be in the second state.
R: A system as any one of paragraphs N-Q recites, wherein the voting module determines the state of the input data stream based, at least in part, on a calculated difference between a current sampled value from the input data stream and a smoothed value calculated in a previous iteration of the exponential moving average.
S: One or more computer-readable media comprising computer-executable instructions that, when executed, direct a computing system to: receive an input data stream comprising a plurality of sampled data values; and generate an output data stream comprising a plurality of smoothed data values having reduced noise relative to the input data stream by calculating multiple iterations of an exponential moving average algorithm, wherein: each iteration of the exponential moving average algorithm is calculated based on a current sampled data value, a previously calculated smoothed data value, and a smoothing factor; the smoothing factor is determined independently for each iteration of the exponential moving average algorithm; and for each iteration of the exponential moving average algorithm: a state of the input data stream is determined based, at least in part, on the current sampled data value; a total vote value is updated based on the state of the input data stream; and the smoothing factor for the current iteration of the exponential moving average algorithm is determined based, at least in part, on the total vote value.
T: One or more computer-readable media as paragraph S recites, wherein, the total vote value is restricted within a pre-defined range of vote values.
U: One or more computer-readable media as paragraph T recites, wherein, the smoothing factor is restricted within a pre-defined range of alpha values such that as the total vote value varies from a minimum vote value to a maximum vote value, the smoothing factor varies from a minimum alpha value to a maximum alpha value.
V: One or more computer-readable media as paragraph U recites, wherein, as the total vote value varies linearly from a minimum vote value to a maximum vote value, the smoothing factor varies non-linearly from a minimum alpha value to a maximum alpha value.
W: One or more computer-readable media as paragraph U recites, wherein, as the total vote value varies linearly from a minimum vote value to a maximum vote value, the smoothing factor varies linearly from a minimum alpha value to a maximum alpha value.
Although example democratic alpha smoothing has been described in language specific to structural features and/or methodological steps, it is to be understood that democratic alpha smoothing as defined in the appended claims is not necessarily limited to the specific features or steps described above. Rather, the specific features and steps described above are disclosed as examples of implementing the claims and other equivalent features and steps are intended to be within the scope of the claims.