Conventional methods for ascribing numerical indices characterizing particular areas of interest, such as the financial performance of a publicly traded company, are usually self-generated by the area of interest and reflect only a narrow, standardized set of internal metrics. These narrow, standardized sets of internal metrics often do not capture the true value of an entity within an area of interest, such as a company, as regarded by the set of all stakeholders or interested parties at large, usually external to the area of interest.
Methods and systems are provided for assessing and providing long-term indicators of sentiment. While it is beneficial to provide a technique whereby a numerical index, or plurality of indices, are generated to precisely reflect the aggregate sentiment of interested parties, stakeholders, experts and the like in regard to a particular area of interest and observation, additional benefit may be provided when assessing these aggregate sentiments in the context of similar areas of interest and/or over stretches of time. It is further apparent that there is a benefit to present informative items, related to an area of interest, in ways that provide context and long-term indicators of sentiment.
This invention relates to a method and system for the generation of a long-term numerical index that provides an enhanced metric reflecting sentiment associated with rapidly changing indications of sentiment. The numerical index may be indicative of a value of an entity in the area of interest. This invention is applicable in areas of interest such as evaluating the characteristics of corporate behavior and performance as traditionally and conventionally only characterized heretofore by standardized financial data and metrics. Furthermore, this invention is applicable in areas of interest that can be attributed by news articles consumable by an observant public, and where members of that public have varying degrees of expertise. The invention can be applicable to other areas of interest for polling audiences on certain characteristics, such as (but not limited to), of a product, sports team, individual athlete, celebrity, company, news, or other areas.
In an aspect of the invention, a method of assessing aggregate sentiment over a plurality of time increments of a time period is provided. The method comprises assigning a maximum aggregation factor that is associated with a particular time period. The method also comprises receiving a plurality of time increments over the time period, wherein each time increment has a characteristic baseline incremental sentiment value (BISV) and incremental sentiment value (ISV). Additionally, the method comprises for each time increment, subtracting the BISV from the ISV to form a BISV/ISV difference value. The method also comprises normalizing the BISV/ISV difference value by dividing by the maximum possible difference, thereby determining a modulator. The method also comprises for each time increment, assigning a value to a recency of the particular time increment to a most recent incremental sentiment value update event, thereby determining a decay factor. The method comprises modulating the maximum aggregation factor associated with a particular time period by multiplying a determined modulator and a determined decay factor associated with each time increment within the evaluated time interval. The method also comprises applying the modulated maximum aggregation factor to aggregated sentiment values, thereby determining an aggregate sentiment value for each time increment over the time period.
In another aspect of the invention, a method of assessing a momentum indicator of a time period is provided. The method comprises receiving a plurality of aggregate sentiment values, wherein each aggregate sentiment value of the plurality of aggregate sentiment values is associated with a time increment within the time period. The method also comprises calculating a curve that models a function of the plurality of aggregate sentiment values across the time period. Additionally, the method comprises determining a momentum indicator based upon characteristics of the curve that models a function of the plurality of aggregate sentiment values across the period of time.
In another aspect of the invention, a method of assessing composite sentiment over a plurality of time increments of a time period is provided. The method comprises assigning a half life parameter. The method also comprises obtaining a diminishing rate from the half life parameter. Additionally, the method comprises assigning a seasoning period. The method also comprises obtaining a first reported general sentiment score. The method also comprises obtaining a general sentiment score over a plurality of time increments over a time period. Further, the method comprises identifying, for each general sentiment score, a number of time periods that the general sentiment score remains unchanged. Additionally, the method comprises assigning a neutral general sentiment score. The method also comprises assigning an information decay factor. The method further comprises calculating a fade-adjusted general sentiment score at a given time based on (a) a general sentiment score at the given time, (b) a number of time periods that the general sentiment score has remained unchanged, (c) the assigned neutral general sentiment score, and (d) the assigned information decay factor. The method also comprises obtaining a seed long-term score by combining the plurality of fade-adjusted general sentiment scores present within the seasoning period. Additionally, the method comprises calculating a present long-term score by iteratively updating long-term scores associated with a time period between the time associated with the seed long-term score and the time associated with the most current long-term score, wherein said long-term scores are updated based on factors selected from the group consisting of a fade-adjusted general sentiment score, diminishing rate, a seed value of the long-term score, and the most recent previous long-term score.
In a further aspect of the invention, a method for assessing a volume-modulated composite sentiment over a plurality of time increments of a time period is provided. The method comprises assigning a half life parameter. The method also comprises obtaining a diminishing rate from the half life parameter. Additionally, the method comprises assigning a seasoning period. The method also comprises obtaining a first reported general sentiment score. The method also comprises obtaining a general sentiment score over a plurality of time increments over a time period. Further, the method comprises identifying, for each general sentiment score, a number of time periods that the general sentiment score remains unchanged. Additionally, the method comprises assigning a neutral general sentiment score. The method also comprises assigning an information decay factor. The method further comprises calculating a fade-adjusted general sentiment score at a given time based on (a) a general sentiment score at the given time, (b) a number of time periods that the general sentiment score has remained unchanged, (c) the assigned neutral general sentiment score, and (d) the assigned information decay factor. The method also comprises obtaining a seed long-term score by combining the plurality of fade-adjusted general sentiment scores present within the seasoning period. Additionally, the method comprises calculating a present long-term score by iteratively updating long-term scores associated with a time period between the time associated with the seed long-term score and the time associated with the most current long-term score, wherein said long-term scores are updated based on factors selected from the group consisting of a fade-adjusted general sentiment score, diminishing rate, a seed value of the long-term score, and the most recent previous long-term score. The method also comprises counting a volume of news events associated with a particular time increment. Additionally, the method comprises calculating an average per-time-increment volume associated with each time increment across a plurality of time increments within a time period. The method further comprises determining that a particular time increment within the plurality of time increments is associated with a relative volume spike in comparison to other time increments within the time period. Further, the method comprises calculating a maximum volume spike within the time period. Additionally, the method comprises assigning an attenuation factor that is configured to amplify a particular long-term score. The method also comprises modulating the calculated present long-term score based on the assigned attenuation factor.
Additional aspects and advantages of the present disclosure will become readily apparent to those skilled in this art from the following detailed description, wherein only exemplary embodiments of the present disclosure are shown and described, simply by way of illustration of the best mode contemplated for carrying out the present disclosure. As will be realized, the present disclosure is capable of other and different embodiments, and its several details are capable of modifications in various obvious respects, all without departing from the disclosure. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
The novel features of the invention are set forth with particularity in the appended claims. A better understanding of the features and advantages of the invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings of which:
While preferable embodiments of the invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention.
The invention provides systems and methods for assessing and providing long-term indicators of sentiment. Various aspects of the invention described herein may be applied to any of the particular applications set forth below. The invention may be applied as a standalone device, or as part of an integrated online valuation system. It shall be understood that different aspects of the invention can be appreciated individually, collectively, or in combination with each other.
Long-term indicators of sentiment may be generated by assessing a numerical sentiment index, or a plurality of sentiment indices, representing the aggregate sentiment of a collection of contributing observers. The contributing observers may retain a range of expertise or influence in an area of interest, and may review informative items relating to said area of interest arising from a source, or plurality of sources. Examples of sources may include newsfeeds, company filings, agency studies, government data, and analyst reports.
In various embodiments, sentiment that is generated may provide observers with feedback of the values of the sentiment index or indices associated with the area of interest, enabling further sentiment input by additional observers. The feedback provided to an observer may incorporate or aggregate values of the sentiment index or indices from other observers. This feedback looping process can then continue indefinitely and with updates at high temporal frequency.
Furthermore, in various embodiments, sentiment that is generated may provide observers with a flow of the latest informative items, most recently available from their sources, which can be contemplated for additional sentiment input. In some examples, sentiment that is generated may be designed to provide observers with precise numerical representations of the most current possible sentiment associated with an area of interest, in addition to a temporal history of such a numerical representation over arbitrary, selectable ranges of time.
Once sentiment has been generated, methods and systems described herein may be used to assess and provide long-term indicators of sentiment. The various functions and methods described herein are preferably embodied within software modules executed by one or more devices possessing general purpose computing capabilities, including, but not limited to, general purpose computers, mobile “smart” phones, tablet computers, or any device possessing a Von Neumann computer architecture. A preferable embodiment also includes computing devices presenting output on visual display units, with a further preference being those with input touch capabilities. In certain preferable cases, some of the various functions and methods described herein can be embodied within hardware, firmware, or a combination or sub-combination of software, hardware, and firmware. Further examples of device or hardware characteristics are described elsewhere herein.
As provided below,
In addition to employing summarization algorithms known in the art, to produce compact representations of the original informative items sufficient for ease of consumption by observers and contributors, an algorithm carrying out any or all the steps below can be alternatively employed to produce a compact representation:
Sort the weighted sentences by weight, highest to lowest.
Display to consumers the sentences from the sorted list do any desirable depth (For example, first five sentences), and interpret this result as a summarization of the source material.
The method of providing compact representations of the original information may be used by way of example only and is not limiting.
Sentiment Acquisition Methods
A preferable embodiment of the invention provides capabilities for each observer or contributor 8 to efficiently inspect multiple informative items in an area of interest 5. A preferable mode of presenting a plurality of information items 5 may include augmenting conventional methods of presenting multiple information items simultaneously known in the art, such as computer display “windows”, “tiles”, and the like, with movement and content selection algorithms enabling rapid consumption and feedback acquisition. The multiple informational items simultaneously displayed may relate to a single entity or multiple entities.
A preferable embodiment of such algorithms driving the presentation of information items include controlling the duration of time an item is presented proportional to the amount of sentiment feedback upon it, relative to that of other information items being presented.
Similarly, a preferable embodiment of algorithms driving the presentation of information items include controlling the proportion of display area occupied by the information items with a positively correlated proportion of sentiment feedback relative to that of other information items being presented.
Another preferable embodiment of a display control algorithm enables information item display duration and display proportion to be controlled by the incident reference counts upon each information item by other information items.
A further preferable embodiment of the information item display control algorithm displays information items in visual clusters as they relate to particular areas of interest.
An additional preferable embodiment of a display control algorithm combines the above techniques with preset weights of influence.
An additional preferable embodiment of the invention to acquire sentiment measurements may utilize sentiment values published and/or updated periodically with applicability over known durations of time. These values may then be mapped and scaled to be made mathematically comparable with the observer-driven sentiment metric ranges and further associated with timestamps distributed in a density over the same duration of time proportionate to the significance or relevance of the values in determining sentiment. The resulting sentiment output of this process can then be likened to equivalent observer-driven sentiment input metrics, suitable for processing identical to that for observer-driven sentiment input metrics. The timestamps may be reflective of when data is received (e.g., feedback from one or more users) or when data is calculated (e.g., calculation of a sentiment score or index). The timestamps may be collected with aid of a clock of a device or system.
In some examples, humans may but used to evaluate sentiment of content received from sources. In some examples, machines and/or processors may be used to evaluate sentiment of content received from sources. In some examples, machines and/or processors may be used to evaluate sentiment of content and humans may also be used to evaluate sentiment of content.
An additional preferable embodiment of the invention to acquire sentiment measurements employs natural language processing (NLP) algorithms known presently in the art which detect superlative (positive or negative) sentiment related to attributes of entities described in natural language, textual or audio. The algorithm may be steered, as known in the art, with keywords relating to the particular areas of interest. Ontological connections for different terms may be made. The sentiment output is then made mathematically comparable with the observer-driven sentiment metrics through known mathematical normalization and scaling techniques.
In examples, training the artificial intelligence (AI), associated with NLP, to detect sentiment in programmable categories may be an iterative process of successive refinement based upon setting inputs, observing results, and repeating until a satisfactory level of accuracy is accomplished. In some examples, the scope of a category may be defined, identifying subtopics it covers. Additionally, a calibration test set of article text may be built up. Text relevant to each subtopic that are representative of the target universe of text may be included in the calibration set. Each subtopic may have a few straightforward examples along with more oblique references. A reference might be oblique if it is only a brief mention or it uses less common vocabulary. Additionally, examples for edge cases may be collected, where the subtopic may be distinguished from similar but irrelevant subtopics.
Lexicons, collections of pertinent terms related to, or describing, topics or subtopics, may be defined that correspond to each subtopic. Tests may be run on the text examples, comprising the observation of the accuracy in automatically extracting a topic from raw text given a trial lexicon, and the performance of each lexicon may be evaluated. If the results are acceptable, then thresholds may be set to where relevant oblique references are counted but irrelevant references do not count. When the performance is ambiguous, an evaluation may be performed as to whether the error is consistent. Then either the subtopic may be split, or a problematic edge case may become its own subtopic. The subtopics may then be redefined, more examples may be collected to address the new subtopics, and lexicons may be edited. If the subtopics are already well-defined with enough examples, then some lexicons may be used as filters for other lexicons. Filters may use lexicons to implement boolean logic operations such as “AND” or “NOT.”
For example, a topic such as “worker treatment and rights” may include fair pay, occupational safety, and non-abusive treatment of employees. After an initial round of trying to detect all these topics with one signal, it may be found that the signal regularly misses slave labor and worker abuse. As a result, the signal may be split it into two signals, “worker treatment” and “worker abuse.” From testing, a couple of problematic edge cases may also be identified. In some examples, events involving the employees of suppliers may not be included. Additionally, senior management may be excluded from the definition of “worker.” The way that the workers are referred to may not differ much whether it's the workers of suppliers or a company's own workers. So it may not be easy to narrow the signals themselves to exclude suppliers. It may be easier to detect whether the article is primarily about the supply chain or suppliers in general, and if it is, to disregard worker signals. This is an examples of a “NOT” filter on “worker treatment” and “worker abuse.” The final formula on the signals in this example is as follows.
(“worker treatment” OR “worker abuse”) NOT “supply chain”
Score Interpretation Methods
In reference to
An additional preferable embodiment of the sentiment score interpreter 2, delivers capabilities to algorithmically generate, in preparation for use by the sentiment index aggregator 3, additional numerical sentiment scores correlated with the known sentiment of the author of an information item being examined by any or all observers and contributors.
An additional preferable embodiment of the sentiment score interpreter 2, delivers capabilities to algorithmically generate, in preparation for use by the sentiment index aggregator 3, additional numerical sentiment scores generated by applying known automated sentiment scoring algorithms to textual feedback items, such as “blog comments”, associated with each informative item being examined by any or all observers and contributors.
An additional preferable embodiment of the sentiment score interpreter 2, delivers capabilities to algorithmically generate, in preparation for use by the sentiment index aggregator 3, additional numerical sentiment scores generated by applying known automated sentiment scoring algorithms to “social media” content relative to the area of interest associated with each informative item being examined by any or all observers and contributors. A skilled artisan can appreciate the use of “social media” to obtain sentiment information.
Sentiment Index Generation Methods
With reference to
In a preferable embodiment, all areas of interest can be represented and maintained as a collection of computational data resident in the storage subsystems of a computing device known in the art. A skilled artisan can then appreciate the process of computationally examining each area of interest sequentially 13 and the capability to repeat the examination of the sequence an arbitrary number of times 12, preferably indefinite. A preferable embodiment further allows for the insertion or deletion of unique areas of interest into the collection.
In a preferable embodiment, all sentiment score types related to an area of interest can be represented and maintained as a collection of computational data resident in the storage subsystems of a computing device known in the art. A skilled artisan can then appreciate the process of computationally examining each sentiment score type sequentially 15 and the capability to repeat the examination of the sequence an arbitrary number of times 14. In some instances, the examination may be repeated until a pre-condition is met. In some instances, the examination may be repeated indefinitely. A preferable embodiment may further allow for the insertion or deletion of unique sentiment score types into the collection, corresponding to a given area of interest.
In a preferable embodiment of the invention, for a sentiment score type under examination, as determined by the sentiment score type examination selection process 15, within an area of interest under examination, as determined by the area of interest examination selection process 13, the current numerical value for the sentiment score is acquired from the sentiment score interpreter 2, in reference back to
A particular contributing observer 8 that provides a sentiment score can be labeled u for this preferable method description. Similarly, a particular informative item in an area of interest 5 can be labeled i for this preferable method description. Additionally, the time mark generated in step 11 can be labeled tui for this preferable method description. For this preferable method description, the sentiment score value provided by the contributor u, through the sentiment score interpreter 2, associated with a particular informative item i, at a particular time t can be labeled R(t)(u)(i). For the purposes of this preferable method description, it will apply to a particular sentiment score type in a particular area of interest, as the skilled artisan can appreciate that it can be applied to each sentiment score type within each area of interest with no change to the method itself. R(t)(u)(i) can be considered as a function of three variables, contiguous in time t, and discrete in both u and i. R may be a sentiment score given by an observer (e.g., may be one of a plurality of dimension values). A skilled artisan can appreciate these mathematical interpretations. The value of the function at any time t is the sentiment score, provided by observer u on informative item i is defined, in the mathematical terminology know in the art as a “step” function, and with the value of the sentiment score set at the most recently updated time tui. This value persists until the next update time tui. For all time prior to the first update time tui the function is not defined mathematically. For this preferable method description, the sentiment index value can be labeled S(t), which is the objective of step 17. In this preferable embodiment, S(t) is computed by ranging over all u and all i, multiplying each value of R(t)(u)(i) found by a weight associated with the particular observer u and particular information item i, summing these products together and then dividing the completed sum by the sum of all the weights. The skilled artisan can appreciate that the weights can be pre-recorded in digital storage media associated with a computing device and extracted for this calculation. In a preferable embodiment of this invention, the weights can be pre-correlated with the significance of the observer and the significance of the information item.
A further preferable embodiment generates a summary sentiment index by mathematically combining a plurality of sentiment indices related to an area of interest 4 applying a mathematical function that maps multiple scalar values into a single scalar value. A preferable embodiment of such a function is an arithmetic mean. A further preferable embodiment of such a function is a weighted arithmetic mean, with weights set correlated to the significance of a particular contributing sentiment index to the overall summary thusly computed. A preferable embodiment in selecting the plurality of sentiment indices related to an area of interest for summarization would be those indices corresponding to areas of interest subordinate to a particular major area of interest. Examples of this arrangement include scenarios where the major area of interest represents a publicly traded corporation and the subordinate areas of interest represent facets of corporate governance and behavior, such as leadership, employee relations, innovation, supplier or “ecosystem” relations, environmental stewardship, and customer relations.
An alternative embodiment for generating sentiment indices that unifies and weighs the various inputs is described below:
As an additional preferred embodiment, the general sentiment score for an area of interest, categorical or overall, is intended to reflect a continuous quantitative sentiment index, updated frequently, reflecting behavior in the areas of interest, categorical or overall. The objective of the index is to provide an indication of the movement of the sentiment over time regarding the categorical or overall area of interest, with more emphasis on the present. In addition, when creating an index of combined categories, the resultant index should be influenced more greatly by categories having more input, versus equal influence of each respective category index.
Input to the process of determining and updating the General Sentiment Score are raw sentiment rating values (referred above as vote values) in the categorical areas, derived from assessing textual news content items, as they appear in time, using technology such as Natural Language Processing (NLP) or via human votes or ratings. This phase of the process, occurring prior to the scoring methods described herein contemplates applying natural language analysis upon news content item to assess that the subject matter relates to the area of interest, categories of interest and quantifies the intensity and polarity (positive or negative) of the sentiment. These quantifications are then used as input to the scoring methods described herein.
A visualization of this resulting from its implementation is shown in
For a particular category (or overall), the General Sentiment Score is produced by applying a running sum average formulation to a continual time stream of sentiment ratings related to the category (or any category, in the case of overall), weighted by a freshness factor that varies over time within the running sum averaging technique employed. Freshness allows for more recent news to be weighted more significantly. The value of the freshness parameter, and the formulation of its mathematical application, are chosen to reflect the ephemeral nature of news events, fading in importance over time. For rating data in the past, the formula can be applied retrospectively over all data points and then prospectively applied going forward in an incremental fashion. The mathematical formulation is detailed below, with rationale for the formulating structural and parametric choices made, along with descriptive introductions to each mathematical line of text.
First, the ratings are sifted to the resolution of category to obtain refined input related to the category:
v(m,n)≡sentiment rating value in the nth category
at the mth time stamp tm (measured in whole and fractional days),
for a particular area of interest (such as a company) ∀ m, n
The above equation assigns a symbol to the sentiment rating and labels it for the category and time. This is the core set of sentiment values to be used in deriving the scores. These sentiment values are produced by natural language processing, upstream of this phase.
N≡number of categories
The above equation assigns a symbol to the number of categories, to be used in subsequent computation. The “freshness” effect is accomplished by applying a function that diminishes the numerical significance of the sentiment rating as it enters into the numerical summarizing (averaging) process. A preferred choice for the decay rate is two weeks, while the model is sufficiently general to select another choice for that setting, should it be determined that the significance of news has a different time constant in a different context:
f(t)≡eA(T−t)≡freshness factor at time t
The above equation assigns a symbol and the exponentially functional computation to the factor used to capture the freshness of the content input assessed for sentiment. The settings below describe the parameters used in this construct:
T≡reference date selected as an arbitrary constant in the past or future
As the factors grow, this reference date T can be pushed into the future to uniformly rescale.
These equations below establish the symbology used in assimilating the freshness factor into the averaging process by defining a weight when a rating value exists to which to apply the freshness factor:
The computation is then carried out as a running sum average, which has the added implicit effect of naturally adjusting for the volume of inputs entering the summary (averaging) computation over time:
The computation is carried out at the categorical level, and to support the running sum averaging process, a numerator and a denominator are first generated:
S
m,n≡Σk=1mw(m,n)v(m,n)≡numerator sum for the nth category for scoring at time tm
The above equation computes the numerator used in the averaging process as a weighted sum, using the weight factors w(m, n) described above applied to the sentiment rating values v(m, n) introduced above.
W
m,n≡Σk=1mw(m,n)≡denominator sum for the nth category for scoring at time tm
The above equation computes the denominator used in the averaging process as a sum of the weight factors w(m, n) described above.
Similarly, the computation of numerator and denominator is carried out at the overall aggregated level:
S
m≡Σn=1NΣk=1mw(m,n)v(m,n)≡numerator for aggregate scoring at time tm
W
m≡Σn=1NΣk=1mw(m,n)≡denominator for aggregate scoring at time tm
The corresponding scores are then computed as the ratios of the numerators to denominators generated above:
P
m,n(t)≡Sm,n/Wm,n≡General Sentiment Score for the nth category at any time tm+1>t≥tm
P
m(t)Sm/Wm≡General Sentiment Score at any time tm+1>t≥tm
The above equations then compute the respective weighted sum averages at arbitrary points in time by dividing the denominators into the numerators, as defined above.
Computation of the General Sentiment Score begins with the first appearing sentiment value in the category for the area of interest, such as a company, at time t1, with that first nontrivial sentiment value being v(1, n), corresponding to the first scoreable (rate-able) news event in the category for the area of interest, such as a company. For all computations, however, the very first entry into the running sum process is v(0, n) is a neutral value, typically 50 in a 0 to 100 score range scale, entered into the sum contemporaneous with the first nontrivial sentiment value, the neutrality origin of all scores. Put another way, all scoring calculations are seeded with the neutral value, at the same time the first nontrivial sentiment value arrives. This enables an initial damping effect that keeps the scoring system from “jumping to an initial conclusion” given just a single initial sentiment value.
The partial summing can be done in bulk or incrementally, as the above numerators and denominators can be held separately and updated as new information (sentiment ratings) occur over time. The incremental updates themselves can also be performed at any time after the new events are collected, not necessarily immediately, allowing for “bucketing” of events and semi-batch processing and updates of the numerators, denominators, and resultant General Sentiment Scores. Examples of this process would be to “bucket” scoreable event inputs for the course of an hour and then update the running sums (numerators and denominators for each category and overall) at the end of the hour.
The consumer of any of the scores can be afforded the ability to combine, in a custom manner, various categories to produce a combined custom sentiment score and presentation. As discussed above, the overall score, which is an implicit combination of all category scores, is computed, for the General Sentiment Score, by performing the running sum average over all input sentiment rating values, by maintaining a separate numerator and denominator, independent of category. In this way, a natural volume weighting occurs as when more sentiment inputs arrive for one category versus another, then the category with greater arrivals has more influence in the overall General Sentiment Score. In the case of custom combined categories, the same effect is desired, yet it is impractical to maintain separate running sum numerators and denominators for all possible combinations of categories. Instead, then, the denominators used in calculating the General Sentiment Score, as explained above, of the categories being combined can be employed to provide a similar volume-weighted effect. In particular, the denominators used in the running sum averaging of the categories selected for the combination can be applied as weights in a weighted average of any of the score types, not necessarily limited to General Sentiment Score, across selected categories on an area of interest to arrive at the custom combined category rendition of the score.
Below is the description in mathematical terms, along with descriptive introductions to each mathematical line of text:
at time t for the subset C of categories selected for the jth area of interest, such as a company.
The above equation defines the technique for computing the General Sentiment Score corresponding to an arbitrarily combined set of categories. It is a weighted average (weighted sum divided by sum of weights) using the per-category denominators Wj,k(t), as defined above, and using the parameters symbolically defined in the equations below:
C≡number of categories selected
IS
j,k(t)≡Score known at time t for the jth company in the kth category within the subset C of categories selected
W
j,k(t)≡running sum General Sentiment Score denominator known at time t for the jth company in the kth category within the subset C of categories selected
A preferable embodiment of the invention enables the consumer of sentiment indices, generated within the capabilities of the invention, to additionally consume information characterizing the correlation of the generated sentiment indices with known, published indices in the area of interest. A skilled artisan can appreciate the use of known mathematical correlation techniques for determining correlation metrics between the sentiment indices generated by embodiments of the invention and known indices characterizing the area of interest.
A further preferable embodiment of the invention teaches the correlation of sentiment indices, in areas of interest relating to corporate behavior, with rapidly changing conventional financial indicators including, but not limited to, stock price, related derivative indicators, and other rapidly changing known financial indicators.
A preferable embodiment of the invention enables the consumer of sentiment indices, generated within the capabilities of the invention, to additionally consume information articulating the collective behavior of, and relationships among, the constituents within groups of areas of interest. Information collected to various groups may be compared and/or differentiated. In some embodiments, information may be displayed relating the comparison of data relating to sentiment indices gathered from different groups.
To indicate aggregate behavior of the indices corresponding to constituents of a collection of areas of interest, a preferable embodiment of the invention enables the consumer to view a display of, and/or obtain a report of, statistics computed across the collection, including, but not limited to, mean, median, and standard deviation. Such statistics may be individualized for different groups or areas or interest.
To indicate behavior of the indices corresponding to constituents of a collection of areas of interest, relative to other constituents within the same collection, a preferable embodiment of the invention enables the consumer to view a display of, and/or obtain a report of, comparative metrics of the index corresponding to each constituent relative to those of other constituents, selected groups of constituents, or relative to aggregate statistics across the collection.
A preferable embodiment of the invention computes comparative metrics among indices of constituents of a collection of areas of interest, relative to other constituents within the same collection, by applying the technique known in the art as “Data Envelopment Analysis” or “DEA.” Such techniques may be applied such that the “outputs” in the known DEA technique are the sentiment indices and the “inputs” can be any quantitative indicators known or hypothesized to have a causal relationship with the sentiment indices of the areas of interest within the collection. The consumer can then view, or obtain reports containing, the standard statistics generated by the DEA technique to assess the behavior of the indices of the peer constituents within the collection relative to one another.
In some specific applications of the invention, the areas of interest may be economic entities such as corporations and the sentiment indices may relate to measures in domains including, yet not limited to, anti-competitive behavior; business model; data security & privacy; leadership/governance; product innovation/integrity; environmental responsibility that includes environmental atmosphere, environmental land, and environmental water; human capital topics such as employee responsibility/workplace; marketing practices; political influence; product integrity & innovation; social responsibility/impact; supply chain; sustainable energy use & production; and/or custom categories such as economic sustainability.
Anti-competitive behavior may focus on firms' use of anti-competitive practices to prevent or restrict competition. This may include, but is not limited to, predatory pricing, transfer pricing, price fixing, geographic monopolies and dividing territories, dumping, exclusive dealing, and bid rigging. Business model may focus on firms' development of strategies to create and deliver value in the short-term and/or the long-term, minimize or mitigate systemic risks and negative externalities as relevant, and avoid controversial business practices. Data security & privacy may focus on firms' data security practices and policies, as well as on its privacy policies and practices related to customer data.
Environmental atmosphere may focus on all environmental impacts on the atmosphere at the local and/or global levels, such as greenhouse gases, climate change, mercury, and/or other emissions. Environmental land may focus on environmental impacts on land, such as biodiversity, deforestation, solid waste disposal, soil pollution, land degradation, and rehabilitation. Environmental water may focus on environmental impacts of water resources, such as waste water, water pollution, aqua bio-diversity, and water efficiency.
Governance may focus on a firm's relation of top management and the board to its stockowners and key stakeholders. Considerations may include ownership structure, voting and proxy processes, board structure and tenure, ethical business practices, and executive compensation arrangements. Governance may exclude dividend reporting. Human capital may focus on the treatment of both unionized and non-union employees according to generally accepted international fair labor standards. Relevant issues may include employee retention, education and training, health and safety, compensation and benefits, as well as diversity and mentoring programs. Marketing practices may focus on information accuracy and completeness, transparent labeling, appropriate marketing channels, and the incorporation of social and environmental considerations as appropriate.
Political influence may focus on firms' lobbying practices and attempts at regulatory capture, as well as undue influence to the degree that these activities may undermine the ability of the political structure and governmental agencies to serve the public interest. Product integrity & innovation may focus on the quality and innovativeness of products and service, as well as the research and development of products in the pipeline. Product integrity & innovation may also include the management of packaging and disposal over the product's life cycle. Social impact may focus on recognized international human rights standards, impact on relationships with relevant communities and key stakeholders as well as philanthropy and charity.
Supply chain may focus on firms' logistical organization and coordination with its suppliers, including social and environmental conditions and impacts. Supply chain may also include adherence to supply chain labor standards, sourcing controversial raw materials, and adherence to or development of industry best practices. Sustainable energy use & production may focus on firms' use and production of sustainable energy forms, including those that minimize negative externalities, such as wind and solar power. It may also include how efficiently firms use energy inputs. Custom categories may be used to create data categories and weighting systems according to user specifications.
In such application, comparative metrics may be computed among indices of constituents of a collection of areas of interest, relative to other constituents within the same collection, by applying the DEA sets the “outputs” technique as the sentiment indices and the “inputs” can be any quantitative indicators known or hypothesized to have a causal relationship with the sentiment indices of the areas of interest within the collection, including, but not limited to, standard economic and financial metrics related to the economic entity, such as return on assets (ROA), return on investment (ROI), and EVA (economic valued added). The consumer can then view, or obtain reports containing, the standard statistics generated by the DEA technique to assess the level of “efficiency” with which economic inputs were deployed to achieve the sentiment levels corresponding to the sentiment domains described above.
A preferable embodiment of the invention enables the consumer of sentiment indices, generated within the capabilities of the invention, to additionally consume information articulating the behavior of the indices over time as described below.
To depict aggregate temporal behavior of the index over selectable windows of time, a preferable embodiment of the invention enables the consumer to view a curve representing the moving average of the index over time. A skilled artisan can appreciate the use of known mathematical techniques for computing the simple moving average, the cumulative moving average, the weighted moving average, and the exponential moving average. Any or all these are applicable in displaying moving average behavior of a sentiment index to a consumer in conjunction with the temporal behavior of the sentiment index itself.
Aggregate Statistics and Constituent Peer Comparisons over Time
To depict temporal behavior of collections of indices over selectable windows of time, a preferable embodiment of the invention enables the consumer to view curves representing any or all aggregate statistics and constituent peer comparisons as functions of time. Graphical representations may show peer-to-peer comparisons, peer-to-groups of peer comparisons, groups of peers-to-groups of peers comparisons, peer-to-entire aggregation comparisons, or groups of peers-to-entire aggregation comparisons.
To further depict aggregate temporal behavior of the index over selectable windows of time, a preferable embodiment of the invention enables the consumer to view a curve representing a mathematically fit trend. A skilled artisan can appreciate the use of known mathematical techniques for computing polynomial fit curves of selectable degree, periodic fit curves, and exponential fit curves. Any or all these are applicable in displaying trending behavior of a sentiment index to a consumer in conjunction with the temporal behavior of the sentiment index itself.
To further inform temporal behavior of the index, or any derivative function of time of an index or indices, over selectable windows of time, a preferable embodiment of the invention enables the consumer to view, or receive remotely, alerts indicating index changes within fixed, moving, or dynamically expandable windows of time triggered by fixed, moving, or dynamically expandable thresholds, keyed from the start of the time window, by most recent time the threshold is exceeded, or any combination thereof. Such alerts may be delivered to the consumer by any known route (e.g. email, text message, pop-up, phone call, or through a mobile application. The consumer may define how they consumer wishes to receive the alert. The consumer may define which alerts the consumer wishes to receive, and/or thresholds for providing alerts. The consumer may define the time window, such as a start and/or end time for the time window.
For a given trend as described above, to provide an indication that the trend will continue into the future with its current parameters, enabling predictability, an embodiment of the invention enables the consumer to obtain a figure of merit indicating the confidence that the trend will continue. Such an indicator may make use of metrics known in the art as goodness of fit. A confidence figure can be computed as follows:
In alternate implementations, other numerical values may be provided for A, B, C, and/or D.
In a further refinement of this metric, within an alternative embodiment of the invention, a predictive period of time, dt, may be selected by the consumer, in addition to a prior fit period of time T. A trend calculation can then be performed as described above for a selected fit type to generate the fit parameters that can then extend the curve beyond the fit period T by the selected predictive period dt. Error calculations may then be performed between the predicted curve and the actual data over the interval dt and the confidence figure may be computed for that range, rather than the fit range as described above.
To provide the ability to forecast an index characterizing the area of interest, a correlation calculation between the sentiment index and the index characterizing the area of interest can be performed and extrapolated to estimate a forecasted value of the index characterizing the area of interest. A skilled artisan can appreciate the use of known mathematical techniques for computing correlated trends that are extrapolatable into the future to obtain estimates of future values, at chosen durations into the future, of one or all of the correlated variables. A preferable embodiment of conducting such a calculation is the use of neural networking algorithms, using time sequences of multiple indices to train the network and then applying the trained network to forecast future values of the indices.
A further preferable embodiment of the invention teaches the forecasting temporal correlation of sentiment indices. In some examples, real-time sustainability data may be of a comparable nature to stock price performance. Additionally, real-time sustainability data may be an ideal leading indicator of associated stock price performances or other frequent financial measures due to the high-frequency nature of the real-time sustainability data. In some examples, the forecasting temporal correlation of sentiment indices may be used in areas of interest relating to corporate behavior, with rapidly changing conventional financial indicators including, but not limited to, stock price, related derivative indicators, index volatility, company volatility, and other rapidly changing known financial indicators.
To provide an assessment of the crowd strength data quality of a particular sentiment index, an embodiment of the invention enables the consumer to query a metric indicating the concentration of observers of various observer classes convolved with the recentness or “freshness” of the observer sentiment. One or more of the following steps may be implemented to compute such a metric:
To refine the value of the freshness decay rate, an algorithm may be employed that may sample the pool of observation data to characterize a canonical rate of change as follows:
To reflect the cumulative effects of sentiment over time, a consumer may query a metric indicating the sustainability of the sentiment level over extended periods of time. A preferable embodiment of the invention may implement the following to compute such a metric:
To provide an indication that a trend may be changing, or if a trend is deviating from a trend of another index associated with an entity, a consumer may obtain alerts when these triggers are detected. A preferable embodiment of calculating the conditions for such triggers is as follows:
To provide an assessment of the time series volatility of a particular sentiment index, an embodiment of the invention enables the consumer to query a metric indicating a relative magnitude of index variability over time. An embodiment of the invention can include one or more of the following steps to compute a volatility metric:
Another embodiment of the invention can include one or more of the following steps to compute a volatility metric:
A third embodiment of the invention can include one or more of the following steps to compute a volatility metric:
To provide an assessment of the relationship of a time series volatility of a particular sentiment index and a published time series indicating volatility obtained by means outside the scope of this invention, yet of additional interest to observers, an embodiment of the invention may enable the consumer to query correlation metrics indicating a strength of relationships between the volatility metrics computed by the invention and external indices of interest. Correlations of this kind can be obtained using statistical correlation methods known in the art and providing the results of such analyses to the consumer. An embodiment of the invention can correlate stock price action beta metrics with volatility indices computed by the invention.
A machine interface may be provided through which sentiment feedback information including indices, metrics, statistics, instrumentation, and/or alerts regarding an entity may be consumed through programmable machine interfaces through standard computer/machine communication media, connections, and/or networks. The entity may be a company, corporation, partnership, venture, individual, organization, or business. In a preferable embodiment, the machine interface can further modify the mathematical presentation of the sentiment feedback information, including, but not limited to applying filters and/or numerical weights related to entity information sources, entity categories, aggregate collections of entities.
In an additional preferred embodiment, a machine interface may be provided through which areas of interest, entities, categories, and/or entity information items and sources can be specified from which sentiment feedback information and all derivative outputs described within this invention can be produced.
A user interface may be provided through which observer feedback may be solicited regarding an entity. The observer may also be able to view a score indicative of the value of the entity. The entity may be a company, corporation, partnership, venture, individual, organization, or business. In one example, the entity may be a publicly traded company. Alternatively, the entity may be a private company. The score may be a numerical value representative of the value of the company. Value may refer to crowd-based sentiment, performance, financial value, or any other index.
In some implementations, entity articles may be displayed on a user interface subject to observer preferences, the significance of the article, or related entity. The entity articles may be provided by the entity, or may be about the entity.
Presentation variations on a user interface may relate to the speed/cycle of an update, size of display area dedicated to the information (e.g., tile size), highlighting, and/or other visual cues.
In some embodiments, the user interface may also include a region 320, 340 through which the observer may select the option to provide feedback. The feedback region may be implemented as a widget, may be displayed on a browser or application, or may be implemented in any other fashion. In some instances, the feedback region may be presented as a button, pop-up, drop-down menu, pane, or any other user interactive region.
Information about the entity 310, 330 and the region through which the observer may provide feedback 320, 340 may be simultaneously displayed. The user may provide feedback about the displayed entity via the region.
The feedback region 420, 460 may include a general query 430, 470. The general query may relate to the value of the entity. For example, the general query may ask how the entity is performing overall. Entity performance can be determined according to different categories or metrics. One or more specific queries 440, 480 may also be displayed. The specific queries may relate to one or more different categories or metrics relating to the general query. For example, if the general query asks how an entity is performing, the specific queries may relate to different areas or categories of how the entity is doing. For example, the specific categories may include, yet may not be limited to, leadership/governance, product innovation/integrity, environmental responsibility, employee responsibility/workplace, social responsibility/impact, and/or economic sustainability. In some instances, five distinct categories may be provided. In alternative embodiments, one, two, three, four, five, six, seven, eight, nine, ten, or more categories may be provided in order to assess entity value or performance.
In some instances, the feedback region 420, 460 may include a visual representation 442 of each category for the specific queries 440, 480. For example, the visual representation may be an icon or picture (or tool tip or helper text) representative of categories, such as leadership, innovation, environmental responsibility, employee responsibility, social responsibility and/or economic sustainability. Such visual representation may create a broader idea of specific category.
One or more interactive tool may be provided through which the observer may provide feedback. For example, as shown in
The interactive tool may permit the observer to easily and simply provide feedback. For example, the observer may provide feedback without having to type in any letters, words, or numbers. The observer may drag a visual indicator into a desired position, or click or touch a desired option. In an alternative to a slider bar, one or more options may be provided that the user may select. Such tools may make it easier to quickly allow an individual to express his or her opinion. An individual may express an opinion with a single click, touch, or drag.
In some instances, category values 446, 486 may be displayed in the feedback region. For example, each category may have a category value reflecting a numerical value for each category. The numeral value may correspond to the placement of the slider on the slider bar 444 or circular bar 484. For example, moving a slider along a linear slider bar 444 to the right may increase the numerical value, and moving the slider to the left may decrease the numerical value. The category value 446 may be provided in the same row or column as the linear slider bar and may be adjacent to the slider bar. In another example, moving a slider about a loop in a clockwise direction relative to a top position or other starting position in a circular bar 484 may increase the numerical value, and moving the slider value closer to the starting position may decrease the numerical value. The category value 486 may be positioned within the loop and/or may be circumscribed by the circular bar.
In one example, the numerical value for each category may fall between 0 and 100. The numerical value may be adjacent to the slider bar or within a circular bar. In one example, an entity, such as a company, may receive numerical scores for categories such as leadership, employee responsibility, anti-competitive behavior, business model, data security, data privacy, environment, corporate governance, human capital, marketing practices, political influence, product integrity, product innovation, social impact, supply chain, sustainable energy use, and sustainable energy production.
In some instances, the placement of the slider on the slider bar may also be associated with a color scheme, representing emotional attachment to the related category. For example, the color scheme may reach from red representing disagreement to green representing agreement. In some instances, red (or another selected color) may correspond to a lower numerical value while green (or another selected color) may correspond to a higher numerical value. A gradient of colors between the selected colors may be provided corresponding to slider position along the slider bar and/or numerical value scale.
In some instances, a default value may be provided on the gradient feedback tool 444, 484. For example, if the user does not provide any feedback, the value may default to midway on a slider bar or circular bar. The numerical category scores 446, 486 may correspondingly have a default value. For example, the numerical category score may default to 50 out of 100, or 5 out of 10, or any other value.
In some embodiments a feedback region 420, 460 may have an expanded form and a contracted form. For example, when the observer selects an option to provide feedback for the entity, the region may expand to display the various categories for which the observer may provide feedback. The feedback region may remain in the same user interface that simultaneously displays the information about the entity 410, 450.
In some embodiments, the entity value score may be calculated using any of the systems and methods described elsewhere herein. In one example, the entity value score may incorporate category scores from one, two or more categories. For example, the entity value score may be calculated based on scores for: leadership, employee responsibility, anti-competitive behavior, business model, data security, data privacy, environment, corporate governance, human capital, marketing practices, political influence, product integrity, product innovation, social impact, supply chain, sustainable energy use, and sustainable energy production. The categories may be ESG categories. In some instances six or fewer, or five or fewer categories may be provided. In other instances, higher counts of categories may be provided. The overall entity value score may be an average of the various category scores.
In some implementations, the overall entity value score may be a weighted average of the various category scores. For example, category score A may have a weight of 5, category score B may have a weight of 2, category score C may have a weight of 2, and category score D may have a weight of 1. The overall entity value score may be 5×(average category score A)+2×(average category score B)+2×(average category score C)+(average category score D). The weights may be selected based on one or more different characteristics (e.g., sector, company focus, industry, current buzz, or other areas). For example, category A may be deemed to be more relevant in certain industries, and may receive a higher weight. In another example, category A may be deemed to relate to a topic that has been receiving a large amount of press attention recently, and may receive a higher weight. The weights may be determined by an observer, administrator, or may be automatically generated with aid of a processor. The weights may be established in accordance with an algorithm with aid of the processor.
The various category scores may include scores inputted by the observer that is viewing the overall entity value score. The various category scores may incorporate scores inputted by other observers than the observer viewing the entity value score. The category scores may be updated in real-time, or with a high level of frequency. The overall entity value score may also be updated in real-time or with a high level of frequency. For example, the various scores may be updated every millisecond, every few milliseconds, every second, every few seconds, every half minute, every minute, every few minutes, every half hour, or every hour. The scores may be reflective of crowd-based sentiment and may be gathered from multiple observers. Multiple observers may provide feedback via a feedback region of their respective user interfaces. In some instances, the feedback from each of the observers may be weighted equally. Alternatively, observers with different backgrounds or qualifications may have their feedback weighted differently. For example, observers who are experts in a particular field may have their feedback relating to that field weighted higher than observers who are not experts.
In some embodiments, in addition to the numerical score 530, 560, the feedback region may have additional visual indicators of the entity true value. For example, if the entity score is in the higher range, a particular color may be displayed. If the entity is in a lower range, a different color may be displayed. Such visual indicators may make it easy for an observer to determine with a glance the overall determined value for the entity.
In some embodiments, a confidence 570 and/or quality 580 of for the numerical score 560 may be provided. The confidence and/or quality may be calculated using any of the techniques described elsewhere herein. Factors, such as moving averages, trends, trend confidence, observer concentration, freshness, long term sentiment, and/or other factors may be considered. Temporal aspects may be considered in determining the confidence and/or quality of the numerical score. For examples, changes over time, or the recentness of data may be considered. A confidence value 570 may be indicative of a confidence that a trend will continue. A higher numerical confidence value may correlate to a greater confidence that the trend will continue. A quality value 580 may be indicative of a concentration and/or freshness of observer input. A higher numerical quality value may correlate to greater concentration and/or freshness of observer input.
The entity name 610 may be presented on the user interface. The entity's overall value score 620 may be displayed as a numerical value. In some instances, a stock market index value 630 for the entity may be displayed.
Information about the entity may be displayed over a window of time. A time selection option 640 may be provided through which an observer may be able to select a window of time from a plurality of options. For example, the windows of time may include 1 day, five days, 1 month, 6 months, or a year. The value and/or index information may be updated to reflect the selected time window.
The displays may accommodate differing scales of heterogeneous quantities, which may enable an observer to visually correlate relationships. For example, a stock price may be displayed simultaneously with a total and/or category score.
The user interface may also display various category scores 650 for the entity. For example, numerical values for different categories, such as leadership, employee responsibility, anti-competitive behavior, business model, data security, data privacy, environment, corporate governance, human capital, marketing practices, political influence, product integrity, product innovation, social impact, supply chain, sustainable energy use, and sustainable energy production may be displayed. The various category scores may be used in calculating the entity's overall value score 620. In some instances, an observer may be able to select a category score to receive additional information about the category or the entity's performance within the category.
In some embodiments, an observer, administrator, or other user may be able to specify which categories to use to specify the overall value score. The overall value score may be personalized to an individual user's needs or desires. For example, if a user does not believe that an innovation score should be a factor of the overall value score, then the user can have the overall value score calculated without factoring in innovation. The user may select one or more categories from a predetermined list of categories. Alternatively, a user may be able to submit a category of the user's own. The categories may be dynamically updated or customized. The user may or may not specify any weighting of the categories in generating the overall value score.
Additional information 660 about the entity may be displayed on the user interface. The additional information may include a summary of the entity, milestones, or information about management of the entity.
In some instances, articles 670 about the entity or comments relating to the entity may be displayed. The articles may include visual information, a title of the article, the source of the article, and various feedback information.
A website 900 may be displayed on a user interface with aid of a browser. A visual representation of the browser extension tool 910 may be provided on the browser environment. Selecting the browser extension tool may provide an option for a user to log in. An authentication interface 920 may be provided for a user to provide the user's identifier (e.g., email, username) and/or password. Alternatively, a user may be pre-logged in, or may not need to be authenticated to access to the browser extension tool.
In some instances, the feedback region 1020 may overlie a website 1000. In some instances, the website may provide content about an entity. The feedback region may include queries about the entity and/or entity performance. The queries in the feedback region may relate to the content of the website, which may be about the entity, or any other types of content as described elsewhere herein.
A live update region 1210 may be displayed. The live update region may be on the left hand side, right hand side, top portion, or bottom portion of the user interface. The live update region may be updated periodically or in real time. The live updates may include information about various companies. For example, the overall value score 1220 of the company may be displayed. Changes to the overall value score of the company may be displayed. The changes may be displayed as numerical score changes 1222 and/or relative percent changes 1224. A visual indicator may be provided whether the changes are positive or negative. The information may scroll through and may be indicative of changes within a given period of time, such as those described elsewhere herein. The changes may reflect real-time changes and/or values.
Other information relating to the companies may be displayed. For example, the appearance of new articles 1230 may be provided. Comments 1240 by other users or individuals to the articles or relating to the company may also be provided. The appearance of the new information may be updated in real time.
The live update region 1210 may be provided so that newer information provided on top or in the front, and older information would scroll downwards or toward the back. As new information is provided, the new information may displace the older information, which may move further down or backwards.
The voting widget 1310 may show the company name 1320. One or more categories 1330a, 1330b, 1330c for evaluation may be provided. Examples of such categories may include, but are not limited to, leadership, employee responsibility, anti-competitive behavior, business model, data security, data privacy, environment, corporate governance, human capital, marketing practices, political influence, product integrity, product innovation, social impact, supply chain, sustainable energy use, and sustainable energy production. When a user has already rated a company in a particular category 1330a the user's category score 1340a for the company may be displayed. When a user is in the process of rating a company in a particular category 1330b, the user's category score 1340b may be displayed once the user has entered a value. Optionally a default value may be provided. An expanded view may be provided which may include information or criteria for the user to consider when rating the company. When a user has not yet rated a company in a particular category 1330c, no category score 1340c may be presented. In some instances, a question mark or similar information indicating the category has not yet been rated may be provided.
When a user is rating a company category, a gradient tool, such as a circular bar 1340b may be provided. The user may slide a slider along the circular bar, or any other type of gradient tool. The numerical value may be updated to reflect the position of the slider along the gradient tool. In some examples, arrows 1342 or similar tools may be provided through which the user may manipulate the numerical value directly.
When the user has entered the user's feedback for the various categories, the overall score for the company provided by the user may be shown or displayed. This overall score may be considered in conjunction with overall scores provided by other users to provide a crowd-based sentiment index.
The voting widget may show the company name 1420. The voting widget may show an overall score for the company 1430. In some embodiments, a confidence 1440 and/or quality value 1450 may also be provided. The overall score may include a double gradient indicator. For example, a double ring voting circle may be shown. An outer ring 1432 may show a current score provided by the user and an inner ring 1434 may show an existing value (e.g., overall value from the combined feedback of other users), or vice versa. The numerical value 1460 displayed for the overall score may be reflective of the current score provided by the outer ring, or the existing value provided by the inner ring. Optionally, comparison value 1465, such as a percent change may be displayed. The percent change may be for the current score relative to the existing value.
The voting widget may show one or more categories 1470. Each of the categories may be representative of a dimension along which the company may be evaluated in determining the overall score. The dimensions may be ESG categories. The overall score may be an ESG rating for the company. The categories may show a score for each of the categories. In some embodiments, each of the category scores may be a double gradient indicator. For example, a double ring may be provided showing the current score for each category as compared to the existing score for the category. Numerical values may also be displayed, which may be reflective of the current category score or the existing category score. A user may be able to manipulate the ring that shows the current score without being able to manipulate the existing score. In some instances, a user may be able to manipulate a slider an on outer ring without being able to manipulate data on an inner ring. The double ring, or double gradient indicator may advantageously provide a simple visual interface through which a user may view how the user's scoring of the company compares to existing scores for the company.
In some embodiments, the ticker display may show a company name 1510, as well as an overall value score 1520 for the company. The overall value score may be a numerical value. In some instances, the numerical value may fall between 0 and 100 or between any other two numbers. Optionally, changes 1530 in the overall value score may be displayed. The changes in the overall value score may be a numerical change over a period of time. In some examples, the period of time may be since the previous day. Other examples of time periods may include years, 1 year, quarters, months, 1 month, weeks, 1 week, days, 1 day, hours, 1 hour, 30 minutes, 10 minutes, or 1 minute. The relative changes 1540 in the overall value score may also be displayed. The relative change may be displayed as a percentage value. The percentage change may be the difference between the current overall value and the previous overall value divided by the previous overall value (or alternatively divided by the current overall value). The previous overall value may be the overall value score at the previous period of time.
The changes 1530 and/or relative changes 1540 in the overall score may show whether a positive or negative value change has occurred.
The ticker display may be shown as part of a website or other environment. The ticker may include the company names and related information scrolling. The information may scroll across horizontally or vertically. For instance, an entity name and overall value score for multiple entities may scroll in a linear fashion.
In some examples, data is aggregated from one or more sources, such as Sources illustrated in
Data may be provided to a Data Server, such as the Data Server seen in
Incoming data content may be identified and/or categorized based on data type. Additionally, each data point may be contextualized so as to identify, extract, and categorize relevant sustainability content. In examples, content may be classified according to one of a set number of categories. In some examples, data may overlap between two or more categories. In examples, analytics may be provided on particular topics identified as relevant to a particular user by creating custom categories.
Additionally, both structured and unstructured data points may be normalized within each category. In some examples, each data point may be naturally weighted within the system according to its timeliness, frequency, and intensity through a running sum-based average. In some examples, custom materiality lenses can be developed to weight data points to varying degrees according to sustainability topic, sector, and/or data source.
The Analytics Server of the technical architecture overview may include a Calculations component, an Aggregation component, and an Event Detection component. In some examples, sustainability performance analytics may be generated. In particular, a dynamic scorecard may be generated for each monitored company. The analytics may be updated in real-time so as to display sustainability trends. In examples, each data point may be scored independently. Additionally, each data point may provide the basis for trends that can be displayed either as an aggregated “overall” performance view and/or by a particular category chosen by a user. In some examples, data behind the analytics may be transparent. In some examples, users may have access to the underlying content used to inform a score in the generated analytics.
Once analytics have been performed, data may be augmented with additional platform tools. This is seen in
One or more devices 710a, 710b, 710c may be in communication with one or more servers 720 of the system over a network 730.
One or more user may be capable of interacting with the system via a device 710a, 710b, 710c. In some embodiments, the user may be an observer or contributor that may provide feedback relating to an entity, such as a company. The user may be an individual viewing information about the entity, such as a value for the company. In some instances, the user may be an investor or broker.
The device may be a computer 710a, server, laptop, or mobile device (e.g., tablet 710c, smartphone 710b, cell phone, personal digital assistant) or any other type of device. The device may be desktop device, laptop device, or a handheld device. The device may be a networked device. Any combination of devices may communicate within the system. The device may have a memory, processor, and/or display. The memory may be capable of storing persistent and/or transient data. One or more databases may be employed. Persistent and/or transient data may be stored in the cloud. Non-transitory computer readable media containing code, logic, or instructions for one or more steps described herein may be stored in memory. The processor may be capable of carrying out one or more steps described herein. For example, the processor may be capable of carrying out one or more steps in accordance with the non-transitory computer readable media.
A display may show data and/or permit user interaction. For example, the display may include a screen, such as a touchscreen, through which the user may be able to view content, such as a user interface for providing information about an entity or soliciting feedback about the entity. The user may be able to view a browser or application on the display. The browser or application may provide access to information relating to an entity. The user may be able to view entity information via the display. The display may be capable of displaying images (e.g., still or video), or text. The display may be a visual display that shows the user interfaces as described elsewhere herein. The display may emit or reflect light. The device may be capable of providing audio content.
The device may receive user input via any user input device. Examples of user input devices may include, but are not limited to, mouse, keyboard, joystick, trackball, touchpad, touchscreen, microphone, camera, motion sensor, optical sensor, or infrared sensor. A user may provide an input via a tactile interface. For instance, the user may touch or move an object in order to provide input. In other instances, the user may provide input verbally (e.g., speaking or humming) or via gesture or facial recognition.
The device may include a clock or other time-keeping device on-board. The time-keeping device may be capable of detecting times at which user inputs are made. In some instances, the device may generate a timestamp associated with the user inputs that may be useful for calculating one or more score as described elsewhere herein. The timestamps may be associated with user feedback and useful for determining feedback to include in specified timeframes.
The device 710a, 710b, 710c may be capable of communicating with a server 720. The device may have a communication unit that may permit communications with external devices. Any description of a server may apply to one or more servers and/or databases which may store and/or access content and/or analysis of content. The server may be able to store and/or access crowd-based sentiment relating to one or more entities. The one or more servers may include a memory and/or programmable processor.
A plurality of devices may communicate with the one or more servers. Such communications may be serial and/or simultaneous. For examples, many individuals may participate in viewing information about an entity and/or providing feedback relating to an entity. The individuals may be able to interact with one another or may be isolated from one another. In some embodiments, a first individual on a first device 710a may provide feedback relating to an entity, which may affect the entity scores which may be viewed by the first individual and a second individual on a second device 710b. In some embodiments, both the first individual and the second individual may provide feedback about an entity which may be used as at least part of the basis of the entity score calculations which may be viewed by the first individual and/or second individual.
The server may store information about entities. For example, feedback received relating to various entities may be stored. Entity scores relating to various categories/metrics or overall entity scores may be stored in memory accessible by the server. Information about users may also be stored. For example, information such as the user's name, contact information (e.g., physical address, email address, telephone number, instant messaging handle), educational information, work information, experience or expertise in one or more category or areas of interest, or other information may be stored.
The programmable processor of the server may execute one or more steps as provided therein. Any actions or steps described herein may be performed with the aid of a programmable processor. Human intervention may not be required in automated steps. The programmable processor may be useful for calculating and/or updating entity scores. The server may also include memory comprising non-transitory computer readable media with code, logic, instructions for executing one or more of the steps provided herein. For example, the server(s) may be utilized to calculate scores for entities based on feedback provided by users. The server may permit a user to provide feedback via a user interface, such as a widget.
The device 710a, 710b, 710c may communicate with the server 720 via a network 730, such as a wide area network (e.g., the Internet), a local area network, or telecommunications network (e.g., cellular phone network or data network). Communication may also be intermediated by a third party.
In one example, a user may be interacting with the server via an application or website. For example, a browser may be displayed on the user's device. For example, the user may be viewing a user interface for entity information via the user's device.
Aspects of the systems and methods provided herein, such as the devices 710a, 710b, 710c or the server 720, can be embodied in programming. Various aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium. Machine-executable code can be stored on an electronic storage unit, such memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk. “Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server. Thus, another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links. The physical elements that carry such waves, such as wired or wireless links, optical links or the like, also may be considered as media bearing the software. As used herein, unless restricted to non-transitory, tangible “storage” media, terms such as computer or machine “readable medium” refer to any medium that participates in providing instructions to a processor for execution.
Hence, a machine readable medium, such as computer-executable code, may take many forms, including but not limited to, a tangible storage medium, a carrier wave medium or physical transmission medium. Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as may be used to implement the databases, etc. shown in the drawings. Volatile storage media include dynamic memory, such as main memory of such a computer platform. Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system. Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications. Common forms of computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a ROM, a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data. Many of these forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to a processor for execution.
An input/output device 830 may be provided. In one example, a user interactive device, such as those described elsewhere herein may be provided. A user may interact with the device via the input/output device. A user may be able to provide feedback about an entity using the user interactive device.
In some embodiments, the computing device may include a display 840. The display may include a screen. The screen may or may not be a touch-sensitive screen. In some instances, the display may be a capacitive or resistive touch display, or a head-mountable display. The display may show a user interface, such as a graphical user interface (GUI), such as those described elsewhere herein. A user may be able to view information about an entity, such as overall value score for the entity or category scores for the entity through the user interface. In some instances the user interface may be a web-based user interface. In some instances, the user interface may be implemented as a mobile application.
A communication interface 860 may also be provided for a device. For example, a device may communicate with another device. The device may communicate directly with another device or over a network. In some instances, the device may communicate with a server over a network. The communication device may permit the device to communicate with external devices.
As described above,
The invention includes methods and systems for determining a particular long-term indicator metric that is intended to be accretive in nature. In some examples, the long-term indicator metric may be positively or negatively accretive in nature depending on the performance of a firm that is being evaluated. In some examples, metrics may be determined that are at least partially derivative based upon a continuous time series of sentiment scores.
The development of a long-term indicator metric may be used to show steady growth, or lack thereof, of a more rapidly varying, underlying sentiment function of time. The long-term indicator metric may be used to recognize value over longer terms in extra-financial areas, yet in a way that is different from conventional summary ratings. These longer-term indicators are based on underlying continuous series and are much more precise and consistent than other methods. For example, if we look at the accumulation of “value” over time (i.e. sustained better than a baseline score series), we can accumulate them into this “integral”-like index, called an aggregation of Incremental Sentiment Value. In some examples, an aggregation of Incremental Sentiment Value may be analogous to compounded annual growth with respect to financial aspects.
The aggregation of Incremental Sentiment Value may emulate compounded return on an asset (both positive and negative relative to a baseline). The calibration bounds are 100% “return” if an Incremental Sentiment Value of 100 is held for one year and proportionately negative if a Incremental Sentiment Value of 0 is held for a year. We can thus compute the maximum attainable daily return rate thusly:
T≡measurement period=365 time units (days)
R≡highest objective cumulative return per measurement period=100%
r
Max=unit time return rate needed to achieveR at the close of the measurement period
(1+rMax)T=1+R=>1+rMax=(R+1)1/T
=>rMax=(R+1)1/T−1=0.001900837677
The return rate is applied, positively or negatively, proportionately to the value of the Incremental Sentiment Value relative to a baseline. The rate is applied in a decreasing manner from the time of the score change until the next score change where the process repeats. The decreasing ramp is the same as the freshness function used in computing the underlying Incremental Sentiment Value. An analogy of this concept is the sustaining of a musical note, with the note initializing at the time of a scoring event and then tailing off over time until total quiet or until another not occurs. Thus:
f(Δt)≡eAΔt≡freshness factor at timeΔt following a scoring event
λ≡information currency half-life [default 14 days]
A≡information decay factor
I
Max=Maximum possible Incremental Sentiment Value=100
I
0(t)≡baseline Incremental Sentiment Value at time t (default constant neutral value of 50;otherwise a baseline benchmark series)
I
min≡minimum possible Incremental Sentiment Value=0
t
n≡time of the nth scoring event
I(t)≡Incremental Sentiment Value at time t
Δt≡time unit increment=1 day
Δtk≡kth time unit (day) from most recent scoring event, with Δt0=0
N(t0,t)≡number of scoring events between times t0, and t
K(tn,t)number of time units after scoring event time tn until the next scoring event or until time t
Δtk≡kth time unit (day) from most recent scoring event=kΔt,k=0 . . . K(tn,t)
C
Max=Maximum possible aggregation of Incremental Sentiment Value per measurement period=100
C(t0,t)=CMax{Πn=1N(t
When t advances by the increment Δt, the aggregation of Incremental Sentiment Value is updated recursively as follows:
Of most interest will be the change in aggregation of Incremental Sentiment Value over a duration [ts,t]:
ΔC(ts,t)≡C(t0,t)−C(t0,ts.)
which can be shown as a graph along the duration of points (t, ΔC(ts,t)), ∀t∈[ts,tf]
Based on the aggregation of Incremental Sentiment Value, Sentiment Momentum characterizes sustained performance over an interval with a single number, with further property objectives of being intuitive in the financial domain.
For a given category or overall for a given company, the mathematical embodiment is as described below:
Given:
{C(t0,t)}t=t
Least-squares fit the following line to the series:
C′(t)=mt+b, where m may be taken as the Sentiment Momentum.
Additionally, software implementation of the above can be straightforwardly performed using standard linear regression libraries.
Additional preferred embodiments of long term sentiment scores and aggregates follow:
The Long Term Score is intended to be accretive in nature (either positive or negative, depending of course on sustained sentiment performance of the firm), similar to the Incremental Sentiment Value, and is a quantity built upon the continuous General Sentiment Score time series. Long Term Score is designed to show steady growth, or lack thereof, of a more rapidly varying, underlying sentiment functions of time. The general intent is the important recognition of value over longer terms in these extra-financial areas, yet in a way that is different from conventional summary ratings. Being based on underlying continuous series and are much more precise and consistent than other methods. For example, if we look at the accumulation of “value” over time (i.e. sustained better than a baseline score series), we can accumulate them into this “integral”-like index, called Long-Term Score, showing what is analogous to compounded annual growth on the financial side.
A visualization of Long Term Score series, relative to its more rapidly varying General Sentiment Score series is shown in
For each particular area of interest, such as a company, for each particular category, given inputs as described below:
The longer-range fade period for the Long Term Score is chosen to provide significance fading to half its impact over six months to provide sufficient movement in an annual period, yet diminishing more volatile effects in the signal. This model is sufficiently general, however, to accommodate a different choice of half-life:
T≡half-life period=182 time units (days)=6 months
The above equation establishes the symbol used to represent the half-life timing parameter used in the method below.
r≡unit time diminishing rate=1−(½)1/T≅0.004
The above equation establishes the symbol, and the functional derivation based on T, defined above, used for the rate parameter in the exponential averaging process described below.
To mitigate the effect of General Sentiment Scores being updated with a lower frequency such that the effect of prior updates linger too long into the Long Term Score smoothing process, their effect is faded to neutrality while awaiting the next General Sentiment Score to appear in order to diminish the impact of “stale” scores in the computation. Also, to mitigate the effect of the first value of General Sentiment Score input to the Long Term Score calculation to have undue, disproportionate significance, a seasoning period is chosen in which the General Sentiment Scores are initially preprocessed and averaged to generate a seed value representative of the General Sentiment Score values having occurred during the seasoning period:
Δt≡seasoning period=14 time units (days)
The above equation establishes the symbol used to represent the time interval during which prior inputs are collected for setting an initial input to the averaging process for the method.
t
0≡time of first reported General Sentiment Score
P(t)≡General Sentiment Score at time t
L(t)≡Duration of time, measured in days, at time t that P(t) has not changed—“lingered” at its current value read at time t
P
0≡neutral General Sentiment Score=50
A≡information decay factor≅−0.05/day as derived above in defining General Sentiment Score
The above equations establish symbols, as described therein, to be used in the computations for the method.
P′(t)≡fade-adjusted General Sentiment Score at time t=P0+eAL(t)(P(t)−P0)
The above equation establishes the symbol and functionally derives the fade adjustment for the General Sentiment Score to be input to the averaging process using an exponential decay with the parameters as described above. The Long-Term Score for that entity for that category is computed recursively as described below:
As discussed earlier, the Long Term Score is first seeded by an average of the fade-adjusted General Sentiment Scores collected during the seasoning period and then recursively updated per time increment (typically daily) forward after that point in time:
The above equation establishes the symbol and functional derivation of the initial value used in the averaging process as the arithmetic average (sum divided by count) of the fade-adjusted scores within the seasoning period.
l(t)≡Long Term Score at time t>t0+Δt=rP′(t)+(1−r)l(t−1)
The above equation computes the objective quantity of Long Term Score using exponentially-weighed moving averaging, expressed as a recursive function, seeded by the initial value defined immediately above and carried out by multiplying the rate r, defined above, by the fade-adjusted score at a time t, and adding to that one minus the rate multiplied by the value of the Long Term Score at the time increment prior, t−1. This process is repeated stepping through time increments.
In some examples, times may be measured in days. In some examples, times may be measured in hours. In some examples, times may be measured in weeks. In some examples, times may be measured in a number of time-based increments that are associated with sections, minutes, hours, days weeks, a fraction of a time-based increment, and/or a multiple of a time-based increment, in addition to other examples. Computation of the Long Term Score begins, as noted, be being offset by the seasoning period past the first newsworthy event in the category for the company. Prior to that point, if a representative Long Term Score is needed, then neutrality, such as 50 in a 0 to 100 scale, is set as the value.
The Volume-Modulated Long Term Score is a modification to the Long Term Score as described above wherein news event sentiment rating volume contemporaneous with a General Sentiment Score change is applied to highlight the change commensurate with the level of volume. The technique contemplates applying the averaging process in place multiple times proportional to the relative volume of General Sentiment Score triggering events occurring during the time increment.
The Volume-Modulated Long Term Score applies the Exponentially-Weighted Moving Average (EWMA), as detailed above, repetitively at a point in time by a number of iterations relatively proportionate to the volume level at that point in time.
The mathematical description follows, along with descriptive introductions to each mathematical line of text. For each particular area of interest, such as a company, for each particular category, given inputs as described below:
The longer-range fade period for the Long Term Score is chosen to provide significance fading to half its impact over six months to provide sufficient movement in an annual period, yet diminishing more volatile effects in the signal. This model is sufficiently general, however, to accommodate a different choice of half-life:
T≡half-life period=182 time units (days)=6 months
r≡unit time diminishing rate=1−(½)1/T≅0.004
Δt≡seasoning period=14 time units (days)t0≡time of first reported General Sentiment Score
The above symbols and parameters are established identically to those established in the description of Long Term Score above.
The volume tracking parameters are derived here, to later be used in multiplicatively applying the averaging process in place to amplify the effect based on the volume in the time increment. Volume tracking is determined in a relativized way, setting the multiplier as a function of time determined by the level of relative volume over time:
V(t)≡Volume(count) of news events, each producing a sentiment rating, in the category at time(date)t
The above equation establishes the symbol for the volume of news events on the date, or, in general, time increment, denoted by t. The volume is the count of news events occurring in that denoted interval.
The above equation establishes the symbology and functional derivation of the average daily volume up to time t. It is computed by summing the volume over all time increments up to t and then dividing by the elapsed time from a set origin t0 up to t.
The above equation establishes the symbol and derivation for the time function representing the magnitude of volume, relative to the average per unit time known up to time t. This time-mapped quantity is defined as “Relative Volume Spike”.
S
max(t)≡maxτ=t
The above equation establishes the symbol and functional determination of the largest known Relative Volume Spike up to time t from a known time origin t0.
u≡User-Selectable Attenuation Factor
The above equation sets the symbol for a factor that the user of the method can select to attenuate the volume-driven amplification of the Long-Term Score signal.
K
max≡Maximum EWMA Iteration Amplifier
The above equation sets the symbol for the maximum number of times the amplification repetitions can occur at any fixed time.
The above equation sets the symbol and describes the functional derivation of the number of times the amplification repetitions that will be applied at time t. It is the minimum of the maximum allowable number of repetitions and the integer nearest above the ratio of the Relative Volume Spike divided by the Maximum Relative Volume Spike known at time t.
To also mitigate the effect of General Sentiment Scores being updated with a lower frequency such that the effect of prior updates linger too long into the Long Term Score smoothing process, their effect is faded to neutrality while awaiting the next General Sentiment Score to appear in order to diminish the impact of “stale” scores in the computation. Also, to mitigate the effect of the first value of General Sentiment Score input to the Long Term Score calculation to have undue, disproportionate significance, a seasoning period is chosen in which the General Sentiment Scores are initially preprocessed and averaged to generate a seed value representative of the General Sentiment Score values having occurred during the seasoning period:
Δt≡seasoning period=14 time units (days)P(t)≡General Sentiment Score at time t
L(t)≡Duration of time, measured in days, at time t that P(t) has not changed—“lingered” at its current value read at time t
P
0≡neutral General Sentiment Score, typically 50 in a 0 to 100 scale
A≡information decay factor≅−0.05/day as derived above in defining General Sentiment Score
P′(t)≡fade-adjusted General Sentiment Score at time t=P0+eAL(t)(P(t)−P0)
The above equations set and derive parameters as described identically above for Long Term Score.
Compute the Volume-Modulated Long-Term Score for that entity for that category recursively as described below:
As discussed earlier, the Volume-Modulated Long Term Score is first seeded by an average of the fade-adjusted General Sentiment Scores collected during the seasoning period and then recursively updated per time increment (typically daily) forward after that point in time:
The above equation establishes the symbol and functional derivation of the initial value used in the averaging process as the arithmetic average (sum divided by count) of the fade-adjusted scores within the seasoning period.
To then compute the Volume-Modulated Long Term Score at a point in time following the seasoning period, the averaging process is iteratively applied in proportion to the relative volume signal set as above described as an additional function of time:
I′(t)≡Volume-Modulated Long-Term Score score at time t>t0+Δt=I′=rP′(t)+(1−r)I′(t−1)K(t)[rP′(t)+(1−r)I′], where is the Iteral operator.
The above equation computes the objective quantity of Volume-Modulated Long Term Score using exponentially-weighed moving averaging, expressed as a recursive function, seeded by the initial value defined immediately above and carried out by multiplying the rate r, defined above, by the fade-adjusted score at a time t, and adding to that one minus the rate multiplied by the value of the Long Term Score at the time increment prior, t−1. In addition, if, at time t, there is a call for amplification repetitions, K(t) to be carried out also at time t, then the repetitions are carried out before advancing to the next time increment. This process is repeated stepping through time increments.
Times is measured in days. Computation of the Volume-Modulated Long Term Score begins, as noted, offset by the seasoning period past the first newsworthy event in the category for the company. Prior to that point, if a representative Long Term Score is needed, then neutrality, such as 50 in a 0 to 100 scale, is set as the value.
Given any score series as described above, Relative Trend Score characterizes sustained performance over an interval with a single number. This representation complements the Long Term scores, which are function of time, as a compact single value representing a selected interval of time and characterizing the trend of the function of time over that selected interval. Relative Trend score computed thusly:
For a given score series for a given area of interest, such as a company, the mathematical embodiment is as described below, along with descriptive introductions to each mathematical line of text
Over the selected time interval, a number corresponding to the slope of the time function is computed, and relativized to the collection of slopes computed over a particular universe of other areas of interest (typically companies):
[ts,tf]≡time interval of interest
The above equation establishes the symbols used to denote the time interval over which the method for deriving the Relative Trend Score is to be applied.
N≡number of areas of interest(such as companies) in the comparison universe
The above equation sets the symbol representing the number of areas of interest present in a comparative set or “universe”, the constituents of which a subject area of interest will be compared.
l(t)≡Score at time t
The above equation sets the symbol for a score at time t. The score can be any of the scores established as General Sentiment Score, Long-Term Score, or Volume-Modulated Long-Term Score. In addition, the method described herein for deriving Relative Trend Score can be applied to any score as a function of time in addition to those specified above.
s≡linear slope=[I(tf)−I(ts)]/(tf−ts)
The above equation sets the symbol and functional derivation of the slope of the scoring function over the time interval of interest. This is accomplished by subtracting the earliest from the latest score value and dividing by the difference between from the latest back to the earliest time.
If s=0, then M[ts,tf]≡Relative Trend Score for the interval, is assigned neutral, typically 50 in a 0 to 100 scale
Otherwise, continue the computation thusly:
S
Max≡max{|si|}i−1N=maximum absolute value raw slope(“universal maximum slope”)
over all areas of interest in the comparison universe for the given category (also known as the Universal Maximum Slope (UMS))
The above equation establishes the symbol for the largest slope found over the universe of areas of interest.
The Relative Trend Score is set to be within a 0 to 100 range, with 50 being neutral, and to mitigate the skewing effects of outliers, a logarithm is utilized, with appropriate perturbations to avoid the mathematically singular effects of the logarithm function:
α≡slope scaling amplifer=1000
ε≡small maximum slope perturbation=0.0001
The above conditional equation and the definitional equations immediately above it perturbs the UMS found if it occurs below the threshold
The above equation defines a conditional function that limits the amplified linear slope |as| to 1 or greater.
c
Max≡clip limit=log 10(l(s))
The above equation establishes the symbol and functional derivation for a clip limit used in deriving the Relative Trend Score. It is computed by using the base 10 logarithm of the amplified linear slope.
The above equation computes the objective Relative Trend Score as the ratio of the base 10 logarithm of the amplified linear slope to the maximum of such logarithms over the universe. The sign of this ratio is then set based on the sign, sgn(s), of the linear slope. The result is then normalized into a scale ranging from 0 to MMax, with zero mapped to the midpoint, MMax/2.
If M[ts,tf]>MMax then reset M[ts,tf]=MMax
If M[ts,tf]<0 then reset M[ts,tf]=0
The above conditional equations limit the range of the resultant Relative Trend Score calculations to the scale bounds.
If Score data does not yet exist for the area of interest, Relative Trend Score is neutral, typically 50 in a 0 to 100 scale. For efficiency, if a set of Relative Trend Scores is desired over a common time interval, SMax can be computed once and then re-used.
In addition to presenting Relative Trend Score as its numerical value alone, a visual depiction of its relative direction is further emphatic. Examples are shown in
The additional visualization is in the form of a compass, indicating “at a glance” highs and lows of the Relative Trend Score. The technique contemplates fitting the range of the scores into a circular dial, using the appropriate trigonometric mapping as detailed below:
The mathematics for dynamically computing the visual elements of this “Relative Trend Score Compass” are detailed thusly, along with descriptive introductions to each mathematical line of text:
The Relative Trend Score Compass visual elements are computed as follows:
m≡Relative Trend Score(as computed above) for a particular area of interest
M≡maximum Relative Trend Score over all areas of interest in a particular universe
L≡length of graphical needle as desired in the rendering
ϵ≡perturbation from verticality(typically 0.1)
The above equations set the symbols for parameters, as described therein, to be used in setting the properties of the Relative Trend Score Compass as described below:
Set the origin of the compass arrow at (0,0), and set the tip at these coordinates:
The above conditional equation sets the horizontal extent, x, of the needle on the compass dial. This is done using trigonometry and scaling by the needle length, L, such that the horizontal movement of the dial angle is fit proportionately, score relative to maximum, into the positive two quadrants of the compass, and is limited to the projection of 45 degrees
This keeps the intuitive sense communicated by the compass to be forward moving.
The above equation sets the vertical extend, y, of the needle on the compass dial. This is done using trigonometry and scaling by the needle length, L, such that the vertical extent is within the two right quadrants and with the angle fit proportionately, score relative to maximum into the angular swing within those quadrants.
The two cases in the x coordinate cover the situation when the absolute angle from the horizontal is above 45 degrees, and we want behavior like a constant-length hand on a clock or compass needle. When the absolute angle from the horizontal is below 45 degrees, we wish to decrease the length of the arrow so that it never exceeds the projection of the 45-degree arrow onto the horizontal. The reason for this is to eliminate perceptions that, although the angle is lower, the length appears greater, erroneously suggesting better progress.
The ϵ is a small perturbation in the arrow angle to provide the effect of the arrow never going singularly vertical, which would be unintuitive, as time would be implied to be standing still with infinite progress.
Note also that only the maximum Relative Trend Score is used to scale the angle of the arrow, rather than the maximum absolute value. Should there be a negative Relative Trend Score with absolute value greater than the maximum positive Relative Trend Score, then the arrow depicting that case would point backwards, and that perception is acceptable and actually informative.
With respect to aggregate scoring, an aggregate is any collection of areas of interest, such as in the case where areas of interest are companies, a benchmark, industry, sector, portfolio, or watchlist. For more intuitive understanding by the consumer, and ease of implementation, the score assigned to an aggregate, given the scores of its constituents, as computed above in any of the forms taught above, is defined, for a particular category (or overall), as the straight arithmetic average of the respective scores of the constituents. In cases where the constituents have not yet reported a score, due to no input news events having occurred, a neutral score, typically 50 in a 0 to 100 scale, is then used as the entry into the average.
In mathematical terms for a particular category (or overall) considered within a particular aggregate, the aggregate score is computed by contemplating an average, described below, along with descriptive introductions to each mathematical line of text:
J≡number of areas of interest, such as companies, in the aggregate
The above equation sets the symbol for the number of areas of interest in the aggregate of interest.
∀j∈{1 . . . J}
The above equation sets the symbol for each of the Jscores within the aggregate to be used to compute the score for the aggregate.
The above equation delivers the objective Aggregate Score for the category (or overall) by an arithmetic average (sum of scores divided by count of score) over the constituents of the aggregate.
In cases where an area of interest had not yet received any input sentiment values, that company would have had a neutral score set, typically 50 in a 0 to 100 scale, which would have then just naturally been averaged in as above.
The Aggregate Score is presented as a current numerical value and as a historical graph over a user-selected timeline, as shown in
For custom combined category scores of aggregates, the approach is a straightforward arithmetic average as shown in
Formalizing mathematically, along with descriptive introductions to each mathematical line of text:
The above equation computes the Aggregate score for a collection of custom categories in a manner identical, using arithmetic average (score sum divided by score count) as with the per category case described above, and using the parameters defined by the equations below:
C≡number of categories selected
J≡number of areas of interest, such as companies, in the aggregate
IS
j,c(t)≡Custom Category Score at time t for the subset C of categories selected for the jth area of interest, such as a company
Score Rankings within Aggregates
Within aggregates, it is useful to present the relative performance of the entities (companies). Often of interest is the ability to stack rank and identify relative performance bands. These stratifications are computed as described below, along with descriptive introductions to each mathematical line of text:
For a particular category (or overall):
for a particular aggregate:
n
m≡number of score data points for the mth area of interest, such as a company, in the aggregate
N≡number of areas of interest in the aggregate with nm>0
for a particular selected time range:
IS
m≡score nearest the end of the time range, for the mth area of interest in the aggregate
The above equations set the symbology and definitions of the various parameters as described to be used in computing rankings and percentiles as described below:
k
m
(<)≡stack parameter below=number of areas of interest, over the N companies in the aggregate, with scores less than that of the mth area of interest in the aggregate
k
m
(=)≡stack parameter equal=number of areas of interest, over the N companies in the aggregate, with scores equal to that of the mth area of interest in the aggregate
R
m
≡N−k
m
(<)≡ranking, over the N areas of interest in the aggregate, of the mth area of interest within it
within the N areas of interest in the aggregate, of the mth area of interest
The above equation maps the percentile into a zero to 99 range by proportionalizing the stack parameter below into the total number of countable areas of interest and adjusting that by half the stack parameter equal. This ratio is then applied to the percentile range of 99. This equation is articulated to adjust for the situation when all values in the aggregate are equal, yielding a 50th percentile for all, and adjusting for when there are few items in the aggregate so as not to falsely over-reward. In examples where nm=0, then Rm=Qm=N/A.
Rankings and percentiles are presented as single numbers pertaining to an area of interest, relative to the aggregate (such as the industry classification of a company). In addition, the areas of interest can be stack listed per their ranks or percentiles within the aggregate.
The present disclosure provides computer control systems that are programmed to implement methods of the disclosure.
The computer system 2701 includes a central processing unit (CPU, also “processor” and “computer processor” herein) 2705, which can be a single core or multi core processor, or a plurality of processors for parallel processing. The computer system 2701 also includes memory or memory location 2710 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 2715 (e.g., hard disk), communication interface 2720 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 2725, such as cache, other memory, data storage and/or electronic display adapters. The memory 2710, storage unit 2715, interface 2720 and peripheral devices 2725 are in communication with the CPU 2705 through a communication bus (solid lines), such as a motherboard. The storage unit 2715 can be a data storage unit (or data repository) for storing data. The computer system 2701 can be operatively coupled to a computer network (“network”) 2730 with the aid of the communication interface 2720. The network 2730 can be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet. The network 2730 in some cases is a telecommunication and/or data network. The network 2730 can include one or more computer servers, which can enable distributed computing, such as cloud computing. The network 2730, in some cases with the aid of the computer system 2701, can implement a peer-to-peer network, which may enable devices coupled to the computer system 2701 to behave as a client or a server.
The CPU 2705 can execute a sequence of machine-readable instructions, which can be embodied in a program or software. The instructions may be stored in a memory location, such as the memory 2710. The instructions can be directed to the CPU 2705, which can subsequently program or otherwise configure the CPU 2705 to implement methods of the present disclosure. Examples of operations performed by the CPU 2705 can include fetch, decode, execute, and writeback.
The CPU 2705 can be part of a circuit, such as an integrated circuit. One or more other components of the system 2701 can be included in the circuit. In some cases, the circuit is an application specific integrated circuit (ASIC).
The storage unit 2715 can store files, such as drivers, libraries and saved programs. The storage unit 2715 can store user data, e.g., user preferences and user programs. The computer system 2701 in some cases can include one or more additional data storage units that are external to the computer system 2701, such as located on a remote server that is in communication with the computer system 2701 through an intranet or the Internet.
The computer system 2701 can communicate with one or more remote computer systems through the network 2730. For instance, the computer system 2701 can communicate with a remote computer system of a user. Examples of remote computer systems include personal computers (e.g., portable PC), slate or tablet PC's (e.g., Apple® iPad, Samsung® Galaxy Tab), telephones, Smart phones (e.g., Apple® iPhone, Android-enabled device, Blackberry®), or personal digital assistants. The user can access the computer system 2701 via the network 2730.
Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the computer system 1801, such as, for example, on the memory 2710 or electronic storage unit 2715. The machine executable or machine readable code can be provided in the form of software. During use, the code can be executed by the processor 2705. In some cases, the code can be retrieved from the storage unit 2715 and stored on the memory 2710 for ready access by the processor 2705. In some situations, the electronic storage unit 2715 can be precluded, and machine-executable instructions are stored on memory 2710.
The code can be pre-compiled and configured for use with a machine have a processer adapted to execute the code, or can be compiled during runtime. The code can be supplied in a programming language that can be selected to enable the code to execute in a pre-compiled or as-compiled fashion.
Aspects of the systems and methods provided herein, such as the computer system 2701, can be embodied in programming. Various aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium. Machine-executable code can be stored on an electronic storage unit, such memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk. “Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server. Thus, another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links. The physical elements that carry such waves, such as wired or wireless links, optical links or the like, also may be considered as media bearing the software. As used herein, unless restricted to non-transitory, tangible “storage” media, terms such as computer or machine “readable medium” refer to any medium that participates in providing instructions to a processor for execution.
Hence, a machine readable medium, such as computer-executable code, may take many forms, including but not limited to, a tangible storage medium, a carrier wave medium or physical transmission medium. Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as may be used to implement the databases, etc. shown in the drawings. Volatile storage media include dynamic memory, such as main memory of such a computer platform. Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system. Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications. Common forms of computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a ROM, a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data. Many of these forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to a processor for execution.
The computer system 2701 can include or be in communication with an electronic display 2735 that comprises a user interface (UI) 2740 for providing, for example, charts that depict successive levels of summary performance information. Examples of UI's include, without limitation, a graphical user interface (GUI) and web-based user interface.
Methods and systems of the present disclosure can be implemented by way of one or more algorithms. An algorithm can be implemented by way of software upon execution by the central processing unit 2705. The algorithm can, for example, assess long-term indicators of sentiment.
While preferred embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. It is not intended that the invention be limited by the specific examples provided within the specification. While the invention has been described with reference to the aforementioned specification, the descriptions and illustrations of the embodiments herein are not meant to be construed in a limiting sense. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. Furthermore, it shall be understood that all aspects of the invention are not limited to the specific depictions, configurations or relative proportions set forth herein which depend upon a variety of conditions and variables. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is therefore contemplated that the invention shall also cover any such alternatives, modifications, variations or equivalents. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.
It should be understood from the foregoing that, while particular implementations have been illustrated and described, various modifications can be made thereto and are contemplated herein. It is also not intended that the invention be limited by the specific examples provided within the specification. While the invention has been described with reference to the aforementioned specification, the descriptions and illustrations of the preferable embodiments herein are not meant to be construed in a limiting sense. Furthermore, it shall be understood that all aspects of the invention are not limited to the specific depictions, configurations or relative proportions set forth herein which depend upon a variety of conditions and variables. Various modifications in form and detail of the embodiments of the invention will be apparent to a person skilled in the art. It is therefore contemplated that the invention shall also cover any such modifications, variations and equivalents.
This application is a national stage entry of PCT/US2018/013906, entitled METHODS OF ASSESSONG LONG-TERM INDICATORS OF SENTIMENT, and filed on Jan. 16, 2018, which claims priority to and the benefit of U.S. Provisional Application No. 62/446,312, filed Jan. 13, 2017, each of which is incorporated herein by reference in its entirety for any purpose.
Number | Date | Country | |
---|---|---|---|
62446312 | Jan 2017 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16470212 | Jun 2019 | US |
Child | 17702356 | US |