METHOD AND APPARATUS FOR RISK MINING

Description

FIELD OF THE INVENTION

This invention generally relates to the area of risk management. More specifically, this invention relates to automating risk identification using information mined from information sources.

BACKGROUND OF THE INVENTION

Organizations operate in risky environments. Competitors may threaten their markets; regulations may threaten margins and business models; customer sentiment may shift and threaten demand; and suppliers may go out of business and threaten supply, etc. Risk management is thus a central part of operations and strategy for any prudent organization.

Currently, various risk alerts with respect to entities and activities are common. However, such risk alerts occur after the fact. While alerts as to the actual occurrence of an event which puts an entity or topic/concern at risk is important, the mining of potential risks is believed to be very useful in decision making with respect to such an entity or issue. In order to perform a meaningful risk assessment, it is often necessary to compile not only sufficient information, but information of the proper type in order to formulate a judgment as to whether the information constitutes a risk. Without the ability to access and assimilate a variety of different information sources, and particularly from a sufficient number and type of information sources, the identification, assessment and communication of potential risks is significantly hampered. Currently, gathering of risk-related information is manually performed and lacks defined criteria and processes for mining meaningful risks to provide a clear picture of the risk landscape.

One possibly related area is research on correlations between stock prices or stock price volatility (a proxy for risk) and published documents. The first step in the risk management cycle, i.e. risk identification, however, has received little or no attention. In other words, methods of the prior techniques are cumbersome, inefficient for identifying risk and lack accuracy. In particular, prior techniques require manual or operator intervention and analysis to access documents that may impact risk before alerting an analyst. Thus, the state of the art is incapable of dealing with risk unanticipated by a risk analyst.

SUMMARY OF THE INVENTION

The present invention recognizes the difficulties analysts currently have in anticipating risks and seeks to overcome these difficulties. The present invention provides a method to accurately and efficiently identify potential risks associated with various entities and activities, and includes various advantages and benefits as described further herein.

The present invention avoids the problems of the prior art by mining risk-indicating patterns from textual databases that can then be used to activate alerts, thus informing users, such as analysts, that a risk may or is about to materialize. In particular, the present invention is directed towards automatically mining risk from different sources, thereby allowing an analyst to review many more information sources than possible with techniques of the prior art.

In one aspect of the present invention a computer implemented method for mining risks is provided. The method includes providing a set of risk-indicating patterns on a computing device; querying a corpus using the computing device to identify a set of potential risks by using a risk-identification-algorithm based, at least in part, on the set of risk-indicating patterns associated with the corpus; comparing the set of potential risks with the risk-indicating patterns to obtain a set of prerequisite risks; generating a signal representative of the set of prerequisite risks; and storing the signal representative of the set of prerequisite risks in an electronic memory. Prior to this mining, a corpus of textual data is first searched with the computing device containing the risk-identification-algorithm for instances of a set of risk-indicative seed patterns provided to create a risk database, which is done by a risk miner function. The corpus may include any searchable source of information. Generally such sources are digital and accessible through computerized searching. For example, the corpus may include, but is not limited to, news, financial information, blogs, web pages, event streams, protocol files, status updates on social network services, emails, short message services, instant chat messages, Twitter tweets, and/or combinations thereof. Rather than alert a user after a risk factor has in fact occurred, a risk alerter function may pass warning notifications to a user directly, thereby avoiding the shortcomings of the prior art.

In another aspect of the present invention a computing device or system may include an electronic memory; and a risk-identification-algorithm based, at least in part, on the set of risk-indicating patterns associated with a corpus stored in the electronic memory.

These and other features of the invention will be more fully understood from the following description of specific embodiments of the invention taken together with the accompanying drawings. Like reference symbols in the various drawings indicate like elements.

BRIEF DESCRIPTION OF THE DRAWINGS

In the drawings:

FIG. 1 is a depiction of a prerequisite of an event forming a risk according to the present invention;

FIG. 2 is a schematic of a device for mining risks according to the present invention;

FIG. 3 is a schematic of the method for mining risks according to the present invention;

FIG. 4 depicts an embodiment of risk clustering according to the present invention;

FIG. 5 another embodiment of risk clustering according to the present invention;

FIGS. 6-13 are risk mining examples according to the present invention; and

FIG. 14 is a schematic of the system for mining and alerting risks according to the present invention.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 illustrates how a risk materializes over time. Initially, a Risk, P=>Q, is extracted from a large textual database at time where Q stands for a high-impact event and P stands for a prerequisite of Q which is causally or statistically connected to Q by and precedes Q in time. The implication symbol “=>” captures the causality and/or enablement relation holding between P and Q (e.g., P causes Q, or P is likely to enable Q). The implication symbol “=>” is not meant to be a material implication. Later at time, t_j, P might happen, which in turn may lead to Q occurring at time t_k. The present invention solves the problem of obtaining risks P=>Q automatically from text and describes how P=>Q and P may be used to alert a user that Q may be imminent. As used herein, the term risk, which may be positive or negative, refers to an event involving uncertainty unless the event has occurred, which may result from a factor, thing, element, or course. In particular, as used herein, the term risk, which may be positive or negative, refers to where a prerequisite for an event where the prerequisite is causally or statistically connected to the event and precedes the event in time. As used herein, the term prerequisite refers to a statement or an indication relating to a particular subject. In particular, the term prerequisite refers to statement or an indication relating to a particular event, either directly or thought the mining techniques of the present invention.

FIGS. 2 and 3 illustrate the overall process of the present invention. As depicted in FIG. 2, a corpus 110, for example a set(s) of textual feed(s), is mined for risk through use of a computing device 120. As used herein, the term corpus and it variants refer to a set or sets of data, in particular digital data including textual data. The corpus 110 may include, but is not limited to, news; financial information, including but not limited to stock price data and its standard derivation (volatility); governmental and regulatory reports, including but not limited, to government agency reports, regulatory filings such as tax filings, medical filings, legal filings, Food and Drug Administration (FDA) filings, Security and Exchange Commission (SEC) filings; private entity publications, including but not limited to, annual reports, newsletters, advertising and press releases; blogs; web pages; event streams; protocol files; status updates on social network services; emails; Short Message Services (SMS); instant chat messages; Twitter tweets; and/or combinations thereof. The computing device 100 surveys corpus 110 to extract risk-indicating patterns and to seed the risk-identification-algorithm 140 with risk-indicative seed patterns for subsequent risk mining by an analyst or user. The computing device 120 may further include an interface 170 for querying the computer, such as a keyboard, and a display 160 for displaying results from the computer 120.

The computing device 120 may also be used to alert users 130 through a computer interface (not shown) of risks, including but not limited to imminent risks, i.e., risks that are likely to occur including, but not limited to, likely to occur in the near future or a defined time period. Typically, the users 130 are alerted via a computing device (not shown). The present invention, however, is not so limited, and any device having a visual display or even a voice communication may suitably be used. As used herein, the term “computing device” refers to a device that computes, especially a programmable electronic machine that performs high-speed mathematical or logical operations or that assembles, stores, correlates, or otherwise processes information. Examples include, without limitation, mainframe computers, personal computers and handheld devices. Before mining the corpus 110 for risk, the present invention utilizes the computing device 100 to extract risk-indicating patterns from corpus or corpora of textual data. As used herein, risk-indicating patterns are patterns developed through the techniques of the present invention which relate possible prerequisites to possible events.

As depicted in FIG. 3, the computing device 120 contains a risk-identification-algorithm 140. With the computing device 120 containing the risk-identification-algorithm 140, a corpus 210 of textual data is searched for instances of a set of risk-indicative seed patterns provided to create a risk database, which is done by a risk miner 220. The corpus 210 may include, but is not limited to, news; financial information, including but not limited to stock price data and its standard derivation (volatility); governmental and regulatory reports, including but not limited, to government agency reports, regulatory filings such as tax filings, medical filings, legal filings, Food and Drug Administration (FDA) filings, Security and Exchange Commission (SEC) filings; private entity publications, including but not limited to, annual reports, newsletters, advertising and press releases; blogs; web pages; event streams; protocol files; status updates on social network services; emails; Short Message Services (SMS); instant chat messages; Twitter tweets; and/or combinations thereof. The corpus 210 may be the same as corpus 110 or may be different.

In one embodiment of the invention, trigger keywords are used (e.g. “risk”, “threat”) to generate the risk database. In another embodiment, regular expressions are used (e.g. “(“may”)? pose(s)? (a)? threat(s)? to”) to generate the risk database. Candidate risk sentences or sentence sequences are created, and new patterns are generalized by running a named entity tagger or Part of Speech (POS) tagger, and chunker (entities can be described by proper nouns or NPs, and not just given by named entities) over it, and by substituting entities by per-class placeholder (e.g. “J.P. Morgan”=>“<COMPANY>”). These generated patterns can be used for re-processing the corpus, in one embodiment of the present invention after some human review, or automatically in another embodiment. The extracted sentences or sentence sequences are then both validated (whether or not they are really risk-indicating sentences) and parsed into risks of the form P=>Q (i.e. finding out which text spans correspond to the precondition “P”, which parts express the implication “=>”, and which parts express the high-impact event “Q”), using, but not limited to, the following nonlimiting features:

- a set of terms with significant statistical association with the term “risk” (in one embodiment of this invention, statistical programs, such as Pointwise Mutual Information (PMI) and Log Likelihood, or rules, including but not limited to rules obtained by Hearst pattern induction, may be used to determine the set of terms);
- a set of binary gazetteer features, where the feature fires if a gazetteer a set of risk-indicative terms (“threat”, “bankruptcy”, “risk”, . . . ) compiled by human experts or extracted from hand-labelled training data;
- a set of indicators of speculative language;
- instances of future time reference;
- occurrences of conditionals; and/or
- occurrences of causality markers.

In one embodiment of the present invention, a variant of surrogate machine-learning (i.e., technology for machine learning tasks by examples) may be used to create training data for a machine-learning based classifier that extracts risk-indicative sentences. One useful technique is described by Sriharsha Veeramachaneni and Ravi Kumar Kondadadi in “Surrogate Learning—From Feature Independence to Semi-Supervised Classification”, Proceedings of the NAACL HLT Workshop on Semi-supervised Learning for Natural Language Processing, pages 10-18, Boulder, Colo., June 2009. Association for Computational Linguistics (ACL), the contents of which is incorporated herein by reference.

A risk type classifier 230 classifies each risk pattern by risk type (“RT”), according to a pre-defined taxonomy of risk types. In one embodiment of the present invention, this taxonomy may use, but not limited to, the following non-limiting classes:

- Political: Government policy, public opinion, change in ideology, dogma, legislation, disorder (war, terrorism, riots);
- Environmental: Contaminated land or pollution liability, nuisance (e.g. noise), permissions, public opinion, internal/corporate policy, environmental law or regulations or practice or ‘impact’ requirements;
- Planning: Permission requirements, policy and practice, land use, socio-economic impact, public opinion;
- Market: Demand (forecasts), competition, obsolescence, customer satisfaction, fashion;
- Economic: Treasury policy, taxation, cost inflation, interest rates, exchange rates;
- Financial: Bankruptcy, margins, insurance, risk share;
- Natural: Unforeseen ground conditions, weather, earthquake, fire, explosion, archaeological discovery;
- Project: Definition, procurement strategy, performance requirements, standards, leadership, organization (maturity, commitment, competence and experience), planning and quality control, program, labor and resources, communications and culture;
- Technical: Design adequacy, operational efficiency, reliability;
- Regulatory: Changes by regulator;
- Human: Error, incompetence, ignorance, tiredness, communication ability, culture, work in the dark or at night;
- Criminal: Lack of security, vandalism, theft, fraud, corruption;
- Safety: Regulations, hazardous substances, collisions, collapse, flooding, fire, explosion; and/or
- Legal: Changes in legislations, treaties.

A risk clusterer 240 groups all risks in the database by similarity, but without imposing a pre-defined taxonomy (data driven). In one embodiment Hearst pattern induction may be used. Hearst pattern induction was first mentioned in Hearst, Marti, “WordNet: An Electronic Lexical Database and Some of its Applications”, (Christiane Fellbaum (Ed.)), MIT Press 1998, the contents of which is incorporated herein by reference. In another embodiment of the present invention a number k is chosen by the system developer, and the kNN-means clustering method may be used. Further details of kNN clustering is described by Hastie, Trevor, Robert Tibshirani and Jerome Friedman, “The Elements of Statistical Learning: Data Mining, Inference, and Prediction”, Second Edition Springer (2009), the content of which is incorporated herein by reference. In such a case, the risks are grouped into a number, i.e. k, of categories and then classified by choosing the cluster with the highest similarity to a cluster of interest. In another embodiment of the present invention, hierarchical clustering is used. Alternatively or in addition to, both k-means clustering and hierarchical clustering may be used.

FIG. 4 depicts one embodiment of the risk clusterer 240 according to the present invention. At step 310, a text corpus is provided. At step 320, the text corpus is tokenized into a set of sentences. At step 330, all instances of a risk, which is indicated by “*”, is extracted from the tokenized text. At step 340, a taxonomy of risks is constructed into a tree by organizing all fillers matching the risk, i.e.“*”. At step, 350, Hearst pattern induction may be used to induce the risk taxonomy. Further, an NP chunker may be used to find the boundaries of interest.

FIG. 5 depicts another embodiment of the risk clusterer 240 according to the present invention. In this embodiment, a risk taxonomy is created from, for example risks 450, legal risks 460 and legal changes 470. Risks 450, such as those that may be associated with legal changes 470, are seeded, as indicated by 410. Legal risks 460, such as legal changes 470, are mined by the computing device 120, as indicated by 420. Risks 450 are also mined for legal risks 470, as indicated by 430. In such a manner there is feedback for the legal risks 460 based on the risks 450 and the legal changes 470. The mining of the risks 450 and the legal risks 460 may include mining with the word risk or an equivalent thereto. The mining of the legal changes 470 does not necessarily include the word risk. Advantageously, the taxonomy resulting from this process contains risk-indicative phrases that do not necessarily contain the word “risk” itself. Such taxonomy may be used in the risk-mining patterns in addition to their use for risk-type classification.

A risk alerter 250 performs a similarity matching operation between the risks in the database and likely instances of P or Q in a textual feed 110. If evidence for P is found, the risk P=>Q is “imminent”. If evidence for Q is found, the risk P=>Q has materialized. In one embodiment of the present invention, the risk alerter 250 passes warning notifications to a user 130 directly.

As a result, when inspecting the risk database the user 130 (e.g. a risk analyst) can take immediate action before the risk materialises and increase the priority of the management of imminent risks (“P!, . . . , P!, P!, P!, . . . P! . . . ”) in the textual feed and materialized risks (“Q!”) as events unfold, without having to even read the textual feeds.

In one embodiment of the present invention, the output of the risk alerter 250 is connected to the input of a risk routing unit (not shown in FIGS. 2-3), which notifies an analyst whose profile matches the risk type RT. For example, an analyst may want to know about environmental risks. The risk alerter 250 would alert the analyst about an environmental risk when a prerequisite of a possible environmental event is mined. For example, the analyst may be altered to an environmental risk of global warming when industrial activity increases in a particular country or region.

In one embodiment of the present invention a set of risk descriptions as extracted from the corpus defined as the set of all past Security Exchange Commission (“SEC”) filings is matched to the risks extracted from the textual feed. The method proposes one risk description or a ranked list of alternative risk descriptions for inclusion in draft SEC filings for the company operating the system, in order to ensure compliance with SEC business risk disclosure duties.

The present invention may use a variety of methods for risk identification. For example, as depicted in FIG. 6, risk mining may include baseline monitoring of regular patterns over surface strings and named entity tags; identification of words frequently associated with risk using clustering information theory; and/or risk-indicative sentence clustering. Alternatively or in addition to, technology for machine learning of tasks by example may be used. The risk identification includes the querying of a corpus or corpora for risk indicating patterns. The query result may match all, substantially all or some of the risk indicating patterns. The number of occurrences or particular risk indicating patterns may also be used in the risk mining techniques of the present invention.

FIGS. 7 and 8 illustrate examples of risk mining according to the present invention. In Example 1 of FIG. 7, the corpus, including the listed news article, is mined for the term “cholesterol” as P or a prerequisite of Q or an event. The event Q is further classified by a holder “diabetics” and a target “amputation risk”. The Risk Type RT is health and has a positive polarity as being beneficial to health. For purposes of the present invention, the term risk not only refers to negative or harmful events, but also may refer to positive or beneficial results. In other words, a risk may have a positive impact and/or a negative impact. In Example 2 of FIG. 8, the corpus, including the listed news article, is mined for the phrase “North Korea launch” as P or a prerequisite of Q or an event. The event Q is further classified by a holder “North Korea” and a target “more than condemnation” U.S.”. The Risk Type RT is political and has a negative polarity as being harmful to world politics. Moreover, such negative and/or positive polarities may also be weighted for degree of the risk. In such a case it may be beneficial to alter the user 130 to a very harmful or very beneficial risk to a greater degree for a less consequential risk.

FIG. 9 illustrates another example of risk mining according to the present invention. In Example 3, the news article is mined. As background, demand for the metal lithium is increasing with limited supplies being available. Much of the metal is obtained from Bolivia, which at the time of this article has a government which may be viewed by some not to be friendly to capitalistic governments or businesses. The article is mined for a variety of potential words, sequences of words, and/or partial phrases to query the article for prerequisite P of events Q which may lead to risk, as indicated by the underlined words and/or sequences. The risk types present in the article include supply-demand risk and political risk.

FIG. 10 illustrates another example of risk mining according to the present invention. In Example 4a corpus is mined for a pattern having specific tokens, i.e., “if” and “then”. The mining extracts sequences beginning or having these tokens. The length of the sequence is not limited to any particular length or number of words, but is determined by tokens. The sequences are stored in registers, for example in the computing device 120. The use of patterns, however, such as, but not limited to those shown in FIG. 13, may be more precise than using a keyword-based ranked retrieval.

FIG. 11 illustrates another example of risk mining according to the present invention. In Example 5a corpus is mined according to syntax or the grammatical structure of sentences or phrases. In this example normal PENN Treebank classes or tags or slightly modified PENN tags are used. Further details of Penn Treebank may be found at http://www.cis.upenn.edu/˜treebank/ (PENN Treebank homepage), the contents of which is incorporated herein by reference, or by contacting Linguistic Data Consortium, University of Pennsylvania, 3600 Market Street, Suite 810, Philadelphia, Pa. 18104. For languages other than English, corresponding tagsets have been established and are known to one of ordinary skill in the art. In this example the tag “PRP” refers to a personal pronoun, i.e., “we” in the example sentence. The tag “VBP” refers a non-third person singular present tense verb, i.e. “expect” in the example sentence. The tag “TO” simply refers to the word “to” in the example sentence. The “VB” tag refers to a base form verb, i.e. “be” in the example sentence. The “RB” tag refers to an adverb, i.e., “negatively” in the example sentence. The “IN” tag refers to a preposition or subordinating conjunction, i.e. “by” in the example sentence. Some of the common PENN Treebank word P.O.S. tags include, but are not limited to, CC—Coordinating conjunction; CD—Cardinal number; DT—Determiner; EX—Existential there; FW—Foreign word; IN—Preposition or subordinating conjunction; JJ—Adjective; JJR—Adjective, comparative; JJS—Adjective, superlative; LS—List item marker; MD—Modal; NN—Noun, singular or mass; NNS—Noun, plural; NNP—Proper noun, singular; NNPS—Proper noun, plural; PDT—Predeterminer; POS—Possessive ending; PRP—Personal pronoun; PRP$—Possessive pronoun (prolog version PRP-S); RB—Adverb; RBR—Adverb, comparative; RBS—Adverb, superlative; RP—Particle; SYM—Symbol; TO—to; UH—Interjection; VB—Verb, base form; VBD—Verb, past tense; VBG—Verb, gerund or present participle; VBN—Verb, past participle; VBP—Verb, non-3rd person singular present; VBZ—Verb, 3rd person singular present; WDT—Wh-determiner; WP—Wh-pronoun; WP$—Possessive wh-pronoun (prolog version WP-S); and WRB—Wh-adverb.

In FIG. 12, Example 6 illustrates another mining sequence or algorithm based on PENN treebank tags. Thus, as shown in FIGS. 11 and 12, the mining techniques of the present invention may analyze the same sentence under different criteria to obtain risks or prerequisites for risks.

In FIG. 13, risk mining according to the present invention is accomplished by a sequence of binary grammatical dependency relationships between words, including placeholders.

The above-described examples and techniques for mining risks may be used individually or in any combination. The present invention, however, is not limited to these specific example and other patterns or techniques may be used with the present invention. The mined patterns from these examples and/or from the techniques of the present invention may be ranked according to ranking algorithms, such as but not limited to statistical language models (LMs), graph-based algorithms (such as PageRank or HITS), ranking SVMs, or other suitable methods.

In one aspect of the present invention a computer implemented method for mining risks is provided. The method includes providing a set of risk-indicating patterns on a computing device 120; querying a corpus 110 using the computing device 120 to identify a set of potential risks by using a risk-identification-algorithm 140 based, at least in part, on the set of risk-indicating patterns associated with the corpus 120; comparing the set of potential risks with the risk-indicating patterns to obtain a set of prerequisite risks; generating a signal representative of the set of prerequisite risks; and storing the signal representative of the set of prerequisite risks in an electronic memory 150. The method may further include determining an imminent risk from the prerequisite risks, the imminent risk being determined using the risk-identification-algorithm 140, the imminent risk being associated with at least one risk from the set of prerequisite risks; generating a signal representative of the imminent risk; and storing the signal representative of the imminent risk in the electronic memory 150. Still further, the method may further include, after storing the signal representative of the set of prerequisite risks, determining a materialized risk, the materialized risk being determined using the risk-identification-algorithm 140, the materialized risk being associated with the set of risks; generating a signal representative of the materialized risk; and storing the signal representative of the materialized risk in the electronic memory 150. Moreover, the method may still further include, after storing the signal representative of the imminent risk, determining a materialized risk, the materialized risk being determined using the risk-identification-algorithm 140, the materialized risk being associated with the imminent risk; generating a signal representative of the materialized risk; and storing the signal representative of the materialized risk in the electronic memory 150.

Desirably, the corpus 110 is digital. The corpus 110 may include, but is not limited to, news; financial information, including but not limited to stock price data and its standard derivation (volatility); governmental and regulatory reports, including but not limited, to government agency reports, regulatory filings such as tax filings, medical filings, legal filings, Food and Drug Administration (FDA) filings, Security and Exchange Commission (SEC) filings; private entity publications, including but not limited to, annual reports, newsletters, advertising and press releases; blogs; web pages; event streams; protocol files; status updates on social network services; emails; Short Message Services (SMS); instant chat messages; Twitter tweets; and/or combinations thereof.

The risk-identification-algorithm 140 may be based upon various factors and/or criteria. For example, the risk-identification-algorithm 140 may be based upon, but not limited to, a set of terms statistically associated with risk; upon a temporal factor; upon a set of customized criteria, etc. and combinations thereof. The set of customized criteria may include and/or take into account of, for example, an industry criterion, a geographic criterion, a monetary criterion, a political criterion, a severity criterion, an urgency criterion, a subject matter criterion, a topic criterion, a set of named entities, and combinations thereof.

In one aspect of the present invention, the risk-identification-algorithm 140 may be based upon a set of source ratings. As used herein, the phrase “source ratings” refers to the rating of sources, for example, but not limited to, relevance, reliability, etc. The set of source ratings may have a one to one correspondence with a set of sources. The set of sources may serve as a source of information on which the corpus 110, 210 is based. The set of source ratings may be modified based upon an imminent risk, a materialized risk, and combinations thereof.

The method of the present invention may further include transmitting the signal representative of the set of prerequisite risks, transmitting the signal representative of the imminent risk, transmitting the signal representative of the materialized risk, and combinations thereof. Moreover, the present invention may further include providing a web-based risk alerting service using at least one of the signal representative of the set of risks, the signal representative of the imminent risk, the signal representative of the materialized risk, and combinations thereof.

In another aspect of the present invention a computing device 120, as depicted in FIG. 2, may include an electronic memory 150; and a risk-identification-algorithm 140 based, at least in part, on the set of risk-indicating patterns associated with a corpus stored in the electronic memory 150. A processor (not shown) may be used to run the algorithm 140 on the computer device 120. The computing device 120 may include a computer interface 170, which is depicted, but not limited to, a keyboard, for querying the risk-identification-algorithm 140. The computing device 120 may include a display 160 for receiving a signal from the electronic memory and for displaying risk alerts from the risk-identification-algorithm 140.

In another aspect of the present invention, a computer system 500, as depicted in FIG. 14, is provided for alerting a user of risks. The system 500 may include a computing device 520 having an electronic memory 550 and a risk-identification-algorithm 540 based, at least in part, on the set of risk-indicating patterns 580 associated with a corpus 110 stored in the electronic memory 550. A processor (not shown) may be used to run the algorithm 540 on the computer device 520. The system 500 may further include a user interface 580 for querying the risk-identification-algorithm 540 and for receiving a signal from the electronic memory 550 of the computing device 520 for alerting a user of risks. The user interface 580 may include, but is not limited to, a computer, a television, a portable media device, and/or a web-enabled device, such as a cellular phone, a personal data assistant, and the like.

While the invention has been described by reference to certain preferred embodiments, it should be understood that numerous changes could be made within the spirit and scope of the inventive concept described. Accordingly, it is intended that the invention not be limited to the disclosed embodiments, but that it have the full scope permitted by the language of the following claims.

Claims

1. A computer implemented method comprising: providing a set of risk-indicating patterns on a computing device;querying a corpus using the computing device to identify a set of potential risks by using a risk-identification-algorithm based, at least in part, on the set of risk-indicating patterns associated with the corpus;comparing the set of potential risks with the risk-indicating patterns to obtain a set of prerequisite risks;generating a signal representative of the set of prerequisite risks; andstoring the signal representative of the set of prerequisite risks in an electronic memory.
2. The method of claim 1 further comprising: determining an imminent risk from the prerequisite risks, the imminent risk being determined using the risk-identification-algorithm, the imminent risk being associated with at least one risk from the set of prerequisite risks;generating a signal representative of the imminent risk; andstoring the signal representative of the imminent risk in the electronic memory.
3. The method of claim 1 further comprising: after storing the signal representative of the set of prerequisite risks, determining a materialized risk, the materialized risk being determined using the risk-identification-algorithm, the materialized risk being associated with the set of risks;generating a signal representative of the materialized risk; andstoring the signal representative of the materialized risk in the electronic memory.
4. The method of claim 2 further comprising: after storing the signal representative of the imminent risk, determining a materialized risk, the materialized risk being determined using the risk-identification-algorithm, the materialized risk being associated with the imminent risk;generating a signal representative of the materialized risk; andstoring the signal representative of the materialized risk in the electronic memory.
5. The method of claim 1 wherein the corpus is digital.
6. The method of claim 5 wherein the corpus comprises news.
7. The method of claim 5 wherein the corpus comprises financial information.
8. The method of claim 5 wherein the corpus comprises blogs.
9. The method of claim 5 wherein the corpus comprises event streams.
10. The method of claim 5 wherein the corpus comprises protocol files.
11. The method of claim 5 wherein the corpus comprises status updates on social network services.
12. The method of claim 5 wherein the corpus comprises emails.
13. The method of claim 5 wherein the corpus comprises Short Message Service (SMS).
14. The method of claim 5 wherein the corpus comprises instant chat messages.
15. The method of claim 5 wherein the corpus comprises Twitter tweets.
16. The method of claim 1 wherein the risk-identification-algorithm is based upon a set of terms statistically associated with risk.
17. The method of claim 1 wherein the risk-identification-algorithm is based upon a temporal factor.
18. The method of claim 1 wherein the risk-identification-algorithm is based upon a set of customized criteria.
19. The method of claim 18 wherein the set of customized criteria comprises an industry criterion.
20. The method of claim 18 wherein the set of customized criteria comprises a geographic criterion.
21. The method of claim 18 wherein the set of customized criteria comprises a monetary criterion.
22. The method of claim 18 wherein the set of customized criteria comprises a political criterion.
23. The method of claim 18 wherein the set of customized criteria takes into account a severity criterion.
24. The method of claim 18 wherein the set of customized criteria takes into account an urgency criterion.
25. The method of claim 18 wherein the set of customized criteria takes into account a topic criterion.
26. The method of claim 18 wherein the set of customized criteria takes into account a set of named entities.
27. The method of claim 4 wherein the risk-identification-algorithm is based upon a set of source ratings, the set of source ratings having a one to one correspondence with a set of sources, the set of sources serving as a source of information on which the corpus is based.
28. The method of claim 27 further comprising modifying the set of source ratings based upon the imminent risk.
29. The method of claim 27 further comprising modifying the set of source ratings based upon the materialized risk.
30. The method of claim 1 further comprising transmitting the signal representative of the set of prerequisite risks.
31. The method of claim 2 further comprising transmitting the signal representative of the imminent risk.
32. The method of claim 4 further comprising transmitting the signal representative of the materialized risk.
33. The method of claim 4 further comprising providing a web-based risk alerting service using at least one of the signal representative of the set of risks, the signal representative of the imminent risk, and the signal representative of the materialized risk.
34. The method of claim 1 wherein the set of risk-indicating patterns is obtained from past Security Exchange Commission (SEC) filings.
35. A computing device comprising: an electronic memory; anda risk-identification-algorithm based, at least in part, on the set of risk-indicating patterns associated with a corpus stored in the electronic memory.
36. The computing device of claim 35, further comprising a computer interface for querying the risk-identification-algorithm.
37. The computing device of claim 35, further comprising a display for receiving a signal from the electronic memory and for displaying risk alerts from the risk-identification-algorithm.
38. A risk alerting system comprising: a computing device comprising an electronic memory and a risk-identification-algorithm based, at least in part, on the set of risk-indicating patterns associated with a corpus stored in the electronic memory;a user interface for querying the risk-identification-algorithm and for receiving a signal from the electronic memory of the computing device for alerting a user of risks.
39. The method of claim 5, wherein the corpus comprises legal information.
40. The method of claim 39, wherein the legal information includes bankruptcy and default filings.

METHOD AND APPARATUS FOR RISK MINING

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

US Classifications

International Classifications

Abstract

Description

Claims