The embodiments discussed relate to computer implemented big data (large and complex data set), information and knowledge processing.
The ability to extract knowledge from large and complex collections of digital data and information, as well as the utilization of these information and knowledge needs to be improved.
According to an aspect of an embodiment, a computer system including at least one computer is configured to generate, specification concept graphs of nodes spec1, spec2, . . . , specm including concepts node and relation nodes according to at least one of a plurality of digitized data from user input from a plurality of computerized data sources d1, d2, . . . , dl forming a first set of evidences U; generate concept graphs of nodes cα1, cα2, . . . , cαn including concept nodes and relation nodes for corresponding obtained plurality of IKs α1, α2, . . . , αn forming a second set of evidences U; select a subset of concept graphs of nodes cαi
These and other embodiments, together with other aspects and advantages which will be subsequently apparent, reside in the details of construction and operation as more fully hereinafter described and claimed, reference being had to the accompanying drawings forming a part hereof, wherein like numerals refer to like parts throughout.
The embodiments described in the attached description entitled “Augmented Exploration for Big Data and Beyond” can be implemented as an apparatus (a machine) that includes processing hardware configured, for example, by way of software executed by the processing hardware and/or by hardware logic circuitry, to perform the described features, functions, operations, and/or benefits.
Big Data refers to large and complex data sets, for example, data from at least two or more dynamic (i.e., modified in real-time in response to events, or updating) heterogeneous data sources (i.e., different domain and/or similar domain but different sources) which are stored in and retrievable from computer readable storage media. According to an aspect of embodiment, data can be text and/or image documents, data queried from a database or a combination thereof. The embodiments provide predictive analytics, user behavior analytics, or certain other advanced data analytics to extract value from data. More particularly, Big Data refers to the problem whereby the processing of large, complex, dynamic, and noisy data overwhelms current state of the art data processing software, thus limiting software applications to achieving only limited and/or simplistic tasks. Advanced tasks involving more complex data analytics such as semantics extraction or complex structure elicitation are in the purview of Big Data and in the search for innovative solutions, algorithms, and software application development.
According to an aspect of an embodiment, the data sources are interconnected computing devices or objects in heterogeneous domains (e.g., computers, vehicles, buildings, persons, animals, plants), as embedded with electronics, software, sensors to transceiver data over data networks including the Internet (Internet of Things (IoT)). According to another aspect of an embodiment, the data sources are Website content, including archived Website content.
Section 1. Augmented Exploration (A-Exploration)
The embodiments described provide a new and innovative approach—Augmented Exploration or A-Exploration—for working with Big Data, for example, to realize the “The White House Big Data Initiative” and more by utilizing Big Data's own power and exploiting its various unique properties. A-Exploration is a Comprehensive Computational Framework for Continuously Uncovering, Reasoning and Managing Big Data, Information and Knowledge. In addition to the built-in next generation real-time search engine, A-Exploration has the capabilities to continuously uncover, track, understand, analyze, manage, and/or utilize any desired information and knowledge, as well as, oversee, regulate and/or supervise the development, progression and/or evolvement of these information and knowledge.
A-Exploration supplements, extends, expands and integrates, among other things, the following related technologies:
The integrations and extensions of the above technologies, combined with new technologies described below:
enable and empower A-Exploration to create and establish diverse methods and functionalities, capabilities to simplify and solve virtually all problems associated with Big Data, Information and Knowledge, including but not limited to:
Depending on the needs, other capabilities of A-Exploration may have to be supplemented, suspended, combined and/or adjusted.
The solutions and results mentioned above, require not only the extraction of enormous up-to-the-minute information, but also instantaneous analysis of the information obtained. Thus, new paradigms and novel approaches are necessary. One new paradigm includes not only the next generation scalable dynamic computational framework for large-scale information and knowledge retrieval in real-time, but also the capability to analyze immediately and seamlessly the partial and piece-meal information and knowledge acquired. Moreover, there is a need to be able to have a deep understanding of the different information and knowledge retrieved to perform the other required operations. Depending on the applications, A-Exploration can provide the necessary solution or simply provide the necessary information, knowledge and analyses to assist the decision makers in discovering the appropriate solutions and directions.
An example of a Big Data Research and Development Initiative is concerned with:
New and innovative process—Augmented Exploration or A-Exploration—is described for working with Big Data to realize the above initiative and more by utilizing Big Data's own power and exploiting its various unique properties. A-Exploration is a Comprehensive Computational Framework for Continuously Uncovering, Reasoning and Managing Big Data, Information and Knowledge. In addition to the built-in next generation real-time search engine, A-Exploration has the capabilities to continuously uncover, track, understand, analyze, manage, and/or utilize any desired information and knowledge, as well as, oversee, regulate and/or supervise the development, progression and/or evolution of these information and knowledge. This is one of the most challenging and all-encompassing aspects of Big Data, and is now possible due to the advances in information and communication infrastructures and information technology applications, such as mobile communications, cloud computing, automation of knowledge works, internet of things, etc.
A-Exploration allows organizations and policy makers to address and mitigate considerable challenges to capture the full benefit and potential of Big Data and Beyond. It contains many sequential and parallel phases and/or sub-systems required to control and handle its various operations. Individual phases, such as extracting, uncovering and understanding, have been used in other areas and examined in details using various approaches and tools, especially if the information is static and need NOT be acquired in real-time. However, overseeing and providing possible roadmap for management, utilization and projection of the outcomes and results obtained are regularly and primarily performed by human. Comprehensive formulations and tools to automate and integrate the latter are not currently available. Since enormous information is extracted and available to the human in the loop, information overload is clearly a significant problem. Last, but not the least, there exists no inclusive framework, at present, which consider all aspects of what are required, especially from a computational point of view, which makes it not viable for automation.
A-Exploration is vital and essential in practically any field that deals with information and/or knowledge, which covers virtually all areas in this Big Data era. It involves every areas discussed in James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and Angela Hung Byers, “Big data: The next frontier for innovation, competition, and productivity,” Technical report, McKinsey Global Institute, 2011, including health care, public sector administration, retail, manufacturing, personal location data; and many other specific applications, such as dealing with unfolding stories, knowledge/information sharing and discovery, personal assistance, to name just a few. As a matter of fact, A-Exploration provides a general platform, as well as, diverse and compound methods and techniques to fulfill the goal of the above White House Big Data Research and Development Initiatives and more.
To solve the problems related to the different applications of A-Exploration, an important initial phase requires the harnessing and extracting of relevant information from vast amounts of dynamic heterogeneous sources quickly and under the pressure of limited resources. Unfortunately, most standard information retrieval techniques, see discussion by Amit Singhal, “Modern information retrieval: A brief overview,” Bulletin of the IEEE Computer, Society Technical Committee on Data Engineering, 24(4):35-43, 2001, do not have the necessary capabilities and/or may not be optimized for specific requirements needed.
The representation of the data fragments (also referred to as ‘data nuggets’) extracted is another potential problem. Most widely used representations, such as at least one or any combination of keywords, relationships among the data fragments (see
There are many available techniques for story understanding, see discussion in Erik T. Mueller, “Story Understanding Resources, <retrieved from Internet: http://xenia.media.mit.edu/˜mueller/storyund/storyres.html>. But virtually all were intended for static events and stories, which are not suitable for dynamic and real-time information and knowledge. Moreover, in A-Exploration, what may be needed is deep understanding of the various scenarios and relationships, as well as, possible consequences of the information and knowledge, depending on the applications.
One of the most difficult problems to overcome is how to manage, utilize and supervise the knowledge and insights obtained from the other phases to resolve the possible outcomes of the situations and to project into the future. Moreover, with sufficient information and knowledge of the situations, one may be able to help shape, influence, manipulate and/or direct how they will evolve/progress in the future. Furthermore, it may help and facilitate the discovery/invention of new and improved methods for solving the problem at hand. Currently, existing framework might not provide the users with the desired solutions, or even just to assist the intended users in making the appropriate decisions or to provide guidance to change the future statuses and/or courses in the desired/preferred directions.
Since typically everything might be performed under limited resources, allocating the resources, especially time and computing power, is another prime consideration. Besides the simple allocation strategies, such as random and round robin methods, other more sophisticated strategies can also be used in A-Exploration.
Section 2. Information and Knowledge
Information and knowledge (IK) come in many different flavors, and have many different properties. Moreover, IKs may be static or dynamic. Dynamic IKs may evolve, expand and/or progress, and may be subject to modification and/or alteration. They may also be combined or integrated with other IKs. The desirability or significance of specific IKs may depend on the applications. Except for proven or well-known IKs, in general, the importance of an IK may depend on the following characteristics:
LRIRE 102 is the web, external databases, and other information and knowledge (IK) repositories serve as sources (102a, 102b, . . . , 102n) of inputs. The inputs include structured data (e.g. forms, spreadsheets, website content, databases, etc.) and free-form data (e.g. unstructured text, live news feeds, etc.), or a combination (e.g. meta-data tags on html documents containing free text). Inputs are retrieved and organized based on any one or combination of two factors: (i) target query or domain specification (information retrieval), and (ii) specific user interests and preferences (User Modeling (102u)) as specified by a use and/or by a computing device. Output is the organized information and knowledge (IK) according to the factors (i) and/or (ii).
At 102x, Concept Node Graph Generator—Each piece of IK from each IK source (102a, . . . 102n) is translated into a concept graph representing the semantics of the IK.
At 102y, Knowledge Fragmenter—The collection of concept graphs are used to construct knowledge fragments. Knowledge fragments identify value relationships between concepts found in the concept graphs.
At 104, Proprietary Hypothesis Plug-Ins—Information and knowledge fragments that are proprietary to the user or organization that drives hypothesis formation and how this can shape the reasoning and analysis of A-Exploration could be injected the information/knowledge bases in 106.
At 106, Knowledge-Base Injector—The fragments are added into existing (or possibly empty) data/knowledge bases of varying forms (e.g. Relation DBs (106b), BKBs (106a), AKBs (106c), etc.). Injection translates the fragments into the appropriate database format. Each information/knowledge base can be directly accessed from 108 and 110.
At 108, Knowledge Completer—All/some IK is extracted from the information/knowledge bases in 106. Unification processes (with Tags (108a—
At 110, Knowledge Augmenter—Inspects information/knowledge bases and generates new knowledge through the Projector/Forecaster (110a
At 112, Augmented Analyzer—Augmented analysis can produce new or existing IK such as through Deep Comprehender (112a), Explorer of Alternative Outcome (112b), Missing Link Hypothesizer (112d), detect unexpected or emergent phenomenons such as through the Emergence Detector (112c) as well as other innovative technologies unique to Augmented Exploration. The output provides new augmented analysis which is also injected into the information/knowledge bases and to the Augmented Supervisor (100) for evolving stories and continuous analyses.
At operation 2, select a subset of concept graphs of nodes cαi
At operation 3, the at least one computer is further configured to create or add into at least one among a plurality of knowledge-bases (KBs) for the corresponding knowledge fragments obtained by creating objects in form ω=E→A from the concept fragments; determining relationship constraints κ in form of set relations among a plurality of subsets of evidences E for a plurality of the objects ω. At operation 4, authorized proprietary hypothesis secured plug-in may provide other knowledge fragments.
At operation 5, any one or combination of knowledge completion functions of unification, including unification with tags and sans inconsistencies, is performed based upon the created objects co so that a validity (v) and a plausibility (p) based upon atomic propositions among the rules A is computed for each object ω=E→A. Other knowledge augmenting functions based upon the created objects co include, at operation 7, generating new knowledge through the Projector/Forecaster (110a—
The plurality of data sources 100 (100a . . . n) store information and knowledge (IK) which are dynamic (i.e., modified in real-time in response to events, or updating) heterogeneous data (i.e., different domain and/or similar domain but different sources) in form of text and/or image. In case of ‘knowledge,’ the information includes meta data or semantic data indicative or representing a target knowledge. The concept graph generator 102 generates concept node graphs.
To facilitate continuous uncovering and tracking, the desired IKs are normally represented using concept graphs. In a non-limiting example,
Concept graph (CG) is a directed acyclic graph that includes concepts nodes and relation nodes. Consistency between two concept graphs, q and d, can be determined by a consistency measure:
cons(q,d)=n/(2*N)+m/(2*M), (1)
where n, m are the number of concept and relation nodes of q matched in d, respectively, and N, M are the total number of concept and relation nodes in q. If a labeled concept node c occurs in both q and d, then c is called a match. Two labeled relation nodes are said to be matched if and only if at least one of their concept parents and one of their concept children are matched and they share the same relation label. In comparing two CGs that differ significantly in size (for example, a CG representing an entire document and another CG representing a user's query), the number of concept/entity nodes and relation nodes in the larger CG instead of the total number of nodes in both CGs, are used.
Equation 1 could be modified whenever necessary to provide more efficient measures to prioritize the resource allocations. For example, an inconsistency measure could be defined as follow:
incons(q,d)=n/(2*N)+m/(2*M), (2)
where ñ, {tilde over (m)} in are the number of concept and relation nodes of q with no match in d, respectively, and N, M are the total number of concept and relation nodes in q, respectively. The priority measure is then given by:
prty(q,d)=cons(q,d)−incons(q,d), (3)
By working strictly with labeled graphs, instead of general graph isomorphism, problems associated with computational complexity is avoided.
It is assumed that each IK has at least two time-stamps: IK time-stamp, i.e., when the IK first appeared (initial or first version), and a representation time stamp, i.e., when the IK was last represented or modified (updated, second (new or subsequent) versions) in the system, also referred to as data version events. The latter is needed in connection with allocation strategies. For the sake of quick referencing, in addition to the representation time-stamp, a copy of the IK time-stamp is also available in the representation.
If the IKs came from the same or affiliated sources, then all those IKs will be combined and all contradictions resolved in favor of the later or latest versions.
Let the combined IK α consist of IKs α1, α2, . . . , αn from the same or affiliated sources with IK time-stamps t1, t2, . . . , tn, respectively, where t1≦t2≦ . . . ≦tn. Then we shall say that a has IK time-stamp tn. At time t, the diversity D (α, t) of α is the average of all et−t
Any IK which is not a part of a combined IK is a stand-alone IK. For a stand-alone IK α, at time t, the diversity D(α, t) and the volume V(α, t) of a are both equal to et−t
The et−t
Let α1 and α2 be two IKs where α1 and α2 need not be distinct. The consistency measure m(α1, α2, t) between α1 and α2 at time t is given by:
m(α1,α2,t)=cons(r1,r2)×et−min(t
where r1 and r2 are the latest representation of α1 and α2, respectively, and t1 and t2 are IK time-stamps of α1 and α2, respectively.
Let α be a (combined or stand-alone) IK.
The diversity and volume of an IK also measure the timeliness and up-to-the minute changes of the IK. Thus, diversity and volume can be used to measure the importance and/or the significance of the IKs as they progress or evolve.
One of the disadvantages of the diversity and volume discussed above is that it is biased against new IKs. Fortunately, this could be partially mitigated by the proper selection of the time unit so that new and recent IKs will automatically be assigned a proportionally larger diversity and volume.
For any IK α, there could be a lot of extraneous noises associated with α. To eliminate those noises and to cut down on the required computations, instead of including all other IKs, the diversity and volume will be determined only by those other IKs whose consistency measures with α are above a certain threshold. We shall refer to these as the truncated diversity and volume.
Since it is unlikely that one will be interested in all the IKs put forward by someone somewhere, it is advantageous to restrict the scope of the search. One such method is to use “watchwords” or “watch structures” to eliminate most of the unwanted IKs. (Watch structures are semantic structures consisting of watchwords, e.g., the query used in information retrieval. They can further sharpen the search for the desired IKs.) However, such screening could potentially overlook key IKs which may seem or is initially unrelated to the desired IKs. To alleviate the situation, one could widen the collection of watchwords or watch structures, as well as, adding watchwords or watch structures which may not be related but has a history of affecting and/or being affected by the IK in question. Historical semantic nets could be constructed in this regard based on historical connections among different ideas, concepts and processes.
According to the embodiments, metrics (Diversity and Volume) are provided to uncover and identify the desired IKs. However, to understand the significance and importance of these IKs, richer structural representations of the IKs are needed. Therefore, in the described framework, concept graphs are transformed into knowledge fragments, e.g., Bayesian Knowledge Fragments (BKF), Compound Bayesian Knowledge Fragments (CBKF), Augmented Knowledge Fragments (AKF), etc.
Section 3. Large-Scale Real-Time Information Retrieval and Extraction (LRIRE)
As mentioned above, one of the major challenges that need to be addressed is the harnessing and extracting of relevant knowledge and insights from vast amounts of dynamic heterogeneous sources quickly and under the pressure of limited resources. To meet this challenge, the capabilities offered Large-scale Real-time Information Retrieval and Extraction (LRIRE) could be integrated with the other components of A-Exploration. However, the embodiments are not limited to LRIRE and other related computational frameworks, for example, related Anytime Anywhere Dynamic Retrieval (A2DR) can be utilized in connection with A-Exploration described herein. LRIRE is an example of one of the next generation scalable dynamic computational frameworks for large-scale information and knowledge retrieval in real-time, which is described in an article by Eugene Santos, Jr., Eunice E. Santos, Hien Nguyen, Long Pan, and John Korah, “A largescale distributed framework for information retrieval in large dynamic search spaces,” Applied Intelligence, 35:375-398, 2011.
The LRIRE incorporates various state-of-the-art successful technologies available for large-scale data and information retrieval. It is built by supplementing, extending, expanding and integrating the technologies initiated in I-FGM (Information Foraging, Gathering and Matching), see discussion by Eugene Santos, Jr., Eunice E. Santos, Hien Nguyen, Long Pan, and John Korah, “A Largescale Distributed Framework For Information Retrieval In Large Dynamic Search Spaces,” Applied Intelligence, 35:375-398, 2011.
LRIRE's goal is to focus on the rapid retrieval of relevant information/documents from a dynamic information space and to represent the retrieved items in the most appropriate forms for future processing. With LRIRE, the information/documents in the search spaces are selected and processed incrementally using an anytime-anywhere intelligence resource allocation strategy and the results are provided to the users in real-time. Moreover, LRIRE is capable of exploiting users modeling (UM) to sharpen the inquiry in an attempt to dynamically capture the target behavior of the users.
In LRIRE the partial processing paradigm is enhanced by using highly efficient anytime-anywhere algorithms, which produces relevancy approximations proportional to the computational resources used. Anytime-anywhere algorithms were designed for both text and image documents: “Anytime refers to providing results at any given time and refining the results through time. Anywhere refers to incorporating new information wherever it happens and propagating it through the whole network.”
LRIRE is an intelligent framework capable of incrementally and distributively gathers, processes, and matches information fragments in large heterogeneous dynamic search spaces, using anytime-anywhere technologies, as well as converting from one representation to another, when needed. The primary purpose of LRIRE is to assist the users at finding the pertinent information quickly and effectively, and to store them using the desired representations. Initially, the graphical structure of document graphs is ideal for anytime-anywhere processing as the overhead for adding new nodes and relations is minimal.
LRIRE is capable of handling retrieval of multiple data types, including unstructured texts (typical format for most databases/sources), images, signals, etc.
In LRIRE, the information acquired is initially represented as document graphs (DG). A DG is essentially a concept graph (CG). However, other representations, such as knowledge fragments, will be superimposed subsequently to make possible the necessary operations and enhance the performance of other aspects of LRIRE.
Multiple queries can proceed in parallel in LRIRE, but in this case, it is advisable to identify the queries.
LRIRE uses various common representations of information for heterogeneous data types. Thus, LRIRE provides a seamless integration of text, image and signals through various unifying semantic representation of contents.
Section 4. Bayesian Knowledge Base (BKB) and Compound Bayesian Knowledge Base (CBKB)
As mentioned above, the information and knowledge available in LRIRE are initially represented using CGs. To equip the accumulated knowledge with the ability to reason, the CGs may be transform into knowledge fragments, e.g., Bayesian Knowledge Fragments (BKF)s, Compound Bayesian Knowledge Fragments (CBKF)s and Augmented Knowledge Fragments (AKF)s, if needed. BKFs, CBKFs and AKFs are subsets of Bayesian Knowledge Base (BKB), Compound Bayesian Knowledge Base (CBKB) and Augmented Knowledge Base (AKB), respectively.
A Bayesian Knowledge Base (BKB), see discussion by Eugene Santos, Jr. and Eugene S. Santos, “A Framework For Building Knowledge-bases Under Uncertainty, “Journal of Experimental and Theoretical Artificial Intelligence,” 11:265-286, 1999 is a highly flexible, intuitive, and mathematically sound representation model of knowledge containing uncertainties.
To expand the usefulness of BKB in support of A-Exploration, a new object is developed and employed which is referred to as Compound Bayesian Knowledge Base (CBKB). In an CBKB, each S-node may be assign multiple values. Each value represents certain aspect of the rule and therefore is capable of more accurately reflect the different types of uncertainties involved.
For any given CBKB, depending on the needs, it can be considered whether just one specific value in the S-node, or a combination of certain specific group of values in the S-node. In this case, the resulting structure is similar to a BKB. However, it may or may not be an actual BKB, since it may or may not satisfy the “exclusivity” constraint of BKB. As a matter of fact, many BKB-like objects could be derived from a single CBKB.
The fusion technology available in BKB is expanded for use in CBKB. This is essential in connection with the LRIRE cited above, which forms a major component of A-Exploration. This allows the representation of the information fragments obtained by LRIRE at different instances and with diverse queries to be combined to form a joint CBKB.
The properties and structures of CBKBs are further enriched by encoding and distinguishing the node in a CBKB according to the types of relation given in the concepts or feature graphs. Besides the relation ‘is-a’, other relations exist between two concepts, such as ‘is-located’, ‘is-colored’, etc. For instance, using the relation ‘is-located’, we can view the node representing a location as a ‘location’ node.
Section 5. Augmented Knowledge Base (AKB) and Augmented Reasoning (AR). Augmented Knowledge Base (AKB) and Augmented Reasoning (AR) are discussed in U.S. Pat. No. 9,275,333 the content of which is incorporated herein by reference.
The following notations, definitions and results are described:
An Augmented Knowledge Base (AKB) (over and U) is a collection of objects of the form (E, A), also denoted by E→A, where A∈ and E is the body of evidence that supports the proposition or rule A. E is a subset of some universal set U and is a collection of propositions or rules A or a first order logic.
Various reasoning schemes can be used with AKBs, including Augmented Reasoning (AR), which is a new reasoning scheme, based on the Constraint Stochastic Independence Method, for determining the validity and/or plausibility of the body of evidence that supports any simple or composite knowledge obtainable from the rules/knowledge that is contained in the AKB, i.e., .
As a matter of fact, AKBs encompass most existing knowledge bases including their respective reasoning schemes, e.g., probabilistic logic, Dempster-Shafer, Bayesian Networks, Bayesian Knowledge Bases, etc. may be viewed or reformulated as special cases of AKBs. Moreover, AKBs and AR have pure probabilistic semantics and therefore not subject to any anomalies found in most of the existing knowledge bases and reasoning schemes. AKBs and AR are not only capable of solving virtually all the problems mentioned above, they can provide additional capabilities, e.g., inductive inferences with uncertainties and/or incompleteness, extraction of new knowledge, finding the missing link, augmented relational databases, augmented deductive databases, augmented inductive databases, etc. related to A-Exploration.
The A in (E→A) is a rule in κ, which is similar to a proposition or rule in a traditional knowledge base. On the other hand, the E in (E→A) is a set, which represents a body of evidence, and is not a rule.
Let κ be an AKB.
Let κ be an AKB. {tilde over (∈)}κ is the smallest collection of subsets of U that contains all (approximately all) the sets in ∈κ and is closed under complement, union, and intersection operators.
Measure (probabilistic or not) are usually associated with collection of evidence to specify their strength. Moreover, they are usually extended to cover {tilde over (∈)}κ. In the case where the measure is probabilistic, it can be interpreted as the probability that L is true.
Let κ be an AKB. {circumflex over (κ)} is defined recursively as follows:
{circumflex over (κ)} extends κ so that AKBs can deal with composite objects ω, associated with combinations of sets of evidences E in the knowledge base and combinations of rules in the knowledge base. Therefore, the embodiments utilize both composite evidences and composite rules to establish support for a target rule. Since κ is finite, so is {circumflex over (κ)}. Members of {circumflex over (κ)} are referred to as composite objects.
∪ and ∩ denote ‘set union’ and ‘set intersection’, respectively, while and denote ‘logical or’ and ‘logical and’, respectively.
Let κ be an AKB.
If ω is a composite object, l(ω) denotes a composite set of evidences associated with ω, and r(ω) denotes the composite rules associated with ω. This enable {circumflex over (κ)} to extract the rule portion and the associated set of evidences portion from ω, where ω represents a composite object of a plurality of E→A.
Let κ be an AKB, G⊂U and L∈,
if and only if there exists ω∈{circumflex over (κ)} such that l(ω)=G and r(ω)L.
Let κ be an AKB and L∈L. Then Σκ(L)={ω∈{circumflex over (κ)}|r(ω)L}, Σκ(L)=ω∈Σ
σκ(L) is the support of L wrt κ, while σκ(L) is the plausibility of L wrt κ.
Various algorithms for determining σκ(F), some polynomial and some non-polynomial may be provided.
Let κ1 and κ2 be AKBs. κ1 and κ2 are equivalent if and only if for all L∈, σκ
Let κ be an AKB. κ is consistent if and only if G=Ø, subject to κ, whenever
In general, consistency imposed certain conditions that the E's must satisfy.
Let κ be an AKB, G⊂U and L∈.
if and only if there exists L0 ∈ such that LL0 and G→κL0.
Let κ be an AKB and L∈. Φκ(L)={ω∈{circumflex over (κ)}|Lr(ω)}.
Let κ be an AKB and L∈. Then φκ(L)=∩ω∈Φ
Let κ be an AKB.
Let κ be an AKB and L∈. φκ(L)=σκ(L) and φκ(L)=σκ(L).
Let κ1 and κ2 be AKBs. κ1 and κ2 are i-equivalent if and only if for all L∈, φκ
Let κ be an AKB. κ is i-consistent if and only if G=U, subject to κ, whenever
An AKB κ is disjunctive if and only if every (E→A)∈κ, A is a conjunction of disjunctions of atomic propositions.
It is assumed that κ is the smallest collection of atomic propositions that satisfies the above definition.
Let κ be an AKB. κ is irreducible if and only if κ satisfies the following conditions:
Let κ be an AKB. If ω∈{circumflex over (κ)}, then ρ(ω) is the collection of all atomic propositions that occur in r(ω). Moreover, Let ω1, ω2∈{circumflex over (κ)}. If both are atomic disjunction over κ, and there exists an atomic proposition P such that P occur in ω1 and P′ occur in ω2, then ω1⋄ω2 merges ω1 and ω2 where a single occurrence of P and P′ are removed.
Let ⊂a, the collection of all atomic propositions in .
It can be shown that ⊂⊂⊂ and ⊂⊂⊂.
Let ⊂a. is simple if and only if for every P∈a, not both P and P′ are in .
Let L∈. L is -simple if and only if for some simple ⊂a, L=P.
Let L∈. L is -simple if and only if for some simple ⊂a, L=P.
In what follows, without loss of generality, we shall assume that L is -simple whenever L∈a, and L is -simple whenever L∈a.
Disjunctive AKB is introduced in U.S. Pat. No. 9,275,333 the content of which is incorporated herein by reference and an algorithm to transform any AKB to disjunctive AKB was given there. If κ is an AKB, then κD denotes the corresponding disjunctive AKB.
Let κ be an AKB.
Algorithm (Disjunctive AKB) Given an AKB κ. Return κ0.
Let κ be a disjunctive AKB.
Let Ø=Ω⊂κ. Then w∈Ωω=(U→T) and w∈Ωω=(Ø→F). Let κ be a irreducible AKB, ω1, ω2 ∈κ and P∈κ. P is complemented with respect to α1 and ω2 if and only if either P∈ρ(ω1) and P′∈ρ(ω2), or P′∈ρ(ω1) and P∈ρ(ω2).
Let κ be a disjunctive AKB, ω1, ω2 ∈κ and P∈κ. P is complemented with respect to ω1 and ω2 if and only if either P∈ρ(ω1) and P′∈ρ(ω2), or P′∈ρ(ω1) and P∈ρ(ω2).
Let κ be a disjunctive AKB and ω1, ω2 ∈κ.
The ⋄ operator defined above is a powerful unification operator which is capable of handling uncertain knowledge in a natural manner. Therefore, it is a true generalization of the traditional unification operators.
Section 6. New Unification Algorithms and New Applications of Uncertainties
In what follows, it is assumed that κ is an AKB. AKB is further described in U.S. Pat. No. 9,275,333.
However, most of the time, instead of dealing with κ itself, we shall be dealing with κD. κD is the disjunctive knowledge base associated with κ. Nevertheless, it is essential that κ be kept intact, especially if inductive reasoning is part of the solutions.
The unification algorithm below is different from related unification algorithms, including the unification algorithms discussed in U.S. Pat. No. 9,275,333, because according to an aspect of an embodiment, it is determined whether any atomic proposition A is selectable, and if there are any atomic propositions A, construct the two sets Q(A) and Q(A′) and then combine the elements in Q(A) with the elements in Q(A′) using the ⋄ operations. The results are added into the knowledge base while discarding the original elements in Q(A) and Q(A′).
Algorithm 6.1. (Unification).
According to an aspect of an embodiment, unification refers to determining of the validity (in the case of deductive reasoning) of a target rule L.
More specifically, given an AKB κ, L∈, output δκ(L), the collection of all 1(w) where ω∈{circumflex over (κ)} and l(ω)=F, as follows:
The above process may be modified for used as an Anytime-Anywhere algorithm. It does not require that κL be determined completely in advanced. It may be stopped anytime by executing Step 16, and continued at the place where it was stopped when more time are allocated. In addition, the algorithm can accept any addition to κL while continuing the unification process. Finally, other heuristics may be used instead of or in addition to Step 6 given in the algorithm to improve the results, Anytime-Anywhere or otherwise.
Given AKB κ and L∈. σκ(L)=∪G∈δ
In Algorithm 6.1 above, let L=F. Then, we obtained the value of σκ(F). Thus, the algorithm can determine whether κ is consistent or not. As a matter of fact, if E=Ø for every E∈δκ(F), then κ is consistent. Else, κ is not consistent.
Since for every L∈, σκ(F)⊂σκ(L), the inconsistencies in κ will affect the unification process, including Algorithm 7.1. In the next section, we shall provide a new algorithm to perform unification while ignoring inconsistencies.
Due to the duality nature of deductively reasoning and inductive reasoning, the above unification method is applicable to both deductive reasoning and inductive reasoning. In the latter case, the AKB κ must be kept intact for constructing the κ. κD cannot be constructed from κD, i.e., κD≠κD. By the way, the unification method can be used with validity or plausibility measures.
Algorithm 6.1 is modified to introduce tags r to identify the sources of the results. In other words, the tags η(ω) indicate how the results were obtained for ω within the AKB. For example, if ω1=(E1→A′B), ω2=(E2→A) are in the knowledge base then the tag for ω=(E1 E2→B) corresponds to ω1 and ω2. According to an embodiment, information indicating a tag is associated with ω.
Algorithm 6.2. (Unification with Tag).
Given an AKB κ1, κ2, . . . , κn, L∈, output δκ(L):
Observe that although Algorithm 6.2 is similar to Algorithm 6.1, however, tags are created only in the former.
Algorithm 6.3. (Unification sans Inconsistencies).
Given an AKB κ, L∈, output δκ(L):
Let κ be a consistent AKB and L∈. Then σκ(L)=∪ω∈δi
The decomposition of any ω∈{circumflex over (κ)} which will be needed in subsequent sections is examined.
Let κ be an AKB, ω∈{circumflex over (κ)} and ωi∈κ for i=1, 2, . . . , n.
Let κ be an AKB and w∈{circumflex over (κ)}. Let ω1∈κ for i=1, 2, . . . , n.
Let κ be an AKB. Every ω∈{circumflex over (κ)} can be put in minimal disjunctive-conjunctive form, as well as in minimal conjunctive-disjunctive form.
In view of the above Proposition, for simplicity and uniformity, unless otherwise stated, we shall assume that all ω∈{circumflex over (κ)} are in minimal disjunctive-conjunctive forms. Of course, we could have chosen minimal conjunctive-disjunctive forms
Let κ be an AKB and ω1, ω2∈{circumflex over (κ)} are expressed in minimal disjunctive-conjunctive forms.
The same notations apply to minimal conjunctive-disjunctive forms.
The following Decomposition Rules are discussed in U.S. Pat. No. 9,275,333 the content of which is incorporated herein by reference.
Let κ be an AKB and L1, L2∈.
Below, new extensions are provided of these rules:
Let κ be an AKB, and L1, L2∈.
Let κ be an AKB and ω∈{circumflex over (κ)}. If ω is expressed in DNF over κ, where ω=ω1ω2 . . . n; then {grave over (ω)} is obtain from ω by removing all conjunctions ωi from ω where r(ωi)F and l(ωi)≠Ø. {grave over (ω)} is called the purified ω. Moreover, {grave over (κ)}=({{grave over (ω)}|ω∈{circumflex over (κ)}}.
Let κ be an AKB and L∈L. Then {grave over (σ)}κ(L)=∪ω∈δi
Since σκ(L)={acute over (σ)}κ(L) if κ is consistent, therefore, in this case, Algorithm 6.3 can be used to compute σκ(L). Observe that Algorithm 6.3 is a faster unification algorithm than Algorithm 6.1.
Let κ be an AKB and L∈L.
From the definition, it follows that {grave over (Σ)}κ(L)∈{grave over (Σ)}κ(L) and {grave over (Φ)}κ(L)∈{grave over (Φ)}κ(L).
Let κ be an AKB. κ is d-consistent if and only if {grave over (σ)}κ(L)=σκ(L) for every L∈L.
The decomposition rules for {grave over (σ)}κ and {grave over (σ)}κ are given below: Let κ be an AKB and L1, L2 ∈.
Section 7. Augmented Projection/Forecasting and Augmented Abduction
Deductive reasoning and inductive reasoning have been used to determine the validity and/or plausibility of any given proposition. In many applications, the desired proposition is not known. Instead, one is given some target propositions to determine the target propositions which can be deduced or induced from the AKB using these target propositions.
Projection and forecasting were used in Subsection on Information and Knowledge for the management and supervision of the evolution of the IKs, and abduction used for Analysis and Deep Understanding of IKs. We shall consider projection/forecasting and abduction in more general settings. We shall refer to the latter as Augmented Projection/Forecasting and Augmented Abduction—these form the basis for the Knowledge Augmenter (KA).
First, we shall introduce some new concepts and results. Moreover, we shall use them to design and develop various solutions for solving the Projection/Forecasting and Abduction problems, under different circumstances.
Let κ be an AKB, and L, L1, L2 ∈.
DC and IC may be viewed as projection/forecasting of the given proposition using deductive and inductive reasoning, respectively. In other words, DC is deductive projection/forecasting while IC is inductive projection/forecasting. On the other hand, DA and IA may be viewed as abduction of the given proposition using deductive and inductive reasoning, respectively. In other words, DA is deductive abduction while IA is inductive abduction. Existing abductive reasoning discussed by Eugene Charniak and Drew McDermott, “Introduction to Artificial Intelligence,” Addison-Wesley, 1985, corresponds to deductive abduction given above.
Let κ be an AKB and L∈.
Let κ be an AKB and L∈.
DC and DA are concentrated upon. All the results obtained for DC and DA can then be transformed into IC and IA, respectively, by replacing κ with κ.
In general, construction of dCκ(L) and dAκ(L) can be done by adding U→A′ and U→A, respectively, and then performing unification.
Algorithm 7.1. Projection/Forecasting (Construction of dCκ(P))
Given AKB κ, an atomic proposition P∈, output Σ0.
Algorithm 7.2. Abduction (Construction of dAκ(P))
Given AKB κ, an atomic proposition P∈, output Σa0.
Let κ be an AKB and L∈. By expressing L as disjunction of conjunctions or conjunction of disjunctions, the Algorithms and results given above provide a complete solution for the construction of dCκ(L) and dAκ(L), i.e., projection/forecasting and abduction, respectively.
If κ is endow with α probabilistic measure m together with its extension {tilde over (m)}, then dCκ(L) and/or dAκ(L) can be sorted in order of {tilde over (m)}(ω) for a in dCκ(L) and/or dAκ(L). In this manner, the “best” or “most probable” projection/forecasting or abduction can be found at the top of the corresponding sorted lists. The remaining members of the sorted lists provide additional alternatives for projection/forecasting or abduction. A threshold for projection/forecasting and/or abduction can be enforced by removing all the members of the sorted lists according to certain specified value.
iCκ(L) and iAκ(L) can also be defined and constructed in a similar manner using κ.
This Section shows how projection/forecasting and/or abduction can be realized given any propositions using either deductive or inductive reasoning based on the validity measures. All the above concepts and results can be carried over if plausibility measures are used. Moreover, if the AKB is consistent, then the use of plausibility measures could provide a wider range of projection/forecasting and/or abduction.
Given L∈, general methods for computing dCκ(L), dAκ(L), iCκ(L) and iAκ(L) are shown above. In other words, they provide a general solution for the projection/forecasting, as well as, the abduction problems.
Now consider L∈, if we are interested in projection/forecasting with respect to L, we can first determine dAκ(L), and then compute dCκ(dCκ(L)). The result represents the projection/forecasting of the possible sources which gave rise to L.
Let κ be an AKB and L∈. Then dCκ(L)⊂ dCκ(dCκ(L)).
The dCκ(dCκ(L)) given above may be viewed as a second order projection/forecasting with respect to L. Obviously, we can continue in this fashion to obtain higher order projections/forecasting.
If we replace dC by dA, then we have higher order abductions if necessary. These are done deductively and with validity measures. In the same manner, we may consider higher order projections/forecasting and abductions done deductively and/or inductively with either the validity measures and/or plausibility measures. These could provide much wider ranges of results compare to first-order projections/forecasting and abductions alone, which were discussed earlier in this sections.
Section 8. A-Exploration
A-Exploration is an overall framework to accommodate and deal with the efforts and challenges mentioned above. It is intended to provide the capabilities to organize and understand, including the ramification of the IKs captured in real-time, as well as to manage, utilize and oversee these IKs to serve the intended users. Management of the IKs may include providing the necessary measures, whenever possible, to guide or redirect the courses of actions of the IKs, for the betterment of the users.
A. Introduction
As stated above, an important component of A-Exploration is the facilities to transform the representation of information fragments acquired by LRIRE into more robust and flexible representations—knowledge fragments. The various transformations, carried out in A-Exploration, are determined in such a way that it can optimally accomplish its tasks; e.g., transformation of CGs into CBKFs, CGs into AKFs, CBKFs into AKFs, AKFs into CBKFs, etc.
The transformation from CGs into CBKFs requires that we have the necessary knowledge about the relationships involving concepts and features. The requisite knowledge is normally available in semantic networks, word-net, etc. Although not necessary, due to the nature of CBKB, addition of probabilities will make the results of the transformation more precise. These probabilities could be specified as part of the semantic networks, etc. to represent the strength of the relationships. If no probabilities are specified, we may assume that they are 1. In any case, the probabilities given have to be adjusted when they are used in the CBKB to guarantee that they satisfy the requirements of the CBKB. The process of transforming DGs into CBKFs, including modification of the probabilities, can be easily accomplished and automated.
The main advantage of knowledge fragments, such as, CBKFs and AKFs, is the fact that they are parts of knowledge bases, and therefore are amenable to reasoning. This allows the possibility of pursuing and engaging in the various functions listed above, including the understanding and projecting the directions of further progression of the IKs, to foresee how they may influence other subjects and/or areas. Ways of accomplishing the functionalities listed above are discussed in more details below.
Methods and algorithms, introduced and/or presented in CBKBs and AKBs, are modified and tailored to analyze and understand the IKs. In particular, by identifying the pertinent inference graphs in a CBKB, it could provide the means to narrow down the analysis and supply the vehicle to explore deeper understanding of the IKs.
For CBKBs and AKBs, full analyses of the IKs can be performed deductively and/or abductively; while for AKBs, we can also perform the analyses inductively. These allow the presentations of a wider range of possibilities for understanding, comprehending and appreciating the ongoing evolution or progression of the IKs.
B. Comprehension, Analysis and Deep Understanding of IKs
Analyses of the IKs require deep understanding of many aspects of the IKs. It is backward looking and entails finding the best explanations for the various scenarios of the IKs. In other words, abduction is the key to analyses. Since both CBKBs and AKBs permit abduction, they can provide the instruments to analyze and better understand the IKs. This forms the basis for the Deep Comprehender subcomponent of the Augmented Analyzer (AA).
Abductive reasoning can be used in both CBKBs and AKBs with deductive reasoning. This permits a better understanding of the IKs and can provide the explanations of the possible courses/paths of how the IKs had evolved or progressed. Moreover, for AKB, abduction can be coupled not only with deduction (that is what existing abduction is all about), but can also be coupled with induction, and thereby allowing awareness and comprehension of suitable/fitting developments, generalizations and/or expansions of the desired IKs. For both CBKBs and AKBs, they can provide the following functionalities for A-Exploration:
Since the power of abduction is the understanding and explanation of the IKs, resolutions of items above, which are related to Item 2 in the list of the functionalities of A-Exploration, can be achieved. In general, by exploiting and customizing (deductive or inductive) abduction, it may be possible to uncover the likely solutions.
C. Manage and Supervise the Evolution of the IKs
When employing only deductive inferencing, the results provided by the above procedures will be referred to as deductive prediction, projection, and/or forecasting. The same methods are equally applicable to inductive inference which is available for AKBs. Inductive prediction, projection and/or forecasting present much larger opportunities to produce more diverse results, which may be more informative and valuable to the question at hand.
Both CBKBs and AKBs have the capabilities of exploring for the desired/ideal alternative outcomes by management and supervising the evolution through user inputs taken into consideration through the user specifications spec1, . . . specm. Thus, they will permit A-Exploration to offer solutions to various functionalities such as those given below:
Moreover, we can also customize and/or adapt various methods and algorithms, available for CBKBs and/or AKBs, to predict, project and/or forecast future directions of the IKs. This forms the basis for the Explorer of Alternative Outcomes subcomponent of the Augmented Analyzer (AA).
With CBKBs, predicting, projecting and/or forecasting based on knowledge of established or proven IKs (complete or partial) can be done using existing mechanisms available in any CBKBs. Since prediction, projection and/or forecasting are forward looking, the usual inference mechanisms can be exploited for that purpose. In CBKBs, this means:
If more than one inference graphs are available, the results can be sorted according to the values of the inference graphs. In this case, different scenarios and their possible outcomes may be offered to the intended users. When user input is taken into consideration, this forms the basis for the Augmented Supervisor (AS). The user input can be menu-based inputs and/or term-based inputs and/or natural language-based inputs and/or database inputs.
D. Proprietary Hypothesis Plug-Ins (PHP)
Most of the basic information and knowledge involved in the creation of the knowledge bases, such as AKB, CBKB, etc. are available in the public domain. Additional up-to-date information can be obtained using the LRIRE given in Section 4. These information and knowledge make up the bulk of the desired AKB, CBKB and other knowledge bases.
The requisite knowledge may also be available as proprietary hypothesis plug-ins, which allows the construction of proprietary knowledge bases, including AKFs, CBKFs, etc. A specific example of a proprietary hypothesis plug-in is the relation between a process and its efficacy, or a drug and how it is connected to certain illnesses.
Proprietary information (including patents and other intellectual properties, private notes and communications, etc.) form the backbone of most businesses or enterprises, and virtually all companies maintained certain proprietary information and knowledge. These can be used as proprietary hypothesis plug-ins. A collection of hypothesis plug-ins may be kept in proprietarily constructed semantic networks, AKBs, CBKBs, etc. Part or all of this collection may be made available under controlled permitted or authorized limited access when constructing the desired AKBs, CBKBs, etc. For large collection, LRIRE, or some simplified form of LRIRE, can be used to automate the selection of the relevant information and/or knowledge.
The availability of proprietary hypothesis plug-ins provides the owners with additional insights and knowledge not known to the outside world. It could offer the owners more opportunities to explore other possible outcomes to their exclusive advantages.
The available plug-ins could supply the missing portions in our exploration of the desired/ideal alternative outcomes. By the way, proprietary hypothesis plug-ins need not be comprised of only proven or established proprietary knowledge. They may contain interim/preliminary results, conjectures, suppositions, provisional experimental outcomes, etc. As stated above, when using AKBs, CBKBs, etc. to house the proprietary hypothesis plug-ins, the unproven items can be signified by specifying a lower probabilities and/or reliabilities.
Clearly, this collection of hypothesis plug-ins can grow as more proprietary information and knowledge, including intellectual properties, is accumulated. It could become one of the most valuable resources of the company.
E. Wrapping-Up
CBKBs and AKBs have the capabilities of inducing and/or exploring additional desired/ideal alternative outcomes for the IKs. This can be achieved by augmenting the knowledge bases with temporary, non-permanent and/or transitory knowledge fragments to the CBKBs or AKBs. The optional knowledge could consist of hypotheses generated using interim/partial/provisional results, conjectures, unproven or not completely proven outcomes, or simply guess works. To maintain the integrity of the knowledge bases, the validity and/or reliability of the added knowledge should be associated with lower probabilities and/or reliabilities. Or the non-permanent knowledge should be held in separate knowledge bases. At any rate, any unsubstantiated hypotheses or temporary items not deemed feasible or useful should be removed promptly from the knowledge bases. In cases where there are multiple alternative outcomes, they can be sorted so the intended users can select the desired options.
Section 11, below on Conjectures and Scientific Discoveries explores desired/ideal alternative outcomes, in connection with finding the “missing links” in scientific discoveries.
Due to the richness of the structures of CBKBs and AKBs, depending on the problem at hand, one could initiate and/or institute novel approaches and/or techniques to accomplish its goals.
The potential that CBKBs and AKBs are capable of handling and managing the above requirements and conditions is the main reason we have chosen CBKBs and/or AKBs to play a central role in A-Exploration.
Both CBKBs and AKBs can be used in A-Exploration to achieve the intended goals, especially if we are interested in deductive reasoning. However, CBKBs are more visual and allow the users to picture the possible scenarios and outcomes. On the other hand, though AKBs are more powerful, they are also more logical. Theoretically, anything that can be accomplished using CBKBs can be accomplished using AKBs, and more. Indeed, depending on the problems and/or IKs involved, it may be advantageous to formulate the problems using either or both CBKBs and AKBs, and to allow the switching from one formulation to the other, and vice versa. With the type of information/knowledge considered in A-Exploration, it is not difficult to transform and switch from one formulation to the other, and vice versa.
It is possible to have multiple A-Exploration systems, each having its own objectives. These A-Exploration systems may then be used to build larger systems in hierarchical and/or other fashions, depending on their needs. Clearly, these systems can cross-pollinate to enhance their overall effects. Various subsystems of different A-Exploration systems may be combined to optimize their functions, e.g. A2DR.
Section 10. Emergent Events and Behaviors—subcomponent Emergence Detector of Augmented Analyzer (AA)
In this Section, we shall examine emergent behaviors, see discussion by Timothy O'Connor, Hong Yu Wong, and Edward N. Zalta, “Emergent Properties,” The Stanford Encyclopedia of Philosophy, 2012. They arise in a complex system mainly due to the interactions among the subsystems. Emergent behaviors are behaviors of the complex systems not manifested by the individual subsystems. When dealing with complex system, usually, emergent behaviors are generally unexpected and hard to predict. In this invention, we develop a formal characterization of emergent behavior and provide a means to quickly verify whether a behavior is emergent or not, especially with respect to AKBs. However, we shall first introduce a new object—consistent event and make use of a new method for unification without inconsistencies.
Consistent AKBs are defined as follows: Let κ be an AKB. κ is consistent if and only if G=Ø, subject to κ, whenever
A new object—κ-consistent event is introduced herewith is essential in dealing with emergent behaviors, etc.: Let κ be an AKB and ω∈{circumflex over (κ)}. ω is a κ-consistent event if and only if for every ω0∈{circumflex over (κ)}, if η(ω0)=η(ω), then l(ω0)=l(ω).
Let κ be an AKB and ω∈{circumflex over (κ)}. Let ω be expressed in minimal disjunctive-conjunctive form ω1ω2 . . . ωn. Then {grave over (κ)}(c) is obtain from a by removing all conjunctions ωi from ω where η(ωi)≡F and l(ωi)≠Ø. Moreover, {grave over (κ)}={{grave over (κ)}(ω)|ω∈{circumflex over (κ)}}.
Let κ be an AKB and ω∈{circumflex over (κ)}. ω is a κ-consistent event if and only if ω∈{grave over (κ)}.
The objects can also be associated with consistent events. In this case, we have created many new objects, and to distinguish them from the old ones, we shall represent them by adding an accent to these new objects, such as {grave over (σ)} for σ, etc.
For consistent events, the decomposition rules become: Let κ be an AKB and L1, L2 ∈.
Let κ be an AKB. κ is consistent if and only if {grave over (σ)}κ(L)=σ(L) for every L∈.
Let κ be an AKB and L∈. Then {grave over (σ)}κ(L)=∪ω∈δ
Since {grave over (σ)}κ(L)={grave over (σ)}κ(L), if κ is consistent, then in this case, Algorithm 7.3 can be used to compute σκ(L). Observe that Algorithm 7.3 is a faster unification algorithm than Algorithm 7.1.
Observe that in Algorithm 6.3, although inconsistencies may exist in κ and may not be removed, they were completely ignored in the unification process. However, since the inconsistencies may permeate through the entire system, they may create many interesting properties and complications.
In the above discussions, we introduced deductive reasoning and inductive reasoning, and used them to determine the validity and/or plausibility of any given proposition. However, the validity and plausibility of a given proposition may be compromised due to inconsistencies. Although there are related algorithms for transforming any AKB κ into a consistent AKB, we show in the previous section how to deal with inconsistencies directly without having to determine them in advance, as well as removing them first, i.e., by considering {grave over (κ)} instead of κ.
The related discussions involving AKBs dealt primarily with consistent AKBs. However, inconsistencies are an integral part of any AKBs involved with emergent behaviors. Various methods could be considered for removing the inconsistencies and constructing consistent AKBs to take their places. In this invention, we show how to deal with inconsistencies directly without having to determine them in advance, as well as removing them first. To avoid the complexities introduced by inconsistencies, we shall use {grave over (σ)} instead of σ, etc.
In the rest of the section, we shall assume that n>0 and κ1, κ2, . . . , κn are AKBs. Moreover, κ=∪i=1nκi, τ=∪i=1n{circumflex over (κ)}i, and for L∈, Σ(L)=i=1nΣκ
Clearly, τ⊂{circumflex over (κ)} and {grave over (Σ)}(L)⊂{grave over (Σ)}κ(L). These differences provide a first step in the study of emergent events and behaviors. Moreover, due to the interaction among the various parts, inconsistencies invariably occurred when the parts are combined into the whole.
Let L∈L and ω∈{circumflex over (κ)}.
Let L∈L and ω∈{circumflex over (κ)}. If L is a (d-valid, d-plausible, i-valid, i-plausible) κ-emergent behavior, then ({grave over (σ)}κ(L)≠Ø, {grave over (φ)}κ(L)≠Ø, {grave over (φ)}κ(L)≠Ø).
Let κ1={E1→B} and κ2={Σ2→(BC)}. If (E1 ∩E2)≠0, then ((E1 ∩E2)→C is a d-valid κ-emergent event, and C is a d-valid κ-emergent behavior.
Let L∈L. L is a (d-valid, d-plausible, i-valid, i-plausible) K-non-emergent behavior if and only if L is NOT a (d-valid, d-plausible, i-valid, i-plausible) κ-emergent behavior.
In view of one of the algorithm shown above, we may assume that the κi as well as κ are all disjunctive. Moreover, given L∈, L can be expressed in atomic CNF form. Thus, we can use these to determine whether L is a d-valid κ-emergent behavior or not. More precisely:
Algorithm 10.1. (Algorithm for determining whether L is a κ-emergent behavior or not) Given AKB κi, i=1, 2, . . . , n, κ=∪i=1n κi and L∈, determine whether L is a κ-emergent behavior or not.
Step 3 in the above algorithm requires the values of {grave over (σ)}κ(P) and {grave over (α)}κ
The cases involving d-plausible, i-valid, and i-plausible can be processed in similar manners.
Section 11. Conjectures and Scientific Discoveries—Missing Link Missing Link Hypothesizer Subcomponent of Augmented Analyzer (AA)
Conjectures, see discussion by Karl Popper, “Conjectures and Refutations:The Growth of Scientific Knowledge,” Routledge, 1963, play a very important role in scientific and other discoveries. Conjectures maybe derived from experiences, educated guesses, findings from similar or related problems, preliminary experimental outcomes, etc. In general, they provide the missing links to clues in the discoveries. However, for most discoveries, there might be many clues or conjectures and it may be expensive to pursue the clues in general and/or to pursue the clues individually (as a matter of fact, it might be very costly even to pursue just a single clue). We shall show how AKBs, and similarly structured knowledge bases and systems, such as CBKBs, can be used to simplify and accelerate the search for the missing links.
Let κ be an AKB and m a κ-measure.
We shall occasionally refer to λ alone as the conjectures.
In our discussion of conjectured AKBs, we are usually not concern with the measures. Thus, we shall view conjectured AKBs as ordinary AKBs. When the measures become part of the discussions, then the measures will be stated explicitly.
The central elements for finding the missing links are either the sets F or the sets (D. In the first case, deductive inference is used, while in the second case, inductive inference is used. In either case, the measures need not be specified.
Since σ and φ are derive solely from Σ and φ, respectively, therefore, they are not affected by the measures either.
Because of the flexibility of the AKB, we can have conjectures (E1→L) and (E2→L), both in λ. The two essentially referred to the same conjecture L. The difference between them is the set associated with L, i.e., E1 and E2. This allows us to specify different constraints involving E1 and E2.
Let κ be an AKB and λ a collection of conjectures over κ. Let L∈ be the knowledge we want to assert. If σκ∪λ(L) does not contain any element in λ, then none of the conjectures will help in the establishment of L. In other words, the missing links are still out of reach, especially if the extended measure {tilde over (m)}(σκ∪λ(L)) is smaller than desired. Modifications or additions of the conjectures are therefore indicated.
Similar conclusions follow if we replaced a in the above by σ, φ or φ.
If some of the conjectures appeared in σκ∪λ(L), then it may be useful to closely examined Σκ∪λ(L) and/or Φκ∪λ(L). We shall concentrate on Σκ∪λ(L) for the time being.
Let κ be an AKB and ω∈{circumflex over (κ)}.
Let κ be an AKB, and for all i≦n, μi ∈κ.
Let κ be an AKB. γκ is the collection of all F-minimal conjunctions over κ and γκ is the collection of all T-minimal disjunctions over κ.
In the rest of the Section, we shall let π=κ∪λ where κ is an AKB and λ is a collection of conjectures over κ.
Let μ∈γλ. Then
Let L∈, κL=κ∪{(U→L′)}, πL=κL ∪λ, and μ∈γλ.
Let L∈. If the conjecture μ is L-inconsequential over κ, then the establishment of the conjecture μ will not improve the validity of L over κ. Therefore, for simplicity, μ will be eliminated from λ. Moreover, some of the μ∈γλ may be L-inconsequential, thus, we shall define
If m is a k-measure and {tilde over (m)} an extension of m, then the elements in
The discussions so far were restricted to using the validity of L. Parallel results can be derived using plausibility of L. In this case, we let κL=κ∪{(U→L)}, and πL=κL∪λ.
In the above discussions, only deductive reasoning were used. However, the missing links can also be found using inductive inference by applying the results given above to w.
Some example benefits according to the described embodiments include:
A new and innovative process—Augmented Exploration or A-Exploration—for working with Big Data by utilizing Big Data's own power and exploiting its various unique properties. A-Exploration has the capabilities to continuously uncover, track, understand, analyze, manage, and/or utilize any desired information and knowledge, as well as, oversee, regulate and/or supervise the development, progression and/or evolvement of these information and knowledge.
The embodiments permit organizations and policy makers to employ A-Exploration to address, in real-time, and mitigate considerable challenges to capture the full benefit and potential of Big Data and Beyond. It is comprised of many sequential and parallel phases and/or sub-systems required to control and handle its various operations.
The embodiments automate the overseeing and providing possible roadmap for management, utilization and projection of the outcomes and results obtained via A-Exploration.
The embodiments provide real-time capabilities for handling health care, public sector administration, retail, manufacturing, personal location data, unfolding stories, knowledge/information sharing and discovery, personal assistance, as well as, any fields that deals with information and/or knowledge—which covers virtually all areas in this Big Data era.
The embodiments enable and empower A-Exploration through the extensions, expansions, integrations and supplementations to create and establish diverse methods and functionalities to simplify and solve virtually all problems associated with Big Data, Information and Knowledge, including but not limited to:
The embodiments provide a new unification process for use with Augmented Knowledge Base AKB which incorporates Anytime-Anywhere methodologies, and allows heuristics to be employed to speed-up the process and improve the results. This new process can also be extended to include tags to identify the sources of the results.
The embodiments provide a method for purifying an object a in the knowledge base by removing all target inconsistencies (as determined by application criteria) contained in r(ω), comprising of transforming the object into disjunctive normal form and then removing all conjunctions in r(ω) which are equivalent to FALSE. This will be referred to as purified object.
The embodiments provide constructing purified validity and purified plausibility by using purified objects from the knowledge base.
The embodiments provide a new unification method for any AKBs, consistent or otherwise, where inconsistencies are approximately completely ignored in the unification process; and inconsistencies are handled directly without having to determine them in advanced or removing them first.
The embodiment provide the purified validity and purified plausibility of L, which may be determined by repeated applications of the decomposition rules including:
The embodiments analyses and deep understanding of many aspects of the information/knowledge comprising, among other things, of finding the best explanations for the various scenarios and possible courses/paths of how the information/knowledge evolved and/or progressed using deductive and/or inductive reasoning.
According to the embodiments, A-Exploration includes the following functionalities:
The embodiments may be utilized to manage and supervise the evolvement of the many aspects of the information/knowledge, comprising deductive and/or inductive prediction, projection and/or forecasting. And to Project, Predict and Forecast the Effects and Consequences of the Various Actions and Activities. And to Instigate Measures to Supervise and Regulate the Information and Knowledge. And to Initiate Credible Actions to Mitigate and/or Redirect the Plausible Effects and Outcomes of the Information and Knowledge.
The embodiments provide the capabilities of exploring for the desired/ideal alternative outcomes. It may consist, among other things, of augmenting the knowledge bases with temporary, non-permanent and/or transitory knowledge fragments to the CBKBs or AKBs, e.g., Hypotheses Plug-Ins.
The embodiment can be implemented by having multiple A-Exploration systems, each having its own objectives. These A-Exploration systems can be used to build larger systems in hierarchical and/or other fashions, depending on their needs; allowing these systems to cross-pollinate to enhance their overall effects; and the various subsystems of different A-Exploration systems may be combined to optimize their functions.
The embodiments provide creating the building blocks needed to perform projection/forecasting of possible outcomes for any given hypothesis/situation, comprising of setting up the deductive consequent and inductive consequent.
The embodiment provide for projecting/forecasting the possible outcomes of a given hypothesis or situation, using deductive validity measures, in terms of Augmented Knowledge Bases or other similarly structured constructs, comprising: methods for determining projection/forecasting with respect to any atomic proposition; and methods for combining and merging the projection/forecasting of atomic propositions to construct the possible outcomes of projection/forecasting for a given proposition.
According the embodiments, the projection/forecasting uses deductive plausibility measures, inductive validity measures and/or inductive plausibility measures.
According to the embodiments, the possible projections/forecasting are ranked.
The embodiments provide for creating the building blocks needed to perform abduction or determine best explanations of a given observation or situation, comprising of setting up the deductive antecedent and inductive antecedent.
The embodiments provide for using abduction to determine the best explanations of a given observation or situation, using deductive validity measures, in terms of Augmented Knowledge Bases or other similarly structured constructs, comprising: ways for determining abduction with respect to any atomic proposition; and methods for combining and merging the abduction of atomic propositions to construct the possible explanation of given observations/propositions.
According to the embodiments, the abduction uses deductive plausibility measures, inductive validity measures and/or inductive plausibility measures.
According to the embodiments, the possible abductions are ranked.
According to the embodiments, the inverses of κ and are used to perform inductive reasoning.
The embodiments provide for speeding-up the projections/forecasting and/or abductions by storing the results of each atomic propositions for faster retrieval.
The embodiments provide for extending and expanding the projections/forecasting and/or abductions to higher-order projections/forecasting and abductions.
The embodiments determine and handle emergent events/behaviors of complex systems, including collections of AKBs.
The embodiments promote scientific and other discoveries comprising of locating and unearthing the missing links between the current situations and the eventual desired solutions.
The embodiments simplify and accelerate the search for the missing links in scientific and other discoveries comprising: introduction and formulation of conjectures which may consist of information and/or knowledge derived from experiences, educated guesses, findings from similar or related problems, preliminary experimental outcomes, etc.; expression of these conjectures in terms identifiable by the selected knowledge base to permit them to be part of the reasoning process in the knowledge base; establishes the conjectured knowledge base consisting of the given knowledge base and the conjectures; employs the reasoning mechanism of the resulting knowledge base to determine which conjectures is consequential and ranked the consequential conjectures to prioritize the conjecture(s) to be examined first; and eliminates the inconsequential conjectures to improve the search of the missing links.
An apparatus, comprising a computer readable storage medium configured to support managing required data objects and hardware processor to executes the necessary methods and procedures.
According to an aspect of the embodiments of the invention, any combinations of one or more of the described features, functions, operations, and/or benefits can be provided. The word (prefix or suffix article) “a” refers to one or more unless specifically indicated or determined to refer to a single item. The word (prefix or suffix article) “each” refers to one or more unless specifically indicated or determined to refer to all items. A combination can be any one of or a plurality. The expression “at least one of” a list of item(s) refers to one or any combination of the listed item(s). The expression “all” refers to an approximate, about, substantial amount or quantity up to and including “all” amounts or quantities.
A computing apparatus, such as (in a non-limiting example) any computer or computer processor, that includes processing hardware and/or software implemented on the processing hardware to transmit and receive (communicate (network) with other computing apparatuses), store and retrieve from computer readable storage media, process and/or output data. According to an aspect of an embodiment, the described features, functions, operations, and/or benefits can be implemented by and/or use processing hardware and/or software executed by processing hardware. For example, a computing apparatus as illustrated in
In addition, an apparatus can include one or more apparatuses in computer network communication with each other or other apparatuses and the embodiments relate to augmented exploration for big data involving one or more apparatuses, for example, data or information involving local area network (LAN) and/or Intranet based computing, cloud computing in case of Internet based computing, Internet of Things (IoT) (network of physical objects—computer readable storage media (e.g., databases, knowledge bases), devices (e.g., appliances, cameras, mobile phones), vehicles, buildings, and other items, embedded with electronics, software, sensors that generate, collect, search (query), process, and/or analyze data, with network connectivity to exchange the data), online websites. In addition, a computer processor can refer to one or more computer processors in one or more apparatuses or any combinations of one or more computer processors and/or apparatuses. An aspect of an embodiment relates to causing and/or configuring one or more apparatuses and/or computer processors to execute the described operations. The results produced can be output to an output device, for example, displayed on the display or by way of audio/sound. An apparatus or device refers to a physical machine that performs operations by way of electronics, mechanical processes, for example, electromechanical devices, sensors, a computer (physical computing hardware or machinery) that implement or execute instructions, for example, execute instructions by way of software, which is code executed by computing hardware including a programmable chip (chipset, computer processor, electronic component), and/or implement instructions by way of computing hardware (e.g., in circuitry, electronic components in integrated circuits, etc.)—collectively referred to as hardware processor(s), to achieve the functions or operations being described. The functions of embodiments described can be implemented in a type of apparatus that can execute instructions or code.
More particularly, programming or configuring or causing an apparatus or device, for example, a computer, to execute the described functions of embodiments of the invention creates a new machine where in case of a computer a general purpose computer in effect becomes a special purpose computer once it is programmed or configured or caused to perform particular functions of the embodiments of the invention pursuant to instructions from program software. According to an aspect of an embodiment, configuring an apparatus, device, computer processor, refers to such apparatus, device or computer processor programmed or controlled by software to execute the described functions.
A program/software implementing the embodiments may be recorded on a computer-readable storage media, e.g., a non-transitory or persistent computer-readable storage media. Examples of the non-transitory computer-readable media include a magnetic recording apparatus, an optical disk, a magneto-optical disk, and/or volatile and/or non-volatile semiconductor memory (for example, RAM, ROM, etc.). Examples of the magnetic recording apparatus include a hard disk device (HDD), a flexible disk (FD), and a magnetic tape (MT). Examples of the optical disk include a DVD (Digital Versatile Disc), DVD-ROM, DVD-RAM (DVD-Random Access Memory), BD (Blue-ray Disk), a CD-ROM (Compact Disc-Read Only Memory), and a CD-R (Recordable)/RW. The program/software implementing the embodiments may be transmitted over a transmission communication path, e.g., a wire and/or a wireless network implemented via hardware. An example of communication media via which the program/software may be sent includes, for example, a carrier-wave signal.
The many features and advantages of the embodiments are apparent from the detailed specification and, thus, it is intended by the appended claims to cover all such features and advantages of the embodiments that fall within the true spirit and scope thereof. Further, since numerous modifications and changes will readily occur to those skilled in the art, it is not desired to limit the inventive embodiments to the exact construction and operation illustrated and described, and accordingly all suitable modifications and equivalents may be resorted to, falling within the scope thereof.
This application is based upon and claims priority benefit to prior U.S. Provisional Patent Application No. 62/331,642 filed on May 4, 2016 in the US Patent and Trademark Office, the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62331642 | May 2016 | US |