Traditional Chinese Medicine (TCM) is enshrined in the local law of the Hong Kong SAR. For this reason computer-aided clinical TCM practice has become a quest for many people. One of these quests is to retrieve herbs with respect to their temperament and curative effects.
Ontology can be used to organize TCM practice. Ontology is a data model that represents a set of concepts within a domain and the relationships between those concepts. Ontology is used to reason about the objects within that domain.
For example, for a query of Q{x, y) the retrieved result should be a conclusion by inference with the two actual parameters x and y. The process of conclusion by inference is called parsing and the piece of software or computational logic used to achieve this conclusion is referred to as a parser. The combination of “query+semantic net+ontology” is the basis of a telemedicine system, which administers medicine over a network, such as the Internet. Telemedicine refers to administering medicine or medical information over a network that supports wireless and wireline communication. For example, a telemedicine environment may be made up of many mobile and/or stationary clinics that collaborate wirelessly. Each clinic includes a clinical telemedicine diagnosis/prescription system that can be operated by a physician, and a pharmacy. A physician can treat patients locally by using the clinical telemedicine diagnosis/prescription system.
TCM is highlighted here as an illustrative example of a domain that can be represented and accessed via an ontological information retrieval system. The subject invention can also be applied to other domains.
The present disclosure relates to an ontological information retrieval system utilizing a three layer architecture. According to one embodiment of the invention, an ontological information retrieval system is provided that represents an ontological layer in an annotated form, represents the annotated form of the onotological layer as a document object model (DOM) tree for parsing the data, and utilizes a graphical user interface (GUI) to represent the DOM tree for human understanding and manipulation. Other human interfaces to the DOM tree can be used with the subject invention as will be apparent to one skilled in the art.
In accordance with the present invention, a DOM tree containing attributes and their associations is provided for establishing a semantic network to parse the ontological data. A query can be mapped into a semantic and the DOM searched to find instances of that semantic.
A specific embodiment of the subject ontological information retrieval system can be utilized for computer-aided clinical TCM practice. In one implementation, a user can input a query with symptoms determined from a patient, and the system's parser can find instances of the symptoms in the DOM tree. The instances can be communicated to the user by, for example, highlighting the instances of the symptoms in the DOM tree displayed to the user.
A relevance index (RI) can be further provided for evaluating a diagnosis by comparing the symptoms determined from a patient with the expected symptoms of the diagnosed illness and returning a value based on the number of matched symptoms.
A frequency index (FI) can be further provided for evaluating a diagnosis by comparing the symptoms determined from a patient with the expected symptions of the diagnosed illness with additional weighting for the major symptoms of the illness. The FI takes into consideration the importance of a symptom, which can include categories such as major criteria and minor criteria of an illness.
An ontological information retrieval system is provided. The subject ontological information retrieval system can utilize a three-layer architecture for transitive mapping.
For a perfectly mapped system, the three layers are transitive. That is, when an element in the query layer 30 is related to an element in the semantic net layer 20, and the element in the semantic net layer 20 is related to an element in the ontology layer 10, then the element in the query layer 30 is related to the element in the ontology layer 10.
The subject ontological information retrieval system can be applied to a telemedicine system. In such an embodiment, the ontological information can relate to, for example, TCM. Accordingly, the ontological layer 10 can include available TCM formal information obtained from the classics and treatises on the subject (also referred to as TCM vocabulary). The representation of this information can be provided in annotated form by using metadata such as XML. The ontological layer 10 is represented with a DOM (semantic net 20) configured in accordance with an embodiment of the present invention, and the query layer 30 is provided in the form of a graphical user interface (GUI).
Aspects of the invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the invention may be practiced with a variety of computer-system configurations, including multiprocessor systems, microprocessor-based or programmable-consumer electronics, minicomputers, mainframe computers, and the like. Any number of computer-systems and computer networks are acceptable for use with the present invention. In addition, computer systems, servers, work stations, and other machines may be connected to one another across a communication medium including, for example, a network or networks.
In accordance with the present diclsosure, computer-readable media include both volatile and nonvolatile media, removable and nonremovable media, and contemplate media readable by a database, a switch, and various other network devices. By way of example, and not limitation, computer-readable media comprise media implemented in any method or technology for storing information. Examples of stored information include computer-useable instructions, data structures, program modules, and other data representations. Media examples include, but are not limited to, information-delivery media, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile discs (DVD), holographic media or other optical disc storage, magnetic cassettes, magnetic tape, magnetic disk storage, and other magnetic storage devices. These technologies can store data momentarily, temporarily, or permanently.
The invention may be practiced in distributed-computing environments where tasks are performed by remote-processing devices that are linked through a communications network. In a distributed-computing environment, program modules may be located in both local and remote computer-storage media including memory storage devices. The computer-useable instructions form an interface to allow a computer to react according to a source of input. The instructions cooperate with other code segments to initiate a variety of tasks in response to data received in conjunction with the source of the received data.
The present invention may be practiced in a network environment such as a communications network. Such networks are widely used to connect various types of network elements, such as routers, servers, gateways, and so forth. Further, the invention may be practiced in a multi-network environment having various, connected public and/or private networks.
Communication between network elements may be wireless or wireline (wired). As will be appreciated by those skilled in the art, communication networks may take several different forms and may use several different communication protocols. And the present invention is not limited by the forms and communication protocols described herein.
In accordance with certain embodiments of the present invention, a system including one or more processors, memory, a display, and an input device is provided for retrieving ontological information and providing that information to a user by using the three-layer architecture as described with respect to
For a telemedicine application, the semantic net 20 is the machine processable form of the TCM ontological layer and the GUI for the query system 30, which abstracts the semantic net, is utilized for human understanding and manipulation. The symptoms that are keyed-in via the GUI are captured as actual parameters for the query to be implicitly (user-transparently) constructed by the GUI system as input to the parser. The parsing mechanism draws the logical conclusion from the DOM tree (e.g., the corresponding illness for the query). The ontological layer 10 defines the bounds of the diagnosis/prescription operation. The ontological layer is the vocabulary and the operation standard of the system.
For embodiments utilizing XML for the ontological layer, the parser can be established using a software language such as VB.net (Visual Basic for the Internet) and compiled into machine readable code.
A GUI of a sample parser according to one embodiment is shown in
In yet a further embodiment, a relevance index (RI) can be incorporated to enable a user to evaluate the results. For example, the RI can be calculated based on frequency (i.e., the number of matched symptoms.
As another embodiment, a frequency index (FI) can be used to improve the RI calcualation by incorporating weighting factors. For example, for each disease type, the symptoms can be categorized and weighted.
Main Symptoms:
The FI score gives the biggest ratio or weight to major symptoms due to their importance. In contrast, the RI score is based only on frequency. The FI score can be advantageous in certain situations because when the score is based on only frequency, the disease which has more matched symptoms that are minor or in pulse would appear to be a better match, and a disease that has less matches, but scored the most in the main symptoms may be inadvertently missed.
Following are examples that illustrate procedures for practicing and understanding the invention. These examples should not be construed as limiting.
Appendix A shows a sample disease, the common cold, annotated with an XML tree. The general structure for the XML annotation of TCM follows the following framework.
An example of the XML annotation for 38 illnesses is shown in Appendix C, as disclosed in U.S. provisional application Ser. No. 61/229,545, filed Jul. 29, 2009, which is incorporated herein by reference in its entirety.
The structure shown can be used to represent ontological information for TCM. But other structures may be used and other domains may be represented and accessed using an ontological information retrieval system.
According to one embodiment, an ontological information retrieval system is implemented to identify all the symptoms (query attributes) with respect to the “10 questions” (). In particular, the list of 21 identifications is as follows: TCM []: chills and fever [], head and body [], fecal [], urine [], diet [], thoracoabdominal [], sweat [], hearing/vision [], cough [], sputum [], pain (location, form) [], sleep [], complexion [], nose [], lips [], throat/pharynx [], vomit [], mental status [], menses [], vaginal discharge [], tongue [], surface or tongue [].
These 21 basic symptoms for “” are tabulated in the tables of Appendix B from Tables 1A to 1D. Table 1E provides a summary of the Symptoms identified based on the “10 questions ()”.
An XML annotation was created for 38 chosen illnesses from some established TCM classics. The XML annotation of these 38 illnesses is shown in Appendix C, as disclosed in U.S. provisional application Ser. No. 61/229,545, filed Jul. 29, 2009, which is incorporated herein by reference in its entirety. When the XML annotation is input to the parsing mechanism such as shown in
The XML annotation in Appendix C for the 38 illnesses includes the 21 symptoms as their attributes. Together they form the subsumption hierarchy that lets symptoms associate with illnesses.
When the RI is incorporated, the system user, such as a physician, can evaluate the diagnosis. For example, if the physician obtained only two symptoms from the “10 questions (),” but the classical information shows that there could be 10 symptoms all together. Then, the RI is the score for the quality of the diagnostic process.
To reduce search time in embodiments where the sample parser program matches symptoms by loading the data and then searching the data, the loading of the data (such as a Display Disease XML) can be separated from the searching so that subsequent searches can utilize the same loaded data.
The physician can select the symptoms attributes by clicking the combo boxes shown in the GUI. The symptom attributes are extracted based on the TCM vocabulary and Table 1A to 1D of Appendix B. After the symptoms attributed are selected, the program matches the attributes with the XML annotation of 38 illnesses once the search button is clicked as shown in
Since some symptoms of different diseases may be the same, the relevance index of each illness is calculated. The relevance of the matched attributes can be measured for the diagnostic process (basic: frequency).
In one embodiment, a 2D array can be used to store the matched symptoms and disease name and the number of matched symptoms can be calculated to determine as their scores. The disease name and score is passed to another 2D array and then sorted.
In another embodiment; a datatable, which is a VB.net object for storing data in a table format, can be used. The data stored in the table format can then be placed into the data. Grid. For example, the VB.net object data table can store the illness names that have symptoms matched and the RI, which is calculated by the number of matched symptoms.
The result shows that the patient is more likely to catch Yang edema—wind edema than a Liver yang headache. The sorted index is for physician's reference.
The UMLS (Unified Medical Language System) is a medical ontology for allopathic applications and is intrinsically suitable for textual mining. It aims to resolve the difference in terminologies among different incompatible medical systems. The semantic groups in level 1 represent the different domains of query (e.g. TCM diagnosis). Level 2 is the semantic net to formally give one unique answer to a specifically formulated query. Level 3 is the ontological infrastructure for the “global allopathic view,” which is described by Jackei H. K. Wong in “A Concise Survey by PhraPharm on Data Mining Methods,” (2008), which is incorparated by reference herein in its entirety.
In addition to text mining, automatic semantic aliasing support can be included in the evolution of the ontology as described by Jackei H. K. Wong et al. in “Real-Time Enterprise Ontology Evolution to Aid Effective Clinical Telemedicine with Text Mining and Automatic Semantic Aliasing Support,” Proceedings of the OTM (Nov. 9-14, 2008), Vol. 5332 Lecture Notes in Computer Science; (2008), which is incorporated by reference herein in its entirety.
For example, TCM ontology was built based on all the canonical texts. A physician extracts a list of symptoms for a patient with a rigid diagnostic procedure. This list of symptoms is then matched with those extracted from canonical texts in the form of descriptors for different diseases. The different matches would have varying relevance. A relevance (index) of 0.7 (70%) to Cough, for example, indicates that the patient's sickness has 70% likelihood to the Cough context. That is, it could betreated with recipes for Cough. Then, the rest 30% difference could mean one of the following:
All patents, patent applications, provisional applications, and publications referred to or cited herein are incorporated by reference in their entirety, including all figures and tables, to the extent they are not inconsistent with the explicit teachings of this specification.
It should be understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application. In addition, any elements or limitations of any invention or embodiment thereof disclosed herein can be combined with any and/or all other elements or limitations (individually or in any combination) or any other invention or embodiment thereof disclosed herein, and all such combinations are contemplated with the scope of the invention without limitation thereto.
This application is a National Stage Application of International Application Number PCT/IB2010/002237, filed Jul. 29, 2010; which claims the benefit of U.S. provisional application Ser. No. 61/229,545, filed Jul. 29, 2009, which are incorporated herein by reference in their entirety.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB2010/002237 | 7/29/2010 | WO | 00 | 12/22/2011 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2011/013007 | 2/3/2011 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6601055 | Roberts | Jul 2003 | B1 |
6687685 | Sadeghi | Feb 2004 | B1 |
6745157 | Weiss et al. | Jun 2004 | B1 |
7149756 | Schmitt et al. | Dec 2006 | B1 |
7222066 | Oon | May 2007 | B1 |
7305389 | Zeng et al. | Dec 2007 | B2 |
7344496 | Iliff | Mar 2008 | B2 |
7444071 | Chen | Oct 2008 | B2 |
7493253 | Ceusters | Feb 2009 | B1 |
7512576 | Syeda-Mahmood | Mar 2009 | B1 |
7630947 | Pandya | Dec 2009 | B2 |
7739104 | Berkan et al. | Jun 2010 | B2 |
7739123 | Rappaport | Jun 2010 | B1 |
7899764 | Martin et al. | Mar 2011 | B2 |
8060513 | Basco et al. | Nov 2011 | B2 |
8150857 | Benson | Apr 2012 | B2 |
8244733 | Fortier et al. | Aug 2012 | B2 |
8433715 | Mirhaji | Apr 2013 | B1 |
8560550 | Patterson | Oct 2013 | B2 |
8781813 | Cooper | Jul 2014 | B2 |
8888697 | Bowman | Nov 2014 | B2 |
20010003183 | Thompson | Jun 2001 | A1 |
20010039503 | Chan | Nov 2001 | A1 |
20020165737 | Mahran | Nov 2002 | A1 |
20030050803 | Marchosky | Mar 2003 | A1 |
20030139652 | Kang | Jul 2003 | A1 |
20040093331 | Garner et al. | May 2004 | A1 |
20040122704 | Sabol | Jun 2004 | A1 |
20040199332 | Iliff | Oct 2004 | A1 |
20050181350 | Benja-Athon | Aug 2005 | A1 |
20060020466 | Cousineau et al. | Jan 2006 | A1 |
20060036430 | Hu | Feb 2006 | A1 |
20060136403 | Koo | Jun 2006 | A1 |
20060183099 | Feely et al. | Aug 2006 | A1 |
20070005621 | Lesh | Jan 2007 | A1 |
20070050344 | Rind et al. | Mar 2007 | A1 |
20080040150 | Kao | Feb 2008 | A1 |
20080077581 | Drayer et al. | Mar 2008 | A1 |
20080228769 | Lita et al. | Sep 2008 | A1 |
20080270120 | Pestian et al. | Oct 2008 | A1 |
20090070103 | Beggelman et al. | Mar 2009 | A1 |
20090076847 | Gogolak | Mar 2009 | A1 |
20090083203 | Cho et al. | Mar 2009 | A1 |
20090119095 | Beggelman | May 2009 | A1 |
20090198511 | Boehlke | Aug 2009 | A1 |
20100010806 | Bi et al. | Jan 2010 | A1 |
20100094874 | Huber et al. | Apr 2010 | A1 |
20100121832 | Adsera Bertran | May 2010 | A1 |
20110004628 | Armstrong et al. | Jan 2011 | A1 |
20110119212 | De Bruin et al. | May 2011 | A1 |
20110184748 | Fierro | Jul 2011 | A1 |
Number | Date | Country |
---|---|---|
1645364 | Jul 2005 | CN |
101408912 | Apr 2009 | CN |
101441682 | May 2009 | CN |
Number | Date | Country | |
---|---|---|---|
20120124051 A1 | May 2012 | US |
Number | Date | Country | |
---|---|---|---|
61229545 | Jul 2009 | US |