The present invention relates to a technique of searching a case database for similar case data.
Medical documents and medical images are becoming digital along with recent popularization of medical information systems including a hospital information system (HIS) and picture archiving and communication system (PACS). Medical images (e.g., X-ray image, CT image, and MRI image), which were often viewed on a film viewer after developed on films, are digitized now. Digitized medical images (digital images) are stored in the PACS, and if necessary, read out from it and displayed on the monitor of a terminal. Medical documents such as a medical record are also being digitized. The medical record of a patient can be read out from the HIS and displayed on the monitor of a terminal. An image diagnostician in a digital environment can receive an image diagnosis request form by a digital message. He can read out, from the PACS, medical image data obtained by imaging a patient, and display it on the image diagnosis monitor of a terminal. If necessary, the image diagnostician can read out the medical record of the patient from the HIS and display it on another monitor.
When interpreting a medical image to make an image diagnosis, a doctor sometimes hesitates to decide a diagnosis name if a morbid portion in the image during diagnosis has an unfamiliar image feature or there are a plurality of morbid portions having similar image features. In this case, the doctor may ask advice for another experienced doctor, or refer to documents such as medical books and read the description of an image feature regarding a suspicious disease name. Alternatively, he may examine photo-attached medical documents to search for a photo similar to a morbid portion captured in the image during diagnosis, and read a disease name corresponding to the photo for reference of the diagnosis. However, the doctor may not always have an advisory doctor. Even if the doctor examines documents, he may not be able to locate a photo similar to a morbid portion captured in the image during diagnosis or the description of an image feature. To solve this, apparatuses for searching for a similar cases have been proposed recently. The basic idea of the search apparatus is to support a diagnosis by searching for case data from those accumulated in the past based on some criterion and presenting it to a doctor.
For example, patent reference 1 discloses a technique of accumulating image data diagnosed in the past in a database in correspondence with diagnosis information including findings and a disease name. Patent reference 1 also discloses a technique of, when findings related to an image to be newly diagnosed are input, searching for past diagnosis information including similar findings and displaying corresponding image data and a disease name. Patent reference 2 discloses a technique of detecting a reference case in which an image diagnosis result and definite diagnosis result are different (case in which an image diagnosis is wrong), and registering it in a reference case database. Further, patent reference 2 discloses a reference case search method capable of referring to a necessary reference case image by designating identification information later.
According to the technique described in patent reference 1, both image data and a disease name are obtained as a similar case search result. However, the similarity between image features is not always guaranteed because the search is based on the similarity between texts. Since only the disease name of case data having similar findings is obtained, different disease names may not always be obtained. The technique described in patent reference 2 can call a doctor's attention to a false diagnosis, but cannot always present case data from which the doctor analogizes a correct diagnosis name of an image during image interpretation. When searching for past case data for a given case, a plurality of case data with different definite diagnosis results may not be obtained, which may make it difficult for the doctor to make a decision.
Patent Reference 1: Japanese Patent Laid-Open No. 6-292656
Patent Reference 2: Japanese Patent Laid-Open No.
5-101122
It is an object of the present invention to provide a technique capable of extracting a plurality of case data with different definite diagnosis results when searching for past case data for a given case.
To solve the above-described problems, a data search apparatus according to the present invention comprises the following arrangement. That is, a data search apparatus which extracts at least data of one definite case from a case database that stores a plurality of definite case data including medical image data and definite diagnosis information corresponding to the medical image data comprises input acceptance unit for accepting input of case data including at least medical image data, derivation unit for deriving a similarity between each of the plurality of definite case data stored in the case database and the case data input from the input acceptance unit, classification unit for classifying the plurality of definite case data stored in the case database into a plurality of diagnosis groups, based on definite diagnosis information included in each of the plurality of definite case data, and extraction unit for extracting, based on the similarity derived by the derivation unit, at least a predetermined number of definite case data from each of the plurality of diagnosis groups.
To solve the above-described problems, a data search apparatus control method according to the present invention comprises the following steps. That is, a method of controlling a data search apparatus which extracts at least data of data of one definite case from a case database that stores a plurality of definite case data including medical image data and definite diagnosis information corresponding to the medical image data comprises an input acceptance step of accepting input of case data including at least medical image data, a derivation step of deriving a similarity between each of the plurality of definite case data stored in the case database and the case data input in the input acceptance step, a classification step of classifying the plurality of definite case data stored in the case database into a plurality of diagnosis groups, based on definite diagnosis information included in each of the plurality of definite case data, and an extraction step of extracting, based on the similarity derived in the derivation step, at least a predetermined number of definite case data from each of the plurality of diagnosis groups.
To solve the above-described problems, a data search system according to the present invention comprises the following arrangement. That is, a data search system including a case database which stores a plurality of definite case data including medical image data and definite diagnosis information corresponding to the medical image data, and a data search apparatus which accesses the case database to extract at least data of one definite case comprises input acceptance unit for accepting input of case data including at least medical image data, derivation unit for deriving a similarity between each of the plurality of definite case data stored in the case database and the case data input from the input acceptance unit, classification unit for classifying the plurality of definite case data stored in the case database into a plurality of diagnosis groups, based on definite diagnosis information included in each of the plurality of definite case data, and extraction unit for extracting, based on the similarity derived by the derivation unit, at least a predetermined number of definite case data from each of the plurality of diagnosis groups.
The present invention can provide a technique capable of extracting a plurality of case data with different definite diagnosis results when searching for past case data for a given case.
Other features and advantages of the present invention will become apparent from the following description of exemplary embodiments with reference to the accompanying drawings. Note that the same reference numerals denote the same or similar parts throughout the accompanying drawings.
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention.
Preferred embodiments of the present invention will now be described in detail with reference to the drawings. It should be noted that the following embodiments are merely examples and do not limit the scope of the present invention.
A similar case search apparatus in a medical data search system will be exemplified as the first embodiment of a data search apparatus according to the present invention.
<Apparatus Arrangement>
The CPU 100 mainly controls the operation of each building component of the similar case search apparatus 1. The main memory 101 stores a control program to be executed by the CPU 100, and provides a work area when the CPU 100 executes a program. The magnetic disk 102 stores an operating system (OS), device drivers for peripheral devices, various kinds of application software including programs for performing similar case search processing and the like (to be described later), and work data generated or used by these software programs. The display memory 103 temporarily stores display data for the monitor 104. The monitor 104 is, for example, a CRT monitor or liquid crystal monitor, and displays an image based on data from the display memory 103. The mouse 105 and keyboard 106 receive a pointing input, a text input, and the like from the user. The shared bus 107 connects these building components so that they can communicate with each other.
In the first embodiment, the similar case search apparatus 1 can read out, via a LAN 5, case data from the case database 2, image data from the medical image database 3, and medical record data from the medical record database 4. The case database 2 functions as a case data archiving unit for archiving a plurality of case data (definite case data) including medical image data and definite diagnosis information corresponding to the medical image data. An existing PACS is usable as the medical image database 3. An electronic medical record system, which is a subsystem of an existing HIS, is available as the medical record database 4. It is also possible to connect external storage devices, for example, a FDD, HDD, CD drive, DVD drive, MO drive, and ZIP drive to the similar case search apparatus 1 and read definite case data, image data, and medical record data from these drives.
Note that medical images include a scout X-ray image (roentgenogram), X-ray CT (Computed Tomography) image, MRI (Magnetic Resonance Imaging) image, PET (Positron Emission Tomography) image, SPECT (Single Photon Emission Computed Tomography) image, and ultrasonic image.
The medical record describes personal information (e.g., name, birth date, age, and sex) of a patient, clinical information (e.g., various test values, chief complaint, past history, and treatment history), reference information to patient's image data stored in the medical image database 3, and finding information of a doctor in charge. After making a diagnosis, a definite diagnosis name is described in the medical record.
Case data archived in the case database 2 is created by copying or referring to some of definite diagnosis name-attached medical record data archived in the medical record database 4 and image data archived in the medical image database 3.
<Data Structure>
The components of case data have the following meanings. A “case data ID (DID)” is an identifier for uniquely identifying data of a case. As DIDs, sequential numbers are assigned in the order in which case data were added. A “definite diagnosis name” is obtained by copying a definite diagnosis name described in medical record data. The “definite diagnosis name” need not always be a character string and may use a standard diagnosis code (numerical value uniquely corresponding to a definite diagnosis name). A “diagnosis group ID (GID)” is an identifier for uniquely identifying a diagnosis group. The diagnosis group is a set of definite diagnosis names which need not be identified in image diagnosis. For example, pulmonal diseases are lung cancer, pneumonia, and tuberculosis. These diseases require different medical treatments and need to be discriminated even in image diagnosis. In contrast, lung adenocarcinoma, squamous cell carcinoma, small cell lung cancer, and the like are diagnosed in more detail from lung cancers. It is difficult and unnecessary to distinguish these diseases in image diagnosis, so they are classified into the same diagnosis group as that of lung cancers. Deciding a diagnosis group requires medical knowledge about image diagnosis.
In the first embodiment, the correspondence table exemplified in
Referring again to a case data table 900, “reference information to medical record data” is reference information for reading out medical record data corresponding to case data from the medical record database 4. “Reference information to medical record data” is stored instead of copying medical record data itself into case data. The case data table can be downsized, saving storage capacity.
An “imaging date” and “image type” can be read out from header information of medical record data or image data. A “target organ” is information representing an organ containing the region of interest of an image (to be described later). A doctor inputs this information when creating case data. The “target organ” can also be automatically input by automatically identifying an organ using the most advanced computer image processing technique.
“Reference information to image data” is reference information for reading out image data corresponding to case data from the medical image database 3. “Reference information to image data” is stored instead of copying image data itself into case data. The case data table can be downsized, saving storage capacity.
A “slice number of interest” is information necessary when the type of medical image is made up of a plurality of slices, like a CT image, MRI image, or PET image. The “slice number of interest” indicates the number of a slice image containing the most concerned region (region of interest) in image diagnosis. “Coordinate information (X0, Y0, X1, Y1) of the region of interest” is information representing an X-Y coordinate range containing the region of interest in a slice image indicated by the “slice number of interest”. In general, coordinate information is expressed as position information of pixels in an orthogonal coordinate system in which the upper left corner of an image is set as the origin, the right direction serves as the X-axis direction, and the down direction serves as the Y-axis direction. Coordinate information (X0, Y0, X1, Y1) represents all the coordinates (X0, Y0) of the upper left corner of the region of interest and the coordinates (X1, Y1) of the lower right corner of it.
The region of interest is obtained as follows. First, image data corresponding to case data is read out from the medical image database 3 using the “reference information to image data”. Then, a slice image designated by the “slice number of interest” is selected. Finally, image data is extracted from a range designated by the “coordinate information (X0, Y0, X1, Y1) of the region of interest”, thereby obtaining image data of the region of interest.
“Image feature information F of the region of interest” is information representing the feature of image data of the region of interest. F is multi-dimensional information (vector information) formed from a plurality of image feature amounts f1, f2, f3, . . . . Examples of the image feature amounts are as follows:
Needless to say, various other image feature amounts can be calculated.
To calculate an image feature amount concerning a morbid portion, the range (boundary) of the morbid portion needs to be specified in advance. General methods of specifying the range of a morbid portion are a method (manual extraction method) of designating the boundary of a morbid portion by a doctor while seeing an image, and an automatic extraction method using an image processing technique. In the embodiment, either of manual extraction and automatic extraction is available. A combination of image feature amounts expressing F is important for calculating the similarity of image data. Generally, a larger number of image feature amounts can be used to express the feature of image data in more detail, but it takes a long similarity calculation time. Hence, F is normally defined as a combination of ten to several tens of image feature amounts which are less correlated in information.
A case data table 1000 is another example of the case data table having components different from those of the case data table 900. Note that a “case data ID (DID)”, “definite diagnosis name”, and “diagnosis group ID (GID)” are the same as those in the case data table 900.
“Predetermined clinical information C” is necessary clinical information selectively copied from medical record data archived in the medical record database 4. C is multi-dimensional information (vector information) formed from pieces of clinical information c1, c2, c3, . . . . Examples of the pieces of clinical information are various test values (e.g., physical examination value, blood test value, and test values regarding a specific disease such as a cancer marker and inflammatory marker), a past history, and a treatment history. A combination of pieces of clinical information expressing C is important for calculating the similarity of clinical information. Deciding a proper C greatly depends mainly on an organ to be diagnosed and the type of disease.
An “imaging date”, “image type”, and “target organ” are the same as those in the case data table 900. “Image data I of the region of interest” is a copy of image data in the region of interest in the slice image of interest selected from image data archived in the medical image database 3. I is multi-dimensional information (vector information) formed from pieces of pixel information i1, i2, i3, . . . as many as pixels falling within the region of interest. “Image feature information F of the region of interest” is the same as that in the case data table 900.
A main difference between the case data tables 900 and 1000 is whether to store reference information to the clinical information C and that to the image data I indirectly (the case data table 900) or directly (the case data table 1000). When the case database 2 has a sufficiently large capacity, it is preferable to directly store all data in the case data table, as exemplified by the case data table 1000. This is because data archived in one database are read out by only one data readout processing. Readout of data archived in a plurality of databases requires a plurality of data readout processes. This complicates the processing procedures and prolongs the processing time.
Referring to
In
<Operation of Apparatus>
Control of the similar case search apparatus 1 by the controller 10 will be explained with reference to the flowcharts of
The execution status and execution result of a program executed by the CPU 100 are displayed on the monitor 104 as a result of the function of the OS and display program separately executed by the CPU 100. The case database 2 is assumed to archive the case data table 1000 exemplified in
In step S310, the CPU 100 accepts input of indefinite case data D0 in response to a command input from a user (doctor). More specifically, the CPU 100 reads the indefinite case data D0 into the main memory 101 from the medical image database 3 or a medical imaging apparatus (not shown) via the shared bus 107 and LAN 5. The CPU 100 may read the indefinite case data D0 into the main memory 101 via the magnetic disk 102 or an external storage device (not shown) via the shared bus 107. In the following description, the indefinite case data D0 includes only information on image data for descriptive convenience. That is, the indefinite case data D0 includes the imaging date, image type, target organ, image data I0 of the region of interest, and image feature information F0 of the region of interest, but does not include predetermined clinical information C0. Similar case search processing is therefore almost the same as similar image search processing. Note that the indefinite case data D0 may include the predetermined clinical information C0 obtained from various clinical test results and the like. Basic processing procedures are the same between a case in which the indefinite case data D0 includes the predetermined clinical information C0 and a case in which the indefinite case data D0 does not include it, except for whether or not to use C0 in similarity calculation.
In step S320, the CPU 100 decides similar case search conditions in accordance with the command input from the doctor. The similar case search conditions are used to limit case data to undergo similar case search. More specifically, only case data whose “image type” and “target organ” components match those of the indefinite case data D0 are subjected to similar case search. This is because when these components are different from those of the indefinite case data D0, the image feature information F of the region of interest is often greatly different. Thus, it is good for working efficiency to exclude, from search targets from the beginning, case data whose components mentioned above differ from those of the indefinite case data D0. It is preferable that decided similar case search conditions can be flexibly changed in accordance with a command input from a doctor in preparation for similar case search from case data different in “image type” and/or “target organ”.
In the following processing example, the “image type” of the indefinite case data D0 is a “contrast-enhanced CT image” and the “target organ” is the “lung”. That is, a processing example upon receiving a command to set the “contrast-enhanced CT image” as the “image type” and the “lung” as the “target organ” as similar case search conditions will be explained.
In step S330, the CPU 100 creates a search case data table exemplified in
The search case data table creation method will be described in detail. The CPU 100 reads case data meeting similar case search conditions from the case database 2 via the shared bus 107 and LAN 5. As described in step S320, case data are limited to those whose “image type” and “target organ” are a contrast-enhanced CT image and lung, respectively, as the similar case search conditions in the embodiment. In
In the first embodiment, as a notation representing row data in an arbitrary table, when a value (an ID in general) written at the start of a row (first column) is a value X, all row data are denoted by X. In other words, X={X, . . . } In the example of
In step S340, the CPU 100 selects top similar case data T1, T2, . . . , Tm from the search case data table exemplified in
A “top similar case data ID (TID)” is an identifier for uniquely identifying top similar case data. After the end of selecting top similar case data in step S340, sequential numbers are assigned as TIDs in order from the first row. A “second case data ID (D′ID)”, “diagnosis group ID (GID)”, and “similarity R” are the same as those described with reference to
In step S410, the CPU 100 creates a top similar case data table exemplified in
In step S420, the CPU 100 checks a value N representing the total number of case data (number of rows of the search case data table) in the search case data table exemplified in
In step S430, the CPU 100 reads out case data D′n of the nth row from the search case data table exemplified in
In step S440, the CPU 100 calculates a similarity Rn between the indefinite case data D0 read in step S310 and the case data D′n read out in step S430. The CPU 100 stores the similarity Rn by writing it in the “similarity R” column of the nth row in the search case data table stored in the main memory 101. As the method of calculating the similarity Rn, an arbitrary calculation method can be defined as long as it uses information included in both the indefinite case data D0 and case data D′n. In the example of
where F0={f01, f02, f03, . . . } and Fn={fn1, fn2, fn3, . . . }
Equation (1) can be geometrically represented as a reciprocal of the Euclidean distance between the F0 and Fn vectors. The similarity Rn should take a larger value for a longer distance between the vectors and thus is defined as a reciprocal of the distance between the vectors. To reduce the calculation amount, a difference R′n may be calculated based on equation (2), in place of the similarity Rn. To further reduce the calculation amount, a difference R″n may be calculated based on equation (3). When the difference R′n or R″n is calculated instead of the similarity Rn, a determination method in step S450 is also changed, which will be described later. A determination method in step S535 of
In step S450, the CPU 100 compares the similarity Rn calculated in step S440 with the similarity R of top similar case data Tm (T3 in the example of
When the difference R′n or R″n is calculated in place of the similarity Rn in step S440, the determination method in step S450 is changed as follows. If the value R′n or R″n is smaller than the value R′ or R″ of Tm, top similar case data must be replaced, and the process advances to step S460. If the value R′n or R″n is greater than or equal to the value R′ or R″ of Tm, no top similar case data need be replaced, and the process advances to step S480.
In step S460, the CPU 100 overwrites the row Tm (T3 in the example of
In step S470, the CPU 100 sorts all the rows (from T1 to Tm) of the top similar case data table in descending order of the “similarity R” value.
In step S480, the CPU 100 increments the index variable n (by one).
In step S490, the CPU 100 compares the index variable n with the number N of rows of the search case data table. If the value n is larger than the value N, all the case data in the search case data table have already been read, and the processing in step S340 ends. If the value n is less than or equal to the value N, not all the case data in the search case data table have been read yet, and the process returns to step S430 and continues. As described above, the contents of the top similar case data table (
In step S350, the CPU 100 checks top similarity diagnosis group IDs and their related group IDs, and decides a combination of each top similarity diagnosis group ID and its related group IDs as a search target group ID. Processing procedures at this time will be described in detail with reference to
The CPU 100 checks values on all rows in the “diagnosis group ID (GID)” column of the top similar case data table exemplified in
In the examples of
In step S360, the CPU 100 decides the lower and upper limits of the selection number of similar case data for each search target group ID. That is, the CPU 100 sets an extraction criterion for each group.
How to decide the selection number (lower limit, upper limit) will be explained with reference to
By applying these rules, only the first value suffices to be decided in advance. When the predetermined value is set to be changeable in accordance with a command input from a doctor, the number of similar cases displayed as similar case search results can be changed. In addition to this decision method, the selection number (lower limit, upper limit) can be decided in various ways. A preferable decision method changes depending on the preference of a user, that is, doctor, the window size for displaying similar case search results, and the like. It is also possible to prepare in advance a plurality of selection number (lower limit, upper limit) decision methods and change the selection number (lower limit, upper limit) decision method based on a command input from a doctor.
In the first embodiment, the lower and upper limits of the selection number of similar case data are decided, but both of them need not always be decided. For example, only one selection number may be decided for each search target group ID, instead of flexibly setting the selection number of similar case data. In this case, deciding each selection number means setting the lower and upper limits of the selection number to be equal to each other. Processing procedures when deciding each selection number fall within those when deciding the lower and upper limits of the selection number.
In step S370, the CPU 100 selects similar case data for each search target group ID. Detailed processing procedures in step S370 will be described with reference to
In step S510, the CPU 100 checks the value of a “search target group ID” on the final row of the correspondence table exemplified in
In step S515, the CPU 100 creates search target group-specific similar case data tables exemplified in
The CPU 100 processes the respective rows in
In step S520, the CPU 100 checks the total number N of case data (number of rows of the search case data table) in the search case data table exemplified in
In step S525, the CPU 100 reads out case data D′n of the nth row from the search case data table exemplified in
In step S530, the CPU 100 compares the value of the diagnosis group ID (GID) in the case data D′n read out in step S525 with a value Gk to be described below. If the two values are equal to each other as a result of the comparison, the process advances to step S535. If the two values are different from each other as a result of the comparison, the process advances to step S560.
How to obtain the value Gk will be explained in detail with reference to the tables shown in
When step S530 is executed for the first time, case data D′1 on the first row in
In step S535, the CPU 100 compares two “similarity R” values. One “similarity R” value is the value Rn of the “similarity R” in the case data D′n read out in step S525. The other “similarity R” value is the value (to be simply referred to as an R value of GTm for Gk) of the “similarity R” on a final row GTm of a similar case data table for Gk exemplified in
In step S540, the CPU 100 overwrites the final row GTm of the similar case data table for Gk exemplified in
In step S545, the CPU 100 sorts all the rows (from GT1 to GTm) of the similar case data table for Gk in ascending order of the “similarity R”. As a result, the “similarity R” of GTm is the smallest value in the similar case data table for Gk.
In step S550, the CPU 100 increments the index variable n by one. In step S555, the CPU 100 compares the index variable n with the value N (number of rows of the search case data table exemplified in
In step S560, the CPU 100 increments the index variable k by one. In step S565, the CPU 100 compares the index variable k with the value Gmax (value of the “search target group ID” on the final row of the correspondence table exemplified in
By the processing in step S370 described with reference to
In the processing procedures of the step S370 described with reference to
In step S380, the CPU 100 classifies similar case data into respective diagnosis groups and displays them by referring to the contents of diagnosis group-specific similar case data tables created in step S370. Processing procedures when reading out similar case data for each search target group by the CPU 100 will be described in detail with reference to the examples of
The CPU 100 reads out values in the “search target group ID” of the correspondence table exemplified in
Then, the CPU 100 reads out values in the “case data ID (DID)” of the similar case data table for G3 in
When D9 is read out from the case data table 1000, a “definite diagnosis name”, “predetermined clinical information C”, and “image data I of the region of interest” in D9 are extracted, obtaining the first definite diagnosis name-attached similar case data for G3. Other definite diagnosis name-attached similar case data can also be attained by the same procedures.
When D9 is read out from the case data table 900, a “definite diagnosis name” can be directly extracted, but predetermined clinical information and image data of the region of interest need to be read out from the medical record database 4 and medical image database 3, respectively. To extract predetermined clinical information, “reference information to medical record data” in D9 read out from the case data table 900 is extracted. Then, medical record data referred to by the reference information is read out from the medical record database 4. Predetermined clinical information is extracted from the medical record data. To extract image data of the region of interest, “reference information to image data” in D9 read out from the case data table 900 is extracted. Then, image data referred to by the reference information is read out from the medical image database 3. Further, a “slice number of interest” and “coordinate information (X0, Y0, X1, Y1) of the region of interest” in D9 read out from the case data table 900 are extracted. By using these pieces of information, the slice number of interest and the region of interest in the image data read out from the medical image database 3 are specified, obtaining image data of the region of interest.
Consequently in the examples of
When reducing the number of definite diagnosis name-attached similar case data owing to, for example, a small window size for displaying similar case search results, the selection number of similar case data for each search target group (=diagnosis group) is decreased. At this time, the selection number of similar case data for each search target group (=diagnosis group) can be decreased up to the lower limit 1) by referring to the lower limit of the selection number of similar case data exemplified in
As described above, the similar case search apparatus according to the first embodiment can extract a plurality of definite case data having different diagnosis results from the case database 2 for input indefinite case data. Based on the diagnosis results of the extracted definite case data, a user (doctor) can examine a plurality of diagnosis results which may correspond to the input case data.
The second embodiment will explain a technique of extracting various kinds of definite case data as compared to the first embodiment. The apparatus arrangement is the same as that in the first embodiment, and a description thereof will not be repeated. The processing procedures described with reference to the flowcharts of
Processing procedures in S370 according to the second embodiment will be explained with reference to the flowcharts of
Processing in step S510 is the same as that in the first embodiment. Processing in step S515 is almost the same as that in the first embodiment except that a search target group-specific similar case data table exemplified in
In step S515, a CPU 100 creates search target group-specific similar case data tables exemplified in
Processes in steps S520 to S535 and steps S550 to S565 are the same as those in the first embodiment, and a description thereof will not be repeated.
The processing in the second embodiment is greatly different from that in the first embodiment in steps S540 and S550 of
In step S610, the CPU 100 checks the number m of rows of the similar case data table for Gk exemplified in
In step S620, the CPU 100 reads out case data GTi of the ith row from the similar case data table for Gk exemplified in
In step S630, the CPU 100 calculates a similarity GkRi between case data D′n read out in step S525 of
where Fn={fn1, fn2, fn3, . . . } and Fi={fi1, fi2, fi3, . . . }
As described in step S440 of
In step S640, the CPU 100 compares the similarity GkRi calculated in step S630 with a predetermined threshold. The predetermined threshold is used to determine whether two case data belonging to the same diagnosis group are very similar to each other. If the similarity GkRi is equal to or higher than the predetermined threshold (the case data D′n and GTi are very similar), the process advances to step S650. If the similarity GkRi is lower than the predetermined threshold (the case data D′n and GTi are not so similar), the process advances to step S660.
When the difference GkR′i or GkR″i is calculated in place of the similarity GkRi in step S630, the determination method in step S640 is changed as follows. If the difference GkR′i or GkR″i is smaller than a predetermined threshold, the process advances to step S650. If the difference GkR′i or GkR″i is greater than or equal to the predetermined threshold, the process advances to step S660.
In step S650, the CPU 100 increments the “overlapping count” of the case data GTi by one, and writes it in the “overlapping count” column of the ith row in the similar case data table for Gk exemplified in
In step S660, the CPU 100 increments the index variable i by one. In step S670, the CPU 100 compares the index variable i with the value m checked in step S610. If i is greater than m, the process advances to step S680. If i is less than or equal to m, the process returns to step S620.
In step S680, the CPU 100 overwrites the final row GTm (GT4 in the example of the similar case data table for G2 in
In step S690, the CPU 100 sorts all the rows (from GT1 to GTm) of the similar case data table for Gk in descending order of the “similarity R” value. Thereafter, the processing in
As described above, the similar case search apparatus according to the second embodiment can extract a plurality of definite case data having different diagnosis results from the case database 2 for input indefinite case data. In particular, the similar case search apparatus according to the second embodiment can extract a wider range (various kinds) of definite case data in comparison with the first embodiment. By displaying the “overlapping count”, the degree of relation with input case data can be notified.
The present invention is also achieved by executing the following processing. More specifically, software (program) for implementing the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media. The computer (or the CPU or MPU) of the system or apparatus reads out and executes the program.
While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2008-246599, filed Sep. 25, 2008, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2008-246599 | Sep 2008 | JP | national |
This application is a CONTINUATION of PCT application No. PCT/JP2009/003459 filed on Jul. 23, 2009 which claims priority from Japanese Patent Application No. 2008-246599 filed on Sep. 25, 2008, the disclosures of which are hereby incorporated by reference herein in their entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2009/003459 | Jul 2009 | US |
Child | 12770613 | US |