Claims
- 1. A method for converting information from at least one raw database into a distilled database, the raw database including a plurality of records, each of the plurality of records including a data field, each data field including a data element, the method comprising the steps of:
converting a non-numeric data field in the raw database to a numeric vector; comparing said vector with a distilled matrix to determine whether said vector is included in said distilled matrix; including said vector in said distilled matrix if said vector is not included in said distilled matrix; and forming the distilled database using said distilled matrix.
- 2. The method of claim 1, further comprising the steps of:
maintaining information with said vector indicative of its origin in the raw database.
- 3. The method of claim 1, further comprising the steps of:
including said vector in a reference database; and identifying an appropriate position for said vector in said reference database.
- 4. The method of claim 3, wherein said step of identifying an appropriate position for said vector comprises the step of locating another vector similar to said vector.
- 5. The method of claim 4, wherein said step of locating another vector similar to said vector comprises the step of numerically comparing said vector with said another vector.
- 6. The method of claim 3, further comprising the step of locating a first vector in said reference database that is similar to a second vector in said reference database.
- 7. The method of claim 6, wherein said step of locating a first vector comprises the step of locating said first vector in said reference database that is identifiable as said second vector in said reference database.
- 8. The method of claim 7, wherein said step of locating said first vector comprises the step of locating said first vector in said reference database that is a duplicate of said second vector in said reference database.
- 9. The method of claim 6, further comprising the step of forming a distilled vector from said first vector and said second vector that includes the best information from said first vector and said second vector.
- 10. The method of claim 9, wherein said step of comparing said vector with a distilled matrix comprises the step of comparing said distilled vector with said distilled matrix to determine whether said distilled vector is included in said distilled matrix.
- 11. The method of claim 3, further comprising the step of locating a first vector in said reference database that is dissimilar to every other vector in said reference database.
- 12. The method of claim 11, further comprising the step of forming a distilled vector from said first vector.
- 13. The method of claim 12, wherein said step of comparing said vector with a distilled matrix comprises the step of comparing said distilled vector with said distilled matrix to determine whether said distilled vector is included in said distilled matrix.
- 14. The method of claim 1, wherein said step of converting the data field comprises the steps of:
selecting an appropriate number system with a radix at least equal to a number of possible values of a data element in said data field; representing said data element as a digit in the number system; and storing said digit in said vector.
- 15. The method of claim 1, wherein said step of comparing said vector with a distilled matrix comprises the step of determining a dot product between said vector and a vector in said distilled matrix.
- 16. The method of claim 1, wherein said step of comparing said vector with a distilled matrix comprises performing an eigenvector analysis.
- 17. The method of claim 1, wherein said step of comparing said vector with a distilled matrix comprises performing a pattern recognition analysis.
- 18. The method of claim 1, wherein said step of comparing said vector with a distilled matrix comprises the step of determining a dot product between said vector and a vector in said distilled matrix.
- 19. The method of claim 1, wherein said step of comparing said vector with a distilled matrix comprises the step of determining a cross product between said vector and a vector in said distilled matrix.
- 20. The method of claim 1, wherein said step of comparing said vector with a distilled matrix comprises the step of determining a difference between said vector and a vector in said distilled matrix.
- 21. The method of claim 1, wherein said step of comparing said vector with a distilled matrix comprises the step of determining a sum of said vector and a vector in said distilled matrix.
- 22. The method of claim 1, wherein said step of comparing said vector with a distilled matrix comprises the step of determining a determinant of said distilled matrix.
- 23. The method of claim 1, wherein said step of comparing said vector with a distilled matrix comprises the step of determining a magnitude of said vector.
- 24. The method of claim 1, wherein said step of comparing said vector with a distilled matrix comprises the step of determining a direction of said vector.
- 25. A method for converting information from a raw database into a distilled database, the raw database including a plurality of records, each of the plurality of records including a data field, the data field including a plurality of data elements, the method comprising:
converting the plurality of data elements in at least one non-numeric data field of one of the plurality of records in the raw database to a numeric value; forming a vector including said numeric value, said vector representative of said one of the plurality of records in the raw database; comparing said vector with a distilled matrix to determine whether said vector is included in said distilled matrix, said comparing using said numeric value; including said vector in said distilled matrix if said vector is not included in said distilled matrix; and forming the distilled database using said distilled matrix.
- 26. The method of claim 25, wherein said converting the plurality of data elements in at least one non-numeric data field of one of the plurality of records comprises:
representing each of the plurality of data elements as a digit in a number system, said number system having a radix at least equal to a number of possible values of a data element in said non-numeric data field, said digits collectively forming said numeric value in said number system.
- 27. The method of claim 25, further comprising:
maintaining information with said vector indicative of its origin in the raw database.
- 28. The method of claim 25, further comprising:
including said vector in a reference database; and identifying an appropriate position for said vector in said reference database.
- 29. The method of claim 28, wherein said identifying an appropriate position for said vector comprises locating another vector similar to said vector.
- 30. The method of claim 29, wherein said locating another vector similar to said vector comprises numerically comparing said vector with said another vector.
- 31. The method of claim 28, further comprising locating a first vector in said reference database that is similar to a second vector in said reference database.
- 32. The method of claim 31, wherein said locating a first vector comprises locating said first vector in said reference database that is identifiable as said second vector in said reference database.
- 33. The method of claim 32, wherein said locating said first vector comprises locating said first vector in said reference database that is a duplicate of said second vector in said reference database.
- 34. The method of claim 31, further comprising forming a distilled vector from said first vector and said second vector that includes the best information from said first vector and said second vector.
- 35. The method of claim 34, wherein said comparing said vector with a distilled matrix comprises comparing said distilled vector with said distilled matrix to determine whether said distilled vector is included in said distilled matrix.
- 36. The method of claim 28, further comprising locating a first vector in said reference database that is dissimilar to every other vector in said reference database.
- 37. The method of claim 36, further comprising forming a distilled vector from said first vector.
- 38. The method of claim 37, wherein said comparing said vector with a distilled matrix comprises comparing said distilled vector with said distilled matrix to determine whether said distilled vector is included in said distilled matrix.
- 39. A distilled database comprising:
a plurality of vectors formed from a plurality of records included in a raw database, each of said plurality of records having a plurality of data fields including at least one non-numeric data field, wherein each of said plurality of vectors includes a numeric value represented in a first number system that was converted from a value of said at least one non-numeric data field.
- 40. The distilled database of claim 39, wherein said numeric value retains semantic significance of its corresponding value in said at least one non-numeric data field.
- 41. A method for converting information from a raw database into a distilled database, the raw database including a plurality of records, each of the plurality of records including a non-numeric data field having a plurality of data elements, the method comprising:
converting a value of the non-numeric data field of one of the plurality of records in the raw database to a numeric value represented in a first number system, said first number system having a radix at least equal to a number of possible values of each of the plurality of data elements, said numeric value retaining semantic significance with respect to said value of the non-numeric data field; forming a vector including said numeric value, said vector representative of said one of the plurality of records in the raw database; comparing said vector with the distilled database to determine whether said vector is included in the distilled database, said comparing using said numeric value; and including said vector in the distilled database if said vector is not included in the distilled database.
- 42. A method for converting information from at least one raw database into a distilled database, the raw database including a plurality of records, each of the plurality of records including a data field, each data field including a data element, the method comprising the steps of:
converting a non-numeric data field in the raw database to a numeric vector; comparing said vector with a distilled matrix to determine whether said vector is included in said distilled matrix; including said vector in said distilled matrix if said vector is not included in said distilled matrix; and forming the distilled database using said distilled matrix.
- 43. A method for converting information from a raw database into a distilled database, the raw database including a plurality of records, each of the plurality of records including a data field, the data field including a plurality of data elements, the method comprising:
converting the plurality of data elements in at least one non-numeric data field of one of the plurality of records in the raw database to a numeric value; forming a vector including said numeric value, said vector representative of said one of the plurality of records in the raw database; comparing said vector with a distilled matrix to determine whether said vector is included in said distilled matrix, said comparing using said numeric value; including said vector in said distilled matrix if said vector is not included in said distilled matrix; and forming the distilled database using said distilled matrix.
- 44. A distilled database comprising:
a plurality of vectors formed from a plurality of records included in a raw database, each of said plurality of records having a plurality of data fields including at least one non-numeric data field, wherein each of said plurality of vectors includes a numeric value represented in a first number system that was converted from a value of said at least one non-numeric data field, said first number system having a radix at least equal to a number of possible values of each of the plurality of data elements.
- 45. A method for generating a distilled database from a plurality of records, each of the plurality of records including at least one non-numeric data field, the non-numeric data field including a plurality of data elements, the method comprising:
determining a numeric value for the non-numeric data field from one of the plurality of records, said numeric value having a representation in a number system having a radix greater than or equal to a number of possible values of each of the plurality of data elements in the non-numeric data field; forming a vector from said one of the plurality of records, said vector including said numeric value; and comparing said vector with a vector in the distilled data base.
- 46. The method of claim 45, where said determining a numeric value from the non-numeric data field comprises determining a single numeric value for the non-numeric data field.
- 47. The method of claim 45, where said forming a vector from said one of the plurality of records comprises forming a vector from said one of the plurality of records, said vector including said numeric value instead of the non-numeric field.
- 48. A method for generating a database from a plurality of records, each of the plurality of records including at least one non-numeric data field, the non-numeric data field including a plurality of data elements, the plurality of data elements representing informational content, the method comprising:
representing the content represented by the plurality of data elements of the non-numeric data field as a numeric value; forming a vector from said one of the plurality of records, said vector including said numeric value; and including said vector in the database.
- 49. The method of claim 48, wherein said representing the content represented by the plurality of data elements of the non-numeric data field as a numeric value, said numeric value maintaining semantic significance of the non-numeric data field.
- 50. The method of claim 48, wherein said representing the content represented by the plurality of data elements of the non-numeric data field as a numeric value, said numeric value included in a number system having a radix greater than or equal to a number of possible values each of the plurality of data elements.
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation application of application Ser. No. 09/357,301, filed on Jul. 20, 1999, the entire content of which is hereby incorporated by reference.
Continuations (1)
|
Number |
Date |
Country |
Parent |
09357301 |
Jul 1999 |
US |
Child |
10198935 |
Jul 2002 |
US |