Claims
- 1. A data mining system, comprising:
a parallel reading and displaying module for reading and displaying data in different formats, said data containing data items with features; a parallel object identifying module for identifying data items; a parallel feature extracting module for extracting relevant features for each data item; a parallel pattern recognition algorithms module for pattern recognition; and a storage module to store the features for each data item as it is extracted.
- 2. The data mining system of claim 1, including a parallel linking module for linking said parallel object identifying module, said parallel feature extracting module, said parallel pattern recognition algorithms module, and said storage module as necessary.
- 3. The data mining system of claim 1, including a parallel dimension reduction module for dimension reduction which reduces the number of features for said data item.
- 4. The data mining system of claim 1, wherein said storage module is a database.
- 5. The data mining system of claim 1, including a parallel sampling module for sampling said data to reduce the number of said data items.
- 6. The data mining system of claim 1, including a parallel multiresolution analysis module for performing a reversible transformation of said data into a coarser resolution.
- 7. The data mining system of claim 1, including a parallel noise removing module for removing noise from said data.
- 8. The data mining system of claim 1, including a parallel data fusion module for data fusion.
- 9. A parallel object-oriented data mining system, comprising:
a parallel object-oriented reading and displaying module for reading and displaying data in different formats, said data containing data items with features; a parallel object-oriented identifying module for identifying data items; a parallel object-oriented feature extracting module for extracting relevant features for each data item; a parallel object-oriented pattern recognition algorithms module for pattern recognition; and a storage module to store the features for each data item as it is extracted.
- 10. The data mining system of claim 9, including a parallel object-oriented linking module for linking said parallel object-oriented identifying module, said parallel object-oriented extracting module, said parallel object-oriented pattern recognition algorithms module, and said storage module as necessary.
- 11. The data mining system of claim 9, including a parallel object-oriented dimension reduction module for dimension reduction which reduces the number of features for said data item.
- 12. The data mining system of claim 9, including a parallel object-oriented sampling module for sampling said data to reduce the number of said data items.
- 13. The data mining system of claim 9, including a parallel object-oriented multiresolution analysis module for performing a reversible transformation of said data into a coarser resolution.
- 14. The data mining system of claim 9, including a parallel object-oriented removing noise module for removing noise from said data.
- 15. The data mining system of claim 9, including a parallel object-oriented data fusion module for data fusion.
- 16. The data mining system of claim 9 wherein said storage module is a database.
- 17. The data mining system of claim 15 including a parallel object-oriented linking module for linking said parallel object-oriented identifying module, said parallel object-oriented extracting module, said parallel object-oriented pattern recognition algorithms module, said storage module, and said database utilizes a scripting language.
- 18. A parallel object-oriented data mining system, comprising:
a parallel object-oriented reading and displaying module for reading and displaying data in different formats, said data containing data items with features; a parallel object-oriented sampling module for sampling said data to reduce the number of data items; a parallel object-oriented multiresolution analysis module for performing a reversible transformation of said data into a coarser resolution; a parallel object-oriented noise removing module for removing noise from said data; a parallel object-oriented data fusion module for data fusion; a parallel object-oriented object identifying module for identifying data items; a parallel object-oriented feature extracting module to extract relevant features for each of the said data items; a parallel object-oriented dimension reduction module for dimension reduction which reduces the number of features for a data item; a parallel object-oriented pattern recognition algorithms module for pattern recognition; and a database to store the features for each data item as it is extracted, wherein, the appropriate modules are linked as necessary using a scripting language.
- 19. A data mining system for science, engineering, business and other applications, comprising:
a parallel object-oriented reading, writing, and displaying module for reading, writing, and displaying engineering, business and other data in different formats, said data containing data items from different sensors at different times under different conditions; a parallel object-oriented sampling module for sampling said data and reducing the number of data items; a parallel object-oriented multiresolution analysis module for multiresolution analysis to perform a reversible transformation of said data into a coarser resolution using multi-resolution techniques; a parallel object-oriented noise removal module for removing noise from said data; a parallel object-oriented data fusion module for data fusion if said data is obtained from different sensors at different times under different conditions at different resolutions; a parallel object-oriented object identifying module for identifying data items in the fused, denoised, sampled, multi-resolution data; a parallel object-oriented feature extracting module for extracting relevant features for each item from the said fused, denoised, sampled, multi-resolution data; a parallel object-oriented dimension reduction module for dimension reduction which reduces the number of features for said data item; a parallel object-oriented pattern recognition module using pattern recognition algorithms selected from the group consisting of decision trees, neural networks, k-nearest neighbor, k-means, or evolutionary algorithms; and a database to store the features for each data item as it is extracted, after the number of features have been reduced, and as the data set grows in size, enabling easy access to subsets of data; wherein, all the appropriate modules are linked as necessary using a scripting language such as Python to provide a solution for data mining.
- 20. The data mining system of claim 18 wherein said parallel object-oriented multiresolution analysis module for multiresolution analysis to perform a reversible transformation of said data into a coarser resolution uses resolution techniques such as wavelets.
- 21. The data mining system of claim 18 wherein said parallel object-oriented noise removal module for removing noise from said data uses techniques selected from the group consisting of wavelet-based denoising, spatial filters or techniques based on partial differential equations.
- 22. The data mining system of claim 18 wherein said multi-resolution techniques are wavelets.
- 23. The data mining system of claim 18 wherein said denoising techniques are wavelet-based.
- 24. The data mining system of claim 18 wherein said denoising techniques are spatial filters.
- 25. The data mining system of claim 18 wherein said denoising techniques are techniques based on partial differential equations.
- 26. A method of data mining, comprising the steps of:
reading and displaying data files, said data files containing objects having relevant features; identifying said objects in said data files; extracting relevant features for each of said objects; and recognizing patterns among said objects based upon said features.
- 27. The method of data mining of claim 25 including the step of sampling said data and reducing the number of said data items.
- 28. The method of data mining of claim 25 including the step of conducting multiresolution analysis to perform a reversible transformation of said data into a coarser resolution.
- 29. The method of data mining of claim 25 including the step of removing noise from said data.
- 30. The method of data mining of claim 25 including the step of conducting data fusion of said data.
- 31. The method of data mining of claim 25 including the step of conducting dimension reduction which reduces the number of features for one or more of said data items.
- 32. The method of data mining of claim 25 including the steps of sampling said data and reducing the number of said data items, conducting multiresolution analysis to perform a reversible transformation of said data into a coarser resolution, removing noise from said data, conducting data fusion of said data, and conducting dimension reduction which reduces the number of features for one or more of said data items.
- 33. A method of data mining, comprising the steps of:
reading and displaying data files using a parallel object-oriented reading and displaying module, said data files containing objects having relevant features; identifying said objects in said data files using a parallel object-oriented object identifying module; extracting relevant features for each of said objects using a parallel object-oriented feature extracting module; and recognizing patterns among said objects based on said features using a parallel object-oriented pattern recognizing module.
- 34. A method of data mining, comprising the steps of:
reading, writing, and displaying a number of data files; sampling said data files and reducing the number of said data files; conducting multi-resolution analysis to perform a reversible transformation into a coarser resolution of said data files; removing noise from said data files; implementing data fusion of said data files; identifying objects in said data files; extracting relevant features for each of said objects; normalizing said features of said objects; reducing the dimension or number of said features of said objects; recognizing patterns among said objects using said features; displaying said data files and said objects and capturing feedback from scientists for validation; storing the said features for each of said objects, after they have been extracted in said extracting step, reduced in number in said reducing step, used for pattern recognition in said recognizing patterns step, and displayed in said displaying step; and linking said foregoing steps.
- 35. A method of data mining, comprising the steps of:
reading, writing, and displaying scientific, engineering, business and other data in different formats using a parallel object-oriented reading, writing, and displaying module, said data containing data items; sampling said data and reducing the number of said data items using a parallel object-oriented sampling module; conducting multiresolution analysis to perform a reversible transformation of said data into a coarser resolution using a parallel object-oriented multiresolution module; removing noise from said data using a parallel object-oriented removing noise module; conducting data fusion using a parallel object-oriented data fusion module; when said data is obtained from different sensors at different times under different conditions at different resolutions; identifying objects or data items in said data and extracting relevant features for each of said data items using a parallel object-oriented identifying objects module; conducting dimension reduction which reduces the number of features for one or more of said data items using a parallel object-oriented conducting dimension reduction module; implementing pattern recognition algorithms using a parallel object-oriented implementing pattern recognition algorithms module; using a database to store said features for each of said data items extracted after the number said features have been reduced, and as said data items grows in size, enabling easy access to subsets of said data; and linking appropriate foregoing parallel object-oriented modules as necessary using a scripting language.
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] Related subject matter is disclosed and claimed in the following commonly owned, copending, U.S. Patent Applications, “PARALLEL OBJECT-ORIENTED, DENOISING SYSTEM USING WAVELET MULTIRESOLUTION ANALYSIS,” by Chandrika Kamath, Chuck H. Baldwin, Imola K. Fodor, and Nu Ai. Tang, patent application number 09/xxxxxx, filed xxxxxxx, 2001, and “PARALLEL OBJECT-ORIENTED DECISION TREE SYSTEM.” by Chandrika Kamath and Erick Cantu-Paz, patent application number 09/xxxxxx, filed xxxxxxx, 2001, which are hereby incorporated by reference in their entirety.
Government Interests
[0002] The United States Government has rights in this invention pursuant to Contract No. W-7405-ENG-48 between the United States Department of Energy and the University of California for the operation of Lawrence Livermore National Laboratory.