The present invention generally relates to systems and methods for data retrieval, predictive analysis, and interactive visualization for discovering candidate compounds having certain attributes. More specifically, the present invention relates to an artificial intelligence assisted computer-implemented system and method for evaluating candidate molecules for use in candidate compound discovery, interacting with the compound in a virtual or augmented reality environment, and predicting clinical success.
Compound development and discovery involves a massive amount of testing and analysis. Generally, a plurality of samples must be tested under a variety of conditions. The evaluation of a new compound must undergo numerous tests and trials before it is approved for use for its intended purpose, whether it be the discovery of new or unique drugs, nutraceutical compounds, chemicals, plastics, metals, alloys, and the like. In healthcare, the primary research is directed toward developing a new understanding of natural substances or physiological processes that produce desirable effects involves massive costs in research and development. In healthcare, research and development costs estimations for the development of new drugs range widely, from $43.4 million to $4.2 billion.
With technological advancements and understanding of biological systems, it is easier to predict new features of chemical, biological or metallic entities, with the help of various application software, hardware, and supercomputers. This is known as testing in silico. In silico attempts to develop new compounds with those currently made by identifying similar components and using behaviors known from previous compounds for new discovery. The component and its parts may be identified, recombined with other components, and tested in silico for desired effects.
Today, various approaches are employed in identifying biological, chemical, or physical attributes to compounds discovered and developed by applying large-scale computing technology. However, these approaches do not produce industry agnostic results efficiently or effectively, are onerous for the user, and are inaccurate as to certain features of a chemical molecule within a particular time, particularly because approaches employed today do not efficiently utilize the available information. Indeed, one of the most extensive challenges of testing candidate compounds and molecules is comparing information from disparate sources in different formats. In addition, using current systems, it is not possible to present consistent, accessible data across a sizeable extensive collection of data sources that comply with a typical industry data model having normalized or standardized parameters in data elements. Further, another drawback of current systems is that they do not capture the consistent relationships between sub-atomic particles and the candidate molecules and compounds to ensure accuracy of predictive models.
Further, many metals (e.g., sodium, potassium, magnesium, calcium, iron, zinc, copper, manganese, chromium, molybdenum and selenium) are required for normal biological functions in humans. Disorders of metal homeostasis and of metal bioavailability, or toxicity caused by metal excess, are responsible for a large number of human diseases. Metals are also extensively used in medicine as therapeutic and/or as diagnostic agents. Metals such as arsenic, gold and iron have been used to treat a variety of human diseases. Nowadays, an ever-increasing number of metal-based drugs are available. These drugs contain a broad spectrum of metals, many of which are not among those essential for humans, able to target proteins and/or DNA. As an example, metal-containing compounds targeting DNA or proteins currently in use, or designed to be used, as therapeutics against cancer, arthritis, parasitic and other diseases, with a special focus on the available information, are often provided by X-ray studies, about their mechanism of action at a molecular level. However, currently, when presented with these situations in the industry, it is difficult and unduly expensive to identify and test metals for therapeutics.
In light of the above-mentioned problems, there is a need for a system and method that allows a user to evaluate candidate molecules for use in candidate compound discovery, interact with the compound in a virtual or augmented reality environment, and predict clinical success in a user-friendly way.
The present invention generally discloses a system and method for gathering, assembling, sorting, and retrieving data. Further, the present invention relates to a computing system for evaluating candidate molecules for use in candidate compound discovery, applying a multitude of artificial intelligence (AI) models, generating interactive virtual environments in atomic resolutions to allow a user to test candidate molecules with candidate compounds. The AI implemented system is configured to gather and receive information relating to, for example, physical, chemical, electrical, genetic, atomic, sub-atomic and biological dimensions of molecules, metals, and compounds using disparate types of information such as all literature in both structured and unstructured formats and reconfigured to identify the similarities and dissimilarities in their composition as well as in their atomic and sub-atomic behavior.
In embodiments, the system and method use a plurality of artificial intelligence (AI) models and virtual reality (VR) or augmented reality (AR) for compound discovery. AR offers several advantages over traditional visualization tools (2D computer projections, 3D computer projections, and 3D printed plastic models), including but not limited to, allowing researchers to interact with the molecules in ways that simulate their natural environment. For example, using the system and method, a researcher (or user) may simulate an enzyme in the presence of its ligand and view the structure, allowing the researcher/user to examine the steric changes that occur within the molecule upon ligand binding. Additionally, the system and method may use VR/AR to understand how a known drug may interact with the molecule and output potential clinical attributes of the candidate compound. Furthermore, researchers (or users) may effectively view and interact with molecules on an atomic and sub-atomic scale. In sum, VR/AR coupled with the system described herein allows researchers to rationally design new drug candidates and test these in silica for binding and fit before investing in drug development and manufacturing. In effect, the system can automatically test many thousands of molecules in silica, then allow a user to interact with plurality of suggested molecules and candidate compounds using VR/AR space to understand critical interactions, eventually leading into in vivo testing.
In embodiments, a discovery system or computer-implemented system that may be integrated with or in communication with a plurality of embedded AI models (e.g., neural networks, RNN, CNN, NLP) in a networked or virtual networked environment is presented. In one embodiment, the system is configured to retrieve standardized data of compounds and molecular information of compounds down to the atomic and sub-atomic level, compare the known physical, chemical, electrical, genetic, atomic, sub-atomic and biological characteristics of molecules and identify potential new combinations of molecules based on similarities in their composition and chemical and atomic characteristics, based in part on known actions of similar compounds and elements. In one embodiment, the software application outputs results based on a plurality of AI-aided user filters. The comparison is made using the system based on accelerating time for researchers performing research activities. In one embodiment, the system is installed with application software, mobile application, or web-based application.
In one embodiment, the system after analysis is configured to use a combination of AI models to predict at least one or a plurality of optimized molecules which could be added to a compound to achieve a desired clinical attribute and view the comparative analysis in drug discovery in a three-dimensional interactive environment using VR or AR. The user can visualize the compound's molecular structure and make changes based on hand gestures or other types of user inputs (e.g., brain waves). In one embodiment, the system comprises a computing device having a processor and a memory in communication with the processor. The memory stores a set of instructions or software modules executable by the processor. In one embodiment, the software modules may be application software, mobile application, or web-based application residing on a server or as part of a virtual network. In one embodiment, the system further comprises a database management system. In one embodiment, the database management system comprises one or more databases in communication with the computing device configured to store a plurality of standardized data of various compositions, compounds, metals, molecules, atoms and sub-atomic particles. In one embodiment, the data management system includes one or more modules configured to analyze, predict, and recommend specific molecules or elements for various desired compounds to give the effect required by the user (e.g., certain attributes or clinical attributes). The efficacy prediction may be based on similar compositions sharing typical biological receptors, enzymatic pathways, chemical structures, and the like.
In one embodiment, the system generates a visual representation configured to compare the known physical, chemical, genetic, stereoisomerism, empirical formula, Valence Shell Electron Pair Repulsion (VSEPR), and biological dimensions of molecules, and in a second layer, a visual representation of characteristics of metals that may be added. It is configured to identify the similarities and behavior in the respective molecular compositions, thereby providing a comparative visualization of analyzed results, having identified similar composition and similarities in their behavior concerning the existing composition.
The present system's visualization layer (interface) will present data in a graphical format where hundreds of parameters can be viewed simultaneously. A detailed drill-down inspection can be performed at any parameter for candidates being investigated through the use of VR and/or AR overlays, which allow researchers to quickly view actual data (not just a graphical representation) and save the findings of interest in a working file for later review. The system's ability to quickly investigate all the parameters required to review in a highly accelerated fashion is novel to the industry and as a result of this efficiency, use of the program will accelerate traditional research time by an anticipated magnitude factor of 2 (up to 100× faster than traditional efforts).
In one embodiment, the comparison is performed on any level of scale, in vivo, in vitro, and in silico simultaneously. The results of comparison self-checks for accuracy, using available data repositories. The comparison is made using the system based on accelerating time for researchers performing research activities. In one embodiment, the system is installed with application software, mobile application, and/or web-based application.
In one embodiment, the software with artificial intelligence will provide those same characteristics to the pharmaceutical explorer, suggesting paths of chemical and/or structural modifications, while also suggesting targets, which will significantly improve the efficiency of drug discovery and development. In one embodiment, the system utilizes the fast-processing capabilities of graphical processing units (GPUs) and supercomputers together with algorithms for similarity detection. In one embodiment, the algorithms are trained using machine learning to identify high potential similarity matches accurately as well as dissimilarities, and the results are displayed to the users using 3D, augmented reality visualizations.
In one embodiment, the software and artificial intelligence are used to predict features of new chemical or biological entities. The artificial intelligence coupled with predictive analytics evolves from predicting solubility, toxicity, and antigenicity, to services incorporating prediction of efficacy based on similar compounds and entities sharing common biological receptors, enzymatic pathways, or chemical structures. It provides value to the very crucial “go/no go” decision making and makes actual engineering modification of the molecules, which could improve both efficacy and limit or reduce toxicity, including antigenicity in the case of protein-based drugs such as antibodies.
In one embodiment, one or more modules include a data platform system (DPS), an artificial intelligence system (AIS), and a molecule and compound editor system (MCEM). In one embodiment, the DPS is integrated with a user interface (UI) configured to store at least one of the standardized and normalized data related to compounds (drugs and chemicals), which provides administration and functions for real-time monitoring of users and their data environments. In one embodiment, the DPS supports data in the form of texts, graphical data, functions, equations, formulas, 2D or 3D figures, and models corresponding to atomic, chemical, physics, biological, mathematical representations, and abstracts for storing and accessing data.
In one embodiment, the artificial intelligence module (AIM) is configured to provide a multitude of AI modules having a plurality of AI methodologies and techniques to compare the candidate compound with at least one of the standardized and normalized data related to similar compounds through at least one of the AI modules of the AIS. The AIM outputs the results of the AI modules based on a plurality of inputs. In one embodiment, each AI module includes natural language processing (NLP), Neuro Nets (NL), Machine learning (ML), recurrent neural network (RNN), and Convolutional Neural Network (CNN).
In one embodiment, the molecule and compound editor system (MCEM) are configured to analyze the predicted results upon user direction by utilizing analysis software of known compounds and/or metals. In one embodiment, the AIM is configured to return potential hit molecules that may be candidate molecules and/or metals for a requested candidate compound upon request function for analysis in the MCEM. The analyzed results from the AIS, MCEM are stored in the DPS, and its process logistics are recorded in the FTDS for further processing. In one embodiment, the comparative visualization displays result in a graphical format to users using three-dimensional and or augmented reality (AR) visualizations simultaneously.
In one embodiment, the software with artificial intelligence (AI) provides those same characteristics to the compound (e.g., pharmaceutical or chemical) or metal explorer, suggesting paths of chemical and structural modifications, suggesting targets, which will significantly improve the efficiency of compound manufacturing and/or drug discovery and development. In one embodiment, the system utilizes the fast-processing capabilities such as dedicated AI configured GPUs, supercomputers, or quantum computing together with algorithms for similarity detection. In one embodiment, the algorithms are trained using machine learning to accurately identify high potential similarity matches, and the results are displayed to users using virtual and augmented reality visualizations.
In one embodiment, the software and artificial intelligence are used to predict features of new chemical or biological entities in some cases having metals inculcated therein. Artificial intelligence coupled with predictive analytics has evolved from predicting solubility, toxicity, and antigenicity, to services incorporating prediction of efficacy based on similar compounds and entities sharing common biological, organic receptors, enzymatic pathways, or chemical structures and metallic properties. It provides value to the very crucial “go/no go” decision making as well as to the actual engineering modification of the molecules, which could improve both efficacy and limit or reduce toxicity, including antigenicity in the case of protein-based drugs such as antibodies or in the case of chemicals, efficacy and safety.
An exemplary output of the software is a modified receiver operator curve; the ordinate shall be a value generated from a regression combining available evidence for similar metals and would include a rating of how similar the innovator metal was concerning the existing metal(s). The abscissa would indicate the predictor of success, from 0 to 100. An additional characteristic would consider the number of tests and data available for each comparison. Metals with only chemical resistance data would have a higher risk associated as a caution with the predictor compared, for example, with a chemical resistance, conductivity, density, and past uses cases.
A computing system for evaluating candidate molecules for use in candidate compound discovery, the computer system comprising a non-transitory computer-readable memory, and a processor configured to execute instructions stored on the non-transitory computer-readable memory which, when executed, causes the processor to receive a type of standardized data from at least a database, wherein the standardized data is for compounds, metals, molecules, atoms, sub-atomic particles or any combination thereof, and wherein the standardized data types comprise numerical data sets, images, graphs, text, or any combination thereof, receive a user query, wherein the user query comprises a request for a desired attribute for the candidate compound, processes each of the standardized data types with one or more trained machine learning models, wherein the one or more machine learning models are configured to recommend at least one of a plurality of the candidate molecules to be bonded with the candidate compound based on the user query, generates an interactive environment comprising a graphical representation of the candidate molecule and candidate compound based on the user query, then based on a received signal from a user, alters a configuration of the candidate compound to allow the user, using virtual reality (VR), augmented reality (AR), or both, to interact with the candidate compound and the plurality of candidate molecules, wherein based on a received additional signal from the user, the computer system can further alter the candidate compound with another of the plurality of molecules, and generate clinical characteristics of the candidate compound based on the further alteration by the user.
In embodiments, a computer-implemented method candidate compound discovery, comprising executing on a processor the steps of receiving a type of standardized data from at least a database, wherein the standardized data is for compounds, metals, molecules, atoms, sub-atomic particles or any combination thereof, and wherein the standardized data types comprise numerical data sets, images, graphs, text, or any combination thereof, receiving a user query, wherein the user query comprises a request for a desired attribute for the candidate compound, processing each of the standardized data types with one or more trained machine learning models, wherein the one or more machine learning models are configured to recommend at least one of a plurality of the candidate molecules to be bonded with the candidate compound based on the user query, generating an interactive environment comprising graphical representation of the candidate molecule and candidate compound based on the user query, based on a received a signal from a user, altering a configuration of the candidate compound to allow the user, using virtual reality (VR), augmented reality (AR), or both, to interact with the candidate compound and the plurality of candidate molecules, wherein based on a received additional signal from the user, the computer system can further alter the candidate compound with another of the plurality of molecules, and generate clinical characteristics of the candidate compound based on the further alteration by the user.
In embodiments, a non-transitory computer-readable medium having stored thereon program instructions that, upon execution by a computing device, causes the computing device to perform operations comprising of receiving a type of standardized data from at least a database, wherein the standardized data is for compounds, metals, molecules, atoms, sub-atomic particles or any combination thereof, and wherein the standardized data types, comprising numerical data sets, images, graphs, text, or any combination thereof, receive a user query, wherein the user query comprises a request for a desired attribute for the candidate compound, processes each of the standardized data types with one or more trained machine learning models, wherein the one or more machine learning models are configured to recommend at least one of a plurality of the candidate molecules to be bonded with the candidate compound based on the user query, to generate an interactive environment comprising a graphical representation of the candidate molecule and candidate compound based on the user query, and based on a received signal from a user, alter a configuration of the candidate compound to allow the user, using virtual reality (VR), augmented reality (AR), or both, to interact with the candidate compound and the plurality of candidate molecules, wherein based on a received additional signal from the user, the computer system may further alter the candidate compound with another of the plurality of molecules and generate clinical characteristics of the candidate compound based on the further alteration by the user.
Advantageously, the systems and methods transform different types of standardized data into data usable to make predictions based on various comparisons. Standardized data comes in the form of numerical data, images, graphs, and literature, and by converting these data sets using various AI models, the efficacy of predictive molecules for use is significantly increased.
Moreover, the time and financial constraints in compound development is significantly decreased. The system is configured to utilize empirical data on features such as how certain molecules may impact the body, such as one's body temperature, for example. The system is configured to predict effective metals to use in therapeutics in a second layer by parsing characteristics such as but not limited to, crystalline structure, and chemical and physical properties, both from predictive software, empirical data, and published data to develop a multidimensional modeling system to improve the accuracy of the model and, in turn, increase successes designing new metals and suggesting new applications for those metals.
Advantageously, the system removes any subjective input in favor of more objective appraisals when applied to the selection process and “go/no go” decisions in candidate compound discovery.
Advantageously, the platform provides the ability for quick, practical working sessions. The system will have the ability to provide cognitive comparison tools by optimizing interfaces so that a researcher can quickly analyze the data. The system will have an immersive set of tools that work in conjunction with each other for comparison, applying and alternating filters for viewing data, saving/recalling work sessions notes and findings, and executing detailed AI analysis across known and new molecules that are being developed.
The interface may be intuitive and utilizes gestures (either through using a mouse, hand controllers, key presses, or other means), which will become second nature to the researcher. The ability to pivot quickly is paramount to the design effort for the system. The need to identify source data, collect information, and prepare it before analyzing it is eliminated. The data will be available for the researcher to use, and he or she will be able to concentrate on the research at hand.
Advantageously, the system is configured to ingest and transform data from disparate sources and takes a continual effort to collect, assemble and parameterize the data in a consistent and consumable manner. The resultant data repository is traceable to source through logging of source and transformation techniques for reliability by researchers across a large body of information.
Advantageously, using scalable architectures in both elastic cloud data centers, use of highly performing graphical processing units (GPUs), the system distributes computing workload for demand across multiple computing processors. Due to the enormity of data, the system provides data processing at various levels such as real-time highly performant (elastic), real-time fixed capacity, and off-peak lower-cost processing, such as overnight or with lower-performing/low-cost CPU allocation.
Advantageously, the system is able to provide clear views of compounds, molecules and sub-atomic particles in an interactive environment at a biological level of 10−6.
Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating specific embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
The foregoing summary, as well as the following detailed description of the invention, is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the invention, exemplary constructions of the invention are shown in the drawings. However, the invention is not limited to the specific methods and structures disclosed herein. The description of a method step or a structure referenced by a numeral in a drawing is applicable to the description of that method step or structure shown by that same numeral in any subsequent drawing herein.
The present invention is best understood by reference to the detailed figures and description set forth herein.
It is understood that the present subject matter may be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein. Rather, these embodiments are provided so that this subject matter will be thorough and complete and will convey the disclosure to those skilled in the art. Indeed, the subject matter is intended to cover alternatives, modifications, and equivalents of these embodiments, which are included within the scope and spirit of the subject matter as defined by the appended claims and their equivalents. Furthermore, in the detailed description of the present subject matter, numerous specific details are set forth in order to provide a thorough understanding of the present subject matter. However, it will be clear to those of ordinary skill in the art that the present subject matter may be practiced without such specific details.
Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatuses (systems), and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general-purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable instruction execution apparatus, create a mechanism for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The non-transitory computer-readable media includes all types of computer-readable media, including magnetic storage media, optical storage media, and solid-state storage media and specifically excludes signals. It should be understood that the software can be installed in and sold with the device. Alternatively, the software can be obtained and loaded into the device, including obtaining the software via a disc medium or from any manner of network or distribution system, including, for example, from a server owned by the software creator or from a server not owned but used by the software creator. The software can be stored on a server for distribution over the Internet, for example.
Computer-readable storage media (medium) exclude (excludes) propagated signals per se, can be accessed by a computer and/or processor(s), and include volatile and non-volatile internal and/or external media that is removable and/or non-removable. For the computer, the various types of storage media accommodate the storage of data in any suitable digital format. It should be appreciated by those skilled in the art that other types of computer readable medium can be employed such as zip drives, solid state drives, magnetic tape, flash memory cards, flash drives, cartridges, and the like, for storing computer executable instructions for performing the novel methods (acts) of the disclosed architecture.
The computing system includes data stores, which maintain a database according to known database management systems (DBMS). The data stores may include a hard disk drive, a magnetic disk drive, an optical disk drive, or another type of computer readable media which can store data accessible by the processor, such as magnetic cassettes, flash memory cards, digital versatile disks, cartridges, random access memory (RAM) and read only memory (ROM). The data stores may be connected to the computing system bus by a drive interface and the data stores provide nonvolatile storage of computer readable instructions, data structures, program modules and other data for the computing system.
The terminology used herein is for the purpose of describing particular aspects only and is not intended to be limiting of the disclosure. As used herein, the singular forms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The description of the present disclosure has been presented for purposes of illustration and description but is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the disclosure. The aspects of the disclosure herein were chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure with various modifications as are suited to the particular use contemplated.
For purposes of this document, each process associated with the disclosed technology may be performed continuously and by one or more computing devices. Each step in a process may be performed by the same or different computing devices as those used in other steps, and each step need not necessarily be performed by a single computing device.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
As used herein, “modules” refers to computer program logic implemented in hardware, firmware, or software. In some examples, modules can be stored on a storage device, loaded into the memory and executed by processor, or be part of a virtual network. For example, modules ca be implemented as one or more computer programs, i.e., one or more modules of computer program instructions, encoded on a tangible computer storage medium for execution by one or more processors.
As used herein, “standardized data” refers to publicly trusted and available data relating to molecules compounds atomic elements, subatomic particles in the like. The data may include and is not limited to information relating to bond angles, electron configurations, melting points, toxicity, efficacy, physical, chemical, genetic, anatomical, biological dimensions and the like.
As used herein, “images” refers to non-numerical data such as molecular images.
As used herein, “desired clinical attribute” or “desired attribute” refers to a request by a user to the system for a characteristic of a molecule or compound (e.g., affinity) or a molecule or compound that may result in treating a disease or condition.
As used herein, “candidate compounds” refer to a compound that is being built by the user to treat a disease or condition.
As used herein, “candidate molecule” refers to molecules that may bond to the compound to provide a desired clinical attribute and may also refer to atomic elements that could be added to a candidate compound.
Referring to
In one embodiment, system 100 further comprises a database management module 102 in communication with one or more databases (104, 106, 108 and 116). The databases (104, 106, 108 and 116), are configured to store standardized data and information of various compounds, metals, molecules, atoms, and subatomic particles as an example (e.g., all known compositions). The database management module 102 is configured, in exemplary embodiments, to scrape predetermined information hubs that are peer reviewed and trusted sources for the information, including but not limited to, chemical characterizations, physical attributes, mathematical calculations, formulas, and values. In embodiments, the system is configured to leverage tabulated data, written/text literature, graphical representations, equations, figures, and parameters. In embodiments, the information includes, but is not limited to, details of physical, chemical, genetic, atomical, and biological dimensions of molecules isolated from each other in an organizable manner incorporating, biological receptors, enzymatic pathways, or chemical structures. This data is received on the databases for use and transformation by the artificial intelligence module (AIM) 112.
In one embodiment, the AIM 112 is in communication with the data platform module 110, and is configured to provide plurality of AI models, methodologies and techniques to transform the standardized data into data sets usable by the system for purpose of molecule recommendation. Further, AIM 112 is configured to recommend at least one of a plurality of candidate molecules to the user based on a user query and desired attribute. In this way the AIM 112 is working in parallel to not only transform in parts data, but also to make predictions and recommendations to a user.
Referring still to
Referring now to
The data is then routed to a data parser 210, which is configured to tag 210a and index 210b and sort the standardized data 210c sets in preparation for processing by the AIS 112. The processed data results are displayed in visualization module 216, which includes the interactive environment 230, graphical representations 232 and drop-down menus 234 for the users to interact with.
Referring now to
VR module 314 is configured to execute based on the user query and AR module 312 is configured to execute based on the user query. The environment generator 320 is configured to generate graphical representations of compounds and molecules in atomic resolution while allowing alteration module 324 to receive signals from the user to move the molecule or compound around the UI. In this way the user 306 can interact with the molecule and the compounds to view bond angles or sub-atomic particles to ascertain which molecule may be employed based on the desired clinical attribute, which are sorted at UI layer 326 for viewing in either the client display device 304 or the AR/VR apparatus 350.
The AIM 112 and the VAI module 316 are configured to receive inputs from the MCEM 114 and generate an AI assisted first user interface where, in operation, subsequent the user input or query, loads a menu on the user interface with recommended molecules based on the user input and order the recommended molecules in menu based on a predictive success parameters using trained data sets. The AR/VR apparatus 350 worn by the user allows the user to view the interactive environment with AR/VR headset.
In operation, the MCEM 114 is configured to display the result of the analyzed data for candidate molecule and compound bio-functionality. In one embodiment, the displayed result comprises details of identified similarities and is configured to deliver the corresponding values.
Referring to
The comparison module 422 is configured for comparing the user request data, via filters, with the existing data present in database 102. In one embodiment, the comparison module 422 utilizes a plurality of AI models to perform the metric analysis techniques and optimize classifications. Then, in operation, the visualization module 114 displays the identified compounds or molecules in an interactive environment the client-side display 304.
Referring now to
In one embodiment, system 100 provides all top-level administration, logistics, and direction for the entire software system. In one embodiment, the system 100 provides the custom user interface (UI) based upon user's criteria through a selection table that can be customized depending on the top-level selection, for example, compound/drug discovery, compound evaluation, etc. In one embodiment, system 100 provides the required security procedures and steps to conform with any and all security requirements. In one embodiment, the security is established between users and from external. In one embodiment, system 100 is the only portal from which users can use. In one embodiment, all the user inputs are processed by the system 100 directs data to other modules while monitoring the other systems. In one embodiment, system 100 also includes the ability to verify the user-selected function was processed and completed independently using Telco software. Further, system 100 includes AI to provide suggestions at all steps that are not part of the AIS.
Referring to
In operation, an integrator or integrator service 602, an index or index services 604 and a file system service 606 are configured to tag, index, and assign value to each of the data sets based on the user query. These modules are in communication with the AIS 112 and are configured to transform standardized data sets based on the chosen artificial neural network for rapid retrieval. The AIS 112 can be used to assist the index or index services 604 in tagging and indexing each of the data sets for retrieval.
Further, the DPM 110 allows multiple customer databases 610 with individual security protocols to isolate each database to support the service level agreement (SLA) for each customer. The DPM 110 service provides a real-time monitoring system showing the users and their data environments with safeguards and alarms to signal any issues with the system and/or users.
Referring now to
In operation, if the standardized data is a numerical data set, a neural net is deployed or embedded. For example, if the standardized data set is an image, a convolutional neural network 704 is deployed. If the standardized data set is a graph, a multilayer perceptron model 720 is deployed, and if the standardized data is a text, a natural language processing 708 with neural nets is deployed. Other models such as RNN 706, NLP 708, transformer 710, LSTM 712, RBF 714, user defined model 716, DNN 718, MLP 720, FN and 722 or AI 724n+1 may be executed by the configurator. As such, the configurator 702 is loaded with or in communication with a plurality of AI models and is configured to automatically choose the most optimized model based on a standardized data set type.
Once the data subset has been executed via the modules, the system is configured to send the output to pooling module 730 based upon classification and the desired result sent to output 740 for analysis and further processing as described with relation to
Referring now to
In an optional embodiment, the database 800 provides value to a “go/no go” decision making an actual elemental or engineering modification of the therapeutic, which can improve efficacy to reach the ideal therapeutic needed and is thus in communication with system 100 and visual display 216 for the user.
In one embodiment, the output of the software would be a modified receiver operator curve; the ordinate shall be a value generated from a regression combining available evidence for similar metals and would include a rating of how similar the innovator metal was concerning the existing metals. The abscissa may indicate the predictor of success, e.g., from 0 to 100. An additional characteristic may be considered to be the number of tests and data available for each comparison. For example, metals with only chemical resistance data would have a higher risk associated as a caution with the predictor compared, for example, with a chemical resistance, conductivity, density, and past uses cases.
In optional embodiments, a similar conductivity between extant and proposed use of the metal may indicate convergence and a probability of success, i.e., the closer the two values, the lower the risk and higher efficacy in its use as it relates to the desired clinical attribute. Likewise, as additional features and data are added, the greater the probability of success and the lower the risk and the higher the efficacy.
Referring now to
In operation, a user 910 or developer may input a request, in some embodiments, a request for data gathering, though this may also occur automatically. Subsequent the input, a scalable Domain Name System (DNS) 902, routes end users to certain applications and connects user requests to infrastructure running in the virtual networked environment such as, load balancers, or buckets (e.g., README file), and can also be used to route users to infrastructure outside of the virtual environment. The DNS is further configured to create a hosted zone to facilitate either creation of new DNS records or the migration of existing DNS records.
A web application firewall (WAF) 904 is in communication with the DNS 902 and is configured to monitors the HTTP(S) requests that are forwarded to a Cloud distribution or virtual network, an API Gateway REST API, an Application Load Balancer, or an AppSync GraphQL API. The WAF 904 enables control over access to content which is based on conditions specified, such as the IP addresses that requests originate from or the values of query strings, the service associated with a protected resource responds to requests either with the requested content or with an HTTP 403 status code (Forbidden). The data is then routed to a Virtual Private Cloud 906 that is configured to launch certain resources (or modules) into the virtual network.
The data is then routed to an application load balancer function 930 on the application layer, the seventh layer of the Open Systems Interconnection (OSI) model. After the load balancer 930 receives a request, it evaluates listener rules in priority order to determine which rule to apply, and then selects a target from the target group for the rule action. The AIS 112 is configured to provide its own listener rules to route requests to different target groups based on the standardized data and application traffic.
In operation, there are multiple availability zones 914 and 944 and the load balancer may route certain data to either availability zone. The availability zone 914 comprises a public tier 916, a private tier 922, and a database tier 932. The public tier 916 comprises a gateway 918 and an open VPN 920. The gateway 918 is in communication with code deploy 942 and the open VPN 920 is in communication with the user machine 910. The public tier 916 is configured as a public subnet that can send outbound traffic directly to the internet.
The private tier 922 comprises data aggregator 928 and data visualization module 924. The data aggregator 928 is configured to execute machine learning functions, via AIS 112, configured to scrap and collect data from trusted web sources. Data visualization 924 allows the user to explore and visualize data. It is also configured to allow the user to choose web server (e.g., Gunicorn®, Nginx®, Apache®), metadata database engine (MySQL, Postgres, MariaDB, etc.), message queue (Redis, RabbitMQ, SQS, etc.), results, backend (S3, Redis, Memcached, etc.), caching layer (Memcached, Redis, etc.)
The data visualization module 924 and data aggregator 928 are in communication with the database tier 932. The database tier comprises database 934 configured as fully managed database service built for the cloud that can build and run graph applications and is optimized for storing billions of relationships and querying the graph with milliseconds latency.
Availability zone 944 is in communication with the load balancer and comprises public tier 948, private tier 956, and database tier 960. The public tier 948 comprises gateway 950 which is configured to act in the same way as gateway 918 private tier 952 comprises data aggregator 954 and data visualization module 956 each of which are also configured to act in the same way as data visualization module 924 in data aggravator 928 and lastly, the database tier 958 comprises database 960 that operates in a similar fashion to database 934.
The virtual network 908 is in communication with source 910 to be created with the appropriate source material to build and deployment stage for data gathering. The source 936 is configured to communicate with the code repositories comprising GitHub 936, Bitbucket 938, CodeCommit 940, or Gitlab 942. Code Pipeline is configured to create a CloudFormation change set after Code Build finishes the build stage and later executes that change set with or without manual approval, Code Build is configured with an appropriate buildspec.yml file to build and package the code that needs to be deployed. The build step is triggered by Code Pipeline after source changes and before deployment begins. In the case of serverless, Code Build is configured to build the code (if needed), create a CloudFormation template package, and pass it along to Code Pipeline to create a CloudFormation change set and then execute on the change set. Code Deploy 942 is configured to automate code deployments to the cloud. Deployment groups are created for each code base and configured to deploy specific code updates to their corresponding computer architectures.
Config logs 964 are configured as KMS Encrypted buckets to store logs from Config 962. The bucket is configured with a dedicated bucket policy only allowing access to Config 962 to read/write logs. Lifecycle Policy is configured to delete logs after some period of time, usually 30 days. DNS Queries 966 is configured as protective firewall, and CloudTrail is deployed in all regions to log, continuously monitor, and retain account(s) activity related to API actions across the infrastructure. Threat detection 972 continuously monitors malicious activity and unauthorized behavior to protect account(s), workloads, and data.
Fleet 1008 is deployed and is configured to launch multiple instance types across multiple with a single API. ECR 1112 comprises repositories that are created to house container images in the primary operating region. Batch 114 is a fully managed batch computing service that plans, schedules, and runs your containerized batch or ML workloads across the full range of the virtual network. Cloud object storage 1116 is in communication with the computer fleet for data transmission after job 1118 is submitted.
Phase 1204 receives data sets from the MCEM 114 and AIS 112 at a rendering module 1212. The rendering module 1212 together with image builder 1214 engine 1216 and streaming protocol 1218 and database 1220 an interactive AR/VR environment is generated and comprises graphical representations of the candidate molecules in atomic resolution based on a user query. The rendering module is configured with 3-D rendering software and may be configured to use an image builder 214 that provides 2-D and 3-D platform to create scenes. The engine 216 is configured as a real time 3-D creation for photo-real visuals and immersive experiences and is configured to build artifacts that will be fully managed on a non-persistent application screaming service 1218.
This data can then be accessed at phase 1206 and viewed via client device 1222 using workflow filters 1224 and visual output 1226. Based on a received signal, that may be AI assisted, the user may alter a configuration of the candidate compound and interact with a candidate compound and replace a molecule in the Kennedy compound. The system is then configured to automatically provide clinical attribute data and predictive scores as to whether that molecule will meet the needs of the user for a specific drug, for example. A user interface generates clinical characteristics of the candidate compound as altered by the user in near real time.
Now with reference to
Referring now to
Referring now to
Referring now to
Referring now to
In exemplary embodiments, this approach to scoring is objective and appreciably better than human judgment in terms of accuracy and efficiency. It removes any subjective input in favor of more objective appraisals when applied to the selection process and go/no go decisions in metal development for all industry sizes.
Advantageously, the present invention makes it possible to make decisions early, saving time and expenses. Further, the data supports management decisions with more circumspection. The system compares the known physical, chemical, genetic, and biological dimensions of molecules and metals and identifies similarities in their composition and behavior. The system performs these comparisons on any level of scale, in vivo, in vitro, and silica simultaneously. In addition, the system self-checks for the accuracy of results using available data repositories. Further, the system uses machine learning to train the algorithms to accurately identify the high potential similarity matches and display results using 3D, virtual, or augmented reality visualizations.
Preferred embodiments of this invention are described herein, including the best mode known to the inventors for carrying out the invention. It should be understood that the illustrated embodiments are exemplary only and should not be taken as limiting the scope of the invention.
The preceding description comprises illustrative embodiments of the present invention. Having thus described exemplary embodiments of the present invention, it should be noted by those skilled in the art that the within disclosures are exemplary only and that various other alternatives, adaptations, and modifications may be made within the scope of the present invention. Merely listing or numbering the steps of a method in an order does not constitute any limitation on the order of the steps of that method. Many modifications and other embodiments of the invention will come to mind to one skilled in the art to which this invention pertains, benefiting the teachings in the preceding descriptions. Although specific terms may be employed herein, they are used only in a generic and descriptive sense and not for purposes of limitation. Accordingly, the present invention is not limited to the specific embodiments illustrated herein.
The present utility patent application is a United States National Stage application filed under 35 U.S.C. § 371 of International Patent Application No. PCT/US23/11398 filed on Jan. 24, 2023 System And Method For Predictive Candidate Compound Discovery, which claims the priority benefit of U.S. provisional patent application Ser. No. 63/302,418 filed on Jan. 24, 2022, entitled System And Method For Predictive Candidate Compound Discovery, the entirety of each is incorporated herein by reference for all purposes.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2023/011398 | 1/24/2023 | WO |
Number | Date | Country | |
---|---|---|---|
63302418 | Jan 2022 | US |