This application incorporates by reference the sequence listing submitted as ASCII text filed via EFS-Web on Apr. 6, 2012. The Sequence Listing is provided as a file entitled “IVSC.027A Sequence Listing”, created on Apr. 6, 2012 and which is approximately 44.9 kilobytes.
Traditionally, diagnosis of disease has relied on morphological examination and symptom presentation. However, under this approach, no diagnosis occurs until the disease has progressed to the point of physical manifestation. For many diseases, early detection can lead to early treatment, which, in turn, can significantly improve recovery and survival rates. Further, detection of a susceptibility to a disease prior to the appearance of symptoms can enable changes in lifestyle, which can minimize severity or even prevent the disease from ever manifesting. Thus, over the past several years, there has been considerable interest in the role of biological markers in the prevention, early detection, diagnosis and treatment of disease.
Generally described, a biomarker is any substance or characteristic, which may be objectively measured and used as an indicator of a biological state, normal biologic processes, pathogenic processes or pharmacologic responses to a therapeutic intervention. One example of a biomarker a pathogenic process is the presence of characterized mutations in the fms-related tyrosine kinase 3 (FLT3) gene. FLT3 mutations are one of the most frequent somatic alterations in acute myeloid leukemia (AML), occurring in approximately ¼ of patients. The presence of a FLT3 mutation is indicative of poor prognosis.
While FLT3 mutation is an instance where detection of a single biomarker is indicative of a disease state, given the complex interaction of human biochemistry, the interaction of multiple markers often has a bearing on the presence or absence of disease, disease predisposition, or response to therapeutic intervention. In most cases, it is the constellation of biomarkers that acts as an overall indicator of present and future biological states, normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention. Further, inter-relationships between biomarkers are such that the relevance or impact of one detected biomarker might be altered based on the presence or status of one or more other biomarkers.
In some cases, biomarkers are subject to restricted use by various rights holders. For example, a particular biomarker or use of that biomarker may be subject to intellectual property rights, such as patent rights. Utilization of these restricted use biomarkers may require permission of the relevant rights holders, which often further requires payment of a licensing fee or royalty. However, mere identification of the rights holder can prove difficult for a practitioner who wants to utilize biomarker analysis. This is further complicated in cases where multiple biomarkers together indicate a biological state. Often, these multiple biomarkers are subject to different rights or different rights holders, complicating their use. These issues can discourage the use and development of biomarkers.
As problematic or more problematic for the physician, and impacting in a very real sense the care of patients, is the difficulty in staying current with information on the clinical efficacy of biomarkers and how all of the information around the ever increasing number of individual biomarkers, coupled with other biological or data inputs and various additional lifestyle metrics together are likely to impact the patient, both in risk assessment and disease intervention and treatment.
The foregoing aspects and many of the attendant advantages of this disclosure will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:
The embodiments described herein relate to systems and methods for managing biomarkers, relationships between biomarkers, biomarker datasets and biological data, and the identification and payment of licensing fees to rights holders. The methods and systems described herein promote the use and development of biomarkers and biomarker dataset relationships by simplifying the identification and compensation of rights holders by practitioners. Such systems and methods may benefit all stakeholders.
This is becoming more important as affordable whole genome sequencing becomes a reality. Now, as opposed to targeted detection using monoclonal antibodies, PCR, or hybridization assays, it is possible to sequence an individual's entire genome, or relevant portions thereof, and simply detect, for example, the presence or absence of a sequence of interest within that data set. Alternatively, one can sequence potentially mutated cells (such as cancer cells, including lymphoid cancers and solid tumors) to ascertain whether certain mutations are present. Although the patient or care provider may be in possession of up to an entire genome sequence, the detection of biomarkers of importance within that sequence may implicate intellectual property rights of many different parties with many different licensing policies, making it difficult or impossible for individuals to navigate the complex intellectual property landscape, and difficult or impossible for intellectual property owners to collect applicable usage royalties. Hence, disclosed herein is a system, apparatus, and method for aggregating intellectual property rights under a single, simplified licensing and content delivery system that facilitates compliance with intellectual property rights and provides legal, curated results to end users.
In the description that follows, a number of terms are used extensively:
As described herein, a biomarker is any substance that may be objectively measured and used as an indicator of a physiological state or likelihood of change of that physiological state. The physiological state may be a normal biologic processes, pathogenic processes, response to exercise, response to a pharmacologic or response to other therapeutic intervention. Examples of a biomarker include, but are not restricted to, DNA, RNA, proteins, peptides, carbohydrates, lipids, or fragments thereof, metabolites, and other small molecules. Examples of nucleic acids-based biomarkers include, but are not limited to, gene mutations, polymorphisms and quantitative gene expression analysis. In some embodiments, metabolic products may serve as biomarkers. In some embodiments, antigens and antibodies may serve as biomarkers. In several embodiments, a nucleotide polymorphism may serve as a biomarker. In some embodiments, the detected level of a protein may serve as a biomarker. The term “licensed biomarker” refers to any biomarker subject to restricted use by a rights holder.
A “biomarker data set” is an electronic or data representation of biological data from a biological sample or from a patient or individual. In one exemplary embodiment, it comprises a DNA sequence, a RNA sequence, a protein or peptide sequence, exome, transcriptome, antibody (including autoantibody) profile, metabolome, epigenome, proteome or one or more measured biometric values from a patient. The biomarker data set could be represented by one, two, or three-dimensional images or points, or information of any derived from any of the techniques employed to analyze biological systems, but in most instances, it will comprise data in a computer or storage medium, or data being transmitted in any manner, whether in electronic, optical, sonic, electromagnetic, wave-form, or any other form.
As used herein, “nucleic acid” or “nucleic acid molecule” refers to polynucleotides, such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), oligonucleotides, fragments generated by the polymerase chain reaction (PCR), and fragments generated by any of ligation, scission, endonuclease action, and exonuclease action. Nucleic acid molecules can be composed of monomers that are naturally-occurring nucleotides (such as DNA and RNA), or analogs of naturally-occurring nucleotides (e.g., enantiomeric forms of naturally-occurring nucleotides), or a combination of both. Nucleic acids can be either single stranded or double stranded.
An “isolated nucleic acid molecule” is a nucleic acid molecule that is not integrated in the genomic DNA of an organism. For example, a DNA molecule that encodes a growth factor that has been separated from the genomic DNA of a cell is an isolated DNA molecule. Another example of an isolated nucleic acid molecule is a chemically-synthesized nucleic acid molecule that is not integrated in the genome of an organism. A nucleic acid molecule that has been isolated from a particular species is smaller than the complete DNA molecule of a chromosome from that species.
“Complementary DNA (cDNA)” is a single-stranded DNA molecule that is formed from an mRNA template by the enzyme reverse transcriptase. Typically, a primer complementary to portions of mRNA is employed for the initiation of reverse transcription. Those skilled in the art also use the term “cDNA” to refer to a double-stranded DNA molecule consisting of such a single-stranded DNA molecule and its complementary DNA strand. The term “cDNA” also refers to a clone of a cDNA molecule synthesized from an RNA template.
A “polypeptide” is a polymer of amino acid residues joined by peptide bonds, whether produced naturally or synthetically. Polypeptides of less than about 10 amino acid residues are commonly referred to as “peptides.”
A “protein” is a macromolecule comprising one or more polypeptide chains. A protein may also comprise non-peptide components, such as carbohydrate groups. Carbohydrates and other non-peptide substituents may be added to a protein by the cell in which the protein is produced, and will vary with the type of cell. Proteins are referred to herein in terms of their amino acid backbone structures; substituents such as carbohydrate groups are generally not specified, but may be present nonetheless.
As used herein, the terms “patient” and “subject” refer to a biological system from which a biological sample or biological data can be collected or to which a therapeutic agent can be administered. A patient can refer to a human patient or a non-human patient. Patients can include those that are healthy and those having a disease, such as cancer. Patients having a disease can include patients that have been diagnosed with the disease, patients that exhibit a set of symptoms associated with the disease, and patients that are progressing towards or are at risk of developing the disease.
As used herein, the term “biological sample” refers to a biological material that can be collected from a patient and used in connection with diagnosis or monitoring of biological states. Biological samples can include clinical samples, including body fluid samples, such as body cavity fluids, urinary fluids, cerebrospinal fluids, blood, and other liquid samples of biological origin; and tissue samples, such as biopsy samples, primary tumor samples, and other solid samples of biological origin. Biological samples can also include those that are manipulated in some way after their collection, such as by treatment with reagents, culturing, solubilization, enrichment for certain biological constituents, cultures or cells derived therefrom, and the progeny thereof.
As used herein, the term “biological state” refers to a condition associated with a patient or associated with a biological sample collected from the patient. A biological state can refer to a healthy state, which corresponds to a normal condition in the substantial absence of a disease, or a disease state, which corresponds to an abnormal or harmful condition associated with a disease.
As used herein, the terms “biological data” and “biological sample data” refer to any information associated with a patient or associated with a biological sample collected from the patient. Biological data can include whole or partial genome sequence, exome, transcriptome, antibody (including autoantibody) profile; metabolome; epigenome; and proteome data. Biological data can also include gender; age; weight; geographic location; family history; personal history; race and ethnicity; drug use (therapeutic and recreational); alcohol use; tobacco use; physical activity; diet; blood pressure, heart rate, metabolite levels, blood sugar levels, blood oxygen saturation levels, cholesterol level and other biometric or physiological data.
The term “user” as used herein is not limited and may be any person or entity that interacts with the licensed biomarker management environment. Examples of users include, but are not limited to, patients corresponding to the biological samples or biological data, health care providers, researchers, health care organizations, research organizations, laboratories, biomarker rights holders, pharmaceutical companies and corporations, intermediary service providers, biologics companies and corporations, universities, licensees, etc.
Generally described, the present disclosure is directed to managing biomarkers subject to restricted use. Specifically, aspects of the disclosure will be described with regard to the management and processing of biomarker data sets, biological data that contain biomarkers, and other biological data. Although various aspects of the disclosure will be described with regard to illustrative examples and embodiments, one skilled in the art will appreciate that the disclosed embodiments and examples should not be construed as limiting.
The licensed biomarker management environment 100 can also include a license management provider 102 in communication with the one or more computing devices 114 via the network 116. The license management provider illustrated in
The license management provider 102 can further include a biomarker data store 108 in communication with the biological processing server for storing information regarding individual biomarkers or groups of biomarkers. The biological data processing server may utilize biomarker information stored within biomarker data store 108 to identify corresponding biomarkers in submitted biological data.
Additionally, the license management provider 102 can include a license and payment server 104 in communication with the biological processing server. The license and payment server may utilize biomarker information to determine licensing information associated with those biomarkers. In some embodiments, the biomarker information utilized may be received from the biological data processing server 106, such as where specific biological data is processed to determine existence of one or more biomarkers in the biological data. In other embodiments, the license and payment server 104 may receive biomarker information from other sources, such as from computing devices 114 over a communications network 116. The license and payment server 104 may then determine licensing information for biomarkers described within the received biomarker information. This determination process is further described with respect to
In some embodiments, the license management provider 102 may further include a data store (not shown) for storing submitted biological data or information the biological data for future use. For example, the license management provider 102 may enable a customer to submit biological data, for example a genome sequence, exome sequence, metabolome, proteome, etc., for future testing in addition to or exclusive of current testing. In some embodiments, the license management provider 102 may enable a user to create an account and associate submitted biological data with the account. The account may further be associated with additional user information (e.g., payment information, personal information, health care provider contact information, etc.). As such, though illustrative embodiments are described herein with respect to submission and analysis of biological data, in some embodiments, previously submitted biological data may be utilized for analysis.
In some embodiments, the license management provider 102 can include a diagnosis and treatment information processing server 105 in communication with the biological data processing server 106. The diagnosis and treatment information processing server 105 may utilize biomarker data information to identify diagnosis and treatment recommendations; perform molecular risk assessments; calculate statistical correlations; identify predictive, diagnostic, prognostic, staging and pharmacodynamic biomarkers and generate other relevant information associated with a biomarker data set or biomarker data sets. In some embodiments, the diagnosis and treatment information processing server 105 may utilize biomarker data information in combination with other biological data information to identify diagnosis and treatment recommendations; perform molecular risk assessments; calculate statistical correlations; identify predictive, diagnostic, prognostic, staging and pharmacodynamic biomarkers; monitor physiologic condition; calculate heath status; determine need for medical intervention and generate other relevant information associated with the biological data profile. In some embodiments, the biomarker data information and/or other biological data information may be received from the biological data processing server 106, such as where biological data is processed to determine existence of one or more biomarkers in the biological data. In other embodiments, the diagnosis and treatment information processing server 105 may receive biomarker data information and/or other biological data information from other sources, such as from computing devices 114 over a communications network 116. The diagnosis and treatment information processing server 105 may then identify diagnosis and treatment recommendations; perform molecular risk assessments; monitor physiologic condition; calculate heath status; determine need for medical intervention and generate other relevant information based on the biomarkers described and/or other biological data information within the received biological data set to generate a customized diagnosis and treatment report.
The license management provider 102 can further include a diagnosis and treatment information data store 107 in communication with the diagnosis and treatment information processing server 105 for storing information regarding diagnosis and treatment recommendations, biomarker data set information, biological data set information (biomarker data set information and other biological data set information), and other relevant information associated with the presence of individual biomarkers, groups of biomarkers and/or other biological data information. The diagnosis and treatment information processing server 105 may utilize diagnosis and treatment recommendation information stored within treatment information data store 107 to identify corresponding diagnosis and treatment recommendation information in submitted biomarker information and/or other biological data information.
Following analysis of the biological data for the presence of one or more of the applicable subset of biomarkers, a list of biomarkers detected in the biological data (biomarker data information) and/or other biological data information can be transmitted to the diagnosis and treatment information processing server 105. A list of biomarkers detected in the biological data (biomarker data information) and/or other biological data information can be transmitted to the diagnosis and treatment information processing server 105 prior to, simultaneous with or after processing and confirmation of payment information. In one embodiment, a list of biomarkers detected in the biological data (biomarker data information) and/or other biological data information can be transmitted to the diagnosis and treatment information processing server 105 after the license and payment server 104 processes the submitted payment information and confirms payment to the biological data processing server 106. The diagnosis and treatment information processing server 105 can utilize the detected biomarker information and/or other biological data information to determine a molecular risk assessment, customized diagnosis and treatment recommendations and other relevant information applicable to the detected biomarkers. The diagnosis and treatment information processing server 105 may apply data on effectiveness of a biomarker or set of biomarkers for the diagnosis, prognosis, and risk assessment for a particular physiological condition generated in independent studies and/or by analysis of biological information data sets to determine a molecular risk assessment, customized diagnosis and treatment recommendations, or other relevant information applicable to the detected biomarker data information. Customized diagnosis and treatment recommendations may correspond, for example, with prognosis information; with disease diagnosis information; with disease staging information; with molecular risk assessment information; with pharmaceutical treatment information; with response to clinical intervention; with recommendation of additional biomarker test information; with specialist referral information; with support group referral information; with clinical study participation information; with dietary treatment information or with other diagnosis and treatment recommendations information.
After diagnosis and treatment recommendation information applicable to the detected biomarkers have been determined and formatted, this information can be returned to the biological data processing server 106 which formats detailed results. Utilizing the diagnosis and treatment recommendations information, the biological data processing server 106 can provide a customized diagnosis and treatment recommendation report to the computing device 114 after license agreement and fee payment are received. In some embodiments, the customized diagnosis and treatment recommendation may correspond to a description of the biomarkers detected and the disease, condition or other physiological state to which the biomarkers are linked. In some embodiments, the customized diagnosis and treatment recommendation information may correspond to an assessment of the probability of having had developed or developing a particular disease associated with the detected biomarkers and/or other biological data information. In other embodiments, the summary of customized diagnosis and treatment recommendation information may correspond to pharmaceutical treatment and dosing information. In some embodiments, upon receipt of license agreement and payment, the diagnosis and treatment recommendations information and payment confirmation is provided to the user of the client computing device 114. In some embodiments, a summary of detected biomarkers is provided such that the user of the client computing device 114 may select one or more of the detected biomarkers and submit payment information corresponding to the required licensing fees to receive further diagnosis and treatment recommendation information regarding the selected biomarkers. In some embodiments, the payment confirmation and diagnosis and treatment recommendation information is provided such that the user of the client computing device 114 may select one or more additional assays, biomarkers or sets of biomarkers recommended for further analysis of the biological data based on the profile of the analyzed biomarker(s) by selecting a corresponding checkbox and submitting the selection by use of submission button 610. In some embodiments, this may require submission of payment information corresponding to the required licensing fees. Alternatively, the user may select to not undertake further analysis by selecting cancellation button.
Following receipt of the detected biomarker data information and/or other biological data information, the diagnosis and treatment information processing server 105 can determine or apply appropriate statistical calculations with which to analyze correlations between the biomarker data information and likelihood of having had developed or developing a disease or condition; responding to a particular treatment regime; or having a particular physiological state. Such determination may be made, for example, based on a statistically weighted combination of biomarker data information and other biological data information. The applicable subset or subsets of diagnosis and treatment recommendation information may correspond to a specifically detected biomarker, to specifically detected groups or subgroups of biomarkers, to combinations of specifically detected biomarkers and other biological data information, to a designated range of confidence intervals for a correlation, or to any other selection criteria. The diagnosis and treatment information processing server 105 then requests the applicable subsets of diagnosis and treatment information from the diagnosis and treatment information datastore 107, which are then returned to the diagnosis and treatment information processing server by the datastore. The diagnosis and treatment information processing server 105 then formats the diagnosis and treatment information and provides the diagnosis and treatment information to the biological data processing server 106. As will be appreciated by one skilled in the art, the returned diagnosis and treatment information may, in various embodiments, correspond to general information, a customized diagnosis and treatment report, a report that a correlation with a disease or condition was detected without disclosing the disease or condition or other diagnosis and treatment information.
In some embodiments, following receipt of the detected biomarker data information and/or other biological data information, the diagnosis and treatment information processing server 105 can retrieve biological information data sets or biomarker information data sets from the diagnosis and treatment information datastore 107 to determine or apply appropriate statistical calculations with which to analyze the detected biomarker information and/or other biological data information in comparison to other biological information data sets retrieved from the diagnosis and treatment information datastore. The diagnosis and treatment information processing server 105 compares biological information data sets and applies statistical analysis to identify statistically significant correlations between individual biomarkers, groups and subgroups of biomarkers, other biological characteristics (for example, gender, age, race and ethnicity, weight, activity levels, drug use, medical history, family history, etc.), or any combination thereof and a physiological state. Based on statistically significant correlations, the diagnosis and treatment information processing server 105 can perform molecular risk assessments; identify predictive, diagnostic, prognostic, staging and pharmacodynamic biomarkers and generate other correlative data. The diagnosis and treatment information processing server 105 may also apply parameters for sensitivity (e.g. >=0.9) and/or specificity (e.g. >=0.9). The diagnosis and treatment information processing server 105 may apply statistical analysis to identify individual biomarkers, groups and subgroups of biomarkers with positive predictive value or negative predictive value for a particular physiological state, for example response to a pharmaceutical treatment regime. The results of the statistical analysis undertaken by the diagnosis and treatment information processing server 105 are formatted and this information can be returned to the biological data processing server 106 which formats detailed results. Upon receipt of license agreement and payment, the diagnosis and the detailed results and payment confirmation is provided to the user of the client computing device 114. In some embodiments, detailed results on the statistical analysis of a biomarker are provided to the rights holder.
In some embodiments, the diagnosis and treatment information processing server 105 may submit the detected biomarker information and/or other biological data information received from the biological data processing server 106 to the diagnosis and treatment information datastore 107. In some embodiments, the biological data processing server 106 may submit biomarker data information and/or other biological data information directly to the diagnosis and treatment information datastore 107 (interaction not shown).
Though identifying diagnosis and treatment recommendation information for the biomarker data set is discussed with regard to the diagnosis and treatment information processing server 105, one skilled in the art will appreciate that in some embodiments the biological data processing server 106 can perform this function.
With reference to
In some embodiments, the license management provider 102 may act to negotiate licenses between one or both of the submitting user (as licensee) and the rights holder (as licensor). Negotiation of such licenses may include, for example, consideration of the size of each entity (e.g., income level, number of employees, etc.), status of each entity (e.g., individual or legal entity, for-profit, non-profit, or educational entity), or other criteria. In some embodiments, licenses may be sought for a collection of intellectual property rights (e.g., one or more patents, trade secrets, etc.). As such, negotiation of licenses may include consideration of the collection of rights sought to be licensed.
In other embodiments, the license management provider 102 may act as an intermediary that holds a license from one or more rights holder, and that offers sublicenses to users. Sublicenses granted to users may include additional or alternative terms than licenses held by the license management provider 102 (or an operator thereof). For example, in some embodiments, the license management provider 102 may provide sublicenses to intellectual property rights at a lower cost than specified in the original license (e.g. for advertising or marketing purposes), or may provide combinations of sublicenses for fixed costs independent of the cost paid for the initial license by the license management provider 102. One skilled in the art will appreciate that terms of sublicenses may include any of the considerations discussed above with respect to licenses, as well as alternative or additional considerations.
One example of a user interface for such biological data submission is shown in
Following receipt of the biological data, the biological data processing server 106 can determine or apply an applicable subset or subsets of biomarkers with which to analyze the biological data. Such determination may be made, for example, based on information provided by the computing device 114. The applicable subset or subsets of biomarkers may correspond to specifically selected biomarkers, to biomarkers associated with certain diseases, to biomarkers having certain licensing characteristics, or to any other selection of biomarker subsets. The biological data processing server 106 then requests the applicable subsets of biomarkers from the biomarker datastore 108, which are then returned to the biological data processing server 106 by the datastore 108. As will be appreciated by one skilled in the art, the returned biomarkers may, in various embodiments, correspond to biomarker data or other biomarker information that facilitates analysis of biological data for the applicable subset of biomarkers.
With continued reference to
With continued reference to
In another embodiment, the user selects particular assays or biomarkers or sets of biomarkers and pays prior to performance of the analysis or prior to identification of the results (detected biomarkers). In this instance, the royalty model may provide for payment of a royalty, regardless of whether the result of the analysis is positive (biomarker present) or negative (biomarker absent).
One example of a web-based user interface for such selection of detected biomarkers is user interface 600 shown in
A second example of a web-based user interface for selection of detected biomarkers is user interface 650 shown in
With reference now to
Though formatting of detailed results for the selected biomarkers is discussed previous to submission, processing, and confirmation of payment information, one skilled in the art will appreciate that the processes may be accomplished simultaneously or in any order while still falling within the scope of the present disclosure.
Once detailed results are formatted and payment is confirmed, the biological data processing server 106 may transmit the detailed results and confirmation of payment to the computing device 114 or another computing device of the user's designation. One example of a web-based interface for receiving such detailed results is user interface 700 shown in
In some embodiments, any of the interactions described above with respect to
With reference now to
One example of a user interface for such a biomarker database is web-based user interface 800 shown in
With continued reference to
In some embodiments, the user of the computing device 114 may wish to enter into a license agreement relating to one or more of the detected biomarkers. The computing device 114 may then request a license for the selected biomarkers from the license and payment server 104. The license and payment server may then grant a license to the user of the client computing device 114 associated with the detected biomarkers. Such a license agreement may define fees associated with use of the detected biomarkers or may further define amounts to be paid to rights holders of the licensed biomarkers covered under the license agreement. In still more embodiments, the license agreement may specify additional licensing terms. Such terms may include provisions that new biomarkers detected during use associated with the licensed biomarkers are the intellectual property of the operator of the license and payment server, the rights holder of a licensed biomarker or biomarkers, or another entity.
One example of a user interface for displaying such transmitted information is the web-based user interface 900 shown in
With reference now to
Although the process described above involves the user of a computing device 114 in order to interface with either a biological data processing server 106 or a license and payment server 104, one skilled in the art will appreciate that the process may be carried out via various modes of interaction. In some embodiments, for instance, a laboratory 110 of
It will be appreciated by those skilled in the art and others that all of the functions described in this disclosure may be embodied in software executed by one or more processors of the disclosed components and mobile communication devices. The software may be persistently stored in any type of non-volatile storage.
Conditional language, such as, among others, “can,” “could,” “might,” or “may,” unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain embodiments include, while other embodiments do not include, certain features, elements and/or steps. Thus, such conditional language is not generally intended to imply that features, elements and/or steps are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding, with or without user input or prompting, whether these features, elements and/or steps are included or are to be performed in any particular embodiment.
Any process descriptions, elements, or blocks in the flow diagrams described herein and/or depicted in the attached figures should be understood as potentially representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps in the process. Alternate implementations are included within the scope of the embodiments described herein in which elements or functions may be deleted, executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those skilled in the art. Further, any process descriptions, elements, or blocks described herein may be implemented or executed by individual systems or devices, or by multiple systems or devices acting collectively or conjointly. It will further be appreciated that the data and/or components described above may be stored on a computer-readable medium and loaded into memory of the computing device using a drive mechanism associated with a computer readable storing the computer executable components such as a CD-ROM, DVD-ROM, or network interface further, the component and/or data can be included in a single device or distributed in any manner. Accordingly, general purpose computing devices may be configured to implement the processes, algorithms and methodology of the present disclosure with the processing and/or execution of the various data and/or components described above.
While the foregoing written description enables one of ordinary skill to make and use what is considered presently to be the best mode thereof, those skilled in the art will understand and appreciate the existence of variations, combinations, and equivalents of the specific embodiment, method, and examples herein. The present embodiments should therefore not be limited by the above described embodiment, method, and examples, but by all embodiments and methods within the scope and spirit of the present embodiments.
The following Example is presented for the purpose of illustration and should not be construed as limiting.
Analysis of Somatic or Acquired Biomarkers: Screening for AML Panel Mutations Using Next Generation Sequencing
Next generation sequencing is a tool that can be used to determine the mutation status of DNA isolated from subjects, such as subjects diagnosed with acute myeloid leukemia (AML). The methodology enables the analysis of multiple individuals in parallel and can be completed in the space of a relatively short time. One of the most commonly used AML Biomarker Panel utilizes characterized mutations in the FLT3-ITD, FL3-TKD, and NPM1 genes as biomarkers.
Mutations of the fms-related tyrosine kinase 3 (FLT3) are among the most common mutations in acute myeloid leukemia, occurring in approximately ¼ of patients. There are two major types of FLT3 mutations: internal tandem duplication (ITD) or length mutations (LM) that map primarily within the juxtamembrane region of FLT3 (15-20% of AML patients), and point mutations in the kinase domain that most frequently involve aspartic acid 835 (D835 mutations) but have also been found less frequently in several other sites (5-10% of AML patients).
NPM1 nucleophosmin mutations are among the most prevalent mutations in karyotype normal AML (25-35% of AML patients). In the absence of FLT3-ITD mutations, NPM1 mutations portend a more favorable outcome for patients with AML. There is some evidence to suggest that NPM1 mutations provide a protective or favorable benefit even in patients with FLT3-ITD mutations.
FLT3-ITD Analysis: How to work a sample for FLT3 mutation status through the 454 Genome Sequencer: an example of next generation sequencing potential.
Primer Design: The primers used for sample amplification include 3 distinct regions: a 19 base pair fusion primer for sequencing, a 10 base pair Multiplex Identifier (MID) adaptor to differentiate individual samples, and the FLT3 primer sequence. The fusion primer segment is specific to the sequencing chemistry and has two iterations, A and B, which identify forward and reverse sequence reads during data analysis. The MID adaptor functions as a barcode, enabling multiplexing of sample processing as well as sample classification of the final data output in the GS Amplicon Variant Analyzer Software. Examples of MID adaptors are disclosed at SEQ ID NOs: 201-250.
More specifically, the forward and reverse FLT3 ITD primers are designed as follows:
Tables 1 and 2 show the sequences of forward and reverse primers, respectively, which may be used to sequence ITD mutations in the FLT3 gene. Tables 3 and 4 show the sequences of forward and reverse primers, respectively, which may be used to sequence D835 mutations in the FLT3 gene.
Master Mix Preparation: Once primers are designed, master mixes are prepared for the amplification of sample DNA. Master mixes can be prepared in bulk, stored at −20 C, and used for multiple batches of sample processing.
Although one master mix targets one gene of interest, multiple master mixes can be prepared to determine the mutation status of multiple genes of interest. A patient sample can be screened for multiple genetic markers by initial amplification with more than one master mix. If the master mixes are designed with the same MID combination, the data analysis software will combine the resulting mutation status data into one patient sample profile. This initial PCR amplification can also be multiplexed.
The master mix may be designed such that the combination of MID sequences can be used to identify each patient sample. In a batch of patient samples processed at the same time, each sample can be amplified by a unique master mix from the other samples that are sequenced in the same area of the Roche 454 PicoTitre sequencing plate. In some embodiments, each master mix uses the same MID sequence in both the forward and reverse primers. In other embodiments, a unique combination of forward and reverse MID sequences may be used. Despite variability in primer sequences between the master mixes, the buffer conditions, MgCl2, and dNTP concentrations remain the same across all master mixes prepared.
The following table, Table 5, is an example of master mix designs, which allows for individual sample data analysis after multiplexing.
Sample DNA Extraction: Genomic DNA is extracted from patient blood or bone marrow using either the manual QIAamp DNA Blood Mini Kit or the automated QiaCube. Extracted sample genomic DNA is brought to a final concentration of 50 ng/ul.
Sample Amplification: Each patient sample is assigned to a master mix for the batch of samples processed. For a group of patient samples that will be processed at the same time, each sample is assigned to a master mix. As described above, the master mix assignments define the forward and reverse MID identifiers for de-multiplexing the resulting data. Patient genomic DNA is amplified with the associated master mixes using common thermocycling parameters, allowing multiple samples to be amplified simultaneously.
Sample Purification: After the patient DNA regions of interest have been amplified, the resulting DNA fragments are purified by magnetic beads using the Agencourt® AMPure® XP protocol.
Sample Quantification and Pooling: Purified sample DNA is quantified by NanoDrop technology and diluted to a concentration of 109 molecules per microliter. Based on the PicoTiter Plate layout, described below, the samples are pooled in an equal volume ratio to prepare for the Roche 454 sequencing preparation protocols.
Pico Titer Plate Device Layout: The PicoTiter Plate Device used for Roche 454 sequencing can be prepared with 1 section for sample beads or divided into 2 or 4 sections. The sections can be utilized to separate the analysis for multiple genes of interest or to repeat master mix and MID combinations. The patient sample master mix and MID combination assignment cannot be duplicated within a section, yet it can be duplicated across sections. However, the PicoTiter Plate can accommodate the greatest number of sequence reads without the presence of section dividers.
Sequencing Preparation: Emulsion PCR; Technology utilizes the Fusion A and Fusion B segments of the sequencing primers; Amplicon is bound to beads; Amplified and washed. Forward and Reverse beads are prepared separately. Beads are pooled. Beads are loaded onto plate. Alternatively, a whole genome sequence of lymphoid cancer cells can be prepared by standard methodology.
Sequence Data Analysis: The resulting sequence data is analyzed using GS Amplicon Variant Analyzer software on a Linux operating system. A new project file is created and known reference sequences for the genetic loci of interest are imported. For each genetic marker tested, the expected amplicon is defined in the software by importing the gene specific portion of the forward and reverse primer sequences. The patient sample IDs or accession numbers are imported. Expected mutations, insertions, and deletions are entered into the software, although the final alignment analysis is able to detect deviations from the reference sequence without preliminary programming. The 10 base pair MID sequences are imported.
Once all of the above listed elements have been imported into the software, the multiplexer is defined to sort the sequence reads and assign those reads to the individual patients based on the MID sequence combinations. The resulting data is presented in a Variant (or mutation) table as well as graphically. The variant table includes a summary of the number of forward and reverse sequence read collected for each sample and target sequence. The mutation status of each clinical sample can be determined from this table since it also presents the number of sequence reads and percent for any predefined or newly detected variation from the reference sequence.
It should be emphasized that many variations and modifications may be made to the above-described embodiments, the elements of which are to be understood as being among other acceptable examples. All such modifications and variations are intended to be included herein within the scope of this disclosure and protected by the following claims.
Next generation sequencing enables the analysis of the complete genome, a subset of the genome and/or all or a subset of the protein and RNA coding regions (exome) from multiple individuals in parallel. There are several different platforms that can be used to generate biological data in the space of a relatively short time. These platforms include, but are not limited to, IIlumina Genome Analyzer, Roche 454 Sequencer, Applied BioSystems SOLiD and Ion-Torrent Personal Genome Machine. The basic process is similar for each of these instruments. As a non-limiting example, the Roche 454 process is described in more detail.
Genomic DNA is sheared so that the majority of the DNA fragments are less than 200 bp in length. Oligonucleotides are then ligated onto the 5′ end of the sheared genomic DNA. The ligated oligonucleotides act as templates for the primers used in the sequencing reactions. The ligated DNA is then bound to magnetic beads so that a single DNA molecule is bound to each bead. The beads are then emulsified with PCR amplification reagents so that a single bead is contained within a bubble of PCR reagents. The emulsion is broken and the beads washed. Beads without DNA are removed from the reaction and the beads with bound DNA are loaded onto plates with wells the size of beads so that one bead is in one well. The plate is then loaded onto the sequencing instrument and the sequencing reaction performed and the results detected according to the instrument protocol.
Each nucleotide base of the genome or exome may be sequenced multiple times, for example, over 20 times, to ensure accuracy. The sequencing results of the DNA fragments are assembled into a complete or partial sequence of the genome or exome of the sample by software that performs algorithms to align overlapping sequences.
A whole or partial genome or exome sequence (biological data) is obtained for an individual and is transmitted, for example, on data storage media for analysis. The biological data is analyzed by comparing the sample genome or exome to a database of licensed biomarkers. In some embodiments, the sample genome is compared to previously assembled genomes or exomes and polymorphisms are detected and then compared to a database of licensed biomarkers.
Individual biomarker databases are becoming more and more accessible through cloud and wireless healthcare connections and resources. It is possible to link, in real-time, individual biomarker databases with wireless healthcare resources, including real-time biometric readings, which enables the establishment of both dynamic, real-time computational and inter-relationships and notifications to patient, healthcare provider, and other interventional authorities or parties. Further, during epidemiological emergencies, biometric readings from individuals comprising a population can be used to alert healthcare authorities to locations that might be populated by persons affected by pathogens or pathogenic agents based upon individual responsive to pathogens (e.g., spike in their temperature) before individuals are aware of their infection and before the trend might otherwise be identified.
It should be emphasized that many variations and modifications may be made to the above-described embodiments, the elements of which are to be understood as being among other acceptable examples. All such modifications and variations are intended to be included herein within the scope of this disclosure and protected by the following claims.
The present application claims the benefit of priority to U.S. Provisional Patent Application No. 61/473,716, filed Apr. 8, 2011, the entire disclosure of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61473716 | Apr 2011 | US |