1. Technical Field
This disclosure generally relates to the fields of molecular biology, microbiology, bioinformatics, and biophysics and, more particularly, to systems, devices, and methods for analyzing hybridization of target molecules to probes on substrate-bound oligonucleotide, peptide, or protein arrays.
2. Description of the Related Art
Nucleic acid diagnostic testing has become a major focus for the fields of genomics, pharmacogenomics, proteomics, and genetic medicine just to name a few. Assay platforms capable of detecting the presence of genes, differential gene expression levels, and genetic variations constitute active areas of development. For example, deoxyribonucleic acid (DNA) arrays can simultaneously analyze the expression of hundreds of genes and permit systematic approaches to biological discovery.
DNA sequences in solution or in a semi-constrained solution (such as a micro-array) form duplexes with other available sequences based on, for example, the properties of the individual duplexes, the temperature of the solution, the relative concentrations of the DNA sequences, and the presence of other factors (e.g., salt concentration). Much of the computational research surrounding DNA is involved with finding similarities between sequences, especially in the face of mismatches, and insertions and deletions of one or more bases. Nearly all computational genetic approaches in the existing state of the art, however, treat the text-based identity of the bases making up the sequences as the only information necessary to determine the level of match or mismatch.
Nucleic acid diagnostic tests often employ strategies based on the hybridization principles of genetic material to DNA or RNA probes. These probes are generally designed in silico with the intent that they bind specifically with their perfectly matched targets. In practice, however, probes often bind to target sequences that are similar to their corresponding complementary target sequences. This cross-hybridization effect often skews the observed data from the expected data by signaling the presence of multiple sequences other than the expected target sequence. Cross-hybridization further complicates the data analysis by presenting numerous statistical problems, including the normalization of the data. Accordingly, there is a need to minimize cross-hybridization effects, as well as a need to better quantify cross-hybridization effects.
Often the sequence of nucleotides in DNA, or the sequence of amino acids in a protein or peptide, is represented as text strings indicative of the nucleotides or amino acids making up the sequence. For example, the sequence of nucleotides in DNA is often represented as a text string based on a four-letter alphabet (A, C, G, T) that symbolically codes for the corresponding nucleotide (e.g., adenine, cytosine, thymine and guanine). Accordingly, much of the sequence analysis, such as homology and similarity searches, protein functional analysis, motif searches, protein structure analysis, and the like often involve text-based search technologies and algorithms, as well as sequence alignment representations that compare the text of a sequence of interest to the text of other sequences.
In sequence alignment representations, sequences are written in rows arranged so that aligned residues appear in successive columns. Many of the available design routines rely on text similarity alignment routines to find, or generate and filter candidate probe sets. One problem with text-based search technologies and algorithms is that they fail to account for many of the secondary and tertiary structure effects associated with many macromolecules (e.g., nucleic acids, proteins, genomes, and the like). Another problem with text-based technologies and algorithms is that they take far too long to reliably compare a probe to a long genomic sequence.
A number of routines have been written to speed up text-based search algorithms. For example, most commonly used search queries employ the Basic Local Alignment Search Tool (BLAST) that looks for sequence homologies between a query sequence and selected genome sequences. Alignments are approximated by a search algorithm fashioned after the “seed” and “expand” Smith-Waterman method that identifies regions of local sequence text similarity and reports the likelihood that the match is the result of random chance.
BLAST has found primary utility in text-based recognition of patterns of sequence similarity used as indicators of evolutionary connectivity. BLAST is also commonly employed to deduce likelihood of duplex formation based on relative sequence homologies between probes and targets determined in text-based searches. But, as previously noted, text-based search technologies and algorithms like BLAST fail to account for some of the duplex interactions formed by probes and targets.
Another approach to speed up text-based search algorithms employs field programmable gate arrays (FPGAs) that distribute text-based comparison algorithms across hundreds or thousands of discrete processing elements for rapid parallel execution of text-based searches. But the FPGAs are designed to perform text-based searching and are therefore limited by the same problems that ultimately limit BLAST.
TIMELOGIC® biocomputing solutions has developed the DECYPHERBLAST™, a search engine using FPGA technology that parallelizes the BLAST search algorithm and has demonstrated improvements in both speed and performance at reduced costs. A shortcoming of this approach, however, is that genomic sequence searches are implemented using text-based approaches. Accordingly, probes designed using this search engine still suffer from cross-hybridization problems due to sequence interactions with other sequences, having dissimilar, non-homologous motifs, which are often unaccounted for in text-based technologies and algorithms approaches.
The present disclosure is directed to overcoming one or more of the shortcomings set forth above, and providing further related advantages.
The letter code or text representation of DNA sequence (e.g., A, T, G, C) is one of the most basic representations and contains important information regarding the protein sequences encoded by DNA (e.g., codons). Unfortunately, the text representation of DNA does not provide much insight regarding the distribution of thermodynamic stability encoded in a DNA sequence. For example, influence of “non-natural” configurations such as mismatch hybrids containing tandem mismatches or misalignments between two strands results in contributions that are lost in text-based homology searches, but that might have an important influence on actual results (generation of cross-hybridization and false positives). Furthermore, sequence dependent thermodynamic stability may encode for physical, chemical, and functional characteristics of duplex DNA that is often unaccounted for in text-based homology searches. Approaches that account for and/or quantify, for example, cross-hybridization effects or the influence of “non-natural” configurations using thermodynamics may be better predictors of true behavior, than those approaches relying on text representations of DNA.
In one aspect, the present disclosure is directed to a data processing system for analyzing a biological sample. The system includes a computer-readable memory medium and a controller.
The computer-readable memory medium comprises thermodynamic data configured as a data structure for use in analyzing biological samples. In some embodiments, the data structure comprises a thermodynamic data section having: thermodynamic data representative of dangling ends of two or more bases; thermodynamic data representative of unpaired single strands of two or more bases adjacent to a Watson-Crick base (w/c) pairing; thermodynamic data representative of unpaired single strands of one or more bases adjacent to a non-Watson-Crick base pairing; thermodynamic data representative of tandem base pair mismatches of two or more bases; thermodynamic data representative of length-dependent terminal mismatches of nucleic acid base; thermodynamic data representative of terminal base pair mismatches, or combinations thereof.
In some embodiments, the controller is configured to compare an input associated with the biological sample to the thermodynamic data, and to generate a response based on the comparison. In some embodiments, the input associated with the biological sample comprises at least one of an output generated from a detected image of the biological sample applied to an array, gene expression data, nucleic acid sequence data, an n-dimensional expression profile vector of the biological sample, a genome of an organism, or combinations thereof.
In another aspect, the present disclosure is directed to a method in a computer system for analyzing nucleic acid probes. The method includes determining a first free energy value indicative of a duplex of a first nucleic acid probe and a first target nucleic acid sequence. The method may include determining a first minimum free energy value indicative of a lowest free energy value associated with a formation of each of one or more duplexes formed by the first nucleic acid probe and at least a second target nucleic acid sequence.
The method may further include determining a second minimum free energy value indicative of a lowest free energy value associated with the formation of each of one or more duplexes formed by the first nucleic acid probe and at least a second nucleic acid probe. The method may further include determining a difference between the determined first free energy value, and a minimum of the first minimum free energy value and the second minimum free energy value. In some embodiments, the method may further include comparing the determined difference to a target value.
In another aspect, the present disclosure is directed to a method in a computer system for determining the presence or absence of a target nucleic acid sequence in a sample. The method includes determining a first free energy contribution parameter for a comparison of a first nucleic acid probe base sequence to a first plurality of target bases of a target sequence.
The method may include comparing the first free energy contribution parameter to a target value. In some embodiments, the method may further include generating a response based on the comparison to the target value.
In another aspect, the present disclosure is directed to a computer-readable memory medium containing instructions for controlling a computer processor to store in a data repository a data structure representing a comparison of a first plurality of nucleic acids with at least a second plurality of nucleic acids, by: determining one or more duplex interactions formed between the first plurality of nucleic acids and the at least second plurality of nucleic acids; and storing sets of thermodynamic values indicative of each of the one or more duplex interactions formed between the first plurality of nucleic acids and the at least second plurality of nucleic acids. In some embodiments, the duplex interactions are selected from dangling ends of two or more bases, unpaired single strands of two or more bases adjacent to a Watson-Crick base pairing, unpaired single strands of one or more bases adjacent to a non-Watson-Crick base pairing, tandem base pair mismatches of two or more bases, length-dependent terminal mismatches of nucleic acid base, terminal base pair mismatches, Watson-Crick base pairings, single base pairings of mismatched doublets, initial binding processes, or combinations thereof.
In another aspect, the present disclosure is directed to a computer readable storage medium storing instructions that, when executed on a computer, execute a method for determining thermodynamic characteristics of nucleic acid sequences. The method includes retrieving from storage one or more thermodynamic parameters associated with a binding comparison of a first nucleic acid base sequence to a first region of at least a second nucleic acid base sequence. The method may further include retrieving from storage one or more thermodynamic parameters associated with a binding comparison of the first nucleic acid base sequence to a second region of the at least second nucleic acid base sequence, the second region different from the first region by at least one nucleic acid base position along a nucleic acid sequence of the second nucleic acid base sequence.
In some embodiments, the one or more thermodynamic parameters comprise at least one of a dangling end of two or more bases thermodynamic parameter, an unpaired single strand of two or more bases adjacent to a Watson-Crick base pairing thermodynamic parameter, an unpaired single strand of one or more bases adjacent to a non-Watson-Crick base pairing thermodynamic parameter, a tandem base pair mismatch of two or more bases thermodynamic parameter, a length-dependent terminal mismatch of nucleic acid base thermodynamic parameter, and a terminal base pair mismatch thermodynamic parameter.
In another aspect, the present disclosure is directed to a computing device for evaluating thermodynamic properties of a nucleic acid probe and a target nucleic acid sequence. The device includes an integrated circuit, an input device, and a processor. In some embodiments, the integrated circuit includes a plurality of logic components. In some embodiments, the input device is coupled to the integrated circuit and is operable to provide data indicative of one or more thermodynamic characteristics of a comparison of individual base pair binding events associated with a nucleic acid probe and at least a first region of a nucleic acid sequence.
In some embodiments, the processor is coupled to the integrated circuit and is operable to analyze an output of one or more of the plurality of logic components and to determine a thermodynamic free energy of the comparison of the individual base pair binding events associated with the nucleic acid probe and the at least first region of the nucleic acid sequence.
In yet another aspect, the present disclosure is directed to a method for analyzing a genomic sequence. The method includes identifying a genetic region in the genomic sequence characterized by at least one nucleic acid sequence. The method may include providing a first probe and at least a second probe, the first and the at least second probes provided based on a free energy gap characteristic indicative of a binding affinity for the at least one nucleic acid sequence. The method may further include detecting whether a binding event between the first and the at least second probes and the at least one nucleic acid sequence has occurred.
In the drawings, identical reference numbers identify similar elements or acts. The sizes and relative positions of elements in the drawings are not necessarily drawn to scale. For example, the shapes of various elements and angles are not drawn to scale, and some of these elements are arbitrarily enlarged and positioned to improve drawing legibility. Further, the particular shapes of the elements, as drawn, are not intended to convey any information regarding the actual shape of the particular elements, and have been solely selected for ease of recognition in the drawings.
In the following description, certain specific details are included to provide a thorough understanding of various disclosed embodiments. One skilled in the relevant art, however, will recognize that embodiments may be practiced without one or more of these specific details, or with other methods, components, materials, etc. In other instances, well-known structures associated with computing systems including, processors, memories, and/or buses have not been shown or described in detail to avoid unnecessarily obscuring descriptions of the embodiments.
Unless the context requires otherwise, throughout the specification and claims which follow, the word “comprise” and variations thereof, such as, “comprises” and “comprising” are to be construed in an open, inclusive sense, that is as “including, but not limited to.”
Reference throughout this specification to “one embodiment,” or “an embodiment,” or “in another embodiment,” or “in some embodiments” means that a particular referent feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, the appearance of the phrases “in one embodiment,” or “in an embodiment,” or “in another embodiment,” or “in some embodiments” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
It should be noted that, as used in this specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to computing device including a “controller” includes a single controller, or two or more controllers. It should also be noted that the term “or” is generally employed in its sense including “and/or” unless the content clearly dictates otherwise.
The computing system 10 may further include one or more memories that store instructions and/or data, for example, random access memory (RAM) 18, read-only memory (ROM) 20, or the like, coupled to the controller 12 by one or more instruction, data, and/or power buses 22. The computing system 10 may further include a computer-readable media drive or memory slot 24, and one or more input/output components 26 such as, for example, a graphical user interface, a display, a keyboard, a keypad, a trackball, a joystick, a touch-screen, a mouse, a switch, a dial, or the like, or any other peripheral device. The computing system 10 may further include one or more databases 28.
The computer-readable media drive or memory slot 24 may be configured to accept computer-readable memory media. In some embodiments, a program for causing the computer system 10 to execute any of the disclosed methods can be stored on a computer-readable recording medium. Examples of computer-readable memory media include CD-R, CD-ROM, DVD, data signal embodied in a carrier wave, flash memory, floppy disk, hard drive, magnetic tape, magnetooptic disk, MINIDISC, non-volatile memory card, EEPROM, optical disk, optical storage, RAM, ROM, system memory, web server, or the like.
In some embodiments, the computing system 10 is configured to compare an input associated with the biological sample to a database 28 of stored reference values, and to generate a response based in part on the comparison. In some embodiments, the computing system 10 is provided for analyzing hybridization of target molecules to probes on substrate-bound nucleic acid, peptide, or protein arrays. In some embodiments, the computing system 10 comprises a data processing system for analyzing a biological sample.
In some embodiments, the computing system 10 may include computer-readable memory media in the form of one or more logic devices (e.g., programmable logic devices, complex programmable logic device, field-programmable gate arrays, application specific integrated circuits, and the like) comprising one or more look-up tables.
In some embodiments, one or more of the disclosed methods can be implemented using a memory medium in which executable instructions or software for realizing the functions, or implementing one or more of the instructions of the various disclosed embodiments, have been stored and are supplied to the computer system 10 or a component of the computer system 10 such as, for example, a micro processor unit, or central processing unit, or the like of the computer system 10. For example, in some embodiments, the computer system 10, or a component thereof, reads and executes executable instructions stored in a memory medium. In some embodiments, the executable instructions themselves read from the memory medium and realize the various functions of one or more of the disclosed embodiments. The computing system 10 is also suitable for implementing one or more of the disclosed methods and/or instructions associated with one or more of the embodiments comprising computer-readable media.
In some embodiments, a computer-readable memory medium includes instructions for controlling a computer processor to store in a data repository a data structure with data representing a comparison of a first plurality of nucleic acids with at least a second plurality of nucleic acids. In some embodiments, the instructions include determining one or more duplex interactions formed between the first plurality of nucleic acids and the at least second plurality of nucleic acids. In some embodiments, the instructions include instructions associated with storing sets of thermodynamic values indicative of each of the one or more duplex interactions formed between the first plurality of nucleic acids and the at least second plurality of nucleic acids.
In some embodiments, the duplex interactions are selected from dangling ends of two or more bases, unpaired single strands of two or more bases adjacent to a Watson-Crick base pairing, unpaired single strands of one or more bases adjacent to a non Watson-Crick base pairing, tandem base pair mismatches of two or more bases, length-dependent terminal mismatches of nucleic acid base, terminal base pair mismatches, Watson-Crick base pairings, single base pairings of mismatched doublets, initial binding processes, or combinations thereof.
The computing system 10 may further include a probe-target analysis component 30 including a probe generator component 32 and a multiplex hybridization component 34.
The probe-target analysis component 30 is operable to, for example, thermodynamically compare sequences of pairs of DNA strands, determine the sequence dependent thermodynamic stability for each alignment of the strands, compare stabilities of different duplexes at each alignment with those of the desired perfect match duplexes, and find those pairs of strands likely to crosshybridize. The probe-target analysis component 30 uses thermodynamic-based screening of probes and targets, rather than text-based screening for determining cross-hybridization propensity.
As previously noted, most commercially available probe design strategies rely on text-based similarity alignment routines to identify and filter candidate probe sequences. In some embodiments, the probe-target analysis component 30 is operable to search, compare, and select sets of probe sequences based on thermodynamic parameters representative of the various duplex interactions. For example, the probe-target analysis component 30 is operable to search and/or compare probes based on, for example, thermodynamic characteristics associated with the probes, and to select sets of probes whose individual members differ in one or more thermodynamic characteristics from one another. Simplicity of the probe-target analysis component 30 defines its elegance and thereby enables machine programmability.
In some embodiments, the probe-target analysis component 30 is configured to provide optimal sets of probe sequences designed to bind to specific target sequences according to one or more of the following desired characteristics: (1) probes bind specifically to defined target sequences; (2) probes do not bind targets other than the desired ones; and (3) probes do not bind any other probes. Accordingly, optimal sets of DNA probe sequences for specific targets may be generated using any of the aforementioned desired characteristics.
For example, given a first nucleic acid probe (α) and a first target nucleic acid sequence (α′); and a second nucleic acid probe (β) and a second target nucleic acid sequence (β′) characteristics of the set {α, β} can be determined by comparing the thermodynamics of every pair of sequences, α and β, in the set as follows. (1) free-energy (ΔG) of the perfect match duplex formed from a with its target (α′). (2) minimum ΔG over all duplexes (at every possible alignment) formed between α and β's target (β′). (3) minimum ΔG over all duplexes formed from α and β. Generally, (1) will have a value much less (i.e., be more stable) than either (2) or (3).
A basic measure of the fitness of the set can be obtained by taking the difference between the maximum of all calculated values of (1) and the minimum value of all the (2) and (3) values. This difference is generally referred to as the energy “gap” between desired duplexes (each probe in a perfect match with its target) and undesired cross-hybrids. In some embodiments, the goal is to make this gap as large as possible. By searching sequences based on thermodynamics differences, rather that their text identity or mere sequence homology, the probe-target analysis component 30 is operable to find probe sequences that are highly specific for their desired targets and have the lowest probability of cross-hybridization.
In some embodiments, the probe-target analysis component 30 is operable to identify sequences that fall below a target binding threshold value. These sequences are deemed unacceptable, eliminated and replaced. Generated sets are then compared to the “best set so far”. If the most recent set is better, sequences within it replace the current set and become the “best set so far” to be compared against other sets. In some embodiments, this iterative procedure continues until a set that satisfies a target energy gap (e.g., that maximizes the energy gap) is obtained. The method also allows consideration of additional constraints on the generated sequences. For example, a target G-C percentage and thereby range of thermodynamic stability of the sequence sets can be specified. Lexical rules can also be imposed (e.g., not allowing certain sequence patterns, (CCC or GGG)). Thermodynamic constraints can also be imposed (e.g., probe:target complexes should have a melting temperature (tm) over 20° C.). Also, probes can be designed while considering the potential interactions with other sequences in the set. Generated sequences should not form a lower ΔG (i.e., more stable) duplex complex, with any of these other sequences (e.g., from the Human Genome). Constraints can be applied at, for example, the time initial or as replacement sequences are generated.
Duplex interactions between nucleic acid probes and targets are generally sequence dependent. Every nucleic acid probe strand present in a multiplex reaction binds, with finite propensity, to nucleic acid targets other than the perfect match complementary sequence target. The extent of binding between two single strands depends on the sequence dependent free-energy of the duplex that they form. The thermodynamics of, for example, short duplex DNAs can be determined (e.g., calculated) using, for example, the nearest neighbor (n-n) model.
Simulations have shown that cross-hybridization (targets binding to probes non-specifically) can have significant effects on hybridization reactions and their interpretation. Accordingly, probes designed with forethought to minimize cross-hybridization may produce more accurate hybridization tests. Minimizing cross-hybridization may involve, in some cases, searching sequences based on thermodynamics differences, rather that their text identity or mere sequence homology. Accordingly, a need exists for the ability to quickly and thermodynamically scan probes against the genome so assays can be designed to minimize cross-hybridization based on thermodynamic rules instead of text homology. Platforms needing high throughput and reliable probes such as, for example, DNA microarrays, real time PCR, and flow cytometry may benefit from a thermodynamic scanning tool capable of setting the scale for minimizing cross-hybridization with undesired regions.
In some embodiments, the computer system 10 takes the form of a computing device for evaluating thermodynamic properties of a nucleic acid probe and a target nucleic acid sequence. The computing device may include an integrated circuit an input device 26, and a controller 12 (e.g., a processor, and the like).
The integrated circuit may include a plurality of logic components. The input device 26 may be coupled to integrated circuit and may be operable to provide data indicative of one or more thermodynamic characteristics of a comparison of individual base pair binding events associated with a nucleic acid probe and at least a first region of a nucleic acid sequence.
In some embodiments, the processor is coupled to the integrated circuit, and is operable to analyze an output of one or more of the plurality of logic components and to determine a thermodynamic free energy of the comparison of the individual base pair binding events associated with the nucleic acid probe and the at least first region of the nucleic acid sequence.
In some embodiments, the integrated circuit comprises an application specific integrated circuit 14 having a plurality of predefined logic components. In some embodiments, the integrated circuit comprises a field programmable gate array 16 having a plurality of programmable logic components.
In some embodiments, the computing system 10 takes the form of a data processing system for analyzing a biological sample. For example, in some embodiments, the computing system 10 comprises a computer-readable memory medium comprising thermodynamic data configured as a data structure for use in analyzing biological samples.
The data structure may comprise a thermodynamic data section including thermodynamic data representative of dangling ends of two or more bases. In some embodiments, the thermodynamic data section may further include thermodynamic data representative of unpaired single strands of two or more bases adjacent to a Watson-Crick base pairing. In some embodiments, the thermodynamic data section may further include thermodynamic data representative of unpaired single strands of one or more bases adjacent to a non-Watson-Crick base pairing. In some embodiments, the thermodynamic data section may further include thermodynamic data representative of tandem base pair mismatches of two or more bases. In some embodiments, the thermodynamic data section may further include thermodynamic data representative of length-dependent terminal mismatches of nucleic acid bases. In some embodiments, the thermodynamic data section may further include thermodynamic data representative of terminal base pair mismatches.
In some embodiments, the thermodynamic data section may further comprise thermodynamic data representative of dangling ends of a single nucleic acid base, thermodynamic data representative of Watson-Crick base pairings, thermodynamic data representative of single base pairings of mismatched doublets, thermodynamic data representative of initial binding processes, or combinations thereof.
In some embodiments, the thermodynamic data comprises nearest-neighbor free energy values, nearest-neighbor enthalpy values, or nearest-neighbor entropy values, or combinations thereof. In some embodiments, the thermodynamic data comprises binding affinity data indicative of a nucleic acid base sequence binding affinity to a target, and stability data indicative of a thermodynamic stability of a nucleic acid base sequence bound to the target, or combinations thereof. In some embodiments, the thermodynamic data comprises salt concentration-dependent thermodynamic data, buffer concentration-dependent thermodynamic data, sample concentration-dependent thermodynamic data, temperature-dependent thermodynamic data, or combinations thereof.
In some other embodiments, the thermodynamic data section may include any combinations of the disclosed thermodynamic data.
In some embodiments, the computing system 10 includes a controller 12 configured to compare an input associated with the biological sample to the thermodynamic data, and to generate a response based on the comparison.
In some embodiments the controller 12 is configured to compare the input associated with the biological sample to the thermodynamic data, and to generate at least one of a comparison plot, comparison data, an indication of a level of gene expression, an indication of a presence or absence of one or more nucleic acid sequences, or an indication of an L-length-mer composition of a target DNA fragment based on the comparison.
Among inputs associated with the biological samples examples include at least of one of an output generated from a detected image of the biological sampled applied to an array, gene expression data, nucleic acid sequence data, an n-dimensional expression profile vector of the biological sample, a genome of an organism, or combinations thereof.
Two sequences may have multiple different sequence alignments in which a duplex of the two can form.
The term “sequence alignment” generally refers to a way of arranging or comparing the primary sequences of DNAs, RNAs, or proteins to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. Gaps are inserted between the residues so that residues with identical or similar characters are aligned in successive columns.
In protein sequence alignment or comparison, the degree of similarity between amino acids occupying a particular position in the sequence can be interpreted as a rough measure of how conserved a particular region or sequence motif is among lineages. The absence of substitutions, or the presence of only very conservative substitutions (that is, the substitution of amino acids whose side chains have similar biochemical properties) in a particular region of the sequence, may suggest that this region has structural or functional importance. Although DNA and RNA nucleotide bases are more similar to each other than to amino acids, the conservation of base pairing can indicate a similar functional or structural role.
For example, two 24-base oligomers may have as many as 47 different sequence alignments in which a duplex of the two can form. Each of these duplexes will have an associated energy of formation. One approach for assessing the thermodynamic parameters associated with duplex interactions formed between, for example, a first plurality of nucleic acids and at least a second plurality of nucleic acids employs the nearest-neighbor thermodynamic model.
Based on the nearest-neighbor thermodynamic model, the energy of duplex formation is determined by the bases of one sequence, taken in paired bases, along with the paired bases of the mating sequence. Accordingly, the thermodynamic stability of two stranded complexes 100 is determined from the sum 106, 122 of n-n interactions over all n-n doublets in the duplex. An n-n doublet is comprised of two “base pair” units. A doublet can be, for example, a Watson-Crick hydrogen bonded base pair 110, a single 112 or double mismatch base pair 126, or the like. Thermodynamic stability of both is sequence dependent. Thus, each n-n doublet can be comprised of two Watson-Crick base pairs 110. An n-n doublet can contain one Watson-Crick base pair and one mismatch base pair (a single base pair mismatch) 116, 112. An n-n doublet can also be comprised of two mismatch base pairs, in a so-called tandem mismatch 126.
The nearest-neighbor thermodynamic model approach may include, for example, determining thermodynamic data representative of: dangling ends of a single nucleic acid base 108, 118; Watson-Crick base pairings 110;116; single base pairings of mismatched doublets 112, 114; initial binding processes 120; unpaired single strands of two or more bases adjacent to a Watson-Crick base pairing 124, 128; tandem base pair mismatches of two or more bases 126; dangling ends of two or more bases; single strands of one or more bases adjacent to a non Watson-Crick base pairing; terminal base pair mismatches; length-dependent terminal mismatches of nucleic acid base; or combinations thereof.
Traditionally, the nearest-neighbor (n-n) model generally assumes that the stability of a duplex DNA depends on the identity and orientation of neighboring base pairs. Any Watson-Crick DNA duplex structure will have ten possible n-n interactions. These interactions are:
The stability of a DNA duplex may be predicted from its primary sequence if the relative stability (ΔGo) of each DNA n-n interaction is known. It is these n-n parameters, when cast in the same format, that are in general agreement amongst the various laboratories. In practice, however, there are many other duplex interactions not accounted for by the n-n model such as those disclosed herein that should also be considered in the thermodynamic description of duplex DNA.
The total free energy change of the DNA helix from its individual strands is given by:
ΔGo(total)=ΣiniΔGo(i)+ΔGo(init w/term GC)+ΔGo(init w/term AT)+ΔGo(sym)
where ΔGo(i) are the strand free energy changes for the ten possible Watson-Crick n-n's, ni is the number of occurrences of each nearest neighbor, i, and ΔGo(sym) equals +0.43 kcal/mol if the duplex is self-complementary and zero if it is not self-complementary. To account for differences between duplexes with terminal AT versus terminal GC pairs, two initiation parameters are introduced.
Some probe design strategies may also apply several empirical factors that make certain “corrections” to the calculated thermodynamics. For example, a parabolic n-n model, in which n-n ΔG values are weighted by an upward parabolic function centered at the middle and increasing at the ends, where as the n-n doublets approach the ends they become less stable (have higher ΔG values).
Although some nearest-neighbor parameters for single base pair mismatches for various possible nearest-neighbor combinations are known, there are no known parameter sets for tandem mismatches.
In some embodiments, the thermodynamic transition parameters, ΔH, ΔS, and ΔG, used in kinetic and equilibrium model calculations, may be determined from sequence-dependent thermodynamic parameters. See e.g., Benight et al., “Statistical Thermodynamics and Kinetics of DNA Multiplex Hybridization Reactions” Biophys J., 91(11), pp. 4133-4153 (2006).
Consider, for example, the hybrid duplex formed by sequences 5′-AGCGATGA-3′- and -3′-CAATAATT-5′ and its decomposition into nearest-neighbor components of the enthalpy, ΔH (mismatches are underlined):
This duplex contains eight nearest-neighbor interactions, including single-base 5′ dangling ends. The nearest-neighbor dependent parameters
for the appropriate sequences and interactions are summarized in the following Tables 2-4.
In some embodiments, initiation factors such as, for example,
may be assigned values depending on the particular identities of the end base pairs. Values for the initiation thermodynamic parameters associated with the duplex formed by the 5′-AGCGATGA-3′- and -3′-CAATAATT-5′ sequences are as follows:
The formulas for total free energy include:
ΔG=ΔH−TΔS (eq. 4);
T
m
=ΔH/ΔS (eq. 5); and
ΔG=ΔH(1−T/Tm) (eq. 6).
In some embodiments, tandem mismatches are evaluated in terms of n-n contributions. In this approach tandem mismatch (mm) base pairs are assigned a ΔG value relative to the corresponding Watson-Crick base pair doublet values. See e.g., Benight et al., “Statistical Thermodynamics and Kinetics of DNA Multiplex Hybridization Reactions” Biophys J., 91(11), pp. 4133-4153 (2006). For example, the free-energy of a mismatch base pair doublet in a tandem mismatch complex can be assigned according to
ΔGmm=κΔGPM=κ(ΔHPM−TΔSPM) (eq 7),
where ΔGPM, ΔHPM, ΔSPM, are the free energy, enthalpy, and entropy, respectively, for melting a hydrogen-bonded Watson-Crick base pair doublet. The factor κ is introduced as a means of scaling values of thermodynamic parameters of mismatch base pairs in tandem mismatches as a relative fraction of the stability of Watson-Crick perfect matches. The factor κ may be a single factor or one or more matrices of factors. In some embodiments, tandem mismatches can either be assumed to be minimal, κ=0, or assigned a K value of greater than zero (0) or less than or equal to one (1) (e.g., κ=0.5). Although consideration of tandem mismatches in this manner is clearly an oversimplified generalization, it provides a convenient means of universally weighting non-Watson-Crick tandem mismatch pair interactions differently than Watson-Crick base pairs, and discerning potential effects of tandem mismatch stability on multiplex hybridization.
Examples of sequence dependent values of tandem mismatch thermodynamic parameters (κ) are summarized in Table 5.
The tandem mismatches values in Table 5 are grouped according to their purine (R) and pyrimidine (Y) composition. As suggested by the values of κ, contributions of tandem mismatches to duplex stability are much larger than presently assumed.
In some embodiments, nearest-neighbor thermodynamic parameters, tandem mismatches contributions, as well as other thermodynamics parameter associated with duplex binding may be determine experimentally using, for example, differential scanning calorimetry (DSC) techniques, UV-Melting analysis, thermal denaturation techniques, optical absorbance versus temperature measurements, or the like.
For example, DNA duplex melting transitions may be evaluated by measurements of DSC melting curves using, for example, a Nano-II differential scanning calorimeter (Calorimetry Sciences Corp., Provo, Utah). In some embodiments, DSC data is collected as the change in excess heat capacity ΔCp versus temperature T. Heating rates may vary from about 15° C./hr to about 90° C./hr. The average buffer base line determined from multiple (usually more than three) scans of the buffer alone, is subtracted from these curves. The resulting base line corrected curve is then normalized to total DNA concentration and the calorimetric transition enthalpy ΔHcal and entropy ΔScal are determined from the normalized, base line corrected ΔCp vs. T curve.
In some embodiments, at least three forward and reverse ΔCp versus T scans are made per experiment. For short DNA melting curves, it is generally assumed that ΔCp (Tinitial)−ΔCp (Tfinal)=0. This assumption has been generally validated by the few attempts to evaluate any excess ΔCp in melting reactions, and it has been found that the contribution and the associated temperature dependence of thermodynamic parameters is very small.
In some embodiments, thermodynamic parameters are evaluated by DSC. DSC offers some advantages over, for example, optical absorbance versus temperature measurements. These include: (1) model independent parameter evaluation; and (2) no need to measure concentration dependence of the melting transition temperature, tm. Because DSC melting experiments are collected at relatively higher strand concentrations than for absorbance melting experiments, higher strand concentrations lead to more duplex formation. As a result melting experiments can be conducted on shorter duplexes at lower salt concentration.
A factor of probe design strategies is the quantitative determination of the propensity for intramolecular hairpin formation in probe and target strands. Known routines primarily rely on version a RNA and DNA folding package known as M-FOLD (developed by Dr. Michael Zuker of the Institute for Biomedical Computing, Washington University School of Medicine).
Some embodiment of the disclosed approaches of comparing and selecting probes based on the largest differences in ΔG of desired versus undesired hybridizations, eliminate potential hairpin forming sequences, since two strands capable of forming hairpins are also self-complementary. Their sequence could also promote bi-molecular duplex formation instead of an internal single strand loop comprised of tandem mismatches. These are apparently effectively filtered by the probe-target analysis component 30 and in preliminary testing it has found that the probe-target analysis component 30 is also an effective “filter” of self-complementary sequences that might be expected to have the strongest probability of hairpin formation. Partitioning of DNA sequence dependent contributions to thermodynamic stability into n-n components is the only known higher order representation of DNA that is not text-based. The n-n model is also ideally suited for an electronic circuit designed to make calculations and comparisons between the thermodynamics of sequences in a repetitive manner, using a database of n-n parameters.
When determining whether or not a particular probe sequence will bind with a set of large target sequences (e.g., a genome), as well as where it will bind, the energy of the duplex at each alignment of the probe with each of the targets must be accounted for. For example, given a probe length of 24 bases, and a genome to be examined having on the order of 6 billion bases, over 600 billion arithmetic operations must be performed to determine all the low energy alignment points. Along with these arithmetic operations, a large number of control and data flow operations are also required.
The extent of computations means that it takes a relatively long time (on the order of an hour or more), for a general purpose computer to make this determination, and thus such computations may become a rate limiting step.
Integrated circuitry offers tremendous computation speed by allowing parallelization of repetitive calculations. Using the n-n model thermodynamic parameters for calculating duplex stability results in fast thermodynamic scans of long DNA sequences.
In some embodiments, aligning a first nucleic acid probe base with a plurality of target bases includes shifting the first nucleic acid probe base sequence by at least one base in comparison to the plurality of target bases of the target sequence to define a second plurality of target bases, and determining the free energy contribution parameter for the comparison of the first nucleic acid probe base sequences with the second plurality of target bases.
The computing system 10 may include at least one memory interface component including one or more of sets of shift registers 206 interconnected in series or in parallel, or combinations thereof. In some embodiments, at least one shift register 202b, 204b of the one or more sets of shift registers 206 may be configured to receive a clock signal having a shift frequency. In some embodiments, the at least one shift register is capable of shifting data loaded into the shift register to a next one of the shift registers in the set 206 according to shift frequency. In some embodiments, thermodynamic data from a computer-readable memory medium is loaded into a corresponding shift register in the sets of shift registers 206 and the loaded thermodynamic data is shifted from the shift register to a next one of the shift registers in the set according to the clock signal, such that the shift register maintains its shift frequency during any loading of the thermodynamic data.
The values addressed correspond to n-n parameters for ΔH and ΔS. All values must be added to give a single ΔS and ΔH for a given alignment, used to calculate the ΔG for that alignment (ΔG=ΔH−TΔS). The 16×16 Ram Blocks 208, 210 shown in
In some embodiments, the computing system 10 may simultaneously address all n-n elements that are stored in pairs of RAM Blocks 208, 210. As previously noted an n-n doublet 258, 260 is comprised of two “base pair” units. In some embodiments, there is one RAM Block 208, 210 per base pair. Accessed values may be sent into pipeline for calculation. This approach may significantly increase the computation speed of a comparison of a first plurality of nucleic acids with at least a second plurality of nucleic acids.
At 272, the individual n-n elements are sent simultaneously to the pipeline 270. With each clock cycle, elements are added by adders 274a, 274b and may be buffered in registers 276a, 276b. A multiplier 278 may multiply a value representing the entropy (ΔS) by a value representing the temperature (T), which may be stored in a register 280. Resulting values may be buffered in registers 282a, 282b, before being added together by adder 284. The adder 284 adds the product (TΔS) to the enthalpy (ΔH) producing the free energy (ΔG=ΔH−TΔS). A comparator 286 compares the calculated ΔG value to a value that represents a reference free-energy ΔGref which may be stored in a register 288. The comparison dictates, for example, whether the probe of interest poses a threat for cross-hybridization at that alignment.
Referring to
In some embodiments, the shift register structure 206 may include a first set of shift registers 202a having a first plurality of shift registers 202b interconnected in series. In some embodiments, at least one of the first plurality of registers 202b is configured to receive a clock signal having a shift frequency. In some embodiments, the first set of shift registers 202a is configured to shift thermodynamic data associated with the first nucleic acid sequence 202 loaded into at least one shift register in the first set of shift registers 202a to a next one of a shift register in the first set of shift registers 202a according to, for example, the shift frequency.
The shift register structure 206 may further include a second set of shift register 204a having a second plurality of shift registers 204b interconnected in, for example, series. The second set of shift registers may include one or more shift register loaded with thermodynamic data associated with the second nucleic acid sequence 204.
In some embodiments, the shift register structure is configure to generate a comparison of thermodynamic data associated with the first nucleic acid sequence 202 loaded in one or more shift register in the first set of shift registers 202a and thermodynamic data associated with the second nucleic acid sequence 204 loaded in one or more shift register in the second set of shift registers 204a.
An estimate on the enormous enhancements in speed that might be realized can be made with the following “back of the envelope” calculation. Bear in mind, however, that the following represents the optimum “theoretical” speed enhancement that can be obtained. What is actually obtained will, of course, depend on the functioning logic device circuitry. The algorithm makes thermodynamic comparisons serially and thus must compare all doublets in a probe-target duplex alignment before shifting the window by a base and making the same computation for the new probe-target duplex alignment. Thus, for a 17 base probe (n) scanned against a strand of the genome six billion base pairs in length (m), the algorithm must make (there are 16 n-n doublets formed in a 17 base pair duplex),
On a standard 3 GHz 1.6 Pentium the probe-target analysis component 30 can compare 600,000 bases per second (r). Thus a single 16 base probe can be scanned against the genome in, for example,
Compare this to the disclosed systems and methods that makes calculations in parallel and therefore makes all comparisons for a single probe at once before shifting over by a base. The same number of comparisons has to be made; however, an FPGA, for example, uses its hardware logic gates and pipeline to effectively reduce the number of comparisons from 16 to 1 comparison per window cycle. Thus the same 17 base probe can be scanned against the same genome by making
Low end FPGAs process at 100 MHz, therefore the time for a scan of this 17 base probe against the genome is
State of the art FPGAs process at 500 MHz which would allow scans five times faster. In this case the genomic scan would take 12 seconds to scan a 20-mer probe against a six billion base pair genome.
The results illustrated in
At 402, the method 400 includes determining a first free energy value indicative of a duplex of a first nucleic acid probe and a first target nucleic acid sequence. In some embodiments, free energy values may be determined using, for example, sequence-dependent thermodynamic parameters. In some other embodiments, free energy values may be determined using, for example, one or more nearest neighbor (n-n) modeling approaches.
In some embodiments, the free energy values may be retrieved from a data structure comprising a thermodynamic data section including thermodynamic data representative of dangling ends of two or more bases. In some embodiments, the thermodynamic data section may further include thermodynamic data representative of unpaired single strands of two or more bases adjacent to a Watson-Crick base pairing. In some embodiments, the thermodynamic data section may further include thermodynamic data representative of unpaired single strands of one or more bases adjacent to a non-Watson-Crick base pairing. In some embodiments, the thermodynamic data section may further include thermodynamic data representative of tandem base pair mismatches of two or more bases. In some embodiments, the thermodynamic data section may further include thermodynamic data representative of length-dependent terminal mismatches of nucleic acid bases. In some embodiments, the thermodynamic data section may further include thermodynamic data representative of terminal base pair mismatches.
At 404, the method 400 includes determining a first minimum free energy value indicative of a lowest free energy value associated with a formation of each of one or more duplexes formed by the first nucleic acid probe and at least a second target nucleic acid sequence. In some embodiments, determining the first free value comprises retrieving from storage a free energy contribution parameter in parallel for one or more of the comparisons of the first or the at least second nucleic acid probe base sequence, to the first or the second plurality of target bases.
At 406, the method 400 includes determining a second minimum free energy value indicative of a lowest free energy value associated with a formation of each of one or more duplexes formed by the first nucleic acid probe and at least a second nucleic acid probe.
At 408, the method 400 includes determining a difference between the determined first free energy value, and a minimum of the first minimum free energy value and the second minimum free energy value.
At 410, the method 400 includes comparing the determined difference to a target value. In some embodiments, comparing the determined difference to a target value comprises comparing the determined difference to a target minimum free energy value, a target maximum energy gap value, a target difference of free energy value, or combinations thereof.
At 412, the method 400 may further include randomly generating a sequence of the first nucleic acid probe and a sequence of the at least second nucleic acid probe prior to determining the first free energy value.
At 414, the method 400 may further include generating a sequence of the first nucleic acid probe and a sequence of the at least second nucleic acid probe using a pseudo-random sequence generator prior to determining the first free energy value.
At 416, the method 400 may further include selecting a set of at least two nucleic acid probes based on whether the determined difference meets or exceeds the target value.
At 418, the method 400 may further include selecting a set of at least two nucleic acid probes based on at least one criterion selected from a compositional constraint, a lexical constraint, and a thermodynamic constraint.
At 452, the method 450 includes determining a first free energy contribution parameter for a comparison of a first nucleic acid probe base sequence to a first plurality of target bases of a target sequence.
At 454, the method 450 includes comparing the first free energy contribution parameter to a target value.
At 456, the method 450 includes generating a response based on the comparison to the target value. In some embodiments, generating a response based on the comparison includes generating the response based on a comparison of the first free energy contribution parameter to a target value indicative of the presence of the target nucleic acid sequence or a closely homologous sequence. In some embodiments, generating a response based on the comparison includes having a controller 12 compare the first free energy contribution parameter to the target value, and to generate at least one of a comparison plot, comparison data, an indication of a level of gene expression, an indication of a presence or absence of one or more nucleic acid sequences, or an indication of an L-length-mer composition of a target DNA fragment based on the comparison.
At 458, the method 450 may further include determining a second free energy contribution parameter for a comparison of at least a second nucleic acid probe base sequence to the first plurality of target bases of the target sequence.
At 460, the method 450 may further include comparing the at least second contribution parameter to the target value.
At 462, the method 450 may further include generating a response based on the comparison to the target value.
At 464, the method 450 may further include determining a third free energy contribution parameter for a comparison of the first nucleic acid probe base sequence to a second plurality of target bases of a target sequence.
In some embodiments, determining the third free energy contribution parameter comprises shifting the first nucleic acid probe base sequence by at least one base in comparison to the first plurality of target bases of the target sequence to define the second plurality of target bases, and determining the third free energy contribution parameter for the comparison of the first nucleic acid probe base sequences with the second plurality of target bases.
At 466, the method 450 may further include comparing the third free energy contribution parameter to the target value.
At 468, the method 450 may further include generating a response based on the comparison to the target value.
At 470, the method 450 may further include providing a signal indicative of when the first free energy parameter is less than a target threshold amount.
At 502, the method 500 includes identifying a genetic region in the genomic sequence characterized by at least one nucleic acid sequence.
At 504, the method 500 includes providing a first probe and at least a second probe, the first and the at least second probes may be provided based on a free energy gap characteristic indicative of a binding affinity for the at least one nucleic acid sequence.
At 506, the method 500 includes detecting whether a binding event between the first and the at least second probes and the at least one nucleic acid sequence has occurred.
In some embodiments, at least one computer readable storage medium stores instructions that, when executed on a computer, execute the method 550 for determining the thermodynamic characteristics of nucleic acid sequences.
At 552, the method 550 includes retrieving from storage one or more thermodynamic parameters associated with a binding comparison of a first nucleic acid base sequence to a first region of at least a second nucleic acid base sequence. In some embodiments, retrieving from storage one or more thermodynamic parameters comprises retrieving from storage at least one value indicative of a nearest-neighbor free energy parameter, a nearest-neighbor enthalpy parameter, or a nearest-neighbor entropy parameter.
At 554, the method 550 may further include retrieving from storage one or more thermodynamic parameters associated with a binding comparison of the first nucleic acid base sequence to a second region of the at least second nucleic acid base sequence, the second region different from the first region by at least one nucleic acid base position along a nucleic acid sequence of the second nucleic acid base sequence.
The one or more thermodynamic parameters may comprise at least one of a dangling end of two or more bases thermodynamic parameter, an unpaired single strand of two or more bases adjacent to a Watson-Crick base pairing thermodynamic parameter, a tandem base pair mismatch of two or more bases thermodynamic parameter, a length-dependent terminal mismatch of nucleic acid base thermodynamic parameter, and a terminal base pair mismatch thermodynamic parameter.
At 556, the method 550 may further include generating a binding profile for the first nucleic acid base sequence based on the comparison of the first nucleic acid base sequence to the first region, or the comparison of the first nucleic acid base sequence to the second region.
At 558, the method 550 may further include generating a thermodynamic stability profile for the first nucleic acid base sequence based on the comparison of the first nucleic acid base sequence to the first region, or the comparison of the first nucleic acid base sequence to the second region. Referring to
The above description of illustrated embodiments, including what is described in the Abstract, is not intended to be exhaustive or to limit the embodiments to the precise forms disclosed. Although specific embodiments of and examples are described herein for illustrative purposes, various equivalent modifications can be made without departing from the spirit and scope of the disclosure, as will be recognized by those skilled in the relevant art. The teachings provided herein of the various embodiments can be applied to systems, devices, and methods for analyzing biological samples, analyzing biological molecules (e.g., oligonucleotides, peptides, proteins, or the like), nucleic acid probes, evaluating thermodynamic properties of nucleic acid sequences, or the like, not necessarily the exemplary systems, devices, and methods for analyzing biological samples, analyzing biological molecules (e.g., oligonucleotides, peptides, proteins, or the like), nucleic acid probes, evaluating thermodynamic properties of nucleic acid sequences, or the like generally described above.
For instance, the foregoing detailed description has set forth various embodiments of the devices and/or processes via the use of block diagrams, schematics, and examples. Insofar as such block diagrams, schematics, and examples contain one or more functions and/or operations, it will be understood by those skilled in the art that each function and/or operation within such block diagrams, flowcharts, or examples can be implemented, individually and/or collectively, by a wide range of hardware, software, firmware, or virtually any combination thereof. In one embodiment, the present subject matter may be implemented via Application Specific Integrated Circuits (ASICs). However, those skilled in the art will recognize that the embodiments disclosed herein, in whole or in part, can be equivalently implemented in standard integrated circuits, as one or more computer programs running on one or more computers (e.g., as one or more programs running on one or more computer systems), as one or more programs running on one or more controllers (e.g., microcontrollers) as one or more programs running on one or more processors (e.g., microprocessors), as firmware, or as virtually any combination thereof, and that designing the circuitry and/or writing the code for the software and or firmware would be well within the skill of one of ordinary skill in the art in light of this disclosure.
In addition, those skilled in the art will appreciate that the mechanisms taught herein are capable of being distributed as a program product in a variety of forms, and that an illustrative embodiment applies equally regardless of the particular type of signal bearing media used to actually carry out the distribution. Examples of signal bearing media include, but are not limited to, the following: recordable type media such as floppy disks, hard disk drives, CD ROMs, digital tape, and computer memory; and transmission type media such as digital and analog communication links using TDM or IP based communication links (e.g., packet links).
The various embodiments described above can be combined to provide further embodiments. To the extent that they are not inconsistent with the specific teachings and definitions herein, all of the U.S. patents, U.S. patent application publications, U.S. patent applications, foreign patents, foreign patent applications and non-patent publications referred to in this specification and/or listed in the Application Data Sheet, including but not limited to U.S. Provisional Patent Application No. 60/884,161 filed Jan. 9, 2007; and U.S. Provisional Patent Application No. 60/947,597 filed Jul. 2, 2007, are incorporated herein by reference, in their entirety. Aspects of the embodiments can be modified, if necessary, to employ, for example, systems, circuits, and concepts of the various patents, applications, and publications to provide yet further embodiments.
These and other changes can be made to the embodiments in light of the above-detailed description. In general, in the following claims, the terms used should not be construed to limit the claims to the specific embodiments disclosed in the specification and the claims, but should be construed to include all possible embodiments along with the full scope of equivalents to which such claims are entitled. Accordingly, the claims are not limited by the disclosure.
This application claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Patent Application No. 60/884,161 filed Jan. 9, 2007 and U.S. Provisional Patent Application No. 60/947,597 filed Jul. 2, 2007.
| Number | Date | Country | |
|---|---|---|---|
| 60884161 | Jan 2007 | US | |
| 60947597 | Jul 2007 | US |