The file named “VONL022US-corrected-8-18-2022.txt” contains a computer readable form of the Sequence Listing and was created on Aug. 18, 2022. The file is 56,268 bytes (measured in MS-Windows®), is filed contemporaneously along with this application by electronic submission (using the United States Patent Office EFS-Web filing system), and is incorporated herein by reference in its entirety.
The invention relates to the fields of immunology, immunogenetics and clinical diagnostics. In particular, it relates to means and methods for assessing clonal immunoglobulin (IG)/T cell receptor (TR) gene rearrangements in a clinical, diagnostic and/or research setting.
Specific antigen recognition by cells of the adaptive immune system (B cells, T cells) is mediated through receptors (immunoglobulin, IG, and T cell receptor, TR) that are uniquely formed during immune development in bone marrow and thymus, respectively. Through recombination of IG/TR loci a diverse (polyclonal) repertoire of unique IG/TR receptors is created. In certain autoimmune diseases this repertoire is skewed (oligoclonal), whereas in lymphoid malignancies receptors are largely identical (monoclonal)1-7. IG/TR rearrangements thus form unique genetic biomarkers (molecular signatures) for studying immune cells for clinical, diagnostic and research applications8-11. Classically, methods for immunogenetic analysis mostly concern fragment analysis and Sanger-based sequencing. The introduction of next-generation sequencing (NGS) makes deeper analysis of IG/TR rearrangements possible, with impact on the main immunogenetic applications: clonality assessment, minimal residual disease (MRD) detection, repertoire analysis12-29.
Identification and assessment of clonal IG/TR gene rearrangements is a widely used tool for the diagnosis and follow-up of lymphoid malignancies30-35. NGS of IG/TR gene rearrangements is gaining popularity in clinical laboratories, as it avoids laborious design of patient-specific real-time quantitative (RQ)-PCR assays and provides the capability to sequence multiple rearrangements and rearrangement types within a single sequencing run. Hence, several methods have already been described for high-throughput profiling of IG/TR rearrangements at diagnosis and follow-up in acute lymphoblastic leukemia (ALL), chronic lymphocytic leukemia (CLL) and other lymphoid malignancies.16,17,23,24,36,37
Potential applications for IG/TR NGS are identification of clonal IG/TR markers in diagnostic samples for subsequent analysis of minimal residual disease (MRD), but also the actual MRD analysis itself, in different lymphoid malignancies (mainly ALL, CLL, follicular lymphoma, mantle cell lymphoma, multiple myeloma). In addition, it can be applied for clonality diagnostics in the diagnostic process of lymphoproliferative disorders.
It has been found that NGS assays, especially those based on amplicons, pose major challenges. For example, multiple primers need to anneal under the same reaction conditions, while many technical variables may be introduced by sample preparation, library construction, sequencing and bioinformatics, potentially leading to inaccurate results38. Unfortunately, standardization and validation in a scientifically-controlled multicentre setting is still lacking. Particularly in a clinical context, strategies for standardisation of laboratory protocols and quality control (QC) of each component of an NGS assay are highly sought for.
Reference standards are essential for the evaluation of wet-lab and in silico NGS processes to ensure the analytical validity of test results prior to implementation of an NGS technology into clinical practice29,39,40. Reference DNA materials should be stable sources of rearrangements that can be sequenced and used for measuring qualitative and quantitative properties. However, the present inventors recognized that previously published standards have a limited scope and utility, since they (1) do not cover all relevant IG/TR loci, (2) do not report on the quality of the sequencing run or the performance of samples and primers, and/or (3) are synthetic constructs that may not reflect the complexity of native genomic DNA23,41,42.
Therefore, they aimed at providing improved types of quality controls that can be readily integrated in existing systems for immunoprofiling IG/TR sequence data, in particular ARResT/Interrogate43, an interactive web-based computational platform that can process and annotate large amounts of immunogenetic data, calculate several relevant statistics including on QC, and present results in the form of multiple interconnected visualizations. More specifically, they sought to provide a composition that can be directly added to a sample to undergo concurrent library preparation and sequencing, acting as in-tube qualitative and quantitative standard that is subjected to the same technical downstream variables as the accompanying samples. Furthermore, they aimed at providing a composition that allows to uniformly assess performance biases or unusual amplification shifts across all types of IG/TR gene rearrangements in a sequencing run by tracking primer usage and comparison with stored reference profiles.
To that end, the present inventors of the EuroClonality-NGS Working Group joined forces to develop, standardize and validate in vitro assays and bioinformatics for IG/TR NGS applications. This resulted in the provision of two novel types of quality controls; a central in-tube quality/quantification control (cIT-QC) based on human B and T cell lines with well-defined IG/TR gene rearrangements, and a central polytarget quality control (cPT-QC) based on a standardised mixture of lymphoid specimens representing a full repertoire of IG/TR genes.
Accordingly, in a first aspect the invention provides a composition (herein also referred to as “central in-tube quality/quantification control” (cIT-QC)) comprising a mixture of genomic DNA isolated from a set of nine cultured cell lines, said set comprising the B cell lines ALL/MIK (B cell precursor ALL), Raji (Burkitt lymphoma), REH (B cell precursor ALL), TMM (CML-BC/EBV+B-LCL), TOM-1 (B cell precursor ALL), WSU-NHL (B cell lymphoma); and three T cell lines: JB6 (ALCL), Karpas299 (ALCL) and MOLT-13 (T-ALL), or wherein one or more cell lines of said set is replaced with one or more other cell line(s) comprising the same immunoglobulin (IG)/T cell receptor (TR) gene rearrangements, i.e. the same IG/TR rearrangement profile.
In a second aspect, the invention provides a composition (herein also referred to as “central polytarget quality control” (cPT-QC)) consisting of essentially equimolar amounts of genomic DNA isolated from healthy human thymus, healthy human tonsil and healthy human peripheral blood mononuclear cells.
Compositions according to the present invention are not known or suggested in the art. US2018/208984 A1 relates to a method for detecting IG/TR rearrangements using next-generation sequencing using a set of primers. A set of plasmids comprising known alleles, including TCR sequences of the T cell lines JB6, Karpas299 and MOLT-13, is used as a control. However, the control sample of US2018/208984 using cDNA prevents the inclusion of incomplete TR and IG rearrangements, because they are not transcribed into mRNA molecules. Such incomplete TR and IG rearrangements are explicit targets in the present invention, as they are complementary targets for clonality detection and MRD assessment. For example, unlike a control composition of the invention, US2018/208984 does not allow for the identification and quantification of the rearrangements TRB D-J, TRD V-D, TRD D-D, TRD D-J, IGK-Kde and IGH D-J.
Beccutti et al. (BMC Bioinformatics, Vol. 18, no. 1, 2017, pages 1-12) relates to a method for detecting IG/TR rearrangements using NGS. DNA isolated from buffy coat (comprising peripheral blood mononuclear cells) is used as a control. Beccutti et al. is silent about the use of DNA from additional tissue sources, let alone that it suggests to include tonsil and thymus genomic DNA in a polytarget quality control composition in order to include essential rearrangements that are not found in PBMCs.
A cIT-QC composition as provided herein has a number of unique and advantageous properties. First, with the selected set of only nine cell lines featuring a total of 46 rearrangements, it represents as few cell lines as possible, while covering each target by at least three different rearrangements, hence allowing for detecting ALL cells harbouring not only lineage-associated but also cross-lineage rearrangements. Second, the rearrangements are unambiguously detectable with Sanger sequencing and/or amplicon-based NGS. Third, the variable region of IGH gene rearrangements are unmutated and therewith avoid issues with primer annealing. Table 1 presents the full list of the 46 rearrangements.
With the use of genomic DNA, a composition of the invention explicitly avoids the usage of plasmids, which are known to pose a serious threat to contaminate PCR assays. Additionally, genomic DNA was chosen to optimally represent the patient samples for which the assay is intended for, and which also comprise genomic DNA.
Genomic DNA is readily isolated from the cell lines using established extraction protocols known in the art. In one embodiment, the DNA is obtained using a phenol-chloroform extraction protocol, followed by ethanol precipitation and elution in Tris ethylenediaminetetra-acetic acid (TE) buffer. The composition is suitably in a dry (e.g. lyophilized) form to be reconstituted with a liquid prior to use.
In a preferred aspect, a composition comprises a mixture of about equal amounts of genomic DNA isolated from the selected set of cultured cell lines. For example, the cIT-QC composition is formulated to provide a test sample with the DNA of at least 40 cell copies, preferably at least 50 copies of each cell line. Whereas there is no maximum number of cell copies to be represented in the control sample, very high amounts of genomic DNA may consume to a considerable extent the sequencing power in the assay. In one embodiment, the cIT-QC composition contains about an equal number of cell line DNA copies of the selected set of cultured cell lines and is (formulated to be) reconstituted to a solution that contains the genomic DNA of 20-50 cell copies of each cell line per reaction.
The B and T cell lines for use in a cIT-QC composition provided herein can be obtained from any suitable (commercial) source. For example, the Raji cell line is DSMZ ACC 319, the REH cell line is DSMZ ACC 22, the TMM cell line is DSMZ ACC 95, the TOM-1 cell line is DSMZ ACC 578, the WSU-NHL cell line is DSMZ ACC 58, the Karpas299 cell line is DSMZ ACC 31 and/or the MOLT-13 cell line is DSMZ ACC 436. It is of course also possible to replace one or more of the cell lines shown in Table 1 with another cell line having good growth characteristics that contains (or is provided with) the same rearrangements. Hence, also encompassed is a composition comprising a mixture of genomic DNA isolated from a set of cultured cell lines which together cover the profile with 46 rearrangement types shown in Table 1. In particular, the invention provides a composition comprising a mixture of genomic DNA isolated from a set of nine cultured cell lines, said set comprising the B cell lines ALL/MIK (B cell precursor ALL), Raji (Burkitt lymphoma), REH (B cell precursor ALL), TMM (CML-BC/EBV+B-LCL), TOM-1 (B cell precursor ALL), WSU-NHL (B cell lymphoma) and the T cell lines JB6 (ALCL), Karpas299 (ALCL) and MOLT-13 (T-ALL), or wherein one or more cell lines of said set is replaced with one or more other (i.e. distinct from the nine cell lines recited), cell line(s) comprising the same immunoglobulin (IG)/T cell receptor (TR) gene rearrangements. Such an “equivalent” cell type comprises at least the same gene rearrangements depicted in Table 1, or IG/TR rearrangements of the same type, i.e. different CDR3. In a preferred aspect, the composition comprises genomic DNA isolated from the B cell lines ALL/MIK, REH and TOM-1, each comprising cross-lineage TR rearrangements.
In a preferred embodiment, the composition consists of a mixture of, preferably in about equal amounts, genomic DNA isolated from the B cell lines ALL/MIK, Raji, REH, TMM, TOM-1, WSU-NHL and the T cell lines JB6, Karpas299 and MOLT-13.
In a further aspect, the invention provides a cPT-QC composition consisting of essentially equimolar amounts of genomic DNA isolated from healthy human thymus, healthy human tonsil and healthy human peripheral blood mononuclear cells (PB-MNC). In other words, it consists of an equimolar mixture of ⅓ thymus, ⅓ tonsil, ⅓ PB-MNC DNA. As used herein, the term “healthy” refers to tissue obtained from a human subject that is known or presumed not to suffer from an underlying malignant immunological disease or disorder. In one aspect, thymus is obtained from young children through removal due to physical impossibility to reach the heart for surgery. It is preferred that, for each tissue, the genomic DNA is obtained from a number of different human individuals. For example, for each tissue the DNA of 3 to 10 human subjects is used. Since this composition represents a “standardised lymphoid specimen”, it is suitably used as separate sample to be processed alongside test samples, it is preferably formulated to provide essentially the same amount of DNA as a regular sample that is tested. This typically ranges from 50 to 200 ng, preferably. In a specific aspect, the composition is dried e.g. lyophilized.
The expression “in about equal amounts” or “in essentially equal amounts” as used herein reflects the aim that each of the cell lines/lymphoid tissue samples is equally represented in the mixture of genomic DNA.
The invention also provides a diagnostic kit comprising a (first) container comprising a “central in-tube quality/quantification control” (cIT-QC) composition of the invention and/or a (second) container comprising a “central polytarget quality control” (cPT-QC) composition as herein disclosed. In one embodiment, the kit comprises at least a cIT-QC composition as herein disclosed. In another embodiment, the kit comprises at least a cPT-QC composition of the present invention. The cPT-QC composition may be packaged together with one or more further useful quality control composition(s). For example, the further control composition may comprise a mixture of genomic DNA isolated from a set of cultured cell lines which together cover the profile with 46 rearrangement types shown in Table 1, such that both quality controls can be used to monitor the assay performance when assessing clonal IG/TR gene rearrangements.
Preferably, the kit comprises both the cIT-QC and the cPT-QC compositions as herein disclosed. The kit may advantageously further comprise one or more reagents for detecting IG/TR gene rearrangements, such as a set of primers for amplicon-based NGS of IG/TR gene rearrangements. In a specific embodiment, the diagnostic kit comprises, in addition to one or both QC composition(s) provided herein, one or more primer sets for detecting one or more of the IG/TR gene rearrangements selected from the group consisting of IGH-VJ, IGH-DJ, IGK-VJ-Kde, TRB-VJ, TRB-DJ, TRD and TRG. Particularly preferred primers, e.g. for use in combination with the QC compositions, are those that have been optimized for NGS-based detection, such as the primers shown in
The invention also provides a set of primers for amplicon-based next-generation sequencing (NGS) of IG/TR gene rearrangements, comprising two or more of the primers selected from the primers shown in
In a specific aspect, a primer of the invention comprises a forward or reverse M13 sequence. In one embodiment, a primer sequence of
The provision of the novel QC compositions of the invention has important implications for the quality control and quantitation strategies. As is demonstrated herein below, the cPT-QC composition is a valuable tool to monitor reproducibility of results and to identify primer perturbations and other deviations in the wet lab protocol, as they introduce detectable changes to the sequencing profile. The addition of the cPT-QC to each sequencing run allows to check the primer and assay performance after sequencing. Accidental deviations in the concentrations of single primers within the multiplexed IG/TR primer sets can be detected, performance failures of single primers can be traced and consequences for the IG/TR analysis can be estimated by analysis of the cPT-QC data.
Additionally, the advantages and diverse utility of the cIT-QC are shown. In contrast to plasmids or synthetic reference templates, cIT-QC cell lines are particularly well suited to be used as control because they are sources of large quantities of genomic DNA and are commercially available. cIT-QC rearrangements represent ⅔ of the amplifiable rearrangement types over all eight primer sets, and thus offer an opportunity to highlight a number of issues, most obviously over-/under-amplification, but also bioinformatic misidentification. Additionally, cIT-QC rearrangements can replace buffy coat DNA for PCR stability without influencing the patient immune repertoire (since cIT-QC rearrangements are bioinformatically identified and by default excluded from the results).
The cIT-QC enables the conversion from reads to cells, which is of utmost importance for clinical use. Diagnostic material being analysed for MRD marker identification can show abundances of particular clonotypes that do not reflect the clonal composition of the sample. For example, if the diagnostic sample is highly infiltrated by a lymphoid malignancy that does not harbour a targetable rearrangement, the (few) residual lymphoid cells would generate the whole spectrum of detectable rearrangements; in such situations minor accompanying physiological B or T cell clones could be misassigned as clones with leukemic markers.
In addition to its use in marker identification, and as exemplarily shown for B and T cell depletion in aplastic follow-up samples, the cIT-QC is of utmost relevance for MRD quantification in samples on or after treatment, in particular if B or T cell directed therapy, which minimises the background of polyclonal gene rearrangements, was applied. If the relative tumor burden is calculated by the ratio of leukemia-specific reads to all annotated reads without any normalisation, the quotient reflects the marker frequency only among cells carrying a particular type of rearrangement (e.g. IG rearrangements in the total pool of B cells present) and might thus heavily overestimate the actual tumor load44.
Still further, the QC protocols can be readily embedded in ARResT/Interrogate, which informs users with reports and messages and allows them e.g. to include the QC-failed samples back into the analysis. The logic behind this is that the flag “fail” is an alarm that pre-defined QC criteria were not met, but it does not necessarily indicate that the data are fully corrupt. However, flagged data should always be used with caution, and dependent on the application or question.
The invention therefore also relates to the use of a composition according to the invention or a kit as described herein above in an assay for detecting IG/TR gene rearrangements. A person skilled in the art will recognize and appreciate the diverse range of applications. Only by way of example, the assay is a clinical diagnostic assay, preferably an assay for detecting clonality, identifying MRD markers and/or MRD monitoring and/or analyzing the (clonal) immune repertoire in a lymphoid malignancy.
A further embodiment relates to an in vitro method for detecting IG/TR gene rearrangements in at least one biological sample using NGS, comprising the conventional steps of sample preparation, PCR and/or library construction, sequencing and bioinformatics analysis, but characterized in that at least one of the QC compositions is used. For example, at least one biological sample is spiked with a cIT-QC composition, e.g. in an amount to provide the DNA of at least 40 cell copies of each cell line. Such use as in-tube qualitative and quantitative control enables the conversion from reads to cell correlates, which is of utmost importance for clinical use.
Alternatively or additionally, a cPT-QC composition is run as a separate sample in parallel to the at least one biological “test” sample(s), therewith serving as external control to check the primer and assay performance after sequencing.
Typically, the at least one biological sample is a clinically relevant sample. In one aspect, it is a sample for detection of clonality to support or exclude the diagnosis of malignant lymphoproliferation. In another aspect, it is a sample taken for MRD marker identification or for MRD monitoring analysis or for (clonal) immune repertoire analysis.
A method provided herein can be performed using standard means and protocols known in the art. In one embodiment, at least part of the method is performed using microfluidics technology. For example, the steps of sample preparation, PCR, library construction and/or sequencing is performed in a microfluidics device comprising one or more prestored reagents. Particularly preferred for use in a method of the invention is a centrifugal-microfluidic disk system (also known in the art as “centrifugal microfluidic biochip” or “centrifugal micro-fluidic biodisk”) which is a type of lab-on-a-chip technology that can be used to integrate processes such as separating, mixing, reaction and detecting molecules of nano-size in a single piece of platform, including a compact disk or DVD. There are various typical units in a centrifugal microfluidic structure, including valves, volume metering, mixing and flow switching. These types of units can make up structures that can be used in a variety of ways. Before the molecules react with the reagents, they should be prepared for the reactions. The most typical is separation by centrifugal force. In the case of blood, for example, the sedimentation of blood cells from plasma can be achieved by rotating the biodisk for some time. After separation, all molecular diagnostic assays require a step of cell/viral lysis in order to release genomic and proteomic material for downstream processing. Typical lysis methods include chemical and physical method. The chemical lysis method, which is the simplest way, uses chemical detergents or enzymes to break down membranes. The physical lysis can be achieved by using bead beating system on a disk. Lysis occurs due to collisions and shearing between the beads and the cells and through friction shearing along the lysis chamber walls.
In one aspect, the disk comprises pre-stored reagents for automated and integrated DNA extraction, PCR and/or library generation. See for example the review by Tang et al45.
Exemplary disks for use in a method of the invention include those having one or more of the specific features as disclosed in patent application in the name of Hahn Schickard, such as WO2013/124258, WO2014/198703, WO2015/189280, WO2015/051950 and WO2017/191032.
In a method of the invention, the step of bioinformatic analysis advantageously comprises the use of a web-based, interactive application. For example, bioinformatic analysis comprises the use of a purpose-built bioinformatic application (such as ARResT/Interrogate, or equivalent) for the pre-processing of raw data, primer sequence analysis, immunogenetic annotation, post-processing of results, analysis and use of the cIT-QC (including for marker quantification), analysis and use of the cPT-QC (including for comparison to pre-analyzed stored reference datasets), reporting of/access to/visualization of results.
Herewith, the invention demonstrates the applicability of two reference/QC standards, which allow standardised analysis of IG/TR NGS data (e.g. using the NGS primer sets herein disclosed) with high reproducibility, accuracy and precision in marker identification. With ARResT/Interrogate, a complete in silico solution accompanying the in vitro assays is provided, which enables an analysis of IG/TR sequences including all quality criteria and quantification concepts necessary for valid marker identification in lymphoid malignancies.
5A-1) Schematic diagrams of IGH-VJ and IGH-DJ rearrangements. The relative position of the VH family primers, DH family primers and consensus JH primers is given according to their most 5′nucleotide upstream (−) or downstream (+) of the involved RSS.
5A-2) Histograms showing junction nucleotide lengths of complete IGH rearrangements (IGH-VJ tube) in a BCP-ALL patient, cPT-QC, BC, thymus, and tonsil. Bars are coloured according to the V-J genes combination.
5A-3) Histograms showing junction nucleotide lengths of incomplete IGH rearrangements (IGH-DJ tube) in a BCP-ALL patient, cPT-QC, BC, thymus, and tonsil. Bars are coloured according to the D-J genes combination.
5B-1) Schematic diagrams of IGK-VJ rearrangement and the two types of Kde rearrangements (V-Kde and intronRSS-Kde). The relative position of the VK, JK, Kde, and intronRSS (INTR) primers is given according to their most 5′nucleotide upstream (−) or downstream (+) of the involved RSS.
5B-2) Histograms showing junction nucleotide lengths of IGK-VJ and IGK-V-Kde rearrangements (IGK-VJ-Kde tube) in a B-ALL patient, cPT-QC, BC, thymus, and tonsil. Bars are coloured according to the V-J-Kde genes combination.
5B-3) Histograms showing junction nucleotide lengths of intron-Kde rearrangements (intron-Kde tube) in a BCP-ALL patient, cPT-QC, BC, thymus, and tonsil.
5C-1) Schematic diagrams of TRB-VJ rearrangement and DJ rearrangement. The relative position of the TRB V family primers, TRB D primers and the TRB J primers is given according to their most 5′nucleotide upstream (−) or downstream (+) of the involved RSS.
5C-2) Histograms showing junction nucleotide lengths of complete TRB rearrangements (TRB-VJ tube) in a T-ALL patient, cPT-QC, BC, thymus, and tonsil. Bars are coloured according to the V-J genes combination.
5C-3) Histograms showing junction nucleotide lengths of incomplete TRB rearrangements (TRB-DJ tube) in a T-ALL patient, cPT-QC, BC, thymus, and tonsil. Bars are coloured according to the D-J genes combination.
5D-1) Schematic diagrams of TRG V-J rearrangement and the relative position of the TRG V and TRG J primers. The relative position of the TRG V primers and the TRG J primers is given according to their most 5′nucleotide upstream (−) or downstream (+) of the involved RSS.
5D-2) Histograms showing junction nucleotide lengths of TRG rearrangements (TRG tube) in a T-ALL patient, cPT-QC, BC, thymus, and tonsil. Bars are coloured according to the V-J genes combination.
5E-1) Schematic diagram of VD-JD, DD-JD, DD-DD, and VD-DD, VD-JA29 rearrangements, showing the positioning of VD, JD, DD, and JA29 primers, all combined in a single tube. The relative position of the Vd, Dd, and Jd primers is indicated according to their most 50 nucleotide upstream (−) or downstream (+) of the involved RSS.
5E-2) Histograms showing junction nucleotide lengths of TRD rearrangements (TRD tube) in a T-ALL patient, cPT-QC, BC, thymus, and tonsil. Bars are coloured according to the V-D-J genes combination.
In total, 59 human B (n=30) and T (n=29) lymphoid cell lines were obtained from the American Type Culture Collection (ATCC; www.lgcpromochem-atcc.com, Manassas, VA, USA) and the German Collection of Microorganisms and Cell Cultures GmbH (DSMZ; www.dsmz.de, Braunschweig, Germany), or were derived from internal cell line banks. DNA from cultured cell lines was isolated using a phenol-chloroform extraction protocol, followed by ethanol precipitation and elution in Tris ethylenediaminetetra-acetic acid (TE) buffer. Alternatively, DNA was isolated with the GenElute Mammalian Genomic DNA Miniprep Kit (Sigma-Aldrich, St. Louis, MO, USA) according to manufacturer's protocol.
Each of the 59 cell lines was screened for clonal IG/TR gene rearrangements using the aforementioned EuroClonality-NGS assay with 100 ng of DNA (quantified with Qubit 3.0, Thermo Fisher Scientific) from each cell line, without addition of buffy coat (BC). Paired-end sequencing (2×250 bp) was performed on an Illumina MiSeq (Illumina, San Diego, CA, USA) with a final concentration of 7 pM per library aiming for at least 2000 reads per sample. To avoid low-complexity library issues 10% PhiX control was added to each sequencing run.
Additional methods were used to verify the NGS-amplicon-identified cell line rearrangements:
IG/TR rearrangement profiles of all cell lines, as obtained with the different methods, were compared.
Verification of Cell Line-Specific Gene Rearrangements from Human B and T Cell Lines Via ddPCR
For cases with discrepant results between the three methods, IG/TR allele-specific PCR assays were designed for digital droplet PCR (ddPCR) (QX200™ Droplet Digital™ PCR System, Bio-Rad) to verify the respective rearrangement. Absolute quantification of IG/TR gene rearrangements by ddPCR was performed using two different gDNA amounts (50 ng, 100 ng). Each experiment included a polyclonal buffy coat BC control and a no template control.
Allele-specific primers for clonal IG/TR rearrangements and probes for quantification were synthesized by Sigma Aldrich. All primers were cleaned by desalting, while hydrolysis probes containing a 5′-FAM/3′-TAMRA reporter dye were cleaned by HPLC. All oligonucleotides were resuspended in TE buffer at a total strand concentration Ct=100 ⋅M and stored at −20° C. before use.
ddPCR reactions were prepared in a volume of 20 ⋅L using 10 ⋅L by 2×ddPCR SuperMix (Bio-Rad Laboratories, Hercules, CA), testing two different amounts of cell line gDNA (50 ng/500 ng) quantified before with the Qubit dsDNA High Sensitivity Assay Kit (Thermo Fisher Scientific, Waltham, MA), forward primer (FP) and reverse primer (RP), each at a final concentration 300 nmol/L, and FAM-labelled probes (100 nmol/L). Droplets were generated by the QX200 droplet generator (Bio-Rad) using 20 ⋅L of the reaction mixture and 70 ⋅L of the droplet generation oil for probes (Bio-Rad), located onto suitable holes in a DG8 cartridge (Bio-Rad). About 45 ⋅L of the drop-oil mixture (12,000-20,000 drops) were transferred to a 96-well plate (Bio-Rad) and loaded on a DNA Engine Dyad Peltier Thermal Cycler with the following amplification protocol: 95° C. for 10 min, followed by 40 cycles: denaturation at 94° C. for 30 s; annealing at 60° C. for 1 min; extension at 60° C. for 1 min. PCR products were loaded into the QX200 droplet reader and analysed by QuantaSoft Version 1.2 (Bio-Rad Laboratories).
Initially, quantification of DNA of selected B- and T-cell lines was done by Qubit dsDNA High Sensitivity Assay Kit (Thermo Fisher Scientific, Waltham, MA). Quantitative values were checked again by ddPCR-based quantification of the albumin housekeeping gene using 50-200 ng DNA/cell line in order to precisely determine the number of cells per μl of DNA. Primers and probe for albumin quantitation were synthesized by Sigma Aldrich. ddPCR was carried out according to the protocol described above, in duplicates for each cell line. After completion of the PCR, samples were analyzed in the Droplet Reader in terms of number of copies of cell lines per 20 μl reaction volume. Based on the values from the ddPCR, the cell line DNA was diluted in TE buffer down to 400 copies/μl. Thereafter, another ddPCR quantification was performed to check the dilution of each cell line DNA again. Two different volumes of the diluted cell line solution (0.5 μl DNA [200 copies] and 2 μl DNA [800 copies]) were used as input amount. With suitable quantitative values, cell line DNAs were further diluted and mixed with each other leading to 40 copies of each cell line being present in 2 μl of the DNA mixture. This mixture was added to each sample as cIT-QC and subjected to simultaneous library preparation prior to sequencing.
Implementation of the cIT-QC
Bioinformatically, cIT-QC reads are identified using an immunogenetic annotation-based approach that is extremely fast while allowing for variations in sequence, avoiding compute-intensive and potentially inaccurate alignment-based approaches. In ARResT/Interrogate, the term ‘spike-ins’ is also used to refer to the cIT-QC.
Regarding QC, identification of at least one read per cIT-QC rearrangement and of at least as many total cIT-QC reads as total cells used is required, otherwise the sample is tagged as “QC-failed” (see below for how this is used in ARResT/Interrogate). The quantification factor—calculated by dividing total cIT-QC cells by total reads—is stored and applied in any case, thus still allowing the user to analyse the sample.
Quantification is based on applying the quantification factor to convert the read counts of a clonotype to cell counts, and then calculate the relative abundance against the total input cells.
A cPT-QC composition was prepared that consists of genomic DNA isolated from healthy human thymus, healthy human tonsil and healthy human peripheral blood mononuclear cells (DNA amounts mixed in a ratio 1:1:1). To that end, a (semi-)automated genomic DNA extraction was performed on cell suspensions obtained after dissecting and mincing tissues or Ficoll density blood separation.
The cPT-QC composition is suitable used to undergo NGS library preparation alongside the investigated samples. For the EuroClonality-NGS assay, this involves one cPT-QC sample per run, amplified in eight tubes.
Implementation of the cPT-QC
Primers are bioinformatically identified in the reads of each of the eight cPT-QC tubes of the run and their abundances compared to stored cPT-QC reference results using the test of proportions.
Stored reference results are the output of ARResT/Interrogate from the analysis of a cPT-QC sample. These results should be confirmed through replicate runs over time in each lab to accommodate for technical variability. The results and not the raw data are stored to ensure that the bioinformatic analysis is not compromised inadvertently by the user; this means that the results are updated with every major release of ARResT/Interrogate to ensure compatibility with new runs.
Issues with abundances of particular primers or a specific primer set are used to tag the corresponding cPT-QC samples plus all user samples of the same primer set as “QC-failed”.
As reproducibility is important for a QC of this type, replicate runs of the cPT-QC were performed. Relative abundances of 5′ primers were compared employing the test of proportions.
To assess the usability of the cPT-QC to detect problems with primer performance, artificial perturbations of primer concentrations were created to simulate missing pipetting a primer or pipetting the wrong primer concentration.
First the 5′ primer usage was analysed in a cPT-QC sample and two primers of differing abundances were selected from each primer set, thereby skipping the intron-Kde primer set, which only has two primers; IGH-VJ-FR1-M-1, IGH-V-FR1-O-1; IGH-D-B-1, IGH-D-E-1; IGK-V-G-1, IGK-V-I-1; TRB-V-AD-1, TRB-V-G-1; TRB-D-A-1, TRB-D-B-1; TRG-V-F-1, TRG-V-E-1; TRD-D-A-1, TRD-V-B-1. Those primers were perturbed by fully excluding them from the primer pool and changing their concentration by reduction to 10% and increase to 200%. Relative abundances of 5′ primers were compared between these perturbed sets and cPT-QC employing the test of proportions.
A dataset was created to evaluate and showcase the aforementioned concepts and functionalities, which consists of the following samples:
The diagnostic samples and the cPT-QC were run with all primer sets, while the aplastic follow-up samples were only run with the corresponding primer sets, i.e. the IG sets for samples with B cell aplasia, and the TR sets for samples with T cell aplasia (as depicted in
Primer Performance Assessment Using the cPT-QC
The tests of proportions of 5′ primer relative abundances, applied to the cPT-QC, BC, their replicates, and to the libraries with primer perturbations, showed that there is a clear difference in p-values between sets of un-perturbed and perturbed primers. In other words, the p-values of the differences in abundance of the perturbed primers are noticeably lower. Table 2 presents a simplified view of the results, focusing on the abundances of perturbed primers plus at least one other un-perturbed primer per primer set either to show their normal behavior or discuss their abnormal behavior. Percentage abundances of 5′ primers across all primer sets. Top group of primers were perturbed; bottom group is a selection of primers that were left un-perturbed: one per primer set selected alphabetically, plus two examples where the primer behavior is of interest to the discussion (see text). Results are shown: in cPT-QC replicates (third column); against samples where primers were excluded (“0%”, fourth column), reduced to 10% (fifth column), increased to 200% (sixth column). Changes in percentages (indicated separately and as +/−) that led to the test of proportions.
At a p-value threshold of 1e-200, none of the primers are flagged in the cPT-QC, which highlights the reproducibility of the assay, while all the perturbed primers are flagged in the perturbed scenarios. In fact, the lowest p-value in the normal samples is 7.86e-142 for primer TRD-V-A-1 (Table 2), compared to multiple zero values in the perturbed comparisons (with a few exceptions, mainly for the 200% perturbation). Significant changes in abundance were also visible in other cells, with the most likely explanation that those primers were indirectly affected by perturbations of other primers. That is, a primer “taking over” when an initially abundant primer was excluded, such as IGH-V-FR1-D-1 when IGH-VJ-FR1-M-1 is perturbed either way especially since these primers amplify partially overlapping lists of genes.
Information on the in silico quality control based on both the cPT-QC and cIT-QC is available in ARResT/Interrogate, with “QC-failed” samples excluded by default to warn and prevent the user from their unintended use. However, the user is notified and has the option to include them back in the analysis.
Generic quality control is also performed on samples, specifically to check for low number of raw reads and low percentage of reads with an identified junction. Such samples are also tagged as “QC-failed”.
Abundances of lymphocyte subpopulations are frequently not available for samples of patients with lymphoid malignancies. Furthermore, as IG/TR NGS only reflects relative representation of the rearrangements, it was important to establish a calibrator, which would allow to normalise sequencing reads to input DNA cells. This is particularly important for tubes that exclusively cover rearrangements being present only in a minority of lymphoid cells (especially the TRD and intron-Kde tubes). TRD genes are not rearranged in normal B cells and are deleted in most TR⋅⋅ cells. Therefore, oligoclonal TCR⋅⋅ T-cells might give rise to dominant clonotypes in the TRD NGS assay, in particular as the normal TCR⋅⋅ T cell repertoire is strikingly skewed during childhood. Here the cIT-QC-based abundance correction is of utmost importance to avoid miss-assignment of (minor) clonal TRD rearrangements from minor TCR⋅⋅ cell populations as leukemic rearrangements that would then serve as markers in further MRD analysis.
Analysis of the test dataset showed the utility of the cIT-QC in marker identification and quantification. Without the cIT-QC, both diagnostic and aplastic samples seem to be oligoclonal if simply based on the number of reads (
On the other hand, and as expected, in the diagnostic samples cIT-QC sequences constitute a minority. Hence, this implies that with the cIT-QC the abundance of a certain rearrangement can much more accurately be determined and recalculated to cell abundances.
Additionally, five experienced EuroMRD ALL reference laboratories performed IG/TR NGS in 50 diagnostic ALL samples, and compared results with those generated through routine IG/TR Sanger sequencing. A cPT-QC composition was used to monitor primer performance, and a cIT-QC composition was spiked into each sample as a library-specific quality control and calibrator. NGS identified 259 (average 5.2/sample, range 0-14) clonal sequences vs. Sanger-sequencing 248 (average 5.0/sample, range 0-14). The overall concordance between Sanger and NGS, including negative libraries, was 78%.
This example describes the development and design of an IG/TR assay, including bioinformatics, and its validation for MRD marker identification in acute lymphoblastic leukemia (ALL). Five EuroMRD ALL MRD reference laboratories performed IG/TR NGS in 50 diagnostic ALL samples, and compared results with those generated through routine IG/TR marker screening and Sanger sequencing. A cPT-QC composition was used to monitor primer performance, and a cIT-QC composition was spiked into each sample as a library-specific quality control and calibrator. The overall workflow of the validation study is shown in
With the objective of developing a universal amplicon-based NGS approach for IG/TR sequence analysis at the DNA level, applicable in all lymphoid malignancies, assays for multiple IG/TR loci were designed for: IG heavy (IGH), IG kappa (IGK), TR beta (TRB), TR gamma (TRG), and TR delta (TRD), including complete and incomplete rearrangements whenever applicable. IG lambda (IGL) was excluded due to its limited complementarity to other IG loci and its reduced diversity. TR alpha (TRA) was excluded due to its high complexity, hampering a reasonable multiplex PCR approach at the DNA level.
The IGH locus is rearranged in two steps (
The TRB locus also features a two-step process with initial formation of incomplete TRB-DJ rearrangements followed by complete TRB-VJ rearrangements. Incomplete and complete TRB rearrangements were designed to be detected in two separate multiplex PCR reactions (FIG. 5C). As TRG locus rearrangements are one-step VJ recombinations involving a limited number of TRGV and TRGJ genes, a single multiplex assay could be developed (
Primers were designed to be gene-specific, but in case of allelic variants, degenerate primers were designed to facilitate multiplexing. For the same reason, single mismatches in the middle or at the 5′-end of the primer were accepted. Table 3 shows the primer sequences comprising nucleotide sequences of
GTAAAACGACGGCCAGTTCGCTTCTCACCTGAATGCCC
GTAAAACGACGGCCAGTCTCAGTTGAAAGGCCTGATGGA
GTAAAACGACGGCCAGTGGAAGCATCCCTGATCGATTCT
GTAAAACGACGGCCAGTTCAGCTAAGTGCCTCCCAAATT
GTAAAACGACGGCCAGTAGTTCCAAATCGCTTCTCACCT
GTAAAACGACGGCCAGTTTCCCTAATCGATTCTCAGGGC
GTAAAACGACGGCCAGTTACAACTGCCAAAGGAGAGGTC
GTAAAACGACGGCCAGTTAAAGGAGAAGTCCCGAATGGC
GTAAAACGACGGCCAGTGGAGAAGTTCCCAATGGCTACA
GTAAAACGACGGCCAGTATAAAGGAGAAGTCCCCGATGG
GTAAAACGACGGCCAGTCTCTAGATGATTCGGGGATGCC
GTAAAACGACGGCCAGTTGAAGCAGACACCCCTGATAAC
GTAAAACGACGGCCAGTTGAGCGATTTTTAGCCCAATGC
GTAAAACGACGGCCAGTACAAAGGAGAGATCTCTGATGGA
TAATACGACTCACTATAGGGCTACAACTGTGAGTCTGGTGCC
TAATACGACTCACTATAGGGCTACAACGGTTAACCTGGTCC
TAATACGACTCACTATAGGGTACAACAGTGAGCCAACTTCCC
TAATACGACTCACTATAGGGCAAGACAGAGAGCTGGGTTCC
TAATACGACTCACTATAGGGCTAGGATGGAGAGTCGAGTCCC
TAATACGACTCACTATAGGGCTGTCACAGTGAGCCTGGTC
TAATACGACTCACTATAGGGCCTTCTTACCTAGCACGGTGAG
TAATACGACTCACTATAGGGTTACCCAGTACGGTCAGCCTAG
TAATACGACTCACTATAGGGCTTACCGAGCACTGTCAGCC
TAATACGACTCACTATAGGGCTTACCCAGCACTGAGAGCC
TAATACGACTCACTATAGGGTCACCGAGCACCAGGAGCC
TAATACGACTCACTATAGGGGAATCTCACCTGTGACCGTGAG
GTAAAACGACGGCCAGTGGAAACTTCCCTGGTCGATTC
GTAAAACGACGGCCAGTCAACGATCGGTTCTTTGCAGTC
GTAAAACGACGGCCAGTTAAATCAGGGCTGCTCAGTGAT
GTAAAACGACGGCCAGTCAGTGATCGGTTCTCTGCAGAG
GTAAAACGACGGCCAGTCTTGAACGATTCTCCGCACAAC
GTAAAACGACGGCCAGTCCGAGGATCGATTCTCAGCTAA
GTAAAACGACGGCCAGTGCCAAAGGAACGATTTTCTGCT
GTAAAACGACGGCCAGTAGGGAGATGTTCCTGAAGGGTA
GTAAAACGACGGCCAGTCCTGAGGGGTACAGTGTCTCTA
GTAAAACGACGGCCAGTCAGAATCTCTCAGCCTCCAGAC
GTAAAACGACGGCCAGTACTTCCCTGATCGATTCTCAGC
GTAAAACGACGGCCAGTCTCAGGTCACCAGTTCCCTAAC
GTAAAACGACGGCCAGTCCTAGATTTTCAGGTCGCCAGT
GTAAAACGACGGCCAGTCTCAACTAGACAAATCGGGGCT
GTAAAACGACGGCCAGTATCGATTTTCTGCAGAGAGGCT
GTAAAACGACGGCCAGTCGGTATGCCCAACAATCGATTC
GTAAAACGACGGCCAGTCTGAAGGGTACAGCGTCTCTC
GTAAAACGACGGCCAGTTCCTCTGAGTCAACAGTCTCCA
GTAAAACGACGGCCAGTCTGAGGCCACATATGAGAGTGG
TAATACGACTCACTATAGGGGAAAACTCACCCAGCACGGTC
TAATACGACTCACTATAGGGTCACCCAGCACGGTCAGCC
GTAAAACGACGGCCAGTGATTCTCAGGTCTCCAGTTCCC
GTAAAACGACGGCCAGTTACCACTGGCAAAGGAGAAGTC
GTAAAACGACGGCCAGTCAAAGGAGAAGTCTCAGATGGC
GTAAAACGACGGCCAGTTTTCTCATCAACCATGCAAGCC
GTAAAACGACGGCCAGTGGAGATGCACAAGAAGCGATTC
TAATACGACTCACTATAGGGCTACAACTGTGAGTCTGGTGCC
TAATACGACTCACTATAGGGCTACAACGGTTAACCTGGTCC
TAATACGACTCACTATAGGGTACAACAGTGAGCCAACTTCCC
TAATACGACTCACTATAGGGCAAGACAGAGAGCTGGGTTCC
TAATACGACTCACTATAGGGCTAGGATGGAGAGTCGAGTCCC
TAATACGACTCACTATAGGGCTGTCACAGTGAGCCTGGTC
TAATACGACTCACTATAGGGCCTTCTTACCTAGCACGGTGAG
TAATACGACTCACTATAGGGTTACCCAGTACGGTCAGCCTAG
TAATACGACTCACTATAGGGCTTACCGAGCACTGTCAGCC
TAATACGACTCACTATAGGGCTTACCCAGCACTGAGAGCC
TAATACGACTCACTATAGGGTCACCGAGCACCAGGAGCC
TAATACGACTCACTATAGGGGAATCTCACCTGTGACCGTGAG
GTAAAACGACGGCCAGTCCTCCACTCCCCTCAAAGGA
GTAAAACGACGGCCAGTCAGACTAACCTCTGCCACCTG
TAATACGACTCACTATAGGGGAAAACTCACCCAGCACGGTC
TAATACGACTCACTATAGGGTCACCCAGCACGGTCAGCC
GTAAAACGACGGCCAGTCAAGCATGAGGAGGAGCTGGAAATTG
GTAAAACGACGGCCAGTACGTCTACATCCACTCTCACC
GTAAAACGACGGCCAGTGCACAAGGAACAACTTGAGATTG
GTAAAACGACGGCCAGTTGGAAGCACAAGGAAGAACTTGAGAA
GTAAAACGACGGCCAGTGCACAGGGAAGAGCCTTAAATT
GTAAAACGACGGCCAGTCAGGAGGTGGAGCTGGATATT
GTAAAACGACGGCCAGTCTCTCACTTCAATCCTTACCATCAA
GTAAAACGACGGCCAGTGCTCACACTTCCACTTCCACTTTGAAAATAAAGT
TAATACGACTCACTATAGGGAGTGTTGTTCCACTGCCAAAG
TAATACGACTCACTATAGGGGTTCCGGGACCAAATACCTTG
TAATACGACTCACTATAGGGGAGCTTAGTCCCTTCAGCAAATA
TAATACGACTCACTATAGGGCCTAGTCCCTTTTGCAAACG
GTAAAACGACGGCCAGTGAATGCAAAAAGTGGTCGCTATTC
GTAAAACGACGGCCAGTTGCAAAGAACCTGGCTGTACT
GTAAAACGACGGCCAGTTGCAGATTTTACTCAAGGACGG
GTAAAACGACGGCCAGTGCAAAATGCAACAGAAGGTCG
GTAAAACGACGGCCAGTGATAAAAATGAAGATGGAAGATTCACTGT
GTAAAACGACGGCCAGTCTCCTTCAATAAAAGTGCCAAGC
GTAAAACGACGGCCAGTATTGAAAAGAAGTCAGGAAGACTAAGT
GTAAAACGACGGCCAGTTCCAGAAAGCAGCCAAATCC
GTAAAACGACGGCCAGTAGGGGTATTGTGGATGGCAG
TAATACGACTCACTATAGGGTTCCACAGTCACACGGGT
TAATACGACTCACTATAGGGGGTTCCACGATGAGTTGTGTT
TAATACGACTCACTATAGGGCACGAAGAGTTTGATGCCAGT
TAATACGACTCACTATAGGGGTTGTTGTACCTCCAGATAGGTT
TAATACGACTCACTATAGGGTGGCTAGAAACACTTACTTGCA
TAATACGACTCACTATAGGGCCCAGGGAAATGGCACTTTTG
GTAAAACGACGGCCAGTGATTCYGAACAGCCCCGAGTCA
GTAAAACGACGGCCAGTGATTTTGTGGGGGYTCGTGTC
GTAAAACGACGGCCAGTGTTTGRRGTGAGGTCTGTGTCA
GTAAAACGACGGCCAGTGTTTRGRRTGAGGTCTGTGTCACT
GTAAAACGACGGCCAGTCTTTTTGTGAAGGSCCCTCCTR
GTAAAACGACGGCCAGTGTTATTGTCAGGSGRTGTCAGAC
GTAAAACGACGGCCAGTGTTATTGTCAGGGGGTGYCAGRC
GTAAAACGACGGCCAGTGTTTCTGAAGSTGTCTGTRTCAC
TAATACGACTCACTATAGGGCTTACCTGAGGAGACGGTGACC
GTAAAACGACGGCCAGTGCAGTCTGGAGCAGAGGTGAAAA
GTAAAACGACGGCCAGTGAGGTGCAGCTGTTGGAGTC
GTAAAACGACGGCCAGTCAGTGGGGCGCAGGACTGTT
GTAAAACGACGGCCAGTCCAGGACTGGTGAAGCCTCC
GTAAAACGACGGCCAGTCCTCAGTGAAGGTTTCCTGCAAGG
GTAAAACGACGGCCAGTAAACCCACAGAGACCCTCACGCTGAC
GTAAAACGACGGCCAGTCTGGGGGGTCCCTGAGACTCTCCTG
GTAAAACGACGGCCAGTCTTCACAGACCCTGTCCCTCACCTG
GTAAAACGACGGCCAGTTCGCAGACCCTCTCACTCACCTGTG
TAATACGACTCACTATAGGGCTTACCTGAGGAGACGGTGACC
TAATACGACTCACTATAGGGCTCACCTGAGGAGACGGTGACC
GTAAAACGACGGCCAGTCTGGGGCTGAGGTGAAGAAG
GTAAAACGACGGCCAGTTCACCTTGAAGGAGTCTGGTCC
GTAAAACGACGGCCAGTAGGTGCAGCTGGTGGAGTC
GTAAAACGACGGCCAGTCCAGGACTGGTGAAGCCTTC
GTAAAACGACGGCCAGTGTACAGCTGCAGCAGTCAGG
GTAAAACGACGGCCAGTGCTGGTGCAATCTGGGTCTG
GTAAAACGACGGCCAGTAAGTGGGGTCCCATCAAGGTTCAG
GTAAAACGACGGCCAGTAGTCCCATCTCGGTTCAGTGGCAG
GTAAAACGACGGCCAGTGAAACAGGGGTCCCATCAAGGTTC
GTAAAACGACGGCCAGTTCCCAGACAGATTCAGTGGCAGTG
GTAAAACGACGGCCAGTCTGGAGTGCCAGATAGGTTCAGTG
GTAAAACGACGGCCAGTCCCTGGAGTCCCAGACAGGTTCAG
GTAAAACGACGGCCAGTGCATCCCAGCCAGGTTCAGTG
GTAAAACGACGGCCAGTGTCCCTGACCGATTCAGTGGCA
GTAAAACGACGGCCAGTAATCCCACCTCGATTCAGTGGC
GTAAAACGACGGCCAGTCTCAGGGGTCCCCTCGAGGTT
GTAAAACGACGGCCAGTAGACACTGGGGTCCCAGCCA
TAATACGACTCACTATAGGGGCAGCTGCAGACTCATGAGGAG
TAATACGACTCACTATAGGGACGTTTGATCTCCACCTTGGTCCC
TAATACGACTCACTATAGGGACGTTTGATATCCACTTTGGTCCC
TAATACGACTCACTATAGGGACGTTTAATCTCCAGTCGTGTCCC
GTAAAACGACGGCCAGTGAGTGGCTTTGGTGGCCATGC
TAATACGACTCACTATAGGGCAGCTGCAGACTCATGAGGAG
Primer331, Primer Digital (PrimerDigital Ltd, Helsinki, Finland) MFEprimer-2.032 and Oligo (Molecular Biology Insights, Inc., Colorado, USA) were used for checking primer specificity and multiplexing. Primer design criteria were followed for all loci: primer melting temperature 57-63° C.; comparable size of final amplicon; primer length 20-24 nt; avoidance of primer dimers; minimal distance of 3′primer end to the junctional region of, preferably, >10-15 bp to avoid false negativity for rearrangements with larger nucleotide deletions from the germline sequence; avoidance of regions with known single nucleotide polymorphisms to allow identical primer annealing for all alleles of the respective V, D or J genes; targeting of, preferably, all V, D and J genes known to be rearranged plus the intronRSS and Kde regions for IGK.
Following in silico design, primers were first tested in monoplex and multiplex reactions using primary patient samples or cell lines with defined rearrangements. In occasional cases where no such samples were available, healthy tonsil or mononuclear DNA samples were employed. Oligoclonal template pools were then created from mixtures of rearranged cell lines and diagnostic samples with defined rearrangements covering many different V, D and/or J genes. Alternatively, for some loci, plasmid pools were produced, covering as many different rearrangements as possible. These multi-target pools allowed fine-tuning of reaction conditions and/or primer concentrations to assess comparable amplification efficiencies. This iterative process of testing also led to a reduction of primers if these appeared redundant. Further multicentre testing was performed with a limited number of monoclonal and poly/oligoclonal samples and on different sequencing platforms, which allowed assessment of robustness of the primer mixes and protocols.
Since the assays were designed with the aim to be platform-independent, a two-step PCR was employed, enabling to switch the sequencing adaptors and to reduce the total number of primers even if a large number of barcodes is necessary. Also, maximal amplicon lengths were defined with respect to the possible maximal sequencing read lengths of current sequencers. PCR conditions were optimized with the aim to find optimal conditions common for all reactions, thus allowing for parallel library preparation. Various numbers of PCR cycles in 1st and 2nd PCR, different polymerases and several library purification methods were tested and compared.
Although this study was exclusively performed on the Illumina MiSeq, the applicability of the same PCR panel on the IonTorrent instrument (ThermoFischer Scientific) was tested in a single-centre setting and a one-step Illumina MiSeq PCR approach was also tested in a single-center setting.
Five experienced EuroClonality-NGS laboratories tested the robustness and applicability of the optimized assays for IG/TR marker identification in ALL in comparison to standard techniques. All laboratories (Bristol/London, Paris, Monza, Prague and Kiel) are members of the EuroMRD consortium and reference laboratories for ALL MRD analysis. Each of the participating laboratories performed NGS-based IG/TR MRD marker identification in 10 patients with B- or T-lineage ALL. A central standard operating procedure was strictly followed by all laboratories. The study was executed using the Illumina MiSeq (2×250 bp v2 kit). NGS analyses were performed fully in parallel to conventional PCR plus Sanger sequencing of clonal products following standard guidelines11. For a part of the cases with unexplained discrepant results between the two methods, allele-specific PCR assays (either for digital droplet PCR or real-time quantitative PCR) were designed to clarify if the respective clonal rearrangement represented the leukemic bulk. EuroMRD guidelines were used to design and interpret allele-specific PCR assays33,34.
Based on the results of the testing and validation phases, the final IG/TR primer mixes consist of eight tubes with 92 forward and 30 reverse primers, 15 of the latter being used in pairs of different tubes). Primer positions and sequences are presented in
Quality control of robust amplification, library preparation and sequencing are of utmost importance for these complex assays. Different primers need to work under the same reaction conditions, while additional variability can be introduced by sample characteristics and sequencing. Primer performance has to be monitored longitudinally, and for the exact estimation of clonal abundance it is important to correct for the number of sequencing reads per input molecule.
To address these issues, two types of quality control compositions were included: (i) the cIT-QC of Example 1 was spiked to each tube as library control and calibrator, and (ii) the cPT-QC of Example 2 was run in parallel to monitor general primer performance and sequencing.
Primers were tailed with universal and T7-linker sequences, and divided over eight tubes (IGH-VJ, IGH-DJ, IGK-VJ-Kde, intron-Kde, TRB-VJ, TRB-DJ, TRG, TRD). The PCR protocol is summarized in Table 4. Sequencing libraries were prepared via a two-step PCR, each using a final reaction volume of 50 μl with 100 ng diagnostic DNA and 10 ng of polyclonal DNA. For the cIT-QC, genomic DNA of 40 cell equivalents of each the 9 different cell lines were spiked into all samples. MgCl2 was intended to be used at a final concentration of 1.5 mM, but needed optimization for some tubes. Therefore, master-mixes for the 1st PCR were tube-specific, but the temperature profile was uniform for all tubes.
After the 1st round of PCR, gel electrophoresis was performed to check for the successful amplification of all targets. For TRB, gel extraction of the specific PCR products was performed prior to the 2nd PCR.
All first round PCR products, except for TRB, the PCR products were diluted 1:50 unless amplicons were very weak. The TRB PCR products and PCR products with weak amplicons were used undiluted. Master-mixes for the 2nd PCR and the temperature profiles were identical for all tubes (Table 4). Primers for the 2nd PCR contained sequencing adaptors and sequencing indexes (barcodes). Unique combination of forward and reverse indexes was used for each library. Three μl of undiluted TRB PCR products and 1 μl of 1:50-diluted IGH, IGK, TRG, and TRD PCR products were amplified in the 2nd PCR.
indicates data missing or illegible when filed
Following 2nd PCR, products from all samples of a run were pooled in equimolar ratios into 8 tube-wise subpools and purified by gel-extraction (see Table 5 for the amplicon lengths). Finally, the subpools were pooled equimolarly into one final pool. Sequencing was performed on Illumina MiSeq sequencers, using 2×250 bp v2 chemistry with a final concentration of 7 pM for the amplicon library and 10% PhiX control added to avoid low-complexity library issues.
ARResT/Interrogate was the main bioinformatics platform used in this study, along with Vidjil47 and IMGT48 resources for specific aspects of this work. Demultiplexing was performed accepting no mismatches. Reads were annotated with EuroClonality-NGS primer sequences (to trim non-amplicon sequence, and for the cPT-QC-based quality control), paired-end joined, dereplicated, immunogenetically annotated48, and classified into rearrangement types (complete and incomplete, and other special types like intron-Kde rearrangements), or “junction classes”. Reads with no rearrangement were excluded from the total read count used for relative abundances.
cIT-QC sequences described above were identified in the data through their immunogenetic annotation. Their counts served both as ‘in-tube’ control and for normalization per primer set: total cIT-QC cells are divided by cIT-QC total reads, the resulting factor used to convert rearrangement reads to cells, those cells divided by total input cells (15,000 in this example). Identified IG/TR sequences were defined as index sequences if abundance after cIT-QC normalisation exceeded 5%. ARResT/Interrogate can track the DNJ 3′stem of a junction, the sequence remaining stable during IGH or TRB clonal evolution in case of V-replacement or ongoing V to DJ rearrangements. The stem consists of the last ⋅3 nt of D (or of the NDN if no D is identifiable), any and all of N2 nucleotides, and the J nucleotides of the junction. This stem is available as a separate immunogenetic feature across all samples and thus able to link other features, e.g. clonotypes.
Next, fifty ALL diagnostic samples (29 BCP-ALL and 21 T-ALL) were analysed for the multicentre validation study. Each of the five participating laboratories received preconfigured 96-well plates containing the different multiplexed NGS primer combinations per target (
In summary, 96 libraries were generated per lab (total of 480 libraries), and sequenced with a total output of 47M reads (⋅9.2M/lab). Centralised analysis was performed with ARResT/Interrogate43 using IMGT germline sequences48—further analyses and verifications were performed with Vidjil47 and IMGT/V-QUEST48.
Overall, 311 clonal IG/TR rearrangements (clonotypes) were identified, with a mean of 5.9 (0-14)/sample by NGS (a 5% threshold was applied for NGS after cIT-QC-based normalization) vs. 5.0 (0-14)/sample by Sanger, while 217 (45%) libraries demonstrated no clonotypes above threshold by either method. A total of 196/311 (63%) clonotypes were fully concordant between NGS and Sanger (
Conversely, 52 clonal IG/TR rearrangements were only detected by Sanger when the 5% NGS threshold was applied: for 5 sequences (1 TRG, 2 TRB-VJ, and 2 IGH-DJ) the relevant primer was not present in the NGS primer set, in 12 cases no explanation was found for the discrepancy. However, in the majority of discordant cases (35/52) the Sanger identified sequences (7 TRD, 8 TRB-VJ, 6 TRG, 4 TRB-DJ, 2 IGK-VJ-Kde, 5 IGH-VJ, 3 IGH-DJ) were also detectable by NGS, but with and abundance below 5%. In 36/39 q/ddPCR evaluated cases the rearrangement was confirmed by ASO-PCR, including all low NGS positive sequences, in 14/36 cases on a subclonal level. Overall concordance between Sanger and NGS, including negative libraries, was 78%.
In 12/29 B-lineage ALL samples the evolution of the dominant clonal IGH sequence was identified employing ARResT/Interrogate. The evolved clonotypes shared the DNJ stem with the dominant one, but the VND part of the rearrangement differed.
The assay performance was also analysed by standardized evaluation of QC samples (cIT-QC and cPT-QC). This showed a remarkably high intra- and inter-lab consistency without statistically significant differences between the five labs.
During the process of multicentre validation, suitable modifications of the SOP were tested in particular laboratories as parallel actions.
One-step versus two-step PCR: The EuroClonality-NGS working group decided to use two-step PCR to enable switching of sequencing adaptors and to limit the total number of required primer batches even if a large number of barcodes is necessary. As first round PCR products are not barcoded, identification of contamination phenomena is hampered in this approach. Therefore, a one-step PCR was tested in a single center (Paris) as an alternative for laboratories that are able to maintain higher numbers of different primer batches. The one-step approach reduces the risk of contamination and thus favours use of NGS not only for marker identification, but also for MRD assessment. The standard operating procedures are shown in supplementary information.
Bead extraction: In our single target evaluation and validation phase, gel extraction of the specific TRB amplicons turned out to lead to more specific libraries compared to bead extraction. However, gel extraction is not used in all laboratories, therefore, in a later phase of the study bead purification of all libraries was also tested. Optimization of the purification processes led to comparable ratios of specific reads irrespective of the type of library purification.
Withdrawal of addition of polyclonal DNA to reaction mix: Polyclonal DNA was added to each reaction in order to prevent excessive primer dimer formation in samples lacking particular rearrangements. The addition of polyclonal DNA, however, alters the composition of polyclonal background of the samples and hampers the analysis of the immune repertoire. We therefore performed testing on 4 samples with B and 4 samples with T cell aplasia and showed that addition of cIT-QC is sufficient to prevent the excessive formation of unspecific PCR products.
Number | Date | Country | Kind |
---|---|---|---|
19163837.8 | Mar 2019 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/NL2020/050181 | 3/18/2020 | WO |