Autism spectrum disorder (ASD) is a childhood-onset neurodevelopmental disorder characterized by cognitive, motor, and sensory deficits. ASD has a strong genetic component, with risk contribution from hundreds of genes. Individuals bearing mutations in the same gene can also present differing severity of clinical manifestations, likely reflecting a modulatory effect of the overall genomic context in which risk genes act. The shared developmental effects that cause this large and heterogeneous collection of genes to converge on the phenotypic features of ASD, whether they act through shared or distinct pathways, and how they may be influenced by genetic and epigenetic context, remain poorly understood.
These questions have been difficult to address in part due to the lack of experimental models of human brain development, a process that largely occurs in utero and is therefore poorly accessible for study. The emergence of human brain organoids represents a major advance in modeling the cellular complexity of the developing human brain in vitro. In combination with single-cell profiling methods, these models have the potential to enable the unbiased identification of the cell types, gene programs, developmental events, and molecular mechanisms that are affected by ASD risk-associated genes, across a diversity of human genomic contexts.
Genetic risk for autism spectrum disorder (ASD) has been associated with hundreds of genes spanning a wide range of biological functions. The phenotypic alterations in the human brain resulting from mutations in ASD risk genes remain unclear, and the level at which these alterations converge on shared disease pathology is poorly understood. Furthermore, the phenotypic manifestation of mutations in disease-associated genes varies across individuals. As described herein, reproducible organoid models of the human cerebral cortex were leveraged to identify cell type-specific developmental abnormalities associated with haploinsufficiency in ASD risk genes, SUV420H1 (KMT5B), ARID1B, PTEN, and CHD8, in multiple cell lines from different donors. Comprehensive single-cell RNA-seq (scRNA-seq) was performed, and proteomic analysis was performed on individual organoids sampled at different developmental stages to identify phenotypic convergence in the effects of these genes.
Within a defined period of early cortical development, the mutations demonstrate asynchronous development of two of the main cortical neuronal lineages, GABAergic neurons and deep layer excitatory projection neurons, but these defects occurred through largely distinct molecular pathways. Although this effect is consistent across cell lines, these phenotypes are influenced to different degrees by the broader genomic context. Calcium imaging in intact organoids shows that these early-stage developmental changes may be related to later abnormal circuit activity, consistent with differential accessibility by single-cell ATAC-seq (scATAC-seq) of regions near genes associated with synaptic function. This work uncovers cell type-specific neurodevelopmental abnormalities shared across ASD risk genes that are finely modulated by human genomic context, revealing convergence in the neurobiological basis of how different risk genes contribute to ASD pathology.
Disclosed herein are methods of screening for pathological mechanisms in neurodevelopmental disorders. The methods may include introducing one or more genetic variations to an organoid; and profiling the mutant organoids using single-cell RNA sequencing, thereby identifying phenotypic alterations as a result of the genetic variation.
In some embodiments, the genetic variation causes haploinsufficiency in a gene. In some embodiments, the gene is selected from the group consisting of SUV420H1, PTEN, ARID1B and CHD8, and in certain embodiments is selected from the group consisting of SUV420H1, ARID1B and CHD8.
In some embodiments, the neurodevelopmental disorder is autism spectrum disorder (ASD).
In some embodiments, the mutant organoids are screened one month, three months, and/or six months after introduction of the genetic variation. In some embodiments, the mutant organoids are further profiled using immunohistochemistry, whole-proteome analysis, single cell Assay for Transposase-Accessible Chromatin sequencing (ATAC-seq), and/or electrophysiological analysis.
In some embodiments, the phenotypic alternation comprises asynchronous development of a cortical neuronal lineage, e.g., asynchronous development of deep layer projection neuron lineage. In some embodiments the phenotypic alteration comprises premature expansion of GABAergic neuron lineage.
Also disclosed herein are methods of screening a therapeutic agent. The methods of screening the therapeutic agent may include contacting an organoid having one or more genetic variations with one or more candidate agents; and testing effects of the one or more candidate agents on asynchronous development of a cortical neuronal lineage.
In some embodiments, the asynchronous development of a cortical neuronal lineage comprises asynchronous development of deep layer projection neuron lineage, e.g., the asynchronous development of a cortical neuronal lineage comprises premature expansion of GABAergic neuron lineage. In some embodiments, the genetic variation causes haploinsufficiency in a gene. In some embodiments, the gene is selected from the group consisting of SUV420H1, ARID1B and CHD8.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawings will be provided by the Office upon request and payment of the necessary fee.
As described herein, reproducible organoid models of the cerebral cortex were combined with single cell RNA-seq, single cell ATAC-seq, proteomics, and electrophysiological analysis to investigate the role of ASD risk genes in human cortical development, across multiple human cell lines. The focus was on chromatin modifiers expressed at early stages of brain development, and genes that have emerged repeatedly as top hits in studies of ASD genetic risk were selected: SUV420H1, PTEN, ARID1B, and CHD8. All of the genes are associated with severe neurodevelopmental abnormalities in patients, including shared phenotypes of high frequency of macrocephaly, suggesting some degree of convergence in effect. However, their broad functions and widespread expression make it difficult to predict how their mutation leads to disease and whether shared mechanisms may play a role. It is shown herein that mutations in these genes converge on a shared phenotype of asynchronous neurogenesis that selectively affects specific classes of cortical neurons. Notably, although each mutation caused similar developmental abnormalities, they acted through divergent molecular mechanisms. Moreover, while these phenotypes were broadly consistent across cell lines from different donors, it was demonstrated that the degree of expressivity varies depending on the risk gene and for different aspects of the phenotype, highlighting the nuanced interactions of risk genes and genomic contexts that produce the phenotypic manifestation of ASD.
Described herein are methods of screening for pathological mechanisms in neurodevelopmental and/or neuropsychiatric disorders. The neurodevelopmental and/or neuropsychiatric disorder or condition may be any disorder affecting synaptic function, neuronal network activity and/or stimulation. A neurodevelopmental and/or neuropsychiatric disease or condition may be a disease or condition involving neural loss mediated or characterized at least partially by at least one of deterioration of neural stem cells and/or progenitor cells.
“Neuropsychiatric disorder” encompasses mental disorders attributable to diseases of the nervous system. Non-limiting examples of neuropsychiatric disorders include addictions, childhood developmental disorders, eating disorders, degenerative diseases, mood disorders, neurotic disorders, psychosis, sleep disorders, depression, obsessive-compulsive disorder, schizophrenia, visual hallucination, auditory hallucination, eating disorder, bipolar disorder, epilepsy, autism spectrum disorder (ASD), and amyotrophic lateral sclerosis (ALS).
“Neurodevelopmental disorder” encompasses mental disorders attributable to impairments of the growth and development of the brain and/or nervous system. Non-limiting examples of neurodevelopmental disorders include attention deficit hyperactivity disorder (ADHD), developmental language disorder (DLD), communication or speech disorder, autism spectrum disorder (ASD), intellectual disabilities, motor disorders (e.g., developmental coordination disorder, stereotypic movement disorder, tic disorder), neurogenetic disorder (e.g., Fragile X syndrome, Down syndrome, Rett syndrome, hypogonadotropic hypogonadal syndromes), learning disorders (e.g., dyslexia or dyscalculia), traumatic brain injury and/or congenital brain injury, and fetal alcohol spectrum disorder.
In some embodiments, the methods of screening for pathological mechanisms in neurodevelopmental and/or neuropsychiatric disorders comprise introducing one or more genetic variations to an organoid. In some embodiments, the organoid is a reproducible organoid. In some embodiments, the organoid is a cortical organoid. The organoid may be derived from cells of any suitable species. In some embodiments, the organoid is derived from mammalian cells. In some embodiments, the organoid is derived from human or rodent (e.g., mouse) cells (e.g., human or rodent stem cell or pluripotent cell). In some embodiments, the organoid is derived from human cells. In some embodiments, the organoid is derived from cells (e.g., human cells or rodent cells, human or rodent stem cells or progenitor cells) comprising a mutation associated with a neurological disease or condition, e.g., a neurodevelopmental or neuropsychiatric disorder or condition.
Stem cells and progenitor cells used to derive the organoids described herein can be any cells derived from any kind of tissue (for example embryonic tissue such as fetal or pre-fetal tissue, or adult tissue), which stem cells have the characteristic of being capable under appropriate conditions of producing progeny of different cell types, e.g. derivatives of all of at least one of the 3 germinal layers (endoderm, mesoderm, and ectoderm). These cell types may be provided in the form of an established cell line, or they may be obtained directly from primary embryonic tissue and used immediately for differentiation. Included are cells listed in the NIH Human Embryonic Stem Cell Registry, e.g. hESBGN-01, hESBGN-02, hESBGN-03, hESBGN-04 (BresaGen, Inc.); HES-1, HES-2, HES-3, HES-4, HES-5, HES-6 (ES Cell International); Miz-hES1 (MizMedi Hospital-Seoul National University); HSF-1, HSF-6 (University of California at San Francisco); and H1, H7, H9, H13, H14 (Wisconsin Alumni Research Foundation (WiCell Research Institute)).
In some embodiments, the stem cells can be isolated from tissue including solid tissue. In some embodiments, the tissue is skin, fat tissue (e.g. adipose tissue), muscle tissue, heart or cardiac tissue. In other embodiments, the tissue is for example but not limited to, umbilical cord blood, placenta, bone marrow, or chondral.
Stem cells of interest also include embryonic cells of various types, exemplified by human embryonic stem (hES) cells, described by Thomson et al. (1998) Science 282:1145; embryonic stem cells from other primates, such as Rhesus stem cells (Thomson et al. (1995) Proc. Natl. Acad. Sci. USA 92:7844); marmoset stem cells (Thomson et al. (1996) Biol. Reprod. 55:254); and human embryonic germ (hEG) cells (Shambloft et al., Proc. Natl. Acad. Sci. USA 95:13726, 1998). Also of interest are lineage committed stem cells, such as mesodermal stem cells and other early cardiogenic cells (see Reyes et al. (2001) Blood 98:2615-2625; Eisenberg & Bader (1996) Circ Res. 78(2):205-16; etc.) The stem cells may be obtained from any mammalian species, e.g. human, equine, bovine, porcine, canine, feline, rodent, e.g. mice, rats, hamster, primate, etc.
Embryonic stem (ES) cells are considered to be undifferentiated when they have not committed to a specific differentiation lineage. Such cells display morphological characteristics that distinguish them from differentiated cells of embryo or adult origin. Undifferentiated ES cells are easily recognized by those skilled in the art, and typically appear in the two dimensions of a microscopic view in colonies of cells with high nuclear/cytoplasmic ratios and prominent nucleoli. Undifferentiated ES cells express genes that may be used as markers to detect the presence of undifferentiated cells, and whose polypeptide products may be used as markers for negative selection. For example, see U.S. application Ser. No. 2003/0224411 A1; Bhattacharya (2004) Blood 103(8):2956-64; and Thomson (1998), supra., each herein incorporated by reference. Human ES cell lines express cell surface markers that characterize undifferentiated nonhuman primate ES and human EC cells, including stage-specific embryonic antigen (SSEA)-3, SSEA-4, TRA-1-60, TRA-1-81, and alkaline phosphatase. The globo-series glycolipid GL7, which carries the SSEA-4 epitope, is formed by the addition of sialic acid to the globo-series glycolipid GbS, which carries the SSEA-3 epitope. Thus, GL7 reacts with antibodies to both SSEA-3 and SSEA-4. The undifferentiated human ES cell lines did not stain for SSEA-1, but differentiated cells stained strongly for SSEA-I. Methods for proliferating hES cells in the undifferentiated form are described in WO 99/20741, WO 01/51616, and WO 03/020920.
A mixture of cells from a suitable source of endothelial, muscle, and/or neural stem cells can be harvested from a mammalian donor by methods known in the art. A suitable source is the hematopoietic microenvironment. For example, circulating peripheral blood, preferably mobilized (i.e., recruited), may be removed from a subject. Alternatively, bone marrow may be obtained from a mammal, such as a human patient, undergoing an autologous transplant. In some embodiments, stem cells can be obtained from the subject's adipose tissue, for example using the CELUTION™ SYSTEM from Cytori, as disclosed in U.S. Pat. Nos. 7,390,484 and 7,429,488 which are incorporated herein in their entirety by reference.
In some embodiments, the stem cells can be reprogrammed stem cells, such as stem cells derived from somatic or differentiated cells. In such an embodiment, the de-differentiated stem cells can be for example, but not limited to, neoplastic cells, tumor cells and cancer cells or alternatively induced reprogrammed cells such as induced pluripotent stem cells or iPS cells.
In some embodiments, the organoid is derived from PGP1 (Personal Genome Project 1) iPSC (induced pluripotent stem cells); HUES66 hESC (human embryonic stem cells); H1 hESC (also known as WA01); GM08330 iPSC (also known as GM8330-8); Mito 210 iPSC; Mito 294 iPSC. Additional examples of reproducible organoids that may be used herein are described in WO 2020/243657, the entire contents of which is incorporated herein by reference.
In some embodiments, one or more genetic variations are introduced to an organoid. In some embodiments, the stem cell from which the organoid is derived is altered. The alteration to the stem cell may occur in any manner which is available to the skilled artisan, for example, utilizing a TALEN, ZFN, or a CRISPR/Cas system. Such CRISPR/Cas systems can employ a variety of Cas proteins (Haft et al. PLoS Comput Biol. 2005; 1(6)e60). In some embodiments, the CRISPR/Cas system is a CRISPR type I system. In some embodiments, the CRISPR/Cas system is a CRISPR type II system. In some embodiments, the CRISPR/Cas system is a CRISPR type V system. It should be understood that although examples of methods utilizing CRISPR/Cas (e.g., Cas9 and Cpf1) and TALEN are described in detail herein, the invention is not limited to the use of these methods/systems.
In some embodiments, the alteration to the stem cell is a modification or cleavage of a target sequence. In other embodiments, the alteration is an indel. As used herein, “indel” refers to a mutation resulting from an insertion, deletion, or a combination thereof. As will be appreciated by those skilled in the art, an indel in a coding region of a genomic sequence will result in a frameshift mutation, unless the length of the indel is a multiple of three. In some embodiments, the alteration is a point mutation. As used herein, “point mutation” refers to a substitution that replaces one of the nucleotides. A CRISPR/Cas system can be used to induce an indel of any length or a point mutation in a target sequence.
In some embodiments, the alteration, e.g., an indel, is made to a gene, e.g., a gene associated with a neurodevelopmental or neuropsychiatric disorder. For example, an alteration, e.g., indel, may target a protein domain that is mutated in a subject suffering from a neurodevelopmental or neuropsychiatric disorder. In one embodiment, the indel is a protein-truncating indel. In some embodiments, the gene is an autism spectrum disorder (ASD) risk gene. In some embodiments, the gene is expressed early in brain development. In some embodiments, the gene is a chromatin modifier. In some embodiments, the gene is selected from the group consisting of SUV420H1 (KMT5B), ARID1B, PTEN, and CHD8. In some embodiments, the gene is selected from the group consisting of SUV420H1, ARID1B, and CHD8. In one embodiment, the gene is SUV420H1. In one embodiment, the gene is ARID1B. In one embodiment, the gene is CHD8. In one embodiment, the gene is PTEN.
In some embodiments, an organoid is derived from the altered stem cell, e.g., the stem cell comprising the indel. In some aspects, one or more genetic variations (e.g., a mutation or alteration) is introduced into an organoid via the altered stem cell. In some aspects, an indel alteration targeting a protein domain is introduced into the organoid. In some embodiments, the genetic variation introduced to the organoid causes haploinsufficiency in a gene. In some embodiments, the one or more genetic variations introduced to the organoid causes haploinsufficiency in SUV420H1, ARID1B, PTEN, and/or CHD8. In some embodiments, the one or more genetic variations introduced to the organoid cause haploinsufficiency in SUV420H1, ARID1B, and/or CHD8. In some embodiments, the organoids comprise haploinsufficient mutations in SUV420H1, ARID1B, and/or CHD8.
In some embodiments, the methods further comprise profiling the mutant organoids using single-cell RNA sequencing. In some embodiments, the mutant organoids are further profiled using one or more of immunohistochemistry, whole proteome analysis, single cell assay for transposase-accessible chromatin sequencing (ATAC-seq), and electrophysiological analysis. The organoids may additionally be assessed for changes in expression of genes and/or gene products over a period of time by any method known in the art including immunohistochemistry, immunofluorescence, flow cytometry, polymerase chain reaction (PCR), quantitative PCR, real-time PCR, gene expression array, mRNA sequencing, high-throughput sequencing, Western blot, Northern blot, and ELISA.
The mutant organoids may be screened or profiled at one or more time points, e.g., after introduction of a genetic variation. In some embodiments, the mutant organoids are screened at least 1, 5, 10, 15, 20, 25, 30, 40, 60, 90, or 120 days after introduction of the genetic variation. In some embodiments, the mutant organoids are screened at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 months after introduction of the genetic variation. In one embodiment, the mutant organoids are screened at least 1, 3, and/or 6 months after introduction of the genetic variation.
In some embodiments, the profiling of the mutant organoids can be used to identify phenotypic alterations that occur as a result of the genetic variation. In some embodiments, the phenotypic alteration comprises the accelerated development of cortical neurons. In some embodiments, the phenotypic alteration comprises the overproduction of cortical neuronal lineages, e.g., neuronal overgrowth. In some embodiments, the phenotypic alteration comprises asynchronous cortical neurogenesis. In one embodiment, the phenotypic alteration comprises asynchronous development of deep layer projection neuron lineage. In one embodiment, the phenotypic alteration comprises premature expansion of GABAergic neuron lineage. In one embodiment, the phenotypic alteration comprises an imbalance of production of cortical excitatory neurons. In some embodiments, early stage developmental changes in the organoid (e.g., the accelerated development of cortical neurons) are related to later abnormal circuit activity of the organoid.
In some embodiments, the mutant organoids exhibit differential expression of one or more genes or gene products associated with synaptic transmission, neuronal maturation, neuronal connectivity, neuronal processes (e.g., axon guidance, synaptogenesis), synapse formation, cytoskeleton remodeling, and/or neuronal development. The differential expression of the proteins may vary based on the genetic variation introduced into the organoid, as well as based on the underlying stem cell from which the organoid is derived. In some embodiments, the mutant organoids exhibit reduced spontaneous neuronal activity and/or delayed neuronal differentiation, e.g., as compared to a control organoid. In some embodiments, the mutant organoids exhibit differentially expressed proteins associated with neuronal maturation and neuronal function. In some embodiments, the mutant organoids exhibit differential expression of one or more genes or gene products selected from the group consisting of CRH, GAD2, SLC32A1, POU3F2, ETV1, GAD1, RBFOX1, JAG1, ZFHX4, FBN2, ZNF385D, SHD, RCAN2, GPC3, ZNF439, APITD1, GNG3, ARPC1B, DDIT3, C14orfl42, NKAIN3, ODC1, CD99, MYO10, TXNDC9, NETO2, TCEAL7, TMEM163, PNO1, PHF19, GCSH, HYAL2, C1QBP, TMEM14C, NPM3, TRIAP1, MRPL3, TJP1, PSMD12, EHBP1, C7orf50, GTF3C6, WIZ, BACH2, GLCCI1, NEDD4L, RAB3A, IER5, GPM6A, CHODL, GUCY1B3, ZNF37A, DPP10, CCSAP, NMRAL1, GSE1, CPM, IER3, INSIG1, AUTS2, NELL2, TUBB4A, GPR85, MYT1L, CALCOCO1, CACNA1A, RAP1GAP, TYW3, EFNA1, KIAA1456, STRA6, COL6A2, BCL11B, SNX32, TPM2, SEMA3A, IGFB5, CRYZ, ABTB1, ISLR2, RBM24, SYN3, SYNDIG1, ZEB2, NRSN1, MEF2C, PALMD, KCNG1, GRIK3, FAM50B, SLA, and CARTPT. In some embodiments, the mutant organoids exhibit differential expression of one or more genes or gene products selected from the group consisting of APITD1, C14orfl42, ODC1, TXNDC9, PNO1, GCSH, HYAL2, C1QBP, TMEM14C, NPM3, TRIAP1, MRPL3, TJP1, PSMD12, C7orf50, GTF3C6, GLCCI1, NEDD4L, RAB3A, GUCY1B3, ZNF37A, CCSAP, INSIG1, TUBB4A, GPR85, CALCOCO1, CACNA1A, RAP1GAP, KIAA1456, BCL11B, SNX32, SEMA3A, ISLR2, SYN3, ZEB2, NRSN1, MEF2C, GRIK3, FAM50B, SLA, and CARTPT. In some embodiments, the mutant organoids exhibit upregulated expression of one or more genes or gene products selected from the group consisting of GLCCI1, NEDD4L, RAB3A, GUCY1B3, ZNF37A, CCSAP, INSIG1, TUBB4A, GPR85, CALCOCO1, CACNA1A, RAP1GAP, KIAA1456, BCL11B, SNX32, SEMA3A, ISLR2, SYN3, ZEB2, NRSN1, MEF2C, GRIK3, FAM50B, SLA, and CARTPT.
Also described herein are methods of screening test agents to identify candidate therapeutic agents. In some aspects, organoids (e.g., mutant organoids) described herein or generated as described by the methods of the invention are used. In some embodiments, organoids are generated from cells derived from a subject having a neurodevelopmental or neuropsychological disease or disorder (e.g., autism spectrum disorder (ASD)).
In some embodiments, a method of screening for a candidate therapeutic agent comprises contacting an organoid having one or more genetic variations as described herein with a test agent, and testing the effect of the one or more candidate agents on the organoid. For example, testing the effect of the one or more candidate agents on asynchronous development of a cortical neuronal lineage (e.g., asynchronous development of deep layer projection neuron lineage and/or premature expansion of GABAergic neuron lineage).
In some embodiments, the effect of the one or more candidate agents on the differential expression of one or more genes or gene products is assessed. Differential expression may be assessed for one or more genes or gene products selected from the group consisting of CRH, GAD2, SLC32A1, POU3F2, ETV1, GAD1, RBFOX1, JAG1, ZFHX4, FBN2, ZNF385D, SHD, RCAN2, GPC3, ZNF439, APITD1, GNG3, ARPC1B, DDIT3, C14orfl42, NKAIN3, ODC1, CD99, MYO10, TXNDC9, NETO2, TCEAL7, TMEM163, PNO1, PHF19, GCSH, HYAL2, C1QBP, TMEM14C, NPM3, TRIAP1, MRPL3, TJP1, PSMD12, EHBP1, C7orf50, GTF3C6, WIZ, BACH2, GLCCI1, NEDD4L, RAB3A, IER5, GPM6A, CHODL, GUCY1B3, ZNF37A, DPP10, CCSAP, NMRAL1, GSE1, CPM, IER3, INSIG1, AUTS2, NELL2, TUBB4A, GPR85, MYT1L, CALCOCO1, CACNA1A, RAP1GAP, TYW3, EFNA1, KIAA1456, STRA6, COL6A2, BCL11B, SNX32, TPM2, SEMA3A, IGFB5, CRYZ, ABTB1, ISLR2, RBM24, SYN3, SYNDIG1, ZEB2, NRSN1, MEF2C, PALMD, KCNG1, GRIK3, FAM50B, SLA, and CARTPT. In some embodiments, differential expression may be assessed for one or more genes or gene products selected from the group consisting of APITD1, C14orfl42, ODC1, TXNDC9, PNO1, GCSH, HYAL2, C1QBP, TMEM14C, NPM3, TRIAP1, MRPL3, TJP1, PSMD12, C7orf50, GTF3C6, GLCCI1, NEDD4L, RAB3A, GUCY1B3, ZNF37A, CCSAP, INSIG1, TUBB4A, GPR85, CALCOCO1, CACNA1A, RAP1GAP, KIAA1456, BCL11B, SNX32, SEMA3A, ISLR2, SYN3, ZEB2, NRSN1, MEF2C, GRIK3, FAM50B, SLA, and CARTPT. In one embodiment, downregulation of one or more genes or gene products selected from the group consisting of GLCCI1, NEDD4L, RAB3A, GUCY1B3, ZNF37A, CCSAP, INSIG1, TUBB4A, GPR85, CALCOCO1, CACNA1A, RAP1GAP, KIAA1456, BCL11B, SNX32, SEMA3A, ISLR2, SYN3, ZEB2, NRSN1, MEF2C, GRIK3, FAM50B, SLA, and CARTPT is assessed.
The term “agent” as used herein means any compound or substance such as, but not limited to, a small molecule, nucleic acid, polypeptide, peptide, drug, ion, etc. An “agent” can be any chemical, entity or moiety, including without limitation synthetic and naturally-occurring proteinaceous and non-proteinaceous entities. In some embodiments, an agent is nucleic acid, nucleic acid analogues, proteins, antibodies, peptides, aptamers, oligomer of nucleic acids, amino acids, or carbohydrates including without limitation proteins, oligonucleotides, ribozymes, DNAzymes, glycoproteins, siRNAs, lipoproteins, aptamers, and modifications and combinations thereof etc. In certain embodiments, agents are small molecules having a chemical moiety. For example, chemical moieties include unsubstituted or substituted alkyl, aromatic, or heterocyclyl moieties including macrolides, leptomycins and related natural products or analogues thereof. Compounds can be known to have a desired activity and/or property, or can be selected from a library of diverse compounds.
As used herein, the term “contacting” (i.e., contacting the organoid with an agent) is intended to include incubating the agent and organoid together in vitro. It is understood that the organoid contacted with an agent can also be simultaneously or subsequently contacted with another agent. In some embodiments, the organoid is contacted with at least two, at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine, or at least ten agents.
In some embodiments, one or more parameters of the organoid are measured in response to contact with the test agent. In certain aspects, the one or more parameters include: the accelerated development of cortical neurons, asynchronous cortical neurogenesis, asynchronous development of deep layer projection neuron lineage, and/or premature expansion of GABAergic neuron lineage. In one aspect, differential gene or gene product expression is measured in response to contact with the test agent. In certain aspects, the one or more parameters include: spontaneous neuronal activity and/or delayed neuronal differentiation.
In some embodiments, the control organoid is an organoid not contacted with the test agent. In some embodiments, the test organoid is derived from a subject having a neurodevelopmental or neuropsychological disorder and the control organoid is from a subject without the neurodevelopmental or neuropsychological disorder. Thus, in certain embodiments, if contact of the test organoid with the test agent results in the test organoid exhibiting a property more similar to the control organoid, then the agent can be identified as a beneficial agent for the neurodevelopmental or neuropsychological disorder.
In some embodiments, a high throughput screen (HTS) is performed. A high throughput screen can utilize the highly reproducible organoids described herein. High throughput screens often involve testing large numbers of compounds with high efficiency, e.g., in parallel. Often such screening is performed in multiwell plates other vessels in which multiple physically separated cavities or depressions are present in a substrate. High throughput screens often involve use of automation, e.g., for liquid handling, imaging, data acquisition and processing, etc. Certain general principles and techniques that may be applied in embodiments of a HTS of the present invention are described in Macarrön R & Hertzberg R P. Design and implementation of high-throughput screening assays. Methods Mol Biol., 565:1-32, 2009 and/or An WF & Tolliday N J., Introduction: cell-based assays for high-throughput screening. Methods Mol Biol. 486:1-12, 2009, and/or references in either of these. Useful methods are also disclosed in High Throughput Screening: Methods and Protocols (Methods in Molecular Biology) by William P. Janzen (2002) and High-Throughput Screening in Drug Discovery (Methods and Principles in Medicinal Chemistry) (2006) by Jorg Hvser.
The articles “a”, “an” and “the” as used herein, unless clearly indicated to the contrary, should be understood to include the plural referents. Claims or descriptions that include “or” between one or more members of a group are considered satisfied if one, more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process unless indicated to the contrary or otherwise evident from the context. It should it be understood that, in general, where the invention, or aspects of the invention, is/are referred to as comprising particular elements, features, etc., certain embodiments of the invention or aspects of the invention consist, or consist essentially of, such elements, features, etc. For purposes of simplicity those embodiments have not in every case been specifically set forth in haec verba herein. It should also be understood that any embodiment of the invention, e.g., any embodiment found within the prior art, can be explicitly excluded from the claims, regardless of whether the specific exclusion is recited in the specification.
As used herein the term “comprising” or “comprises” is used in reference to compositions, methods, and respective component(s) thereof, that are essential to the invention, yet open to the inclusion of unspecified elements, whether essential or not. The term “consisting essentially of” refers to those elements required for a given embodiment. The term permits the presence of additional elements that do not materially affect the basic and novel or functional characteristic(s) of that embodiment of the invention. The term “consisting of” refers to compositions, methods, and respective components thereof as described herein, which are exclusive of any element not recited in that description of the embodiment.
Where ranges are given herein, the invention includes embodiments in which the endpoints are included, embodiments in which both endpoints are excluded, and embodiments in which one endpoint is included and the other is excluded. It should be assumed that both endpoints are included unless indicated otherwise. Furthermore, it is to be understood that unless otherwise indicated or otherwise evident from the context and understanding of one of skill in the art, values that are expressed as ranges can assume any specific value or subrange within the stated ranges in different embodiments of the invention, to the tenth of the unit of the lower limit of the range, unless the context clearly dictates otherwise. It is also understood that where a series of numerical values is stated herein, the invention includes embodiments that relate analogously to any intervening value or range defined by any two values in the series, and that the lowest value may be taken as a minimum and the greatest value may be taken as a maximum. Numerical values, as used herein, include values expressed as percentages. For any embodiment of the invention in which a numerical value is prefaced by “about” or “approximately”, the invention includes an embodiment in which the exact value is recited. For any embodiment of the invention in which a numerical value is not prefaced by “about” or “approximately”, the invention includes an embodiment in which the value is prefaced by “about” or “approximately”. “Approximately” or “about” generally includes numbers that fall within a range of 1% or in some embodiments 5% of a number in either direction (greater than or less than the number) unless otherwise stated or otherwise evident from the context (except where such number would impermissibly exceed 100% of a possible value).
Furthermore, it is to be understood that the invention encompasses all variations, combinations, and permutations in which one or more limitations, elements, clauses, descriptive terms, etc., from one or more of the listed claims is introduced into another claim dependent on the same base claim (or, as relevant, any other claim) unless otherwise indicated or unless it would be evident to one of ordinary skill in the art that a contradiction or inconsistency would arise. Where elements are presented as lists, e.g., in Markush group or similar format, it is to be understood that each subgroup of the elements is also disclosed, and any element(s) can be removed from the group.
Certain claims are presented in dependent form for the sake of convenience, but any dependent claim may be rewritten in independent format to include the limitations of the independent claim and any other claim(s) on which such claim depends, and such rewritten claim is to be considered equivalent in all respects to the dependent claim (either amended or unamended) prior to being rewritten in independent format. It should also be understood that, unless clearly indicated to the contrary, in any methods claimed herein that include more than one act, the order of the acts of the method is not necessarily limited to the order in which the acts of the method are recited, but the invention includes embodiments in which the order is so limited. It is contemplated that all aspects described above are applicable to all different embodiments of the invention. It is also contemplated that any of the above embodiments can be freely combined with one or more other such embodiments whenever appropriate.
All patents, patent applications, and other publications (e.g., scientific articles, books, websites, and databases) mentioned herein are incorporated by reference in their entirety. In case of a conflict between the specification and any of the incorporated references, the specification (including any amendments thereof, which may be based on an incorporated reference), shall control. Standard art-accepted meanings of terms are used herein unless indicated otherwise. Standard abbreviations for various terms are used herein.
The above discussed, and many other features and attendant advantages of the present inventions will become better understood by reference to the following detailed description of the invention.
In order to determine whether human brain organoids could be used as models to understand the roles of ASD risk genes in human brain development, it was first sought to verify expression of a large set of known risk genes in this cellular system across a developmental timeline. A large dataset of scRNA-seq data was generated from individual organoids produced from two different human induced pluripotent stem cell (iPSC) lines, GM08330 and Mito210, using a recently-developed cortical organoid model (
Individual organoids were profiled by immunohistochemistry and scRNA-seq across three stages of in vitro development: one month, when organoids contain multiple classes of progenitors and neurogenesis is beginning; three months, when progenitor diversity increases, including the appearance of outer radial glia, and multiple subtypes of projection neurons emerge; and six months, when astroglia and interneurons are present (n=3 single organoids per timepoint and line;
The cell type-specific expression of ASD risk genes in this organoid model was then examined using a set of 102 genes identified as contributing to ASD risk from the largest-to-date ASD exome sequencing study8. The expression of these genes was first evaluated as a set in each cell type of the combined scRNA-seq datasets (
In order to begin to understand the molecular effects of ASD risk genes and whether they show convergent effects in human brain development, three genes were chosen for further study. The above panel of ASD risk genes8 was started with, and the focus was on those that were: a) more strongly associated with ASD than intellectual disability/developmental delay and other neurodevelopmental abnormalities8,21, b) predominantly found as de novo protein-truncating variants, which are more likely to be associated with increased ASD liability48, and c) expressed early in brain development, and thus likely to reflect early developmental events that can be modeled in organoids8.
Based on these criteria, SUV420H1, CHD8, and PTEN were chosen, which span different degrees of ASD risk, with CHD8 having the highest risk association8. In the scRNA-seq dataset, each of these three genes was expressed across multiple ages and cell types, precluding a priori prediction of cell type-specific effects (
To investigate the effects induced by SUV420H1 haploinsufficiency on human cortical development, a human iPSC line bearing a protein-truncating frameshift in the N-domain of the SUV420H1 protein (see Methods) was generated and validated, and cortical organoids were generated from isogenic control and SUV420H1 heterozygous mutant cell lines (
The analysis was first focused on one-month organoids. Transcriptional profiling was performed by scRNA-seq of 57,941 single cells from individual control and mutant organoids at 28 d.i.v. (n=3 single organoids per genotype;
To investigate the effect of SUV420H1 haploinsufficiency on protein expression at one month in culture, whole-proteome analysis was performed on single organoids from SUV420H1 mutant and control lines, from a separate differentiation (n=3 single organoids per genotype), using isobaric tandem mass tag (TMT) multiplexed quantitative mass spectroscopy (MS). 233 proteins determined to be significantly differentially expressed were identified (FDR<0.1, moderated t-test; >4,000 proteins detected per sample,
In order to determine whether phenotypic abnormalities induced by SUV420H1 heterozygosity persisted in more mature developmental stages, scRNA-seq was performed on two later timepoints, 35 d.i.v. and three months, using additional batches of organoids from two additional differentiations (
These data indicate that despite widespread SUV420H1 expression (
To understand whether mutations in other ASD risk genes also result in altered production of neurons of the cortical plate, the molecular phenotype of PTEN haploinsufficiency was examined in the organoid model. A PTEN mutant iPSC line was created with a protein-truncating frameshift mutation in the phosphatase domain, a region found mutated in patients (32, see Methods), and generated mutant and isogenic control organoids (
To investigate the cellular composition of PTEN mutant and isogenic control organoids, scRNA-seq and proteomic analysis was performed on individual organoids at one month in culture (
To better understand the developmental relevance of these changes, whole proteome analysis was performed of PTEN mutant and isogenic control organoids at a later timepoint, 70 d.i.v. (n=3 single organoids per genotype). While there were no differences in protein abundance between mutant and controls, when the proteins that were differentially expressed between ages (35 vs 70 d.i.v.) in each genotype were examined, the mutant organoids showed significantly smaller expression changes between ages than controls (p=1.6e−13, Wilcoxin rank test,
scRNA-seq profiling was therefore performed of 42,871 cells from PTEN mutant and control organoids at three months in culture (n=3 separate organoids per genotype). Mutant organoids at this stage showed a small increase in cycling progenitors and oRG (
Although affecting a different population of excitatory neurons then the SUV420H1 mutation, both SUV420H1 and PTEN converged on a share phenotype of imbalance of production of cortical excitatory neurons.
In order to determine if alterations in the speed of neurogenesis of cortical neurons was shared by a third mutation, a human ESC line heterozygote was obtained for a protein-truncating frameshift in the helicase-C domain of the CHD8 gene (see Methods), and generated mutant and isogenic control cortical organoids (
Whole-proteome MS was performed on one-month CHD8 organoids (n=3 for each of mutant and control), and significant differential expression of 35 proteins was found between genotypes. The interneuron fate regulator DLX634 was found among the most upregulated proteins in the mutant (
These results prompted the performance of scRNA-seq analysis on six CHD8+/− and control organoids cultured for 3.5 months, a stage when interneurons begin to appear (67,024 cells, n=3 single organoids per genotype, 109 d.i.v.). At this stage, the analysis revealed a dramatically increased number of cycling interneuron progenitors, interneuron precursors, and immature interneurons in mutant organoids (n=3 single organoids per genotype; cycling interneuron progenitors: p=0.031, interneuron precursors: p=0.0012, immature interneurons: p=0.079, logistic mixed models;
This experiment was repeated using an independent batch of organoids from a separate differentiation, profiling 35,789 single cells of control and mutant organoids collected at 3.5 months. An even more dramatic increase in the numbers of immature interneurons was observed, as well as in the numbers of interneuron precursors and cycling interneuron progenitors (n=3 single organoids per genotype; immature interneurons: p=8.1e−06, interneuron precursors: p=1.9e−05, cycling interneuron progenitors: p=7.2e−05, logistic mixed models;
Altogether, the data indicate that despite the widespread expression and broad functions of each of the three ASD risk genes analyzed here, all converge on overproduction of cortical neuronal lineages. Interestingly, each mutation affects different subsets of neuronal populations. These findings indicate that ASD pathology in these three genes may converge on a broad phenotype (neuronal overgrowth), rather than on a shared cell type or molecular pathway.
In order to determine if similarly altered developmental progression was observed between the three mutations, it was next sought to identify where each mutant population of neurons was located along the organoid developmental trajectory. Pseudotime trajectories were calculated for each scRNA-seq dataset using Monocle335, after downsampling to have an equal number of control and mutant cells. For all three mutants, the pseudotemporal ordering of cell types approximated that of in vivo human development, following a trajectory that progressed from progenitors to neuronal cell types (
Strikingly, for each of the three genes, mutant organoids showed an increased distribution of cells toward the end point of these developmental trajectories, i.e., at a more advanced stage of differentiation, compared to control (
In order to explore molecular differences that may underlie these phenotypes, gene expression regulation was examined across the partitions of interest using a co-expression network analysis. Using Weighted Correlation Network Analysis (WGCNA,36), modules of genes with highly correlated expression across cells in each dataset were found. Full lists of modules and their associated genes can be found in
Collectively, these results demonstrate that mutations in SUV420H1, PTEN, and CHD8 each lead to an acceleration of the developmental trajectory of specific neuronal subtypes in human cortical organoids.
Finally, it was sought to determine whether molecular phenotypes of these three genes converged at the level of specific biological processes or protein networks. First, differential gene regulation changes across the three mutations were compared. Reads from the cells in each selected partition of interest were summed within each organoid, and DESeq2 was used to calculate differentially expressed genes (DEGs) between mutant and control in each dataset (a technique which explicitly controls for sample-to-sample variation). No notable overlap of GO terms for genes downregulated in any of the three mutations was observed (
Lastly, an integrated protein interaction network was built using the DEPs across mutations, and tested the hypothesis of a convergence of altered networks. It was found that DEPs from all three mutations populated multiple sub-clusters of the network enriched for terms including “synapse”, “axon extension”, and “calcium signaling” (p<0.01 in all cases;
Collectively, these data show that SUV420H1, CHD8, and PTEN heterozygous organoids converge on a phenotype of accelerated neurogenesis and overexpression of synaptic genes, and show cell type-specific effects on individual neuronal populations.
The process by which mutations in ASD-related genes converge on the neurobiology of this multifaceted disorder remains unclear. Studies have been made difficult by the complexity of ASD genetics, limited experimental access to the human brain, and the lack of experimental models for human brain development. By leveraging reproducible human cortical organoids and single cell genomics, it was found that mutations in three ASD risk genes, SUV420H1, PTEN, and CHD8 converge on a shared phenotype of accelerated early cortical neurogenesis, and define three neuronal classes of the cortical microcircuit as the specific cell types affected. These findings are interesting in light of prior work indicating that ASD genetics affects neurogenesis37-39, and it underscores the value of organoids to identify both the developmental processes affected and the target cell types involved.
The finding that different ASD risk genes converge over accelerated neuronal production but diverge one level of complexity below it, with each mutation affecting distinct neuronal types, suggests that the shared clinical pathology of these genes may derive from high-order processes of neuronal differentiation and circuit wiring, as opposed to convergent use of the same cellular and molecular mechanisms. Indeed, for the three ASD risk genes analyzed minimal sharing of molecular pathways was found. These results inform a framework for therapeutic approaches that might be better tailored to the modulation of common, dysfunctional circuit properties instead of common molecular pathways.
Excitatory/inhibitory imbalance of the cortical microcircuit is a major hypothesis for the etiology of ASD40-45. The imbalance could derive from a change in the numerical proportions of cell types within the circuit, incorrect wiring, or altered properties of the neurons and glial cells involved, among others. The increased number of neurons that were observed in mutant organoids points at an early defect in circuit development whereby a small number of neuron classes develop asynchronously with the remaining cell types of the circuit. Studies in vivo have shown that this kind of relatively subtle developmental alterations, even if resolved later in development, can result in disproportionally large functional effects on the mature circuit46-48. Future work to increase the fidelity of circuit and tissue architecture in organoids will be crucial to investigate the implications of ASD risk alterations and potential interventions on circuit physiology and function.
Reproducible brain organoid models, combined with high-throughput single cell genomic methods, are suitable systems for unbiased identification of pathogenic mechanisms in neurodevelopmental disorders. The work paves the way for larger efforts to multiplex perturbation and analysis of large spectra of ASD risk genes49 in organoids to understand whether synergistic mechanisms are at play across this vast number of variants and possibly across disease manifestations.
The experiments were not randomized. Investigators were not blinded during experiments and analysis of results.
The HUES66 human ES cell line and the edited CHD8+/− line (also known as HUES66 AC2—this clone has a heterozygous 13 nucleotide deletion which resulted in a stop codon at amino acid 1248 [CHD8 gRNA: 5′-TTCTTACTGTGTACCCGGGC-3′ (TGG)] (SEQ ID NO: 1)) was from the laboratory of F. Zhang (MIT/BROAD Institute); the Mito210 human iPSC line was from the laboratory of B. Cohen (McLean Hospital); the GM08330 human iPSC line (also known as GM8330-8) was from the laboratory of M. Talkowski (MGH Hospital). Cell lines were cultured in feeder-free conditions on 1% Geltrex (Gibco) coated cell culture dishes (Corning) using either mTESR1 medium (StemCell Technologies) or mTESR+ medium (StemCell Technologies) with 100 U/mL penicillin and 100 μg/ml streptomycin, at 37° C. in 5% CO2. PSC cultures and colonies were dissociated with the Gentle Cell Dissociation Reagent (StemCell Technologies) or ReLeSR (StemCell Technologies). All human PSC cultures were maintained below passage 50, were negative for mycoplasma, and karyotypically normal (analysis performed by WiCell Research Institute). The HUES66 line was authenticated using STR analysis completed by GlobalStem (2008). The Mito210 line was authenticated by genotyping analysis (Fluidigm FPV5 chip) performed by the Broad Institute Genomics Platform (2017). The GM08330 line was not authenticated. All experiments involving human cells were approved by the Harvard University IRB and ESCRO committees.
The CRISPR guides for SUV420H1 and PTEN were designed using the Benchling CRISPR Guide Design Tool (Benchling Biology Software (2017). Retrieved from https://benchling.com). The guide was designed to maximize on-target efficiency and minimize off-target sites in intragenic regions51,52. For SUV420H1, a guide was designed to target the beginning of the first exon to create a protein truncation in all known protein coding transcripts (SUV420H1 gRNA: 5′-CAAGAACCAAACTGGTTGCT-3′ (AGG) (SEQ ID NO: 2)). The PTEN guide was chosen to induce a stop codon in the catalytic phosphatase domain (PTEN gRNA: 5′-CCAAATTTAATTGCAGAGGT-3′ (AGG) (SEQ ID NO: 3)).
Mutations in SUV420H1 and PTEN were induced in the Mito210 iPSC line. For SUV420H1, Nanoblades were generated as previously described53. Parental iPSC in mTeSR1 were plated in 12-well plates pre-coated with Geltrex. When cells reached about 80% confluency, 300 μl of mTeSR1 with nanoblades (10 μl) and 4 μg/ml of Polybrene were added. Cells were incubated for 6 hours at 37° C., after which fresh mTeSR1 was added. For selection of edited clones, cells were enzymatically detached and plated onto Geltrex and mTeSR1 in a ratio of ˜5,000 cells per 60 mm dish. 10 μM of Y-27632 was added to increase single cell survival after splitting. When the colonies started to appear, each clone was manually collected and split into a single well of a 96-well plate coated with Geltrex and mTeSR1. During colony picking, some cells were reserved for DNA extraction and clonal screening.
The PTEN mutation was generated by the Broad Institute Stem Cell Facility. The transfections were performed with the NEON transfection system (using settings of 1050V, 30 ms, 2 pulses and 2.5×10{circumflex over ( )}5 cells); locus specific guides plus Cas9 (EnGen Cas9 NLS from NEB) were used; cells were cultured either in mTeSR or StemFlex (Gibco) media on Geltrex coated plates.
Insertions/deletions in individual clones were screened via PCR amplification using primers flanking the guide and a second round of amplification using unique bar codes (Kappa HiFi Master Mix, Roche) with an annealing temperature of 60° C. and sequenced using Miseq (Illumina). For SUV420H1 the clone chosen for this study had a heterozygous 10 nucleotide deletion which resulted in a stop codon at amino acid 90 of the full-length transcript, resulting in protein truncation in all known protein coding transcripts. For PTEN the clone chosen had a heterozygous 1 nucleotide insertion which resulted in a stop codon at amino acid 91 in the phosphatase domain of the full-length transcript.
Dorsal forebrain-patterned organoids were generated as previously described13,54.
Samples were fixed in 4% paraformaldehyde (PFA) (Electron Microscopy Services). Samples were washed with 1× phosphate buffered saline (PBS) (Gibco), cryoprotected in a 30% sucrose solution, embedded in optimum cutting temperature (OCT) compound (Tissue Tek) and cryosectioned at 12-18 μm thickness. Sections were washed with 0.1% Tween-20 (Sigma) in PBS, blocked for 1 hour at room temperature (RT) with 6% donkey serum (Sigma)+0.3% Triton-X-100 (Sigma) in PBS and incubated with primary antibodies O/N diluted with 2.5% donkey serum+0.1% Triton-X-100 in PBS. After washing, sections were incubated at RT with secondary antibodies diluted in the same solution as with primary antibodies for 2 hours, washed, and incubated with DAPI staining (1:10,000 in PBS+0.1% TWEEN-20) for 15 minutes to visualize cell nuclei.
Organoids were processed using the SHIELD protocol55. Briefly, organoids were fixed for 30 minutes in 4% PFA at RT. The tissues were then treated with 3% (wt/v) polyglycerol-3-polyglycidyl ether (P3PE) for 48 h in ice cold 0.1M phosphate buffer (pH7.2) at 4° C. then transferred to 0.3% P3PE in 0.1M sodium carbonate (pH10) for 24 h at 37° C. Samples were rinsed and cleared in 0.2M SDS in 50 μmM phosphate buffer (pH 7.3) for 48 h at 55° C. Organoids were stained with Syto16 (ThermoFischer #S7578), anti-SOX2 (R&D #AF2018) and anti-TBR1 (CST #45664S) using the SmartLabel system (Lifecanvas). Tissues were washed extensively for 24 h in PBS+0.1% Triton-X-100 and antibodies were fixed to the tissue using a 4% PFA solution overnight at RT. Tissues were refractive index-matched in PROTOS solution (RI=1.519) and imaged using a SmartSPIM axially-swept light-sheet microscope (LifeCanvas Technologies). 3D image datasets were acquired using a 15×0.4 NA objective (ASI-Special Optics, #54-10-12).
Images of organoids in culture were taken with an EVOS FL microscope (Invitrogen), Lionheart™ FX Automated Microscope (BioTek Instruments), or with a Zeiss Axio Imager.Z2. Immunofluorescence images were acquired with the latter two and analyzed with the Gen5 (BioTek Instruments) or Zen Blue (Zeiss) image processing software, respectively. ImageJ56 was used to quantify organoid area. Area values were obtained by manually drawing a continuous line around the perimeter of the organoid, and the software counts pixels in the indicated area. Measurements were plotted as a ratio to the average value for control organoids of each differentiation batch.
iPSC were washed with 1× phosphate buffered saline (PBS) (Gibco) and protein were extracted incubating with N-PER™ Neuronal Protein Extraction Reagent (ThermoFisher) supplemented with protease (cOmplete™ Mini Protease Inhibitor Cocktail, Roche) and phosphatase inhibitor (PhosSTOP, Sigma) for 15 minutes on ice. After 5-7 minutes of lysis, organoids were dissociated using a pipette tip. Lysates were transferred to microcentrifuge tubes and centrifuged for 10 minutes at 13,500 rpm at 4° C. The supernatants were transferred to new tubes and stored at −20° C. Protein concentration was quantified using the Pierce™ BCA Protein Assay Kit (ThermoFisher). Proteins were separated on a NuPAGE™ 4-12%, Bis-Tris Gel (Invitrogen) or Mini-PROTEAN 4-15% Gels (Biorad) and transferred onto PVDF membrane via the iBlot 2 Dry Blotting System (ThermoFisher) or traditional dry transfer system. Blots were then blocked in 5% nonfat dry milk (Bio-Rad) for 1 hour at RT and incubated with primary antibodies with gentle agitation on a shaker overnight. Afterward, the blots were washed 3× for 10 minutes in TBST and incubated at RT with secondary horseradish peroxidase-conjugated antibodies for 1 hour. The blots were washed in TBST again and briefly incubated in SuperSignal™ West Femto Maximum Sensitivity Substrate (ThermoFisher). The blots were immediately imaged on the ChemiDoc XRS+ System (Bio-Rad).
For SUV420H1 and PTEN 35 d.i.v., four mutant and four control organoids were used. For CHD8 35 d.i.v., and for PTEN 70 d.i.v., three mutant and three control organoids were used. Cells were placed into microTUBE-15 (Covaris) microtubes with TPP buffer (Covaris) and lysed using a Covaris S220 Focused-ultrasonicator instrument with 125 W power over 180 s at 10% max peak power. Upon precipitation with chloroform/methanol, extracted proteins were weighed and digested according to FASP protocol (100 μg for PTEN and CHD8; 150 g for SUV420H1). Briefly, the 10K filter was washed with 100 μL of triethylammonium bicarbonate (TEAB). Each sample was added, spun, and the supernatant discarded. Then, 100 μL of Tris(2-carboxyethyl) phosphine (TCEP) 37 was added for one hour, spun, and the supernatant discarded. While shielding from light, 100 μL IAcNH2 was added for 1 hour followed by spinning and discarding the supernatant. Next, 150 μL 50 μmM TEAB +Sequencing Grade Trypsin (Promega) was added and left overnight at 38° C., upon which the samples were spun and supernatant collected. Lastly, 50 μL 50 μmM TEAB was added to the samples, followed by spinning and supernatant collection. The samples were then transferred to HPLC.
The TMT (Tandem Mass Tag) Label Reagents were equilibrated to RT and resuspended in anhydrous acetonitrile or ethanol (for the 0.8 mg vials 41 μL were added, for the 5 mg vials 256 μL were added). The reagent was dissolved for 5 minutes with occasional vortexing. 41 μL of the TMT Label Reagent (the equivalent of 0.8 mg) was added to each 100-150 g sample. The reaction was incubated for 1 hour at RT. 8 μL of 5% hydroxylamine was added to the sample and incubated for 15 minutes to quench the reaction. Samples were combined, dry in speedvac (Eppendorf, Germany) and stored at −80° C.
Before submission to LC-MS/MS (Liquid Chromatography with tandem mass spectrometry), each experiment sample was separated on a Hi-pH column (Thermo, CA) according to the vendor's instructions. After separation into 40 fractions, each fraction was submitted for a single LC-MS/MS experiment performed on a Lumos Tribrid (Thermo, CA) equipped with 3000 Ultima Dual nanoHPLC pump (Thermo, CA). Peptides were separated onto a 150 μm inner diameter microcapillary trapping column packed first with approximately 3 cm of C18 Reprosil resin (5 μm, 100 Å, Dr. Maisch GmbH, Germany) followed by PharmaFluidics (Belgium) micropack analytical column 50 cm. Separation was achieved through applying a gradient from 5-27% ACN in 0.1% formic acid over 90 min at 200 nl min−1. Electrospray ionization was enabled through applying a voltage of 1.8 kV using a home-made electrode junction at the end of the microcapillary column and sprayed from stainless-steel tips (PepSep, Denmark). The Lumos Orbitrap was operated in data-dependent mode for the MS methods. The MS survey scan was performed in the Orbitrap in the range of 400-1,800 m/z at a resolution of 6×104, followed by the selection of the twenty most intense ions (TOP20) for CTD-MS2 fragmentation in the Ton trap using a precursor isolation width window of 2 m/z, AGC setting of 10,000, and a maximum ion accumulation of 50 ms. Singly-charged ion species were not subjected to CID fragmentation. Normalized collision energy was set to 35 V and an activation time of 10 ms. Ions in a 10 ppm m/z window around ions selected for MS2 were excluded from further selection for fragmentation for 90 s. The same TOP20 ions were subjected to HCD MS2 event in Orbitrap part of the instrument. The fragment ion isolation width was set to 0.8 m/z, AGC was set to 50,000, the maximum ion time was 150 μms, normalized collision energy was set to 34V and an activation time of 1 ms for each HCD MS2 scan.
Raw data were submitted for analysis in Proteome Discoverer 2.4 (Thermo Scientific) software. Assignment of MS/MS spectra was performed using the Sequest HT algorithm by searching the data against a protein sequence database including all entries from the Human Uniprot database (SwissProt 19,768 2019) and other known contaminants such as human keratins and common lab contaminants. Sequest HT searches were performed using a 10 ppm precursor ion tolerance and requiring each peptides N-/C termini to adhere with Trypsin protease specificity, while allowing up to two missed cleavages. 16-plex TMTpro tags on peptide N termini and lysine residues (+304.207 Da) was set as static modifications while methionine oxidation (+15.99492 Da) was set as variable modification. A MS2 spectra assignment false discovery rate (FDR) of 1% on protein level was achieved by applying the target-decoy database search. Filtering was performed using a Percolator (64 bit version,57). For quantification, a 0.02 m/z window centered on the theoretical m/z value of each of the six reporter ions and the intensity of the signal closest to the theoretical m/z value was recorded. Reporter ion intensities were exported in the result file of Proteome Discoverer 2.4 search engine as Excel tables. The total signal intensity across all peptides quantified was summed for each TMT channel, and all intensity values were normalized to account for potentially uneven TMT labeling and/or sample handling variance for each labeled channel.
Potential contaminants were filtered out and proteins supported by at least two unique peptides were used for further analysis. Proteins that were missing were kept in at most one sample per condition. Data were transformed and normalized using variance stabilizing normalization using the DEP package of Bioconductor58. To perform statistical analysis, data were imputed for missing values using random draws from a Gaussian distribution with width 0.3 and a mean that is down-shifted from the sample mean by 1.8. To detect statistically significant differential protein abundance between conditions a moderated t-test was performed using the LIMMA package of Bioconductor59, employing an FDR threshold of 0.1. GSEA was performed using the GSEA software60. Gene Ontology (GO) and KEGG pathway annotation were utilized to perform functional annotation of the significantly regulated proteins. GO terms and KEGG pathways with FDR q-values <0.05 were considered statistically significant. For PTEN datasets, correlation between mutant effect (PTEN+/− vs control at 35 d.i.v.) and time effect (35 d.i.v. vs. 70 d.i.v. in control organoids) was calculated using Pearson correlation. Between PTEN 35 d.i.v. and 70 d.i.v., changes in protein levels in mutant and control organoids were compared to one another with a signed paired Wilcoxin rank test, using stat_compare_means from the ggpubr R package (available at rpkgs.datanovia.com/ggpubr/).
To build protein interaction networks, the prize-collecting Steiner forest algorithm was used61,62 using all three sets of DEPs as terminals, with the absolute value of their log fold change as prizes. This algorithm optimizes the network to include high-confidence protein interactions between protein nodes with large prizes. The PCSF R package v0.99.163 was used to calculate networks, with the STRING database as a background protein-protein interactome64, using parameters r=0.1, w=0.8, b=15, and mu=0.01. As is default in that package, the network was subclustered using the edge betweenness clustering algorithm from the igraph package, and functional enrichment is performed on each cluster using the ENRICHR API. The Cytoscape software version 3.6.1 was used for network visualization65. To assess relationships between the three sets of differential proteins, a PPI-weighted gene distance50, was calculated between each pair of protein sets. A background distribution was calculated by drawing size-matched random lists of proteins from all detected proteins in each dataset and calculating the pMM between these sets. This was repeated 1000 times, and a p value was calculated by evaluating the amount of times randomized pMMs were lower than the value calculated using differential proteins. DEPs: differentially expressed proteins.
Individual brain organoids were dissociated into a single-cell suspension using the Worthington Papain Dissociation System kit (Worthington Biochemical). A detailed description of the dissociation protocol is available at Protocol Exchange, with adaptations depending on age and size66. Dissociated cells were resuspended in ice-cold PBS containing 0.04% BSA (Sigma, PN-B8667), counted them with Countess II (ThermoFisher), and then adjusted the volume to the final concentration of 1000 cells/μl. Approximately 17,400, 9600 or 8000 cells were used per channel (to give an estimated recovery of 10,000, 6000 or 5000 cells per channel), loaded them onto a Chromium™ Single Cell 3′ Chip (10× Genomics, PN-120236), and processed them through the Chromium Controller to generate single cell GEMs (Gel Beads in Emulsion). Single-cell RNA-seq libraries with the Chromium™ Single Cell 3′ Library & Gel Bead Kit v3 (10× Genomics, PN-1000075), with the exception of a few libraries in the earlier experiments that were prepared with a v2 kit (10× Genomics, PN-120236). Libraries from different samples were pooled based on molar concentrations and sequenced on a NextSeq 500 or NovaSeq S1 instrument (Illumina) with 28 bases for read 1 (26 bases for v2 libraries), 55 bases for read 2 (57 bases for v2 libraries) and 8 bases for Index 1. If necessary, after the first round of sequencing, libraries were re-pooled based on the actual number of cells in each and re-sequenced with goal of producing an equal number of reads per cell for each sample and to reach a sequencing saturation of at least 20% for v3 libraries (50% for v2 libraries).
Reads from RNA-seq were aligned to the GRCh38 human reference genome and the cell-by-gene count matrices were produced with the Cell Ranger pipeline (10× Genomics)67. Cellranger version 2.0.1 was used for GM08330 experiments and for HUES66 3.5 months batch I, while version 3.0.2 was used for all other experiments. Default parameters were used, except for the ‘-cells’ argument. Data was analyzed using the Seurat R package v3.1.568 using R v3.6. Cells expressing a minimum of 500 genes were kept, and UMI counts were normalized for each cell by the total expression, multiplied by 106, and log-transformed. Variable genes were found using the “mean.var.plot” method, and the ScaleData function was used to regress out variation due to differences in total UMIs per cell. Principal component analysis (PCA) was performed on the scaled data for the variable genes, and top principal components were chosen based on Seurat's ElbowPlots (at least 15 PCs were used in all cases). Cells were clustered in PCA space using Seurat's FindNeighbors on top principal components, followed by FindClusters with resolution=1.0. Variation in the cells was visualized by t-SNE (t-distributed stochastic neighbor embedding) on the top principal components.
In the case of GM08330 one month organoids, cells were demultiplexed using genotype clustering from cells from a different experiment that were sequenced in the same lane. To demultiplex, SNPs were called from CellRanger BAM files with the cellSNP tool v0.1.5, and then the vireo function was used with default parameters and n_donor-2, from the cardelino R library v0.4.069 to assign cells to each genotype.
In three cases, one out of six organoids were excluded from the analysis as outliers. See the Statistics & Reproducibility section for details.
For each dataset, upregulated genes in each cluster were identified using the VeniceMarker tool from the Signac package v0.0.7 from BioTuring (available at github.com/bioturing/signac). Cell types were assigned to each cluster by looking at the top most significant upregulated genes. In a few cases, clusters were further subclustered to assign identities at higher resolution.
To assess gene expression of ASD risk genes in GM08330 and Mito210 control organoids across timepoints, datasets from one, three, and six months were merged using Seurat, then batch corrected using Harmony v1.0 with default parameters70. Since the one month data is dominated by cell cycle signal, the ScaleData function was used to regress out variation due to both total UMI count per cell and to cell cycle stage differences, calculated using Seurat's CellCycleScore. Variation was visualized using t-SNE on the first 30 Harmony dimensions. Expression of the 102 ASD-linked genes identified in the Satterstrom et. al. study8, compared to control gene lists with similar average expression, was evaluated using Seurat's AddModuleScore function. The distribution of module scores between broad cell types was compared using Seurat's violin plot.
To compare cell type proportions between control and mutant organoids, count tables were prepared from dataset metadata. For each cell type present in a dataset, a column was created that indicated whether a cell belonged to that cell type or not. Using the R package lme4 v1.1-2371, the glmer function was used to estimate a mixed-effect logistic regression model with this binary cell type column as output, the control or mutant state of the cell as a fixed predictor, and the organoid that the cell belonged to as a random intercept. In order to calculate the significance of the contribution of control-versus-mutant state to the model, another model was fit without that predictor, and the anova function was used to compare the two model fits. P values for each cell type were then adjusted for multiple hypothesis testing using the Benjamini-Hochberg correction.
Pseudotime trajectories were created for each dataset using the Monocle3 v. 0.2.0 software package35. Briefly, a CellDataSet was created from the raw counts and metadata in the corresponding Seurat object, and then cells were subsetted to contain an equal amount from control and mutant. Data was processed as default in Monocle3. A starting point for the trajectory was chosen manually by finding an endpoint of the tree located in the earliest developmental cell type (generally, cycling progenitors). Where the cells were split into more than one partition, the starting point was chosen within the partition containing the mature cell types of interest (PN for one month SUV40H1 organoids, IN for 3.5 month CHD8 organoids), and then a new UMAP was calculated using just these cells. Results were then visualized on the resulting UMAP, and by plotting the density of pseudotime values assigned to control and mutant cells.
In order to learn patterns of coordinated gene regulation across the cells, WGCNA36 was applied to each dataset. Where cells were split into partitions in the above pseudotime analysis, only cells belonging to the partition of interest were used. For the PTEN three months dataset, cells were split into two partitions based on the branching of the pseudotime into two subtypes of projection neurons: one excluding callosal projection neurons, and the other excluding corticofugal neurons. Normalized gene expression data was further filtered to remove outlying genes, mitochondrial and ribosomal genes. Outliers were identified by setting the upper (>9) and lower (<0.15) thresholds to the average normalized expression per gene. After processing, blockwiseModules function from the WGCNA v1.69 library was performed in R with the parameters networkType=“signed”, minModuleSize=4, corType=“Bicor”, maxPOutliers=0.1, deepSplit=3,trapErrors=T, and randomSeed=59069. Other than power, remaining parameters were left as the default setting. The power parameter that determines the resolution of gene module output was chosen independently for each dataset. To pick an adequate power, the pickSoftThreshold function from WGCNA was used to test values from 1 to 30. Final resolution was determined by studying the output modules and choosing the resolution that captured most variation in the fewest total number of modules—this resulted in a power of 18 for SUV40H1 28 d.i.v., 3 for SUV40H1 35 d.i.v., and 12 for both partitions of PTEN 3 months and HUES66 3.5 months.
In order to calculate significance of differential expression of modules in the data, Seurat objects were downsampled to have an equal number of cells per organoid, and then the AddModuleScore function was used to assign a score to each cell. For each module, linear mixed-effect models were fit to the data, with the modules scores as the output, the organoid the cell belongs to as a random intercept, and with or without the control-versus-mutant state as a predictor. The anova function was used to compare the models and evaluate the use of control-versus-mutant as a predictor, and p values were then adjusted across modules using the Benjamini-Hochberg correction.
Differential genes between control and mutant organoids were assessed using a method to control for variability organoid-to-organoid. Datasets were subsetted to the cells from the “partition of interest” in the above pseudotime analysis, in order to isolate the cells predicted to be in the developmental trajectory of the affected neurons for each mutant. Reads were then summed across cells in each organoid. Genes with less than 10 total reads were excluded, and then DESeq272 was used to calculate DEGs, with each organoid as a sample. The clusterProfiler73 R package was used to find enriched biological processes in these gene sets, with the enrichGO function and the compareCluster function to highlight processes the gene sets might have in common. Redundant GO terms were removed with the simplify function, with cutoff=0.7. DEGs: differentially expressed genes.
For organoid analysis, SUV420H1+/− organoids: n=42 18 d.i.v. control organoids, 42 18 d.i.v. mutant organoids, 47 27-d.i.v. control organoids, and 46 27-d.i.v. mutant organoids, from 4 differentiation batches. For PTEN+/− organoids: n=114 18-d.i.v. control organoids, 77 18-d.i.v. mutant organoids, 39 35-d.i.v. control organoids, and 34 35-d.i.v. mutant organoids, from 3 differentiation batches. For CHD8+/− organoids: n=19 16 d.i.v. control organoids, 19 16 d.i.v. mutant organoids, 16 25 d.i.v. control organoids, and 17 25 d.i.v. mutant organoids, from 2 differentiation batches.
For proteomic analysis, four mutant and four control organoids were used in the case of SUV420H1 and PTEN 35 d.i.v. For CHD8 35 d.i.v., and for PTEN 70 d.i.v., three mutant and three control organoids were used.
Finally, in each scRNA-seq dataset, three organoids per genotype were initially sequenced. In three cases, one out of six organoids were excluded from the analysis as outliers. In PTEN+/− 35 d.i.v. batch I, a control organoid was excluded because not enough cells could be recovered from that sequencing lane, and in the second batch, a control organoid was excluded because it clustered entirely separately from the other 5 organoids, indicating a potential differentiation issue. Finally, in CHD8+/+ 3.5 month batch II, a mutant organoid was excluded due to mostly containing inhibitor-lineage cells, with very few projection neuron cells. Although an increase in interneuron-lineage cells was seen in all mutant organoids in this experiment, this extreme example was excluded to be conservative. This leaves us with a total of 57 organoids considered in downstream analysis, with a total of 416,051 cells.
This Example both re-presents certain data from Example 1 and provides additional data.
To experimentally investigate whether mutations in different ASD risk genes converge on shared molecular, cellular, or developmental phenotypes, cortical organoids, an in vitro model of human cortical development, was leveraged using a robust protocol that was previously developed14. This model has been validated to undergo reproducible generation of cortical cellular diversity, along consistent developmental trajectories, in each individual organoid.
First, organoids were generated from two different human induced pluripotent stem cell (iPSC) lines, GM08330 and Mito210 (Methods,
Next, 102 genes identified as contributing to ASD risk from the largest-to-date ASD exome sequencing study6 were taken and their expression was verified in these organoids across this developmental timeline (
Organoids with Haploinsufficient Mutations in SUV420H1, ARID1B, and CHD8 Recapitulate Brain Size Abnormalities Observed in Patients
To begin to understand the phenotypic manifestation of ASD risk genes and whether they show convergent effects on human cortical development, three genes were chosen for further study. The 102 ASD risk genes6 were filtered for genes that were a) significantly implicated in risk for ASD (familywise error rate ≤0.05)6, b) predominantly found as de novo protein-truncating variants, which are more likely to be associated with increased ASD liability6,22, and c) expressed early in brain development6, and thus likely to cause developmental abnormalities that can be modeled in organoids. Finally, genes were selected from a single functional category, focusing on chromatin modifiers, a class of proteins often affected in ASD and likely to influence brain development.
From the candidates meeting these criteria, SUV420H1, ARID1B, and CHD8 were focused on. SUV420H1 is a histone methyltransferase23,24 whose function in brain development is largely unknown but has been linked to neuroectodermal differentiation25 and regulation of adult neural stem and progenitor cell pools26. ARID1B is a core component of the BAF complex27, which regulates cell cycle progression and acquisition of neuronal specific traits28. CHD8 is an ATP-dependent chromatin-remodeling factor associated with regulation of cell cycle and neuronal differentiation29-31. All three were expressed in the single-cell organoid atlas across multiple timepoints and cell types, precluding a priori prediction of cell type-specific effects (
For each of the three genes, heterozygous protein-truncating indel mutations were engineered using a CRISPR/Cas9 strategy, in each case targeting a protein domain that is mutated in patients (
For SUV420H1, a protein-truncating frameshift was introduced in the N-terminal domain, a region affected in patients24, in three parental lines from different donors (Mito294, PGP1, and Mito210; Methods and
Notably, for all three risk genes there was substantial variation in endogenous levels of each protein, as well as in the degree of reduction in the mutants, as determined by Western blot of undifferentiated stem cells from each mutant and isogenic control line. In each case, different genomic contexts (parental lines) showed substantial difference in endogenous expression of the risk proteins, such that in a line with high endogenous expression in the control, the level of the protein in the corresponding mutant line, even though significantly reduced, may resemble the level in another control line with lower endogenous expression (e.g., compare Mito210 and PGP1 for SUV420H1;
Cortical organoids were produced from each isogenic control and heterozygous mutant line using the previously-established protocol14. Organoids from all lines initiated appropriate neural development (
As all three genes are linked to macrocephaly and/or microcephaly in patients, organoid size was quantified for each mutant and control line (see
For ARID1B, haploinsufficient organoids showed an increase in size in the Mito210 background at both two weeks and one month (adjusted p<8.8×10−16 at 18 d.i.v. and adjusted p=1.2×10−12 at 35 d.i.v., t-test;
For CHD8, at one month (24-25 d.i.v.), mutant organoids were significantly larger compared to controls in two out of the four lines tested, across multiple batches (adjusted p=1.8×10−4 in HUES66, p=1.2×10−6 in Mito294, t-test;
These data indicate that human cortical organoids bearing haploinsufficient mutations in SUV420H1, ARID1B, and CHD8 replicate size defects related to the macrocephalic and microcephalic abnormalities seen in patients carrying these mutations, suggesting that organoids can capture clinically-relevant features of ASD pathology and justifying further investigation using this model.
In order to investigate developmental abnormalities in the SUV420H1 mutant, single-cell RNA-seq was performed on mutant and control organoids at different stages of in vitro development. Striking developmental defects were found in specific populations of cortical neurons in mutant organoids. Early stages of mutant and control organoids were profiled for Mito294 SUV420H1 (30,733 single cells at 35 d.i.v.), PGP1 SUV420H1 (37,510 single cells at 35 d.i.v.), and Mito210 SUV420H1 (57,941 single cells at 28 d.i.v., and 33,313 single cells at 35 d.i.v., from two separate differentiations). At this early time point, mutant organoids showed a consistent presence of GABAergic neurons in all the genomic contexts profiled (
Despite the consistency of this phenotype across lines and differentiations, there were noticeable differences in phenotypic severity (expressivity) across genomic contexts. Specifically, the Mito294 SUV420H1 line showed the most dramatic increase in GABAergic neurons, with over 50% of the cells in all mutant organoids belonging to the GABAergic lineage, and <5% belonging to the excitatory projection neuron lineage (n=3 organoids per genotype, adjusted p=0.002, logistic mixed models;
Whether the increase in GABAergic neurons induced by SUV420H1 haploinsufficiency persisted at later stages of development was then investigated. Organoids from the two lines that showed the greatest difference in phenotypic severity (Mito294 and Mito210) were profiled. At three months (89 d.i.v.), scRNA-seq of the Mito294 SUV420H1 mutant organoids, which showed the most dramatic phenotype at one month (
It was then sought to examine possible changes in other cell types. Due to the pronounced overgrowth of the GABAergic lineage in the Mito294 SUV420H1 mutant organoids, most other cell types had reduced proportions (
In the PGP1 background, although the GABAergic phenotype was consistently seen, an increase in the number of deep layer projection neurons in the PGP1 SUV420H1 mutant organoids was not seen at one month (35 d.i.v.,
To better understand the mechanisms behind the increased number of neurons in the SUV420H1 mutants, the developmental dynamics of the affected cell types were examined along a developmental continuum of cells within a specific lineage. As the very low numbers of GABAergic neurons in the control lines at one month precluded accurate computation of developmental trajectories, the deep layer projection neuron phenotype was focused on, and pseudotime trajectories were calculated using Monocle336. As pseudotime analysis results in multiple developmental trajectories leading to different end states (for example, astrocytes or projection neurons), the portion of the trajectory corresponding to the development of the affected cell types (the “partition of interest”) was first identified. Initial Monocle analysis of the combined 35 d.i.v. Mito210 SUV420H1 mutant and control organoids (batch II) split the cells into two major partitions, one containing progenitors traversing the cell cycle, and another containing cycling progenitors maturing towards deep layer projection neurons (
It was then sought to explore possible mechanisms for the premature expression of maturation-associated genes in this mutant. SUV420H1 is a Histone-Lysine N-Methyltransferase responsible for catalyzing the methylation of H4K20me1 to H4K20me2 and H4K20me3, ultimately leading to chromatin compaction and transcriptional silencing24. Therefore, changes in chromatin accessibility were examine in these mutants.
Single-cell Assay for Transposase Accessible Chromatin was performed using sequencing (scATAC-seq) on Mito210 SUV420H1 organoids at both one and three months (84,696 single nuclei at 31 d.i.v., n=3 single organoids per genotype; 23,669 single nuclei at 93 d.i.v., n=3 single organoids per genotype). Cell types in both datasets were annotated by projecting scRNA-seq annotations. Co-embedding scATAC-seq and scRNA-seq data showed that chromatin accessibility capture most of the cell types identified by gene expression, and further confirmed the overall consistency between expression and chromatin accessibility profiles (
At one month (31 d.i.v.), a modest number of regions were differentially accessible (adjusted p<0.1, logistic regression) between mutant and control for most cell types (range 6-34), except for apical radial glia cells, the most prevalent (and thus most powered) cell type at this time point, which had 515 differentially accessible regions (DARs). Because most of the significant DARs overlapped across cell types, all cells were combined and DARs were recalculated to identify a final set of 414 DARs. The genes nearest (within 10 Kb) to DARs with increased accessibility in mutant organoids were enriched for GO terms associated with synaptic transmission and neuronal maturation, whereas genes nearest to DARs with reduced accessibility were enriched for negative regulation of neuronal maturation, connectivity, and development (
At a later developmental stage, three months in vitro (93 d.i.v.), only 43 significant DARs (adjusted p<0.1) were detected across all cells in both mutant and control organoids, suggesting that many of the differences detected at one month resolve as development progresses, in agreement with the scRNA-seq data. Of interest, although differences in neuron type proportions resolve in this genomic context (Mito210 SUV420H1, scRNA-seq analysis,
Next, motif enrichment in the top 400 DARs was examined at each time point (31 d.i.v. and 93 d.i.v., ranked by adjusted p value). Regions with increased accessibility in the mutant were enriched for Transcription factor (TF) binding sites for regulators of neurogenesis and patterning of the developing nervous system, including EN1, DLX2, LHX8, LHX1, DLX1, GSX2, EN2, PDX1, DLX5, HOXA1 at one month, and NEUROD1, NEUROG2, HAND2, OLIG2, PTF1A, PB0193.1, TCFE2A, TCF4, BHLHA15 at three months (
Altogether, the results show that in SUV420H1 mutant organoids, both GABAergic neurons and deep layer projection neurons undergo accelerated asynchronous development relative to isogenic control lines, in a defined temporal window during early corticogenesis. Although both phenotypes were present in multiple human pluripotent stem cell lines, the data indicates that genomic context can differentially modulate phenotypic abnormalities affecting different cell types.
It was hypothesized that the developmental abnormalities in the GABAergic and deep layer projection neuron populations seen in the SUV420H1 mutant organoids may affect cortical local circuits composed of these neuron classes, a possibility supported by the changes in the expression and accessibility of synapse-associated genes.
To test this hypothesis, calcium imaging analysis was performed in intact preparations of SUV420H1 control and mutant organoids. Spontaneous neuronal activity of organoids from the cell line that displayed an intermediate severity of phenotype was analyzed (PGP1 SUV420H1,
Notably, after blockade of excitatory synapses with NBQX, only control organoids showed residual uncorrelated activity, while mutant organoids displayed no calcium transients under the same conditions (
Moreover, comparing the frequency (p=0.032, t-test,
As SUV420H1 haploinsufficiency altered the rate of production of GABAergic neurons and immature deep layer projection cells, next it was investigated whether similar changes in the rate of production of neuronal classes were observed with mutations in other ASD risk genes. Individual Mito210 ARID1B mutant and isogenic control organoids were profiled at one month (35 d.i.v.) by scRNA-seq to compare changes in their cellular composition (
While control organoids had few or no GABAergic lineage cells at this age, the Mito210 ARID1B mutant organoids showed a consistent increase in the proportions of GABAergic neurons and their progenitors (n=3 single organoids per genotype, GABAergic neurons: adjusted p=0.0057 in batch I and adjusted p=0.0076 in batch II; GABAergic neuron progenitors: adjusted p=0.0004 in batch I and adjusted p=0.013 in batch II; cycling GABAergic neuron progenitors: p=0.0004 in batch I and p=0.0001 in batch II; logistic mixed models;
At three months, scRNA-seq of Mito210 ARID1B mutant and control organoids (90 d.i.v., 33,762 total single cells; n=3 individual organoids per genotype) also showed an increase in GABAergic lineage cells (GABAergic interneurons) in two of three mutant organoids (vs. 0 of 3 controls; adjusted p=0.19, logistic mixed models;
As was found for SUV420H1, these phenotypes were affected by the genomic context in which the mutation occurs. Control and ARID1B mutant organoids were generated in the Mito294 parental line, and 50,081 cells were profiled at 35 d.i.v. (n=3 per genotype). Consistent with the Mito210 ARID1B phenotype, there was a decreased number of newborn deep layer projection neurons in the mutant organoids compared to control (p=0.025, logistic mixed models;
Pseudotime analysis of the developmental trajectory corresponding to the deep layer projection neuron lineage (the neurons altered in both genomic contexts) supported a model of delayed differentiation (
In sum, like the SUV420H1 mutants, ARID1B mutant organoids exhibit both a phenotype of premature expansion of the GABAergic neuron lineage, and asynchronous development of the deep layer projection neuron lineage. Notably, as in the SUV420H1 mutants, these two phenotypes were differentially modulated in distinct parental lines in the ARID1B mutant organoids, further supporting the finding that genomic context modulates the expressivity of developmental defects in a phenotype-specific manner.
To further explore the hypothesis of convergent phenotypes among ASD risk genes, mutant and control HUES66 CHD8 organoids were profiled at 3.5 months (109 d.i.v., 67,024 single cells, n=3 single organoids per genotype;
Notably, scRNA-seq on the HUES66 CHD8 organoids after six months in culture showed that the phenotypic abnormalities induced by CHD8 haploinsufficiency persisted (190 d.i.v., n=3 individual organoids per genotype, 39,285 cells), with a consistent increase in GABAergic interneurons in mutant vs. control organoids (adjusted p=0.002, logistic mixed models;
This increase in GABAergic populations is consistent with two recent reports that CHD8 haploinsufficiency affects expression of GABAergic interneuron marker genes, using in vitro models derived from two additional human genomic contexts, a control iPSC line and a hESC line, engineered to carry mutations in CHD843,44. As both SUV420H1 and ARID1B had shown significant variation in expressivity between lines, it was sought to determine if there were likewise genomic contexts that could modulate the phenotypic manifestation of CHD8. Therefore, mutant and control CHD8 organoids generated from four different parental lines spanning different basal levels of CHD8 protein expression were compared (
Trajectory analysis on the 3.5 months HUES66 CHD8 control and mutant organoids supported a model where the increase in GABAergic interneurons is due to accelerated differentiation of GABAergic neurons in the mutant. The partition containing the GABAergic lineage was extracted, progressing from radial glia to GABAergic interneurons (
Thus, similarly to SUV420H1 and ARID1B mutants, CHD8 haploinsufficiency leads to accelerated development of the GABAergic lineage, which in the case of CHD8 leads to a persistent increase in the proportion of these cell types through cortical development. For all three of these risk genes, this phenotype can be observed in multiple parental lines, but, notably, different donor lines vary in their degree of phenotypic expressivity.
The three ASD risk genes investigated, SUV420H1, ARID1B, and CHD8, may converge on asynchronous development of cortical neuronal lineages by acting through common molecular pathways, or by using distinct mechanisms. To investigate these possibilities, gene expression changes across the three ASD risk genes was compared. Cell lines that showed a strong phenotype (Mito210 SUV420H1, Mito210 ARID1B, and HUES66 CHD8) at one timepoint (one month, 35 d.i.v.) were focused on. First, DESeq245 (using a model which explicitly controls for sample-to-sample variation46) was applied to calculate DEGs between mutant and control in each of the following datasets: pseudo-bulk from Mito210 SUV420H1 (batch II), pseudo-bulk from Mito210 ARID1B (batch II), and bulk RNA-seq from HUES66 CHD8 (batch I and II combined), all collected at 35 d.i.v. 1,689 DEGs in SUV420H1, 1,241 DEGs in ARID1B, and 4,864 DEGs in CHD8 were found; of these, only 92 were shared in all three mutants, 41 of which were changed in the same direction for all mutations (
The single-cell data was leveraged to compare cell-type-specific changes in expression in each mutant. DEGs were calculated between mutant and control for each cell type present at one or three months in each dataset, and the overlap was quantified between each of these gene sets within matching time points (SUV420H1 Mito210 35 d.i.v. versus ARID1B Mito210 35 d.i.v. batch II, and SUV420H1 Mito210 92 d.i.v. versus ARID1B Mito210 90 d.i.v. versus CHD8 HUES66 109 d.i.v.). While similar cell types within the same mutation often exhibited highly-overlapping DEG sets, DEGs caused by different mutations rarely overlapped, even for identical or closely-related cell types (
Similar findings were obtained at the level of protein expression, as assessed by whole-proteome mass spectrometry of mutant and control single organoids. Isobaric tandem mass tag (TMT) multiplexed quantitative mass spectroscopy was performed on Mito210 SUV420H1 (n=4 single organoids per genotype), Mito210 ARID1B (n=4-5 single organoids per genotype), and HUES66 CHD8 (n=3 single organoids per genotype) organoids collected at one month (35 d.i.v.). 233 proteins were identified that were significantly differentially expressed for SUV420H1 (≥4,000 proteins detected per sample, FDR <0.1,
It was then sought to evaluate whether the affected proteins in the three mutants were predicted to interact with one another, or with shared target proteins. The top 50 DEPs (ranked by adjusted p value) for each of the three mutations were used as input to create a network of interacting proteins50,51, followed by clustering to identify subnetworks (Methods). These subnetworks were enriched for terms including “nervous system”, “vesicle-mediated transport”, “chromatin modification” and “metabolism” (adjusted p≤0.1 in all cases;
Altogether, the results suggest that despite the widespread expression and broad functions of these three risk genes, SUV420H1, ARID1B, and CHD8, all cause phenotypes of asynchronous development of specific human cortical neuronal lineages (GABAergic cells and deep layer projection neurons), but do not affect the same genes and proteins. Moreover, these phenotypic abnormalities can be modified by the genomic context in which each mutation acts, affecting expressivity in a gene- and phenotype-specific manner.
The process by which mutations in ASD risk genes converge on the neurobiology of this multifaceted disorder remains unclear. Progress in this field has been difficult because of the complexity of ASD genetics, limited experimental access to the human brain, and the lack of experimental models for human brain development11. By leveraging reproducible human cortical organoids, proteomics, and single cell genomics (scATAC-seq and scRNA-seq), it was found that mutations in three ASD risk genes, SUV420H1, ARID1B, and CHD8 converge on a shared phenotype of asynchronous cortical neurogenesis, and define two main neuronal classes of the local cortical circuit (GABAergic neurons and deep layer projection neurons) as the specific cell populations affected. This is intriguing, as prior studies of postmortem ASD brains53 and human PSC lines derived from idiopathic ASD patients54-56 have pointed at dysregulation of GABAergic neuron development. In addition, studies in mouse models and human neural cultures have linked several ASD risk genes to abnormalities in neuronal development,31,57 including upregulation of GABAergic neuron marker genes in CHD8 mutant cultures44. The new findings indicate that despite widespread expression of ASD risk genes in multiple cell types and developmental stages, their resulting developmental abnormalities are specific for selected populations of cells. Identification of GABAergic neurons and deep layer projection neurons as the cells affected by mutation in these genes now provides a focus for investigation of disease mechanisms affecting these exact cell types. Notably, the development of these neurons is affected through largely non-overlapping molecular targets in each mutation. This encourages future mechanistic investigation focused on shared developmental processes rather than shared molecular pathways, a hypothesis that can now be tested across many more ASD mutations.
It is shown that cell lines from different donors affect the same cell types, but differ in the severity of the individual phenotypes. This is consistent with the known phenotypic variability associated with ASD risk gene mutations in human populations7,8,13,19 and highlights the importance of studying risk genes across multiple human genomic contexts. Notably, the data indicates that different genomic contexts differentially modulate expressivity based on both the risk gene mutated and the nature of the specific phenotype caused by each mutation. For example, the Mito294 cell line shows a pronounced phenotype in the SUV420H1 mutation, but is resilient to the same phenotype caused by mutation in ARID1B. Furthermore, the effects of the broader genetic context are highly nuanced with relation to the phenotype; in SUV420H1 mutant, it was found that although the Mito210 line showed a more severe phenotype for the deep layer projection neurons than the PGP1 line, the latter showed a more severe phenotype for the GABAergic neurons. One possible mechanism underlying this variability may be idiosyncratic differences in the baseline expression and function of each risk gene between individuals (
The finding that different ASD risk genes converge on a phenotype of asynchronous neuronal development but mostly diverge one level of complexity below it (i.e., at the level of molecular targets), suggests that the shared clinical pathology of these genes may derive from high-order processes of neuronal differentiation and circuit wiring, as opposed to convergent use of the same molecular effectors. These results suggest the value of investigating therapeutic approaches aimed at the modulation of shared dysfunctional circuit properties in addition to shared molecular pathways.
Excitatory/inhibitory imbalance of the cortical microcircuit is a major hypothesis for the etiology of ASD58-60. Such an imbalance could result from a change in the numerical proportions of cell types within the circuit, incorrect wiring, and/or altered properties of the neurons and glial cells involved, among others. The altered overall proportion of neurons that were observed in mutant organoids points to an early defect in circuit development whereby a small number of neuron classes develop asynchronously with the remaining cell types of the circuit. Studies in vivo suggest that these types of relatively subtle developmental alterations, even if resolved later in development, could result in disproportionally large functional effects on the mature circuit61-63. This hypothesis is supported by the observation of changes in spontaneous activity in SUV420H1 mutant organoids at four months in the calcium imaging analysis. As organoid models become more sophisticated in reproducing wiring maps of the endogenous brain, incorporating additional cell types (e.g., microglia, pericytes, endothelium), and developing stereotypical structure11, they will empower investigation of the long-term effects of these early developmental abnormalities on circuit physiology and function.
Reproducible brain organoid models, combined with high-throughput single cell genomic methods, are suitable systems for unbiased identification of pathogenic mechanisms in neurodevelopmental disorders. This work paves the way for future efforts to multiplex perturbation and analysis of large panels of ASD risk genes in organoids to understand whether convergent mechanisms are at play across the multitudinous implicated genes, and for a deeper analysis of the effect of different genomic contexts on ASD outcome.
The HUES66 CHD8 parental hESC line64 and CHD8 mutant line (HUES66 AC2), a clone which has a heterozygous 13 nucleotide deletion, resulting in a stop codon at amino acid 1248 (CHD8 gRNA: 5′-TTCTTACTGTGTACCCGGGC-3′ (TGG) (SEQ ID NO: 1)), were kindly provided by N. Sanjana, X. Shi, J. Pan, and F. Zhang (Broad Institute of MIT and Harvard). The psychiatric control Mito210 and Mito294 parental iPSC lines were provided by B. Cohen (McLean Hospital); the parental PGP1 iPSC line by G. Church (Harvard University)65; the GM08330 iPSC line (a.k.a. GM8330-8) by M. Talkowski (MGH) and was originally from Coriell Institute and the H1 parental hESC line (a.k.a. WA01) was purchased from WiCell. Cell lines were cultured as previously described14,66. Among these cell lines iPSC lines from individuals with no known history of ASD or other psychiatric condition (Mito210 and Mito294 confirmed by structured psychiatric interview, PGP1 with publicly available records) were included. All human pluripotent stem cell lines were maintained below passage 50, were negative for mycoplasma (assayed with MycoAlert PLUS Mycoplasma Detection Kit, Lonza), and karyotypically normal (G-banded karyotype test performed by WiCell Research Institute). The HUES66 and PGP1 lines were authenticated using STR analysis completed by GlobalStem (in 2008) and TRIPath (in 2018), respectively. The Mito210 and Mito294 lines were authenticated by genotyping analysis (Fluidigm FPV5 chip) performed by the Broad Institute Genomics Platform (in 2017). The H1 and GM08330 lines were authenticated using STR analysis completed by WiCell (in 2021). In Mito294 ARID1B control line a CNV smaller than 0.5 Mb on Ch19 was detected via SNP array analysis. The GM08330 parental line and modified lines have all an interstitial duplication in the long (q) arm of chromosome 20. All experiments involving human cells were performed according to ISSCR 2021 guidelines67, and approved by the Harvard University IRB and ESCRO committees.
The CRISPR guides for SUV420H1 and ARID1B were designed using the Benchling CRISPR Guide Design Tool (Benchling Biology Software, 2017). The guides were designed to maximize on-target efficiency and minimize off-target sites in intragenic regions68,69. For SUV420H1, a guide was designed to target the N-terminal domain to create a protein truncation early in the translated protein in all known protein coding transcripts (SUV420H1 gRNA: 5′-CAAGAACCAAACTGGTTGCT-3′ (AGG) (SEQ ID NO: 2)). The ARID1B guide was chosen to induce a stop codon immediately before the ARID domain (ARID1B gRNA: 5′-CTCTAGCCTGATGAACACGC-3′ (AGG) (SEQ ID NO: 4)). For CHD8, all the mutant lines were generated using the same gRNA previously used for the generation of the HUES66 AC2 (CHD8 gRNA: 5′-TTCTTACTGTGTACCCGGGC-3′ (TGG) (SEQ ID NO: 1)).
Mutations in SUV420H1 were introduced in the Mito210, Mito294 and PGP1 iPSC lines. For the Mito210 and Mito294 SUV420H1 mutant lines, nanoblades generated as previously described70 were mixed with 300 μl of mTeSR1 and 4 μg/ml of polybrene and added to 80% confluent cells. For selection of the edited clones, cells were enzymatically detached and plated at a ratio of ˜5,000 cells per 60 mm dish with 10 μM of ROCK inhibitor (Y-27632, Millipore-Sigma) to increase single-cell survival. When the colonies started to appear, each clone was manually collected and transferred into a single well of a 96-well plate. During colony picking, some cells were reserved for DNA extraction and clonal screening. The PGP1 SUV420H1 mutant line was generated in collaboration with the Harvard Stem Cell Institute (HSCI) iPS Core Facility. Briefly, parental cells were transfected with the Neon system (1,000v, 1,100v or 1,200v, 30 ms, 1 pulse). For 100,000 cells 6.25 pmol TrueCut™ Cas9 Protein v2 (Thermo Fisher Cat: A36496) and 12.5 pmol of sgRNA (Synthego) were used. Post transfection, the pools of cells were harvested to test knock-out efficiency. The best pool was then selected for low density plating (600 to 2,000 cells per 10 cm dish). A week later, colonies were picked into one 96 well plate. Clones were screened by PCR and Sanger sequencing. Heterozygous clones were expanded and the genotypes were re-confirmed post expansion.
Mito210 and Mito294 ARID1B edited lines were generated by the Broad Institute Stem Cell Facility. The guide RNA and Cas9 (EnGen Cas9 NLS from New England Biolabs) were transfected by using the NEON transfection system (Thermo Fisher Scientific, 1050 V, 30 ms, 2 pulses and 2.5×105 cells).
Mutations in CHD8 were introduced in the Mito210 and Mito294 lines using the Amaxa 4D-Nucleofector® (Lonza), using the protocol optimized for pluripotent stem cell lines. Parental cell lines were transfected with gRNA-CHD8-Cas92APuro and immediately plated in mTeSR1 for 24 hours. Selection of transfected cells was done by adding 0.25-0.5 μg/ml of puromycin after 48 hours of transfection, for two days. Selection of the edited clones was performed according to the protocol described for the Mito210 and Mito294 SUV420H1 clones. The H1 and GM08330 CHD8 mutant lines were generated in collaboration with the HSCI iPS Core Facility following the protocol used to generate the PGP1 SUV420H1 mutant line.
Insertions/deletions in individual clones were screened via PCR amplification using primers flanking the guide. For more details about Insertions/deletions see
Cortical organoids were generated as previously described14,66. Embryoid bodies were formed in the same pluripotent media in which they were maintained for 1-2 days in order to better enable the formation of embryoid bodies from each line.
Samples were prepared as previously described14. Cryosection thickness varied from 14-18 μm.
Organoids were processed using the SHIELD protocol71. Briefly, organoids were fixed for 30 minutes in 4% paraformaldehyde (PFA) at room temperature (RT) and were then treated with 3% (wt/v) polyglycerol-3-polyglycidyl ether (P3PE) for 48 hours in ice cold 0.1 M phosphate buffer (pH 7.2) at 4° C. then transferred to 0.3% P3PE in 0.1 M sodium carbonate (pH 10) for 24 hours at 37° C. Samples were rinsed and cleared in 0.2 M SDS in 50 μmM phosphate buffered saline (PBS, pH 7.3) for 48 hours at 55° C. Organoids were stained with Syto16 (Thermo Fisher Scientific #S7578) and anti-SOX2 using the SmartLabel system (Lifecanvas). Tissues were washed extensively for 24 hours in PBS+0.1% Triton-X-100 and antibodies were fixed to the tissue using a 4% PFA solution overnight at RT. Tissues were refractive index-matched in PROTOS solution (RI=1.519) and imaged using a SmartSPIM axially-swept light-sheet microscope (LifeCanvas Technologies). 3D image datasets were acquired using a 15×0.4 NA objective (ASI-Special Optics, #54-10-12). Optical sections from whole-organoid datasets are shown in
Microscopy and organoid size analysis
Images of organoids in culture were taken with an EVOS FL microscope (Invitrogen), Lionheart™ FX Automated Microscope (BioTek Instruments), or with an Axio Imager.Z2 (Zeiss). Immunofluorescence images were acquired with the latter two and analyzed with the Gen5 (BioTek Instruments) or Zen Blue (Zeiss) image processing software. ImageJ72 was used to quantify organoid size. Area values were obtained by tracing individual organoids on ImageJ software which measured area pixels. Measurements were plotted as a ratio to the average value for control organoids of each experimental batch.
Proteins were extracted from iPSC using N-PER™ Neuronal Protein Extraction Reagent (Thermo Fisher Scientific) supplemented with protease (cOmplete™ Mini Protease Inhibitor Cocktail, Roche) and phosphatase inhibitor (PhosSTOP, Sigma). Lysates were centrifuged for 10 minutes at 13,500 rpm at 4° C. Protein concentration was quantified using the Pierce™ BCA Protein Assay Kit (Thermo Fisher Scientific). 15-20 μg of protein lysates were separated on a NuPAGE™ 4-12%, Bis-Tris Gel (Invitrogen) or Mini-PROTEAN 4-15% Gels (Bio-Rad) and transferred onto PVDF membrane. Blots were blocked with 5% nonfat dry milk (Bio-Rad) and incubated with primary antibodies overnight. Blots were then washed and incubated at room temperature (RT) with secondary horseradish peroxidase-conjugated antibodies (Abcam) for 1 hour. Blots were developed using SuperSignal™ West Femto Maximum Sensitivity Substrate (Thermo Fisher Scientific) or ECL™ Prime Western Blotting System (Millipore), and a ChemiDoc System (Bio-Rad). Densitometry band quantification was done using Fiji software73 v2.0 and normalized on housekeeping genes (GAPDH or P-actin). The bands used for quantification are marked with an asterisk in
Organoids were transduced with pAAV-CAG-SomaGCaMP6f2 (Addgene, #158757) by pipetting 0.2 μl of stock virus to 500 μl of Cortical Differentiation Medium IV (CDMIV, without matrigel) in a 24 well containing a single organoid. On the next day, each organoid was transferred to a 6-well plate filled with 2 ml of fresh medium. On the third day after transduction, organoids were transferred to low attachment 10-cm plates and on the seventh day, medium was switched to BrainPhys (5790 STEMCELL Technologies) supplemented with 1% N2 (17502-048 Thermo Fisher), 1% B27 (17504044 Thermo Fisher), GDNF (20 ng/ml, Cat. No. 78139 STEMCELL Technologies), BDNF (20 ng/ml, 450-02 Peprotech), cAMP (1 mM, 100-0244 Stemcell Technologies), Ascorbic acid (200 nM, Cat. No. 72132 STEMCELL Technologies), and laminin (1 μg/ml, 23017015 Life Technologies). Organoids were cultured in BrainPhys for at least 2 weeks before imaging.
Brain organoids were randomly selected and transferred to a recording chamber containing BrainPhys. Imaging was performed using a confocal scanner (CSU-W1, Andor confocal unit attached on an inverted microscope [Ti-Eclipse, Nikon]), while the organoids were kept at 37° C. using a heating platform and a controller (TC-324C, Warner Instruments). The use of a 10× objective (Plan Apo λ, 10×/0.45) resulted in a field of view of 1.3×1.3 mm2 and a pixel size of 0.6 μm. Imaging took place in fast-time-lapse mode, with an exposure time of 100 ms, resulting in an acquisition rate of 8 frames/sec. Spontaneous activity was recorded in three different z-planes, for at least 22 minutes of baseline activity in total (with no pharmacology treatment).
Stock solutions of 2,3-dioxo-6-nitro-1,2,3,4-tetrahydrobenzo[f]quinoxaline-7-sulfonamide disodium salt (NBQX disodium salt, Abcam; 100 mM) and Tetrodotoxin citrate (TTX, Abcam; 10 mM) were prepared in ddH2O. Bath application of NBQX (antagonist of AMPA/kainate glutamate receptors) and TTX (voltage-gated sodium-channel antagonist) was applied to achieve a final bath concentration of 20 μM and 2 μM, respectively.
Data were converted from nd2 format to tiff, and automated motion correction and cell segmentation were performed using Suite2p74, followed by manual curation of segmented cells (the spatial footprint and temporal characteristics of each candidate cell was examined, as well as manually adding neurons with clear cell body morphology (see
Analysis was done using in-house MATLAB scripts. Raw calcium signals for each cell, F(t), were converted to represent changes from baseline level, ΔF/F(t) defined as (F(t) −Fo(t))/Fo(t). The time varying baseline fluorescence, Fo(t), for each cell was a smoothed fluorescence trace obtained after applying a 10-second-order median filter centered at t. Calcium events elicited by action potentials were detected based on a threshold value given by their peak amplitude (5 times the standard deviation of the noise value) and their first time derivative (2.5 times the standard deviation of the noise value).
The analysis of network bursting was performed based on the population-averaged calcium signal along all segmented cells. A peak in the population signal was considered a network burst if it met the following criteria: (1) the peak amplitude was greater than 10 times the standard deviation of the noise value, (2) a set of bursting cells composed of at least 20% of total cells were active during that population calcium transient, and (3) a cell was considered part of the set of bursting cells only if it participated in at least 50% of the network bursts. Under these criteria, 89.3%±14% (range from 60.5% to 100%) and 95.5%±6.8% (range from 77.6% to 100%) of segmented cells participated in network bursting in control and mutant organoids, respectively.
The peaks of the network bursts were used to measure the inter-spike interval (ISI), and the burst frequency was obtained from the average ISI. The burst half-width was also measured from the population-averaged calcium signal by calculating the width of the transient at 50% of the burst peak amplitude.
For analyses of the synchronicity, the ΔF/F(t) signal was used to calculate the cross-correlation between all pairs of cells at zero lag (
Extracellular neurophysiological signals were recorded using a Maxwell Biosystems CMOS-HD-MEA system75 (MaxOne System, MaxWell Biosystems AG, Switzerland). MaxOne chip contains 26,400 platinum electrodes in a sensing area of 3.85×2.10 mm2 with 17.5 m center-to-center pitch, 3265 electrodes/mm2 density, and 1024 configurable low noise readout channels (2.4 μVr·m·s. in the 300 Hz-10 kHz band) with a sampling rate of 20 kHz/s at 10-bit resolution. Acute recordings were performed at room temperature, with the intact organoid in fresh BrainPhys.
For the recordings, MaxLab Live Software (v.20.1.6. MaxWell Biosystems AG, Switzerland) was used. Briefly, spontaneous activity of neurons was measured using the Activity Scan Assay where the whole chip area was scanned with a sparse recording (30 s/configuration, seven configurations). Active neurons were automatically identified, based on the firing rate and spike amplitude obtained from the Activity Scan. Based on the activity of the neurons, the most active electrodes were routed for the creation of the network configuration based on units of 4×3 electrodes each, with 1024 recording electrodes in total (
For spike detection, the software used a finite impulse response bandpass filter between 300-3000 Hz to pre-process the raw data (
When extracting the waveform of the electrodes in a single unit (set of neighboring 4×3 electrodes;
All descriptive statistics and statistical tests were performed in Matlab (Version 9.5, R2018b, The MathWorks, Inc.), using the Statistics Toolbox (version 11.4, R2018b, The MathWorks, Inc.). The Lilliefors test was used to test for normality of data distributions. All datasets met the assumptions of the applied statistical tests. When comparing groups, the equality of the variance was tested at the 5% significance level by a two-tailed squared-ranks test. All statistical tests applied to the electrophysiological data were two-tailed, with a 5% significance level.
For SUV420H1, 4 mutant and 4 control organoids, for CHD8, 3 mutant and 3 control organoids and for ARID1B, 5 mutant and 4 control organoids were used. Cells were placed into microTUBE-15 (Covaris) microtubes with TPP buffer (truXTRAC Protein Extraction Buffer TP, Covaris SKU: 520103) and lysed using a S220 Focused-ultrasonicator instrument (Covaris) with 125 W power over 180 seconds at 10% max peak power. Upon precipitation with chloroform/methanol, extracted proteins were weighed and digested according to the FASP protocol76,77 (100 μg for ARID1B and CHD8; 150 μg for SUV420H1). Briefly, the 10K filter was washed with 100 μl of 50 μmM triethylammonium bicarbonate (TEAB). Each sample was added, centrifuged, and the supernatant discarded. Then, 100 μl of 20 mM Tris (2-carboxyethyl) phosphine (TCEP) at 37° C. was added for one hour, centrifuged, and the supernatant discarded. While shielding from light, 100 μl of 10 mM IAcNH2 was added for 1 hour followed by spinning and discarding the supernatant. Next, 150 μl of 50 μmM TEAB+3 μg of Sequencing Grade Trypsin (Promega) was added to each sample and left overnight at 38° C. The samples were then centrifuged and the supernatants collected. Finally, 50 μl of 50 μmM TEAB was added to the samples, followed by spinning and supernatant collection. The samples were then transferred to HPLC.
TMT (Tandem Mass Tag) Label Reagents (TMTPro, ThermoFisher Scientific, 16plex Label Reagent Set Catalog number: A44521) were equilibrated to RT and resuspended in anhydrous acetonitrile or ethanol (for the 0.8 and 5 mg vials, 41 μl and 256 μl were added, respectively). The reagent was dissolved for 5 minutes with occasional vortexing. TMT Label Reagent (41 μl, equivalent to 0.8 mg) was added to each 100-150 μg sample. The reaction was incubated for one hour at RT. The reaction was quenched after adding 8 μl of 5% hydroxylamine to the sample and incubating for 15 minutes. Samples were combined, dried in a speedvac (Eppendorf) and stored at −80° C.
Before submission to Liquid Chromatography with tandem mass spectrometry (LC-MS/MS), each experiment sample was separated on a Hi-pH column (Thermo Fisher Scientific) according to the vendor's instructions. After separation into 40 (20 for the ARID1B experiment) fractions, each fraction was submitted for a single LC-MS/MS experiment performed on a Lumos Tribrid (Thermo Fisher Scientific) equipped with 3000 Ultima Dual nanoHPLC pump (Thermo Fisher Scientific). Peptides were separated onto a 150 μm inner diameter microcapillary trapping column packed first with approximately 3 cm of C18 Reprosil resin (5 μm, 100 Å, Dr. Maisch GmbH) followed by PharmaFluidics micropack analytical 50 cm column. Separation was achieved by applying a gradient from 5-27% acetonitrile (ACN) in 0.1% formic acid over 90 minutes at 200 nl per minute. Electrospray ionization was enabled by applying a voltage of 1.8 kV using a home-made electrode junction at the end of the microcapillary column and sprayed from stainless-steel tips (PepSep). The Lumos Orbitrap was operated in data-dependent mode for the MS methods. The MS survey scan was performed in the Orbitrap in the range of 400-1,800 m/z at a resolution of 6×104, followed by the selection of the 20 most intense ions (TOP20) for CID-MS2 fragmentation in the Ion trap using a precursor isolation width window of 2 m/z, AGC setting of 10,000, and a maximum ion accumulation of 50 μms. Singly-charged ion species were not subjected to CID fragmentation. Normalized collision energy was set to 35 V and an activation time of 10 ms. Ions in a 10 ppm m/z window around ions selected for MS2 were excluded from further selection for fragmentation for 90 seconds. The same TOP20 ions were subjected to HCD MS2 events in the Orbitrap part of the instrument. The fragment ion isolation width was set to 0.8 m/z, AGC was set to 50,000, the maximum ion time was 150 μms, normalized collision energy was set to 34 V and an activation time of 1 ms for each HCD MS2 scan.
Raw data were submitted for analysis in Proteome Discoverer 2.4 software (Thermo Fisher Scientific). Assignment of MS/MS spectra was performed using the Sequest HT algorithm by searching the data against a protein sequence database including all entries from the Human Uniprot database78,79 and other known contaminants such as human keratins and common lab contaminants. Sequest HT searches were performed using a 10 ppm precursor ion tolerance and requiring each peptides N-/C termini to adhere with Trypsin protease specificity, while allowing up to two missed cleavages. 16-plex TMTpro tags on peptide N termini and lysine residues (+304.207 Da) was set as static modifications while methionine oxidation (+15.99492 Da) was set as variable modification. A MS2 spectra assignment false discovery rate (FDR) of 1% on protein level was achieved by applying the target-decoy database search. Filtering was performed using a Percolator (64 bit version)80. For quantification, a 0.02 m/z window centered on the theoretical m/z value of each of the 6 reporter ions and the intensity of the signal closest to the theoretical m/z value was recorded. Reporter ion intensities were exported in the result file of Proteome Discoverer 2.4 search engine as Excel tables. The total signal intensity across all peptides quantified was summed for each TMT channel, and all intensity values were normalized to account for potentially uneven TMT labeling and/or sample handling variance for each labeled channel.
Potential contaminants were filtered out and proteins supported by at least two unique peptides for the SUV420H1 and CHD8 experiment and at least one for the ARID1B experiment were used for further analysis. Proteins that were missing were kept in at most one sample per condition. Data were transformed and normalized using variance stabilizing normalization using the DEP package of Bioconductor81. To perform statistical analysis, data were imputed for missing values using random draws from a Gaussian distribution with 0.3 width and a mean that was down-shifted from the sample mean by 1.8. To detect statistically significant differential protein abundance between conditions, a moderated t-test was performed using the LIMMA package of Bioconductor82, employing an FDR threshold of 0.1. GSEA was performed using the GSEA software83. GO and KEGG pathway annotation were utilized to perform functional annotation of the significantly regulated proteins. GO terms and KEGG pathways with FDR q-values <0.05 were considered statistically significant.
To build protein interaction networks, the prize-collecting Steiner forest algorithm was used50,84 using the top 50 DEPs (ranked by adjusted p value) from each mutation as terminals, with the absolute value of their log fold change as prizes. This algorithm optimizes the network to include high-confidence protein interactions between protein nodes with large prizes. The PCSF R package v0.99.185 was used to calculate networks, with the STRING database as a background protein-protein interactome5, using parameters n=10, r=0.1, w=2, b=40, and mu=0.01. As by default in that package, the network was subclustered using the edge-betweenness clustering algorithm from the igraph package, and functional enrichment was performed on each cluster using the ENRICHR API. Cytoscape software version 3.8.2 was used for network visualization86. To assess relationships between the three sets of differential proteins, a PPI-weighted gene distance52 was calculated between each pair of protein sets. A background distribution was calculated by drawing size-matched random lists of proteins from all detected proteins in each dataset and calculating the pMM between these sets. This was repeated 1000 times, and an empiracal p-value was calculated by evaluating the number of times randomized pMMs were lower than the value calculated using DEPs.
Dissociation of Brain Organoids and scRNA-Seq
Organoids were dissociated as previously described66,87. Volumes of reagents were scaled down 25× for one month old organoids. Cells were loaded onto either a Chromium™ Single Cell B or G Chip (10× Genomics, PN-1000153, PN-1000120), and processed through the Chromium Controller to generate single cell GEMs (Gel Beads in Emulsion). scRNA-seq libraries were generated with the Chromium™ Single Cell 3′ Library & Gel Bead Kit v3 or v3.1 (10× Genomics, PN-1000075, PN-1000121), with the exception of a few libraries in the earlier experiments that were prepared with a v2 kit (10× Genomics, PN-120237). Libraries were pooled from different samples based on molar concentrations and sequenced them on a NextSeq 500 or NovaSeq instrument (Illumina) with 28 bases for read 1 (26 bases for v2 libraries), 55 bases for read 2 (57 bases for v2 libraries) and 8 bases for Index 1. If necessary, after the first round of sequencing, libraries were re-pooled based on the actual number of cells in each and re-sequenced with the goal of producing approximately equal number of reads per cell for each sample.
scRNA-Seq Data Analysis
Reads from scRNA-seq were aligned to the GRCh38 human reference genome and the cell-by-gene count matrices were produced with the Cell Ranger pipeline (10× Genomics)88. Cell Ranger version 2.0.1 was used for experiments using the GM08330 control “single cell map” and for HUES66 CHD8 mutant and control organoids at 3.5 months, batch I, while version 3.0.2 was used for all other experiments. Default parameters were used, except for the ‘-cells’ argument. Data was analyzed using the Seurat R package v3.1.589 using R v3.6. Cells expressing a minimum of 500 genes were kept, and UMI counts were normalized for each cell by the total expression, multiplied by 106, and log-transformed. Variable genes were found using the “mean.var.plot” method, and the ScaleData function was used to regress out variation due to differences in total UMIs per cell. Principal component analysis (PCA) was performed on the scaled data for the variable genes, and top principal components were chosen based on Seurat's ElbowPlots (at least 15 PCs were used in all cases). Cells were clustered in PCA space using Seurat's FindNeighbors on top principal components, followed by FindClusters with resolution=1.0 (briefly, a 20-nearest neighbor graph was constructed and modularity optimization using the Louvain algorithm was performed to identify clusters). Variation in the cells was visualized by t-distributed stochastic neighbor embedding (t-SNE) on the top principal components.
In the case of the GM08330 one month organoids (single-cell map), cells were demultiplexed using genotype clustering from cells from a different experiment that were sequenced in the same lane. To demultiplex, SNPs were called from CellRanger BAM files with the cellSNP tool v0.1.5, and then the vireo function was used with default parameters and n_donor=2, from the cardelino R library v0.4.090.91 to assign cells to each genotype.
In two cases, one organoid was excluded from the analysis as outliers. See the Statistics & Reproducibility section for details.
For each dataset, upregulated genes in each cluster were identified using the VeniceMarker tool from the Signac package v0.0.7 from BioTuring (available at github.com/bioturing/signac). Cell types were assigned to each cluster by looking at the top most significant upregulated genes. In a few cases, clusters were further subclustered to assign identities at higher resolution. At one month, the excitatory projection neurons included a gradient of immature neurons, which were split into two clusters: the cluster representing the earlier developmental stage was labeled “newborn deep layer projection neurons” and the cluster representing the later stage was labeled “immature deep layer projection neurons”. At three months and beyond, excitatory projection neuron clusters could be identified as deep layer corticofugal neurons and upper-layer callosal projection neurons. For the GABAergic populations, one month organoids included neurons expressing broad markers of GABAergic identity (labelled as “GABAergic neurons”), progenitor cells expressing markers of GABAergic lineage identity (“GABAergic Neuron Progenitors”), and progenitor cells with high expression of cell cycle markers in addition to the progenitor identity markers (“Cycling GABAergic Neuron Progenitors”). At three months and beyond, GABAergic neurons expressed more specific markers of cortical interneurons (hence labelled “GABAergic Interneurons”), and GABAergic lineage progenitors at these ages were divided into “GABAergic Interneuron Progenitors” and “Cycling GABAergic Interneuron Progenitors”, based on level of expression of cell cycle markers.
To assess gene expression of ASD risk genes in GM08330 and Mito210 control organoids across timepoints, datasets from one, three, and six months were merged using Seurat v3.1.5, then batch corrected using Harmony v1.0 with default parameters92. Since the one month data are dominated by cell cycle signal, the ScaleData function was used to regress out variation due to both total UMI count per cell and to cell cycle stage differences, calculated using Seurat's CellCycleScore. Variation was visualized using t-SNE on the first 30 Harmony dimensions. Broad cell types were assigned as above, and mutual information was calculated between cell type assignments and individual organoids with the mpmi R package93. Expression of the 102 ASD risk genes identified in the Satterstrom et. al. study6 was evaluated using Seurat's AddModuleScore function using default parameters. This function calculates the average expression level per cell of the set of genes (based on log-normalized, unscaled data), and then subtracts the average expression of a randomly-selected expression-matched control set of genes. A resulting score above zero indicates that the ASD risk gene set is expressed more highly in that cell than would be expected, given the average expression of the gene set across the dataset.
To compare cell type proportions between control and mutant organoids, for each cell type present in a dataset, the glmer function from the R package lme4 v1.1-2394 was used to estimate a mixed-effect logistic regression model95. The output was a binary indicator of whether cells belong to this cell type, the control or mutant state of the cell was a fixed predictor, and the organoid that the cell belonged to was a random intercept. Another model was fit without the control-versus-mutant predictor, and the anova function was used to compare the two model fits. P-values for each cell type were then adjusted for multiple hypothesis testing using the Benjamini-Hochberg correction.
Nuclei from one month and three month organoids were extracted with two types of procedures according to their size differences. For the one month organoids, the nuclei were extracted following a protocol provided by 10× Genomics96 to minimize material loss, while a sucrose-based nucleus isolation protocol97 was used for the three month organoids to better remove debris. Single-nucleus ATAC-Seq libraries were prepared with the Chromium™ Single Cell ATAC Library & Gel Bead v1 Kit (10× Genomics, PN-1000110) and around 15,300 nuclei per channel were loaded to give an estimated recovery of 10,000 nuclei per channel. Libraries from different samples were pooled based on molar concentrations and sequenced with 1% PhiX spike-in on a NextSeq 500 instrument (Illumina) with 33 bases each for read 1 and read 2, 8 bases for Index 1 and 16 bases for Index 2.
Reads from scATAC-seq were aligned to the GRCh38 human reference genome and the cell-by-peak count matrices were produced with the Cell Ranger ATAC pipeline v2.0.0 (10× Genomics) with default parameters. Data were analyzed using the Signac R package v1.2.198 using R v4.0. Annotations from the EnsDb.Hsapiens.v86 package99 were added to the object. After consideration of QC metrics recommended in that package, cells with 1500-20,000 fragments in peak regions, at least 35% of reads in peaks, a nucleosome signal of less than 4, and a TSS Enrichment score of greater than 2 were retained for further analysis. Latent semantic indexing (LSI) was performed to reduce data dimensionality (counts were normalized using term frequency inverse document frequency, all features were set as top features, and singular value decomposition (SVD) was performed). The top LSI component was discarded as it correlated strongly with sequencing depth, and components 2-30 were used for downstream analysis. Cells were clustered using Seurat's FindNeighbors, followed by FindClusters with the SLM algorithm (a 20-nearest neighbor graph was constructed and modularity optimization using the smart local moving algorithm was performed to identify clusters). Variation in the cells was visualized by UMAP (Uniform Manifold Approximation and Projection) on the top LSI components.
ScATAC-seq data were integrated with scRNA-seq data from the corresponding Mito210 dataset for each timepoint, using Seurat's TransferData to predict cell type labels for the ATAC profiles. Concurrently, differentially accessible (DA) peaks per cluster were called using FindMarkers with the logistic regression framework with the number of fragments in peak regions as a latent variable. These DA peaks were mapped to the closest genes. Top genes per cluster were used to confirm and refine cluster cell type assignments from those based on transferring RNA labels.
DA peaks between control and SUV420H1 mutant organoids were calculated per cell type, using the same method as above. It was noticed that most cell types had very few significantly differential regions, and that these were almost entirely overlapping regions in all cell types. Therefore, differentially accessible regions were calculated using all cells together to improve power. Differentially accessible regions were visualized using Signac's CoveragePlot function with default parameters.
To find transcription factor motifs enriched in differentially accessible regions, the top 400 up- and down-regulated peaks for each time point differentially accessible peaks were supplied to the HOMER software v4.11.1100, using a 300 bp fragment size and masking repeats. In the case of upregulated regions in three month mutant organoids, only 341 regions were supplied, since that was the total number of regions with log FC>0.1 and p>0.1. The top 5 de novo motifs per cell type found by HOMER with a p value <=10−10 are reported, along with all TFs who's known binding sites match that motif with a score >=0.59.
Pseudotime analysis was performed using the Monocle3 v. 0.2.0 software package36 with default parameters. The cells were first subset to contain an equal amount from control and mutant. A starting point for the trajectory was chosen manually by finding an endpoint of the tree located in the earliest developmental cell type (generally, cycling progenitors). Where the cells were split into more than one partition, the starting point was chosen within the partition of interest, and a new UMAP was calculated using just these cells. To test whether mutant trajectories were accelerated compared to control, a one-sided Kolmogorov-Smirnov test was applied comparing the distribution of psuedotime values of control vs. mutant cells, using the stats R package.
In order to learn patterns of coordinated gene regulation across the cells, WGCNA37 was applied to each dataset. Where cells were split into partitions in the above pseudotime analysis, only cells belonging to the partition of interest were used. Normalized gene expression data was further filtered to remove outlying genes, mitochondrial and ribosomal genes. Outliers were identified by setting the upper (>9) and lower (<0.15) thresholds to the average normalized expression per gene. After processing, blockwiseModules function from the WGCNA v1.69 library was performed in R with the parameters networkType=“signed”, minModuleSize=4, corType=“Bicor”, maxPOutliers=0.1, deepSplit=3,trapErrors=T, and randomSeed=59069. Other than power, remaining parameters were left as the default setting. To pick an adequate power for each dataset, the pickSoftThreshold function from WGCNA was used to test values from 1 to 30. Final resolution was determined by choosing the resolution that captured most variation in the fewest total number of modules—this resulted in a power of 3 for SUV420H1 35 d.i.v., and 9 for ARID1B 35 d.i.v. and 12 for CHD8 109 d.i.v.
To calculate differential expression of modules, Seurat objects were downsampled to have an equal number of cells per organoid, and then the AddModuleScore function was used, using gene lists from WGCNA results. For each module, linear mixed-effect models were fit to the data, with the modules scores as the output, the organoid the cell belongs to as a random intercept, and with or without the control-versus-mutant state as a predictor. The anova function was used to compare the models, and p-values were then adjusted across modules using the Benjamini-Hochberg correction.
Differentially expressed genes between control and mutant organoids were assessed after datasets were subset to the cells from the partition of interest in the above pseudotime analysis, to the cells from each individual cell type, or not subset at all for pseudobulk analysis. Reads were then summed across cells in each organoid. Genes with less than 10 total reads were excluded, and DESeq245 was used to calculate DEGs, with each organoid as a sample46. The clusterProfiler101 R package was used to find enriched biological processes in these gene sets, with the enrichGO function and the compareCluster function to highlight processes the gene sets might have in common.
For organoid size analysis, see
For the proteomic analysis, four mutant and four control organoids were used for SUV420H1. Three mutant and three control and five mutant and four control organoids were used for CHD8 and ARID1B, respectively
For scATAC-seq, three SUV420H1 mutant and three control organoids were used for each of the one month and three month time points.
Finally, in each scRNA-seq dataset, three individual organoids per genotype were profiled. In two cases, one organoid was excluded from the analysis as an outlier: in PGP1 SUV420H1 organoids at one month, a mutant organoid was excluded due to very low average nUMI and nGene in that sequencing lane, and in the HUES66 CHD8 organoids at 3.5 months batch II, a mutant organoid was excluded because it mostly contained interneuron lineage cells, with very few projection neuron cells. Although an increase in interneuron-lineage cells was seen in all mutant organoids, this organoid was excluded to be conservative. This left a total of 124 single organoids that passed quality control and were considered in downstream analysis, with a total of 710,085 cells across both scATAC-seq and scRNA-seq.
This application claims the benefit of U.S. Provisional Application No. 63/108,246, filed on Oct. 30, 2020, the contents of which are hereby incorporated by reference in their entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/057632 | 11/1/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63108246 | Oct 2020 | US |