| Literature DB >> 31211786 |
Claire Bomkamp1,2, Shreejoy J Tripathy1,3, Carolina Bengtsson Gonzales4, Jens Hjerling-Leffler4, Ann Marie Craig1,2, Paul Pavlidis1,3.
Abstract
In order to further our understanding of how gene expression contributes to key functional properties of neurons, we combined publicly accessible gene expression, electrophysiology, and morphology measurements to identify cross-cell type correlations between these data modalities. Building on our previous work using a similar approach, we distinguished between correlations which were "class-driven," meaning those that could be explained by differences between excitatory and inhibitory cell classes, and those that reflected graded phenotypic differences within classes. Taking cell class identity into account increased the degree to which our results replicated in an independent dataset as well as their correspondence with known modes of ion channel function based on the literature. We also found a smaller set of genes whose relationships to electrophysiological or morphological properties appear to be specific to either excitatory or inhibitory cell types. Next, using data from PatchSeq experiments, allowing simultaneous single-cell characterization of gene expression and electrophysiology, we found that some of the gene-property correlations observed across cell types were further predictive of within-cell type heterogeneity. In summary, we have identified a number of relationships between gene expression, electrophysiology, and morphology that provide testable hypotheses for future studies.Entities:
Mesh:
Year: 2019 PMID: 31211786 PMCID: PMC6599125 DOI: 10.1371/journal.pcbi.1007113
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Methods for modeling relationships between gene expression and electrophysiological or morphological properties with respect to cell class (A) Schematic for defining cell types from single-cell transcriptomic or electrophysiological and morphological data. We divided cells into types based on Cre-driver expression as well as cortical layer and excitatory/inhibitory identity (left). Right panel shows summarization of cellular features by cell type for a hypothetical gene and property, where each point in the scatter plot represents each cell type’s mean gene expression (x-axis) and the mean value of an electrophysiological or morphological property (y-axis). (B) A hypothetical class-driven relationship between a gene and an electrophysiological or morphological property, in which neither cell class (excitatory or inhibitory) shows a relationship between gene expression and the property (solid lines), but an overall relationship appears because of systematic cross-class differences in both data modalities (dashed line). For B-D, small points represent individual cells and larger circles or diamonds represent cell type averages. (C) A hypothetical example of a non-class-driven relationship, where the gene-property relationship appears within each major cell class (solid lines), but would be obscured if modeled in a class-independent manner (dashed line). (D) A hypothetical example of a gene-property relationship exhibiting an interaction with cell class. Here, expression of the gene is positively correlated with the property in excitatory cell types but negatively correlated in inhibitory types (solid lines).
Fig 2Different sets of genes are associated with electrophysiological and morphological properties after correcting for cell class.
(A) Number of genes significantly associated with each property in the class-conditional model at various levels of significance (only properties with significant genes in this model are shown). Darkness of the bar represents the significance level of each group of genes. Venn diagrams to the left indicate the extent of overlap (pink; middle) between the gene sets identified by the class-independent (gold; left) and class-conditional (teal; right) models, where the area of each segment is proportional to the significant gene count at a threshold of FDR = 0.1. Venn diagrams for different properties are not to scale with one another. Percentages next to principal components (PCs) indicate the percent variability explained by that component. See S2 Table for descriptions of electrophysiological and morphological properties analyzed here, as well as gene counts for all properties. (B) Comparison of model-based slopes from the class-independent and class-conditional models. Each point represents a single gene’s relationship with the electrophysiological property AHP amplitude and is colored according to whether the relationship is significant in one or both models (at FDR = 0.1). Example genes in C-E are indicated. For clarity of visualization, only a random subset of genes (2% total) are shown to mitigate over-plotting. Dashed line indicates identity. (C-E) Examples of genes showing significant associations with AHP amplitude that are class-driven (C; significant in class-independent model only), non-class-driven (D; significant in class-conditional model only), or non-class-driven but significant by either model (E). Solid lines indicate linear fits within excitatory or inhibitory cell classes only and dashed line indicates a linear fit including all cell types. Gene expression is quantified as counts per million (CPM).
Fig 3Identification of divergent gene-property relationships in excitatory versus inhibitory cell classes (A) Number of genes showing a significant interaction effect between gene and class for each property. Darkness of the bar represents the significance level of each group of genes. Venn diagrams to the left indicate the extent of overlap (pink; middle) between the class-conditional (teal; left) and interaction (purple; right) models, where the area of each segment is proportional to the significant gene count at a threshold of FDR = 0.1. Venn diagrams for different properties are not to scale with one another. Percentages next to principal components (PCs) indicate the percentage of variability explained by that component. (B-C) Slope values within excitatory cell types (x axis) plotted against the slope values for the same set of genes in inhibitory cell types (y axis). Each point represents a single gene’s relationship to the morphological property maximum branch order (B) or electrophysiological property AHP amplitude (C), and is colored according to its significance in one or both models (see inset legend). Example gene-property relationships highlighted in D-E are marked in panel C. For clarity of visualization, only a random 2% subset of the total number of genes are plotted. Dashed lines indicate positive and negative unity lines. (D) Example of a gene with a significant interaction term which is not significant in the class-conditional model. For D and E, solid lines indicate linear fits including only excitatory or only inhibitory cell types, and dashed line indicates a linear fit including all cell types. (E) Example of a gene which is significant in both the class-conditional and interaction models.
Fig 4Modeling gene/electrophysiology relationships using the class-conditional model is more predictive than the class-independent model of model slopes in an independent dataset containing non-projecting cell types only (A) Aggregate gene-property relationship consistency between AIBS and NeuroExpresso/NeuroElectro (NE) datasets. Error bars indicate a 95% confidence interval, and asterisk indicates a significant (p < 0.05) difference in the consistency metric between the class-independent and class-conditional models, calculated using 100 bootstrap resamples of the original values (indicated only for properties where both values are positive). (B) Direct comparison of gene-property relationships between the AIBS and NE datasets. Each point represents the relationship between a single gene and the property AP half-width. The model slope from the AIBS dataset is plotted on the x axis (with the class-independent model (ind) slopes in gold, and the class-conditional model (cond) slopes in teal), and that for the same set of genes in the NE dataset on the y axis. For clarity of visualization only 10% of the total number of genes are plotted. Lines indicate a linear fit for each set of points. The correlation within each set of points is used as a measure of cross-dataset consistency (plotted for all properties in panel A). (C-D) Example of a gene showing consistent results between the NE dataset and the AIBS dataset using the class-conditional model, but not the class-independent model. C shows the relationship within the AIBS dataset, and D shows the same gene and property in the NE dataset. Solid lines indicate a linear fit including only types belonging to one cell class, and dashed line indicates a linear fit including all cell types.
Description of PatchSeq datasets re-analyzed in this study.
Depending on the dataset, RNA amplification was performed using variations on single-cell-tagged reverse transcription (STRT) [31] or Switching Mechanism At the end of the 5’-end of the RNA Transcript (SMART) [32]. The Bengtsson Gonzales dataset reflects a novel dataset reported here for the first time.
| Dataset | Description | RNA amplification | Number of cells | Accession |
|---|---|---|---|---|
| Cadwell [ | Cortical layer 1 interneurons | Smart-seq2 | 57 | E-MTAB-4092 |
| Fuzik [ | Cortical layer 1/2 interneurons and pyramidal cells | STRT-C1 (with unique molecule identifiers) | 80 | GSE70844 |
| Földy [ | Hippocampal CA1 and subiculum pyramidal cells and regular- and fast-spiking interneurons | SMARTer | 93 | GSE75386 |
| Muñoz-Manchado [ | Striatum interneurons | STRT-C1 (with unique molecule identifiers) | 99 | GSE119248 |
| Bengtsson Gonzales | Hippocampal CA1 Pvalb-Cre interneurons | STRT-C1 (with unique molecule identifiers) | 19 | GSE130950 |
Fig 5Assessing gene-property relationships within cell subclasses using PatchSeq (A) Number of genes associated with each electrophysiological property based on a joint cross-laboratory analysis of 5 PatchSeq datasets. Genes shown are significant at FDR = 0.1, based on a mixed-effects regression model, treating gene expression as a fixed effect and dataset identity and cell type as random effects. Bar color denotes overlap of PatchSeq based gene-property relationships with analogous relationships from the AIBS class-conditional model analysis. Note that analysis of gene-property relationships in the PatchSeq datasets are independent from those in the AIBS cell types analysis. (B, C) Examples of genes showing significant associations with electrophysiological features in the class-conditional analysis of the AIBS dataset (left-most panel) and the mixed-effects analysis of the PatchSeq datasets (other panels). Dataset name and cell type is shown in the subpanel title and solid lines indicate linear fits within cell classes (AIBS) or fits within each PatchSeq dataset and cell type, after weighting cells by transcriptome-quality (see Methods). Based on differences in mRNA quantification, x-axis units for AIBS, Cadwell, and Földy datasets are log2 (CPM+1), and for Bengtsson Gonzales, Muñoz, and Fuzik datasets are log2 normalized molecule counts (normalized to 2000 unique molecules per cell).
Fig 6Accounting for cell class changes the interpretation of the relationship between potassium channel expression and after-hyperpolarization amplitude (A) Schematic view of an action potential trace, with the dashed line representing the AHP amplitude value. (B-D) Examples of voltage-gated potassium channel genes significantly associated with AHP amplitude in the class-independent model (B), the class-conditional model (C), or both (D) at a threshold of FDR = 0.01. Solid lines indicate a linear fit including only excitatory or only inhibitory cell types, and dashed line indicates a linear fit including all cell types. (E) Comparison of class-independent and class-conditional approaches for detecting associations between voltage-gated potassium channels and AHP amplitude. Each point indicates a single gene, and x and y axes are the slopes from the class-independent and class-conditional models, respectively. Labeled points are the example genes shown in B-D. Dashed line indicates identity.
Fig 7Examples of experimentally supported or otherwise potentially interesting genes (A-D) Examples of genes showing statistically-significant gene-property relationships in the class-conditional model (at FDR = 0.1) that also have experimental support for their causal regulation of the property in the literature. Solid lines indicate linear fits including only excitatory or only inhibitory cell types, and dashed line indicates a linear fit including all cell types (also applies to F-I). (E) Heatmap showing a subset of the most significant genes for each property in the class-conditional model, sorted along both axes by similarity. Dendrogram represents cross-property similarity between the significance levels for the genes shown here; properties appearing closely linked in the dendrogram are those which are strongly associated with the same genes in our analysis. For each property, up to 3 top genes were chosen that were significant (at FDR = 0.1) in the class-conditional model, and also non-significant (at FDR = 0.2) in both the class-independent and interaction models for the same property. In addition, genes marked by asterisks are shown here based on their known function based on the literature in addition to at least one significant result in the class-conditional model, shown as scatterplots in A-D and F-I. Light grey indicates a non-significant result in the class-conditional model (at FDR = 0.1). (F-I) Examples of under-studied but plausibly causal genes showing significant results in the class-conditional model (see text).
Top correlated genes for each electrophysiological property.
Genes marked with asterisks are significantly associated (at FDR = 0.1) with the indicated property in the class-conditional model, and selected based on their reported function in the literature. All other genes are significant (at FDR = 0.1) in the class-conditional model and non-significant (at FDR = 0.2) in both the class-independent and interaction models for the indicated property. “Direction” indicates the direction of the model slope; for example, high expression of Daam1 in a cell type predicts a low value of AP half-width and vice versa.
| Gene | Gene Name | q | Direction | |
|---|---|---|---|---|
| Electrophysiology PC3 (9%) | RasGEF domain family, member 1B | 0.029 | + | |
| Electrophysiology PC3 (9%) | G-protein coupled receptor 88 | 0.036 | + | |
| Electrophysiology PC3 (9%) | copine IV | 0.041 | + | |
| Electrophysiology PC2 (18%) | AT rich interactive domain 5A (MRF1-like) | 0.00088 | + | |
| Electrophysiology PC2 (18%) | brain abundant, membrane attached signal protein 1 | 0.0026 | - | |
| Electrophysiology PC2 (18%) | phosphatidylinositol 3-kinase, regulatory subunit, polypeptide 1 (p85 alpha) | 0.0026 | - | |
| Electrophysiology PC1 (47%) | insulin-like growth factor 1 | 9.4 * 10−11 | - | |
| Electrophysiology PC1 (47%) | inositol 1,4,5-trisphosphate receptor 1 | 1.3 * 10−10 | + | |
| Electrophysiology PC1 (47%) | rho/rac guanine nucleotide exchange factor (GEF) 2 | 1.4 * 10−8 | + | |
| Rheobase | distal-less homeobox 1 | 9.5 * 10−7 | - | |
| Rheobase | distal-less homeobox 2 | 0.00052 | - | |
| Rheobase | solute carrier family 6 (neurotransmitter transporter, GABA), member 1 | 0.00085 | + | |
| AP Threshold | AT rich interactive domain 5A (MRF1-like) | 0.013 | + | |
| AP Threshold | potassium voltage-gated channel, subfamily F, member 1 | 0.013 | - | |
| AP Threshold | tubulin, alpha 8 | 0.019 | + | |
| AP Half-width | keratin 1 | 2.4 * 10−7 | + | |
| AP Half-width | N-terminal EF-hand calcium binding protein 2 | 1.6 * 10−6 | + | |
| AP Half-width | epoxide hydrolase 4 | 2.2 * 10−6 | - | |
| AP Amplitude | inositol 1,4,5-trisphosphate receptor 1 | 2.9 * 10−6 | - | |
| AP Amplitude | RAS-related C3 botulinum substrate 3 | 4.5 * 10−5 | + | |
| AP Amplitude | ArfGAP with coiled-coil, ankyrin repeat and PH domains 2 | 5.3 * 10−5 | - | |
| AHP Amplitude | insulin-like growth factor 1 | 6.7 * 10−9 | - | |
| AHP Amplitude | prospero homeobox 1 | 4.8 * 10−7 | - | |
| AHP Amplitude | sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3C | 6.1 * 10−7 | - | |
| Capacitance | expressed sequence AW551984 | 4.6 * 10−7 | + | |
| Capacitance | leucine rich repeat containing 4C | 0.00000049 | - | |
| Capacitance | oxytocin receptor | 0.00000081 | + | |
| Time Constant Tau | CUGBP, Elav-like family member 6 | 1.2 * 10−6 | + | |
| Time Constant Tau | proline rich 5 (renal) | 3.7 * 10−6 | + | |
| Time Constant Tau | family with sequence similarity 81, member A | 4.3 * 10−6 | - | |
| Input Resistance | cortexin 1 | 7.7 * 10−6 | + | |
| Input Resistance | ectodermal-neural cortex 1 | 7.7 * 10−5 | + | |
| Input Resistance | RIKEN cDNA A330050F15 gene | 0.00025 | - | |
| Resting Membrane Potential | EGF-like domain 7 | 0.012 | + | |
| Resting Membrane Potential | EH domain binding protein 1-like 1 | 0.012 | + | |
| Resting Membrane Potential | transgelin 3 | 0.013 | + | |
| Sag | tubulin, alpha 8 | 0.064 | - | |
| Sag | potassium voltage-gated channel, subfamily F, member 1 | 0.064 | + | |
| Average Interspike Interval | insulin-like growth factor 1 | 2.8 * 10−7 | + | |
| Average Interspike Interval | rho/rac guanine nucleotide exchange factor (GEF) 2 | 3.97 * 10−6 | - | |
| Average Interspike Interval | dendrin | 1.1 * 10−5 | - | |
| Max Firing Frequency | insulin-like growth factor 1 | 4.6 * 10−12 | - | |
| Max Firing Frequency | inositol 1,4,5-trisphosphate receptor 1 | 1.9 * 10−9 | + | |
| Max Firing Frequency | rho/rac guanine nucleotide exchange factor (GEF) 2 | 0.000000017 | + | |
| Input-Output Curve Slope | insulin-like growth factor 1 | 1.6 * 10−13 | - | |
| Input-Output Curve Slope | inositol 1,4,5-trisphosphate receptor 1 | 3.9 * 10−10 | + | |
| Input-Output Curve Slope | synaptotagmin-like 2 | 0.000000034 | + | |
| Adaptation Ratio | SOX2 overlapping transcript (non-protein coding) | 0.00021 | - | |
| Adaptation Ratio | insulin-like growth factor 1 | 0.00023 | - | |
| Adaptation Ratio | distal-less homeobox 1 | 0.0003 | - | |
| Branchiness | methylenetetrahydrofolate dehydrogenase (NADP+ dependent) 2-like | 0.023 | - | |
| Branchiness | interferon induced transmembrane protein 10 | 0.024 | - | |
| Branchiness | lysosomal-associated membrane protein family, member 5 | 0.024 | + | |
| Max Branch Order | mannoside acetylglucosaminyltransferase 5 | 0.017 | - | |
| Max Firing Frequency | potassium voltage-gated channel, shaker-related subfamily, member 1 | 0.00018 | + | |
| AP Half-width | sodium channel, voltage-gated, type I, beta | 0.00079 | - | |
| Branchiness | leucine-rich repeat kinase 2 | 0.048 | + | |
| Max Firing Frequency | potassium voltage-gated channel, shaker-related subfamily, beta member 2 | 1.8 * 10−5 | + | |
| AHP Amplitude | RAB33A, member RAS oncogene family | 0.004 | + | |
| AHP Amplitude | mediator complex subunit 23 | 0.057 | + | |
| AP Half-width | nephronophthisis 4 (juvenile) homolog (human) | 0.036 | - | |
| AP Half-width | dishevelled associated activator of morphogenesis 1 | 0.096 | - |