Literature DB >> 22917185

CDS: a fold-change based statistical test for concomitant identification of distinctness and similarity in gene expression analysis.

Nicolas Tchitchek¹, José Felipe Golib Dzib, Brice Targat, Sebastian Noth, Arndt Benecke, Annick Lesne.

Abstract

The problem of identifying differential activity such as in gene expression is a major defeat in biostatistics and bioinformatics. Equally important, however much less frequently studied, is the question of similar activity from one biological condition to another. The fold-change, or ratio, is usually considered a relevant criterion for stating difference and similarity between measurements. Importantly, no statistical method for concomitant evaluation of similarity and distinctness currently exists for biological applications. Modern microarray, digital PCR (dPCR), and Next-Generation Sequencing (NGS) technologies frequently provide a means of coefficient of variation estimation for individual measurements. Using fold-change, and by making the assumption that measurements are normally distributed with known variances, we designed a novel statistical test that allows us to detect concomitantly, thus using the same formalism, differentially and similarly expressed genes (http://cds.ihes.fr). Given two sets of gene measurements in different biological conditions, the probabilities of making type I and type II errors in stating that a gene is differentially or similarly expressed from one condition to the other can be calculated. Furthermore, a confidence interval for the fold-change can be delineated. Finally, we demonstrate that the assumption of normality can be relaxed to consider arbitrary distributions numerically. The Concomitant evaluation of Distinctness and Similarity (CDS) statistical test correctly estimates similarities and differences between measurements of gene expression. The implementation, being time and memory efficient, allows the use of the CDS test in high-throughput data analysis such as microarray, dPCR, and NGS experiments. Importantly, the CDS test can be applied to the comparison of single measurements (N=1) provided the variance (or coefficient of variation) of the signals is known, making CDS a valuable tool also in biomedical analysis where typically a single measurement per subject is available.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2012 PMID： 22917185 PMCID： PMC5054499 DOI： 10.1016/j.gpb.2012.06.002

Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN： 1672-0229 Impact factor: 7.691

Introduction

The problem of identifying differentially expressed genes has been widely studied [1]. Considering two different biological conditions, one aims to decide which genes are differentially expressed from one biological condition to the other, each composed of one or several gene expression measurements. RNA quantification, which is being used in transcriptome analysis here will serve as an instance representative of any type of high-throughput quantification of cellular components such as DNA, RNA, protein, or metabolites, as the underlying problem of identifying statistically significant changes remains similar independent of the nature of the experiment. Therefore, all of what follows similarly applies to proteome or other measurements. For the sake of simplicity, we will only continue to discuss the case of gene expression investigations. First attempts to tackle the question of differential quantities did not involve statistics and genes having expression levels differing by more than an arbitrary cut-off fold-change value were considered to be differentially expressed [2], [3]. Although the identification of statistically differentially expressed genes has been widely covered [1], the identification of similarly expressed genes has been far less studied. This is surprising, for several reasons. (i) Statistical measures for similarity are an important tool in establishing reproducibility and thus track technical and biological variation. (ii) In relative quantification, such as microarray experiments, where no absolute numbers of, e.g., transcripts is established, a defining procedure for what is considered similar, or unchanged, expression would in turn also provide a sound basis for defining what is to be considered different. (iii) Finally, especially in the case of biomedical studies on human subjects and patients, the question of genes with conserved expression across different biological conditions is of similar importance to the one of change [4]. When reasoning in a statistical manner, assumptions can generally be made that gene expression measurements are normally distributed. The simplest statistical method for detecting differentially expressed genes is the two-sample t-test [5]. The two-sample t-test allows us to formulate statements concerning the difference between the means of two normally distributed variables with the assumption that the variances are unknown. On the other hand, the two-sample z-test allows us to formulate statements concerning the difference between the means of two normally distributed variables with the assumption that the variances are known. However as this assumption can only be made with a large sample of independent records or with additional information about the variances, the two-sample t-test is more often used in the identification of differentially expressed genes. Different variants of the two-sample t-test can be classified in two groups: (i) methods such as the two-sample t-test with relative thresholds [6] carrying out local adjustments to account for biologically meaningful differences, and the Significance Analysis of Microarrays method [7] that uses a gene-specific correction; and (ii) jointly global and local methods such as the B-statistic [8] and the regularized two-sample t-test [9]. In addition to simple fold-change or t-test-like methods, another approach is to consider the statistical properties of the ratio of means of the two biological conditions sampled. Based on the previous work [10], Chapman [11] proposed for the first time a statistical test in this direction. Recent methods (e.g., [12], [13]) extended this approach by considering confidence intervals for the statistic of the ratio of the two means used in hypothesis testing. When comparing different methods for differential expression detection, among the desirable characteristics that a method should have are reproducibility and control of type I and type II errors. Not all of the existing methods necessarily combine both characteristics [14]. Another way of comparing different methods is to measure their false positive and false negative rates [15]. Assume two sets of gene expression measurements obtained from two different biological conditions (Figure 1). By initially making the assumptions that the gene measurements are normally distributed with known variances, we represent the fold-change as the tangent of θ in Figure 1A. Having two biological conditions we can expect different scenarios. If both biological conditions have a small variance within biological replicates and then show differential expression, then methods should detect them as significantly statistically differentially expressed (Figure 1B). Ideally, the same metric would provide for detecting similar expression across biological conditions (Figure 1C) when they present small variability. However, when variability is high, methods should indicate no statistical significance neither for similarity nor for difference (Figure 1D).

Figure 1

Graphical representation of the problematic and encountered scenarios A. Expression signals of a single gene in two different biological conditions, with normal distributions having the parameters x1 and x2 (mean values) and σ1 and σ2 (variances). The fold-change criteria defining the difference or similarly is represented with a conic section defined by parameter θ. The problem is to determine the value of (x1,x2) having the values of estimators . B. Potential scenario for the statistical test for differential expression and low variability. C. Potential scenario of having low variability and similarly expressed genes. D. Potential scenario for the statistical test for no statistical significance and high variabilities.

We describe here a statistical test, CDS for Concomitant identification of Distinctness and Similarity, which allows: (i) obtaining statements on the fold-change rather than on the difference between the mean expression levels; (ii) providing an estimate of the variance together with the signal; (iii) obtaining bounds on the fold-change, both in case of differentially expressed genes and similarly expressed genes. CDS can thereby be used for single measurements of biological conditions (N = 1), provided an estimate of the variance is available.

Statistical approach

Test formulation

Let X be a random variable following the given distribution D with unknown parameter x, and let be an estimator of the parameter x from a sample of independent observations of X. Let H0 be a null hypothesis and H an alternative hypothesis, and let R0 and R be two regions (we use the term region as a synonym of set), such as: Let be the rejection region of H0 such that H0 is rejected if and only if (iff) , and let be the rejection region of H such that H is rejected iff . The probability of type I error, which is the probability of making an error of rejecting the null hypothesis H0 when it is actually true, is then defined by: The probability of type II error, which is the probability of making an error of rejecting the alternative hypothesis H when it is actually true, is then defined by: For any regions , it can be noticed that we have: In plain words, is the probability that the estimated value belongs to knowing that the actual value of the parameter is x. Controlling and is hence equivalent to control probabilities of making type I and type II errors in worst cases. Let Q0 and Q be these two probabilities such as: The above definitions can be exploited in three different ways. First, given regions R0 and R defined by a null hypothesis H0 and an alternative hypothesis H, and given the estimator defining rejection regions and such that , the probabilities of making type I and II errors can be calculated (more precisely, upper bounded) using Eqs. (1) and (2). Second, given the estimator defining rejection regions and such that and given a confidence level α, a confidence interval for x can be obtained by delimiting regions R0 and R such that . Then, it will be stated with a confidence level α that (complement of ). Third, given regions R0 and R defined by a null hypothesis H0 and an alternative hypothesis H, and given ε a maximal tolerance for probability of making type I and type II errors, rejection regions and can be delimited such that . Then, H0 will be rejected iff , and H will be rejected iff , with at most a probability ε of making an error.

Formulation of fold-change statements

Let X1 be a random variable following a normal distribution and with . Let s1 be a sample from X1 of size n1 and empirical mean , and s2 be a sample from X2 of size n2 and empirical mean . Furthermore, assume that σ1 is known, and σ2 is known (we will discuss this aspect later in detail). Consider the samples s1 and s2 as two sets of expression measurements of a specific gene of interest in two different biological conditions. Formulating statistical statements about the fold-change between the means x1 and x2 using the above described statistical approach leads to adequately define regions R0, R, and . In order to formulate fold-change statements between the means x1 and x2 of the two normal distributions, regions R0, R, and have to be defined using conic sections C such as:where is an angle on each side of the first diagonal. Moreover, means x1 and x2 must be controlled to avoid a negative contribution of the distributions to the fold-change. As only positive values of means have to be taken into account, regions R0 and R must be curtailed from zero, and regions and must be curtailed from and . We will henceforth alleviate the notations and use for the value (resp. ) of the estimator of the mean , as computed from the data sample. Then, regions R0, R, , and are defined such as: Figure 2 illustrates the definition of regions R0 (Figure 2A), R (Figure 2B), (Figure 2C), and (Figure 2D) with arbitrary parameters.

Figure 2

Representation of the different regions andA. Region R shown in blue with . B. Region R shown in blue with . C. Region shown in red with , and . D. Region shown in red with , and .

Probabilities Q0 and Q described in Eqs. (1) and (2) with the above defined regions are then defined by: with , , and . As explained above, the above definitions can be exploited in three different ways. First, given two angles θ0 and θ that are relevant to assess the similarity and the distinctness between x1 and x2, and given and defining the rejection regions such as and with , the probabilities of making type I and II errors can be calculated using Eqs. (3) and (4). To formulate fold-change statements, angles θ0 and θ are defined as and where and are two fold-change values that are relevant to assess the similarity and the distinctness between x1 and x2. Q0 (resp. Q) will then give the probability of making an error when stating that two genes are differentially (resp. similarly) expressed. Second, given and (computed from the data samples) defining the rejection regions such as and with , and given a confidence level α, a confidence interval for can be obtained by delimiting regions R0(θ0) and R(θ) such that . Then, it will be stated with a confidence level α that which corresponds to the state that θ< θ < θ. By denoting f the fold-change between x1 and x2, this is equivalent to state with a confidence level α that where , , and . Third, given two angles θ0 and θ (i.e., fold-changes, see above) that are relevant to assess the similarity and the distinctness between x1 and x2, and given ε a maximal tolerance for the probability of making type I and type II errors, rejection regions and can be delimited. However, as those regions are defined by three parameters, their delimitation is more complicated to define than for the regions R0 and R. Also, as they are not essential for our question, we will not focus here on their delimitation.

Test behavior and biological application

Test behavior

Let us have three different situations as displayed in Figure 3 represented as bar charts: a gene showing statistically significantly differential expressions (Figure 3A), another gene whose expression does not differ statistically significantly from one to another condition (Figure 3B), and finally a situation where a given gene cannot be said to be statistically significantly differentially nor similarly expressed due to the variability of its expression levels (Figure 3C).

Figure 3

Test behavior validationIn silico simulations using standard normal distributed data with parameters x1, x2, σ1, σ2 capture 3 different situations shown in A–C. A. Case of differentially expressed gene having a significant Q0 value but an high Q value. B. The opposite case being statistically similar . C. Case where neither Q0 nor Q display statistical significance. Our method is tested on a real biological dataset (panels D and E) showing the correct behavior. D. Values of Q0 are directly changing as a function of the parameter. As we increase the parameter, the values of Q0 are higher. This means that the more we increase the parameter the less Q0 values we have for a given bin of the Q0 histogram. E. Values of Q are inversely changing as a function of the parameter. As we increase the parameter the values of Q are lower. This means that the more we increase the parameter the more Q values we have for a given bin of the Q histogram.

As previously explained, Q0 is the probability of making an error in stating that a certain gene is differentially expressed between the two biological conditions. Lower values (close to zero) of Q0 indicate then dissimilarities in terms of gene expression as in Figure 3A (Q0 = 0.01) as opposed to the cases presented in Figure 3B and C. Similarly, Q is the probability of making an error in stating that a certain gene is similarly expressed between the two biological conditions. The situation displayed in Figure 3B (Q= 0.03) can be considered as statistically significant as opposed to the cases presented in Figure 3A and C. Moreover, values such as the ones in the example of Figure 3C are associated neither with similarity nor distinctness from a statistical point of view. In summary, our examples suggest three typical situations when comparing the expression levels of a certain gene between two different biological conditions that our statistical test can detect.

Biological application

In order to illustrate the behavior of our statistical test in a biological application, we use a dataset coming from transcriptome microarray studies of adrenal cancer [16], [17], [18], [19]. This dataset is composed of 3 different biological conditions: (i) adrenal cortex carcinoma (ACC), 33 samples (ii) adrenal cortex adenoma (ACA), 22 samples and (iii) normal adrenal cortex (NAC) that serves as control, 10 samples [18]. The insulin-like growth factor (IGF) signaling system was identified as being one of the most dominantly altered in ACC in the form of greatly increased expression of IGF2 [17]. In a subsequent study [18], 10 genes associated with the cancer phenotype are identified. Steroid signaling is associated with ACA since the activation of this pathway is needed for different hormone production. We estimate the evolution of the values for Q0 and Q as we vary the fold-change parameter and , respectively. For example, considering the differences between ACC and NAC (i.e., the malignancy profile), we computed several subtraction profiles as displayed for Q0 (Figure 3D) and Q (Figure 3E). As expected, Q0 is more restrictive as increases and conversely when we increase the value of Q is more permissive. The results obtained here for differentially expressed genes is presented in Figure 4. A summary of the number of differentially expressed genes is displayed in a Venn Diagram (Figure 4A). The advantage of our method is that we can extend our scope by looking at cases other than simply differential expression amongst the different biological conditions. For instance, we can consider similar expression in one of the comparisons (Figure 4B) or even in two of them (Figure 4C). Each of these possibilities give us different insights. Among the 114 genes differentially detected, the collagen type I – alpha 1 gene (COL1A1) has been identified as present in the adrenal cancer malignancy (Figure 4D). In particular, we detected the gene encoding secreted phosphoprotein 1 (SPP1), which is present in carcinoma and control samples, with little variation while displaying a large variability in the adenoma conditions. This can be explained since this dataset has adenomas that produce different hormones all synthesized from cholesterol (steroids), which contribute to the variability of this gene (Figure 4E). We can predict that gene Interleukin-1 alpha (IL-1a) is similarly expressed among carcinoma samples but is not relevant as a malignancy marker since the expression is not statistically consistent with the other two biological conditions (Figure 4F).

Figure 4

Experimental validation of our CDS statistical test Venn Diagrams of the differentially expressed genes when comparing 3 different biological conditions are shown in panel A–C. A. Comparing differential expression across the three comparisons is the usual case. With the CDS method we can capture more cases, for instance shown in panel B and C. B. Comparing differential expression in two subtractions and similarity in one subtraction. C. Comparing one differential expression and two similarity expressions. Panels D–F. Examples of genes detected using both Q0 and Q values issued from our method. D. Difference in the three biological conditions. E. Similarity between two biological conditions. F. Similarity in two comparisons and difference among the three biological conditions.

Variance estimation

The CDS statistical test described here is based on the assumption of known variances of the signals. This assumption is reasonable in cases where the technology itself provides direct estimates of the variance as is the case for dPCR and certain NGS applications using recall chemistry. Furthermore, modern microarray platforms provide coefficient of variation estimates which can be used as proxies for variance [20]. Another most important case is the often encountered scenario of biomedical investigations where a large number of individual measurements are available (e.g., a single recording per patient or subject). Computing the biological variations from the entire cohort of samples can then allow us to compare individual measurements amongst each other with the CDS statistical test.

Multiple testing

The CDS statistical test can and should be combined with false-positive discovery rates or similar corrections when used in a serial manner. We have used successfully both FDR and pFDR methods [21], [22]. Note, that the data presented here were not subjected to multiple testing correction as they only serve to demonstrate the applicability of the CDS method.

Conclusion

The CDS statistical test is suitable for quantitatively checking statements, typically to determine confidence intervals, about the fold-change between the means of two normally distributed variables, under assumptions that the variances are known. Applied to the identification of differentially and similarly expressed genes in the context of microarray measurements, this statistical test correctly identified genes of interest in benchmark situations and also gave confidence intervals of the fold-change. Moreover, this statistical test can be used for any -omics data as long as the similarity or distinctness between two signals is measured by the fold-change and the required assumptions are fulfilled. Even if in the present case, assumptions have been made that gene expression measurements are distributed according to normal distributions with known variances, the principle of the test remains valid for other distributions and it can be numerically implemented. Indeed, Monte Carlo simulations can be performed to estimated probabilities and when explicit forms cannot be obtained easily. Finally, when variances of the normal distributions are not supposed to be known but have to be estimated from the samples, Student t-distributions can be used instead of the normal distributions.

Methods

Explicit forms of probabilities

The explicit forms and have been obtained by applying affine transformations to the bivariate normal distribution in order to make the integration region rectangular and then easily computable using the standard bivariate normal complementary cumulative distribution function. These explicit forms are given in Supplementary materials (http://cds.ihes.fr) in Eqs. (5) and (6).

Type I and type II risks upper bounds computation

It is notable that supremums of Eqs. (3) and (4) are reached on boundaries of regions and , meaning on lines and for Q0, and on lines and for Q. Albeit mathematically defined, as in (3) and (4), the computation of these probabilities begged for a numerical estimation given the complexity of the explicit form of their first and second derivatives. In this line of thought, we use numerical methods to obtain the maximum values of the probabilities considering a finite number of instances of the probability distribution functions (as opposed to the exact functions from the mathematical definition) and we evaluated them over a finite interval in the parameter space (as opposed to the infinite interval assumed in the mathematical definition).

Confidence interval computation

As Q0 increases (respectively QA increase) with θ0 (resp. θ), this delineation can done by performing a binary search of the angle θ0 (resp. θ) from to 0 (resp. to ) until (resp. ) is reached.

Implementation

This statistical test has been implemented in Java and it is possible to compute and as well as the confidence intervals for a set of 30,000 values in a few minutes. The computational speed allows us to imagine using this test for the analysis of NGS data. It may be interesting to notice here that some thought can be stated regarding the nature of the distributions to be used for the analysis of NGS. Indeed, in contrast to data from microarrays where the values are continuous signals, the measured values are discrete, and thus the use of discrete distributions like the negative binomial distribution can be interesting for better modeling of assumptions. An R implementation of the CDS statistical test is available at http://cds.ihes.fr.

Data processing

The transcriptome data discussed have first been published in [17], and are available from GEO [23] under Accession No. GSE10927, and mace (http://www.mace.ihes.fr) under Accession No. 2651913582. Data were log-transformed, subjected to an additional round of quality control [24], [25], and normalized using NeONORM [26] for subtraction profiling. No multiple testing correction was performed so as to retain the original P values.

Authors’ contributions

NT, JFGD, SN, AB and AL performed test formulation. NT, JFGD, BT and SN carried out implementation, and NT, BT and AB performed testing. NT, JFGD, AB and AL wrote the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors declare no competing interests.

19 in total

1. Adrenal cortex remodeling and functional zona glomerulosa hyperplasia in primary aldosteronism.

Authors: Sheerazed Boulkroun; Benoit Samson-Couterie; José-Felipe Golib Dzib; Hervé Lefebvre; Estelle Louiset; Laurence Amar; Pierre-François Plouin; Enzo Lalli; Xavier Jeunemaitre; Arndt Benecke; Tchao Meatchi; Maria-Christina Zennaro
Journal: Hypertension Date: 2010-10-11 Impact factor: 10.190

2. Ratio-based decisions and the quantitative analysis of cDNA microarray images.

Authors: Y Chen; E R Dougherty; M L Bittner
Journal: J Biomed Opt Date: 1997-10 Impact factor: 3.170

3. Probability fold change: a robust computational approach for identifying differentially expressed gene lists.

Authors: Xutao Deng; Jun Xu; James Hui; Charles Wang
Journal: Comput Methods Programs Biomed Date: 2008-10-07 Impact factor: 5.428

Review 4. Mutations in KCNJ5 gene cause hyperaldosteronism.

Authors: Maria-Christina Zennaro; Xavier Jeunemaitre
Journal: Circ Res Date: 2011-06-10 Impact factor: 17.367

5. Distinct transcriptional profiles of adrenocortical tumors uncovered by DNA microarray analysis.

Authors: Thomas J Giordano; Dafydd G Thomas; Rork Kuick; Michelle Lizyness; David E Misek; Angela L Smith; Donita Sanders; Rima T Aljundi; Paul G Gauger; Norman W Thompson; Jeremy M G Taylor; Samir M Hanash
Journal: Am J Pathol Date: 2003-02 Impact factor: 4.307

6. NCBI GEO: archive for functional genomics data sets--10 years on.

Authors: Tanya Barrett; Dennis B Troup; Stephen E Wilhite; Pierre Ledoux; Carlos Evangelista; Irene F Kim; Maxim Tomashevsky; Kimberly A Marshall; Katherine H Phillippy; Patti M Sherman; Rolf N Muertter; Michelle Holko; Oluwabukunmi Ayanbule; Andrey Yefanov; Alexandra Soboleva
Journal: Nucleic Acids Res Date: 2010-11-21 Impact factor: 16.971

7. Quality assessment of transcriptome data using intrinsic statistical properties.

Authors: Guillaume Brysbaert; François-Xavier Pellay; Sebastian Noth; Arndt Benecke
Journal: Genomics Proteomics Bioinformatics Date: 2010-03 Impact factor: 7.691

8. Normalization using weighted negative second order exponential error functions (NeONORM) provides robustness against asymmetries in comparative transcriptome profiles and avoids false calls.

Authors: Sebastian Noth; Guillaume Brysbaert; Arndt Benecke
Journal: Genomics Proteomics Bioinformatics Date: 2006-05 Impact factor: 7.691

9. Testing significance relative to a fold-change threshold is a TREAT.

Authors: Davis J McCarthy; Gordon K Smyth
Journal: Bioinformatics Date: 2009-01-28 Impact factor: 6.937

10. High-sensitivity transcriptome data structure and implications for analysis and biologic interpretation.

Authors: Sebastian Noth; Guillaume Brysbaert; François-Xavier Pellay; Arndt Benecke
Journal: Genomics Proteomics Bioinformatics Date: 2006-11 Impact factor: 7.691

10 in total

1. CTIP2 is a negative regulator of P-TEFb.

Authors: Thomas Cherrier; Valentin Le Douce; Sebastian Eilebrecht; Raphael Riclet; Céline Marban; Franck Dequiedt; Yannick Goumon; Jean-Christophe Paillart; Mathias Mericskay; Ara Parlakian; Pedro Bausero; Wasim Abbas; Georges Herbein; Siavash K Kurdistani; Xavier Grana; Benoit Van Driessche; Christian Schwartz; Ermanno Candolfi; Arndt G Benecke; Carine Van Lint; Olivier Rohr
Journal: Proc Natl Acad Sci U S A Date: 2013-07-12 Impact factor: 11.205

2. Delayed inflammatory and cell death responses are associated with reduced pathogenicity in Lujo virus-infected cynomolgus macaques.

Authors: Angela L Rasmussen; Nicolas Tchitchek; David Safronetz; Victoria S Carter; Christopher M Williams; Elaine Haddock; Marcus J Korth; Heinz Feldmann; Michael G Katze
Journal: J Virol Date: 2014-12-17 Impact factor: 5.103

3. 1918 Influenza virus hemagglutinin (HA) and the viral RNA polymerase complex enhance viral pathogenicity, but only HA induces aberrant host responses in mice.

Authors: Tokiko Watanabe; Jennifer Tisoncik-Go; Nicolas Tchitchek; Shinji Watanabe; Arndt G Benecke; Michael G Katze; Yoshihiro Kawaoka
Journal: J Virol Date: 2013-02-28 Impact factor: 5.103

4. A TDG/CBP/RARα ternary complex mediates the retinoic acid-dependent expression of DNA methylation-sensitive genes.

Authors: Hélène Léger; Caroline Smet-Nocca; Amel Attmane-Elakeb; Sara Morley-Fletcher; Arndt G Benecke; Sebastian Eilebrecht
Journal: Genomics Proteomics Bioinformatics Date: 2014-01-03 Impact factor: 7.691

5. Alternative splicing of TAF6: downstream transcriptome impacts and upstream RNA splice control elements.

Authors: Catherine Kamtchueng; Marie-Éve Stébenne; Aurélie Delannoy; Emmanuelle Wilhelm; Hélène Léger; Arndt G Benecke; Brendan Bell
Journal: PLoS One Date: 2014-07-15 Impact factor: 3.240

6. HIC1 controls cellular- and HIV-1- gene transcription via interactions with CTIP2 and HMGA1.

Authors: Valentin Le Douce; Faezeh Forouzanfar; Sebastian Eilebrecht; Benoit Van Driessche; Amina Ait-Ammar; Roxane Verdikt; Yoshihito Kurashige; Céline Marban; Virginie Gautier; Ermanno Candolfi; Arndt G Benecke; Carine Van Lint; Olivier Rohr; Christian Schwartz
Journal: Sci Rep Date: 2016-10-11 Impact factor: 4.379

7. The siRNA-mediated knockdown of GluN3A in 46C-derived neural stem cells affects mRNA expression levels of neural genes, including known iGluR interactors.

Authors: Svenja Pachernegg; Sebastian Eilebrecht; Elke Eilebrecht; Hendrik Schöneborn; Sebastian Neumann; Arndt G Benecke; Michael Hollmann
Journal: PLoS One Date: 2018-02-13 Impact factor: 3.240

Review 8. Systems approaches to influenza-virus host interactions and the pathogenesis of highly virulent and pandemic viruses.

Authors: Marcus J Korth; Nicolas Tchitchek; Arndt G Benecke; Michael G Katze
Journal: Semin Immunol Date: 2012-12-05 Impact factor: 11.130

9. Specific mutations in H5N1 mainly impact the magnitude and velocity of the host response in mice.

Authors: Nicolas Tchitchek; Amie J Eisfeld; Jennifer Tisoncik-Go; Laurence Josset; Lisa E Gralinski; Christophe Bécavin; Susan C Tilton; Bobbie-Jo Webb-Robertson; Martin T Ferris; Allison L Totura; Chengjun Li; Gabriele Neumann; Thomas O Metz; Richard D Smith; Katrina M Waters; Ralph Baric; Yoshihiro Kawaoka; Michael G Katze
Journal: BMC Syst Biol Date: 2013-07-29

10. HMGA1 recruits CTIP2-repressed P-TEFb to the HIV-1 and cellular target promoters.

Authors: Sebastian Eilebrecht; Valentin Le Douce; Raphael Riclet; Brice Targat; Houda Hallay; Benot Van Driessche; Christian Schwartz; Gwenaëlle Robette; Carine Van Lint; Olivier Rohr; Arndt G Benecke
Journal: Nucleic Acids Res Date: 2014-03-11 Impact factor: 16.971

10 in total