| Literature DB >> 26163561 |
Irina Makarevitch1, Cameo Frechette2, Natalia Wiatros2.
Abstract
Integration of inquiry-based approaches into curriculum is transforming the way science is taught and studied in undergraduate classrooms. Incorporating quantitative reasoning and mathematical skills into authentic biology undergraduate research projects has been shown to benefit students in developing various skills necessary for future scientists and to attract students to science, technology, engineering, and mathematics disciplines. While large-scale data analysis became an essential part of modern biological research, students have few opportunities to engage in analysis of large biological data sets. RNA-seq analysis, a tool that allows precise measurement of the level of gene expression for all genes in a genome, revolutionized molecular biology and provides ample opportunities for engaging students in authentic research. We developed, implemented, and assessed a series of authentic research laboratory exercises incorporating a large data RNA-seq analysis into an introductory undergraduate classroom. Our laboratory series is focused on analyzing gene expression changes in response to abiotic stress in maize seedlings; however, it could be easily adapted to the analysis of any other biological system with available RNA-seq data. Objective and subjective assessment of student learning demonstrated gains in understanding important biological concepts and in skills related to the process of science.Entities:
Mesh:
Year: 2015 PMID: 26163561 PMCID: PMC4710385 DOI: 10.1187/cbe.15-04-0081
Source DB: PubMed Journal: CBE Life Sci Educ ISSN: 1931-7913 Impact factor: 3.325
Activities implemented as a part of lab series on RNA-seq data analysis
| Activities | Assessment |
|---|---|
| Worksheet 1. Transcriptional Response to Cold Stress: Primary Literature Analysis and Developing Testable Hypotheses | |
| Observation and description: phenotypic effects of abiotic stress | Worksheet 1 (completeness and effort, feedback), lab report |
| Primary literature analysis: effects of abiotic stress on gene expression in plants | |
| Formulating hypotheses/predictions: number and types of genes affected by the stress and variation in response to different stress and between different genotypes | |
| Worksheet 2. RNA-seq Analysis: Principles | |
| Concept discussion: classes of RNA molecules, similarities and differences | Worksheet 2 (completeness and effort, feedback), lab report |
| Knowledge building: principles of RNA-seq analysis, creating libraries, and sequencing | |
| Worksheet 3. RNA-seq Analysis: Data Quality and Initial Analysis | |
| Understanding sequence read files (FastQ): how do my data look like? | Worksheet 3 (completeness and effort, feedback), lab report |
| Initial data analysis: data quality control using Green Line of the DNA Subway | |
| Analogy and exercise: principles of mapping and counting RNA-seq reads | |
| Worksheet 4. Data Analysis: Finding Differentially Expressed Genes | |
| DE-Seq analysis: finding differentially expressed (DE) genes | Lists of DE genes, summary tables, lab report |
| Formulating questions, choosing approaches to data visualization | |
| Data visualization and analysis | |
| Worksheet 5. Data Visualization: Common Types of Graphs Used to Show RNA-seq Data | |
| Exploring various approaches to RNA-seq data graphical visualization | Student presentations and discussion, worksheet 5, lab report |
| Data visualization and analysis | |
| Sharing the results with other groups, discussion of data and graphs | |
Figure 1.The flow of an RNA-seq experiment. The steps shown in blue were performed by students. Students completed learning exercises only for the steps shown in green.
Figure 2.Students’ prior exposure to the data analysis approaches used in the lab. During the first week of class, students were asked to rank their prior experience to gene expression analysis, analysis of large data sets, and using R or other programming tools in data analysis and data visualization. Data shown are for 85 students in a genetics course who completed this survey in 2014. The actual number of students is designated for each answer choice.
Examples of questions used to assess student learning
| Percent of correct answersb | ||
|---|---|---|
| Concept questiona | Pretest | Posttest |
| Genes and gene regulation (11 questions: 1–6, 8–11, 14) | ||
| Which of the following human cells contains a gene that specifies eye color? | 34 | 85 |
| In what way is the same environmental signal expected to modify gene activity in different individuals? | 19 | 71 |
| What proportion of genes is likely change their expression levels in response to environmental stress? | 13 | 87 |
| RNA-seq analysis (9 questions: 7, 12, 13, 15–18, 20, 21) | ||
| What is not true about RNA molecules that are “sequenced” during RNA-seq experiments? | 12 | 78 |
| What is not necessary to have in order to perform an RNA-seq experiment? | 68 | 86 |
| Data visualization (2 questions: 19, 22) | ||
| Two graphs below show the comparison of normalized gene counts from an RNA-seq experiment. What can you conclude based on these graphs? | 39 | 90 |
aA detailed copy of the content assessment test and the correct answers can be found in the Supplemental Material. The numbers of questions corresponding to the content assessment test are listed for each of the concepts assessed by this instrument.
bThe proportion of correct answers for a given question is shown. The questions had a multiple-choice format; some questions were slightly modified (rephrased) to fit into this table.
Evidence of student learninga
| Average score | |||
|---|---|---|---|
| Course and year | Number of students | Pretest | Posttest |
| Principles of Genetics, 2014 | 85 | 27 ± 15% | 79 ± 8% |
| Applied Biotechnology, 2015 | 8 | 52 ± 19% | 90 ± 10% |
aResults of pretest and posttest used to evaluate student learning after the completion of the laboratory project. Average student scores and SDs are shown. The test results were analyzed by using a paired two-tail t test. The results of the pretest and posttest were significantly different at p < 0.001 for Principles of Genetics and at p < 0.1 for Applied Biotechnology.
Figure 3.Phenotypic effects of exposure to abiotic stress observed by students. B73 seedlings show strong response to cold stress with dry necrotic leaf edges and tips, while Mo17 seedlings show only minimal response to cold. Both B73 and Mo17 seedlings show response to heat stress with wilted leaves.
Figure 4.Examples of graphs students used to visualize data and answer the questions. (A and B) Comparison of variation between two replicates of the same condition and between stress and control conditions. Log2RPKM values are graphed for all maize genes. (C) The conservation of stress response. Many genes up-regulated in response to cold stress in B73 are also up-regulated in response to cold stress in Mo17, while many genes show response in only one of the genotypes. (D) The proportion of all maize genes, genes up-regulated in response to cold, and genes down-regulated in response to cold is shown. SE is shown with error bars. Three gene ontology categories significantly overrepresented among genes up-regulated in response to cold stress are shown (p < 0.05). (E) Abiotic stress exposure results in up- or down-regulation for a number of maize genes in each genotype. The Z-normalized RPKM values for all differentially expressed genes were used to perform hierarchical clustering of the gene expression values. The genotypes (B73: B; Mo17: M) and conditions (heat: red; control: green; cold: blue) are indicated below each column. Three replicates of each condition are shown. (F) Genes affected by cold stress are frequently up-regulated in response to heat stress as well. Genes up- and down-regulated for cold stress in B73 are shown, as is their response to heat stress. ND: the genes with no differential expression. The number of genes in each category is shown.
Figure 5.Assessment of student learning. Student learning was assessed using a test consisting of 22 multiple-choice questions. Questions were separated into three categories, and the average proportion of correct answers for the questions in these categories was calculated for two courses (Principles of Genetics and Applied Biotechnology). Vertical bars show SD. For all three categories, the differences between pretest and posttest were significant as tested by paired t test (p < 0.01).
Assessment of data visualization and interpretation in student lab reports
| Rubric category | Criteria for the correct responses | Student scores (out of 4 points for each category) |
|---|---|---|
| Experimental question | Clarity and appropriateness of the experimental question | 3.46 ± 0.68 |
| Graphs | The choice of the visualization approach and the correct organization of the graph | 3.20 ± 0.62 |
| Graph labels | Presence and accuracy of the graph labels | 3.42 ± 0.60 |
| Figure legends | Completeness and accuracy of the figure legends | 3.25 ± 0.64 |
| Data interpretation | Clarity and appropriateness of the conclusions, support of the conclusions by the graphs | 3.20 ± 0.70 |
| Total | 16.45 ± 2.44 |
Learning gains reported by Principles of Genetics students in CURE surveya
| Category | Genetics learning gains (65 students) | CURE participants (4800 students) |
|---|---|---|
| Understanding science process | ||
| Understanding how knowledge is constructed | 3.49 | 3.42 |
| Understanding the research process | 3.50 | 3.46 |
| Understanding how scientists work on real problems | 3.62 | 3.58 |
| Understanding that scientific assertions require supporting evidence | 3.59 | 3.64 |
| Understanding science | 3.66 | 3.58 |
| Data analysis skills | ||
| Ability to integrate theory and practice | 3.38 | 3.46 |
| Ability to analyze data and other information | 3.96 | 3.74 |
| Skill in interpretation of results | 3.62 | 3.54 |
| Ability to read and understand primary literature | 3.45 | 3.34 |
| Communication skills | ||
| Skill in science writing | 3.39 | 3.31 |
aLopatto ).
Student perception of the lab series on RNA-seq data analysis
| Open codes | Theme | Description | Student quotes |
|---|---|---|---|
| Cool lab | Exciting and interesting | Overall perception of the lab series | “I never had so much fun building graphs.” |
| Interesting | “Great addition to Genetics.” | ||
| Unusual lab | “The lab was very frustrating and difficult, but I learned a lot!” | ||
| Fun | |||
| Real research | Authentic research | Includes references to the research nature of the lab series | “Doing real research in class is really cool.” |
| Real science | “We worked with real data on real research problem[s].” | ||
| Cool experiment | “Nobody knew the answers to our questions.” | ||
| Real data | “We got to build graphs in R and they looked like the graphs from the papers we were reading!” | ||
| Programming | Computational nature of biology research | Describes the student perception of programming and computational studies as a part of biology | “This was the first time I was involved in large data analysis; it would be great to do it more often!” |
| Bioinformatics | “I never realized that biology is almost computer science now.” | ||
| Databases | “I wish I knew more programming and was more familiar with computers, this was fun!” | ||
| A lot of computation | |||
| Large data sets | |||
| Confusion | Discontent and frustration | Reflects negative perceptions of the lab series due to lack of interest, confusion, or frustration | “This lab is way too difficult and should not be a part of introductory course.” |
| Frustration | “I was confused through the whole three weeks.” | ||
| Analysis did not work | “My R code never worked and the instructor had to fix it all the time. Very frustrating.” | ||
| Lack of engagement | |||
| Too complex |