| Literature DB >> 25880064 |
Melanie I Stefan1, Johanna L Gutlerner2, Richard T Born1, Michael Springer3.
Abstract
The past decade has seen a rapid increase in the ability of biologists to collect large amounts of data. It is therefore vital that research biologists acquire the necessary skills during their training to visualize, analyze, and interpret such data. To begin to meet this need, we have developed a "boot camp" in quantitative methods for biology graduate students at Harvard Medical School. The goal of this short, intensive course is to enable students to use computational tools to visualize and analyze data, to strengthen their computational thinking skills, and to simulate and thus extend their intuition about the behavior of complex biological systems. The boot camp teaches basic programming using biological examples from statistics, image processing, and data analysis. This integrative approach to teaching programming and quantitative reasoning motivates students' engagement by demonstrating the relevance of these skills to their work in life science laboratories. Students also have the opportunity to analyze their own data or explore a topic of interest in more detail. The class is taught with a mixture of short lectures, Socratic discussion, and in-class exercises. Students spend approximately 40% of their class time working through both short and long problems. A high instructor-to-student ratio allows students to get assistance or additional challenges when needed, thus enhancing the experience for students at all levels of mastery. Data collected from end-of-course surveys from the last five offerings of the course (between 2012 and 2014) show that students report high learning gains and feel that the course prepares them for solving quantitative and computational problems they will encounter in their research. We outline our course here which, together with the course materials freely available online under a Creative Commons License, should help to facilitate similar efforts by others.Entities:
Mesh:
Year: 2015 PMID: 25880064 PMCID: PMC4399943 DOI: 10.1371/journal.pcbi.1004208
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Learning goals for QMBC.
|
|
| Students will be able to |
| - recognize situations that call for computational methods |
| - conceptualize a problem so it becomes amenable to computational solution |
| - use simulations to build intuition about biological systems |
| - compare the outcome of simulations to real-world data |
| - formulate and test hypotheses |
| - understand a project as a collection of smaller parts |
| - plan steps needed to solve a problem |
| - think of ways to test the validity of a computational approach |
|
|
| Students will be able to |
| - import large datasets into MATLAB |
| - parse such datasets into appropriate computational structures |
| - visualize a dataset in multiple ways |
| - compute summary statistics |
| - use elements of programming to implement problem-solving strategies |
| - use trial and error to design a computational approach |
| - read and understand MATLAB documentation |
| - read and understand someone else’s code |
| - find and fix errors in a piece of code |
| - write a program to automatize data analysis |
| - document their code and use programming style in naming variables |
|
|
| Students will |
| - appreciate the value of computational and quantitative approaches |
| - feel confident about approaching and solving a computational problem |
| - persevere when they find a problem difficult or do not immediately understand it |
| - recognize that successful coding can be fun as well as useful |
| - know when to ask for help and where to find support when needed |
| - be willing and ready to learn more |
| - evaluate the quality of computational and quantitative methods in scientific studies |
| - influence the work of others by setting examples of good practice in this domain |
Learning goals for QMBC, categorized into the three domains of thinking, doing, and feeling.
Overview of course Days 1–3.
| Topic | Exercise/ Examples | Biological problem |
|---|---|---|
|
| ||
| Getting Started | ||
| Variables | Creating variables; basic operations on variables | |
| Arrays | Indexing, storing, retrieving, and elementary operations | Image visualization |
| Built-in Functions | Summary statistics | |
| Data visualization | Histograms, color maps, and plots | |
| INTEGRATION | Summary statistics and plotting to characterize an unknown dataset | Mystery ‘microarray’ dataset |
| Arrays II | Cropping and subsampling | Image manipulation |
| Conditional statements | Logical operations on arrays (<, >, = =) | |
| INTEGRATION | Normalize and modify an image with built-in functions and logical operators | Image manipulation and visualization |
| INTEGRATION |
| |
|
| ||
| Review of Day 1 | ||
| Functions | Inputs, outputs, scope, and naming | |
| Functions | Convert script from Day 1 into a function | Image normalization and visualization |
| Loops | for | |
| Conditional statements | if, elseif, else, while | |
| INTEGRATION | 96-well plate growth curve data | |
| Strings | Data type conversion and basic pattern matching | Basic bioinformatics (find a ‘motif’) |
| Cell arrays | Dealing with mixed data types | Data plus metadata |
| INTEGRATION |
| |
|
| ||
| Binomial distribution, null hypothesis, p-value | Binomial rat—simulation | Choice behavior in animals |
| Bootstrapping methods | 2-sample neuron comparison—resampling | Morphological characterization of neurons |
| False positive statistics | “researcher degrees of freedom” and multiple hypothesis testing | Neuronal data—simulation |
Summary of the topics covered in Days 1–3 of the course, the examples and exercises, and the biological motivation.
Overview of course Days 4 and 5.
| Topic | Exercise/ Examples | Biological problem |
|---|---|---|
|
| ||
| Visualizing and scaling images | Images with varying dynamic ranges | |
| Segmentation versus quantitation | Counting and characterizing cells | |
| Filtering | Understanding filters, visualization of their effects on different images, combining filters | |
| Edge detection | Segmentation by finding boundaries of cells | |
| Morphological operations | Quantitation and segmentation of an image with uneven illumination | |
|
| ||
| Loading and parsing data | Uploading and parsing an RNA sequencing experiment | |
| INTEGRATION |
| |
|
| (Advanced topics and integration): | |
| Bring your own data | ||
| Bootstrapping | Neural tuning curves | |
| Principal component analysis | Spike sorting or Calcium imaging in zebrafish | |
| Biological image processing | Image filters used by biological vision systems | |
| Quantitative trait loci | Raw sequencing data → enriched alleles: identifying causative loci | |
| Pattern matching | Identifying over- and underrepresented motifs in a genome | |
| Biochemical/signaling models | Introduction to simbiology and simple models | |
Summary of the topics covered on Days 4 and 5 of the course, the examples and exercise, and the biological motivation.
Student enrollment in QMBC.
| Spring 2012 | Summer 2012 | Spring 2013 | Summer 2013 | Spring 2014 |
|---|---|---|---|---|
| 74 | 94 | 37 | 74 | 44 |
Student enrollment in QMBC, Spring 2012 to Spring 2014
Fig 1Overall course experience.
Students were asked after each course to rate their overall experience on a five-point scale (Poor, Fair, Good, Very Good, or Excellent). Diverging stacked bars are centered between Fair and Good. To allow comparison between different course offerings, data for each year was normalized by the total number of respondents. Spring 2012: n = 37, Spring 2013: n = 21, Spring 2014: n = 24, Summer 2012: n = 57, Summer 2013: n = 43.
Fig 2Increase in self-reported MATLAB programming skills.
In the postcourse survey, students were asked: “Rate your ability to program in MATLAB before the course,” and “Rate your ability to program in MATLAB after the course.” Answers were given on a scale from 0 (novice) to 11 (expert). Upper panel: Summer 2013, lower panel: Spring 2014. Scatter plot: Each student is represented by a circle. The diagonal represents no improvement in skill. Insert: Increase in self-reported skill (after-before). Summer 2013: n = 43, Spring 2014: n = 24.
Fig 3Self-assessed understanding of concepts and skills.
Data shown is for the Spring 2014 offering of the course. Students were asked to rate their understanding of specific skills on a five-point scale (Poor to Excellent, as above). Skills are listed in the order in which they are introduced at QMBC. n = 24.
Fig 4Future impact of the course.
Students were asked to rate their agreement with the following three questions: “This course provided a practical base and starting point for using MATLAB in my own work,” “The workshop provided me with a practical base/starting point for analyzing quantitative problems,” and “This course has increased the likelihood I will use quantitative methods in my research.” Rating was on a five-point scale (Strongly disagree to strongly agree). Data shown are pooled responses from the last four offerings of the course (Summer 2012 to Spring 2014). n = 141.