| Literature DB >> 34506617 |
Adam J Kleinschmit1, Elizabeth F Ryder2, Jacob L Kerby3, Barbara Murdoch4, Sam Donovan5, Nealy F Grandgenett6, Rachel E Cook7, Chamindika Siriwardana8, William Morgan9, Mark Pauley10, Anne Rosenwald11, Eric Triplett12, William Tapprich13.
Abstract
As powerful computational tools and 'big data' transform the biological sciences, bioinformatics training is becoming necessary to prepare the next generation of life scientists. Furthermore, because the tools and resources employed in bioinformatics are constantly evolving, bioinformatics learning materials must be continuously improved. In addition, these learning materials need to move beyond today's typical step-by-step guides to promote deeper conceptual understanding by students. One of the goals of the Network for Integrating Bioinformatics into Life Sciences Education (NIBSLE) is to create, curate, disseminate, and assess appropriate open-access bioinformatics learning resources. Here we describe the evolution, integration, and assessment of a learning resource that explores essential concepts of biological sequence similarity. Pre/post student assessment data from diverse life science courses show significant learning gains. These results indicate that the learning resource is a beneficial educational product for the integration of bioinformatics across curricula.Entities:
Mesh:
Year: 2021 PMID: 34506617 PMCID: PMC8432852 DOI: 10.1371/journal.pone.0257404
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Sequence similarity learning resource module descriptions.
| Sequence Similarity Module Title | Module Description |
|---|---|
| Module 1: Similarity and Sequence Alignment | Students explore the meaning of sequence similarity and then investigate how similarity can be quantitatively compared between two similar length proteins using a Blocks Substitution Matrix (BLOSUM) scoring matrix. This core concept and competency has utility for biologists seeking to identify conserved blocks of sequence in homologous proteins that may have structural and functional importance and hint at evolutionary relationships between two sequences. |
| Module 2: Sequence Alignment to a Database of Sequences | Students find local regions of similarity between a query sequence and a database of subject sequences using the Basic Local Alignment Search Tool (BLAST) algorithm and develop a basic understanding of the algorithm through a manual scoring exercise. This core concept and competency has utility for biologists seeking to identify conserved blocks of nucleotide or protein sequence that may or may not necessarily be homologous, but share common domains (often reused by similar families of proteins) that may hint at structure and function of a protein and hint at evolutionary relationships between two sequences. |
| Module 3: Phylogenetic Analysis of Homologous Sequences | Students practice accessing text-based FASTA-formatted sequence information via the National Center for Biotechnology Information (NCBI) databases as they collect protein sequence data for a multiple sequence alignment for the generation of a phylogenetic tree. Students construct a small tree manually using the Neighbor Joining algorithm. This core concept and competency has utility for biologists seeking to identify conserved protein domains and key conserved amino acid residues associated with structure and function within a domain in addition to allowing for visual depiction of evolutionary relationships between three or more sequences. |
| Module 4: Inquiry-Based Investigation | Students apply concepts and competencies from Modules 1–3 to address an authentic biological question. Instructors or students may choose between three investigations involving (1) the evolution of alcohol metabolism in hominids, (2) the evolution of Zika virus, or (3) determining the likely causative agent of an equine corneal ulcer. |
Fig 1Development, implementation, and assessment of a NIBLSE OER learning resource.
The original learning resource was conceived by a pair of institutional colleagues and implemented with course-specific student learning objectives. The resource was later expanded and targeted to a wider audience by a community of faculty through a NIBLSE Incubator. Following development of an assessment instrument, a NIBLSE Faculty Mentoring Network (FMN) recruited implementers and refined the assessment while collecting pilot assessment data. Data were collected from multiple institution and classroom settings concurrently during the FMN and after its conclusion. Vertically overlapping boxes indicate concurrent activities.
The bioinformatics sequence similarity learning resource was implemented in a diverse set of courses across program-level and institution classification.
| Course Content Focus | Undergraduate Course Level | Institution Classification |
|---|---|---|
| Bioinformatics and Computational Biology | 100 | RI |
| General Biology | 100 | RI |
| General Biology | 200 | PUI |
| Genetics | 300 | RI |
| Molecular Biotechnology | 300 | PUI |
| Molecular Biology of the Cell | 300 | RI |
| Developmental Biology | 400 | PUI |
| Virology | 400 | RI |
*The General Biology course offered at the primarily undergraduate institution covered topics focused within the areas of cellular and molecular biology, while the General Biology course offered at the research-intensive institution focused on biological diversity and ecology.
†Within a 4-year undergraduate degree plan, 100–200 level courses are typically introductory in nature and require less prerequisite knowledge (OR fewer prerequisite courses). 300–400 level courses are typically advanced in nature and specialized in course content and typically reserved for upper-level students.
‡Research Intensive (RI) Institutions are doctoral degree granting universities with moderate to very high research activity. Primarily Undergraduate Institutions (PUIs) typically focus on conferring bachelor’s degrees, where the primary expectation for faculty is teaching, with research being a secondary focus.
Fig 2Aggregate pre-/post-assessment quiz scores indicate significant participant learning gains.
The fifteen-item assessment consisting of a combination of multiple-choice and multiple-select questions was administered pre- and post-completion of the learning modules. Nine cohorts of student participants (n = 373) at independent institutions completed the assessment instrument with 7–28 days between pre- and post-assessment. Pre- (4.47) and post- (6.78) means are represented by a narrow black crossbar. The difference between the pre- and post-means has statistical significance (p < 0.00001, GLM). Black error bars represent the 95% confidence interval of the mean and the number of matched student assessment records is indicated below each swarm plot.
Fig 3Learning gains from matched pre-/post-assessment quiz scores disaggregated by course type.
Courses at PUIs in which the modules were implemented included General Biology, Molecular Biotechnology, and Developmental Biology. All others, including an additional General Biology course were at RI institutions. Means are represented by a narrow black crossbar. Black error bars represent the 95% confidence interval of the mean. The black dashed line indicates a pre-/post- difference of zero, indicating neither a learning gain nor loss. Learning gains significantly greater than 0 were observed in all classes (adj. p < 0.001, one-sample t-test). Sample size (n) for each course is shown above course name.
Fig 4Student participants self-reported perceived learning gains.
Retrospective pre- and post-survey aggregate data utilizing a four-point Likert-type scale is depicted as a divergent stacked bar graph. Nine cohorts of student participants (n = 362) at a diversity of institutions completed the survey instrument. All questions were statistically significant (p<0.0001) when comparing median Likert-type scale response between retrospective pre- and post-ratings using the Wilcoxon signed-rank test.