Natalie Christian1, Katherine D Kearns2. 1. Evolution, Ecology and Behavior Program, Department of Biology, Indiana University, Bloomington, IN 47405. 2. Center for Innovative Teaching and Learning, Indiana University, Bloomington, IN 47405.
Abstract
s play the pivotal role of selling an article to a prospective reader, and for students, the ability to communicate science in concise written form may foster scientific thinking. However, students struggle with abstract composition, and we lack evidence-based educational innovations to help them develop this skill. We designed, implemented, and assessed an intervention for abstract composition with elements of scaffolding and transparency to ask whether deliberate practice improves concise scientific writing in early career undergraduate biology majors. We evaluated student performance by analyzing abstracts written before and after the intervention and by assessing pre- and posttest student concept maps. We found that scaffolded learning improved student abstract writing, with the greatest gains in students' ability to describe the motivation for their work. Using a set of tested tools to teach scientific writing has important implications for strengthening students' capacity to reinforce and synthesize content in the future, whether that is in laboratory course exercises, in independent research, or as a transferable skill to general critical thinking.
s play the pivotal role of selling an article to a prospective reader, and for students, the ability to communicate science in concise written form may foster scientific thinking. However, students struggle with abstract composition, and we lack evidence-based educational innovations to help them develop this skill. We designed, implemented, and assessed an intervention for abstract composition with elements of scaffolding and transparency to ask whether deliberate practice improves concise scientific writing in early career undergraduate biology majors. We evaluated student performance by analyzing abstracts written before and after the intervention and by assessing pre- and posttest student concept maps. We found that scaffolded learning improved student abstract writing, with the greatest gains in students' ability to describe the motivation for their work. Using a set of tested tools to teach scientific writing has important implications for strengthening students' capacity to reinforce and synthesize content in the future, whether that is in laboratory course exercises, in independent research, or as a transferable skill to general critical thinking.
A widespread goal in undergraduate biology courses is that students learn to authentically communicate science (1). Primary literature is a main mode of communication within the scientific community, providing scientists an avenue through which to compare and critique methods and share results (2). By learning to read and evaluate primary literature, students can improve their understanding of science and the scientific method and develop the educational and life skills of assessing whether assertions are evidence-based (2). However, to effectively communicate science, students must also be able to write scientifically.Writing scientifically is important for the express purpose of communicating science, and also because the exercise in scientific argumentation should promote students’ ability to think like scientists (3–5). For instance, when writing courses are offered within a biology curriculum, students may see themselves as scientists learning to write instead of students writing about science for a class (6). In a study with first-year medical students, practicing scientific writing improved the writing itself, and also promoted proficiency in information searches and active and collaborative learning (7). This is not unique to the sciences, and there has been an increased recognition that disciplinary knowledge is inseparable from discipline-specific writing practices (also known as Writing in the Disciplines) (8–10). In general, the process of writing also reinforces and synthesizes content (8). For example, when students are asked to provide a written justification for multiple-choice questions, their performance on those problems improves (11).Instruction in scientific writing is particularly emphasized in undergraduate laboratory courses, where there is an inherent goal of exposing students to scientific approaches and thinking (12). Although some laboratory curricula have been criticized for relying on “cookbook” exercises that do not allow students to generate and analyze real data (13), educators are increasingly implementing more authentic and open-ended laboratory exercises (14, 15). Regardless, students are often asked to describe and report on their laboratory activities and experiments in the authentic style of primary scientific articles. However, scientific writing is not typically taught before students reach early undergraduate education (16, 17). Thus, early career undergraduates may be ill-prepared to clearly express the content, process, and outcome of scientific inquiry (18).Although abstracts make up only a fraction of the space in a scientific paper or student lab report, they play a disproportionately large role in scientific communication and potentially in student learning. Abstracts, or summaries of scientific papers, play the pivotal role of convincing someone to read the entire article (19, 20). For students, articulating their scientific process in a concise, written form reinforces how science works—scientists identify a hole in the scientific literature and then devise a way to address that gap. Learning to write good abstracts thus helps students to more strategically read and use them, an important skill for both scientists and people who use information generated by scientists. Ultimately the process of writing scientifically can strengthen students’ capacity to reinforce and synthesize content in the future, whether in laboratory course exercises or in independent research, and even become a transferable skill to general critical thinking (5, 21, 22).Students may approach abstract writing with a series of misconceptions. For instance, students may have difficulty distinguishing an abstract from an introduction, or may include extraneous detail, abbreviations, and references (personal observation, NC). Writing scientific abstracts could be considered a “bottleneck topic”—a point in a course or field where the learning of a significant number of students is interrupted (23). Although a quick web search yields copious advice on writing abstracts, there is a lack of evidence-based educational advances focused on developing this skill. Recent studies have described approaches to teaching abstract composition, but the emphasis is on students for whom English is a second language (24, 25). Thus, while previous studies address an important and unique set of pedagogical challenges, more efforts are needed to understand the generalizability of interventions for improving scientific abstract writing.Consistent with the Middendorf and Pace model for “Decoding the Disciplines” (23), we designed and implemented an intervention for abstract composition with elements of scaffolding and transparency to test the hypothesis that deliberate practice improves abstract writing in early career undergraduate biology majors. Our goal was that undergraduate biology students learned how to write complete, concise, clear, and effective abstracts.
METHODS
Course context
This study took place in the spring of 2014 at a large public research university in the Midwestern United States. All procedures were approved by the Indiana University Institutional Review Board (Protocol #1312922824). We implemented our intervention in an introductory biology laboratory course. The course is an introduction/survey of biology techniques (e.g., pipetting, spectrophotometry, electrophoresis) and concepts (e.g., photosynthesis, genetics, animal behavior). The course is coordinated by two biology faculty members who mentor approximately 25 teaching assistants (TAs) who teach individual sections. Each section consists of approximately 24 students—from all years but primarily sophomores. The majority of a student’s grade comes from writing five lab reports about the experiments they conduct, which are structured like primary scientific articles. Students write an abstract for every report.
Lesson development
The lesson lasted approximately 45 minutes and comprised six parts (Appendix 1). First, students created concept maps, diagrams showing the mental connections students make related to a focal concept (Appendix 2). A concept map is a Classroom Assessment Technique that provides an assessable record of a student’s conceptual schema (26). The value of this exercise is that it allows the instructor to assess a student, and the student to assess herself metacognitively. For this exercise, the focal concept was “Abstract” (Appendix 2). The concept map was used as a pretest that assessed students’ conceptual associations before the lesson.The second activity was a guided discussion, accompanied by an outline/worksheet, of the “what, where, why, when, and how” of abstracts (Appendix 3). For this scaffolding activity, student input was solicited to fill out the worksheet, but the instructor facilitated the discussion by elaborating on student responses. The goal of this activity was to reinforce the structural and conceptual framework of an abstract before addressing how to strengthen it.The third activity was a “meta-abstract” analysis (see Appendix 4). This exercise asked students to imagine that someone had conducted research on what makes an effective abstract and was submitting that work for publication. Students were presented with an abstract of that fictional paper, with brackets delineating different sections of text. As a class, students read each section out loud, and discussed its purpose and contribution to the abstract, as well as its effectiveness in achieving that purpose. The goal of this modeling activity was for students to critically assess what they should include in an abstract and understand how to make each part as clear and concise as possible.The fourth activity was an interactive discussion detailing what did not belong in an abstract (e.g., excessive detail, citations, references to figures, abbreviations). After being asked the question “what does not belong in an abstract?” students were encouraged to “think, pair, share,” a classic active learning technique that lets students think quietly and jot down their own ideas, then discuss those ideas with a neighbor, and finally reconvene and brainstorm as a class (26). The instructor wrote each correct student response on the blackboard while elaborating on why it was inappropriate for inclusion.The penultimate activity was analyzing an exemplary (but not perfect) abstract written by a former course student (Appendix 5). The class worked in small groups to place brackets around each section of text, as in the meta-abstract activity, and then to discuss the function and effectiveness of each section. Afterwards, the class came together and discussed what the author did well and what needed improvement. This activity modeled and made transparent instructor expectations of students.The final activity was a posttest evaluation in which students revisited their concept maps. Students were instructed to add to or edit their maps based on what they had learned during the lesson. Students were asked to circle their additions or edits, so changes in their conceptual schema could be assessed following the lesson.
Lesson implementation
The lesson was implemented on a course-wide scale. NC taught the lesson to all current course TAs, who were instructed to implement the lesson in their section (21 to 27 students per section) during week three of the semester. Writing samples consisting of the abstracts submitted as part of their lab reports were collected from all students (n = 453) before and after the intervention. Abstracts were coded and anonymized. A rubric was designed by the two authors and another individual familiar with the course to score the abstracts. It was based on the lab report assignment and the course grading rubric, so it was tightly linked to what students were asked to produce. Additionally, the rubric tried to capture areas of difficulty that the lesson addressed, such as completeness and concision. The rubrics judged six specific criteria: 1) rationale/background, 2) question/hypothesis/prediction, 3) methods, 4) results, 5) implications/applications, and 6) writing conventions (Appendix 6).
Scoring
We recruited three experienced graduate-level writing center tutors as raters for the abstracts. Before scoring the abstracts, the three raters attended an hour-long norming session in which they used sample lab reports not included in the study to learn how to consistently apply the rubric. We did not want to allow too much time to pass between the norming session and raters’ analysis of student work, as that could negatively affect scoring consistency. Thus, a subset of 152 abstracts (n = 76 pre-intervention; n = 76 post-intervention) was randomly selected for analysis by the raters so that scoring could be completed within one month. This subset represents just over 15% of the total writing samples collected and should be representative of student performance. The raters were blind to the identity of the students, as well as to which abstracts were written pre- or post-intervention. Each abstract was read and scored by all three raters. The raters assigned a categorical value for each rubric category. The categorical values were “meets expectations,” “too much detail,” “too little detail,” and “absent,” with the exception of the last rubric category, “Writing Conventions,” which only contained three values—“meets expectations,” “too much detail,” and “absent” (Appendix 6).
Data analysis and statistical methods
Scores from all three raters were compiled into one datasheet, and all analyses were performed using R v. 3.1.2 (27). To test for agreement in scoring among raters, interrater reliability (measured by Fleiss’ kappa) was computed across the three raters using the irr Package (function kappam.fleiss). Across the study, κ = 0.384, suggesting “fair agreement” among raters (28). To test the effect of the intervention on student performance given the less-than-optimal agreement among raters, scores were handled in two complementary ways.First, scores were left as categorical values to test whether the intervention had a significant effect on the distribution of scores across the rubric. NC manually went through the data sheet, and for each rubric item for each abstract, if at least two of the three raters agreed on a categorical score, that score was assigned to that rubric item. If all three raters scored a rubric item differently, that rubric item’s score was assigned as “No agreement among raters.” Chi-squared tests were performed to compare student performance pre- and post-intervention across all four performance categories, as well as the fifth category, which represented a lack of consensus among raters. A significant p value in the Chi-squared test indicates that the distribution/proportion of scores across rubric categories deviated significantly following the intervention.Second, each categorical value was converted to a numerical value to test whether the intervention significantly improved overall abstract performance. Points were assigned as follows: 3 = “meets expectations,” 2 = “too much detail,” 2 = “too little detail,” and 1 = “absent.” Both “too much detail” and “too little detail” were assigned the same numerical value of 2, because although both categories reflect different student shortcomings, claiming that one is more problematic than the other is subjective. Because the assigned ordinal values are largely arbitrary and do not reflect true quantitative differences between categories, these types of data (e.g., Likert scale data) are analyzed using non-parametric statistical tests. For each abstract, the raters’ scores were averaged to give one score per rubric category. The total number of points ascribed to each abstract, as well as differences between rubric categories, were compared before and after the intervention using Mann-Whitney U Tests, pairing individual student scores pre- and post-intervention.Concept maps were recovered from 29 of the 76 students whose abstracts were analyzed. The other concept maps were not collected or returned by students’ TAs. Concept maps were analyzed by NC. The number of pre-intervention concepts and linking words and post-intervention concepts and linking words (which students indicated by circling) were manually counted and recorded. A paired Mann-Whitney U Test was used to compare the number of concepts and linking words represented in the concept maps pre- and post-intervention, pairing by student. All pre- and post-intervention concepts and linking words were then typed into a datasheet. The wordcloud Package was used to remove common words in the English language (e.g., “a,” “it,” “with”) that would not be related to student conceptualization (functions removeWords and stopwords), before generating two word clouds (function wordcloud). The first word cloud included the 15 most common words from the pre-intervention concept maps, and the second word cloud included the 15 most common words added to the concept maps following the intervention. For both word clouds, the size and color of each word represent the frequency of the word in the concept maps.
RESULTS
Pre-intervention student performance
Prior to the intervention, students demonstrated proficiency describing the experiment they performed and what they found, with 27.6% (21/76;) and 69.7% (53/76) of students meeting expectations for the methods and results categories, respectively (Fig. 1). Students also largely maintained proper writing conventions throughout their abstracts, with 55.3% (42/76) meeting expectations for this category (Fig. 1). Conversely, prior to the intervention, students struggled with providing the motivation for their study and placing it within a larger scientific context, with 77.6% (59/76) and 63.2% (48/76) of students entirely missing a question/hypothesis or implications/applications component of their abstracts, respectively (Fig. 1). Although 21.1% (16/76) of students met expectations for the background/rationale component, there was no agreement among raters for 40.8% (31/76) of student abstracts in this category (Fig. 1). This lack of agreement among raters suggests unclear or confused writing by students that was difficult to score.
FIGURE 1
Distribution of ratings across all rubric items, pre- and post-intervention. For “Writing Conventions,” there was no “too little detail” option on the rubric.
Distribution of ratings across all rubric items, pre- and post-intervention. For “Writing Conventions,” there was no “too little detail” option on the rubric.
Post-intervention student performance
When examining student scores as categorical, following the intervention students did significantly better addressing the rationale/background of their experiment (χ2
= 24.686, df = 4, p < 0.001) but the number of students who met expectations for writing conventions decreased (χ2
= 36.970, df = 3, p < 0.001). The distribution of student scores was also significantly different for the results (χ2
= 12.682, df = 4, p = 0.013). Specifically, while the overall score was high for results both before and after the intervention, it appears that following the intervention, there were fewer students who had absent results. Student performance on the implications/applications component did not change significantly following the intervention, although the number of students who “met expectations” nearly doubled (from 7 to 12 students), with a corresponding decrease in students who included “too little detail.” There were no significant changes in student description of the question/hypothesis (χ2
= 5.241, df = 4, p = 0.263), or methods (χ2
= 6.193, df = 4, p = 0.185) (Fig. 1).When examining student scores as semi-quantitative, the average total number of points assigned to abstracts before and after the active learning intervention did not change (pre-intervention = 12.20 points; post-intervention = 12.23 points; V = 1,359.5, p = 0.801). However, student performance on individual rubric categories did change following the intervention. Students’ ability to address the rationale/background in their abstracts significantly improved following the intervention (V = 374, p < 0.001), and they also did a marginally significantly better job of stating their question/hypothesis (V = 234, p = 0.076). There was no change in student scores for the methods (V = 642.5, p = 0.404), results (V = 398.5, p = 0.260), or implications/applications (V = 859, p = 0.846) components. Students’ competence in writing conventions decreased following the intervention (V = 1,695.5, p < 0.001) (Fig. 2). When the writing convention category is removed from the rubric, the average total number of points assigned to abstracts was significantly higher post-intervention (10.38 points) than pre-intervention (9.78 points) (V = 957, p = 0.009) (Fig. 3).
FIGURE 2
Student scores for all rubric items, pre- and post-intervention. Categorical rubric values were converted to numerical values on a Likert scale, where 3 = “meets expectations,” 2 = “too much detail,” 2 = “too little detail,” and 1 = “absent.” Error bars represent standard error of the mean.
FIGURE 3
The average number of rubric points assigned to each abstract increased following the intervention if the “Writing Conventions” rubric category was excluded from the analysis (p = 0.009). Total possible rubric points = 15. Error bars represent standard error of the mean.
Student scores for all rubric items, pre- and post-intervention. Categorical rubric values were converted to numerical values on a Likert scale, where 3 = “meets expectations,” 2 = “too much detail,” 2 = “too little detail,” and 1 = “absent.” Error bars represent standard error of the mean.The average number of rubric points assigned to each abstract increased following the intervention if the “Writing Conventions” rubric category was excluded from the analysis (p = 0.009). Total possible rubric points = 15. Error bars represent standard error of the mean.
Student concept maps
Students significantly expanded their concept maps following the intervention (V = 0, p < 0.001), with 6.45 average concepts and linking words on the initial maps increasing to 8.34 following the lesson (29.3% increase) (Fig. 4a). Furthermore, following the lesson, the types of words that students added to their maps changed. The three most common words found on students’ initial concept maps included “experiment,” “results,” and “conclusion” (Fig. 4b, Table 1), and none of these three words were added to the concept maps following the intervention (0% increase in use). Instead, following the intervention, students were more likely to add words like “hypothesis” (54.5% increase in use), “purpose” (45.5% increase in use), and “introduction” (38.5% increase in use) (Fig. 4b, Table 1). Additionally, following the intervention, students increased their use of words relating to writing conventions in abstract writing, such as “concise” and “brief” (40.0% and 16.7% increases in use, respectively) (Fig. 4b, Table 1). Some words only appeared once students revised their concept maps, but it is unclear whether some of these words are connected to an increase in student understanding (e.g., “beginning”) (Fig. 4b, Table 1). Two examples of student concept maps can be found in Appendix 7.
FIGURE 4
Students made significant modifications to their concept maps following the intervention. a) The total number of concepts and linking words on student concept maps was significantly higher following the intervention than before the intervention (p < 0.001). Boxes show first and third quartiles with the median as a heavy line, and whiskers extend to 1.5 times the interquartile range. b) The word clouds depict the 15 most common words used in the construction of the initial concept maps and the 15 most common words added by students following the intervention. The size and color of each word is proportional to the number of times the word was used.
TABLE 1
The 10 most common words extracted from both the pre- and post-intervention student concept maps and their usage before and after the intervention.
# Times Used Pre-Intervention
# Additional Times Used Post-Intervention
Total # Times Used
Percent Increase in Use
results
29
0
29
0%
experiment
24
0
24
0%
conclusion
21
0
21
0%
introduction
13
5
18
38.5%
brief
12
2
14
16.7%
hypothesis
11
6
17
54.5%
purpose
11
5
16
45.5%
concise
10
4
14
40.0%
discussion
6
0
6
0%
entire
5
0
5
0%
report
0
3
3
NA
specific
0
3
3
NA
background
0
2
2
NA
beginning
0
2
2
NA
data
0
2
2
NA
There is some overlap between the two sets of 10 words.
Students made significant modifications to their concept maps following the intervention. a) The total number of concepts and linking words on student concept maps was significantly higher following the intervention than before the intervention (p < 0.001). Boxes show first and third quartiles with the median as a heavy line, and whiskers extend to 1.5 times the interquartile range. b) The word clouds depict the 15 most common words used in the construction of the initial concept maps and the 15 most common words added by students following the intervention. The size and color of each word is proportional to the number of times the word was used.The 10 most common words extracted from both the pre- and post-intervention student concept maps and their usage before and after the intervention.There is some overlap between the two sets of 10 words.
DISCUSSION
Incorporating deliberate practice and scaffolding when teaching undergraduate biology students to write abstracts enhanced some components of student performance. Following the intervention, students primarily improved at providing rationale and background for their experiment. Prior to the intervention, students already described their methods and results fairly well, so following the intervention, their performance did not change dramatically (though fewer students had a missing results component post-intervention). Students improved minimally or not at all on their questions/hypotheses/predictions and the implications/applications of their work. Due to the poorer quality of writing conventions post-intervention, the overall number of points attributed to each abstract did not change, but students achieved a significantly higher number of rubric points following the intervention when the writing convention category was excluded from the analysis.Students’ most dramatic improvement was introducing the background and rationale of their work. This result was mirrored by students’ concept maps. Prior to the lesson, students built conceptual frameworks of abstracts using words describing the experiment and its results, whereas following the lesson, students layered on ideas about the study’s motivation and purpose. Students’ ability to articulate their scientific process and distill it down to its most important points provides them the opportunity to see beyond details and contemplate how their experiment fits into and has significance for the larger scientific landscape. This process can motivate students to do their best work and instill in them an authentic appreciation for scientific communication. For example, following a pilot session of the intervention, one student commented “the main takeaway that I got from the presentation is that abstracts are important in the science field as a way to get out your experiment and communicate to the science field effectively.” Although most introductory biology students may never contribute their own article to the scientific literature, an intimate understanding of abstracts can help with scientific reasoning and learning to use information strategically. Such critical thinking can carry students forward into their upper-level courses and beyond, to times in their lives as everyday citizens when they will still need to consider evidence supporting scientific claims.Overall, student abstract writing skills improved as a result of engaging with the lesson. However, this study is not without limitations. For instance, we did not observe the TAs implementing the lesson, so they may have made changes that we were unaware of. Our intervention was completed in one 45-minute session and we analyzed abstracts from two consecutive assignments—it may require more time for students to develop good abstract writing skills. Analyzing student writing at multiple time points following the intervention could help determine its long-term consequences for student performance. It is also possible that student performance improved partially due to increased exposure to writing abstracts, independent of the intervention itself. Specifically, students received feedback from their TAs on their pre-intervention abstracts that could have helped them avoid similar mistakes post-intervention.Additionally, the content of the lab and the associated report differed pre- and post-intervention. In the pre-intervention experiment, students serially diluted a known concentration of a common food coloring, and they then measured the absorbance of each dilution to create a standard curve, which was used to determine the concentration of dye in various soft drinks (29). The straightforward experiment primarily taught students laboratory techniques such as operating pipettors and spectrophotometers. Conversely, the post-intervention lab focused on photosynthesis, and involved using multiple methods to investigate how chloroplast membranes capture light energy and use it to energize electrons, and which colors of visible light most effectively generate high-energy electrons in a physiologically active membrane. To reach these goals, students needed to understand polarity, oxidation-reduction reactions, and absorbance spectra (29). In a more recent section of the course (taught by NC), the grades for these two lab reports differed by 10%, indicating that the content of the latter was more difficult for students. The increase in complexity is likely why student adherence to proper writing conventions decreased following the intervention, even though students were more likely to add words related to writing conventions to their post-intervention concept maps. Students struggled to synthesize the post-intervention lab experiments into one cohesive abstract, and as a result, their concision suffered. While the difficulty of the lab activity was independent from what the rubric assessed, moving forward it would be important and informative to implement similar studies that control for both the content of the lab exercise/report and the implementation of the intervention.Because student performance on some components of abstract writing did not change, it is clear that the null expectation that increased exposure to abstract composition improves student performance is too optimistic. Even following deliberate instruction, some components of the abstract remained unchanged. For instance, students scored the lowest on the hypothesis/research question and implications/applications components of their abstracts, and the proportion of students that met expectations for these categories did not change. Moving forward, it will be important to figure out why students are still not comfortable formulating hypotheses or placing their results within a broader scientific context. One possibility is that because students were not responsible for formulating their own research questions, they may not have felt committed to exploring hypotheses or relating their results to previous work. It is possible that this problem would be alleviated if students were conducting an experiment of their own design (as students do at a later point in this particular course). Future iterations of this intervention could integrate an activity specifically addressing ways to formulate hypotheses or research questions. Another useful area of study would be to survey students about what components of the intervention they found helpful. While we are unable to attribute any one of our intervention parts (or outside influences) to their successes, surveying or interviewing students about their learning process would help inform future modifications to the intervention.Based on our results and the results of others, we advocate using scaffolding and deliberate practice to teach difficult topics in the sciences (21, 30–32). Specifically, our evidence-based set of tools will be helpful for improving student abstract writing. It is ready for use in other classrooms and can be improved over time with further modification. Similar exercises could and should also be implemented in lecture- or seminar-based courses, as increased exposure to the comprehension and creation of scientific literature can assist student understanding of the scientific method (33) and ability to critically assess different types of information, whether scientific or not (2), in ways that standard textbooks cannot. However, instructors must be careful to not sacrifice the quality and style of students’ writing conventions for the sake of technical content, as successful scientific literature must excel in both areas (34). Additionally, although writing an abstract could be considered a smaller assignment than writing an entire paper, short forms of writing can foster clarity and concision that aid longer forms of writing and better promote metacognition and reflection. It would be interesting to see whether the results from this study could be extrapolated to an analysis of student composition of full scientific papers. For instance, are students also performing poorly on introduction and discussion sections compared with methods and results, and how can we, as educators, address that? By focusing on the areas where students need more help compared with those where students already perform well, we can maximize both our teaching and student learning in scientific communication.Click here for additional data file.