| Literature DB >> 28572177 |
Jenny L McFarland1, Rebecca M Price2, Mary Pat Wenderoth3, Patrícia Martinková4, William Cliff5, Joel Michael6, Harold Modell7, Ann Wright8.
Abstract
We present the Homeostasis Concept Inventory (HCI), a 20-item multiple-choice instrument that assesses how well undergraduates understand this critical physiological concept. We used an iterative process to develop a set of questions based on elements in the Homeostasis Concept Framework. This process involved faculty experts and undergraduate students from associate's colleges, primarily undergraduate institutions, regional and research-intensive universities, and professional schools. Statistical results provided strong evidence for the validity and reliability of the HCI. We found that graduate students performed better than undergraduates, biology majors performed better than nonmajors, and students performed better after receiving instruction about homeostasis. We used differential item analysis to assess whether students from different genders, races/ethnicities, and English language status performed differently on individual items of the HCI. We found no evidence of differential item functioning, suggesting that the items do not incorporate cultural or gender biases that would impact students' performance on the test. Instructors can use the HCI to guide their teaching and student learning of homeostasis, a core concept of physiology.Entities:
Mesh:
Year: 2017 PMID: 28572177 PMCID: PMC5459253 DOI: 10.1187/cbe.16-10-0305
Source DB: PubMed Journal: CBE Life Sci Educ ISSN: 1931-7913 Impact factor: 3.325
Overview of the methods used to generate the HCIa
aAC, associate’s colleges; BCAS, baccalaureate colleges: arts and sciences focus; MCU, master’s colleges and universities; R1, doctoral universities–highest research activity.
Types of institutions used in the validation of the HCI
aSee Table 1 for definitions of abbreviations.
Demographic characteristics of students who participated in the large-scale testing of the HCI (main sample of 669, Table 2)
| Category | Count | Percent | |
|---|---|---|---|
| Gender | F | 405 | 61 |
| M | 246 | 37 | |
| NA | 18 | 3 | |
| Age (years) | ≤24 | 494 | 74 |
| 25–29 | 106 | 16 | |
| ≥30 | 69 | 10 | |
| Planning to major in the life sciences | No | 270 | 40 |
| Yes | 399 | 60 | |
| Planning to attend professional school | No | 190 | 28 |
| Yes | 479 | 72 | |
| Race/ethnicity | Asian | 117 | 17 |
| Black | 39 | 6 | |
| Hispanic | 85 | 13 | |
| White | 343 | 51 | |
| Mixed and other | 54 | 8 | |
| Undisclosed | 31 | 5 | |
| English as first language | Yes | 521 | 78 |
| No | 148 | 22 | |
| Year in college | Freshman | 67 | 10 |
| Sophomore | 137 | 20 | |
| Junior | 171 | 26 | |
| Senior | 216 | 32 | |
| Postbaccalaureate | 78 | 12 |
The statistical methods used to gather evidence for the validity of the HCI scores
| Method | Analytical question |
|---|---|
| Two-sample | Do graduate students in the life sciences perform better on the HCI than undergraduates? |
| Pre/posttesting ( | Do students perform better on the HCI after receiving instruction about homeostasis? Is this improvement bigger than the improvement of students who did not receive any instruction about homeostasis? |
| Mixed-effects linear regression | Do students pursuing majors in the life sciences perform better on the HCI than students pursuing other majors? Is this difference significant when controlling for other variables such as gender, ethnicity, institution, and course? |
| Density plots | Does a range of total scores on the HCI exist for different demographic groups? Can we see a visual difference among the demographic groups? For example, do students pursuing majors in the life sciences perform better on the HCI than students pursuing other majors? Do students from R1 institutions perform better than students from other types of institutions? |
| Test–retest (Pearson correlation) | Is student performance on the HCI repeatable? |
| Cronbach’s alpha | Is the test internally consistent? |
| Test item function (TIF) | How reliable is the HCI is for students with different levels of ability? |
| Estimating item difficulty | Does the HCI have a range of difficulties, as indicated by the percentage of students answering each item correctly? |
| Estimating item discrimination | Do strong students perform better on harder questions? |
| Item-person (Wright) map | Does the inventory capture the whole population of students? Do item difficulties correspond to student abilities? |
| Item characteristic curves | Do items have a range of difficulties, and do they have sufficient discrimination? |
| Item information function | For which latent abilities do individual items provide the highest information? |
| DIF analysis | Are the HCI items biased with respect to gender, ethnicity, and English language status? |
| Abstract and applied questions | |
| Paired | Is student performance on abstract questions the same as student performance on applied questions? |
FIGURE 1.Students with different levels of experience perform on the HCI as expected. The horizontal midline in box plots is the median, and the top and bottom of each box represent one quartile from the median. Data beyond the end of whiskers are outliers. (A) Graduate students with more exposure to homeostasis perform better than undergraduates (two-sample t test, p = 0.024). (B) Undergraduates perform better on the HCI after receiving instruction (paired t test, p = 0.010). (C) Undergraduates who received instruction about homeostasis had higher gains (measured as the difference in pre–post scores from the sample in B) than master’s students from a professional school studying an unrelated life science field who were naïve to the concept (two-sample t test, p = 0.048). Sample sizes for each comparison are described in Table 2.
Final linear mixed-effects model for total score with the demographic variables ordered in terms of how they impact interpretation of the validity of the HCI (as opposed to Table 3)a
| Parameter | Model-based estimate |
|---|---|
| Intercept | 12.32 ± 0.62*** |
| Major pursued | |
| (reference category: Other) | |
| Life Sciences | 1.01 ± 0.269*** |
| Year | |
| (reference category: Freshman) | |
| Sophomore | 1.00 ± 0.549+ |
| Junior | 0.01 ± 0.552 |
| Senior | 0.26 ± 0.586 |
| Postbaccalaureate | 2.29 ± 0.630*** |
| Gender | |
| (reference category: Male) | |
| Female | −0.77 ± 0.259** |
| NA | −1.77 ± 0.820* |
| Race/ethnicity | |
| (reference category: White) | |
| Asian | −0.48 ± 0.386 |
| Black | −2.28 ± 0.523*** |
| Hispanic | −1.38 ± 0.411*** |
| Mixed and other | −0.75 ± 0.461 |
| NA | −2.31 ± 0.631*** |
| English as first language | |
| (reference category: Yes) | |
| English as second language | −1.53 ± 0.335*** |
aThe differences between school types were not significant after accounting for hierarchical structure with random effects. p values: +<0.1; *<0.05; **<0.01; ***<0.0001.
FIGURE 2.Density plots comparing total scores on the HCI for different demographic groups. Density plots are read like histograms, where density is analogous to proportion. Each graph shows a range of scores, indicating that the HCI can assess how students from each of these demographic groups understand the concept of homeostasis. (A) Student major. Students pursuing life science majors scored higher than students pursuing other majors (see Table 5; p < 0.001). (B) Course audience. Students enrolled in physiology courses for science majors scored higher than students in courses for allied health students or in courses for nonmajors (but in the final mixed-effects model, this difference is captured by student major and thus not significant). (C) Type of institution. The students in our sample who attended doctoral universities (highest research activity [R1]) tended to perform better than those at master’s colleges and universities (MCU), and baccalaureate colleges: arts and sciences focus (BCAS). Students from associate’s colleges (AC) show a bimodal distribution, and the higher mode is comparable to performance of students at R1 institutions. The final model accounts for the fact that courses are different, which encompasses the difference among institutions.
FIGURE 3.Difficulty, which is represented as the proportion of students who answered the item correctly, and discrimination for the 20 questions in the HCI. The items are arranged by percent correct, with the most frequently incorrect on the left, and the most frequently correct on the right. The horizontal line represents a discrimination of 0.2, which is usually considered the minimum discrimination for items to be included in a concept inventory (Nunnally and Bernstein, 1994). However, since item 17 tests a critical misconception about how the body responds to the complete cessation of a signal from the sensor, we felt it was essential to include it in the HCI.
FIGURE 4.Item-person map. The left panel describes the distribution of student abilities, as determined with a one-parameter IRT model; values are arranged from the most able at the top to least able at the bottom. The items in the right panel are organized from the most difficult at the top to the least difficult at the bottom. Here, difficulty is defined as ability level, so a student of this ability has 50% probability of answering the item correctly. For the students in our sample, we found a range of item difficulties in the HCI, from item 3 (frequently correct) to item 17 (frequently incorrect).
FIGURE 5.Results from the three-parameter IRT model. (A) Item characteristic curves describe the probability of an item being answered correctly by a student of a given ability. Ability is displayed as SDs from average. Item discrimination is represented by the slope; the relatively flat curve for item 17 corresponds to its low discrimination for students of low abilities. (B) Item information curves represent how well the items distinguish between strong and weak students for given ability; note that item 13 discriminates particularly well among students of average ability. (C) The TIF represents the reliability of the test for students of different ability. Based on the peak of this curve, the HCI is most reliable for students whose abilities range from −1 to 1.
FIGURE 6.Tetrachoric correlation heat map. The items are ordered into clusters based on how correlated they are with each other. Items 4, 9, and 20 form a cluster; items 7 and 17 do not cluster with any other items; and the rest of the items cluster together.
FIGURE 7.Students’ performance on abstract questions was indistinguishable from their performance on related questions that applied knowledge to real-world scenarios (paired t test, p = 0.132). Here, difficulty is represented as the percentage of students who answered the item correctly (% correct), as in Figure 3.