Michael Bramhall1, Oscar Flórez-Vargas, Robert Stevens, Andy Brass, Sheena Cruickshank. 1. *Bio-Health Informatics Group, School of Computer Science, University of Manchester, Manchester, United Kingdom; and †Manchester Immunology Group, Faculty of Life Science, University of Manchester, Manchester, United Kingdom.
Abstract
BACKGROUND: Current understanding of the onset of inflammatory bowel diseases relies heavily on data derived from animal models of colitis. However, the omission of information concerning the method used makes the interpretation of studies difficult or impossible. We assessed the current quality of methods reporting in 4 animal models of colitis that are used to inform clinical research into inflammatory bowel disease: dextran sulfate sodium, interleukin-10, CD45RB T cell transfer, and 2,4,6-trinitrobenzene sulfonic acid (TNBS). METHODS: We performed a systematic review based on PRISMA guidelines, using a PubMed search (2000-2014) to obtain publications that used a microarray to describe gene expression in colitic tissue. Methods reporting quality was scored against a checklist of essential and desirable criteria. RESULTS: Fifty-eight articles were identified and included in this review (29 dextran sulfate sodium, 15 interleukin-10, 5 T cell transfer, and 16 TNBS; some articles use more than 1 colitis model). A mean of 81.7% (SD = ±7.038) of criteria were reported across all models. Only 1 of the 58 articles reported all essential criteria on our checklist. Animal age, gender, housing conditions, and mortality/morbidity were all poorly reported. CONCLUSIONS: Failure to include all essential criteria is a cause for concern; this failure can have large impact on the quality and replicability of published colitis experiments. We recommend adoption of our checklist as a requirement for publication to improve the quality, comparability, and standardization of colitis studies and will make interpretation and translation of data to human disease more reliable.
BACKGROUND: Current understanding of the onset of inflammatory bowel diseases relies heavily on data derived from animal models of colitis. However, the omission of information concerning the method used makes the interpretation of studies difficult or impossible. We assessed the current quality of methods reporting in 4 animal models of colitis that are used to inform clinical research into inflammatory bowel disease: dextran sulfate sodium, interleukin-10, CD45RB T cell transfer, and 2,4,6-trinitrobenzene sulfonic acid (TNBS). METHODS: We performed a systematic review based on PRISMA guidelines, using a PubMed search (2000-2014) to obtain publications that used a microarray to describe gene expression in colitic tissue. Methods reporting quality was scored against a checklist of essential and desirable criteria. RESULTS: Fifty-eight articles were identified and included in this review (29 dextran sulfate sodium, 15 interleukin-10, 5 T cell transfer, and 16 TNBS; some articles use more than 1 colitis model). A mean of 81.7% (SD = ±7.038) of criteria were reported across all models. Only 1 of the 58 articles reported all essential criteria on our checklist. Animal age, gender, housing conditions, and mortality/morbidity were all poorly reported. CONCLUSIONS: Failure to include all essential criteria is a cause for concern; this failure can have large impact on the quality and replicability of published colitis experiments. We recommend adoption of our checklist as a requirement for publication to improve the quality, comparability, and standardization of colitis studies and will make interpretation and translation of data to human disease more reliable.
Inflammatory bowel diseases (IBD) are a spectrum of multifactorial, chronic inflammatory diseases of the digestive tract, typically involving some degree of colitis. The etiology of IBD is still unclear, but genome-wide association studies have provided >160 contraindicated genetic loci for IBD susceptibility.1 By knocking out or interfering with a number of these IBD-associated genes in animals (e.g., interleukin [IL]-10−/−, IL-2−/−, STAT3−/−),2 many of the symptoms, pathology, pathways, and histological features of IBD can be accurately reproduced in rodent models.3 Mouse models have advanced our understanding of IBD and provided strong evidence of links between genetic predisposition and the loss of microbial tolerance in the onset of chronic colitis; as exemplified by genetically susceptible mice failing to develop colitis when housed in germ-free conditions.4In order for the vast quantities of data derived from animal experimentation to be translated reliably into human studies, published experiments must be reported in sufficient detail to allow accurate comparison, reproduction, replication, and interpretation.5 The ARRIVE (Animal Research: Reporting of In Vivo Experiments) guidelines suggest that reporting omissions prevent readers from reaching useful conclusions.6 Recent work by the Reproducibility Initiative has highlighted the obstacles that can arise when repeating experimental work if the materials and methods have been insufficiently described in published articles.7 In addition, this problem has become increasingly relevant due to the surge in interdisciplinary research, where experts from clinical or nonbiology backgrounds may be responsible for curating, managing, and analyzing data derived from laboratory experiments, and these individuals may not be able to identify or infer the missing details from experimental methods that could impact on data quality.In recent years, a number of methods reporting guidelines and checklists have been developed, with a focus on a particular type of protocol (e.g., the minimum information guidelines group, MIBBI8) or a general theme, such as the ARRIVE guidelines for experiments using animal models.6 These interventions have largely been successful in raising awareness of flawed methods reporting within the scientific literature, gaining the support of journals, publishing houses, and members of the scientific community.5,6,8,9 In several cases, publishers have implemented stricter guidelines for methods quality, introduced broad checklists, and removed limitations on word counts for methods reporting.10 However, there is still a lag between implementation of these measures and adherence to them.11We recently examined the quality of methods reporting in parasitology experiments,12 highlighting the need for domain-specific guidelines: bespoke checklists tailored by experts that can be used to assess and improve the methods reporting quality within their community. These checklists can be implemented before the point of publication, acting as a barrier to prevent incomplete methods from entering the literature, and also as a review tool for nonexperts when assessing article quality postpublication. Animal models of colitis are numerous, with at least 60 established IBD models currently being used.2 These models use diverse methods, and the exact mechanics of colitis induction (and the IBD they best model) are poorly understood in some cases. In this article, we aim to briefly summarize the types of colitis model that IBD researchers have at their disposal, highlight some of the problems that experimenters face in producing reliable and robust data from these models, and assess the current quality of methods reporting in published experiments in a subset of available colitis models; scoring them against a checklist of essential and desirable reported methods criteria. The selected criteria cover key aspects that can affect the outcome of colitis in animal models.We have included checklist criteria relating to 3 broad areas. First, animal sex, age, origin, and housing is considered, which can affect the severity of inflammation, the balance of microbiota in the gut (e.g., strain, diet, acclimation), and animal stress levels (e.g., temperature, animals per cage), and therefore, collectively modulate the severity of induced colitis.13–18 Second, factors pertaining to the colitis model, such as genetic modification of animals, origin of chemicals19 and dosing should be recorded in order for the experiment to be repeatable under the same conditions. Finally, criteria relating to the measurement of colitis, time course of the experiment, and clinical monitoring of animals during the experiment should be reported as standard to determine the success of colitis induction and provide means by which similarity between experiments can be determined for inclusion into systematic reviews and meta-analyses.
Animal Models of IBD
Animal models of colitis have a number of distinct advantages over clinical data when it comes to determining the cause and prevention of IBD. For example, by controlling the onset of inflammation in the laboratory, the failures of immune tolerance, susceptibility genes, and specific proinflammatory pathways involved in triggering colitis can be identified more easily than in a patient admitted with progressive disease and potential comorbidities. Anticolitic preventative measures may also be tested before symptoms occur in an animal model, an impossible task in current treatment of human IBD, where new patients usually only present once the disease reaches clinical significance. The pathway of inflammation can also be accurately modulated in laboratory models to emulate acute or chronic disease depending on the strain of animal used, the mechanism of induction and the use of intervals between deliveries of proinflammatory stimulus.Although the range of IBD models is diverse, they can be broadly categorized into 4 groups: chemically induced, biologically induced, genetic (including congenic and genetically modified animals), and cell transfer models. We have chosen a cross-section of colitis models to assess methods reporting quality in this field: dextran sulfate sodium (DSS), IL-10 knockout (IL-10−/−), CD4+ CD45RBhigh T cell transfer, and 2,4,6-trinitrobenzene sulfonic acid (TNBS). In addition to animal housing conditions having an impact on the microbiota composition, which itself has a major impact on colitis models, different colitis models have specific criteria that influences their reproducibility as summarized below.
DSS-induced Colitis Model
DSS is one of the most commonly used inducers of colitis in animal models, thanks largely to the ease of use and potentially short turnaround times for obtaining results. DSS is typically administered in the drinking water of mice or rats at a dose dependent on the strain of animal, the severity of inflammation desired, and the length of the experiment. Acute and resolving inflammation usually occurs after a single continuous exposure to DSS in drinking water over a week or less, whereas repeated exposure punctuated with recovery periods results in chronic inflammation. The exact mechanism by which DSS induces colitis is still poorly understood, but its primary mode of action seems to chemically interfere with gut mucosa barrier integrity, allowing luminal antigens access to the lamina propria and the proinflammatory cells within.20 Other factors that can influence the severity and susceptibility of exposure to DSS are the manufacturer and molecular weight of DSS,19 the strain of animal used (C3H/HeJ and BALB/c mice show increased susceptibility), gender (males are more susceptible), and whether animals are raised in germ-free or specific pathogen-free environments.20
IL-10−/− Chronic Colitis Model
IL-10 is an anti-inflammatory cytokine that functions to prevent excessive inflammatory and autoimmune pathology.21 Genome-wide association studies and clinical observations have identified IL-10 as a susceptibility gene for both Crohn's disease and ulcerative colitis.22 By employing a number of genetic mechanisms, IL-10 or its receptor have been knocked out or functionally impaired to create several murine animal systems for the study of inflammation. IL-10−/− mice housed under normal conditions develop chronic inflammation in the gut, but mice will remain healthy when housed under germ-free conditions or with a defined selected microbiota and administration of antibiotics can prevent the onset of colitis in IL-10−/− mice.21 Consequently, to standardize microbial influence on triggering colitis in the IL-10−/− model, specific enteric microbes such as Enterococcus faecalis or Helicobacter hepaticus may be used as an inoculum for mice that have been raised in germ-free housing.
T Cell Transfer Colitis Model
The T cell transfer model builds on the understanding that T lymphocytes play a pivotal role in the onset of colitis: mediating between antigen presenting cells and generating targeted immune responses to commensal enteric bacteria. In this model, naive T cells (CD4+ CD45RBhigh or CD4+ CD62L+) are adoptively transferred from wild-type mice into genetically identical mice lacking T cells and B cells (e.g., SCID or RAG−/− mice). The onset of symptoms occurs 2 weeks after T cell transfer in the recipient mice, with pancolitis present from 4 weeks.23 Due to the extraction, isolation, purification, and injection of adoptive T cells, this model requires a much more complex and labor-intensive protocol than many other IBD models. Factors that influence the resulting colitis include the strain of animal used, the number and viability of T cells transferred, and the presence of B cells in the recipient animals.23
TNBS-induced Colitis Model
TNBS is a chemical administered rectally in the form of an enema to mice or rats. TNBS is administered in combination with ethanol, which disrupts the mucous barrier, and it is generally thought that TNBS induces colitis by haptenating proteins within the gut, causing them to become preferential targets for immune cells.24 As with other chemically induced colitis models, the severity of TNBS-induced colitis depends largely on the dosage applied and the strain of animal used.24
Scope of this Study
A vast amount of clinical and experimental IBD data are available for access: a PubMed search for the Medical Subject Headings (MeSH) term “inflammatory bowel diseases”[MeSH] from the year 2000 to present returns 30,931 articles. Researchers and health professionals cannot possibly hope to consult all the data to make decisions, so we are becoming increasingly reliant on meta-analyses and combinatory repositories to inform translation from animal experiments to clinical practice: it is vitally important that these processes are built on reliable foundations. This leads us to a pressing need to annotate and accurately record experiments from disparate sources, and this information is often lacking—not only does this prevent construction of well-founded knowledge-base systems, but it also prevents others from fully understanding the validity of results in the context of the experimental setting. How can a reader know whether 2 experiments are comparable if the methods from each experiment are not explicitly clear? In addition, geographical and language barriers or the use of nondomain experts may prevent the fluid exchange of tacit knowledge, resulting in subtle, yet important, omissions when describing experiments.25To determine whether experiments in the field of primary colitis research are reported with adequate clarity and detail for replication, reproduction, and comparison, we defined a checklist of essential parameters that must be included and desirable parameters that ought to be included when describing experimental animal colitis. We then conducted a PubMed search to obtain a corpus of articles using DSS, IL-10−/−, T cell transfer, or TNBS colitis models for assessing with the checklist. To gather a manageable number of results, we limited the search to studies published after 2000 that conducted a microarray on colitic tissues.
MATERIALS AND METHODS
A systematic search was performed following the recommendations of the PRISMA guidelines.26 Relevant search terms were selected to identify published articles that used 1 (or more) of 4 animal models of colitis: DSS, IL-10−/−, T cell transfer, or TNBS. The search was narrowed down to select only those articles that conducted a microarray on colonic tissues. Assessed criteria were divided into 3 sections in a protocol: aspects relating to the animal and its housing conditions, description of the model of perturbation used and criteria describing the assessment of colitis and the experimental design. The protocol used here for assessing criteria has not been previously published.The literature search was conducted using PubMed in June 2014 and included articles published in English from January 1, 2000 to of June 1, 2014. The search terms included MeSH (Medical Subject Headings) terms and text strings, as outlined in Table 1. The year 2000 was selected as the cutoff due to the emergence of high-throughput analytical techniques becoming more commonplace after the publication of the first draft of the human genome. The DSS model was chosen as this is the most commonly used colitis model.19 We also selected TNBS as a comparative chemical inducer of colitis, IL-10−/− to represent genetically modified colitis models, and T cell transfer as an example of a model that requires additional, more complex steps in its methods. Biologically induced colitis models, where bacterial or helminthic challenge is used to induce colitis, were not specifically included in this study. However, a number of IL-10−/− articles did include bacterial induction, where a specific cocktail of common murine bacterial strains were used to inoculate germ-free IL-10−/− mice (the checklist is capable of handling biologically induced colitis models). In addition, Trichuris muris–induced colitis, while not universally accepted as an IBD model, bears many phenotypic and transcriptional similarities to more traditional IBD models.27 However, we chose not to include the T. murisinfection model in this review as it was covered to some degree in our previous methods quality article.12
TABLE 1
PubMed Search Terms Used for Each Colitis Model Included in the Systematic Review
PubMed Search Terms Used for Each Colitis Model Included in the Systematic Review
Inclusion Criteria
Primary research articles published in English, within the date constraints, that were returned in the PubMed search were considered for inclusion based on the title and abstract. Reviews, meta-analyses, and experiments that did not use any of the 4 chosen models were excluded. In addition, articles that conducted microarrays on human tissue or primary cell culture tissue only were also excluded, along with articles that were based on microarray data from a previous study. We also excluded combined colitis and carcinogenesis models. The resulting corpus of articles was assessed using the bespoke methods reporting checklist for animal models of colitis.
Checklist
A checklist of essential criteria that must be included and nonessential criteria that are useful to include when reporting the results of animal models of colitis was drawn up (Table 2), with additional input by experts in the field of colitis research. Articles were assessed on whether they included each criterion within the published article, supplementary methods, or relevant cited articles. For each criterion, an article received a weighted score if the criterion was present or not applicable, and zero if the item was absent. Total scores for all criteria were tallied to provide a final percentage score for successfully reported criteria. Data extraction and assessment was conducted by one reviewer, and half of the articles were randomly selected and scored blind by the second reviewer. Inconsistencies were discussed by both reviewers until a consensus was reached.
TABLE 2
Checklist of Essential and Desirable Criteria and the Weighting Applied to Each Criterion for Reporting Methods in Animal Models of Colitis
Checklist of Essential and Desirable Criteria and the Weighting Applied to Each Criterion for Reporting Methods in Animal Models of Colitis
Weighting
Weight per item was determined in consultation with 3 colitis experts (Table 2). Criteria were assigned a weight by a combination of 2 factors: whether the item was considered essential (Y/N), and whether the item was determined to be of low, medium, or high importance (L/M/H). Weighted scores were allocated as follows: (Y = 5 or N = 2) and (L = 3 or M = 4 or H = 5). Therefore, each criterion received a score between 5 and 10, which was then used to determine the weight as a percentage of the sum of all scores. Where disagreement occurred in allocating weighting to criteria, the majority vote was used.
Journal Impact Factor
Journal impact factor (IF) was retrieved from the Institute of Scientific Information (ISI) Journal Citation Reports (JCR) database 2013.
Confirmation of Impartiality in Scoring of Studies
Half of all articles accepted were randomly selected and scored using the checklist by the second reviewer. Differences between scores were assessed using a Bland–Altman comparison and linear correlation to determine whether any reviewer bias was present.
Statistical Analysis
Data were analyzed by two-way analysis of variance, Bland–Altman correlation, and linear correlation using GraphPad Prism version 6.05 (Windows) and 6.0f (Mac), GraphPad Software, La Jolla CA, www.graphpad.com.
Ethical Considerations
There are no ethical considerations.
RESULTS
Search Strategy
A total of 58 unique studies were identified for inclusion in the review (see Fig., Supplemental Digital Content 1, http://links.lww.com/IBD/A789). Six of the included articles were applicable to more than 1 of the colitis models and were subsequently included in the datasets for every relevant model (29 DSS,28–56 15 IL-10−/−,36,49,50,57–68 5 T cell transfer,56,69–72 and 16 TNBS35,56,61,73–85; for details of all included studies see Table, Supplemental Digital Content 2, http://links.lww.com/IBD/A790). Duplicate articles were only included once in summary analyses where data from all models are combined. The PubMed searches returned 256 unique articles (54 DSS, 146 IL-10−/−, 42 T cell transfer, and 21 TNBS), 188 of which were rejected based on the title and abstract. A further 10 articles were excluded after assessing the full text of the article, leaving a corpus of 58 articles for analysis.
Quality of Methods Reporting
Each article was assessed for inclusion of the criteria outlined in the quality checklist, which was subdivided into 3 domains: animal, model, and experiment—correlating with subject, perturbation and outcome. The mean weighted score across all colitis models was 81.7% (SD = ±7.038) of criteria reported. By model, articles using the DSS model had the highest quality of methods reporting (mean = 83.30%, SD = ±7.019), and the lowest quality was observed in articles using the T cell transfer model (mean = 73.19%, SD = ±5.328): significantly lower than DSS (P ≤ 0.01) and IL-10−/− (P ≤ 0.05) colitis models (Fig. 1A). Individually, the article with the lowest mean score was 64.05% (T cell transfer model72), and the highest recorded was 94.86% (DSS model52). No article reported 100% of all of the criteria on our checklist but 1 article (DSS model39) of all the 58 articles assessed successfully reported all essential criteria for every domain.
FIGURE 1
A, Overall scores (percent criteria reported) for the quality of methods reporting for each colitis model included in this review. The T cell transfer model scored significantly lower than DSS (P ≤ 0.01) and IL-10−/− (P ≤ 0.05) colitis models. n = 29 (DSS), 15 (IL-10−/−), 5 (T cell transfer), and 16 (TNBS). Analysis by two-way ANOVA. B, Methods reporting quality (percent criteria reported) for each of the 3 subsections of the quality reporting checklist. Criteria relating to the model subsection scored higher than the animal and experimental design subsections. Within the experimental design subsection, DSS and IL-10−/− scored significantly higher than both T cell transfer (P ≤ 0.05) and TNBS (P ≤ 0.001 and P ≤ 0.01, respectively) colitis models. n = 29 (DSS), 15 (IL-10−/−), 5 (T cell transfer), and 16 (TNBS). Analysis by two-way ANOVA. ANOVA, analysis of variance.
A, Overall scores (percent criteria reported) for the quality of methods reporting for each colitis model included in this review. The T cell transfer model scored significantly lower than DSS (P ≤ 0.01) and IL-10−/− (P ≤ 0.05) colitis models. n = 29 (DSS), 15 (IL-10−/−), 5 (T cell transfer), and 16 (TNBS). Analysis by two-way ANOVA. B, Methods reporting quality (percent criteria reported) for each of the 3 subsections of the quality reporting checklist. Criteria relating to the model subsection scored higher than the animal and experimental design subsections. Within the experimental design subsection, DSS and IL-10−/− scored significantly higher than both T cell transfer (P ≤ 0.05) and TNBS (P ≤ 0.001 and P ≤ 0.01, respectively) colitis models. n = 29 (DSS), 15 (IL-10−/−), 5 (T cell transfer), and 16 (TNBS). Analysis by two-way ANOVA. ANOVA, analysis of variance.The best reported domain was the model itself (mean = 95.80%, SD = ±3.018), followed by animal criteria (mean = 64.05%, SD = ±6.992) and experiment criteria (mean = 56.44%, SD = ±10.225). Looking at scores per domain by colitis model, IL-10−/− had the highest quality for the animal domain (mean = 70.99%, SD = ±20.194), TNBS had the highest quality for the model domain (mean = 98.94%, SD = ±1.914), and DSS had the highest quality for the experiment domain (mean = 65.78%, SD = ±13.810). The T cell transfer model had the lowest mean scores for all 3 domains (animal = 54.95%, SD = ±7.770; model = 92.00%, SD = ±2.937; experiment = 46.58%, SD = ±14.908) (Fig. 1B). For full details of methods reporting quality for each included study see Tables, Supplemental Digital Content 3-14, http://links.lww.com/IBD/A935, http://links.lww.com/IBD/A936, http://links.lww.com/IBD/A937, http://links.lww.com/IBD/A938, http://links.lww.com/IBD/A939, http://links.lww.com/IBD/A940, http://links.lww.com/IBD/A941, http://links.lww.com/IBD/A942, http://links.lww.com/IBD/A943, http://links.lww.com/IBD/A944, http://links.lww.com/IBD/A945, and http://links.lww.com/IBD/A946.
DSS-induced Colitis Model
For DSS colitis, the most poorly reported criteria for the animal domain were food/water, acclimation, animal gender, and animal age (44.83%, 41.38%, 31.03%, and 20.69% of articles failed to report the criteria, respectively). When describing the DSS model itself, 9 articles (31.03%) failed to provide any information about the molecular weight of the DSS used, and 17.24% of articles did not provide information about the supplier of the DSS chemical (Fig. 2). A more detailed examination of the reporting of molecular weight of DSS revealed that of the 20 articles (68.97%) that proved information about the molecular weight of DSS, only 5 (17.24%) used the correct units of measurement: of the remaining 15 articles, 13 (44.83%) provided no units and 2 (6.90%) used incorrect units. Of the 29 articles that used DSS colitis, 24 (82.76%) failed to correctly report the nature of the DSS molecule that they used to induce colitis. The worst reported essential criteria in the experiment design domain were mortality reporting, colon length/weight measurements, animal weight loss, and colitis scoring by histology (72.41%, 51.72%, 20.69%, and 10.34% of articles failed to report these criteria, respectively).
FIGURE 2
Proportion of all DSS articles that correctly and incorrectly described the molecular weight of the DSS used in the experiment. Correct reporting of DSS was only described in 17.24% of articles, and no information at all was provided in 31.03% the studies assessed (n = 29).
Proportion of all DSS articles that correctly and incorrectly described the molecular weight of the DSS used in the experiment. Correct reporting of DSS was only described in 17.24% of articles, and no information at all was provided in 31.03% the studies assessed (n = 29).
IL-10−/− Chronic Colitis Model
In the animal domain, the criteria most poorly reported in the articles using the IL-10−/− model were very similar to those missing in the DSS model: acclimation, gender, and food/water were the most commonly absent essential criteria (46.67%, 40%, and 33.33% of articles failed to report, respectively). For the IL-10−/− model itself, measurement of bacterial colonization in the gut was poorly reported when specific bacterial inoculation was used to induce colitis (53.33% failed to report criteria). In addition, 26.67% of IL-10−/− articles did not specify the strain(s) of bacteria used to induce colitis. The worst reported criteria relating to the experimental design were mortality reporting and colon weight/length measurements, which were both absent in 66.67% of articles.
T Cell Transfer Colitis Model
For articles using the T cell transfer model, the worst reported criteria in the animal domain were food/water and acclimation (100% and 80% of articles failed to report these criteria, respectively). Gender of animals used was also not specified in 1 of the 5 T cell transfer articles (20%). When describing the T cell transfer model itself, none of the 5 articles described how viability of T cells transferred was measured or whether it was measured at all. For the experimental design, no article using T cell transfer reported mortality of animals used, 60% of articles failed to report colon length/weight measurements, and 40% of articles failed to report animal weight during the experiment.
TNBS-induced Colitis Model
Articles using TNBS to induce colitis were the worst for reporting whether animals had been acclimated (87.5% of articles failed to report this criterion). Also, food/water supply and age of animals used was missing in 50% and 25% of articles, respectively. The TNBS model itself was well reported, although 18.75% of articles failed to report the supplier of the TNBS. Similar to the other colitis model, the worst reported essential criteria in the experiment design domain for TNBS were mortality reporting, colon length/weight measurements, animal weight loss, and colitis scoring by histology (75%, 75%, 43.75%, and 37.5% of articles failed to report these criteria, respectively).
More Recent Articles Have Higher Methods Reporting Quality
Overall scores have significantly improved year on year (P = 0.037, r2 = 0.075). T cell transfer is the only model to have a drop in methods reporting quality over time, but this is not significant. DSS and IL-10−/− show a trend toward improved methods reporting quality over time and TNBS overall reporting quality has significantly improved with time (P = 0.0036, r2 = 0.4659) (Fig. 3A). The improvement in TNBS reporting quality over time has largely come from a significant improvement in the experiment domain (P = 0.0203, r2 = 0.3285) (Fig. 3B).
FIGURE 3
A, A significant positive correlation (P ≤ 0.01, r2 = 0.47) is seen between overall methods reporting quality score (%) and year of publication in studies using TNBS-induced colitis. B, The source of this correlation comes largely from the strong positive correlation (P ≤ 0.05, r2 = 0.33) between reporting quality (%) and year of publication within the experimental design subsection in TNBS colitis papers (n = 16). C, IF of the journal of publication had no impact on the overall quality of methods reporting. D, By subdomain, a nonsignificant negative correlation between reduced methods reporting quality and increased IF was observed in the animal domain (P = 0.0536, r2 = 0.07) (n = 58). Analyses by linear correlation.
A, A significant positive correlation (P ≤ 0.01, r2 = 0.47) is seen between overall methods reporting quality score (%) and year of publication in studies using TNBS-induced colitis. B, The source of this correlation comes largely from the strong positive correlation (P ≤ 0.05, r2 = 0.33) between reporting quality (%) and year of publication within the experimental design subsection in TNBS colitis papers (n = 16). C, IF of the journal of publication had no impact on the overall quality of methods reporting. D, By subdomain, a nonsignificant negative correlation between reduced methods reporting quality and increased IF was observed in the animal domain (P = 0.0536, r2 = 0.07) (n = 58). Analyses by linear correlation.
Journal IF Has No Relation to Methods Reporting Quality
IF was not observed to have a significant impact on methods reporting quality in animal models of colitis (Fig. 3C). When broken down into domains, there was a slight negative correlation between IF and quality score in the animal domain, but this was not significant (P = 0.0536, r2 = 0.06488) (Fig. 3D).
Verification of Consistency in Scoring of Studies
The second examiner scored 33 of the 58 articles included in the review (DSS = 14, IL-10−/− = 8, T cell transfer = 3, and TNBS = 8). Differences in scores for the 2 examiners were assessed through a Bland–Altman plot (Fig. 4). Difference in scores between examiners did not differ significantly from zero (P = 0.149, r2 = 0.066) suggesting that there was no bias in scoring, and articles were scored consistently with the minimum information checklist.
FIGURE 4
Bland–Altman plot to assess agreement between 2 experimenters in scoring articles with the minimum information checklist (n = 33). Articles were scored by the second marker, representing at least half the articles assessed for each model. Difference in scores is not significantly different from zero (P = 0.149, r2 = 0.066).
Bland–Altman plot to assess agreement between 2 experimenters in scoring articles with the minimum information checklist (n = 33). Articles were scored by the second marker, representing at least half the articles assessed for each model. Difference in scores is not significantly different from zero (P = 0.149, r2 = 0.066).
DISCUSSION
Chronic inflammation is a complex and poorly understood pathway with important clinical significance both in terms of quality of life and financial impact. It is vitally important that the animal experiments that inform almost all clinical practice are conducted rigorously and published in enough detail for others to benefit from and build upon, which would be in agreement with the principles stated in the 3 Rs (replace, reduce, and refine).86 To examine the quality of methods reporting in animal models of colitis and determine the potential impact on reliability, replicability, and comparability of studies in this field, we have assessed 4 commonly used animal models of colitis: DSS, IL-10−/−, T cell transfer, and TNBS. Our results indicate that although these models score well against a checklist of essential criteria, there are still a variety of fundamental criteria that are repeatedly omitted. It is also encouraging to see an improvement over time, even if this effect is quite small. However, the fact that only 1 article from a corpus of 58 reported all essential criteria is a huge cause for concern, 98.3% of articles included in this analysis failed to include sufficient information to accurately repeat the experiment.In the United Kingdom, death as an endpoint in animal experiments is to be avoided wherever possible.87 However, mortality and morbidity does occur from time to time and for a variety of reasons, and this should be reported as it will have a significant impact on the data produced and the results of statistical analyses. A statement referring to animal mortality, even if no animal died during the experiment, was one of the worst reported essential criteria from the checklist across all 4 colitis models included in this analysis (48 of 58 articles, 82.76%, failed to include this criterion). Most animal models of colitis are not expected to cause significant morbidity or death, but the lack of reporting, even to confirm that no unexpected deaths occurred, is problematic. When results from animal experiments fail to disclose mortality, bias may be introduced, giving an overly optimistic estimate of the efficacy of the intervention.88 For example, without adverse event reporting being enforced, there is no obligation for researchers to declare mice that die during an animal study, but failing to declare this information potentially puts the safety of animals and people in future trials at risk.89 We are not suggesting that the studies included in this review are deliberately obscuring potentially harmful results, and we assume a lack of adverse event reporting reflects an absence of adverse events to report. However, without such a declaration, we cannot say for certain either way. Consequently, animal experiments should align more closely with clinical practice in this regard and declare adverse reactions as a matter of course.90The key role of gut microbiota in the onset and severity of chronic colitis is well defined.14 Thus, it was surprising that more than half of the studies (63.79%) failed to describe how animals had been acclimated to ensure potential differences in microbiota had been accounted for and controlled. In addition, very few articles specified the use of littermate controls, which would be the ideal gold-standard for controlling baseline equivalence in microbiota populations. It is insufficient to assume animals obtained from the same supplier or reared within the same experimental facility will harbor equivalent microbial populations, as differences can and do exist even within rooms or across facilities.91 Simple tools to characterize microbiota are available,16 and, ideally, these should be used to improve standardization and tighten controls within experiments. Alternatively, cohousing or litter mate controls reduce the likely impact of the environment. Additionally, acclimation serves to compensate for stresses involved in transporting animals. Moving cages to a new location in the same facility can have stressful effects on animals lasting several weeks, ultimately influencing immune responses in experimental conditions.18 Movement of animals should be kept to a minimum and laboratory animals require up to 7 days for changes in immune and endocrine parameters to return to baseline before experimental procedures begin92; needless to say, these details should be declared in the methods of the study write-up.Another key factor in determining microbial consistency is diet, with various dietary factors influencing the growth of different bacterial populations in the gut.15,93 Again, over half of the studies (53.45%) in our analysis failed to define the chow fed to experimental animals, a factor that can have significant effects on the severity of induced colitis and the microbiota present in the gut.15 Better standardizations are required for studies where gut microbiota can influence results, and colonization of laboratory animals with defined microbial populations would introduce a new level of control in these experiments.94Reporting the gender of animals was one of few criteria where the quality differed depending on the animal model used, with 9 DSS studies and 6 IL-10−/− studies failing to report animal gender compared with just 1 study each from the T cell transfer and TNBS-induced models. The role of gender in inflammation is well established, with females (in both mice and humans) being more susceptible to developing autoimmune diseases and mounting a more pronounced inflammatory response than males.17,95 In addition, sex differences also occur within animal models of colitis: male mice are more susceptible to DSS colitis, for example.19 Failing to describe the gender of animals in an experiment relying on inflammation obscures vital information when trying to infer meaning from the results and prevents data from different studies from being reliably compared.A number of criteria relating to animal housing were considered to be nonessential in our checklist, yet temperature, humidity, light/dark cycle, and the number of animals per cage were repeatedly omitted from the methods of between 50% and 100% of the studies assessed, depending on the model used. Temperature in particular can affect the immune system of mice, with low temperatures triggering immunosuppressive responses.96 Many studies are conducted where animal facilities are kept at “room temperature” (19–22°C) to suit the experimenters but not necessarily the animals that they house: wild mice spend daytime inactive, nesting at 30 to 32°C and are therefore experiencing cold stress in the majority of animal facilities.96,97 Also, in addition to behavioral and immunological changes,98 mice housed alone will have to endure cooler conditions that mice housed in groups. Severity of colitis in the DSS model is strongly linked to the strain of animal used and the specifications of the DSS itself. Large molecular weight DSS (≥500 kDa) fails to bypass the mucous barrier and does not induce colitis,99 whereas smaller preparations of DSS (5–40 kDa) elicit colitic responses in a spectrum of disease severity.19 Although DSS is commonly prepared at around 40 kDa, not all experimenters obtain DSS from the same supplier or at the same molecular weight. That only 5 of the 29 DSS articles accurately reported the molecular weight of DSS with the appropriate units is problematic. The presence of arbitrary numbers with no denomination specified or with clearly incorrect units resulting in claims of molecular weight out by orders of magnitude (e.g., kDa instead of Da, or vice versa) in published studies is poor. The increased number of interdisciplinary, non-domain specialists involved in curating and annotating datasets for inclusion in meta-analyses means that this sort of information must be included within the methods of published articles. Authors of studies cannot assume that everyone accessing their study has the expertise to be able to infer the fine details of the protocols they used. Thus, these sorts of errors appearing in the literature suggest potential shortcomings in submission, peer review, and journal editing processes. It is often the responsibility of submitting authors to ensure that there are no errors in a submitted manuscript but peer reviewers ought to be spotting these errors before an article gets to print.We recommend the continued uptake of methods quality checklists to assist authors and publishers with inclusion of all the relevant methods details that are required to fully interpret data and integrate results into larger analyses. We have provided a domain-specific checklist that can be used in the assessment of methods reporting in any colitis model, and we think this will aid translation of discoveries in animal models into human studies. However, we are aware that by including only microarray studies, we are focusing on a subset of published colitis research. Methods reporting quality for animal models of colitis in general may not reflect the results we have reported here. Also, we have not attempted to address the diversity of experimental design within models or the choice of statistical tests and power calculations used in analysis of data in this field, both of which will impact the feasibility of comparing data from colitis models. It is worth noting that, although all the studies in this review detailed the numbers of mice used per group, none of the studies included any statistical measure of power to justify the number of animals used. This is of concern, as power calculations are important for assessing the validity of statistical tests applied to the data generated and to limit unnecessary use of animals in research.6,100In conclusion, we have demonstrated that the quality of methods reporting in modeling colitis, while generally appearing high, has serious flaws with long-ranging impact on the translation of primary research into clinical research of IBD. Automated methods, such as computerized histology scoring,101 may become more commonplace in future, assisting experimenters in standardizing their methods, but more needs to be done to promote and enforce existing guidelines. Animal experimenters have an onus to follow the 3 Rs (replace, reduce, and refine), and better reporting of studies will add value to experimental data produced by animal studies.86 Implementation of our colitis methods checklist would improve the quality of publications in this field, ensuring animal models, and the data they produce are used effectively to fulfill their maximum usefulness. The pipeline from basic science to clinical practice is filled with examples where success in the laboratory fails to translate into human subjects and improving methods reporting would be an excellent starting point in rectifying this problem at very little cost or effort.
Authors: Bianca Knoch; Matthew P G Barnett; Janine Cooney; Warren C McNabb; Diane Barraclough; William Laing; Nicole C Roy Journal: Biotechnol J Date: 2010-09-24 Impact factor: 4.677
Authors: Matthew J Hamilton; Mark J Sinnamon; Gregory D Lyng; Jonathan N Glickman; Xueli Wang; Wei Xing; Steven A Krilis; Richard S Blumberg; Roberto Adachi; David M Lee; Richard L Stevens Journal: Proc Natl Acad Sci U S A Date: 2010-12-20 Impact factor: 11.205
Authors: Dmitry V Ostanin; Jianxiong Bao; Iurii Koboziev; Laura Gray; Sherry A Robinson-Jackson; Melissa Kosloski-Davidson; V Hugh Price; Matthew B Grisham Journal: Am J Physiol Gastrointest Liver Physiol Date: 2008-11-25 Impact factor: 4.052
Authors: Dag Henrik Reikvam; Muriel Derrien; Rejoanoul Islam; Alexander Erofeev; Vedrana Grcic; Anders Sandvik; Peter Gaustad; Leonardo A Meza-Zepeda; Frode L Jahnsen; Hauke Smidt; Finn-Eirik Johansen Journal: Eur J Immunol Date: 2012-09-04 Impact factor: 5.532
Authors: Paul Glasziou; Douglas G Altman; Patrick Bossuyt; Isabelle Boutron; Mike Clarke; Steven Julious; Susan Michie; David Moher; Elizabeth Wager Journal: Lancet Date: 2014-01-08 Impact factor: 79.321
Authors: Hyacinth I Hyacinth; Courtney L Sugihara; Thomas L Spencer; David R Archer; Andy Y Shih Journal: J Cereb Blood Flow Metab Date: 2017-09-19 Impact factor: 6.200
Authors: Kurinchi S Gurusamy; David Moher; Marilena Loizidou; Irfan Ahmed; Marc T Avey; Carly C Barron; Brian Davidson; Miriam Dwek; Christian Gluud; Gavin Jell; Kiran Katakam; Joshua Montroy; Timothy D McHugh; Nicola J Osborne; Merel Ritskes-Hoitinga; Kees van Laarhoven; Jan Vollert; Manoj Lalu Journal: PeerJ Date: 2021-01-27 Impact factor: 2.984
Authors: Mercedes Lopez-Santalla; Pablo Mancheño-Corvo; Amelia Escolano; Ramon Menta; Olga DelaRosa; Jose Luis Abad; Dirk Büscher; Juan M Redondo; Juan A Bueren; Wilfried Dalemans; Eleuterio Lombardo; Marina I Garin Journal: Front Immunol Date: 2017-06-08 Impact factor: 7.561
Authors: Oscar Flórez-Vargas; Andy Brass; George Karystianis; Michael Bramhall; Robert Stevens; Sheena Cruickshank; Goran Nenadic Journal: Elife Date: 2016-03-03 Impact factor: 8.140