Literature DB >> 24498313

A review of the study designs and statistical methods used in the determination of predictors of all-cause mortality in HIV-infected cohorts: 2002-2011.

Kennedy N Otwombe1, Max Petzold2, Neil Martinson3, Tobias Chirwa4.   

Abstract

BACKGROUND: Research in the predictors of all-cause mortality in HIV-infected people has widely been reported in literature. Making an informed decision requires understanding the methods used.
OBJECTIVES: We present a review on study designs, statistical methods and their appropriateness in original articles reporting on predictors of all-cause mortality in HIV-infected people between January 2002 and December 2011. Statistical methods were compared between 2002-2006 and 2007-2011. Time-to-event analysis techniques were considered appropriate. DATA SOURCES: Pubmed/Medline. STUDY ELIGIBILITY CRITERIA: Original English-language articles were abstracted. Letters to the editor, editorials, reviews, systematic reviews, meta-analysis, case reports and any other ineligible articles were excluded.
RESULTS: A total of 189 studies were identified (n = 91 in 2002-2006 and n = 98 in 2007-2011) out of which 130 (69%) were prospective and 56 (30%) were retrospective. One hundred and eighty-two (96%) studies described their sample using descriptive statistics while 32 (17%) made comparisons using t-tests. Kaplan-Meier methods for time-to-event analysis were commonly used in the earlier period (n = 69, 76% vs. n = 53, 54%, p = 0.002). Predictors of mortality in the two periods were commonly determined using Cox regression analysis (n = 67, 75% vs. n = 63, 64%, p = 0.12). Only 7 (4%) used advanced survival analysis methods of Cox regression analysis with frailty in which 6 (3%) were used in the later period. Thirty-two (17%) used logistic regression while 8 (4%) used other methods. There were significantly more articles from the first period using appropriate methods compared to the second (n = 80, 88% vs. n = 69, 70%, p-value = 0.003).
CONCLUSION: Descriptive statistics and survival analysis techniques remain the most common methods of analysis in publications on predictors of all-cause mortality in HIV-infected cohorts while prospective research designs are favoured. Sophisticated techniques of time-dependent Cox regression and Cox regression with frailty are scarce. This motivates for more training in the use of advanced time-to-event methods.

Entities:  

Mesh:

Year:  2014        PMID: 24498313      PMCID: PMC3911971          DOI: 10.1371/journal.pone.0087356

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Appropriate utilization of biostatistical methods is becoming increasingly important in biomedical research. Many journals, if not all, have a dedicated statistical committee that scrutinizes the methods used in analyzing data. In the last decade, several papers addressing study design issues and statistical analysis approaches in different clinical fields have been published underpinning the importance of robustness in methodology [1]–[8]. There is consensus that inappropriate study designs and statistical methodology lead to incorrect results, poor interpretation of study findings and wrong conclusions. An array of study designs and appropriate statistical techniques with varying levels of complexity exists. Selecting the appropriate study design and relevant statistical analysis technique is largely dependent on the complexity of the study and its objectives. Research on statistical content of medical research shows wider usage of techniques [9], [10] beyond descriptive statistics as a result of advanced software that can handle complex analyses. Much as advanced analyses are being conducted, simple techniques of descriptive and inferential statistical analysis like student t-tests and chi-square tests remain popular in the literature [4], [6], [11]. Despite major successes in the development of interventions for prevention of mother to child treatment (PMTCT) and anti-retrovirals (ARVs), HIV still remains a major public health concern. To date, limited information is available if any, reporting on the study design and statistical techniques used in determining the predictors of all-cause mortality in HIV positive cohorts in the last decade. With a large number of clinicians and public health experts relying on published research for new developments in HIV research, it is important they understand appropriateness of study designs and statistical techniques used in determining predictors of all-cause mortality. This study reviews relevant original articles in HIV-infected cohorts with the aim of identifying study designs, statistical methods used and further assess their appropriateness. We also sought to determine whether there was an increase in the use of time-to-event analysis techniques over time and highlight the need for methodological training.

Methods

Search strategy and selection criteria

In this bibliometric analysis, we searched all original English-language articles indexed in Pubmed/Medline using the terms “Predictors of HIV Mortality”, “Determinants of HIV Mortality” and “Factors associated with HIV mortality”. The search covered the period between January 2002 and December 2011, a period of ten years. These were further split into two five year periods; January 2002–December 2006 and January 2007 to December 2011 in order to assess whether there was a variation in the methods used over time. Original articles on HIV-infected cohorts within the specified period were eligible for inclusion. Letters to the editor, editorials, reviews, systematic reviews, meta-analysis and case reports were excluded. Other studies comparing both HIV positive and negative participants were also excluded. We identified a total of 91 and 98 papers between the periods 2002–2006 and 2007–2011 respectively. Each article was reviewed to determine the study design, nature of statistical methods used and their appropriateness. Time-to-event analysis methods were considered optimal or appropriate in this study. A spreadsheet containing a checklist of items of interest was prepared as a data collection tool. Findings were systematically recorded based on statistical methods previously reported [12]. We used a modified version of the classification proposed by Colditz and Emerson (Table 1) [13], [14]. Where a statistical technique was used more than once in an article, we recorded it as having occurred only once. A count of the number of statistical techniques employed in each article was determined for purposes of comparing the two periods.
Table 1

Classification of statistical methods as reported in journals.

CategoryBrief description
t -tests One-sample, matched pair and two-sample t-tests
Contingency tables Chi-square tests, Fisher's exact test, McNemar's test
Pearson correlation Classical product moment-correlation
Simple linear regression Least squares regression with one predictor and one response variable
Power Loosely defined, includes use of the size of detectable (or useful) difference in determining sample size
Epidemiological statistics Relative risk, odds ratio, log odds, measures of association, sensitivity, specificity
Adjustment and standardization Pertains to incidence rates and prevalence rates
Multiway tables Mantel-Haenszel procedure, log-linear models
Non-parametric tests Sign test, Wilcoxon signed-rank test, Mann-Whitney test
Non-parametric correlation Spearman's rho, Kendall's tau, test for trend
Analysis of variance Analysis of variance, analysis of co-variance and F-tests
Multiple comparisons Procedures for handling multiple inferences on same data sets, includes Bonferroni techniques, Scheffe's contrasts, Duncan multiple range procedures, Newmann-Keuls procedure
Transformation Use of data transformation (e.g logarithms) often in regression
Multiple regression Includes polynomial regression and stepwise
Life table Actuarial life table, Kaplan-Meier estimate of survival
Regression for survival Includes Cox regression and logistic regression
Other survival analysis Breslow's Kruskal-Wallis, log-rank, Cox model for comparing survival
Cost benefit analysis The process of combining estimates of cost and health outcomes to compare policy alternatives
Sensitivity analysis Examines sensitivity of outcome to modest changes in parameters of model, or in other assumptions
Other Anything not fitting above headings, includes cluster analysis, discriminant analysis and some mathematical modeling
The statistical methods used in the research articles were classified as either parametric or non-parametric. A further classification was made describing the statistical methods used as either basic or advanced. Methods classified as basic included Student t-test, Chi-Square and Fishers Exacts test, Mann-Whitney, Kruskall-Wallis, Wilcoxon, simple one-way ANOVA and correlation statistics. Modelling approaches such as logistic Regression, Conditional Logistic Regression, Poisson Regression, Cox regression, time-varying Cox-regression and Cox regression with frailty and epidemiologic statistics were classified as advanced (Table 1). The logistic regression is used to analyse the relationship between a binary dependent variable and independent predictor through estimation of the probability of an event occurring. It makes no assumption about normality, linearity and homogeneity of variance. But used with time-to-event outcomes, it fails to account for follow-up time. For this reason, articles reporting use of logistic regression on such outcomes were classified as sub-optimal [15]. Cox regression analysis is a survival analysis technique in time-to-event data that incorporates follow-up time and fixed covariates [16]. Censoring is done when events occur. The method assumes risk of an event is homogeneous. Extensions of the Cox regression exist which include time dependent Cox regression and Cox regression analysis with frailty [17], [18]. Time dependent Cox regression analysis accounts for the inherent correlation that may exist when covariates change over time. Cox regression analysis with frailty, if used in some of the reviewed articles, tries to account for unobserved heterogeneity. The data collected in this study were compared between two periods. Frequency analysis was used to determine the number of studies reporting use of specified statistical techniques. The number of optimal or sub-optimal methods used in the determination of predictors of mortality was determined using frequencies. The comparison between numbers of methods reported between the two periods was compared using the chi-square test where appropriate. All the Statistical analysis was performed using SAS 9.3 software and p-values ≤0.05 were considered a significant difference.

Results

The total number of studies reporting on predictors of HIV mortality that met our criteria in the era January 2002 and December 2011 was 189 (n = 91 in 2002–2006 and n = 98 in 2007–2011). Figure 1 is a flow chart displaying the selection criteria that was followed in arriving at the final number of articles. All the identified articles used at least one (basic or advanced) statistical test. Journal of Acquired Immune Infection (JAIDS) (n = 34, 18%) and AIDS (n = 24, 13%) published more articles on HIV mortality. JAIDS and AIDS published 19/34 (56%) and 14/24 (58%) of these articles in the era 2002–2006. Majority of the articles used a prospective study design and the number was similar in both periods (n = 67, 74% vs. n = 62, 63%; p = 0.66). Sample sizes varied from under 200 to greater than 1,000 participants. Table 2 presents the study design and sample size distribution of the included articles between the two periods. There were no significant differences in the study designs and sample sizes used between the two periods.
Figure 1

Flow chart showing the selection process of articles and the number in each period.

Table 2

Types of study designs and sample sizes.

2002–2006 2007–2011
Research DesignNo.(%)No.(%)p-value
Prospective 677462630.66
Retrospective 232533340.14
Case control 1133-
Prospective1-0--
Retrospective0-3--
Total 9198
Sample sizes
up to 200222415150.5
201 to 50021232526
501 to 1,00016182020
>1,00032353839
The number of studies reporting descriptive statistics for the two periods was similar. Table 3 presents the distribution of commonly reported statistical methods. The number of articles reporting use of t-tests, contingency table analysis, correlation and epidemiological statistics was similar. Few studies in both periods reported using one-way analysis of variance technique (ANOVA). The number of modeling approaches such as logistic, conditional logistic, generalized estimating equations and Poisson regression was significantly higher in the later period compared to the earlier (n = 31, 32% vs. n = 13, 14% p = 0.005).
Table 3

Summary of statistical methods.

2002–2006 2007–2011
Statistical methods No. of publications (N = 91) (%) No. of publications (N = 98) (%) p-value
Descriptive statistics889794960.78
Inferential methods
t-test131419190.35
Contingency table analysis
 Basic (chi-square/Fisher exact)252737380.13
Correlation91011-
Epidemiologic statistics394345460.67
Analysis of variance3311-
Regression
#Multiple regression131431320.005
Survival analysis
 Kaplan-Meier697653540.002
 Cox-Regression687563640.12
 Time dependent Cox regression212311110.03
 Cox-Regression with frailty1166-
Non-parametric methods869591930.64
No. of different inferential methods
2 or 3 methods101111110.96
4 or 5 methods444843440.54
More than 5 methods374144450.56
Basic analysis364045460.38
*Advanced analysis909997990.96

Refers to logistic, conditional logistic, generalized estimating equations and Poisson regression methods.

Refers to Student t-test, Chi-Square and Fishers Exact test, Mann-Whitney, Kruskall-Wallis, Wilcoxon, simple ANOVA and correlation.

Refers to Logistic Regression, Conditional Logistic Regression, Poisson Regression and Survival Analysis and Epidemiologic statistics.

Refers to logistic, conditional logistic, generalized estimating equations and Poisson regression methods. Refers to Student t-test, Chi-Square and Fishers Exact test, Mann-Whitney, Kruskall-Wallis, Wilcoxon, simple ANOVA and correlation. Refers to Logistic Regression, Conditional Logistic Regression, Poisson Regression and Survival Analysis and Epidemiologic statistics. A total of 122 (65%) articles reported using the Kaplan-Meier methods and it was commonly used in the first period compared to the second (p = 0.002). Use of the Cox proportional hazards regression modeling was reported by 131 (69%) articles and the number was similar between the two eras (p = 0.12). The number reporting use of time dependent Cox regression was higher in the first period (n = 21, 23% vs. n = 11, 11%; p = 0.03). Overall Cox regression with frailty was scarcely used (n = 7, 4%) in which 6 (3%) articles were in the later period. There were 22 (12%), 96 (51%) and 71 (38%) articles reporting use of 2 to 3, 4 to 5 and more than 5 statistical methods respectively. There were no significant differences in the number of methods used between the two eras. Similarly there were no significant differences between the two eras in the number of articles reporting use of basic or advanced statistical analysis methods. A total of 149 (79%) of the articles used appropriate methods while 40 (21%) used sub-optimal methods to determine the predictors of mortality in HIV-infected participants. Of the articles using appropriate methods, 116 (78%) were prospective and 33 (22%) retrospective. There were significantly more articles from the first period using appropriate methods compared to the second (n = 80, 88% vs. n = 69, 70%, p-value = 0.003). Table 4 presents findings on the appropriateness of the statistical methods used. A significantly higher number of articles in the first period could have used Cox regression analysis with frailty as the appropriate method, since they had clustered data (n = 82, 92% vs. n = 65, 68%; p<0.0001) while overall they were 147 (78%).
Table 4

Appropriateness of statistical methods for predictors of mortality.

2002–2006 2007–2011
Prospective (N = 68) Retrospective (N = 23) Prospective (N = 62) Retrospective (N = 36)
Statistical methodsNo. (%)No. (%)No. (%)No. (%)
Appropriate statistical analysis
Cox regression53 (78)15 (65)48 (77)15 (42)
Time dependent Cox regression16 (24)5 (22)10 (16)1 (3)
Cox regression analysis with frailty1 (1)0 (0)3 (5)3 (8)
Poisson regression3 (4)0 (0)4 (6)0 (0)
Sub-Optimal statistical analysis
Logistic regression4 (6)4 (17)7 (11)17 (47)
Unclear1 (1)2 (9)2 (3)3 (8)

Note: Totals in this table do not add up to the number of articles because some articles used more than one method in their analysis.

Note: Totals in this table do not add up to the number of articles because some articles used more than one method in their analysis.

Discussion

This paper aimed at reviewing articles on predictors of all-cause mortality in HIV-infected people to investigate the appropriateness of statistical methods used and nature of study designs. We reviewed a total of 189 articles. Like in any other study, there were several limitations. The literature review of the articles included in this study was searched in Pubmed/Medline ostensibly because this was not a systematic review requiring a measure of effect. Any relevant articles indexed elsewhere or in a language other than English were not considered. Our findings concur with others reporting on study designs, statistical methods used and their appropriateness. Prospective study designs remain the most common type of design used in studies of predictors of HIV mortality in the last decade. Retrospective study designs formed about one third of all articles included in this study. It may be that retrospective study designs are used as a cost-effective way of saving on huge expenses required for running prospective studies as a way for stimulating academic research. However there was no significant difference in the type of study designs used between the two periods. Basic statistical analysis procedures like t-tests, Chi-Square and Fishers Exact tests, Mann-Whitney, Kruskall-Wallis and Wilcoxon are commonly used. There was no difference in the number of articles reporting use of t-tests between the two periods. This is similar to previously reported studies that have shown the popularity of t-tests [4], [19]. All the studies used at least two statistical tests. We contend that our inclusion criteria and the nature of studies included in this review all required using a type of statistical analysis to address the research question. Our findings concur with those reported earlier showing majority of articles apply more than one statistical test [5], [7], [20]. But this is contrary to the findings of a review on study designs and statistical methods in Chinese journals that found a low proportion of studies reporting use of multiple statistical tests [21]. Survival analysis approaches remain popular in the studies looking at predictors of mortality in HIV-infected people, especially the Cox proportional hazards regression modeling. Though fewer studies used extensions of the Cox proportional hazards regression, our findings show that there is an interest in using advanced approaches like the time-dependent Cox proportional hazards or Cox proportional hazards regression with frailty in modeling survival data in HIV-infected cohorts. We found a higher proportion of the studies could have used Cox regression analysis with frailty, an appropriate technique. While the methods used were not wrong, they could have gained more information by using Cox regression analysis with frailty. Previously reported work on statistical methods in medical research show that while use of sophisticated methods is increasing, inappropriate techniques still remain a challenge [1], [6], [7], [22]. It may be that recent techniques are advanced and require rigour to implement. Furthermore the techniques may not necessarily be easily implemented in standard statistical software [23]. As a result, researchers use techniques that are fairly straight-forward and implementable in standard statistical software. Our findings show that not all the studies in our sample used optimal statistical tests in the determination of the predictors of mortality. Survival analysis techniques produce better estimates that are more informative when analysed using optimal methods. Furthermore, in clinical research where objectives require a multivariable analysis approach, it is prudent to adjust for confounding appropriately by using optimal statistical methods [24]. Cox regression analysis and its extensions provide a better picture compared to logistic regression when using survival data. Unlike previously reported research [21], [25], the proportion of studies using sub-optimal statistical tests was lower in our sample. These findings are contrary to those reported in other clinical fields where there was a high proportion of articles using sub-optimal methods [3], [11], [19], [26], [27]. Descriptive statistics and survival analysis techniques remain the most common methods of analysis in publications on predictors of all-cause mortality in HIV-infected cohorts while prospective research designs are favoured. These results suggest the importance of understanding advanced survival analysis methods in interpreting research findings in this set-up. However complex and appropriate methods like Cox regression analysis with frailty remain scarcely utilised. Our findings are in agreement with others who also reported a high use of descriptive statistics [4], [6]. The more sophisticated techniques of time dependent Cox regression and Cox regression with frailty are scarcely used. This motivates for more training in the use of advanced time-to-event methods. Prisma checklist. (DOC) Click here for additional data file.
  22 in total

1.  A survey of abstracts of high-impact clinical journals indicated most statistical methods presented are summary statistics.

Authors:  Nathan Taback; Monika K Krzyzanowska
Journal:  J Clin Epidemiol       Date:  2007-09-14       Impact factor: 6.437

2.  Use of statistical methods in dental research: comparison of four dental journals during a 10-year period.

Authors:  Hannu Vähänikkilä; Pentti Nieminen; Jouko Miettunen; Markku Larmas
Journal:  Acta Odontol Scand       Date:  2009       Impact factor: 2.331

3.  Characteristics and trends of radiology research: a survey of original articles published in AJR and Radiology between 2001 and 2010.

Authors:  Kyoung Ja Lim; Dae Young Yoon; Eun Joo Yun; Young Lan Seo; Sora Baek; Dong Hyeon Gu; Soo Jeong Yoon; Ari Han; You Jin Ku; Sam Soo Kim
Journal:  Radiology       Date:  2012-09       Impact factor: 11.105

4.  The impact of heterogeneity in individual frailty on the dynamics of mortality.

Authors:  J W Vaupel; K G Manton; E Stallard
Journal:  Demography       Date:  1979-08

5.  Cox regression analysis of multivariate failure time data: the marginal approach.

Authors:  D Y Lin
Journal:  Stat Med       Date:  1994-11-15       Impact factor: 2.373

6.  Use of statistical analysis in the AJR and Radiology: frequency, methods, and subspecialty differences.

Authors:  A D Elster
Journal:  AJR Am J Roentgenol       Date:  1994-09       Impact factor: 3.959

7.  The statistical content of published medical research: some implications for biomedical education.

Authors:  G A Colditz; J D Emerson
Journal:  Med Educ       Date:  1985-05       Impact factor: 6.251

8.  Characteristics of recent biostatistical methods adopted by researchers publishing in general/internal medicine journals.

Authors:  Paul J Nietert; Amy E Wahlquist; Teri Lynn Herbert
Journal:  Stat Med       Date:  2012-03-13       Impact factor: 2.373

Review 9.  Statistical errors in medical research - a review of common pitfalls.

Authors:  Alexander M Strasak; Qamruz Zaman; Karl P Pfeiffer; Georg Göbel; Hanno Ulmer
Journal:  Swiss Med Wkly       Date:  2007-01-27       Impact factor: 2.193

10.  Quality of reporting statistics in two Indian pharmacology journals.

Authors:  Preeti Yadav
Journal:  J Pharmacol Pharmacother       Date:  2011-04
View more
  5 in total

Review 1.  Review of the Statistical Methods Used in Original Articles Published in Iranian Journal of Public Health from 2015-2019: A Review Article.

Authors:  Sajjad Bahariniya; Farzan Madadizadeh
Journal:  Iran J Public Health       Date:  2021-08       Impact factor: 1.429

2.  Comparison of the performances of survival analysis regression models for analysis of conception modes and risk of type-1 diabetes among 1985-2015 Swedish birth cohort.

Authors:  Adeniyi Francis Fagbamigbe; Emma Norrman; Christina Bergh; Ulla-Britt Wennerholm; Max Petzold
Journal:  PLoS One       Date:  2021-06-25       Impact factor: 3.240

3.  Research design and statistical methods in Pakistan Journal of Medical Sciences (PJMS).

Authors:  Sohail Akhtar; Syed Wadood Ali Shah; M Rafiq; Ajmal Khan
Journal:  Pak J Med Sci       Date:  2016 Jan-Feb       Impact factor: 1.088

4.  Performance evaluation of survival regression models in analysing Swedish dental implant complication data with frailty.

Authors:  Adeniyi Francis Fagbamigbe; Karolina Karlsson; Jan Derks; Max Petzold
Journal:  PLoS One       Date:  2021-01-07       Impact factor: 3.240

5.  Factors associated with mortality in HIV-infected people in rural and urban South Africa.

Authors:  Kennedy N Otwombe; Max Petzold; Tebogo Modisenyane; Neil A Martinson; Tobias Chirwa
Journal:  Glob Health Action       Date:  2014-09-29       Impact factor: 2.640

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.