Maryam Mohammadian-Khoshnoud1, Abbas Moghimbeigi2, Javad Faradmal3, Mahnaz Yavangi4. 1. MSc, Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran. info.amargiran@gmail.com. 2. PhD, Department of Biostatistics, School of Public Health, Modeling of Noncommunicable Disease Research Canter, Hamadan University of Medical Sciences, Hamadan, Iran. moghimbeigi@umsha.ac.ir. 3. PhD, Department of Biostatistics, School of Public Health, Modeling of Noncommunicable Disease Research Canter, Hamadan University of Medical Sciences, Hamadan, Iran. javad.faradmal@umsha.ac.ir. 4. MD, Department of Gynecology, Hamadan University of Medical Sciences, Hamadan, Iran. m_yavangui@yahoo.com.
Abstract
Background: Birth weight and gestational age are two important variables in obstetric research. The primary measure of gestational age is based on a mother's recall of her last menstrual period. This recall may cause random or systematic errors. Therefore, the objective of this study is to utilize Bayesian mixture model in order to identify implausible gestational age. Methods: In this cross-sectional study, medical documents of 502 preterm infants born and hospitalized in Hamadan Fatemieh Hospital from 2009 to 2013 were gathered. Preterm infants were classified to less than 28 weeks and 28 to 31 weeks. A two-component Bayesian mixture model was utilized to identify implausible gestational age; the first component shows the probability of correct and the second one shows the probability of incorrect classification of gestational ages. The data were analyzed through OpenBUGS 3.2.2 and 'coda' package of R 3.1.1. Results: The mean (SD) of the second component of less than 28 weeks and 28 to 31 weeks were 1179 (0.0123) and 1620 (0.0074), respectively. These values were larger than the mean of the first component for both groups which were 815.9 (0.0123) and 1061 (0.0074), respectively. Conclusion: Errors occurred in recording the gestational ages of these two groups of preterm infants included recording the gestational age less than the actual value at birth. Therefore, developing scientific methods to correct these errors is essential to providing desirable health services and adjusting accurate health indicators.
Background: Birth weight and gestational age are two important variables in obstetric research. The primary measure of gestational age is based on a mother's recall of her last menstrual period. This recall may cause random or systematic errors. Therefore, the objective of this study is to utilize Bayesian mixture model in order to identify implausible gestational age. Methods: In this cross-sectional study, medical documents of 502 preterm infants born and hospitalized in Hamadan Fatemieh Hospital from 2009 to 2013 were gathered. Preterm infants were classified to less than 28 weeks and 28 to 31 weeks. A two-component Bayesian mixture model was utilized to identify implausible gestational age; the first component shows the probability of correct and the second one shows the probability of incorrect classification of gestational ages. The data were analyzed through OpenBUGS 3.2.2 and 'coda' package of R 3.1.1. Results: The mean (SD) of the second component of less than 28 weeks and 28 to 31 weeks were 1179 (0.0123) and 1620 (0.0074), respectively. These values were larger than the mean of the first component for both groups which were 815.9 (0.0123) and 1061 (0.0074), respectively. Conclusion: Errors occurred in recording the gestational ages of these two groups of preterm infants included recording the gestational age less than the actual value at birth. Therefore, developing scientific methods to correct these errors is essential to providing desirable health services and adjusting accurate health indicators.
Entities:
Keywords:
Bayesian; Birth weight; Gestational age; Mixture model
Birth weight and gestational age are two important features in prenatal research. These factors have an important role in computing health indicators and adopting health policies. We can almost always trust the accuracy of birth weight, but the accuracy of gestational age, especially if reported based on the last menstrual period (LMP) can be doubted (1-7). Premature birth or preterm birth refers to birth before the 37th week of pregnancy (8). The prevalence of this phenomenon is different in various regions and is reported between 5% to 10% (9,10). According to studies performed in Iran, the prevalence of premature is reported to be 5.6% in Qom, 7% in Zanjan and 14.7% in Shiraz’s high risk group and 3.1% in its low risk group (11-13). Infants born based on their gestational age require different treatments. Therefore, determining the gestational age of infants is the best health indicator in this critical period of their life (14). Hertz et al reported that LMP can be trusted only in 18% of women (15). Some reasons for this uncertainty include oligomenorrhea, intrauterine hemorrhage etc. (16). Furthermore, illiterate women cannot record their LMP date and cannot calculate the gestational age of their fetus (17). According to the mentioned reasons, using LMP can lead to random and systematic errors (18-21). In addition to LMP, ultrasound, uterine evaluation, fetus movement and uterine height measurement are used to determine the gestational age (22). The error of gestational age determination using bimanual examination of uterine height is ±2 weeks (23). Uterine height measurement in the second and third trimester show 8 weeks of change in gestational age (24). The ultrasound error for determining gestational age in the first and second trimester is about 3 to 5 days (22). While this method is more accurate in measuring the gestational age in the first and second trimester of pregnancy, the accuracy of ultrasound is not reliable in the third trimester (25). Accurate estimation of gestational age is important and essential to women’s health in all conditions. For instance, the antenatal test interpretation may depend on the gestational age. Especially, the rate of ɑ-fetoprotein in the amniotic fluid and the maternal serum depend on the gestational age (26). Also, the wrong classification of term delivery as preterm delivery increases the rate of infant survival unrealistically in low gestational ages (27).If the recorded birth weight is not consistent with gestational age, cut-off points, a simple method for identifying gestational ages, is utilized (3,7). Utilizing cut-off points is administratively easy, but this method does not accurately control the errors since it is based on clinical information and judgment and depends on the clinical characteristics of society which may not be applicable to all societies (28). On the other hand, removing weights that are far from cut-off points leads to truncated distribution of birth weight and data loss (29).The mixture model is a new method for identifying implausible gestational ages which does not result in data loss (28,30-32). Since in earlier reported gestational ages, the distribution of birth weight instead of being curved or symmetrical is often skewed to the right or bimodal. There are techniques such as numerical methods, EM algorithm and Bayesian approach for parameter estimation in the normal mixture model. According to studies, compared to the others, Bayesian technique results are more acceptable (34). Therefore, Bayesian mixture models are plausible candidates for identifying implausible gestational ages. Oja et al., provided a three-component mixture model in order to model the distribution of birth weight (30). Their method was able to diagnose the errors in one menstrual cycle. Platt et al used a two-component mixture model in order to study deliveries with misreported gestational ages (28). Tentoni et al. also used a two-component mixture model for identifying implausible gestational ages (32). While there are many studies considering identification of implausible gestational ages, a search in published literature shows that there are no studies in Iran considering this issue. Consequently, the objective of this study is to identify implausible gestational ages in preterm infants using Bayesian mixture model.
Methods
This study was a cross-sectional research, the study population was all the preterm infants (based on LMP) born in Hamadan Fatemieh Hospital from 2009 to 2013. Hamadan Fatemieh Hospital is the most important center for gynecology and delivery in Hamadan province, west of Iran. Preterm infants refer to those born before the end of 37th week of pregnancy which according to Cunningham et al include infants with the following gestational ages: extremely preterm (less than 28 weeks), very preterm (between 28 to 32 weeks), moderately preterm (between 32 to 34 weeks), late preterm (between 34 to 36 weeks) (8). Data were collected from hospital files of all preterm infants during the above mentioned years.The inclusion criteria were identifying the infant as preterm (37 weeks or less gestational age) and not being a twin. All twin infants (192 infants) and multiple births (48 infants) and infants who did not have a recorded gestational age (17 infants) were excluded from the study. As a result, from 759 preterm infants, only 502 were included in the study.In order to extract the necessary information from infant records, a checklist was designed with the following variables: duration of infant hospitalization, reason of hospitalization, mother’s age, gestational age, weight, height, head circumference, the first and the fifth minutes Apgar score, gender, number of infants, type of delivery, fatality reasons (if the infant dies), blood type, mother’s number of deliveries, number of stillborn and previous abortions.
Data analysis
In this study, two-component Bayesian mixture model was used to identify implausible gestational age. According to previous research, a combination of gestational age and birth weight information was used to identify implausible gestational age (29). Therefore, infants were classified into 4 groups of less than 28 weeks, 28 to 32 weeks, 32 to 34 weeks and 34 to 36 weeks based on their gestational ages. Then in each group, a Bayesian mixture model was fitted on birth weight. In a homogenous population, it is assumed that the distribution of birth weight is normal, but because of recording/recalling error, a two components mixture of normal distribution was assumed for birth weight. So, the probability density function (PDF) of birth weight can be written aswhere shows the normal PDF of each birth weight component, Z=1 or 2, a latent variable following binary distribution with Pr(Z=j) = πj and πj j = 1, 2 is the component weight or mixing proportion so that Σ2j = 1 πj = 1 . It is assumed that one component indicate infants with correctly recorded gestational age and the other one, infants with erroneously reported gestational age.Parameters estimation of birth weight distribution were carried out using Baysian method. Normal and inverse gamma were used as prior distributions for the mean and variance of each component of birth weight distribution, respectively. Also, Beta distribution, the conjugate of binary distribution, was used as the prior of Z. The posterior distribution of the unknown parameters were sampled using Markov chain Monte Carlo (MCMC) algorithm. The convergence of MCMC algorithm was evaluated by Geweke, Gelman-rubin and Raftery and Lewis convergence diagnostics (33). In order to produce almost independent samples from prior distribution of parameters, one of the methods is to execute the Markov chain Monte Carlo algorithm for a long time and then we can reduce the effect of the convergence between samples. The Bayesian mixture model with introduced priors is converged to posterior distribution with 20000 repetitions (10000 burn-ins and with lag of 20). The data were analyzed through OpenBUGS software 3.2.2 and 'coda' package of R software 3.1.1.
Results
From among the five years considered in this study, the largest number of births belonged to 2012 with 131 infants (26.1%) and the smallest number of births belonged to 2010 with 62 infants (12.4%). According to Table 1, the number of preterm boys is more than the preterm girls. The largest number of preterm girls belonged to 2012 and the smallest number of preterm girls belonged to 2010. The preterm infants born during the studied period included 243 girls (48.4%) and 259 boys (51.6%).
Table 1
Frequency of infants' gender by the year
Gender
Year
MaleN (%)
FemaleN (%)
TotalN (%)
2009
66 (51.6)
62 (48.4)
128 (100)
2010
30 (48.4)
32 (51.6)
62 (100)
2011
34 (49.2)
35 (50.8)
69 (100)
2012
71 (54.4)
60 (45.6)
131 (100)
2013
58 (51.8)
54 (48.2)
112 (100)
total
259 (51.6)
243 (48.4)
502 (100)
Forty two percent of the infants were the first born to their mothers. From all the preterm infants, 47.2% died in the hospital, 49.6% were discharged from the hospital after successful treatment and the rest were transferred to other hospitals or discharged with personal content. The age of mothers at delivery was 14 to 50 years with mean (SD) of 27.09 (7.68) and 13% of mothers were under 20 years. According to Table 2, the largest number of preterm infants belonged to 28 to 31 weeks gestational age and the smallest number belonged to 21 to 27 weeks gestational age. The smallest birth weight was 450 gr and with increasing in gestational age, there was an increase in the average birth weight (Table 2).
Table 2
Infants' birth weight (gr) in different gestational age groups
Gestational age group
N (%)
Minimum
Maximum
Mean
SD
21 to 27 weeks
74 (15)
500
1600
894.19
232.25
28 to 31 weeks
218 (43)
450
2950
1275.5
399.24
32 to 33 weeks
118 (24)
600
2500
1700.3
442.32
34 to 36 weeks
92 (18)
1000
3800
2256.1
506.39
In this study, the identifiability of all gestational age was considered for the estimation of parameters and since this capability was not present in the third and fourth groups, 32 weeks gestational age and higher were removed from the study. Therefore, the probability of implausible gestational age was considered for two groups of 21 to 27 weeks and 28 to 31 weeks. For both groups, the Bayesian mixture model gestational age converged to a posterior distribution.From PDF plot and histogram of birth weight in the pre-specified gestational ages, it is clearly observable that in each group of gestational age, two components are detectable (Figs. 1 and 2). The mean of the first component has very small difference from the mean of the second one for infants with 21-27 gestational age (Fig. 2a) but this difference is relatively of bigger size for infants with 28-31 gestational age (Fig. 2b). Table 3 reports the parameter estimates of Bayesian mixture model and confirm these findings.
Fig. 1
Fig. 2
Table 3
The estimated mean of birth weight (gr), standard deviation and mixing proportion for the Bayesian mixture model
Gestational age group
Component
Mean
Standard deviation
Mixing Proportion
Credible interval
21 to 27 weeks
FirstSecond
815.91179
0.01230.0123
0.740.26
766 - 8461039 - 1171
28 to 31 weeks
FirstSecond
10611620
0.00740.0074
0.560.44
1030 - 10911583 - 1656
The histogram of birth weight (gr) for (a) 21-27 and (b) 28-31 weeks gestational age of infants
born in Fatemieh Hospital of Hamedan during 2008 to 2013.The estimated PDF of birth weight (gr) for (a) 21-27 and (b) 28-31 weeks gestational age of infants born in Fatemieh Hospital of Hamedan during 2008 to 2013.Table 3 reveals that about three fourth of the recorded gestational ages between 21 to 27 weeks are plausible and in the second group, 28-31 weeks gestational age, only about 56% of the gestational ages are plausible. These results indicate that accuracy decrease with increasing of gestational ages.
Discussion
Erroneous recording of gestational age may be the result of unknowing the correct date or wrong recall of LMP. Several studies consider the identification of implausible gestational ages using mixture models. In this study, which is inspired by the work of Zhang et al, the Bayesian mixture model was used to identify the proportion of implausible gestational ages. This method was used for two groups introduced in the materials and methods section.Joseph et al considered four different cut-off point methods to remove the implausible gestational ages (35). These methods cause to different proportion of implausible gestational ages detection from about 0.09% using ±4 standard deviation method to about 40% using Tukey’s rule. Also, according to the experts’ opinion method, no gestational age was implausible regarding to related weight. These four method detect a smaller proportion of gestational as implausible compared to the Bayesian mixture model that was used in the current study. This finding is consistent with other studies that were used mixture models (29). According to the Joseph’s study, in all four different cut-off point methods, the number of implausible gestational age increases with increase in recorded gestational age. This finding is in concordance with the current study's result. Also, in the study by Zhang et.al, the percentage of correctly recorded age for infants with 23 to 27 weeks of gestational age is more than 80% and this percentage reduced for infants with 28-31 weeks gestational age (29). These findings are close to that achieved in this study where the percentage of correctly recorded age in the first group is about 74% and this percentage is more than the second group. Overall, the model shows that the misspecification of the gestational age less than the true value occurred for relatively vast majority of infants.Although a two-component Bayesian mixture model was fit in the current study, a method with more components (if necessary) can be used to identify implausible gestational ages. On the other hand, each component of the model has a normal distribution, this restriction may be questionable, especially in lower gestational ages. As a solution, one may use distribution-free models. These models present a flexible distribution for erroneously reported gestational ages. Wilcox and Russell suggest a two-component mixture distribution (36). One component includes “predominant distribution” which has normal distribution. The second component includes a “residual distribution” which has no specified form. Ultimately, the objective of this study was only to identify implausible gestational ages and correcting the probably misreported gestational ages can be the subject of the future studies.
Conclusion
This study demonstrate that the errors occurred in specification/recording the gestational ages of preterm infants less than the true values include a vast majority of gestational ages. Therefore, correcting the system of recording and developing scientific methods to correct these gestational ages is essential to providing suitable health services and adjusting accurate health indicators.
Acknowledgments
This article was extracted from the MSc dissertation of the second author and supported by Hamadan University of Medical Sciences. We are grateful to personnel of Fatemieh Hospital of Hamadan for their corporation.
Conflict of interest
All authors declare that they have no significant competing interests.
Funding
This study was funded by the Vice-chancellor of Research and Technology of Hamadan University of Medical Sciences (grant number 9304242121).
Authors: R W Platt; M Abrahamowicz; M S Kramer; K S Joseph; L Mery; B Blondel; G Bréart; S W Wen Journal: Stat Med Date: 2001-12-15 Impact factor: 2.373
Authors: David A Savitz; James W Terry; Nancy Dole; John M Thorp; Anna Maria Siega-Riz; Amy H Herring Journal: Am J Obstet Gynecol Date: 2002-12 Impact factor: 8.661