Literature DB >> 34345645

The legacy of a standard of normality in child nutrition research.

Abstract

Anthropometric evaluation of children is among the most vital and widely used instruments of public health and clinical medicine. Anthropometry is used for establishing norms, identifying variations, and monitoring development. Yet the accurate assessment of physical growth and development of children remains a perpetually beleaguering subject. This paper focuses on the evolution of anthropometry as a science and its associated measurements, indices, indicators, standards, references, and best practices. This paper seeks to clarify aspects of the assessment of child growth, explores the historical trajectory of the study of anthropometry and its contemporary limitations, and contributes to the debate surrounding references and standards, and the applicability of international anthropometric standards to an individual's health. Among its findings is a surprisingly nonlinear and contested record of events, up to and including leading contemporary practices and datasets. It contextualizes the legacy of child malnutrition studies in a broad framework, including the linkage between the early eugenics movement and contemporary notions of a "normal" child, the interpersonal and intuitional competition to become the preeminent child growth authority, the obfuscated distinction between reference growth charts and standards of growth, and the hidden consequences of universal growth standards that no longer reflect any observable populations.

Entities: Chemical Disease Gene Species

Keywords: Anthropometry; Child nutrition; Normality; References and standards; Samples and populations

Year: 2021 PMID： 34345645 PMCID： PMC8319510 DOI： 10.1016/j.ssmph.2021.100865

Source DB: PubMed Journal: SSM Popul Health ISSN： 2352-8273

Anthropometry is the scientific study of the measurements and proportions of the human body. The World Health Organization asserts, “that for practical purposes anthropometry is the most useful tool for assessing the nutritional status of children” (WHO, 1986, p. 929). Other approaches to measure malnutrition include self-reported hunger levels and estimates based on food supply, however, they are less reliable (Svedberg, 2011). Child malnutrition is an indicator of food and nutrition security (Smith, El Obeid, & Jensen, 2000). Although anthropometry is not the same as health, it is significant and useful for understanding health (Komlos, 2009). There is little reason to doubt the importance and urgency of improving child health and nutrition, substantiated by a resolute anthropometric method. In general contemporary terminology, the basic anthropometric measurements are age, sex, weight, and height. Other measurements include subscapular skinfold thickness, triceps skinfold thickness, mid-upper arm circumference, and head circumference. An index is a combination of measurements (e.g., weight-for-height, height-for-age). They are necessary for grouping and interpreting measurements. The most prominent anthropometric index expression is the z-score. It is derived from the difference between a particular child's weight-for-height or height-for-age and the comparable value from a reference population, divided by the standard deviation of that reference population (WHO, 1995). The most ubiquitous growth chart is the 2006 WHO Child Growth Standards (Natale & Rajagopalan, 2014). An indicator is the application of an index prescribing judgement on the health of an individual (e.g., wasted, stunted, underweight). An index is a numerical calculation only, whereas an indicator is a value based grouping or cutoff (WHO, 1986). The two most widely studied contemporary indicators are wasting and stunting. Wasting indicates a deficit in tissue and fat mass, either from weight loss or inability to gain weight. Stunting indicates impeded skeletal growth. It is an evaluation of linear growth, representing chronic malnutrition accumulated over time. Nutrition monitoring and intervention programs hinge on specific, accurate, and standardized indicators (UNICEF, 2013). Why are these the dominant accepted paradigms and how did they get to be so? Historians of science know that understanding how and why a science (in this case anthropometry) developed is methods and gained its prominence raises profound questions. Social context, metaphysical assumptions, professional aspirations, and ideological allegiances are significant to the histories of a science. A conventional and sanitized history of science—which ignores blind alleys, errors, and distortions in the past—is incomplete. This paper attempts to grapple with some of these unconventional and ignored questions, particularly questions pertaining to the evolution and prominence of universal growth charts, the lasting impacts of emphasizing “normal” children, why and how categories of healthy growth developed and who was responsible, the oft-ignored distinction between references and standards, and the hidden consequences on the applicability of recommendations of child growth derived from universal growth standards. Broadly speaking, the present article consists of seven compound objectives. The first section chronicles the nascent development of the science of anthropometry, detailing the motivations and findings of contributors to the field at its inception. The second section introduces the premise that the motivations and findings of anthropometric science is inextricably linked to the eugenics movement and how the notion of a “normal” child (described later as still in contemporary practice) derives from this doctrine. The third section chronicles the development of child growth charts and the struggle of various institutions to supplant one another as the preeminent authority, leading to a movement away from regional and national tables towards a single unified international reference. The fourth section traces the semantic evolution of methods and terminology of anthropometric measurements, indices, and indicators to describe child malnutrition in its various forms. This section also explores the struggle between quantitative and qualitative classifications, and juxtaposes the needs of cold statistical objectivity against individual subjective judgement and evaluation. The fifth section examines the distinction between reference growth charts and standards of growth, the continued development of unified international growth charts, and what it means to be a “normal” child. The sixth section highlights the origins for ongoing debates of the social determinants of health and the meta-histories of anthropometry. The final section analyzes the state of the contemporary preeminent international child growth chart derived from the 2006 WHO Multicentre Growth Reference Study. This framing reveals a picture of anthropometry as a cultural product and a political resource. As Rudwick puts it, “Accepting or rejecting any scientific theory is always and irreducibly a social act, by a specific social group, in particular cultural circumstances” (1981, p. 247). Demonstrating that anthropometry has always been contested and negotiated, this historical awareness helps to keep the subject open to dialogue and debate. Future policies and initiatives will be more effective and successful if they are shaped against a background that includes an understanding of the forces and factors that shaped past developments.

A nascent scientific subject

The genesis of anthropometry is not in medicine or even science, but in the arts and Pythagorean philosophy (Tanner, 1981). It was sculptors and painters, in search of Platonic ideals, who first measured the relative proportions of the human form. The nascent scientific study of measurements and proportions of the human body was conceived by Adolphe Quételet. The Quételet Index, later redubbed the Body Mass Index, is still relevant (Eknoyan, 2008). In his 1832 article Research on the weight of man at different ages, Quételet describes the first cross-sectional study of the height and weight of newborns and children (Quételet, 1832). In his 1835 text A treatise on man and the development of his faculties, Quételet presented his conception of the “average man” and the link between the population distribution of weight and height to the normal Gaussian distribution (i.e., a bell curve) (Quételet, 1835). It was not until after the UK Parliament passed the 1833 Factory Act, reforming inadequate child labor standards in factories, that a need arose for physicians to measure and standardize the growth rates of children. The Act required physicians to certify children's “age and physical capacity for work … and that the [child] has the ordinary strength and appearance of at least 8 years of age” (Roberts, 1876, p. 681). Following the passage of the Act there was a smattering of studies measuring the weight and height of children in select factories. However, it was Roberts (1876) who first endeavored to establish standards of reference for the height and weight of children, collecting measurements from 10,000 boys and girls, aged 8 to 14, across urban and rural populations, and factory and non-factory households. In 1883, the Final Report of the Anthropometric Committee of the British Association for the Advancement of Science was published. The Committee was appointed in 1875 “for the purpose of collecting observations on the systematic examination of the height, weight, and other physical characters of the inhabitants of the British Isles” (Galton, 1883a, p. 1). Under the chairmanship of Francis Galton (inventor of correlation and regression, and cousin of Charles Darwin), the Committee collected anthropometric measurements from 917 infants and 651 children under 5 years of age to construct tables of average weight and height. The primary questions of research at the time were concerned with developing general principles of growth and development, understanding the link between social class and mental and physical capacity in children, and discerning the point at which growth matures (Burk, 1898). Similar efforts were also underway in the US (Bowditch, 1877). By the end of the 19th century interest in anthropometry––specifically anthropometry of children––was accelerating. Hartwell (1893) chronicled 117 titles of anthropometric works in the US. In 1898, Burk published growth curves and a study describing the “average” American boy and girl, based on the anthropometric surveys of Boas (88,449 Boston, St. Louis, Milwaukee, Worcester, Toranto, and Oakland children), Bowditch (24,500 Boston children), Peckham (9600 Milwaukee children), and Porter (34,500 St. Louis children).

A pure and normal child

In 1909, Ellen Key's The Century of the Child was published in English. The volume and its title served as spark and slogan for a bourgeoning child welfare movement, which was gaining moral and political authority throughout Western Europe and the United States at the turn of the 20th century (Cravens, 1993). Key's message certainly resonated in the United States, especially with people like Cora Hillis of the National Congress of Mothers (the progenitor of the National Parent Teacher Association), who in 1917 fought to establish the Iowa Child Welfare Research Station. The Research Station pioneered methods of assessing children's nutritional status with anthropometry indicators in order to “give the normal child the same scientific study by research methods that we give to crops and cattle” (Bradbury & Stoddard, 1933, p. 7). It was there that the notion of a “normal” child was championed. However, the notion of a “normal” child and the study of anthropometry is inextricably linked to the early eugenics movement. It was Francis Galton himself who coined the term eugenics as “the science of improving [human] stock [through] judicious mating … to give the more suitable races … a better chance of prevailing over the less suitable” (Galton, 1883b, p. 25). The early child wellbeing researchers assumed the national population was divided into a hierarchical series of groups, some superior and some inadequate, with native-born whites of Anglo-Saxon Protestant ancestry at the top (Cravens, 1993). Ellen Key, echoing Galton, called for “very strict rules, to hinder inferior specimens of humanity from transmitting their vices or diseases, their intellectual or physical weaknesses” (Key, 1909, p. 20). Fully in the mainstream of her time, Cora Hillis also campaigned for racial purity in order to promote the Research Station (Cravens, 1993). Anthropometry has been conjoined since its inception as a scientific practice with the ideals of eugenics. Despite meaningful insights from anthropometry, this legacy has beset the field. From these early studies medical professionals began to use height-weight-age tables as an index of child health and measure of severe malnutrition, replacing the inadequate measure of weight only (JAMA, 1933). The impetus for an index of child health came from the Baldwin-Wood tables, first published in 1910 and revised in 1923, which soon became widely taught and reproduced in most textbooks (Tanner, 1952). Emerson and Manny (1920) first proposed a normal zone—of 7% below to 20% above average weight for height—to identify malnourished children, determining that 20 to 40 percent of US children were malnourished. Accompanying the salutary results of the research, interpreting the limits of the normal zone was generally misunderstood by anxious parents who would consult oracular weighting machines to gauge their child's health (Tanner, 1952). Even medical professionals misunderstood and trivialized malnutrition, dominated instead by the ideas of infection (Williams, 1973). But unlike infection, which asks the qualitative question “Whether or Not” (a child is infected), malnutrition asks the quantitative question of “How Much” (a child is malnourished).

Growth curve standardization and unification

By the early 1940s the study of velocity of growth grew in prominence. First advocated by Frank K. Shuttleworth, he deemed cross-sectional data inadequate for all meaningful analysis with the exception of “determining the average size of children in general at any given age” (Shuttleworth, 1937, p. 180). However, determining velocity was financially, administratively, and computationally burdensome, requiring longitudinal rather than cross-sectional studies (Tanner, 1952). Boas (1892) realized the importance of longitudinal data, but was largely ignored until 40 years later when he clarified the statistical and scientific gains to be had from following individuals through time (Boas, 1930). The first longitudinal charts came from studies in the United States, consisting of 50–200 children from homologous communities (Bayer & Gray, 1935; Jackson & Kelly, 1945; Palmer, Kawakami, & Reed, 1937; Palmer & Reed, 1935; Robinow, 1942; Simmons & Todd, 1938; Wetzel, 1941). Older studies and charts did exist in a sense. As far back as 1872 Bowditch collected longitudinal data; however, he only studied 13 girls and 12 boys who were all mostly related and older than 5 years of age (Bowditch, 1877). These studies, however, were only quasi-longitudinal, with many children only being observed for a few years at a time. Despite their shortcomings, these standards of reference would not be fully supplanted until 2000 to 2006 (de Onis et al., 2007a, de Onis et al., 2007b). In a perpetual trend that continues today, the accepted standards of anthropometric measurement continued to evolve. Growth rate norms developed from data earlier than the 1930s (i.e., the Baldwin-Wood tables) were deemed inadequate for evaluation. Critics like Shuttleworth (1934) decried the inadequacies of the contemporary standards of development. Pointing to the secular trend over the past century towards heavier and taller populations (see Roberts, 1876), previous standards of reference were quickly deemed out-of-date (Meredith, 1941; Meredith & Meredith, 1944; Tanner, 1952). The secular growth trend debate continues to beleaguer contemporary studies of anthropometry (NCD-RisC, 2017). Stuart and Meredith (1946) provided the first such updated standards, collected from 750 children between the ages of 5–18 years of “northwest European ancestry living under better than average conditions from the standpoints of nutrition, housing, and health care” at the Iowa Research Station (Meredith, 1949, p. 884). In the fifth edition of Mitchell-Nelson's Textbook of Pediatrics (for the past 70 years the most prominent book of its kind), Stuart and Stevenson (1950) provided further updates from the Harvard School of Public Health Longitudinal Studies data, including children from birth to 18 years old. These anthropometry standards—referred to as the Harvard-Iowa standards—remained in prominent use for the next thirty years (Tanner, 1981). Similar efforts were also underway in the Netherlands (de Wijn & de Haas, 1960) and Britain (Tanner & Whitehouse, 1959). Despite its prominence, the Harvard-Iowa standards were recognized as inadequate for a national reference, much less for an international reference, but such is the effect of professional prestige and political power. In an effort to standardize inadequate nutrition assessments, the World Health Organization in 1966 published a simplified combined-sexes version of the Harvard-Iowa standards (Dibley, Goldsby, Staehling, & Trowbridge, 1987). Certifying itself as exemplar, the World Health Organization established methods, techniques, and procedures for defining, collecting, presenting, and interpreting anthropometric measurements (D. Jelliffe, 1966a, Jelliffe, 1966b). Pediatricians and public health officials were beginning to adopt anthropometry and children's health as a sensitive index of the health of a community (Tanner, Whitehouse, & Takaishi, 1966). Indeed, the Assistant Director-General of the World Health Organization, W. H. Chang proclaimed, “Health of a population is reflected most accurately by the rate of growth of its children” (Eveleth & Tanner, 1976, p. ix). In 1967 the World Health Organization and UNICIEF (United Nations International Children's Emergency Fund) collaborated with the International Biological Programme (under the auspices of the International Council of Scientific Unions) to collect anthropometry data from a globally representative sample spanning 42 countries and 340 projects, in an unprecedented multilateral effort, including a joint longitudinal study of children in Paris and London, to serve as the new reference (Eveleth & Tanner, 1976). Unfortunately, the efforts of the International Biological Programme lacked traction in the nutrition sphere and became defunct by 1972. The First Joint Food and Agriculture Organization/World Health Organization Committee on Nutrition convened in 1949. In keeping with its persistent message, the First Expert Committee prescribed a need for studies of the clinical characteristics of early childhood malnutrition (FAO & WHO, 1949). Under the United Nations’ collective belief that health is a fundamental human right and the healthy development of children is of central importance, nutritional needs assessments in underdeveloped countries began in earnest. By 1971, the Eighth Expert Committee prescribed a need to study incidence and prevalence of malnutrition, and the urgent prerequisite of a general consensus of definitions and classifications. They also highlighted other concurrent issues such as the etiology of malnutrition and role of non-illness (socio-economic) factors, and the permanent physical and mental impairment caused by malnutrition (FAO & WHO, 1971). Greater understanding of the mechanisms of malnutrition, highlighted by Emerson and Manny (1920), spurred by Jelliffe, 1966b, Jelliffe, 1966a, and underscored by Waterlow (1972), led to the supremacy of height-for-age and weight-for-height anthropometric indices, supplanting the inadequate weight-for-age index (Waterlow et al., 1977; WHO, 1976). Perpetuating the discourse of ever more rigorous standards, the Maternal and Child Health Program, the Unites States Public Health Service, and the American Academy of Pediatrics concurred in 1971 that the Harvard-Iowa standards were inadequate and no longer applied to the US (Hamill et al., 1979). This decision was the impetus for the Health and Nutrition Examination Survey carried out by the Centers for Disease Control and Prevention's National Center For Health Statistics Task Force and later recommended by the US National Academy of Science in 1974 as the new US national anthropometric reference (WHO, 1978). First released in 1977, the National Center For Health Statistics Growth Curves were a combination of data from the National Center For Health Statistics’ Health Examination Surveys, Health and Nutrition Examination Survey and the Fels Research Institute (Hamill et al., 1979). The National Center For Health Statistics data consisted of three pooled quasi-longitudinal surveys (1963–1974) measuring the anthropometry of 2–18 year-olds from a national stratified probability sample (Hamill, Drizd, Johnson, Reed, & Roche, 1977). The Fels data was compiled from a sample of convenience of 867 white middle-class Ohio children during a longitudinal study (1929–1975) of children from birth to 3 years old (Dibley, Goldsby, et al., 1987). The portmanteau quality of the growth reference led to a discontinuity at the junction point of the disparate data sets (Dibley, Staehling, Nieburg, & Trowbridge, 1987). The discontinuity produced spurious interpretations of anthropometric indicators, which incorrectly implied a drop in prevalence rates at 2 years old. This spurious artifact persists today in many studies on the etiology of malnutrition. Waterlow et al. (1977) of the World Health Organization, described the canonical criteria for an anthropometric reference population, which would establish the US National Center for Health Statistics Growth Curves (Hamill et al., 1979) as the preeminent growth reference for both individuals and populations for the next 30 years. In 1978 the Centers for Disease Control and Prevention developed a statistically normalized version of the National Center for Health Statistics Growth Curves (Dibley, Goldsby, et al., 1987). In the same year the World Health Organization adopted the normalized Growth Curves and succeeded in promoting them as the preeminent international growth reference. The single international reference population allowed pediatricians, public health officers, and organizations like the World Health Organization to compare the results among different nutrition studies, assisting interpretation and improving clarity (WHO, 1978).

Categories, cutoffs, and classifications

Though not the first to try, Waterlow et al. (1977) cemented normalized growth charts and z-scores as the definitive indicator measurement. The most common expressions of anthropometric indices are percent-of-median, percentiles, and z-scores (sometimes referred to as standard deviation scores) used to group and interpret measurements. Percent-of-median is the ratio of an anthropometric measurement or index for a child (e.g., their weight) to the median value of comparable children in the reference population, expressed as a percentage (WHO, 1995). Percent-of-median is the simplest to calculate and a useful measurement if the distribution of the reference population is unknown, unspecified, or otherwise not normalized (Gorstein et al., 1994). Percentiles rank the relative position of a child against comparable children in the reference population, expressed in terms of what percentage of the reference population the child equals or exceeds (WHO, 1995). Percentiles are the most intuitive, and formerly the most common way physicians tracked a child's growth; the 50th percentile or the median (and if the reference is perfectly Gaussian normal, also the mean), describes the central point with 50% of the population above it and 50% of the population below it (Falkner, 1962). Z-scores convey anthropometric measurements as a number of standard deviations below or above the reference population value. Z-scores are the difference between a child's measurement and the mean value of comparable children in the reference population, divided by the standard deviation of the reference population (WHO, 1995). Z-scores require a reference population that follows a normal (Gaussian) distribution. In return, z-score cutoff values are stable across different reference populations (e.g., defining a −2.0 weight-for-height z-score as wasted is consistent across all heights and even through other conditional factors such as age). Z-score measurements are also useful for comparing measurements across different units (Falkner, 1962), and as a feature of normalization the full distribution of anthropometric values can be expressed with just a mean and standard deviation. Z-scores are now accepted as the best system for analysis and presentation of anthropometric data (de Onis & Blössner, 1997; de Onis & Habicht, 1996; WHO, 1995). The terminology used to describe malnutrition has gone through many renditions. As one anonymous author in the British Medical Journal once said: “All we can demand is … that language shall not lag behind knowledge; and that, as we learn to know things better, we shall also take due pains to name them more perfectly” (Anonymous, 1886, p. 1116). Etymologically speaking, the terms wasting and stunting are ideophones: purely descriptive of the symptomatic thinness and shortness of malnutrition. As early as Emerson and Manny (1920), stunting described low height-for-age whereas malnourished described low weight-for-height. At the First Joint Food and Agriculture Organization/World Health Organization Committee on Nutrition kwashiorkor or malignant malnutrition was the watchword of the day (FAO & WHO, 1949). Kwashiorkor is a Ghanaian word meaning “the disease of the deposed baby when the next is born” (Williams, 1973, p. 361). First described by distinguished pediatric pioneer Cicely Williams (1933), it is a type of clinical malnutrition from deficient protein intake coupled with edema (i.e., an excess of fluid in body tissues and cavities). By the Third Joint Committee, the nutrition lexicon shifted to protein-calorie malnutrition and included descriptions of “wasted muscles” hinting at the ensuing terminology (FAO & WHO, 1953). During the intervening decade, 1950–1960, the field of nutrition shifted emphasis from micronutrients (vitamins A and B, iodine, and zinc) to macronutrients (proteins, fats, and carbohydrates) (Jolliffe, 1962). Jelliffe, 1966b, Jelliffe, 1966a suggested the term protein-calorie malnutrition of early childhood should be used as a generic term to cover the whole range of manifestations, which would include the clinical syndromes of kwashiorkor and marasmus—a more general form of starvation with signs of “severe wasting,” but not edema. He also distinguished between four forms of malnutrition: undernutrition, specific deficiency, overnutrition, and imbalance. In modern parlance, “severe acute malnutrition” and “severe wasting” have superseded kwashiorkor and marasmus (WHO & UNICEF, 2009). Waterlow (1972) proposed retardation as the slowing of linear growth where stunting would describe a reduction in final stature. Following Seoane and Latham (1971), who noted weight-for-height gauges current nutrition and height-for-age gauges past nutrition, Waterlow (1972) also proposed four categories of nutritional status: normal; malnourished but not retarded (acute malnutrition); malnourished and retarded (acute on chronic malnutrition); and retarded but not malnourished (so-called nutritional dwarfs) each category was accompanied with a grade to further distinguish the severity. By 1977, the contemporary derivations of wasting (low weight-for-height) and stunting (low height-for-age) were established. But the sorites problem—the ancient Greek paradox of how many grains of sand it takes to make up a heap—remained unresolved. That is, at what point is a child stunted, wasted, underweight, malnourished or severely malnourished? Determining a child's nutritional status based on anthropometric values requires defining cut-off points, which needs a qualitative classification, whereas prevalence and severity needs a quantitative classification (Waterlow, 1972). To use Stevens’s (1946) typology of scale, one must transform a ratio measurement into a nominal grouping. Using weight-for-age, Gómez et al. (1956) imposed explicit cut-off points (i.e., 76–90, 61–75, and less than 60 percent-of-median) to classify malnutrition severity into first degree, second degree, and third degree malnutrition. Ford (1964) suggested that 66 percent-of-median should be the malnutrition line. Garrow (1966) proposed that severe malnutrition occurred only below 70 percent-of-median weight-for-age. Dugdale (1971) believed malnutrition began at 80 percent-of-median reference weight. Waterlow (1972) tweaked the Gómez Classification; using weight-for-height he suggested three delineated malnutrition severities of 90–80, 80–70, and less than 70 percent-of-median. Trowbridge (1979) classified wasting as below 80 percent-of-median and stunting as below 82.5 percent-of-median. The Oomen Malnutrition Index (Oomen, 1955) and Protein-Calorie Malnutrition Score (Jelliffe & Welbourn, 1963) were other attempts to establish a common system, but the Gómez classification is considered the progenitor of the modern malnutrition classification system (de Onis, 2000; D.; Jelliffe, 1966b, Jelliffe, 1966a). Originally the Gómez classification was devised to group cases of similar prognosis for children aged 1–4 years and guide physicians in selecting the appropriate place of treatment. It was not intended as a diagnostic classification tool for community surveys nor to be extended to other age groups (FAO & WHO, 1971; Gómez et al., 1956). With the increasing prominence of normalized curves and z-scores, Waterlow et al. (1977) defined the contemporary canonical cut-off points for moderate wasting and stunting as 2 standard deviations below the median reference, and for severe wasting and stunting as 3 standard deviations below the median reference (UNICEF, 2013). Though largely ignored, the Eighth Report of the Joint Food and Agriculture Organization/World Health Organization Expert Committee on Nutrition did warn against the problem of a “normal” standard in tests of nutritional status (FAO & WHO, 1971). “In most biochemical and haematological measurements it is usual, for practical reasons, to specify ranges and “cut-off” points that distinguish “normal” individuals or groups from those who are “at risk” or “deficient”” the Report goes on to say, “This is an arbitrary procedure, since most parameters vary continuously … [and statistical evaluation] cannot by itself distinguish between what is normal and abnormal in the biological sense” (FAO & WHO, 1971, p. 76). Sole reliance on statistical evaluation continues, with little consideration as to the sensitivity and specificity of an arbitrary cut-off point.

Reference standards

Using the 1978 normalized Growth Curves, the World Health Organization continued to collect and publish (in 1983, 1989, and 1993) information on the nutritional status of the world's children (de Onis & Blössner, 1997). In 1986, a World Health Organization Working Group published a conclusive guide to define, interpret, and standardize anthropometric indicators (WHO, 1986). By 1993, the Expert Committee on Physical Status, convened by the World Health Organization, concluded that despite previous admonitions, reference growth charts had long been misconstrued as a standard of growth (de Onis & Habicht, 1996). The National Center for Health Statistics and the Centers for Disease Control and Prevention designed both the 1977 smoothed percentiles and the 1978 normalized growth curves as references (Kuczmarski et al., 2002). The sole aim of a reference is to be a common basis in order to group, analyze, and compare different populations, whereas a standard represents a desirable target or norm (WHO, 1995). In practice, however, clinicians use growth charts as standards rather than references (Grummer-Strawn, Krebs, & Reinold, 2010). The distinction may seem trivial, but the requirements of the underlying data will change depending on the intended application, which can produce spurious interpretations and conclusions. The problem is also circular. To be able to identify the normal range in a population the abnormal ones must first be removed, but abnormalities can only be identified once the normal range is defined (Armstrong, 2019; Creadick, 2010; Rose, 2016). Not to mention the well documented paradox that given enough measurement dimensions—even a small number of dimensions across a homogenous sample—exactly zero people will be “average” (Creadick, 2010; Rose, 2016; Subramanian, Kim, & Christakis, 2018). However, the question remains of whether it is appropriate to compare children across radically different environments, and whether the reference versus standard distinction is satisfactory or merely evades the larger issue (de Onis & Blössner, 1997). Different subpopulations have different proclivities for growth, based on their environment, gene pools and the interaction between the two (Eveleth & Tanner, 1976). “Clearly, if there were differences dependent on different gene distributions,” states Waterlow et al., “then the target for one population would not be the same as the target for another. This does not, however, affect the use of the reference data for comparisons between populations” (1977, p. 490). Tempting as it may be, the desire to distill all observed differences in human growth and behavior down to the environment and gene pools should be avoided, especially if accompanied by a numerical ranking, echoing eugenics and environmental determinism. Even the canonical arbiters of the international anthropometric reference conceded that, “Because the reference population cannot be used as a universal target, the question of what is a realistic goal in any particular situation does become important. Decisions of this kind have to be taken locally, and it is not possible to make international recommendations about them” (1977, p. 490). The distinction was, and continues to be, largely overlooked. In constructing the international growth reference chart, the National Center for Health Statistics decided that smoothed growth curves looked better and represented reality better. Although mathematical smoothing techniques have long existed, the 1977 reference was the first to use computers to systematically smooth its curves in a reproducible, quantifiable way (Hamill et al., 1977). The result produced artificial growth curves in order to serve statistical techniques of comparison that depend on the normal (Gaussian) distribution (Dibley, Goldsby, et al., 1987). The increasing normality of the international reference data (in the statistical Gaussian sense), however, exacerbated the phenomenon of misapplying the reference as a standard (WHO, 1995). Recognizing this phenomenon along with other inadequacies of the data (e.g., discontinuities and unrepresentative samples of convenience) led to the development of new growth charts, which purported to serve as both reference and standard.

Histories, etiologies, and determinants

Pioneering the research on the social causes of malnutrition, José María Bengoa (1940) believed malnutrition to be an ecological problem: the result of overlapping factors in a community's physical, biological and cultural environments. Physician Norman Jolliffe (1962) proposed a twofold classification for the pathogenesis nutritional deficiency. Jolliffe's classification places a faulty diet as the primary cause which is conditioned upon by inadequate or abnormal nutrient ingestion, absorption, utilization, and excretion. This etiology is firmly couched within the purview of illness related malnutrition (Mehta et al., 2013). Moving towards a non-illness etiology of the social determinants of health, tropical pediatric expert Dr. Derrick Jelliffe, 1966b, Jelliffe, 1966a proposed that the principle aim of nutritional assessment should be to map out the magnitude and geographical distribution of the problem and analyze the direct and indirect ecological factors. The entitlements paradigm, conceived by Nobel Prize-winning economist Amartya Sen (1976), approached the study of poverty and hunger by illuminating the less than obvious economic mechanisms when dealing with less than extreme raw poverty and its antecedents. The same year Sen devised entitlements, physician and demographic historian Thomas McKeown (1976; 1979) proposed that economic growth coupled with better nutrition (i.e., greater caloric intake) caused improvements in health outcomes, rather than targeted public health or medical interventions. Dubbed the “McKeown thesis,” it became the subject of much controversy and shaped the research hypotheses of many scholars (Colgrove, 2002). Motivated by McKeown and coinciding with the search to develop child growth standards, the National Bureau of Economic Research conducted numerous early studies on anthropometric history and trends (Cuff, 2019). In the late 1970s researchers such as Nobel Prize-winning economist Robert Fogel began to create the new anthropometric history (Steckel, 2009). The founders of this newly developing interdisciplinary perspective were instrumental in bridging child growth and economic development, connecting components of biological welfare with the socioeconomic and epidemiological environment during childhood (Komlos & Baten, 2004). In particular, anthropometric history found a niche in scholarship by incorporating the effects of environmental externalities, cyclical fluctuations, family resource distribution, societal level inequalities, and spatial disparities from historical records (see Floud & Wachter, 1982; Fogel et al., 1978; Fogel, Engerman, & Trussell, 1982; Friedman, 1982; Komlos, 1985, 1998; Margo & Steckel, 1982, 1983; Sokoloff & Villaflor, 1982; Steckel, 1979; Tanner, 1982; Trussell & Steckel, 1978). Much of McKeown's particular arguments about public health have been largely invalidated, but the legacy remains. Stiglitz (1976), picking up where Leibenstein (1957) left off, argued productivity depends (nonlinearly) on nutrition from an efficiency wage perspective. Szreter (1988) argued that public health measures—especially clean water and improved sanitation—fundamentally reduces mortality and causes improvements in health outcomes throughout history. While others, such as Behrman and Deolalikar (1987), Bouis and Haddad (1992), and Bouis (1994), proposed that increases in income will not result in substantial improvements in nutrient intake, from an Engel curve for calories perspective. However, Subramanian and Deaton (1996) argued calorie elasticity is not zero, suggesting sufficient daily calories can be readily purchased with only a small fraction of the daily wage. Fogel (1994; 2004) documented direct evidence for the importance of nutrition, connecting levels of calorie availability to their effects on health throughout history. He postulated that understanding nutrition traps is the key to both improved health and economic development. Smith and Haddad (2000), from an aggregate cross-county perspective take the broader view, suggesting the main determinants of malnutrition are national income, poverty, education, and the state of the health environment. Under the chairmanship of Jeffery Sachs, the WHO Commission on Macroeconomics and Health, suggested that good health is a necessary—and possibly sufficient—condition of economic growth, which suggests that improving health, and as a consequence stimulating economic growth, requires direct intervention through public health provisioning (WHO, 2001). However, Deaton (2003) concluded that there is no direct link from income inequality to ill-health. Deaton goes a step further to emphasize the reinforcing interplay between disease and nutrition. He showed how nutrition traps are much easier to understand once disease is given its proper place in the story. Malnutrition compromises the immune system, while at the same time, disease prevents the absorption of nutrients. For example, giving more food to a malnourished child afflicted with severe diarrhea would not ameliorate her health. As such, scientists, pediatricians, public health policy makers, and nutrition assistance programs need to carefully consider the many nuances of anthropometric modeling.

The new normal

In 2000, the US Centers for Disease Control and Prevention released a revised version of the National Center for Health Statistics growth charts, and recommended them for both clinical and research purposes to evaluate the growth status of children in the US (Kuczmarski et al., 2002). These Growth Charts are based on five nationally representative surveys administered between 1963 and 1994 (de Onis et al., 2007a, de Onis et al., 2007b). The revised charts amended previous issues of discontinuity and unrepresentative samples, and an internal evaluation found no systematic differences between the smoothed and empirical data. In a separate effort, the World Health Organization also concluded that the 1978 Growth Curves were inadequate (WHO, 2006a). As a result, the World Health Organization Multicentre Growth Reference Study was implemented between 1997 and 2003. The designers of the new Growth Reference were intentionally prescriptive rather than descriptive (i.e., they designed a reference for how children should grow rather than how children actually grow) (Garza & de Onis, 2004). In other words, it was purposely designed to produce a standard rather than a reference. Despite the fact that the National Center For Health Statistics Growth Curves and the revised Centers for Disease Control and Prevention Growth Charts are a reference, whereas the World Health Organization Multicentre Growth Reference Study is a standard, there are those who propose to compare the two and recommend one as a universally better tool (de Onis et al., 2007a, de Onis et al., 2007b; de Onis et al., 2006; de Onis et al., 2007a, de Onis et al., 2007b; Ziegler & Nelson, 2012). Even as a standard, other studies find the Multicentre Growth Reference Study does not necessarily stand up (Bonthuis et al., 2012; Christesen, Pedersen, Pournara, Petit, & Júlíusson, 2016; de Wilde, van Dommelen, van Buuren, & Middelkoop, 2015; Heude et al., 2019; Júlíusson, Roelants, Hoppenbrouwers, Hauspie, & Bjerknes, 2011; Kêkê et al., 2015; Natale & Rajagopalan, 2014; Scherdel et al., 2015, 2016). Regardless, the Multicentre Growth Reference is the definitive international anthropometric “reference population.” The Multicentre Growth Reference Study (July 1997–December 2003) consists of both cross-sectional and longitudinal surveys from six cities: Davis, California, USA; Muscat, Oman; Oslo, Norway; Pelotas, Brazil; select affluent neighborhoods in Accra, Ghana; and South Delhi, India (WHO, 2006b). The distributions of children across the different survey countries for the longitudinal component are: 119 USA; 149 Oman; 148 Norway; 66 Brazil; 227 Ghana; and 173 India. For a definitive global reference, the number of children the study is based on is rather small. The distributions of children across the different survey countries for the cross-sectional component are: 476 USA; 1438 Oman; 1385 Norway; 480 Brazil; 1403 Ghana; and 1487 India. Children were selected for inclusion based on: no known health or environmental constraints to growth, mothers willing to follow feeding recommendations (although only 20% actually did), no maternal smoking before and after delivery, single term birth, and absence of significant morbidity. Of the 13,741 children screened for the longitudinal survey, less than 7% or 882 children (428 boys and 454 girls) were eligible, compliant, and included in the final study. In addition, of the 21,520 children screened for the cross-sectional survey, less than 31% or 6669 children (3450 boys and 3219 girls) were eligible, compliant, and included in the final study. Notwithstanding the discontinuity problem seen in the 1978 Growth Curves, induced by a truncated longitudinal survey of children 0–24 months old, the longitudinal component of the Multicentre Growth Reference Study is an equally truncated survey of children 0–24 months old. Prior to constructing the standards, if a child was 3 standard deviations above the sample median or 3 standard deviations below the sample median they were excluded (WHO, 2006b). For the cross-sectional sample the truncation procedure was even stricter. If a child was 2 standard deviations above the sample median or 2 standard deviations below the sample median they were excluded. In other words, even though the study sought out the healthiest, most ideal population to measure, 69–93% of the healthy populous (i.e., a very large percentage of the actual population) did not conform to this ideal (Sandler, 2021). As such, the Multicentre Growth Reference Study is not representative of even a healthy population, much less a malnourished one. The initial Multicentre Growth Reference Study sample was not a standard normal (Gaussian) distribution. After the selective sampling and exclusion exercise, the sample was exceedingly skewed to the right (WHO, 2006b). To rectify the non-normality, the data were cleaved at the median. The values from each new dataset were then reflected across the median to create two symmetrical distributions. Fitting a normal distribution to the data using the LMS method (Cole & Green, 1992), each mirrored distribution was used to derive standard deviation cut-off values for the respective upper and lower portions of the data. This means that if describing a “population” effect or standard, most of the actual, non-statistical, real-world population distribution is fundamentally and structurally not represented. The population is a sum of individual identities and should provide a fluid denominator, comparator, context, and analytic space, yet now the population has come to define those very individuals (Armstrong, 2017). Despite its shortcomings and checkered heritage, the Multicentre Growth Reference remains the most ubiquitous and authoritative resource of its kind (Natale & Rajagopalan, 2014). Even the United States Centers for Disease Control and Prevention (CDC), who develop their own specific child growth charts, “recommends that clinicians in the United States use the 2006 WHO international growth charts, rather than the CDC growth charts, for children aged <24 months” (Grummer-Strawn et al., 2010). Only 47 countries have potential alternative growth charts to the Multicentre Growth Reference (Natale & Rajagopalan, 2014). Elsewhere in countries where child malnutrition is most severe and country specific child growth charts do not exist, the Multicentre Growth Reference remains the most relied upon growth chart of its kind. WHO contends that its growth curves describe how all children should grow in all countries and that any deviations from its standards should be considered as evidence of abnormal growth (Garza & de Onis, 2004; WHO, 2006b). In the context of clinical nosology, Armstrong observed that “when classificatory systems and explanatory frameworks are in flux there is no Archimedean point from which to see things as they really are: neither causes nor reasons can have epistemological priority” (2011, p. 806). The statement aptly characterizes anthropometric evaluation as well. Chronicling the evolution of medical classification is rare and has not received the attention it deserves (Armstrong, 2011; Jutel, 2009). Overlooking the legacy of a standard of “normality” in anthropometry could have profound consequences for contemporary etiological analyses of nutrition (e.g., Corsi, Mejía-Guevara, & Subramanian, 2016; Kim, Mejia-Guevara, Corsi, Aguayo, & Subramanian, 2017; Kim et al., 2019; Perkins et al., 2017). To uncover its implications, we should continue to interrogate contemporary manifestations of anthropometric ontologies. It is well beyond the reach of this or any other single paper to disentangle the historical strands and perform this sort of examination, although it would not be impossible given more time and space.

Author contribution statement

Austin Sandler is responsible for all conceptualization, methodology, validation, analysis, investigation, resources, curation, drafting, reviewing, and editing.

Ethical statement

No data was collected from human subjects as part of this research.

Financial disclosure

None. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Declaration of competing interest

None.

72 in total

1. The McKeown thesis: a historical controversy and its enduring influence.

Authors: James Colgrove
Journal: Am J Public Health Date: 2002-05 Impact factor: 9.308

2. The heights of slaves in Trinidad.

Authors: G C Friedman
Journal: Soc Sci Hist Date: 1982

3. The early achievement of modern stature in America.

Authors: K L Sokoloff; G C Villaflor
Journal: Soc Sci Hist Date: 1982

4. THE GROWTH OF CHILDREN.

Authors: F Boas
Journal: Science Date: 1892-12-23 Impact factor: 47.728

5. A big-data approach to producing descriptive anthropometric references: a feasibility and validation study of paediatric growth charts.

Authors: Barbara Heude; Pauline Scherdel; Andreas Werner; Morgane Le Guern; Nathalie Gelbert; Déborah Walther; Michel Arnould; Marc Bellaïche; Bertrand Chevallier; Jacques Cheymol; Emmanuel Jobez; Sylvie N'Guyen; Christine Pietrement; Rachel Reynaud; Jean-François Salaün; Babak Khoshnood; Jennifer Zeitlin; Jean Maccario; Gérard Breart; Jean-Christophe Thalabard; Marie-Aline Charles; Jérémie Botton; Bruno Frandji; Martin Chalumeau
Journal: Lancet Digit Health Date: 2019-11-07

6. Clinical and biochemical characteristics associated with anthropometric nutritional categories.

Authors: F L Trowbridge
Journal: Am J Clin Nutr Date: 1979-04 Impact factor: 7.045

7. New trends and approaches in the delivery of maternal and child care in health services. Sixth report of the WHO Expert Committee on Maternal and Child Health.

Authors:
Journal: World Health Organ Tech Rep Ser Date: 1976