Literature DB >> 35134053

An initiative to develop capability-adjusted life years in Sweden (CALY-SWE): Selecting capabilities with a Delphi panel and developing the questionnaire.

Kaspar Walter Meili¹, Anna Månsdotter^1,2, Linda Richter Sundberg¹, Jan Hjelte³, Lars Lindholm¹.

Abstract

INTRODUCTION: Capability-adjusted life years Sweden (CALY-SWE) are a new Swedish questionnaire-based measure for quality of life based on the capability approach. CALY-SWE are targeted towards use in cost-effectiveness evaluations of social welfare consequences. Here, we first motivate the measure both from a theoretical and from a Swedish policy-making perspective. Then, we outline the core principles of the measure, namely the relation to the capability approach, embedded equity considerations inspired by the fair-innings approach, and the bases for which capabilities should be considered. The aims were to 1) the most vital capabilities for individuals in Sweden, 2) to define a sufficient level of each identified capability to lead a flourishing life, and to 3) develop a complete questionnaire for the measurement of the identified capabilities.
MATERIAL AND METHODS: For the selection of capabilities, we used a Delphi process with Swedish civil society representants. To inform the questionnaire development, we conducted a web survey in three versions, with each Swedish 500 participants, to assess the distribution of capabilities that resulted from the Delphi process in the Swedish population. Each version was formulated with different strictness so that less strict wordings of a capability level would apply to a larger share of participants. All versions also included questions on inequality aversion regarding financial, educational, and health capabilities.
RESULTS: The Delphi process resulted in the following six capabilities: Financial situation & housing, health, social relations, occupations, security, and political & civil rights. We formulated the final phrasing for the questionnaire based on normative reasons and the distribution of capabilities in the population while taking into account inequality aversion.
CONCLUSION: We developed a capability-based model for cost effectiveness economic evaluations of broader social consequences, specific to the Swedish context.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35134053 PMCID： PMC8824323 DOI： 10.1371/journal.pone.0263231

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Overview

The aim of this paper is to describe a Swedish initiative to develop a new outcome measure for public health and social interventions, capability-adjusted life years (CALY-SWE), for use in economic evaluations [1]. The measure is based on Sen’s capability approach [2, 3], and the quality-adjusted life years (QALYs) methodology that is frequently used for economic evaluations in healthcare [4]. The paper is organized as follows. The background section motivates CALY-SWE from a Swedish policy perspective and outlines the historical development in health economics from utilitarianism and cost-benefit analysis to extra-welfarism and QALYs, and how this historic context relates to CALYs. Then, the principles of our measure are described, followed by two investigations. The first investigation was a Delphi panel with participants from the Swedish civil society, the purpose being to select the capabilities to include in the measure. The second investigation was a survey with Swedish participants in which we investigated the present distribution of capabilities in the Swedish population (n = 1,500) to inform the phrasing of the instrument. We aimed to formulate questions that measure each capability on three levels, where the highest level corresponds to having the opportunity of living a flourishing life. In the next section, the results from the two empirical investigations are merged, and we suggest a complete questionnaire, including phrasing. Finally, we discuss our proposal for a new outcome measure in relation to other attempts to develop capability-based measures.

Reasons for a capability-based wellbeing measure in Sweden

QALYs were established in the 1980s and have now spread globally as the primary outcome measure in economic evaluations in healthcare and public health. QALYs combine health-related quality of life and time into a single metric [4]. However, finding appropriate measures for policymakers to evaluate interventions with broader social consequences than health is challenging. In the Swedish context, the public sector has three tiers: national, regional, and municipal. National actors and regions are mainly responsible for healthcare and public health and tend to rely on QALYs, while municipalities are mostly responsible for welfare—for example, primary and secondary education, social care and elderly care. Because municipalities are involved with interventions with consequences primarily beyond population health, QALYs are a mostly useless currency, as they do not directly capture effects from relevant welfare areas. Furthermore, the theory of social determinants of health is important in public health. However, there is a tendency in public health to treat social living conditions mainly as a means to an end—better health. Mackenbach [5] judges the UK strategy from 1999 to be the most ambitious attempt in its history to reduce health inequalities. Minimum wages were increased, along with benefits and pensions. Spending on education, housing and urban regeneration was increased. From a public health point of view, the strategy failed, since the health inequalities persisted, which has also been the legacy. However, from a wellbeing perspective, the gains could perhaps have been worthwhile despite the costs. As far as we know, this broader picture has not yet been investigated. To capture broader social consequences, policy actors could in theory rely on cost-benefit analysis, applying willingness to pay. However, willingness-to-pay studies are not common in the evaluation of public health and social policies. Instead, in Sweden, a simple cost-saving approach is frequently used [6] which can hardly capture the complex social consequences of the interventions under consideration, such as long-term benefits of improved education Costs and savings (that is, reduced costs), on the other hand, can be captured with a high degree of certainty. Therefore, these cost-saving analyses may lead to a bias towards resource-saving interventions, while interventions creating quality of life gains are given low priority.

From welfarism to extra-welfarism and capabilities

Welfarism has its roots in utilitarianism, developed by Bentham (1748–1833) and Mill (1806–1873) [7]. There are two pillars of utilitarianism: to strive for maximisation of utility or welfare, and actions should be assessed according to their consequences. We need to check whether a policy has good or bad consequences before we assess its value. These ideas have remained at the core of twentieth-century economics. In neoclassical welfare economics, it is assumed that individuals strive to maximize their utility, which is derived from goods and services consumed. Furthermore, individuals choose rationally, which means that they are able to consistently rank all choices at hand. Another important normative assumption is individual sovereignty—i.e. the individual is the only judge of his/her own welfare [8]. The conceptual origin of cost-benefit analyses stems from the tradition of welfarism. In principle, individuals are asked to evaluate any change in monetary terms. If the consequences are positive (winners), they are asked about the largest amount they are willing to pay to achieve this change. If the consequences are negative (losers), they are asked about the minimum compensation they need to accept the change. If the total amount winners are hypothetically willing to pay is enough to compensate the losers, the net is positive and the total utility in the society will increase. This is known as the potential Pareto criterion or Kaldor-Hicks criterion [9]. Comparisons between individuals are only possible by accepting potential Pareto improvement in the form of compensatory payments according to the Kaldor-Hicks criterion. Sen opposed welfarism and cost-benefit analysis, such as the behavioural assumptions of rational utility maximization. He voiced his critique, for example, in his ‘Rational Fools’ paper from 1977 [10], and he went on to develop the capability approach. Specifically, Sen proposed that the most important information to consider is capabilities [2]—i.e. whether an intervention increases individuals’ opportunities that they value for living a flourishing life. Capabilities denote the set of what individuals can be and what they can do. The individuals themselves should have freedom in how to shape their lives by choosing what capabilities they would like to realize, as opposed to measuring wellbeing based on individuals’ endowments or utility. Sen writes that capability is the freedom to achieve valuable functionings [11], and illustrates this by the difference between being hungry due to fasting versus starving. A contemporary example from Sweden would be the difference between voluntary retirement and forced retirement. In parallel to, and influenced by, the development of the capability approach, extra-welfarism was established in health economics. Extra-welfarism extends the information space beyond individual utility and allows for interpersonal comparisons. QALYs, for example, are based on extra-welfarist values, in that health is the underlying value to be maximized [9]. Brouwer et al. [12] made a comparison between welfarism and extra-welfarism, concluding that the latter accepts, and even strives for, broader outcomes and more collective decision-making. Furthermore, according to the extra-welfarist framework, individuals themselves, as well as experts or the general public, can value the outcomes; explicit equity weights can be added, and interpersonal comparisons are desirable. The most common interpretation of this extra-welfarist framework is that healthcare interventions should be judged by their impact on health, preferably in terms of QALYs. According to Cookson [13], the QALY measure displays clear characteristics of the capability approach, such as the focus on instruments with multiple attributes that correspond to functionings and approval of non-utilitarian values. For example, estimating QALYs by the well-established instrument EQ-5D implies considering dimensions of mobility, self-care, usual activities, pain/discomfort and anxiety/depression [14]. Capabilities, as an outcome measure in healthcare, elderly care or social services, have gained increasing attention during the last decade [1, 15–18]. While capability-adjusted life years have been suggested and implemented before [19-22], our approach includes equity considerations from the start and is focused on an amalgamation of the QALY approach and a set of capabilities covering vital aspects of a good life relevant for the Swedish policy context.

Aims

As discussed, there is a need for a new broader measure, and we argue that the QALY approach merged with Sen’s capability approach would be suitable. Thus, the aims of this article are: To identify the most vital capabilities for individuals in Sweden. To define a sufficient level of each identified capability for leading a flourishing life. To develop a complete questionnaire for the measurement of the identified capabilities.

Capability-adjusted life years (CALYs): Our proposal in principle

With the CALY-SWE measure, we propose to transfer the QALY concept of a weighted lifetime to a capability basis. QALY has shown to be very useful in decision-making all over the world, and our intension is to promote CALYs for decision-making guided by both efficiency and equity goals. Whereas CALY-SWE specifically denotes our proposed instrument for CALYs, CALY denotes the capability-weighted life year concept in general, to the distinction between for example EQ-5D and QALYs. CALY-SWE weights are based on a descriptive system consisting of capabilities which can be satisfied on three levels. The levels of each capability are ordinal and define the extent of a capability set. For each possible configuration of capabilities and levels, a weight between 0 and 1 is calculated that corresponds to an index for the overall quality of life in term of capabilities. The conceptual anchoring of 1 and 0 and corresponding justifications constitutes another issue. We aim to anchor 1 to a capability set sufficient for living a life that is “flourishing” [23]. The concept of human flourishing goes back to Aristotle and has also been adopted by Martha Nussbaum [24-26]. Similar to QALYs, the weight 0 is given to a capability configuration equivalent to 0 lifetime. QALYs allow negative weights—that is, states judged as being worse than death [27, 28]. However, in this phase of development, we limit the scale to 1 to 0. The normative assumption in mainstream health economics is the maximization of QALYs subject to budget constraints, following the utilitarian tradition. However, other normative assumptions have been proposed, and they are usually gathered under the label ‘trade-offs’ between efficiency and equity [29-31]. One version of this idea, known as “fair innings”, was suggested by Alan Williams [31]. Williams’ point of departure was lifetime QALYs for the UK population according to social position. His data showed a well-known pattern: a low position was associated with relatively few QALYs and a high position with many. Williams suggested that a ‘fair innings’ (e.g. average lifetime QALYs) should guide resource allocation [31]: those above should get less and those below should get more. He writes:”It [the fair innings concept] reflects the feeling that everyone is entitled to some ‘normal’ span of health and anyone failing to achieve this has been cheated, whilst anyone getting more than this is ‘living on borrowed time’” [31]. A version of this idea, known as prioritarianism, also appears in contemporary philosophy. Prioritarianism holds that an increment in wellbeing is morally more valuable the lower (in absolute terms) the level of wellbeing from which this increment arises [32]. Wellbeing is summed over all individuals, but extra weight is given to the wellbeing of individuals who are worse-off. Interpretations of prioritarianism dominate the ethics in healthcare and public health. Severity of the condition is perhaps the most important criterion when ascribing priorities in healthcare. Norheim et al. suggest the principles of “greatest benefit” and “worst-off” [33]. Our intention is to follow in the footsteps of Alan Williams and other proponents of prioritarianism, particularly considering the interests of the worst-off. For each capability, we aim to define the sufficient capability level for a flourishing life. For this, we conducted a survey to measure the distribution of capabilities in the Swedish population, reported in the section ‘Setting sufficient capability thresholds in Sweden’. Some capability aspects can easily be measured on continuous scales—e.g. wealth in dollars and health in lifetime QALYs. If sufficient wealth for a flourishing life is decided to be 50,000 USD annually, any income above that threshold does not increase capabilities nor quantity of CALYs. In contrast, personal wellbeing, in terms of utility, and total societal welfare across all individuals may certainly increase when income grows if we undertake a utilitarian calculus. Thus, in the CALY calculation, only improvements below the threshold are counted. A crucial difference between wealth and health in Nordic/European welfare states is the rights the citizens are given. In the distribution of health care, need is the most important criterion, and even the poor and sick are principally offered the best possible care, which is clearly stated in the Swedish Health Care Law [34]. Most political parties in Europe support healthcare allocation purely according to need and disregard any association between health outcomes and social position. Regarding wealth, the normative thinking is different. The cash benefits people can get in case of no other incomes are on a low level. Income and wealth distributions are skewed to the right and have a very long tail [35, 36]. Deservedness plays a big role when society assess the fairness of incomes, but less so in the fairness of health outcomes. Societal ambition is high when it comes to a health capability. The ideal distribution is a long and healthy life for all. To achieve this, we certainly need to allocate more resources to those with the poorest health, along the lines suggested by Williams. When it comes to wealth, the societal ambition for redistribution is low, or perhaps moderate. The ideal distribution may be quite scattered. Societies usually do not deny people earning millions or even billions of dollars, but financial success should be a private endeavour. Other capabilities, at least ideally, are dichotomous. All should have equal and complete political rights, and no one deserves to only get partial rights. However, in real life there exist levels between all or nothing. Research in political science shows that even democracies can be ranked [37]: some are complete and other partial.

Which capabilities should be included?

Two different views have dominated this debate. Marta Nussbaum [24] suggested a list of dimensions with claims to be universally valid (length of life; health; bodily integrity; senses, imagination and thought; emotions; practical reasons; affiliations; concern for other species; play; and control over one´s environment). Sen never suggested any definite list of capabilities, because he was not willing to anticipate the processes and decisions that are necessary for each context [38]. We agree, and feel that the choice of capabilities must be tailor-made for a particular setting, although framed by universal principles. We believe Sen and Nussbaum developed their visions of the capability approach mainly from the perspective of low-income countries. In high-income democracies like Sweden, it is possible to distinguish different clusters of capabilities: those already being equally distributed (e.g. formal political freedom); those offering opportunities to those in need or who have ambitions (e.g. healthcare and free schooling); those meeting vague distributional obligations (e.g. housing); and those being quite far from societal decision-making (e.g. emotions and leisure). Yet societies are seldom neutral, neither in the proportion of the population that votes or finishes education, nor in the distribution of income or pleasure. This is most clearly expressed regarding employment. A high employment rate (including for women) is a pillar in the welfare state [34]. One important point of departure in our research project is to develop a measure useful for public decision-making, and this gives rise to some considerations: Which capabilities are most appropriate for public decision-making? Which capabilities vary mostly between individuals in Sweden today? Regarding A, we think there are two different angles to consider. One angle is normative: many people would like to have a private sphere in which public interventions are not desired. The other angle is whether public interventions can achieve a certain goal. Love is important in life, but it is not clear if public intervention would be a good idea from a normative point of view, or even effective in terms of benefiting people who feel unloved. Correspondingly, regarding B, formal political rights and free access to nature are not good candidates in Sweden, because a large majority of inhabitants have these capabilities to a level that does not hinder a flourishing life. Incidentally, the national parliament recently initiated an investigation of possible measures of quality of life in Sweden. The mission was assigned to a professor in sociology, Robert Eriksson, whose expert report suggest capabilities as the best measure of wellbeing in Sweden. The report [39] suggested ten essential capabilities for wellbeing in Sweden: balance of time; economic resources; mental and physical health; political resources; knowledge and skills; sound living environment; employment/commitment; social relations; feeling of security; and safe housing. This was starting point for our purpose, but the next step was to organize a Delphi panel to select particularly relevant capabilities. The panel and its choices are reported in the next section.

The Delphi panel and the choice of capabilities

Procedure

Methods for the selection of capabilities have varied. Both expert-led and participatory approaches have been suggested. Nussbaum’s famous list is not strictly empirically based. Researchers in the UK have tried participatory approaches and used qualitative interviews, focus groups and postal surveys [17, 40, 41]. We suggest a further variant, and have worked together with fair minded people in a Delphi process [42] to achieve consensus on a specific question or topic through collaborative decision-making. The process takes place in several rounds in an iterative approach, and information from previous steps is communicated anonymously to participants through a central coordinator or facilitator [43]. The Delphi method is of particular use in the evaluation and assessment of areas in which it is generally difficult to reach a common view or agreement [44]. Most importantly, the Delphi method is useful for exposing priorities of personal values or social goals on a collective basis [44]. The participants of the Delphi panel were selected through a nomination process that was based on the idea of ‘fair-minded people’, a concept introduced by Norman Daniels [42]. ‘Fair-minded people’ refers to individuals who agree on mutually justifiable terms of cooperation, and “who want to play by agreed-upon rules and prefer rules that are designed to bring out the best in that game”. The participants were also assumed to have experience of the subject and be willing to dedicate time to the Delphi process (cf. Ogbeifun et al. [43]). We thus invited 22 national (mainly non-profit) organizations (Table 1) to nominate two delegates each (one woman and one man) to take part in our panel. The selection of organizations was influenced by the United Nations Agenda for Sustainable Development for 2030 [45]; we tried to cover individual goals such as no poverty and good health, in the sense that at least one of the organizations had their main activities in these areas. To cover the goal of decent work and economic growth, we invited unions and one enterprise organization, despite these hardly meeting the non-profit criterion.

Table 1

Actors from the Swedish civil society who participated in the Delphi panel.

Name of organisation	Related SDG goals	Answered	Participated
Amnesty	16: Peace and Justice Strong Institutions	No
Children’s Rights in Society	1: No Poverty	Yes	Yes
Children’s Rights in Society	5: Gender Equality	Yes	Yes
Crime Victim Support Sweden	16: Peace and Justice Strong Institutions	Yes	Yes
The National Association for Cancer Patients	1: No Poverty	No
The National Association for Cancer Patients	3: Good Health and Well-being	No
Disability Human Rights	1: No Poverty	Yes	Yes
	3: Good Health and Well-being
	8: Decent Work and Economic Growth
	16: Peace and Justice Strong Institutions
The Swedish Network of Refugee Support Groups	1: No Poverty	No
The Swedish Network of Refugee Support Groups	16: Peace and Justice Strong Institutions	No
Islamic Association in Sweden	1: No Poverty	No
Islamic Association in Sweden	16: Peace and Justice Strong Institutions	No
National Organization for Pensioners	1: No Poverty	No
	3: Good Health and Well-being
	16: Peace and Justice Strong Institutions
Swedish Pensioners’ Association	1: No Poverty	No
	3: Good Health and Well-being
	16: Peace and Justice Strong Institutions
Comrade Association of Former Criminals	4: Quality Education	No
Comrade Association of Former Criminals	16: Peace and Justice Strong Institutions	No
Fryshuset Global (supporting young people)	16: Peace and Justice Strong Institutions	No
The Swedish Federation for Lesbian, Gay, Bisexual, Transgender, Queer, and Intersex Rights	3: Good Health and Well-being	Yes	Yes
	16: Peace and Justice Strong Institutions	Yes	Yes
Red Cross	1: No Poverty	Yes	Yes
Red Cross	2: Zero Hunger	Yes	Yes
Save the Children	1: No Poverty	No
	2: Zero Hunger
	4: Quality Education
The Swedish Trade Union Confederation	8: Decent Work and Economic Growth	Yes	Yes
The National Organization for White Collar Workers	8: Decent Work and Economic Growth	Yes	No
The Swedish Confederation of Professional Associations	8: Decent Work and Economic Growth	Yes	Yes
Confederation of Swedish Enterprise	8: Decent Work and Economic Growth	No
Salvation Army	1: No Poverty	Yes	Yes
Salvation Army	2: Zero Hunger	Yes	Yes
The Swedish Council for Information on Alcohol and Other Drugs	3: Good Health and Well-being	No
	16: Peace and Justice Strong Institutions	No

SDG, Sustainable Development Goals.

SDG, Sustainable Development Goals. Some of the goals are more individual and others more collective. Access to clean water, sustainable electricity and combat of climate change are in our terminology collective. Deteriorations affect more or less the whole population and improvements benefit all. Furthermore, there are equity goals, in line with the thinking in our model. Individual goals are: 1. No poverty 2. No hunger 3. Good health and wellbeing 4. Quality education 8. Decent working conditions and economic growth 16. Peace, justice, and strong institutions In order to introduce the panel members to the research project, and what a Delphi panel means and what their role was, they were invited to a start-up day in Stockholm, where representatives of nine organizations took part. Our research team presented the project, the Delphi process, and the role of the panel. The members in the panel agreed to act like fair-minded people, using the following definition: “Fair-minded people are not selfish but try to act in the best interest of others; they listen to and think about others’ arguments, and they take time to reflect before they take a standpoint.” The procedure in a Delphi panel was explained, and the members were even encouraged to have a dialogue with their partners before answering. In the first round, the task for the panel was to rank ten capabilities suggested in a Swedish public investigation in 2015, where 1 was the most important and 10 the least important. The ten capabilities were: financial situation, health, education and skills, occupation, social relations, residence, security, time balance, local community, and political rights. The panel was invited to suggest further capabilities if they felt anything important was lacking. They were also encouraged to explain their rankings in free text. To evaluate the first-round rankings, we calculated the median and mean. We also counted the number of times a certain capability was ranked among the top five. The last method requires no interval properties.

Outcome

After the first round (16 out of 18 answered), the panel strongly agree to include: Health, social relations, financial situation, and residence. They strongly agree to exclude: Education and skills, time balance and local community. The members gave different rankings for occupation, security and political rights. At the start of the second round (12 answered), the results from the first round were presented, and the panel was given the following new task: to rank occupation, security and political rights (i.e. the capabilities where there were diverging rankings). The research team suggested that financial resources and residence should be merged into one capability because the two are closely related. People having financial resources are able to find a proper residence. We asked the panel if they would support this change and they agreed. Finally, we asked the members to confirm the inclusion of health, social relations, and financial situation including residence. The second round did not come closer to a consensus regarding occupation, security and political rights. They still got almost the same levels of support. However, a large majority of the members approved the merger of financial situation and residence. They also confirmed the inclusion of health, social relations and financial situation (including residence). We decided not to initiate a third round because the assessment was that this would not increase consensus among the participants. Instead, the number of capabilities included was increased to a total of six. The capabilities selected by the panel were: health, social relations, financial situation including residence, occupation, security, and political rights.

Setting sufficient capability thresholds in Sweden

Mitchell et al. [19] define a sufficient threshold “as the level of capability at or above which a person’s level of capability wellbeing is no longer a concern for policy”. Instead, public policies should focus on helping all to reach this level, and in particular to lift those worst-off. However, in the general welfare state those beyond the threshold are still entitled to benefits such as free schooling or subsidized health care. If they were exempted from these benefits many may drop below the sufficient level. How to set concrete thresholds that are useful in policymaking is not a well-researched topic. We think William’s ‘fair innings’ approach can bring insights [31]. Williams investigated lifetime QALYs in different social groups to suggest a fair inning. Similarly, we thus aim to measure the distribution of capabilities in the Swedish population to set a threshold. Additionally, when considering both efficiency and equity, some people are willing to trade-off efficiency in exchange for a more equitable distribution, as in the landmark paper by Atkinson [46]. We considered inequality aversion in the population, which may vary between life spheres.

Distribution of capabilities in the population

With the goal of setting a fair innings threshold for the phrasing of the CALY-SWE questionnaire, we conducted a web survey in June 2020 among 1,500 participants from the general Swedish population using a commercial web panel [47]. The aims were: i) to assess the distribution of the capabilities held in the population; ii) to explore the impact of different wordings in the capability statements; and iii) to investigate the degree of inequality aversion regarding income, education, and health in the form of life expectancy. The Swedish Ethical Review Authority approved the study with an advisory statement (Dnr 2019–02848) and participants consented to participate electronically. We excluded answers from participants that stated an age below 18 (ethical concerns) and over 99 (data quality concerns). Detailed methods, participants’ characteristics, limitations, and Swedish question phrasings including English translations, are available in S1 File. The design of the study is summarized in Table 2. Three samples of about 500 persons which were similar regarding sex, age and education were drawn (A, B, and C). We deemed a sample size of 500 as sufficient where with a 5% significance level, a two sample test for proportions detects a 0.1 proportion difference with at least 80% power. The versions differed in how the extent of the capability was described. The A version used the wording of “always” having the capability; the B version used the wording “almost always”; and the C version used the wording “mostly”. The B and C versions also had an additional descriptive clause that specified the extent of “almost always” and “mostly”. Participants needed to select either “completely agree”, “partially agree”, or “do not agree at all” for all statements.

Table 2

Design of distribution study.

Version and phrasing of capability extent (sample size)	Answer options
A: “Always”	a. Completely agree
(n = 497)	a. Completely agree
	b. Partially agree
	c. Do not agree at all
B: “Almost always”	a. Completely agree
(n = 503)	a. Completely agree
	b. Partially agree
	c. Do not agree at all
C: “Mostly”	a. Completely agree
(n = 505)	a. Completely agree
	b. Partially agree
	c. Do not agree at all

Aim i) was to explore the distribution of capabilities. In our model, those who “completely agree” have already sufficient capability and those who “do not agree at all” are those “worst-off”. The intention behind “worst-off” is to identify a small group that has the poorest living condition. For the validity and practical usability of CALY-SWE, the size of this proportion should be limited. If the proportion would be, say 50%, the term “worst-off” would lose its meaning. A correct size for the proportion of “worst-off” may not be determinable, but a reasonable proportion may be five to ten percent. To be useful in policymaking, the highest capability level (“completely agree”) should not have been achieved by a large majority but nevertheless not be unattainable for a majority. If too many achieved the highest level, it may be difficult to evaluate reforms that affect the whole distribution, make cross-sectional comparisons, or follow population trends over time. Aim ii) was to investigate whether the difference in wording between A, B and C influenced the answer pattern. Due to the higher threshold for a “completely agree” answer implied by the stricter wording of A (“always”) compared to B (“almost always”) and C (“mostly”), we expected an increased proportion of “completely agree” answers for C compared to B and for B compared to A. If this expectation would not get support, we assumed that this way of finetuning the phrasing was not meaningful. The most common answer among participants was “completely agree” followed by “partially agree” for all capabilities except for political resources, where most participants answered “partially agree” followed by “completely agree” (Fig 1).

Fig 1

Distribution of answer alternatives in different wording versions, with bootstrapped 95% confidence intervals.

Significant differences with p < 0.05 in a z-test for difference in proportions are marked with*.

Distribution of answer alternatives in different wording versions, with bootstrapped 95% confidence intervals.

Significant differences with p < 0.05 in a z-test for difference in proportions are marked with*. The proportion of “not at all” answers, corresponding to the “worst-off”, was stable at around 10 per cent between versions A, B, and C, and the change of frequency in the statements from “always” (A) to “mostly” (C) had no impact. “Completely agree”, the part of the population that has reached the threshold of sufficient capabilities, was around 50 per cent. The expected increase of “Completely agree” answers for the versions with “almost always” and “mostly” wordings notably occurred for health, financial situation & housing, occupation and security, where either the proportion for the C compared to the B, C compared to A, or A compared to B was significantly higher. The answer patterns for social relations and political rights did not correspond with the expected increase of “Completely agree” answers in the B and C versions.

Inequality aversion

We asked participants about the current Swedish distribution in life expectancy, income, and education. Participants could state that the difference was too large or was acceptable, and for income also if it was too small. Detailed phrasings are available in S1 File. There was a large difference in inequality aversion for income compared to education and life expectancy as a proxy for health. About two thirds of participants judged the differences for education and health to be too large, in contrast to income, where only one third found the difference to be too large (Fig 2).

Fig 2

Distribution of answers in inequality aversion questions with 95% confidence intervals.

Answer alternative ‘Difference too small’ not available for education and health.

Distribution of answers in inequality aversion questions with 95% confidence intervals.

Answer alternative ‘Difference too small’ not available for education and health. In the next section, ‘Capability Levels: Phrasing the statements’, we use these findings to phrase the CALY-SWE statements.

Capability levels: Phrasing the statements

In this section, we suggest how to phrase the level of a capability sufficient for a flourishing life. We discuss in light of the theory of public goods whether it is meaningful for all capabilities to set an explicit threshold. We try to consider all relevant arguments of which we are aware. In particular, we use input from the Delphi process and the study of distribution of capabilities in the population sample. Alongside these two investigations, we consider Swedish laws and policies, and normative ideas in general, which we judge to be of relevance. Health: Based on the input from the Delphi panel, we consider physical and mental health as one of the most (maybe the most) important capabilities. The population survey indicated a pronounced inequality aversion. The threshold should facilitate a high and equitable level of health for everybody. The Healthcare Act [48] and other policy documents [49] point in the same direction. However, “always” having the health to do what one wants may be unrealistic and certainly extremely demanding of resources, and thus complicates the realization of better health for those worst-off, as well as the realization of other capabilities. Therefore, we think the formulation “almost always” is the most reasonable “fair innings”. Financial situation and housing: Compared to health and social relations, a high degree of equality in financial resources may be less relevant for policymaking, as differences in incomes and accommodation are more acceptable. The population survey indicated a mild inequality aversion. From a global perspective, the living standard in Sweden is very high. Considering climate change and sustainable development, it is desirable to reduce the average consumption of goods and services in this country. Yet there are poor groups, especially among immigrants, single parents, and some segments of pensioners. As was reported earlier, around 10 per cent in our study did not agree at all in any of the versions. The proportion of “agree completely”, in the “mostly” (C) version compared to the “always” (A) and “almost always” (B) versions was higher, indicating an effect of the phrasing difference. We chose “always” for accommodation and “mostly” for financial situation as the capability threshold—i.e. the fair innings. Occupation: This capability has the highest degree of reciprocity. Work is the foundation of the welfare state and a widespread view is that all should contribute according to their capacity. However, high unemployment rates are a big problem, particularly among young people and people with poor education. From the worst-off angle, certainly unemployment dominates. As for financial situation and accommodation, the wording made a difference, in that “agree partially” proportions were lower and “agree completely” proportion were higher in the “mostly” (C) version compared to the “always” (A) and “almost always” (B) versions. Summing up the arguments above, we think “mostly” is a reasonable societal ambition—i.e. a fair inning. The first priority in policy in this area must be to increase opportunities for the unemployed to get a job, or for those wanting a meaningful activity—e.g. mainly retired or long-term ill people. Security and political rights have in common that these capabilities can mainly be increased by the provision of public or social goods (or services)—that is, a good) that is both non-excludable and non-rivalrous [50, 51], in that individuals cannot be excluded from use or could benefit from it without paying for it, and where use by one individual does not reduce availability to others, or the good can be used simultaneously by more than one person. Security is a classic example. More police officers in a certain area benefit all living there. The same goes for measures aimed at preventing crime. Similarly, the execution of political rights does not reduce any other person’s opportunity to execute his or her rights. Social relations are a kind of ‘micro social good‘ [52] in the sense that improved relations for one individual correspond to improved relations for at least one another individual. The answer patterns regarding political and civil rights and social relations indicated that the wording frequencies may were not well understood, and proportions did not differ significantly between versions. For security, the differences in answers between the versions corresponded to the theoretical expectation, with a higher proportion of “completely agree” in the “almost always” (B) and “mostly” (C) versions compared to the “always” (A) version. However, tolerating some degree of unsafety related to violence and crime may collide with article 3 of the human rights declaration that guarantees security of person [53] or the Swedish constitution articles 1 and 2 that proclaim equal treatment and bodily integrity of individuals [54]. Therefore, for political and civil rights, security and social relations, we rejected all the three tested versions. Instead, our phrasing does not include any term for how frequently the capability is available. In the case of political and civil rights, we use plural wording to underline the nature of political decisions. These are typically laws or policies which treat all individuals the same. “We count everyone as one, no one for more than one,” as Bentham and Mill argued regarding aggregation of utility [55, 56]. The finalized questionnaire is available in S2 File.

Discussion

The main result of this research is a capability-based model for the economic evaluation of public health and social policies in Sweden. The intention is to support public decision-making and thereby contribute to an increase in wellbeing in the society. The ethical foundation is to consider both equity and efficiency goals, giving particular weight to the interest of the worst-off, trying to make a compromise which can be defended by relevant and convincing arguments. We judge our proposal to be a logical extension of the cost per QALY approach which is now common in healthcare and public health all over the world. As Cookson [13] argues, QALYs is an application of Sen’s capability approach. The frontier between health-related quality of life and global quality of life is fluent. Well-known QALY measures include, for instance, ability to work (EQ5D) [14, 57] and social relations [58]. Thus, there is a continuous transition from pure health measures to general wellbeing measures, rather than a distinct difference. This paper presents a complete questionnaire, which itself can be used for evaluation and population survey. We recently used the questionnaire for an evaluation of the COVID-19 pandemic [59]. We are also planning to develop a weight tariff using time trade-off and discrete choice evaluation questions. In Table 3, we have tried to compare Nussbaum’s [20] list of capabilities and the Swedish list. A first observation is the language style. Nussbaum’s language is more poetic and the Swedish prosaic. However, when we try to overcome this difference, there are many similarities.

Table 3

Relation of CALY-SWE dimensions to Nussbaum’s list of central capabilities.

	Central capability	Description
Health	1,2	A life of normal length and good health.
Financial situation	10B	Being able to hold property. Having the right to seek employment.
Social relations	5	Being able to have attachments to things and people outside ourselves. Being able to live with and towards others.
Political	10A	Being able to participate effectively in political choices.
Security	3	To be secure against violent assault.
Occupation	10B	Being in work, being able to work as human being. Play for retired people.

All capabilities in the Swedish list are in Nussbaum’s list, but not vice versa. Regarding material capabilities, the commitments in the Swedish list are more far-reaching—e.g. Nussbaum wrote “having the right to seek employment” [24]. The corresponding wording in the Swedish list puts a threshold on the function, the right (and duty) to have a job. The same is true for residence, the right to have a permanent residence. We think that differences in phrasing are adaptations to the different settings—in our case, the Swedish welfare state. All already have the right to seek employment and thus we need to be more specific and committed. Our statement is rather about a function—whether most of the time in previous years, they have held an occupation they are quite happy with. We also argue there is a moral difference between being able to work but not taking a job in real life, and having a job. All share the benefits of the welfare state, such as free education and healthcare, and should feel an imperative to share the burden of producing goods and services and contributing to the welfare state. Another capability that clearly needs to be adapted to different settings is political rights. We can find many democracies classified as satisfactory (Sweden is one), where people formally hold reasonable rights [37]. However, the extent to which people execute these rights varies considerably, and one important factor may be in the political system [60, 61]. We have chosen to label our measure CALY-SWE, and the reason is, of course, to stress the dependency on setting. As discussed above, the result of our investigation is a quite general set of capabilities. We thus believe they would be valid in neighbouring countries such as Norway, Denmark, and Finland, and probably also in other countries in Northwest Europe. On the other hand, these countries have a much higher population density, which perhaps influences the choice of capabilities. We hope that future research will bring more clarity to this question.

Is there a conflict between prioritizing the worst-off and the welfare state’s general policies?

A symbol of the general policies in Sweden has been child benefit for all, even those families with the highest incomes. Advocates of this policy acknowledge that the material gains for families with high incomes are negligible. Instead, the main reason for the policy is that it contributes to trust and fairness [62] if all taxpayers are also generally eligible for all the benefits in the welfare state. In our model, child benefits to families beyond the threshold in the capability “financial situation” would not contribute to further capability or to more CALYs. However, if the advocates of the welfare state are right, our “political rights” capability would be sensitive to the child benefits scheme. It has been argued [62] that high levels of trust are a condition for the welfare state, perhaps even a consequence of it. Healthcare is distributed according to need, and almost free of cost at the time of consumption. Healthcare is mainly financed by taxes, and resources are redistributed over the lifespan. There is even a redistribution between the healthy and the ill in the population. As stated above, healthcare in the welfare state already applies a worst-off criterion.

Experts, fair-minded people, and population surveys

The development of CALY-SWE has utilized the knowledge and perspectives of experts, citizens and a hybrid of those groups—fair-minded-people. The purpose of this mix was to get advice from advocates of objective wellbeing as well as subjective wellbeing. The list of ten capabilities we used as the point of departure was clearly a work of experts. However, fair-minded people were not experts, but rather wise laymen given the opportunity to reflect on the topic and take part in a dialogue over some weeks. They were instructed to act in the public interest and not to base their positions mainly on their personal preferences. They should try to imagine the interests of others and try their best to satisfy the interest of the public majority. We did consider two alternative approaches to the Delphi panel we undertook. One was to use a panel design, but to let political parties nominate the members. This design would link very well to a tradition where the parties nominate, for example, jury members. The reason for us not to choose this approach was a fear that current political disagreements would colour the work of the panel. Furthermore, we also believe that our panel of fair-minded people has deeper insight into disadvantaged people’s living conditions. The other approach we considered was a population survey. However, we saw problems with a survey among researchers and doctoral students in health-related medicine [63]. We could not reasonably expect the respondents to use more than about 30 minutes to answer our questions, and that meant an unreasonable imbalance between the task and the opportunity given for solving it. Secondly, population survey respondents may be guided by their personal preferences, which would lead to a purely subjective model. Thirdly, participants in a one-time cross-sectional survey would not be able to respond to the input of other survey participants, and thus a consensus as in the Delphi panel would not have been possible. The result from the Panel can be compared with a previous pilot study. The aim of the study was to investigate whether it was possible to rank capabilities included in the Swedish parliament report [39]. The participants were researchers or PhD-students, mostly in public health. The main finding was that most of the respondents managed to rank the capabilities included in the set. The results suggest that participants deemed health to be most important, followed by social relations and financial situation. Knowledge, occupation, time, security, political resources, housing, and living environment were ranked lower, and the exact order depends on the metric used to synthesise the individual rankings. To form a Delphi panel with fair-minded people is a rather novel approach. The usual Delphi panels consist of experts with formal merits in relation to the panel topic. Our approach will be evaluated in a coming study. CALYs, like QALYs, require weighting of each set of capabilities. Our intention is to estimate the weights in population surveys. In previous surveys [64], the respondents have assigned higher importance to the step from “not agree” to “partly agree” higher than the step from “partly agree” to “completely agree”. We interpret this pattern as a particular concern about the worst-off.

CALY-SWE compared to other proposals

There are several other instruments based on the capability approach. We provide an incomplete overview of the main approaches and how they compare to CALY-SWE in Table 4. A comprehensive review was recently published [15] that provides a more complete and exhaustive overview of the field.

Table 4

Comparison of capability-based instruments.

	ICECAP-A	ICECAP-O	ICECAP-SCM	OxCAP, OCAP-18, and OxCAP-MH	ASCOT	CALY-SWE
Choice of dimensions	Qualitative interviews	Qualitative interviews	Qualitative interviews	Indicators mapped to Nussbaum’s list	Building on previous measures, qualitative interviews, literature review, Delphi process	Delphi process among civil society representatives
Purpose	Economic evaluation in general population	Economic evaluation of health and social care interventions in elderly aged 65+	Economic evaluations for individuals at end-of-life	General population	General population	1 Economic evaluation
				General population		2 Mapping living conditions over time, between areas
				(OxCAP-MH: Mental illness)		2 Mapping living conditions over time, between areas
Dimensions	Stability, attachment, autonomy, achievement, enjoyment	Attachment, security, role, enjoyment, control	Choice, love and affection, physical suffering, emotional suffering, dignity, being supported, preparation	Life, bodily health, bodily integrity, senses, imagination and thought, emotions, affiliation, other species, play, control over one’s environment	Personal cleanliness and comfort, accommodation cleanliness and comfort, food and drink, safety, social participation and involvement, occupation, control over daily life, dignity	Financial situation and housing, health, social relations, occupation, security, and political and civil rights
Valuation method	Best–worst scaling	Best–worst scaling	N/A	N/A	Best–worst scaling	Time trade-off and discrete choice
Key references	[16, 65]	[40, 66]	[67, 68]	[17, 69, 70]	[18]	[1]

The ICEpop CAPability measure for adults (ICECAP-A) is a measure of capability wellbeing for economic evaluation aimed at the general adult population [2]. It includes the dimensions of stability, attachment, autonomy, achievement, and enjoyment, selected through qualitative interviews with English informants. The developed questionnaire with four levels per dimension was scored using best–worst scaling [65]. ICECAP-A builds on the ICEpop CAPability measure for older people (ICECAP-O), also featuring five dimensions of attachment, security, role, enjoyment and control, with four levels that were selected by analysing qualitative data [40] and evaluated using best–worst scaling. The ICECAP-Supportive Care Measure (ICECAP-SCM) focuses on quality of life at the end of life [67, 68]. Another family of capability-based measures includes the Oxford CAPability (OxCAP), Oxford CAPability 18 (OCAP-18), and the Oxford CAPability Mental Health (OxCAP-MH) instruments [8-10]. The selection of dimensions was also based on qualitative data and guided by Martha Nussbaum’s list of capabilities [24]. The adult social care outcome toolkit (ASCOT) was developed to offer a tool comparable to QALYs for social care. It features the eight dimensions of personal cleanliness and comfort, accommodation cleanliness and comfort, food and drink, safety, social participation and involvement, occupation, control over daily life, and dignity, each on four levels. This selection was based on previous research and the perspective of researchers and users of social care. Tariffs were estimated using best–worst scaling [11]. ASCOT carer is a related version for measuring social care-related quality of life from the perspective of caregivers [71]. In comparison, CALY-SWE focus on the Swedish context, policy relevance, and a paternalistic selection of capabilities in a Delphi process involving fair-minded people. Thresholds for policy relevant capability levels have also been investigated and operationalized using the ICECAP measure in Mitchell et al. [19] and Goranitis et al. [20]. In contrast to our approach, the sufficient capability level was exploratively set to the second and third highest of four levels without empirical investigation. Kinghorn [72] organised deliberative workshops for citizens, and in total took 62 persons part in 8 workshops. One task was to assess the level of sufficient capabilities, defined by ICACAP-A. The level chosen in the workshops was 3,3,3,3,3 with the best possible state being 4,4,4,4,4. In our model the threshold is defined by specifying the extend which a capability is available (always, almost always, or mostly). A person who chose the answer option “fully agree” has reached the “sufficient” level. Similar to ICECAP-A, the thresholds for the different capabilities may vary. The sufficient level in CALY-SWE level is based on a synthesis of relevant sources such as policies and laws, the Delphi-panel, present distribution, and inequity aversion in the population. CALY-SWE also explicitly includes special consideration of those “worst-off” (not agree at all). Priority one in policy is to lift those worst-off, priority two to support those who have not reached a sufficient level, but do not belong to those worst-off (partly agree). All the attempts described above to establish a sufficient level may be quite immature and explorative, and more research is needed in this area. The level for the CALY-SWE measure may, for instance, change over time due to societal progress.

Conclusions

The Delphi panel investigation resulted in the capabilities health, social relations, financial situation including residence, occupation, security and political rights to be included in the CALY-SWE measure. We also defined a sufficient level for living a flourishing life for each capability dimension, based on a “fair innings” approach, which allowed us to finalize the CALY- SWE Questionnaire.

Survey.

Screenshots, phrasing with English translation, methods, and results of the web survey and phrasing in Swedish with English translations. (DOCX) Click here for additional data file.

CALY SWE questionnaire.

Finalized CALY SWE phrasings. (DOCX) Click here for additional data file. 3 Mar 2021 PONE-D-21-04330 An Initiative to Develop Capability-Adjusted Life Years (CALYs) in Sweden: Selecting Capabilities with a Delphi Panel and Developing the Questionnaire PLOS ONE Dear Dr. Meili, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please ensure that your revision satisfies PLOS ONE’s publication criteria and not, for example, on novelty or perceived impact. The reviewer makes three sets of comments which you need to address but accepts this would warrant publication - hence the decision. The second set: I just do not see any value in the second survey. The three samples are small (500) and there are different samples in each. The whole objective and methodology just seem totally bizarre to me. I do not see the contribution of this, and I do not see the value in publishing it. suggests that either you reduce this section and mention it only briefly or you persuade the reviewer it is valuable. The reviewer does not say what appears strange and so the more obvious route would be to downsize the material. The reviewer comments: Nor, I'm afraid, do I see the logic of using the survey data to inform the wording of the attribute. Surely this is just perpetuating existing inequalities? Discussion around the third objective felt very subjective and speculative. So albeit a discussion, you should find literature or sharper arguments that support your view. Also in the limitations section you should mention the possibility of perpetuating inequalities. If these things are done, the technical questions should be addressed. Please include these comments in your cover letter of reply. Please submit your revised manuscript by Apr 17 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols We look forward to receiving your revised manuscript. Kind regards, Paul Anand Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. In the Methods section, please provide additional information on participant recrtuiemnt for the online web survey. In particular please describe any inclusion and exclusion criteria's used. And please provide a justification for the sample size used in your study, including any relevant power calculations (if applicable). Finally, please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was suitably informed and (2) what type you obtained (for instance, written or verbal). If your study included minors under age 18, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information. 3. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability. Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized. Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access. We will update your Data Availability statement to reflect the information you provide in your cover letter. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: No ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The paper seeks to achieve 3 objectives: 1) identify the most vital capabilities for individuals in Sweden, 2) to define a sufficient level of each identified capability to lead a flourishing life, and to 3) develop a complete questionnaire for the measurement of the identified capabilities. The first objective is achieved through taking a long list of capabilities from an existing report and using a Delphi method to arrive at a shorter list. One immediate question is, that if the original list by Eriksson was deemed to be a list of 10 "essential" capabilities, why were all ten not included? Why was there a need to reduce the number of capabilities and refine the list (through merging two of the capabilities)? There does not appear to have been any scope for participants to add in anything that they considered to be missing from the original list of 10. "Fair minded" individuals were recruited, largely from NGO type organisations. I wonder if there was a danger that the organisations which accepted the invitation to participate approached the task with a particular organisational agenda? The organisations included children's charities, but no reference is made to whether the questionnaire is intended for adult or child completion. If it is only for adults, then what was it the childrens charities brought to the Delphi task? I do have serious reservations about: 1) The appropriateness of calling the set of capabilities a "Capability Adjusted Life Years". The QALY is a broad approach whereby information on health is combined with information on life expectancy. Various questionnaires can be used to assess health. We don't refer to these questionnaires as QALYs. This paper is not the first to suggest CALYs, it has been done by Goranitis et al and by Mitchell et al previously, in relation to ICECAP (incidentally, these authors are not referenced, and hence key literature is missed). So, the approach of CALYs is not novel and a CALY could, in theory be calculated from any capability questionnaire. So I do not feel it is appropriate to refer to one set of capabilities by the term associated with a broader approach. 2) The wording of the questionnaire and whether it can actually be considered to assess capability at all - I would say it assesses functioning. I am not clear how the authors went from the original set of capabilities (as described by Eriksson) to the wording used in their survey? The phrasing of some capabilities is quite complicated, awkward and restrictive - for example, there will be lots of junior researchers who get paid enough to afford safe and acceptable accommodation, it will not be a lack of money that prevents them from securing long term accommodation, but instead short term contracts and possibly the need to move to take up new jobs. It is difficult for people to respond to questions which combine several different concepts, and I'm not sure how reliably the information could be interpreted. Whilst there are a few additional details, and some additional justification that I would like to see provided in relation to the Delphi task, I am prepared to accept the process as been sufficiently sound as to warrant publication. However, I have serious concerns about the extent to which the second two objectives have been met. I just do not see any value in the second survey. The three samples are small (500) and there are different samples in each. The whole objective and methodology just seem totally bizarre to me. I do not see the contribution of this, and I do not see the value in publishing it. Nor, I'm afraid, do I see the logic of using the survey data to inform the wording of the attribute. Surely this is just perpetuating existing inequalities? Discussion around the third objective felt very subjective and speculative. I had a few other, presentational issues: - Is the rationale for merging capability and QALYs entirely pragmatic? I don't see any conceptual justification? - Page 11, line 209: The argument here is confusing. First, I don't understand what is meant by "satisfaction of capabilities". Then, I fail to see how increasing income would not enhance a person's capability? I don't think a very good understanding of sufficiency is demonstrated, and again, references to sufficiency are missing. - Should we be allocating more resources to those in the poorest health? Does this not depend on their ability to respond to treatment, and the cost and cost-effectiveness of that treatment? Table 4 makes reference to TTO/DCE, but I can't see that this has been discussed in the paper. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 25 Apr 2021 Dear Reviewer and Editor We highly appreciate your efforts to review our manuscript and the valuable comments. We carefully considered your feedback and revised our manuscript accordingly. We specifically shortened the section on the methods and results of the empirical survey considerably by shifting a substantial part to the supplementary. We also revised the section to improve clarity and structure, revised Figure 1 to contain the information in Figure 2 in a more understandable format and removed Figure 2. We also agree with the reviewers reasoning regarding the name. In 2017 [1] we suggested CALYs as a proper name for measures inspired by Sen’s capability approach and the QALY-approach. All “QALYs” a measure of health-related quality of life and a time dimension in common. The scale for health-related quality of life is anchored between “perfect health” and a death. Drummond also mentions an additional a value premisses: ”Also it should be remembered that the cost-per-QALY ranking does embody a kind of equality, in that a QALY is considered to be worth the same to every individual” [2]. However, in this family of “QALYs”, different methods are used to construct health related quality of life measures. Well-known are EQ5D, SF36/SF6 and HUI, and established methods for estimating weights are TTO and standard gamble. Despite these differences, we label the construct(s) QALY. Our thinking regarding CALYs follow the same lines. As for QALYs, different methods will be developed, and the content in the measure will certainly varies between countries. Even different methods to assign weights to states will be used. We think capability measures in general can be transformed to CALY, if the states can be properly located on a scale from 0 to 1. Finally, we think equity should be considered in CALYs, for example as in the “sufficient capabilities” approach [3], or our fair-innings model. But other methods might exists or will be developed for trading of equity and efficiency when using CALYs as a measure for resource allocation. In our paper from 2017, we refer to and describe the content in Mitchell et al. [3]. Of course should Mitchell et al. (and Goranitis et al. [4]) be in the present reference list, and we apologize our carelessness. Please find our point-by-point response below. Yours sincerely The authors Original editor comments ____________________________ 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at Reply: We renamed the figure files to start with a capital letter. ____________________________ 2. In the Methods section, please provide additional information on participant recrtuiemnt for the online web survey. In particular please describe any inclusion and exclusion criteria's used. And please provide a justification for the sample size used in your study, including any relevant power calculations (if applicable).Finally, please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was suitably informed and (2) what type you obtained (for instance, written or verbal). If your study included minors under age 18, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information. Reply: We included the following sentences in the section Setting sufficient capability thresholds in Sweden->Distribution of capabilities in the population: Regarding inclusion exclusion criteria “We excluded answers form participants that stated an age below 18 (ethical concerns) and over 99 (data quality concerns).”and regarding power analysis “We deemed a sample size of 500 as sufficient where with a 5% significance level, a two sample test for proportions detects a 0.1 proportion difference with at least 80% power.”We also mention exclusion criteria and power analysis in S1 Survey, under the heading Material and methods. The only exclusion criteria we applied were age < 18 or age > 99. In the same section we mention that consent was obtained using the online survey. Note that beyond age we did not apply any inclusion and exclusion criteria. Where: Lines 383-384Lines 388-390S1 supplement, Material and methods ____________________________ 3. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability. Reply: Thank you for highlighting this. We share PLOS one’s philosophy open access to underlying data. The data is currently under review at the Swedish National Data to be published according to the FAIR principles. We will notify you as soon as the data is published. ____________________________ In-house check April 21 ____________________________ 1) In the Methods section, please provide additional information on participant recrtuiemnt for the online web survey. In particular please describe any inclusion and exclusion criteria's used. And please provide a justification for the sample size used in your study, including any relevant power calculations (if applicable).Finally, please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was suitably informed and (2) what type you obtained (for instance, written or verbal). If your study included minors under age 18, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information. Reply: Thank you for your request. We added “and participants consented electronically to the sentence “The Swedish Ethical Review Authority approved the study with an advisory statement (Dnr 2019-02848) and participants consented to participate electronically.” Also see comment 2. above regarding sample size and inclusion exclusion criteria. Where: Lines 382-383 ____________________________ 2) Thank you for including your ethics statement on the online submission form: "The Swedish Ethical Review Authority approved the study with an advisory statement (Dnr 2019-02848)." To help ensure that the wording of your manuscript is suitable for publication, would you please also add this statement at the beginning of the Methods section of your manuscript file. Reply: Thank you for pointing out this issue. We included a modified version of the sentence in the section Setting sufficient capability thresholds in Sweden->Distribution of capabilities in the population, after the aims. See also 1). Where: Lines 382-383 ____________________________ ____________________________ Reviewer ____________________________ Set 1 ____________________________ 1) The appropriateness of calling the set of capabilities a "Capability Adjusted Life Years". The QALY is a broad approach whereby information on health is combined with information on life expectancy. Various questionnaires can be used to assess health. We don't refer to these questionnaires as QALYs. This paper is not the first to suggest CALYs, it has been done by Goranitis et al and by Mitchell et al previously, in relation to ICECAP (incidentally, these authors are not referenced, and hence key literature is missed). So, the approach of CALYs is not novel and a CALY could, in theory be calculated from any capability questionnaire. So I do not feel it is appropriate to refer to one set of capabilities by the term associated with a broader approach. Reply: We agree with your reasoning. And we propose ‘CALY-SWE’ as a proper name. This is what we adopted in the manuscript. In the revision we consistently use CALY-SWE where applicable, but retain ‘CALY’ where we talk about capability-adjusted life years in general. To give appropriate credit to the previous applications and conceptualizations of CALYs we mention the authors you list and some others we changed the sentence leading up to the aims in the Introduction; from “However, to our best knowledge, the amalgamation of the QALY approach and a set of capabilities covering vital aspects of a good life other than health, is novel.” to “While capability-adjusted life years have been suggested and implemented before [3-6], our approach includes equity considerations from the start and is focused on an amalgamation of the QALY approach and a set of capabilities covering vital aspects of a good life other than health.” The references consist of Goranitis et al., Mitchell et al., the 2008 report by Lorgelly et al, and and the 2017 ICECAP application by Walters et al. Where: Through the manuscriptLine 155 ____________________________ ____________________________ Set 2 ____________________________ 2) The wording of the questionnaire and whether it can actually be considered to assess capability at all - I would say it assesses functioning. I am not clear how the authors went from the original set of capabilities (as described by Eriksson) to the wording used in their survey? The phrasing of some capabilities is quite complicated, awkward and restrictive - for example, there will be lots of junior researchers who get paid enough to afford safe and acceptable accommodation, it will not be a lack of money that prevents them from securing long term accommodation, but instead short term contracts and possibly the need to move to take up new jobs. It is difficult for people to respond to questions which combine several different concepts, and I'm not sure how reliably the information could be interpreted. Reply: Thank you for your comment. One of the strongest initiatives from the Delpi-panel was to raise the social problems we have in Sweden due to housing shortage. Some people are literally homeless, many more cramped. Prices have skyrocket, and in the big cities only high-income earners can afford a reasonable housing standard. The Panel also defended the view that a welfare state is committed to implement policies which make it possible to get permanent housing within a reasonable waiting-time, and to a reasonable cost, for all who wants. This should be considered as a capability. If anyone prefer to live with temporary contracts they are free to do so. To give university students temporary contracts during the years they study is a practical arrangement, and would not be interpreted as a limitation of the capability. However, all the young that are forced to stay with parents because they cannot afford own housing suffer a limitation.To facilitate the interpretation of this capability we plan to have a question about housing in the background section.In the Discussion, we explain why occupation is formulated as a function. One reason is the construction of the welfare state. All who share the benefits of the welfare state (free health care, education etc,) should also share the burdens. Where: Line 536 ____________________________ I just do not see any value in the second survey. The three samples are small (500) and there are different samples in each. The whole objective and methodology just seem totally bizarre to me. I do not see the contribution of this, and I do not see the value in publishing it. Reply: The logic of our approach is heavily inspired by Alan Williams “Fair Innings” [7]. He describes expected QALYs during lifetime in different social groups, and against this empirical pattern he suggests a fair inning to be, say 70 QALYs. If he had set the fair inning to be 80 years, none of the groups had reached it and if he had set it equal to sixty all would have gotten it. A fair inning in some African country would of course be different from the fair inning in UK, so information about present distribution in the particular setting is helpful.Our idea is simply to follow in the foot-steeps of Williams. We want to prioritize those worst-off (also referring to Rawls theory of justice), which are those who “not at all” agree to the stated capability. This threshold needs to take the present distribution of a capability into account, exactly as Williams did. If the whole population would be below the threshold, we cannot identify those worst-off. “All” can of course not be worse off. In our data, we found that around 10% were worst-off, and this proportion remained stable between the three samples.With a sample size of 500, the power to detect a proportion difference of 0.1 between two survey versions, using a two sample test for proportions, is at least 80%, which we deemed adequate for our purposes. We aimed for independent samples because we believe that asking the different capability wordings for each capability in sequence would lead to invalid results as participants might change in how they answer between the wording versions. The whole section has been shortened, and partly rewritten. We have changed the heading to “Setting sufficient capabilities thresholds in Sweden” to better describe the content. The section has also got a new introduction to clarify the logic. Where: Lines 363-433 ____________________________ Nor, I'm afraid, do I see the logic of using the survey data to inform the wording of the attribute. Surely this is just perpetuating existing inequalities? Discussion around the third objective felt very subjective and speculative. Reply: Mitchell et al. [3] discuss how sufficient capabilities can be defined in ICECAP-O. This instrument has four levels for each capability. Full capability (level 4), a lot of capability (level 3), a little capability (level 2) and no capability (level 1). The threshold is set by choosing a level (i.e. 3) and the threshold need not to be constant over all capabilities. It can vary depending on the real availability of capabilities in a certain context.In our model the threshold is defined by specifying the extend which a capability is available (always, almost always etc). A person who chose the answer option “fully agree” has reached the “sufficient” level. In similarity with ICECAP-O varies the thresholds for the different capabilities.Neither ICECAP-O nor our method are “objective”. Rather, to set a threshold is a value-based decision.We have tried to be transparent, and describe the different aspects which have been taken into account.How to set concrete threshold for sufficient capabilities so that they are useful in policy-making is not a well-researched topic. We see it thus as a first generation method that we and others try to develop, and they will certainly be refined in the future. Where: Lines 436-503 ____________________________ ____________________________ Other, representational issues ____________________________ - Is the rationale for merging capability and QALYs entirely pragmatic? I don't see any conceptual justification? Reply: We think there are good reasons to define “severity” of a disease as the loss of quality of life times the duration of the state. Seasickness is terrible in the moment, but since you completely recover when you disembark the ferry, it’s not judged to be a serious condition. We think this time dimension is crucial when we judge all capabilities. It matters whether you lack money for a month or suffer poverty for decades. Thus, the inclusion of time is a deliberate, value driven decision.Additionally, extending the QALY concept to incorporate capabilities seems logical from a welfare theory point of view, as we outline in the section from Introduction-> From welfarism to extra-welfarism and capabilities section:In parallel to the development of the capability approach and extra-welfarism based on the shortcomings of welfarism, we see CALYs as an application of the capability approach in line with extra-welfarist ideas to solving problems of resource allocation. There are indeed conceptual points of friction with the capability approach as well, such as the stricter descriptive system that allows for complete orderings, but we think that our concept goes beyond pure pragmatism. Where: Lines 103-158 ____________________________ - Page 11, line 209: The argument here is confusing. First, I don't understand what is meant by "satisfaction of capabilities". Then, I fail to see how increasing income would not enhance a person's capability? I don't think a very good understanding of sufficiency is demonstrated, and again, references to sufficiency are missing. Reply: Thank you for highlighting this unclarity related to “satisfaction” which may be connotated with “preference”. We changed the wording in this sentence to read “does not increase capabilities”. We also changed other uses of “capability satisfaction” in the manuscript. The question if our model perpetuates inequalities is central. In fact, our model is constructed to promote equality. Indeed, increased income enhances capability. However, moving into making priorities between conflicting societal goals, e.g. “the equity-efficiency trade-offs”, one idea could be to put a threshold (say annually 100 000 Euro) that is good enough for giving reasonable opportunities for a flourishing life. Mitchell et al (2016) defines the threshold “as the level of capability at or above which a person’s level of capability wellbeing is no longer a concern for policy”.On the other end of the income distribution we have people earning only 10 000 Euro annually, certainly not enough for a flourishing life. In our model an increase from 10 000 to say 20 000 would yield increased capabilities in the societal calculus. For instance and individual could change from “Not agree” to “Partially agree” to having the capability. Contrary, an increase from 100 000 to 110 000 would not yield anything in the societal calculus, because already 100 000 leads to the answer “Completely agree” to having the capability. Therefore only polices improving the conditions for those below the threshold would be able to improve the level of capabilities. In particular, policies benefiting those “worst-off” would yield new levels of capabilities. This is a parallel to the fair innings idea by Alan Williams (1997). In resource allocation in health care, priority should be given to those who did not get their fair inning. Our model extends this principal view to all capabilities. Where: Line 214, 476, 560 ____________________________ - Should we be allocating more resources to those in the poorest health? Does this not depend on their ability to to treatment, and the cost and cost-effectiveness of that treatment? Reply: We agree that resource allocation to people with poor health should be dependent on the cost-effectiveness of the treatment (which ideally would factor in the access to the treatment). Furthermore, the mainstream normative assumption in cost-effectiveness analysis is health maximization, that is to purely rank treatment according to the cost-effectiveness. This idea has its roots in utilitarianism, and has been repeatedly challenged also within health economics, such as by Wagstaff [8] and Williams [7]. Both argue for abandoning pure health maximization and instead make a trade-off between efficiency and equity.We can add that pure health maximization is not a reasonable goal in policy-making. In Sweden, and many other countries, equity considerations play a prominent role in public decision-making, and our “model” (CALY-SWE) is intended to facilitate decision-making. Also Sen’s own writing supports equity considerations and criticize utility or health maximization [9,10] ____________________________ Table 4 makes reference to TTO/DCE, but I can't see that this has been discussed in the paper. Reply: Thank you for the comment. We added a sentence that briefly mentions that we plan on using TTO/DCE questions: “We are also planning to develop a weight tariff using time trade -off and discrete choice evaluation questions.” Where: Line 521 Submitted filename: response_to_reviewers_20210423.docx Click here for additional data file. 12 Aug 2021 PONE-D-21-04330R1 An Initiative to Develop Capability-Adjusted Life Years in Sweden (CALY-SWE): Selecting Capabilities with a Delphi Panel and Developing the Questionnaire PLOS ONE Dear Dr. Meili, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Sep 26 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols . We look forward to receiving your revised manuscript. Kind regards, Rabia Hussain Academic Editor PLOS ONE Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: (No Response) ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The manuscript is improved and I think I understand the motivation for the online survey now - the authors seem to set the sufficient level as the level that most people already achieve in the small population surveys. I still don't understand why the same survey was not sent to all 1,500 respondents, with a request for them to indicate whether they can achieve the capability somewhat, mostly or always?? Surely this would have made more sense than seeing how many people from a smaller sub-sample strongly agree that they can achieve the capability somewhat, how many people from a separate and equally small subset strongly agree that they can achieve the capability mostly, etc?? It just seems like an unnecessarily complicated way of getting the information, which breaks the sample down into three small and independent sub-samples. Also, I think it could still be clearer that the objective was to match the sufficient level to the existing distribution of achievement. And I STILL think it needs to be acknowledged that this does NOT reflect societal values about what constitutes a good life, but instead risks perpetuating poor performance of social policies or existing inequalities. Where is the incentive for social policies to drive improvements in quality of life if all they need to do is perpetuate existing achievement? Will the sufficient level need to be up-dated over time? There are a few things that are misleading and need to be changed: Page 9: It is wrong to suggest that ICECAP measures only focus on health, in fact they don't explicitly include health at all. So the work reported here is NOT novel in the sense of moving beyond health outcomes. Page 10 - I STILL do not believe that the work contributes a sufficient level for a FLOURISHING LIFE (see above) - at no point have the authors asked anyone or considered what constitutes a good life. Instead the work establishes (loosely!!) a rough indication of current achievement on the attributes - the majority of people may have perfectly miserable lives - this is never established or considered. Lines 396 to 401 on page 23 - I don't understand what is being said here. Discussion - Kinghorn HAS used empirical methods to establish a sufficient level of capability well-being for ICECAP-A, but their work (published in Social Science & Medicine) is not acknowledged. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 9 Sep 2021 Dear Reviewer and Editor We would like to extend our appreciation for your efforts in reviewing our manuscript and for providing valuable comments. We think that the comments and remarks helped us to further improve the manuscript and would like to thank the reviewer. Please consider our point-by-point response below on the ensuing pages for the changes in the manuscript. Yours sincerely Kaspar Walter Meili Anna Månsdotter Linda Richter Sundberg Jan Hjelte Lars Lindholm Editor comment: “Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.” Answer: We went through all the items in the bibliography and failed to find a rejected publication. We however detected a wrongly cited item: Bibliography item [46] (Fenner F et al., Family and generic names for viruses approved by the International Committee on Taxonomy of Viruses) was cited by mistake on line 368, the correct citation would have been item [31] (Williams A, Intergenerational Equity: An Exploration of the ‘Fair Innings’ Argument). We corrected the issue, the citation now is on line 372. Reviewer comments: Major 1: “The manuscript is improved and I think I understand the motivation for the online survey now - the authors seem to set the sufficient level as the level that most people already achieve in the small population surveys. I still don't understand why the same survey was not sent to all 1,500 respondents, with a request for them to indicate whether they can achieve the capability somewhat, mostly or always?? Surely this would have made more sense than seeing how many people from a smaller sub-sample strongly agree that they can achieve the capability somewhat, how many people from a separate and equally small subset strongly agree that they can achieve the capability mostly, etc?? It just seems like an unnecessarily complicated way of getting the information, which breaks the sample down into three small and independent sub-samples.” Answer: Thank you for the comment and the relevant remark. The intention with not sending the same survey to all the participants was to enable the 3 subsamples that are similar and representative of the Swedish population to answer the statements individually, independently from each other. We wanted to use wording in the statements that is similar to the final version of the questionnaire. With the suggested study design this would have not been possible because -1) (if all participants get the 3 versions) the answers to the different statements would not be independent of each other, possibly biased, or -2) (if as the reviewer suggested, a question phrasing in the style of “do you achieve the capability X?”- somewhat, -mostly, -always, would have been used) a phrasing quite dissimilar to the final questionnaire and different answer options would have been necessary to keep a respectably level of face validity and the results would not have the same legitimacy to guide the choice of phrasing for the statements. Major 2: “Also, I think it could still be clearer that the objective was to match the sufficient level to the existing distribution of achievement. And I STILL think it needs to be acknowledged that this does NOT reflect societal values about what constitutes a good life, but instead risks perpetuating poor performance of social policies or existing inequalities. Where is the incentive for social policies to drive improvements in quality of life if all they need to do is perpetuate existing achievement? Will the sufficient level need to be up-dated over time?” Answer: Thank you. We agree that the section that outlines the goals of the survey is not enough clearly written, and the part between lines 387 and 401 is rewritten to increase clarity. Among other things, we think a short table (Table 2) can be used to explain the design of the study. We cannot exclude that the sufficient level and phrasing requires to be updated. Indeed, adaptions may be necessary in the future to reflect progress or changed societal values. The revised pars are on lines 367 to 370, 391 to 392 and 401 to 420. Minor 1: “Page 9: It is wrong to suggest that ICECAP measures only focus on health, in fact they don't explicitly include health at all. So the work reported here is NOT novel in the sense of moving beyond health outcomes.” Answer: Thank you for your comment, what we write on page 9 is : “While capability-adjusted life years have been suggested and implemented before [1-4], our approach includes equity considerations from the start and is focused on an amalgamation of the QALY approach and a set of capabilities covering vital aspects of a good life relevant other than health.” For instance, we use time-trade off to establish weights as some QALY approaches do. We know and even write about measures (ICACAP and other) which have moved beyond health. We did by no means intend to suggest that ICECAP does focus health and apologise. We therefore changed the wording to (on line 158): “… vital aspects of a good life relevant for the Swedish policy context.” Minor 2: “ I STILL do not believe that the work contributes a sufficient level for a FLOURISHING LIFE (see above) - at no point have the authors asked anyone or considered what constitutes a good life. Instead the work establishes (loosely!!) a rough indication of current achievement on the attributes - the majority of people may have perfectly miserable lives - this is never established or considered.” Answer: What we argue in the paper, in similarity with other capability approaches, is to give people sufficient capabilities for a flourishing life. Lack of capabilities should not hinder people form living the lives they want. However, sufficient capabilities are not a guarantee for flourishing life. Even if all capabilities met high threshold, there would certainly be people dissatisfied with their lives and living conditions. The choice of capabilities in our model has been a process in several steps: the governmental investigation, a survey among public health researchers and finally the Delphi-panel with members from different not-for-profit organisations. When it comes to the level of a capability sufficient for a flourishing life, we try to consider all relevant arguments of which we are aware. In particular, we use input from the Delphi process and the study of distribution of capabilities in the population sample. Alongside these two investigations, we consider Swedish laws and policies, and normative ideas in general, which we judge to be of relevance. For example, about health we write (on line 463-471 in the revised manuscript): “Health: Based on the input from the Delphi panel, we consider physical and mental health as one of the most (maybe the most) important capabilities. The population survey indicated a pronounced inequality aversion. The threshold should facilitate a high and equitable level of health for everybody. The Healthcare Act [49] and other policy documents [50] point in the same direction. However, “always” having the health to do what one wants may be unrealistic and certainly extremely demanding of resources, and thus complicates the realization of better health for those worst-off, as well as the realization of other capabilities. Therefore, we think the formulation “almost always” is the most reasonable “fair innings”. “ The other thresholds are set using a similar synthesizing reasoning. Minor 3: Lines 396 to 401 on page 23 - I don't understand what is being said here. Answer: We reworded the section to increase clarity from “Due to the higher threshold implied by the wording of A (“always”) compared to B (“almost always”) and C (“mostly”), we expected an increased proportion of “completely agree” on C compared to B and on B compared to A. Our intent was to select the wording version with a distribution pattern that fitted normative considerations well. No difference between the versions was an argument for adapting the A version with the shortest and least complex formulation.” to (now on lines 416 to 420) “Due to the higher threshold for a “completely agree” answer implied by the stricter wording of A (“always”) compared to B (“almost always”) and C (“mostly”), we expected an increased proportion of “completely agree” answers for C compared to B and for B compared to A. If this expectation would not get support, we assumed that this way of finetuning the phrasing was not meaningful.” Please also refer to Major comment 2. Minor 4: Discussion -Kinghorn HAS used empirical methods to establish a sufficient level of capability well-being for ICECAP-A, but their work (published in Social Science & Medicine) is not acknowledged. Answer: Thank you, we include now the 2019 Kinghorn paper now in the discussion on line 658 and relate it to our approach on lines 656 to 673: … “Kinghorn [72] organised deliberative workshops for citizens, and in total took 62 persons part in 8 workshops. One task was to assess the level of sufficient capabilities, defined by ICACAP-A. The level chosen in the workshops was 3,3,3,3,3 where the best possible state is 4,4,4,4,4. In our model the threshold is defined by specifying the extend which a capability is available (always, almost always, or mostly). A person who chose the answer option “fully agree” has reached the “sufficient” level. Similar to ICECAP-A, the thresholds for the different capabilities may vary. The sufficient level in CALY-SWE level is based on a synthesis of relevant sources such as policies and laws, the Delphi-panel, present distribution, and inequity aversion in the population. CALY-SWE also explicitly includes special consideration of those “worst-off” (not agree at all). Priority one in policy is to lift those worst-off, priority two to support those who have not reached a sufficient level, but do not belong to those worst-off (partly agree). All the attempts described above to establish a sufficient level may be quite immature and explorative, and more research is needed in this area. The level for the CALY-SWE measure may, for instance, change over time due to societal progress. “ Other Changes: We also would like to add the results from a pilot study (published in this Journal ref). We suggest the inclusion of the following paragraph after line 509: “The result from the Panel can be compared with a previous pilot study. The aim of the study was to investigate whether it was possible to rank capabilities included in the Swedish parliament report (35). The participants were researchers or PhD-students, mostly in public health. The main finding was that most of the respondents managed to rank the capabilities included in the set. The results suggest that participants deemed health to be most important, followed by social relations and financial situation. Knowledge, occupation, time, security, political resources, housing, and living environment were ranked lower, and the exact order depends on the metric used to synthesize the individual rankings.” On line 154 we changed “social care” to “social services” to increase clarity (“Capabilities, as an outcome measure in healthcare, elderly care or social services, have gained increasing attention during the last decade”). We also consistently applied the spelling “worst-off” throughout the manuscript (“instead of worst off”). Submitted filename: response_to_reviewer_210908.docx Click here for additional data file. 17 Jan 2022 An Initiative to Develop Capability-Adjusted Life Years in Sweden (CALY-SWE): Selecting Capabilities with a Delphi Panel and Developing the Questionnaire PONE-D-21-04330R2 Dear Dr. Meili, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Rabia Hussain Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: 31 Jan 2022 PONE-D-21-04330R2 An Initiative to Develop Capability-Adjusted Life Years in Sweden (CALY-SWE): Selecting Capabilities with a Delphi Panel and Developing the Questionnaire Dear Dr. Meili: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Rabia Hussain Academic Editor PLOS ONE

33 in total

1. Exploring challenges to TTO utilities: valuing states worse than dead.

Authors: Angela Robinson; Anne Spencer
Journal: Health Econ Date: 2006-04 Impact factor: 3.046

2. Welfarism vs. extra-welfarism.

Authors: Werner B F Brouwer; Anthony J Culyer; N Job A van Exel; Frans F H Rutten
Journal: J Health Econ Date: 2007-11-29 Impact factor: 3.883

3. Valuing the ICECAP capability index for older people.

Authors: Joanna Coast; Terry N Flynn; Lucy Natarajan; Kerry Sproston; Jane Lewis; Jordan J Louviere; Tim J Peters
Journal: Soc Sci Med Date: 2008-06-21 Impact factor: 4.634

4. On the promotion of human flourishing.

Authors: Tyler J VanderWeele
Journal: Proc Natl Acad Sci U S A Date: 2017-07-13 Impact factor: 11.205

5. Values for the ICECAP-Supportive Care Measure (ICECAP-SCM) for use in economic evaluation at end of life.

Authors: Elisabeth Huynh; Joanna Coast; John Rose; Philip Kinghorn; Terry Flynn
Journal: Soc Sci Med Date: 2017-07-21 Impact factor: 4.634

6. Using deliberative methods to establish a sufficient state of capability well-being for use in decision-making in the contexts of public health and social care.

Authors: Philip Kinghorn
Journal: Soc Sci Med Date: 2019-09-11 Impact factor: 4.634

7. Complex Valuation: Applying Ideas from the Complex Intervention Framework to Valuation of a New Measure for End-of-Life Care.

Authors: Joanna Coast; Elisabeth Huynh; Philip Kinghorn; Terry Flynn
Journal: Pharmacoeconomics Date: 2016-05 Impact factor: 4.981

8. Developing attributes for a generic quality of life measure for older people: preferences or capabilities?

Authors: Ini Grewal; Jane Lewis; Terry Flynn; Jackie Brown; John Bond; Joanna Coast
Journal: Soc Sci Med Date: 2005-09-15 Impact factor: 4.634

9. Towards capability-adjusted life years in public health and social welfare: Results from a Swedish survey on ranking capabilities.

Authors: Anna Månsdotter; Björn Ekman; Kaspar Walter Meili; Inna Feldman; Lars Hagberg; Anna-Karin Hurtig; Lars Lindholm
Journal: PLoS One Date: 2020-12-01 Impact factor: 3.240

10. Perceived changes in capability during the COVID-19 pandemic: A Swedish cross-sectional study from June 2020.

Authors: Kaspar Walter Meili; Håkan Jonsson; Lars Lindholm; Anna Månsdotter
Journal: Scand J Public Health Date: 2021-07-02 Impact factor: 3.021