Literature DB >> 35560140

Causal reasoning without mechanism.

Selma Dündar-Coecke¹, Gideon Goldin², Steven A Sloman².

Abstract

Unobservable mechanisms that tie causes to their effects generate observable events. How can one make inferences about hidden causal structures? This paper introduces the domain-matching heuristic to explain how humans perform causal reasoning when lacking mechanistic knowledge. We posit that people reduce the otherwise vast space of possible causal relations by focusing only on the likeliest ones. When thinking about a cause, people tend to think about possible effects that participate in the same domain, and vice versa. To explore the specific domains that people use, we asked people to cluster artifacts. The analyses revealed three commonly employed mechanism domains: the mechanical, chemical, and electromagnetic. Using these domains, we tested the domain-matching heuristic by testing adults' and children's causal attribution, prediction, judgment, and subjective understanding. We found that people's responses conform with domain-matching. These results provide evidence for a heuristic that explains how people engage in causal reasoning without directly appealing to mechanistic or probabilistic knowledge.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35560140 PMCID： PMC9106179 DOI： 10.1371/journal.pone.0268219

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.752

1. Introduction

Even experienced cyclists cannot reliably draw a picture of the mechanism that makes a bicycle work [1]. Moreover, few people can explain how a ballpoint pen works; in fact, when they try, they discover they do not understand such artifacts as well as they thought they did [2]. Most of us live with the illusion that we have more causal knowledge than in fact we do [3]. It is unrealistic to expect people to remember all that we learn about how things work. Things have many layers of complexity, both because they have parts that themselves can be decomposed at multiple levels, and because they interact with so many other things (understanding ballpoint pens requires understanding writing). So how do people make inferences about what causes what if things are too complex for them to understand? A number of proposals have been offered. Einhorn and Hogarth [4] suggested that people rely on several "cues-to-causality," including covariation, temporal order, contiguity in time and space, and similarity of cause and effect. Evidence for some of these cues has accumulated (see e.g., Lagnado & Sloman [5] for temporal order and contiguity; LeBoeuf & Norton, [6], for similarity; Johnson & Keil [7] and Rottman & Hastie [8], for more recent reviews). However, such cues only provide pairwise information about variables. Causal mechanisms generally involve sets of variables working together in a highly structured way (like the pedals, gears, chain, wheels, and frame of a bicycle) and pairwise relations are not sufficient. Johnson and Keil propose a heuristic for capturing a certain kind of structure—the hierarchical structure of events—by positing a level-matching principle. Our proposal appeals to similarity indirectly and offers a heuristic for making causal inferences that respects structural relations, not merely cause-effect pairs. Our proposal assumes that retaining abstraction information from a small set of categories of mechanism is cognitively feasible. Whenever we lack specific knowledge about a process, we can note the broader type of that process to bootstrap causal inference. We refer to such types as “domains.” Our claim is not that mechanisms come in a fixed hierarchy of types, but that any appropriate abstraction process induces a category and therefore a domain. Thus, causal domains tend to reflect regularities in the world; and we are in fact encouraged to consider only the likeliest causal relations, effectively reducing an otherwise overwhelming search space. We propose that humans tend to make inferences and ascribe causal structures based on the domain of the corresponding mechanism. Categorical domains help to identify types of entities [9, 10], but mechanism domains enable us to identify the kinds of parts and processes that are causally related and operate in similar ways. For instance, the knowledge that the mechanism for making calls on a cell phone is electromagnetic is enough to guide a swath of inductive inferences. The belief that a cellphone uses an electromagnetic mechanism suggests that its performance depends on whether the phone case is made of metal, but not its color. In most cases, although a detailed understanding of how a cell phone works is not necessary to use it, mechanism domain knowledge helps to identify the type and scope of relevant information. It makes many inferences feasible and economical. We propose the domain-matching heuristic, hypothesizing that we are likelier to assume two events are causally related if they share the same mechanism domain. When we observe a cause that participates in the mechanical domain, we are more likely to infer a corresponding effect that also participates in the mechanical domain. If we observe an effect in the chemical domain, we will look to possible causes that also participate in the chemical domain. Our goal in this paper is to test this proposal and explore whether people’s causal attributions fit mechanism domains when they link potential causes and effects. We also conjecture that “cross-domain” mechanisms might be relatively rare, rendering the domain-matching heuristic a useful guide most of the time.

1.1 Causal attribution in the absence of mechanism knowledge

Causal processes can be reduced to mechanisms, sequences of interconnected events that involve parts and processes [11, 12]. In psychology, the Piagetian framework recognized this long ago, proposing a mechanistic view for causal systems, where mechanism refers to, for example, how a bicycle works. Johnson and Ahn [13] define mechanisms as systems of visible and invisible characteristics interacting systematically, where the same effects are produced by the same causes. Park and Sloman [14] define them as a set of causes, enablers, disablers, and preventers that are involved in producing an effect, unfolding over time. Lombrozo [15; see also 16] highlights that mechanisms have a privileged relationship to explanations; they do not simply identify causes but illustrate how the cause brought on the effect. Exposure to mechanisms is inevitable. Within months of taking our first steps, we start establishing some appreciation of how the world works without developing a deep understanding of operating mechanisms. Children learn to keep their ice-cream in the fridge on a sunny day. Cooks will use an oven to bake a cake. How exactly do people think they understand phenomena without knowing the details of causal interactions between functional parts? We argue that mechanism domains offer a heuristic for a variety of causal reasoning tasks so that even though most of us do not know how a fridge or an oven work, we rely on our beliefs that are rooted in knowledge of mechanism domains. It is this ability that helps us to organize knowledge into content domains like biology or chemistry. Findings showing young children’s ability to distinguish animals from artifacts supports this view in the sense that children demonstrate distinct explanatory understandings of, for instance, biological (animals) and mechanical causal agents (machines or blocks). They seem to believe there are distinct mechanisms driving the causal relations in different domains [17], with multiple causal-explanatory construals for physical, biological, psychological, and chemical events. For instance, when asked, a preschooler could state that hammers break things, whereas water makes things wet, possibly by associating causes with effects in a domain-specific manner [18-21]. Comparing young children’s and adults’ responses over a series of five experiments, Shultz’s [22] study also showed that, for instance, in the physical domain, familiarity with the objects in a question is not a strong indicator of mechanism level thinking. In one of the experiments, where children and adults were presented with sound, wind, and light transmissions in different procedures, participants’ tendency to analyze causal mechanisms was not restricted to prior knowledge. Even young children knew that a spot of light was likelier to be due to a flashlight than a fan, and furthermore, their justifications more often relied on knowledge about the nature of the transfer (e.g., light spots are round) than past experience (e.g., it looks like my flashlight). The relational complexities inherent in most causal mechanisms seem to drive people to develop beliefs about certain relational patterns, with some explanatory frameworks allowing them to make sense of properties and causal relations. Intuitive and lay theories conjecture that these beliefs are intrinsically limited, incomplete, or partial models for how things work [23, 24]. Although the majority of everyday explanations invoke cause-effect relations, most without requiring domain expertise (the wind blew the fence down), people often seem to determine appropriate relationships through a mental process producing subjective beliefs about reality. Consider someone who believes they will get sick if they fail to use soap. According to Ahn and Kalish [9], this thought implies a belief in a mechanism, whether it involve viruses, miasma, or something else. Walsh and Sloman [25] highlight that most of the evidence on reasoning about causal relations supports mechanism-based theories (see also [26]). Most of us do not know how soap works, but we act by relying on some beliefs about an underlying process. Ahn and Kalish explain this as “people’s beliefs about causal relations include… [among other things]… a set of more or less elaborated beliefs about the nature of that mechanism, described in theoretical terms.” (p. 5). In our view, these beliefs are rooted in knowledge of mechanism domains. Across a series of experiments, Rozenblit and Keil [2] asked people for self-understanding ratings for how a number of devices worked. People’s ratings were lower after trying to explain how the device worked, after seeing an expert explanation, or after they were asked a key comprehension question about the system suggesting that people think they know how things work better than they actually do [see also [3]). What allows people to address causal questions when their causal knowledge is so impoverished? We propose that a mechanism-domain-matching heuristic is one of the strategies humans employ in new situations to close the gap between, on the one hand, causal explanation and prediction, and on the other hand, prior knowledge and understanding. Understanding, explaining, and predicting are intimately related but also distinct competences [27], the differences between them giving cues as to how people can predict and explain a causal phenomenon without fully understanding it. With respect to the categorization of knowledge, this kind of representation would constitute what Sloman, Lombrozo, and Malt [28] call an extra-strong ontology, wherein differences between any proposed domains are irreducible to other cognitive representations. In their analysis of domain-generality versus domain-specificity in higher-order cognition, they outline four more possible ontologies, each of which is decreasingly committed to a strong distinction amongst domains of knowledge (where no ontology is the 5th and alternate extreme). They argue for mild ontology, which they describe as follows: “Domain differences in categorization and inference are systematic, but not cognitively primitive. People tend to reason and classify phenomena using domain-general causal reasoning mechanisms. To the extent that domains correspond to causal discontinuities in the world, systematic differences between domains may emerge, and domains thus serve as a useful shorthand for theorists to roughly classify different types of processing. However, in a given classification or inference, an object is processed the way it is by virtue of its causal history and other causal roles, which will correlate imperfectly with its domain” (p. 201). This implies that mechanism domains are not parts of a pre-specified ontology, but rather, they emerge from our observations of causal regularities and thus we take our hypothesis to be a form of “mild ontology” in the language of Sloman, Lombrozo, and Malt [28]. We propose that mechanism domains constitute the fundamental representations that allow us to generate causal models and explanations quickly and effortlessly.

1.2 Overview of the studies

The proposed hypothesis that ‘mechanism domains facilitate search by focusing our attention on likely relations’ was investigated in 5 studies. The first study documented whether people’s sorting tendencies generated any clusters when they are asked to sort artifacts by common mechanisms. This study found three mechanism domains: mechanical, chemical, and electromagnetic. Studies 2 and 3 tested one implication of the hypothesis: that people should select a cause that matches the domain of the effect. With a larger number of items, studies 4 and 5 examined whether people chose within-domain causes over cross-domain causes. Two exploratory studies (2b and 4b) examined younger people’s (aged 8 to 17) choices and judgments of the same material to analyze the factors influencing use of the domain-matching-heuristic among youngsters. All studies were conducted online. Only volunteer children and adults participated. The sample sizes across the studies were based on a pilot work as well as the availability of volunteer participants. Tests for normality of data have been run for each condition of each experiment. Except for the first study, the data were normally distributed. There was a significant skew on scores for gender in Study 1. However, this study was exploratory -participants sorted a set of items and hierarchical clustering was used to examine their classifications without the need for parametric analyses. In Study 4, where 88 children were recruited, main effects of age were followed up with tests assessing differences between age groups, using Bonferroni corrections. Ethical approval for this research has been obtained from Brown University Research Ethics Committee. None of the participants needed to sign a written consent. Regarding children’s data, only those whose parents signed the consent form were included. Parents received the online web link asking them to review the questionnaires and the target of the study. Those allowed their children to access it included.

2. Study 1: Uncovering mechanism domains

Previous literature has highlighted differences in people’s reasoning about natural kinds versus artifacts [29] and in general, in the phenomena associated with different areas of study [30]. The purpose of this study is to determine how people would sort randomly sampled items: whether they sort different set of real-world items in similar or dissimilar ways, that is, in such a manner that a hierarchical analysis could reveal the domains of mechanisms. To begin, we ran a study to enumerate the domains people use to think about common artifacts. We asked participants to sort items based on their perceived mechanisms. Two other groups of participants were asked to sort the same items based on either function or overall similarity.

2.1 Method

Sixty participants were recruited from Mechanical Turk. Nine were removed for not passing an attention check, and one was removed for not answering all items (final N = 50; 34 male, 16 female; Mage = 32.40 years, SDage = 12.93 years). The study required an average of 12 minutes to complete. We used the 42 artifact-based stimuli from Study 1 in Rozenblit and Keil [2] as our test items, as this set was of reasonable size and already constructed independently. Items included: quartz watch, zipper, spray bottle, solid-fuel rocket, VCR, cellular phone, helicopter, etc. (see S1 Appendix for the list of items). Participants were randomly presented with the 42 items on screen. Each of three groups was given a different sorting instruction. Participants were randomly presented with the items on screen. They were then asked to drag and drop the items into distinct groups based on either (i) how similar the items’ functions are (i.e., what they did; function condition), (ii) how similar the items’ mechanisms are (i.e., how they worked; mechanism condition), or (iii) how similar the items themselves are (general condition). Participants completed the task by dragging and dropping the items into on-screen bins.

2.2 Results & discussion

On average, participants sorted the items into 12 groups (Moverall = 11.88; SDoverall = 5.11; Mmechanism = 11.64, SDmechanism = 5.42; Mfunction = 14.06, SDfunction = 4.97; Mgeneral = 10.11, SDgeneral = 4.47). For each condition, a hierarchical clustering was used to explore the nature of people’s groupings. To do this, a 42-item x 42-item distance matrix was computed in each condition by assigning to every cell the number of participants that sorted the intersecting pair of items into distinct groups. Items along the main diagonal were set to the minimum distance, 0, while the maximum value possible for any cell was the number of subjects in the condition (Nmechanism = 14, Nfunction = 17, Ngeneral = 19). An agglomerative hierarchical cluster tree was then built, taking the original distance matrix as input and computing linkages between clusters using their unweighted average distances. Cluster membership for each item was obtained by pruning the binary tree for three top-level branches, proposing that participants maintained three mechanism domains in their sorting as shown below. In the mechanism condition, slicing the tree just past a distance of 12 revealed three clusters (see Fig 1). One cluster contained many items that are chemical in nature (e.g., greenhouse, solid-fuel rocket), while another cluster contained mostly mechanical devices (e.g., piano, helicopter). The last cluster contained almost only electrical items (e.g., VCR, cellular phone). This division of items captured the mechanism domains, as opposed to capturing other domains at a different level of abstraction.

Fig 1

A dendrogram representing the mechanism condition sortings.

In the function condition, three clusters emerged when cutting the tree at a distance of about 16 (see Fig 2). This clustering depicts different patterns. Here, items tend to group around a purpose rather than a process, such as travel (e.g., solid-fuel rocket grouped with helicopter here, but not in the mechanism condition).

Fig 2

A dendrogram representing the function condition sortings.

In the general sorting condition, electrical items tend to group together while the other two clusters were not as clear. It appears that participants considered function when judging similarity (see Fig 3). This behavior is supported by the HIPE theory of function [31, 32], whereby an object’s physical structure is thought to be caused by its historically intended role (i.e., its function).

Fig 3

A dendrogram representing the general condition sortings.

In order to compare clusterings, three metrics were employed, namely the Rand Index (RI: [33]), the adjusted Rand Index (ARI: [34]), and the Variance of Information criterion (VI; [35]). These metrics’ results were shown in Table 1 below.

Table 1

Similarities between the mechanism, function, and general item clusterings (higher numbers mean greater similarity for the RI and ARI, while lower numbers mean greater similarity for the VI).

	Mechanism	Function	General
Mechanism		RI = 0.77	RI = 0.72
		ARI = 0.51	ARI = 0.40
		VI = 0.64	VI = 1.01
Function			RI = 0.85
			ARI = 0.66
			VI = 0.61

The RI is derived by counting the number of times a given pair of items are grouped together or separately in two clusterings divided by these counts as well as the number of times they are grouped inconsistently (i.e., together in one clustering but separately in the other). The index spans from 0 to 1, whereby identical clusterings obtain 1 and clusterings that share no consistency score 0. The ARI spans the same range but is designed to be more sensitive to differences of that are the result of chance. The VI takes an information-theoretic approach, attempting to discern how much information one clustering can provide about the other. As such, two identical clusterings would obtain a VI score of 0 (i.e., no information can be gleaned about the other), and the VI has an upper bound of log n, where n is the number of points in the data set, since the more items there are, the more clusterings might theoretically exist. Across each metric the function and general clusterings were similar, while the mechanism condition is a bit closer to the function than the general condition. These results offer preliminary support for our hypothesis. They not only suggest that participants are sensitive to mechanism, but also that their sorting conforms to three mechanism domains: mechanical, chemical, and electromagnetic.

3. Studies 2a and 2b: Abstract attribution

Schultz, Bonawitz, and Griffiths [36] investigated whether preschoolers would make causal attributions based on the domains of their naïve theories. Motivated by Griffiths and Tenenbaum’s [37] theory-based causal induction theory, Schultz et al., pitted domain-specific evidence (e.g., theories) against domain-general statistical inference. In theory, children’s judgments might emerge from evidence, theories, or a combination of the two. In any case, their judgments should depend on the strength of each component. According to Schultz et al. [36], domain-specific theory knowledge manifests in our prior probabilities of theories. However, they also show that children can pick up on statistical regularities that can override domain-specific theories. We presented participants with a series of effects and asked them to select the most likely cause. The domain-matching hypothesis predicts that people will select the cause that matches the domain of the effect. Study 2a tested adults and 2b tested children. As predicted by the domain-matching hypothesis, children were expected to select the cause that matched the domain of the effect. For each effect, participants were required to select only one cause. We wanted to know if children could select a cause that matched the domain of the effect, or whether their choices varied depending on their age across development.

3.1 Norming

Six items were crafted in total, three causes and three effects. Each item was designed with an intended mechanism domain in mind: mechanical, chemical, or electromagnetic. For example, the mechanical effect item was, "Imagine a machine that alters the shape of objects," and the chemical cause was, "This machine works by invoking a chemical reaction." The complete set of stimuli can be found in Table 2 below. The norming instructions given to subjects were as follows: “For each of the following statements, think about how the event of interest actually works. Think about the kinds of mechanisms involved, and then indicate whether you believe that the primary mechanisms are mainly mechanical, chemical, or energy-based (waves of energy/electricity) in nature.” The last category is referred to as “electromagnetic” in subsequent studies.

Table 2

Normed stimuli used in Studies 2a and 2b, alongside judged domain and chi-square tests for the three causes and the three effects in Studies 2a and 2b.

Intended Domain	Cause Item	Agreement with Intended Domain	Effect Item	Agreement with Intended Domain
Mechanical	This machine works by applying physical pressure.	85.00%X² (2) = 24.10,p < 0.001	Imagine a machine that modifies the shape of objects.	90.00%X² (2) = 29.20,p < 0.001
Chemical	This machine works by invoking a chemical reaction.	95.00%X² (2) = 34.30,p < 0.001	Imagine a machine that modifies the color of objects.	85.00%X² (2) = 24.70,p < 0.001
Electro magnetic	This machine works by emitting an electrical current.	80.00%X² (2) = 19.60,p < 0.001	Imagine a machine that modifies the temperature of objects.	70.00%X² (2) = 12.40,p = 0.002

On average, the proportion of participants whose judged domains matched our intended domains was high (84.2%). Chemical items were most agreed upon (90.0%) followed by their mechanical (87.5%) and electromagnetic counterparts (75.0%; see Table 2 below). For all items, the intended domain was always endorsed more often than either unintended domain. For each item, a chi-square test was performed to see if the item’s judged domain was different from what would be predicted by chance given these three categories.

3.2 Method

In Study 2a, 51 adult participants (24 males, 27 females; Mage = 31.59 years, SDage = 9.85 years) recruited from Mechanical Turk completed the task. Participants took on average 2 minutes and 53 seconds to complete the task. In Study 2b, 88 children completed the online task using OptimalSort. The age range was between 8 to 17 (N = 88; 38 male, 46 female, 4 preferred not to say (Mage = 13.55 years, SDage = 3.07 years). On each trial, participants (in both Study 2a and 2b) were presented with one effect and were asked to rank the three candidate causes in descending order of likelihood. Participants did this for each effect. The questions and their choices were presented in random order.

3.3 Results & discussion

3.3.1 Study 2a

For each effect, the first-ranked causes were analyzed. In the mechanical effect condition, the mechanical cause was selected first 37 (73%) out of 51 times, χ2 (2) = 35.29, p < 0.001. In the chemical effect condition, 42 (88%) of 48 judgments (3 participants failed to make judgments) pitted the chemical cause as most likely, χ2 (1, N = 48) = 63.88, p < 0.001. And in the electromagnetic effect, the electromagnetic cause was selected first 26 (53%) of 49 times (2 participants failed to make judgments), χ2 (2, N = 49) = 8.61, p = 0.01. Overall, participants were more likely to select a cause that matched the domain of an effect (see Fig 4 and Table 3).

Fig 4

Proportion of responses for which the 1st–ranked cause matched the intended domain of each of the three effects (chance is 331/3%; error bars are 95% confidence intervals).

Table 3

A frequency listing of all ranking data.

	Rank	Mechanical Effect	Chemical Effect	Electromagnetic Effect
Mechanical Cause	1^st	37	5	11
	2^nd	10	7	5
	3^rd	4	36	33
Chemical Cause	1^st	7	42	12
	2^nd	27	3	25
	3^rd	17	3	12
Electromagnetic Cause	1^st	7	1	26
	2^nd	14	38	19
	3^rd	30	9	4

3.3.2 Study 2b

For the mechanical effect, 66 (75%) out of 88 participants appropriately selected the mechanical cause. For the chemical effect, 67 (76%) participants selected the chemical cause. Similar patterns were observed for the electromagnetic effect, where 61 (69.3%) participants appropriately selected the electromagnetic cause. One-way ANOVAs confirmed nonsignificant age effects for each domain (mechanical, chemical, electromagnetic; p > .05, for all). The results indicated that children were more likely to select a cause that matched the domain of an effect. These results were similar to the adult data. An alternative account of how people make these decisions is that they are sensitive to base rates: people select what they believe to be the cause with the highest base rate, regardless of the context that the given effect provides. Because we do not know the base rates of our causes, it is unclear what that would mean for these stimuli, so we tested the base rate account in Study 3. Another account of how people come to attributions is that they base their responses on similarity [6]. For example, they might select the mechanical cause when it is most similar to the mechanical effect. However, the similarity relations in our stimuli do not clearly coincide with domain, at least if we understand “similarity” to refer to perceptual similarity. If we allow “similarity” to include non-perceptual elements, then it becomes too unconstrained to be helpful. For example, perhaps the application of pressure is more similar to shape modification than color change. However, is emitting an electrical current more similar to temperature or color change? In most cases, electrical currents are invisible; and to a lesser extent the same is true for temperature fluctuations. If people are basing their judgement on the fact that temperature changes tend to associate with electromagnetic activity, then they are relying on the similarity of domains, not on the perceptual similarity of events. Another account might predict that people make these judgments only after a theoretical evaluation of the mechanisms involved across each pair of cause and effect. According to this “knowledge hypothesis,” people would respond in kind not because of mechanism domain matching, but because they have a complete-enough understanding of actual mechanisms to derive a response based on a mental simulation or similar computation. The fact that the children’s data largely replicated the results of the adult data argues against this account. Presumably the ability to generate a mental simulation of a complex process develops over time. It is less likely for children to have a complete-enough understanding of artifact mechanisms. We conclude that the best explanation for the data is the proposed domain-matching heuristic.

4. Study 3: Abstract prediction

Study 3 was designed to mirror Study 2 but in the causal direction, the more natural direction to reason in [38-40]. The domain-matching hypothesis is agnostic with respect to causal directionality, and so the prediction here is the converse: that people should prefer effects that come from the same domain as their cause.

4.1 Method

51 participants (32 males, 19 females; Mage = 31.31 years, SDage = 9.03 years) recruited from Mechanical Turk completed the task over an average of 3 minutes and 20 seconds. The materials from the previous two studies were reused. This time, participants were first presented with a cause, and then asked to rank the three effects in descending order of likelihood. Participants repeated this for each effect. The three questions and their choices were presented in random order.

4.2. Results & discussion

For the mechanical cause, the mechanical effect was selected first 34 (67%) times out of a total of 51 judgments, X2(2) = 25.76, p < 0.001. In the chemical cause condition, 35 (69%) of 51 judgments indicated the chemical effect was most likely, X2(2) = 31.53, p < 0.001. For the electromagnetic cause the electromagnetic effect was selected first 36 (71%) of 51 times, X2(2) = 31.88, p < 0.001. We again take this as evidence supporting the domain-matching heuristic. This experiment rules out the idea that people are choosing the option with the highest base rate. That hypothesis predicts that a single effect would be chosen more often than the others for all items.

5. Studies 4a and 4b: Concrete attribution

Studies 2 and 3 asked participants to reason about relatively abstract items. Study 4 also asked participants to perform causal attribution, but with a larger number of more diverse items. This study serves to compare the predictions of our mechanistic domain(s) hypothesis to those of statistical and knowledge-based accounts directly. Study 4 test items were designed so that causes and effects that matched mechanism domains would be objectively or subjectively counter-normative (i.e., in contradiction with statistical or theoretical knowledge or both). We normed a set of triplets composed of an effect, a within-domain cause, and a cross-domain cause. Like Study 2, participants were presented with an effect and asked to choose the likelier cause. During our norming phase, we also collected likelihood judgments for all events. A sample triplet including mechanism domain and likelihood obtained during norming is presented in Table A1 in S1 Appendix. We predicted that people would select causes that match the mechanism domain of the effects (i.e., within-domain) causes more often than ones that do not match the domain of the effects (i.e., cross-domain) causes. Because cross-domain causes were chosen that were more likely than within-domain causes, our predictions diverge from those based on the prevalence of causes or covariation of causes and effects (which would suggest, for example, that cars are likelier to be in accidents than they are to have their batteries short-circuit). It also implies that people will often respond contrary to what a knowledge-based account would dictate. For example, in another test item we ask participants to determine whether a house fire is more likely to be the result of leaving a wool sweater by the lit fireplace or plugging an air conditioning unit into an extension cord. Though people might have assumed that leaving the sweater is the more likely culprit, extension cord fires are a common cause of houses burning down, while wool is actually used in airplane upholstery because of its fire-retardant properties. Our predictions imply there is more to consider in causal attribution than base rates and mechanistic knowledge. Study 4a examined adult judgments and 4b children’s judgments.

5.1 Method

5.1.1 Norming

Norming was conducted over two sessions. In the first session, a group of participants was only asked about causes. In the second, another group of participants was asked about effects. For session 1, 60 participants (28 males, 32 females; Mage = 34.12 years, SDage = 12.77 years) recruited from Amazon’s Mechanical Turk via TurkGate [41] completed the task. Participants were paid twenty cents for participation, which on average 6 minutes and 1 second. Session 1 asked half of the participants (N = 30) to make likelihood judgments for statements like, “The rotor inside a random fan broke,” and, “A person spilled bleach on their sweater”. The other half of participants (N = 30) made domain judgments for the same items (i.e., mechanical, chemical, or electromagnetic). The full instructions were, “For each of the following statements, think about how the event of interest actually works. Think about the kinds of mechanisms involved, and then indicate whether you believe that the primary mechanisms are mainly mechanical, chemical, or energy-based (waves of energy/electricity) in nature.” Session 2 was conducted between two other, unrelated studies on political psychology. 96 participants (35 males, 59 females; Mage = 35.57 years, SDage = 12.27 years) recruited from Amazon’s Mechanical Turk via TurkGate [41] completed the task. Participants were paid sixty cents for participation in a larger group of surveys. In session 2, participants received the same instructions and questions as in session 1. However, they were asked about the effect items rather than the cause items. One group of items was eliminated due to a typo. The final set of 57 test items alongside their likelihood and domain judgments can be seen in Table A1 in S1 Appendix. Table A1 in S1 Appendix shows the mean and standard deviation of likelihoods for each item. In addition, the proportion of participants who selected each mechanism domain is given, along with a Chi-Square test (df = 2) to determine whether responses were significantly different from chance. The last column represents the same inclusion criterion used in the previous experiment, which indicates whether more than 1/3 of participants agreed with the items’ intended domain.

5.2 Study 4a

30 adult participants (19 males, 11 females; Mage = 35.47 years, SDage = 11.81 years) recruited from Mechanical Turk completed the task. Participants took on average 5 minutes and 18 seconds to complete the study. Participants were presented with effect items from the mechanical, chemical, and electromagnetic mechanism domains, each of which was paired with a cause that was within-domain (e.g., mechanical causes mechanical) and a cause that was cross-domain (e.g., chemical or electromagnetic causes mechanical). For each effect presented, participants were asked to choose which of the two causes they believed was responsible for the effect. All items were presented in random order. (see Table A1 in S1 Appendix) The full instructions were, “Thank you for participating! Please respond to the following statements according to your beliefs. For each statement choose the reason that you feel is more likely. Read all of the text but do not hesitate too long on any particular response.”

5.3 Study 4b

In total 88 children completed the task online. Participants responded to 9 questions presented with triplets following the same testing protocol as adults. The triplet items contained an effect, a within-domain cause, and a cross-domain cause as shown in Table 4. As in Study 4a, children were expected to follow the domain-matching hypothesis and select within-domain causes more often than cross-domain causes.

Table 4

Triplets.

Items	Intended Domain	%
1. Tim couldn’t put the square block in the round hole	M
The square was blue, and the round hole was green	C	3.4
The square was bigger than the round hole.	M	96.6
2. John’s car wasn’t very shiny after treating it with wax.	C
John used the wrong kind of wax when waxing his car	C	64.8
John used the wrong waxing motion when waxing his car.	M	35.2
3. Alfie made a bread, but he found it to be smaller than he hoped.	C
Alfie didn’t mix the dough enough when making bread.	M	54.5
Alfie used too little yeast when making bread.	C	45.5
4. The radio in Hannah’s car no longer works.	E
Hannah’s car was in an accident.	M	4.5
Hannah’s car’s battery short-circuited.	E	95.5
5. Jem’s cell phone battery doesn’t hold charge as well as it used to.	E
Jem’s phone’s vibrated during incoming calls.	M	9.1
Jem’s phone’s battery never fully drained.	E	90.9
6. Tom’s house caught on fire.	C
A fireplace was lit with a wool sweater nearby.	C	67.0
A random person plugged an air conditioner into an extension cord.	E	33.0
7. Jack got sick after using a dirty restroom.	C
Jack rubbed his hands together under running water when cleaning his hands.	M	50
Jack dipped and left his hands in soapy water before rinsing them off when cleaning his hands.	C	50
8. The fan inside Tim’s laptop suddenly slowed down.	M
Tim’s laptop was kept in a very dusty room.	M	70.5
Tim’s internet cut off while watching a movie.	E	29.5
9. Sara’s sweater got a hole in it.	M
Sara spilled bleach on her sweater.	C	6.8
Sara’s sweater got caught on her belt’s fastener.	M	93.2

5.4 Results & discussion

5.4.1 Study 4a

As predicted, participants selected the within-domain cause more often than the cross-domain cause, χ2 (1) = 20.15, p < 0.001 (see Fig 5).

Fig 5

Proportion of attributions (within and cross proportions sum to 1) and mean likelihoods for causes that matched the domain of the effect (within-domain) vs. those that did not (cross-domain).

(Error bars are 95% Confidence Intervals for the proportions).

Proportion of attributions (within and cross proportions sum to 1) and mean likelihoods for causes that matched the domain of the effect (within-domain) vs. those that did not (cross-domain).

(Error bars are 95% Confidence Intervals for the proportions). We also predicted that participants would choose the within-domain causes even when those causes did not obtain the highest likelihood judgments obtained during norming, which were on average higher for cross-domain items (Mcross = 64.90%, SDcross = 33.61%) than within-domain ones (Mwithin = 57.76%, SDwithin = 34.42%; t(838) = -3.04, p = 0.002). Participant’s attributions were typically in disagreement with any theory-based response as well.

5.4.2 Study 4b

Children selected the within-domain cause more often than the cross-domain cause. The most challenging triplets were the ones which required participants to make a choice between the mechanical and chemical cause (see Table 5). Further one-way ANOVAs for each triplet indicated nonsignificant age effects on participants’ choices after the Bonferroni correction (p>.05 for all except for the sixth triplet, which was p = .033).

Table 5

Chi-square and logistic regression tests for each triplet.

Triplet	N	χ2 tests for each triplet				Logistic regressions
Triplet	N	Mean	SD	χ2	Sig	Nagelkerke R²	Wald test for age	Sig
1	88	1.97	.183	76.409	.000	.408	2.195	.138
2	88	1.35	.480	7.682	.006	.050	.393	.531
3	88	1.45	.501	.727	.394	.019	.591	.442
4	88	1.95	.209	72.727	.000	.252	.174	.677
5	88	1.91	.289	58.909	.000	.076	1.712	.191
6	88	1.33	.473	10.227	.001	.067	.071	.789
7	88	1.50	.503	.001	1	.062	1.841	.175
8	88	1.30	.459	14.727	.000	.018	.166	.683
9	88	1.93	.254	65.636	.000	.292	.025	.875

To examine whether there was a significant difference between the percentage of children who chose within-domain causes and the percentage who chose cross-domain causes, we used a chi-square test (testing the proportions against the null hypothesis). As can be seen in Table 5, the 3rd and 7th triplet values were nonsignificant. All other triplets showed a significant difference in participants’ choices favoring within-domain causes. For the first triplet, the regression model significantly explained 40.8% (Nagelkerke R2) of the variance in children’s responses and correctly classified 96.6% of cases, where age effect was non-significant. For the second triplet the model explained 5% of the variance and correctly classified 67% of cases, where age effect was non-significant. Similarly, for the rest of the triplets the analyses indicated non-significant age effects in accord with the ANOVA results (see also Fig 6, showing participants’ choices across the age groups on the triplets used in Studies 4a and 4b).

Fig 6

Participants’ choices across age groups.

Both adults and children selected the within-domain cause more often than the cross-domain cause, suggesting their choices were based on the domains of the mechanism. The results could be due to participants considering causes from the same domain to have a higher causal strength in generating their corresponding effects. To shed light on this issue, we ran Study 5, presenting participants with multiple effect items that did not always favor causal strength.

6. Study 5: Attribution with stable causes

Is it possible that something altogether different from base rates, causal knowledge, similarity, or mechanism domains is driving behavior? Maybe people are responding to another cue. For example, consider the test item: “A person’s sweater got a hole in it.” The corresponding, cross-domain cause is, “The person spilled bleach on their sweater,” while the within-domain cause is, “The person’s sweater got caught on their belt’s fastener.” For the sake of completeness, we test the possibility that some people are endorsing the within-domain cause not because it matches the mechanical mechanism domain, but because they believe belt fasteners are highly causative for some reason. One way to test such factors would be to use the same causes across different effects (i.e., to manipulate effects while holding causes stable). Would the same people endorse belt fasteners regardless of the effects they are presented with? If in these cases people still select the within-domain cause more often, additional support will be provided for the domain-matching hypothesis. To test this, we developed a new set of quadruplets (rather than triplets) as shown in S2 Appendix. For example, one quadruplet’s effects were, “A person’s sweater got discolored,” and, “A person’s sweater got holes in it.” The quadruplet’s causes were, “A person poured bleach over a stain on their sweater,” and, “A person rubbed steel wool over a stain on their sweater.” Bleach can discolor a sweater, but it can also create holes by dissolving the sweater’s fabric. Given the effect of “holes in a sweater,” the domain-matching prediction is that participants would more often select the steel wool cause, even though, again, it might be less probable (steel wool likely does more damage than good in removing common stains). At the same time, the steel wool cause is predicted to elicit a different response given the discoloration effect. This study relied on additional norming to establish a set of items whereby each cause had a validated within- as well as a cross-domain effect, and vice versa. Study 5 asked participants to make a series of causal attribution once again. The prediction again was that participants would select the within-domain cause more often than the cross-domain cause.

6.1. Method

6.1.1. Norming

This study was conducted between two other, unrelated studies on political psychology. 100 participants (41 males, 59 females; Mage = 36.49 years, SDage = 12.75 years) recruited from Amazon’s Mechanical Turk via TurkGate [41] completed the task. Participants were paid sixty cents for their participation in a larger group of surveys. Participants were entered into one of two conditions, such that any given participant either made a series of likelihood judgments or domain judgments, where all items were presented in random order. Judgment results are provided in S2 Appendix. According to the same 1/3 criterion used in previous studies, judged domain matched intended domain for nearly all (92.86%) items. The two Cause within-Cause cross-Effect within-Effect cross quadruplets that did not completely meet the 1/3 criterion are in italics in the Table A2 in S2 Appendix. Furthermore, one hundred participants (44 males, 56 females; Mage = 32.95 years, SDage = 11.17 years) recruited from Mechanical Turk completed the task. Participants completed the task within a larger group of surveys. In this study, participants were provided with an effect and asked to determine the likelier cause. Any given participant was presented with only one of the two effect items per quadruplet.

6.2. Results & discussion

Although we used the same causes across different effects, participants selected the within-domain cause significantly more often (63.00%) than the cross-domain cause (37.00%), χ2 (1) = 33.80, p < 0.001. Thus, the same causes and effects elicited different responses depending on their context (e.g., corresponding effects and causes). This finding also discredits both the base rate hypothesis, and the idea that idiosyncratic properties of select events engender stable response patterns. We take this as some evidence that the causal strength of each cause for its corresponding effect is higher when the causes match the domain of the effect; we see greater causal strength when domains match. While it could be that the results occurred because causes that matched the domain of their effect just happened to have greater causal strength, this would be an unlikely co-incidence. While it is possible that each of our items suffers from a different confound, the one invariant across items is that causal strength is higher when domains match, suggesting that commonality of domain is the operative variable.

7. General discussion

This study investigated whether people tend to think about the mechanisms that relate causes to effects as sitting within one of small number of domains, whether people use a domain-matching heuristic. In total 5 studies with adults and 2 with children looked for evidence of the phenomenon. The first study required participants to cluster artifacts, and we identified specific domains that people commonly employ in their clusters. Using these domains, the next studies evaluated the domain-matching heuristic by testing predictions about causal attribution, prediction, judgement, or subjective understanding. The results suggested that people do use knowledge of mechanism domains when engaged in various causal reasoning tasks. People’s judgments in attribution, prediction, and believability abide by the domain-matching hypothesis, which states that causal perception is enhanced when cause and effect share a common mechanism domain. These findings cannot easily be explained by an appeal to base rates, mechanism knowledge, or similarity between events alone. The mechanism domains we found evidence for were the mechanical, chemical, and electromagnetic domains. Even though preliminary evidence supported these domains in particular, the domain-matching hypothesis is not committed to any specific domains, and there very well may be others, such as those in the social realm. These were excluded in order to restrict scope. Other researchers have distinguished biological causal explanations from explanations for artifacts and social entities [29, 42]. Others focus on how teleological explanations differ from causal ones [16, 43–45]. Our studies are limited to artifacts. Within that broad domain, our studies suggest a basic distinction among mechanical, chemical, and electromagnetic forms. Mechanism domains seem to help people make sense of an otherwise complex world. They possibly allow us to bypass deep understanding in order to come to satisfactory conclusions in a cognitively economic manner. We propose domain matching is a heuristic we employ in order to reduce search spaces during causal reasoning, and though it can pick up on the veridical structure of the world, it can also lead us astray (albeit in predictable directions). Developmental data supported this interpretation and revealed that, despite limited experience with the artifacts, children were more likely to attribute a causal relation when two events shared a mechanism domain. Their choices were substantially above chance, similar to that of adults. This implies that, even when lacking mechanistic knowledge, the ability to reason about causal relations and linking mechanisms supporting these relations has already matured into adult form sometime around the age of 11. We leave open questions about the manner and extent to which much younger children use the domain-matching heuristic. The data we report here make only a small incursion into our understanding of how people develop their extraordinary ability to understand the causal powers of everyday objects.

List of items used in Study 1, and Table A1.

Norming data for study 4a and 4b. (DOCX) Click here for additional data file.

Table A2.

Norming data for Study 5. (DOCX) Click here for additional data file. (ZIP) Click here for additional data file. 13 Jul 2021 PONE-D-21-08341 Causal reasoning without mechanism: The domain-matching heuristic PLOS ONE Dear Dr. Dündar-Coecke , thank you for submitting your manuscript to PLOS ONE. After careful consideration by three experts in the field, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. I appreciate the Reviewers' efforts while evaluating the manuscript and their thoughtful and detailed comments on the work. As you will see, the Reviewers have found much interest in your work. We invite you to submit a revised version of the manuscript that addresses the points raised during the review process, paying special attention to the following issues: 1) Please thoroughly address all technical and statistical issues raised by the Reviewers below. 2) Please report if the data sets were tested for normality and which statistical inference was used for respective data including test directionality and multiplicity correction where applicable. 3) Two Reviewers have mentioned some issues with data availability. Please clarify; in case of uncertainty, please contact PLOS ONE Editorial Team on this occasion. 4) Although an Ethics board approval was obtained, please indicate in the main text if/that the written consent was given by the participants. Please also make sure to address all the other Reviewers' comments in the point-by-point manner. Please submit your revised manuscript within six months from this date as thereafter, any revision has to be considered a new submission. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols . We look forward to receiving your revised manuscript. Thank you for choosing PLOS ONE for communicating your research. Kind regards, Sasha Alexander N. 'Sasha' Sokolov, Ph.D. Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. You indicated that you had ethical approval for your study. In your Methods section, please ensure you have also stated whether your IRB specifically approved the method of parental consent. 3. Please amend either the title on the online submission form (via Edit Submission) or the title in the manuscript so that they are identical. 4. We note you have included a table to which you do not refer in the text of your manuscript. Please ensure that you refer to Table 2 in your text; if accepted, production will need this reference to link the reader to the Table. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly Reviewer #3: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No Reviewer #2: Yes Reviewer #3: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: This paper examines a longstanding puzzle in cognitive psychology: On the one hand, many results point to the importance of mechanism knowledge in causal reasoning; yet on the other hand, people are remarkably bad at articulating these mechanisms, suggesting that mechanism knowledge is shallow and skeletal. This paper presents evidence for a domain-matching heuristic, such that people categorize causal mechanisms into broad domains (e.g., mechanical, chemical, electromagnetic), matching causes and effects by domain despite a lack of detailed knowledge of how these mechanisms work. Several experiments in both adults and children support this hypothesis. Overall, I enjoyed reading this paper and think it makes a contribution to the literature on causal cognition. Here are a few questions I had while reading it, mainly aimed at further understanding the authors’ theory and how it fits in with other perspectives. 1. The coverage in the introduction of the literature on mechanism knowledge in causal reasoning is generally very good. For readers who are unfamiliar with phenomena such has the illusion of explanatory depth and the related results from the authors’ own research group, it would be useful to devote a couple more paragraphs to summarizing the evidence that people have impoverished knowledge of causal mechanisms. 2. One relevant literature the authors do not consider is related work on other similarity-matching heuristics in causal reasoning. Off-hand, I can think of at least four papers in this vein that the authors might find useful in terms of explaining and situating their theory: Einhorn, H. J., & Hogarth, R. M. (1986). Judging probable cause. Psychological Bulletin, 99(1), 3-19. Johnson, S. G. B., & Keil, F. C. (2014). Causal inference and the hierarchical structure of experience. Journal of Experimental Psychology: General, 143(6), 2223. LeBoeuf, R. A., & Norton, M. I. (2012). Consequence-cause matching: Looking to the consequences of events to infer their causes. Journal of Consumer Research, 39(1), 128-141. Lim, J. B., & Oppenheimer, D. M. (2020). Explanatory preferences for complexity matching. PloS one, 15(4), e0230929. Work on the “laws of sympathetic magic” is also broadly relevant, although less directly about casual reasoning in particular: Rozin, P., Millman, L., & Nemeroff, C. (1986). Operation of the laws of sympathetic magic in disgust and other domains. Journal of Personality and Social Psychology, 50(4), 703–712. It would be useful if the authors could explain how their similarity-matching principle relates to the ideas in these papers. Does it share a similar adaptive or informational logic? 3. I am a little confused about the authors’ use of terminology surrounding “domains.” This is not entirely on the authors, as this terminology is somewhat slippery in the broader literature as well. (i) *Categories* In a somewhat loose sense, sometimes people use the word “domain” to simply refer to different categories of knowledge or phenomena. For example, professors at a music conservatory might have different “domains” of knowledge such as music history, music theory, and music performance. In that sense, the categories studied in this paper (e.g., electromagnetism vs. mechanics) surely count as domains. It seems like there’s nothing inherently special about these particular categories, but rather these are the level of categories that participants naturally learned. In the vein of work by Coley, Medin, and others on expertise and categorization, these categories could easily shift with increased knowledge and likely would become more specific with more experience. (ii) *Modules* In its more technical sense, “domain” refers to something more like a mental module. For example, folk-psychology or folk-physics could be a domain because core knowledge about these domains plausibly evolved, whereas folk-music-history would not be a good candidate for a domain, nor (I think) would folk-electromagnetism. Presumably modules (to the extent they exist) have not changed over the course of human history, but I am sure that everyday intuitions about electromagnetism have changed very much over the past 100 years. As you can see, I think the authors are basically using the word “domain” synonymously with “category,” but this is causing me some confusion as there are some references to the domain-specificity literature that make it seem like the authors are sometimes using some of the theoretical apparatus of modules. Related – and perhaps this would help with the above point – the discussion of Sloman, Lombrozo, and Malt’s analysis is not integrated as well as it could be with the other points made in the introduction. I like the Sloman et al. chapter a lot and find it a good way of summarizing various writers’ views on this topic, and I also find those authors’ “mild ontology” view plausible. But the relationship between mild ontology and the “domain matching” hypothesis isn’t particularly clear, mainly because “domain specificity” or “extra strong ontology” is very much along the lines of modules (option ii), whereas “mild ontology” is much more along the lines of categories (option i). Basically, I would like the authors to better articulate what they mean by “mechanism domains,” and possibly consider using a clearer term that does not carry as much baggage and ambiguity. 4. One thing that made the paper a little bit hard to read was the introduction of alternative explanations throughout the presentation of the results rather than in the front-end of the paper. Could the authors add a section that collects the alternative accounts in one place (e.g., similarity, base rates, mechanism knowledge) and explains the kind of evidence that will be important for addressing them? These alternatives are not explained in very much detail in the current version of the manuscript which makes it somewhat difficult to evaluate their plausibility. In particular, the similarity account seems like a plausible alternative that should be discussed more thoroughly, both in the introduction and discussion, as fleshing this out would likely help to clarify the authors’ own proposal. 5. Overall I found the package of studies to be convincing. I had just a couple suggestions around the presentation: (i) The results of Study 1 sound very interesting but they are only described qualitatively. Could the authors provide some visualization of how participants sorted the items? (ii) Could the authors present figures that show the age effects (including comparisons to adults when the data are comparable)? Even though age effects were not substantial, the developmental aspect of these studies will be interesting to many readers. Minor: - The numbering of the tables is confusing, as the tables are numbered sequentially between the main text and appendix. Referring to the appendix tables as A1 and A2 (or simply as Appendix 1 and 2) would be clearer. - Some of the references to tables seem to refer to the incorrect table, including lines 446, 474, and 521… generally throughout Studies 4 and 5 the relationship between the main text and tables is confusing. Thanks very much for the opportunity to read your work, and best of luck with next steps! Sam Johnson (I sign all reviews) Reviewer #2: Review of “Causal reasoning without mechanism” by Dündar-Coecke, Goldin, and Sloman In this paper, Dündar-Coecke, Goldin, and Sloman (DGS henceforth) investigate how reasoners (human adults and children) assess the causal status of co-occurred events in the absence of specific mechanistic or covariational knowledge about the corresponding types of events. The main hypothesis DGS propose in this paper is that people assess causal links between events in such cases by considering abstract mechanism domains. More specifically, the authors propose and test the idea that reasoners are more likely to regard certain events as causally connected if these events belong to the same global mechanism class rather than to different mechanism classes. The authors first established (Experiment 1) and then tested (subsequent studies) three global mechanism categories: (1) mechanical, (2) chemical, and (3) electromagnetic. For both adults and children, it was found that causal judgments, irrespective of whether they were made in the predictive or the diagnostic direction (Experiments 2 and 3), corresponded to the proposed domain matching hypothesis: when selecting the most probable cause (or effect) of a target event, subjects tended to select an event that belonged to the same mechanism category. Experiment 4 replicated the effect with test times (cause and effect events) that were more concrete than the rather abstract items used in Experiments 2 and 3. I think that this is an interesting topic and that DGS propose a, in my view, very plausible hypothesis. I find it quite natural, for example, that if a reasoner observes an event and tries to uncover its cause(s), he or she will search for events that could plausibly be causally relevant. I share the authors’ intuition that broad mechanism categories are a useful guide in this respect. I also think that the question how people form causal beliefs in the absence of directly relevant information, such as information about covariation, has been understudied. However, although I’m quite enthusiastic about DGSs’ research, I think that some more work needs to be done before this manuscript can turn into an impactful publication. Some of my remarks, that can probably be dealt with fast, concern the flow and the presentation of ideas and conclusions, and the lacking of some information in the empirical sections, but I also think that an additional final experiment could be necessary to have a more convincing package. Flow of ideas and structuring: My impression is that the main theory of the paper is told relatively straightforwardly, but I also was a bit confused by some of the paragraphs in the introduction and I think that many of the paragraphs are not very well connected. I fear that this tends to impair the reading flow. For example, in section 1.1 “Causal attribution in the absence of mechanism knowledge” it was not really clear to me what the authors wanted to tell the reader. If the purpose here was to convince the reader that reasoners tend to be lacking specific mechanism knowledge in many domains but often seem to be guided by a more abstract sense of mechanisms, that this could be done more concisely. I really wondered, for example, why the paragraph on page 5 beginning in line 108 was in there. The paragraph starts with “`The domain-based difference’ hypothesis, proposed by [17], suggests that domains can indeed play functional role even after equating strength of belief. In that study, a sample of people were asked several questions from the scientific and religious domains […]” and ends with “[…] This showed that people think a mechanistic explanation is a stronger requirement in science than religion.” I don’t understand why this is relevant for the present paper? I had the same question for the paragraph bout Rozenbit and Keil’s results starting in line 145, in which DGS summarize that people’s confidence in their understanding of devices decreases, for example, after they had tried to offer an explanations of how these devices work. Again, I think the point that DGS want to make in their theory section is that people do not seem to rely on specific mechanism knowledge, but this doesn’t mean that they are not guided by more abstract mechanism knowledge. I think the different paragraphs need to be connected better. Right now, it feels a bit like loose pieces and the main message emerges to the reader only somewhat implicitly. Presentation of empirical results: I think it would be appropriate to give some rationale for the sample sizes that the authors employed in their studies. Did the authors do an a priori planning of their studies? Are the sample sizes based on pilot or previous studies? In the results sections when the authors present the means, I think it would be good to also report confidence intervals for the means and not just standard deviations. A related point is that, in Figure 2, the authors present 95% Cis for the proportions but SEs for the Likelihoods. Why? I think this should be the same for the bars in that figure. Since most readers will probably not expect different types of error bars within the same graph, a spontaneous impression (until one reads the legend) is that the proportion judgments were estimated much less precisely than the likelihood judgments, but this was actually not the case. Also, in Figure 1, the reader is not told what the error bars are. I think it would be best to have 95% CIs all the time. Paragraph beginning in line 253: DGS wrote “[…] their sorting conforms to the predicted mechanism domains: mechanical, chemical, and electromagnetic.” I think these domains were not predicted in the introduction. I was wondering how the authors can claim here that they found the predicted types of categories given that the goal of that study and the conducted cluster analysis was to identify relevant domains? Items shown in Table 4: I was wondering whether at least some of the items would have led to totally different results if they had been formulated slightly differently. Take, for example, Item 3: “Alfie made a bread, but he found it to be smaller than he hoped.” The different cause items are “Alfie didn’t mix the dough enough when making bread.” and “Alfie used too much yeast when making bread”. Subjects found the first option to be more likely, but what would have happened if the second option had mentioned “too little” rather than “too much” yeast. Using too little yeast leads to smaller breads, so wouldn’t this have changed subjects selections? The items shown in Table 4 bring me to my biggest concern, which I think might have to be addressed in a final separate study. While I think that the authors convincingly address the potential base rate problem, another problem that I see is that subjects might have thought about items and might have based their selections on considerations of causal strength. Take for example, Item 9 about hole in the sweater, which is supposed to be an event belonging to the mechanical domain. The two potential causes are that bleach was spilled on the sweater and (chemical cause) and the sweater got caught on her belt’s fastener (mechanical). Most subjects apparently chose the “belt fastener”. Apart from belonging to the same domain, doesn’t this cause also have a higher causal strength in leading to this effect? I think it is statistically more likely that a sweater carries away a hole if I it gets tangled up in the belt faster than when it gets treated with bleach. I think most people who use bleach will not use so much that holes results, while an accidental rip of a pullover on the belt fastener is not unlikely to lead to a hole. An even more drastic example is Item 1 about the square that doesn’t fit through a round hole. The potential causes are that the square was to large for the hole and that their colors didn’t match. Most people know that color is causally irrelevant, while size is not. This knowledge, I think, is very concrete and I don’t think subjects’ decisions here were guided only by means of abstract domain matching. I think that DGSs’ results would be more convincing if they controlled for the causal strength between the potential causes and the effect. While controlling for base rate, I think that it might be a possibility that subjects might have based their decisions on causal strength considerations. I see that DGS aimed to address this concern in their Study 5 (intended to control for how “causative” one event is for the other), but I don’t think that the way they did this is compelling. The idea in Study 5 was to control for how “causative” causes are by pairing them with different effects. For example, the cause item “a person rubbed steel wool over their sweater” was once coupled with the effect item “A person’s sweater got discolored” and once with the effect item “A person’s sweater got holes in it”. Another alternative cause item that was also used was “A person poured bleach over a stain on their sweater”. In the main task, subjects were presented with one of the effect items and asked to select the most likely cause. It was found, for example, that subjects who saw the effect item “A person’s sweater got holes in it” tended to select the mechanical cause item “A person rubbed steel wool over their sweater”, while they tended to select the cause item “A person poured bleach over a strain on their sweater” when the effect item was “A person’s sweater got discolored”. My concern is that subjects here simply selected the causes that have a higher causal strength in generating the presented target effects. Spilling bleach over a sweater will rarely lead to a hole in a sweater, whereas the causal strength between “rubbing a sweater with steel wool” and “making a hole in the fabric” is probably much higher. The reverse is true for the “discoloration” effect. DGS write that if one of the causes is more “causative” then subjects’ preference shouldn’t change depending on the effect item. But causal strength is relative. One and the same cause can have high causal strength with respect to one type of effect and weak (or zero) causal strength with respect to another. I was also wondering whether a causal strength theory can actually be dissociated from the domain matching theory that DGS propose. If not, I think what would be needed if causal strength and domain matching always make similar predictions is an experiment that shows that subjects do not think of causal strength (although it would lead to the same behavior if they did) when searching for the likeliest cause, but really about broader mechanism classes. Minor comments: - Line 153: I think a paragraph shouldn’t start with “So…” - Line 153: “[…] gab between, on one hand, […]” should be “on the one hand” - Line 175: “This implies that mechanism domains are not parts of a […]” should be “part” not “parts” - Line 185: “Study 2 and 3 tested one […]” should be “Studies 2 and 3 tested one […]” - Line 187: “[…], study 4 and 5 […]” should be “Studies 4 and 5 […]” - Line 211: “[…] size and already constucted […]” should be “constructed” - Line 313: “In the mechanical effect, the mechanical cause […]” should be sth. like “In the mechanical effect condition, […]” - Line 341: “[…], so we test the base rate account in Study 4.” Should be “tested” - Line 488: “As predicted, people selected […]” should be “subjects” - Line 497: “[…] causes even if when […]” should be either “if” or “when” In the reference list, many journal names were abbreviated “Cog Psyc.” or “Phil of Sci,” and I wondered whether this format is correct. All in all, I think that this is an interesting project and I think that this work can make an interesting contribution. However, I think that the writing requires some polishing; my main point here is that I think that the paragraphs are often to loosely connected. I also think that DGS need to address the causal strength argument, although I unfortunately can’t suggest a specific experiment right now that would address this issue straightforwardly. Signed, Simon Stephan Reviewer #3: This paper examines whether adults and children use a "mechanism domain" heuristic when connecting causes to effects. They find that people tend to associate independently-validated 'mechanical', 'chemical', and 'electromagentic' effects with causes in the same domain, and vice versa. I have several concerns about Study 1, but I also believe that it could be either substantially expanded or removed from the paper and the end result would be suitable for publication in PLoS ONE. First, the main issue with Study 1 is simply the lack of direct data available to readers, or at least reviewers. There is no listed repository that I could see linked in any of the materials or supplemental information that would let me see the raw data from this study, and the results are described at an extremely coarse level of detail. While the authors claim in the data availability statement that the data are available in the supporting information files, no information on Study 1 is found in either appendix, and no other files were in what I, as a reviewer, was able to download. At a minimum, I would have expected a table showing the individual items in each cluster in the mechanism condition (if not in each condition), since the identification of the clusters as "mechanical", "chemical", and "electromagnetic" is not currently supported in any way other than the author's assertion and a handful of examples from each cluster. However, Study 1 may also be simply redundant, because each experiment includes its own stimulus validation experiment. Right now, with so little information about Study 1 available, I read the categories as largely arbitrarily defined, but that's not necessarily a problem given 1) the validation in each experiment and 2) the consistency of the results. While it would be nice to have a more complete version of Study 1 to give stronger external validation of the categories, the complexity of the analysis, and the underlying degrees of experimenter freedom about, for example, the cutoffs, mean that the paper may be most improved by simply omitting it and treat the three categories as being defined a priori. Since the stimuli of Studies 2-5 are not in fact based of the body of stimuli used in Study 1, Study 1 doesn't even provide direct evidence for these categories as they apply in the other experiments regardless. That said, studies 2-5 are excellent tests of the idea that people cluster mechanisms by domain. Study 5 in particular provides an elegant control for any concerns related to the individual stimuli. I believe that these studies fit all the criteria for publication in PLoS ONE, and either providing much more information about Study 1 or dropping it altogether will yield a paper that is suitable for publication. Minor things: Figs. 1 and 2 have low image resolution, though I expect this will be fixed during production. Line 341: The base rate experiment is Study 3, not Study 4. Line 453: "previous chapter" previous experiment? ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Sam Johnson Reviewer #2: Yes: Simon Stephan Reviewer #3: Yes: Jonathan F. Kominsky [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 24 Sep 2021 Response letter has been uploaded Submitted filename: Response to reviewers.docx Click here for additional data file. 17 Jan 2022

PONE-D-21-08341R1

Causal reasoning without mechanism

PLOS ONE Dear Dr. Dündar-Coecke , thank you for submitting the revision of your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that carefully addresses the (technical) points raised by Reviewer 2, the thorough duicussion of which you will find below.

Please submit your revised manuscript within six months from this date as afterwards any revision has to be considered a new submission. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Thank you for submitting your reseach to PLOS ONE. Kind regards and stay safe and healthy in 2022, Sasha Alexander N. 'Sasha' Sokolov, Ph.D. Academic Editor PLOS ONE [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: (No Response) Reviewer #3: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly Reviewer #3: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: No ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: I am pleased to report that the authors have done a good job of addressing my comments. The issue of “domains” in particular was much clearer in the revision – although this is a fairly small change in terms of prose, I think it makes the theoretical picture much clearer. Overall, I believe this article represents an advance in our understanding of causal inference and I am happy to support publication. Thanks for the opportunity to be one of your first readers! Sam Johnson Reviewer #2: Review of “Causal reasoning without mechanism” #Revision 1 by Dündar-Coecke, Goldin, and Sloman I’d like to apologize for the delayed submission of my review. I was Reviewer #2 in the first review round and now got the chance to read the revised manuscript. My overall impression hasn’t changed. I think that this is very interesting work with the potential to make a relevant contribution to the literature on causal cognition. I also think that the authors did an overall great job (though see below) in addressing the concerns and suggestions provided by the three reviewers. My two main former concerns were: (1) “However, I think that the writing requires some polishing; my main point here is that I think that the paragraphs are often to loosely connected. I think that the authors have addressed this concern very well. (2) I also think that DGS need to address the causal strength argument, although I unfortunately can’t suggest a specific experiment right now that would address this issue straightforwardly.” By contrast, I think that this second of my concerns has still not been resolved satisfactorily. My feelings that many of the thoughts the authors have provided in their reply (in the response letter) to this concern would be worth including in the discussion of the results, or even the general discussion. The authors in their reply wrote “The causal strength of each cause for specific effects is higher when the causes match the domain of the effect. This is a possibility in principle. Our claim is that the reason for greater causal strength is because the cause and effect match domain. Perhaps each of our items suffers from a different confound, but the one invariant across items is that causal strength is higher when domains match.”. If domain-matching and causal strength were to be prima facie confounded, shouldn’t this interesting observation be presented and discussed? The authors continued “We take this as pretty good evidence that the reason that one cause has greater causal strength for one effect and not the other is because it shares a domain with the former.”. I tend to agree with this point, but doesn’t it ultimately mean that there is a confounding explanation here? Do reasoners primarily “think” of domain matching or are they relying on knowledge about strength? What is the psychological primary factor here that drives their decisions? I think this point should be discussed (if not tested, but this could maybe be left open for follow-up studies). However, I also feel inclined to mention that the two other reviewers did not seem to share this concern, so maybe I’m being too strict here. A remark on the sample size rationale: the authors now write “Using the G-Power software, minimum power to detect effect sizes in ANOVAs was .80 – a generally accepted level of power.” What is this supposed to mean? A given (or in this case aspired) level of test power for a significance test is relative to an assumed (standardized) effect size. So what readers need to know here is for which effect size estimate (or assumed effect size) the authors aimed to achieve 80 percent test power. Also, why talk about ANOVAs here? Unless I’m missing something here, the tests testing the domain matching theory aren’t ANOVAs. The authors test proportions here. There’s an ANOVA to test age effects of domain matching, but this question was rather a side than a main issue. My suggestion is that the authors either revise this sentence or, in case they actually didn’t base their sample sizes on effect size assumptions, leave it out completely. Minor things: Line 46: “[…] knowledge that in fact we do.” �  “than” Line 128: the paragraph starts with “Schultz’s study also shows […]” �  unless I missed it, this is the first mentioning of that study, so it is weird to start a paragraph about it as if the readers already know what Shultz et al. did. Line 543: “[…] cause even if when those […]” �  either “if” or “when”. Signed (24 Nov 2021), Simon Stephan Reviewer #3: The figures and added information for Experiment 1 is extremely helpful and justifies its inclusion in the paper. The full data for Experiment 1 are still not accessible at least to reviewers, but there is enough information now to make sense of the results. The other changes are all good as well. I am happy to recommend this paper for publication in PLoS ONE. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Sam Johnson Reviewer #2: Yes: Simon Stephan Reviewer #3: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

29 Jan 2022 The response letter has been uploaded as a separate file. Submitted filename: Response to reviewers2.docx Click here for additional data file. 10 Feb 2022

PONE-D-21-08341R2

Causal reasoning without mechanism

PLOS ONE Dear Dr. Dündar-Coecke , Thank you for submitting your revised manuscript to PLOS ONE. After careful consideration by several experts in the field and myself, we feel it represents a nice piece of research, has been substantially improved, and can be recommended for publication conditional upon fulfilling PLOS ONE’s publication criteria as to the statistical reports, and considering minor suggestions of Reviewer 2 below.

Therefore, we invite you to submit a revised version of the manuscript that addresses the following: Please specify in the manuscript whether or not (1) tests for normality of data sets have been run throughout (if yes, please identify which) and the data processing accomplished accordingly, and (2) corrections for multiple comparisons have been employed as necessary. Should some concerns arise please explain why these procedures have been deemed not applicable. Finally, please consider the minor points of Reviewer 2 as listed below.

Please submit your revised manuscript within six months from this date as at a later time point, any revision will have to be considered a new submission. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:

17 Feb 2022 Minor revisions were explained above. Submitted filename: Rebuttal letter 3.docx Click here for additional data file. 26 Apr 2022 Causal reasoning without mechanism PONE-D-21-08341R3 Dear Dr. Dündar-Coecke, thank you for the revision of your above manuscript. We’re happy to inform you that your revised manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Thank you for choosing PLOS ONE for reporting your research. Kind regards, Sasha Alexander N. 'sasha' Sokolov, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: 28 Apr 2022 PONE-D-21-08341R3 Causal reasoning without mechanism Dear Dr. Dündar-Coecke: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Alexander N. Sokolov Academic Editor PLOS ONE

23 in total

1. The scope of teleological thinking in preschool children.

Authors: D Kelemen
Journal: Cognition Date: 1999-04-01

2. Functional explanation and the function of explanation.

Authors: Tania Lombrozo; Susan Carey
Journal: Cognition Date: 2005-06-06

3. Time as a guide to cause.

Authors: David A Lagnado; Steven A Sloman
Journal: J Exp Psychol Learn Mem Cogn Date: 2006-05 Impact factor: 3.051

4. Do We "do"?

Authors: Steven A Sloman; David A Lagnado
Journal: Cogn Sci Date: 2005-01-02

5. Young children's psychological, physical, and biological explanations.

Authors: H M Wellman; A K Hickling; C A Schult
Journal: New Dir Child Dev Date: 1997

Review 6. Causality in thought.

Authors: Steven A Sloman; David Lagnado
Journal: Annu Rev Psychol Date: 2014-07-21 Impact factor: 24.137