Literature DB >> 31975976

Modeling compliance specifications in linear temporal logic, event processing language and property specification patterns: a controlled experiment on understandability.

Christoph Czepa¹, Amirali Amiri¹, Evangelos Ntentos¹, Uwe Zdun¹.

Abstract

Mature verification and monitoring approaches, such as complex event processing and model checking, can be applied for checking compliance specifications at design time and runtime. Little is known about the understandability of the different formal and technical languages associated with these approaches. This uncertainty regarding understandability might be a major obstacle for the broad practical adoption of those techniques. This article reports a controlled experiment with 215 participants on the understandability of modeling compliance specifications in representative modeling languages, namely linear temporal logic (LTL), the complex event processing-based event processing language (EPL) and property specification patterns (PSP). The formalizations in PSP were overall more correct. That is, the pattern-based approach provides a higher level of understandability than EPL and LTL. More advanced users, however, seemingly are able to cope equally well with PSP and EPL in modeling compliance specifications.

Entities: Chemical Disease Gene Mutation Species

Keywords: Complex event processing; Controlled experiment; Event processing language; Linear temporal logic; Property specification patterns; Understandability

Year: 2019 PMID： 31975976 PMCID： PMC6944266 DOI： 10.1007/s10270-019-00721-4

Source DB: PubMed Journal: Softw Syst Model ISSN： 1619-1366 Impact factor: 1.910

Introduction

Many domains are subject to a vast and ever-growing number of rules and constraints stemming from sources including laws, legislation, regulations, standards, guidelines, contracts and best practices. One example is compliance in the corporate and financial sector. The Sarbanes–Oxley Act of 2002 (SOX) [55] is a federal law that defines rules in reaction to major corporate accounting scandals in the USA (e.g., Enron and WorldCom). Basel III [4] has been established in response to weaknesses in financial regulation responsible for the financial crisis in 2007/2008. Another example of heavily regulated domains is the construction industry. Compliance rules in this domain are often related to occupational safety and health. For example, certain precautions and safe practices are required if a lead contamination is present or to be presumed in buildings built before 1978 that undergo renovation (cf. United States Environmental Protection Agency’s Lead-Based Paint Renovation, Repair and Painting Rule [83]). A third example is the healthcare sector. Processes in hospitals must comply with state-of-the-art medical knowledge and treatment procedures (e.g., Rovani et al. [71]). From cooperations with industry partners (e.g., Tran et al. [80]), their customers and other company representatives at conferences and workshops, we were able to gain valuable insights into the current situation on how compliance rules are handled in practice. Most often, compliance documents are transformed to internal policies first. They are often described in natural language, but there is also a shift toward structured approaches like the Semantics of Business Vocabulary and Business Rules (SBVR) standard [60]. Later these internal policies become considered in business process models (e.g., BPMN [59]) or other behavioral models (e.g., UML activity diagrams), and/or they become hard-coded in a programming language. That often leads-to consistency problems and to a poor maintainability and traceability between compliance specifications, internal policies, models and the source code. This is especially the case when compliance specifications change frequently. Additionally, practitioners report that it often takes a long time until new compliance specifications are actually supported by their software. Often the compliance rule has long been obsolete before the implementation is ready (cf. [20, 48]). Consequently, the industry shows a strong interest in approaches that are applicable in practice. Such approaches should support a comprehensible, fast and accurate adoption of compliance specifications as well as their automated enactment and verification. All modeling languages that we study in this article are well suited for automated computer-aided compliance checking or monitoring. Nonetheless, companies are still often reluctant to expose their customers or employees to such approaches. In discussions with industry partners (cf. [79, 81]), uncertainty regarding how understandable these approaches are became evident. This uncertainty was stated as one of the major reasons for the reluctance in practical adoption.

Problem statement

Most existing work on design time verification and runtime monitoring focuses on technical contributions rather than empirical contributions. From the perspective of a potential end user who has to implement compliance specifications, the understandability of an offered formal specification language appears to be a major interest. To the best of our knowledge, there are no empirical studies that investigate and compare the understandability of representative languages with respect to the formal modeling of compliance specifications. In particular, the following representative specification languages are considered in this empirical study:In previous controlled experiments carried out by Czepa and Zdun [17], the understandability of already existing formal specifications in those language was studied. That experiments can be seen as the first step toward studying the understandability of those languages. To further study the understandability of these languages, it is crucial to consider the modeling itself as well. Linear temporal logic (LTL) was proposed in 1977 by Pnueli [65]. LTL is a popular way for defining compliance rules according to Reichert and Weber [66]. In general, LTL is a widely used specification language commonly applied in model checking (cf. Cimatti et al. [12] for NuSMV1, Blom et al. [9] for LTSmin2, Holzmann [42] for SPIN3) and runtime monitoring by non-deterministic finite automata (cf. De Giacomo and Vardi [23] and De Giacomo et al. [25]). Event processing language (EPL) is the query language of the open-source complex event processing engine Esper4. EPL is well suited as a representative for CEP query languages as it supports common CEP query language concepts, such as leads-to (sequence, followed-by) and every (each) operators, that are present in many CEP query languages and engines (e.g., Siddhi5 and TESLA [15]). Several existing studies on compliance monitoring make use of EPL (cf. Awad et al. [2], Holmes et al. [41] and Tran et al. [82]). Property specification patterns (PSP) are a collection of recurring temporal patterns proposed by Dwyer et al. [27, 28]. This pattern-based approach abstracts underlying technical and formal languages, most notably LTL and CTL (Computation Tree Logic; cf. Clarke et al. [13]). Numerous existing approaches are based on PSP. Among them are the Compliance Request Language proposed by Elgammal et al. [29] and the declarative business process approach Declare proposed by Pešić et al. [61].

Research objectives

This empirical study has the research objective to investigate the understandability construct of representative languages with regard to the modeling of compliance specifications. The understandability construct focuses on the degree of correctness achieved and on the time spent on modeling compliance specifications. The experimental goal using the goal template of the Goal Question Metric proposed by Basili et al. [5] is stated as follows: Analyze LTL, PSP and EPL for the purpose of their evaluation with respect to their understandability related to modeling compliance specifications from the viewpoint of the novice and moderately advanced software engineer, designer or developer in the context/environment of the Software Engineering 2 Lab and the Advanced Software Engineering Lab courses at the Faculty of Computer Science of the University of Vienna. Based upon the stated goal, questions concerning understandability were generated as shown in Table 1.

Table 1

Questions based upon the goal

Identifier	Question
Q1	How understandable are the tested approaches for participants at the bachelor level (attending the Software Engineering 2 Lab course)?
Q2	Are there differences in understandability between the tested approaches for participants at the bachelor level (attending the Software Engineering 2 Lab course)?
Q3	How understandable are the tested approaches for participants at the master level (attending the Advanced Software Engineering Lab course)?
Q4	Are there differences in understandability between the tested approaches for participants at the master level (attending the Advanced Software Engineering Lab course)?
Q5	How understandable are the tested approaches for participants with industrial working experience?
Q6	Are there differences in understandability between the tested approaches for participants with industrial working experience?

Questions based upon the goal The understandability is measured by three dependent variables, namely the syntactic correctness and semantic correctness achieved in trying to formally model compliance specifications as well as the response time. Correctness and response time are commonly used to measure the construct understandability, for example, in empirical studies by Feigenspan et al. [31] and Hoisl et al. [40]. The study design enables a more fine-grained analysis of the correctness by differentiating between syntactic and semantic correctness as suggested by numerous existing studies, such as Ferri et al. [32], Hindawi et al. [39] and Harel and Rumpe [37]. Besides the main research goal, which focuses on understandability, this work addresses subjective aspects, namely the perceived ease of application and the perceived correctness, which are measures of self-assessment and not directly related to the understandability construct.

Guidelines

This work follows the guidelines for reporting experiments in empirical software engineering by Jedlitschka et al. [45]. These guidelines integrate among others the “Preliminary guidelines for empirical research in software engineering” by Kitchenham et al. [50] and standard books on empirical software engineering by Wohlin et al. [86] and Juristo and Moreno [47]. The “Robust Statistical Methods for Empirical Software Engineering” article by Kitchenham et al. [49] had a strong impact on the statistical evaluation of the data in this article.

Background

This section provides a brief introduction to the specification languages used in this study. Readers already familiar with one or more of the discussed approaches may consider skipping parts of this section. Examples of compliance specifications formalized in all three representations are available in “Appendix A.” These examples are based on the experimental tasks (cf. Sect. 3.3) of this experiment.

Linear Temporal Logic (LTL)

Propositional logic is not expressive enough to describe temporal properties, so a logic called linear temporal logic (LTL) for reasoning over linear traces with the temporal operators (or ) for “globally” and (or ) for “finally” was proposed by Pnueli [65]. Additional temporal operators are for “until,” for “weak until,” for “release” and (or ) for “next.” The meaning of these operators is described in Table 2. LTL formulas are composed of the aforementioned temporal operators, atomic propositions (the set AP) and the Boolean operators (for “and”), for “or,” for “not,” for “implies” (cf. Baier and Katoen [3]). The weak-until operator is defined as .

Table 2

Informal meanings of LTL operators

Text notation	Symbol notation	Meaning
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {G}}\psi $$\end{document}Gψ	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\square \psi $$\end{document}□ψ	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi $$\end{document}ψ must be true in every point in time
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {F}}\psi $$\end{document}Fψ	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lozenge \psi $$\end{document}◊ψ	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi $$\end{document}ψ must be true at some future point in time
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi ~{\mathcal {U}}~\phi $$\end{document}ψUϕ	-	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi $$\end{document}ψ must remain true at least until the point in time when \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\phi $$\end{document}ϕ becomes true
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi ~{\mathcal {R}}~\phi $$\end{document}ψRϕ	-	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi $$\end{document}ψ must remain true at least until and including the point in time when \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\phi $$\end{document}ϕ becomes true
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {X}}\psi $$\end{document}Xψ	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\circ \psi $$\end{document}∘ψ	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi $$\end{document}ψ must be true at the next point in time

Informal meanings of LTL operators An LTL formula is inductively defined as follows: For every , a is an LTL formula. If and are LTL formulas, then so are (or ), (or ), , , (or ), , and . The semantics of LTL over infinite traces is defined as follows: LTL formulas are interpreted as infinite words over the alphabet . The alphabet is all possible propositional interpretations of the propositional symbols in AP. denotes that state of the trace at time instant i. means that a trace at time instant i satisfies the LTL formula , and is defined as follows:For the definition of the semantics of LTL over finite traces, we refer the interested reader to the work of De Giacomo and Vardi [23] and De Giacomo et al. [25]. , for iff . iff . iff and . iff or . iff . iff , such that . iff , such that . iff , such that and , we have . iff , iff , then , such that . In model checking, LTL formulas commonly have two possible truth value states, namely true and false. In case of monitoring a compliance specification in a running system, it might be the case, that it is not only of interest if it is satisfied or violated but also whether further state changes are possible that could resolve or cause a violation of it. That is, the runtime state of a specification is either temporary or permanent. Consequently, an LTL specification at runtime is either temporarily satisfied, temporarily violated, permanently satisfied or permanently violated (cf. Bauer et al. [6, 7]). Several existing studies make use of the concept of four LTL truth value states (cf. Pešić et al. [62], De Giacomo et al. [24] and Maggi et al. [54]).

Event Processing Language (EPL)

In this section, the event processing language (EPL) [30] is discussed and how it can be applied for runtime monitoring of compliance specifications. An EPL-based specification consists of an initial truth value, which is either assigned to temporarily satisfied or temporarilyviolated, and one or more query–listener pairs. A query–listener pair causes a truth value change in the specification as soon as a matching event pattern is observed in the event stream. Consequently, an EPL-based compliance specification always consists of EPL queries that are composed of EPL operators and listeners that cause truth value changes to temporarily satisfied, temporarily violated, permanently satisfied, permanently violated, as already discussed for LTL in Sect. 2.1. The truth value state of the specification is updated by a positive match of the related expression in the event stream. Based on the notation suggested by Czepa et al. [18, 19], the short notation is used for an EPL query–listener pair responsible for changing the truth value of a compliance rule. Obviously, further truth value changes are not possible once a permanent state, namely either permanently violated or permanently satisfied, has been reached. According to the EPL reference [30], the semantics is given as shown in Table 3.

Table 3

Semantics of EPL operators

Operator name	Representation	Semantics
and	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_1$$\end{document}e1 and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_2$$\end{document}e2	Logical conjunction that is matched once both \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_1$$\end{document}e1 and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_2$$\end{document}e2 in any order have occurred
or	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_1$$\end{document}e1 or \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_2$$\end{document}e2	Logical disjunction that is matched once either \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_1$$\end{document}e1 or \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_2$$\end{document}e2 has occurred
not	not e	Logical negation that is matched if the expression e is not matched
every	every e	Not just observe the first occurrence of the expression e in the event stream but also each subsequent one
leads-to	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_1$$\end{document}e1 -> \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_2$$\end{document}e2	The first \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_1$$\end{document}e1 must be observed and only then is \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_2$$\end{document}e2 matched. Intuitively, the whole expression is matched once \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_1$$\end{document}e1 is followed by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_2$$\end{document}e2 at the occurrence of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_2$$\end{document}e2
until	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_1$$\end{document}e1 until \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_2$$\end{document}e2	Matches the expression \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_1$$\end{document}e1 until \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_2$$\end{document}e2 occurs. In practice, this operator is commonly used in the expression not\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_1$$\end{document}e1until\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_2$$\end{document}e2 that demands the absence of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_1$$\end{document}e1 before the occurrence of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_2$$\end{document}e2

Semantics of EPL operators

Property specification patterns (PSP)

Dwyer et al. proposed the property specification patterns (PSP) [27, 28], a collection of recurring specification patterns. For each pattern, there exist transformation rules to underlying formal representations , including LTL and CTL6. The patterns are categorized into Occurrence Patterns and Order Patterns as shown in Tables 4 and 5, respectively. Figure 1 shows the area of effect of available scopes, whereas Table 6 discusses their meaning.

Table 4

Intents of occurrence patterns

Pattern name	Representation	Intent
Absence	a never occurs	To describe a portion of a system’s execution that is free of certain events or states
Universality	a always occurs	To describe a portion of a system’s execution which contains only states that have a desired property
Existence	a occurs	To describe a portion of a system’s execution that contains an instance of certain events or states
Bounded existence	a occurs at most n times	To describe a portion of a system’s execution that contains at most a specified number of instances of a designated state transition or event

Table 5

Intents of order patterns

Pattern name	Representation	Intent
Precedence	a precedes b	To describe a relationship between a pair of events/states where the occurrence of the first is a necessary precondition for an occurrence of the second
Response	a leads-to b	To describe a cause–effect relationship between a pair of events/states. An occurrence of the first, the cause, must be followed by an occurrence of the second, the effect
2 Cause–1 Effect Precedence Chain	(a, b) precedesc	To describe a relationship between an event/state sequence (a, b) and an event/state c in which the occurrence of c within the scope must be preceded by a sequence of events/states (a, b) within the same scope
1 Cause–2 Effect Precedence Chain	aprecedes (b, c)	To describe a relationship between an event/state a and a sequence of events/states (b, c) in which the occurrence of b followed by c within the scope must be preceded by an occurrence of a within the same scope
2 Stimulus–1 Response Chain	(a, b) leads-toc	To describe a relationship between a stimulus sequence (a, b) and a response event c in which the occurrence of the stimulus events must be followed by an occurrence of the response event within the scope
1 Stimulus–2 Response Chain	aleads-to (b, c)	To describe a relationship between a stimulus event a and a sequence of two response events (b, c) in which the occurrence of the stimulus event must be followed by an occurrence of the sequence of response events within the scope

Fig. 1

Available scopes for property specification patterns (shaded areas indicate the extent over which the pattern must hold)

Table 6

Meaning of scopes

Scope name	Representation	Meaning
global	This scope is implicitly assumed when no other scope is defined	Defines that a pattern must hold during the entire execution of a system
before	before s [ p ]	p must hold before the first occurrence of s
after	after s [ p ]	p must hold after the first occurrence of s
between	between \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_1$$\end{document}s1 and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_2$$\end{document}s2 [ p ]	p must hold between every \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_1$$\end{document}s1 (i.e., starting the scope) that is followed by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_2$$\end{document}s2 (i.e., closing the scope)
after-until	after \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_1$$\end{document}s1 until \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_2$$\end{document}s2 [ p ]	p must hold after every \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_1$$\end{document}s1 (i.e., starting the scope) by no later than \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_2$$\end{document}s2 (i.e., closing the scope)

Intents of occurrence patterns Intents of order patterns Available scopes for property specification patterns (shaded areas indicate the extent over which the pattern must hold) The available runtime states of PSP specifications are no different from those of LTL and EPL specifications (cf. Sects. 2.1 and 2.2), namely temporarily satisfied, temporarily violated, permanentlysatisfied and permanently violated.

Experiment planning

This section describes the outcome of the experiment planning phase, and it provides all information that is required for a replication of the study.

Goals

The primary goal of the experiment is measuring the construct understandability of representative languages that are suitable for modeling compliance specifications. This construct is defined by the syntactic correctness, semantic correctness and response time of the answers given by the participants. This study differentiates between syntactic and semantic correctness as it enables a more fine-grained analysis. This is in line with Chomsky [11], who stressed that the study of syntax must be independent from the study of semantics. Numerous existing studies differentiate between syntactic and semantic correctness (cf. Ferri et al. [32], Hindawi et al. [39] and Harel and Rumpe [37]). On the other hand, an LTL formula can be syntactically totally correct without catching the desired meaning. For example, the specification “activity 2 must not happen unless activity 1 has already happened” is not covered at all in a semantic way by the syntactically correct formula “.” In contrast, the formula “” is both syntactically and semantically correct. Meaning of scopes In addition to the understandability construct, the experiment aims at studying the perceived ease of application of the languages and the perceived correctness of the formalized compliance specifications.

Experimental units

All 215 participants of the experiment are students who enrolled in the courses “Software Engineering Lab (SE2)” and “Advanced Software Engineering Lab (ASE)” at the Faculty of Computer Science, University of Vienna, Austria. Two kinds of participants can be differentiated:Using students as proxies for non-expert users is not an issue according to Kitchenham et al. [50]. Other studies even suggest that students can be used as proxies for experts under certain circumstances (cf. Höst et al. [43], Runeson [72], Svahnberg et al. [78] and Salman et al. [73]). As an incentive for participation and proper preparation, up to 10 bonus points ( of total course points) were awarded based on the participant’s performance in the experiment. All participants were randomly allocated to experiment groups. 149 participants of the bachelor-level course SE2 are used as proxies for novice software engineers, designers or developers. 66 participants of the master-level course ASE are used as proxies for moderately advanced software engineers, designers or developers.

Experimental material and tasks

In total, the experiment comprised five distinct tasks stemming from three different domains, as shown in Table 7. Tasks 1 and 2 are related to compliance in the context of lending, Task 3 focuses on compliance regarding hospital processes, and Tasks 4 and 5 are based on compliance specifications in the construction industry. Each task was presented to the participants by stating first the context, then the specification and last the available elements that are to be used during formal modeling of the specification. For an example, how experimental tasks were presented to the participants, see Fig.2. The full experimental material is available online (cf. Czepa et al. [22]). For sample solutions of all experimental tasks, see “Appendix A.” It is important to note that these sample solutions show just one way to model the compliance specifications. In the grading process, each proposed solution was carefully assessed under constant consideration that the sample solution might not be the only way to correctly formalize the specification.

Table 7

Experimental tasks

Task No.	Context/Source	Compliance specification in natural language	Available elements for modeling
1	Request for a loan (cf. Elgammal et al. [29])	The branch office manager has to evaluate the loan risk before signing the contract officially. No one else is allowed to evaluate the loan risk and to sign the contract	Tasks
			Evaluate loan risk
			Officially sign contract
			Roles
			Branch office manager
2	Request for a loan (cf. Elgammal et al. [29])	The checking of the customer bank privilege is followed by checking of the credit worthiness. Both activities must take place before determining the risk level of the loan application	Tasks
			Check customer privilege
			Check credit worthiness
			Evaluate loan risk
3	Medical treatment and surgery of malignant gastric diseases (cf. Rovani et al. [71])	The preoperative screening is performed before any surgical treatment in order to assess whether the patient’s conditions are good enough for the surgery to be performed and to estimate potential risks. As far as the surgical technique is concerned, the gastric resection for malignant diseases can be performed by using either a laparoscopic surgery or a traditional open approach, but not both. Furthermore, in both cases a nursing period is needed to monitor the patient after the operation	Tasks
			Preoperative screening
			Laparoscopic gastrectomy
			Open gastrectomy
			Nursing
4	Renovation work and lead-based paint (cf. United States Environmental Protection Agency [83])	Once a lead contamination has been identified, a certified renovator must be present all time while any cleaning activity is performed until the end of the renovation work	Tasks
			Renovation
			Cleaning
			Presence of certified renovator
			Events
			Lead contamination identified
5	Renovation work and lead-based paint (cf. United States Environmental Protection Agency [83])	Contractors, property managers and others who perform renovations for compensation in residential houses, apartments and child-occupied facilities built before 1978 are required to distribute a lead pamphlet before starting renovation work	Tasks
			Renovation
			Distribute lead pamphlet
			Classify building
			Enter building date
			Data
			Year of construction
			Type of building

Fig. 2

Sample task as presented to the participants

Experimental tasks Sample task as presented to the participants

Hypotheses, parameters and variables

PSP abstracts underlying formal representations, such as LTL formulas, by high-level patterns with the intention to facilitate reuse and to enable ease of use. That is, the pattern representations are assumed to provide a better understandability than their underlying LTL formulas. EPL-based constraints are composed of an initial truth value and one or more query–listener pairs that change the truth value state. In contrast to LTL where meaning is encoded in a formula, different concerns, namely defining the initial truth value and change criteria for the truth value, are separated from each other in EPL-based constraints. This separation of concerns is assumed to facilitate the understandability of EPL-based constraints as opposed to LTL formulas where this separation is not present. Consequently, we hypothesized that PSP, as a highly abstract pattern language, is easier to understand than LTL and EPL and that EPL, due to separation of concerns, is easier to understand than LTL. Consequently, the following hypotheses for the controlled experiment were formulated:The construct understandability is measured by three interval-scaled dependent variables, namely:In addition, there are hypotheses that are concerned with the participants’ opinion on the languages under investigation, namely:The dependent variables associated with these hypotheses are ordinal scaled since the data were gathered by agree–disagree scales. In accordance with the results of a study by Revilla et al. [68], each scale had five categories. : There is no difference in terms of understandability between PSP and LTL. : PSP has a higher level of understandability than LTL. : There is no difference in terms of understandability between PSP and EPL. : PSP has a higher level of understandability than EPL. : There is no difference in terms of understandability between EPL and LTL. : EPL has a higher level of understandability than LTL. the syntactic correctness achieved in trying to formally model the compliance specifications, the semantic correctness achieved in trying to formally model the compliance specifications, the response time, which is the time it took to complete the experimental tasks. : There is no difference in terms of perceived correctness between PSP and LTL. : PSP has a higher level of perceived correctness than LTL. : There is no difference in terms of perceived correctness between PSP and EPL. : PSP has a higher level of perceived correctness than EPL. : There is no difference in terms of perceived correctness between EPL and LTL. : EPL has a higher level of perceived correctness than LTL. : There is no difference in terms of perceived ease of application between PSP and LTL. : PSP has a higher level of perceived ease of application than LTL. : There is no difference in terms of perceived ease of application between PSP and EPL. : PSP has a higher level of perceived ease of application than EPL. : There is no difference in terms of perceived ease of application between EPL and LTL. : EPL has a higher level of perceived ease of application than LTL.

Experiment design and execution

According to Wohlin et al. [86], “it is important to try to use a simple design and try to make the best possible use of the available subjects.” For that reason, a completely randomized experiment design with one alternative per experimental unit was used. That is, each participant is randomly assigned to exactly one experiment group. This assignment took place fully automated in an unbiased manner. Preparation documents were distributed to the participants one week before the experiment run. In these documents, the basics of the approaches are discussed, and the participants were encouraged to prepare for the experiment by applying the assigned behavioral constraint representation before the experiment session. To avoid bias, all three preparation documents are similar in length and depth. The approaches were presented in an approachable manner to the participants as suggested by numerous existing research on teaching undergraduate students in theoretical computer science, formal methods and logic (cf. Habiballa and Kmeť [34], Knobelsdorf and Frede [51], Carew et al. [10] and Spichkova [77]). The used training material is available online (cf. Czepa et al. [22]).

Procedure

To ensure a smooth procedure and to avoid unnecessary stress, the preparation document informed the participants about the procedure on the experiment day as detailed as possible. Seating arrangements were made to limit chances for misbehavior, and the participants were instructed how to find a suitable seat. The participants were allowed to use printouts of the preparation material and notes at their own discretion. After a brief discussion of the contents and structure of the experiment document by the experimenters, the participants started trying to solve the experimental tasks. The duration of the experiment was limited to 90 min. Due to organizational reasons, the experiment was done on paper, and time record keeping was the responsibility of each participant (please see Sect. 5.2 for a discussion of this potential threat to validity). After experiment execution, the answers given were evaluated. For that purpose, a method proposed by Lytra et al. [53] was applied, which comprises the independent evaluation of the answers by three experts, and a discussion of large differences in grading until a consensus is achieved. The attempted formalization in each experiment tasks was graded independently by the first, second and third author, who are experts in the investigated languages. To mitigate the risk of grading bias, the participant’s given answers were graded in random order by each of the experts, and, in case of large differences in grading, a discussion took place until a consensus was achieved. Figures 3 and 4 depict the grading process schematically from the individual and overall perspective, respectively. This evaluation of more than a thousand distinct answers comprising approximately 17,000 constraints took about half a year besides the authors’ normal responsibilities such as teaching and other research. All other given answers, which are related to previous knowledge, time records and agree–disagree scale responses, were digitized and double-checked subsequently.

Fig. 3

Individual grading procedure

Fig. 4

Overall grading procedure

Individual grading procedure

Analysis

This section is concerned with the treatment and statistics of the data.

Data set preparation

To preserve the integrity of the acquired data, it was necessary to drop potentially unreliable items. In total, the data of eight participants were not considered in the statistical evaluations. Table 8 summarizes all dropped participants including the reasons for non-consideration.

Table 8

Summary of dropped participants

Group	Course	Reason
PSP	SE2	The participant gave up after the first task
PSP	SE2	The participant did not apply PSP, but used a language/formalism that was not part of the study
LTL	SE2	The participant was assigned to LTL, but gave answers in PSP
LTL	SE2	The participant gave positive perceived difficulty and correctness ratings for unsolved tasks
PSP	ASE	The participant did not apply PSP, but wrote basic Boolean formulas
PSP	ASE	The participant came unprepared
PSP	ASE	The participant did not apply PSP, but drew UML activity diagrams
LTL	ASE	The participant gave up after the first task

Descriptive statistics

In this section, the acquired data (cf. Czepa et al. [22]) are analyzed by the help of descriptive statistics. Overall grading procedure Table 9 shows the number of observations, central tendency and dispersion of the dependent variables syntactic correctness, semantic correctness and response time per group. In the bachelor-level course Software Engineering 2, the sample size is relatively large and evenly distributed (9 : 47 : 49). In the master-level course Advanced Software Engineering, there are less than half as many observations. Unfortunately, the number of participants of the group with the smallest number of observations, namely PSP, was further diminished by the exclusion of three participants (cf. Sect. 4.1). In consequence, the distribution in the ASE course is 21 : 17 : 24. The median and mean correctness values of the LTL groups in both SE2 and ASE are smaller than those of the other two groups. In SE2, the mean syntactic correctness of the LTL group is 56.52, thus about less than in the EPL group () and about less than in the PSP group (), and the mean semantic correctness of the LTL group is at , so about below the EPL group () and below the PSP group (). In ASE, the mean syntactic correctness of the LTL group is , thus about less than in the PSP group () and about less than in the EPL group (). While the PSP group overall achieved a higher syntactic and semantic correctness than the LTL group in SE2, this ranking is reversed in the ASE course where EPL participants overall achieved a higher syntactic and semantic correctness than their colleagues of the PSP group. The mean syntactic correctness achieved by the PSP group () is about higher than in the EPL group () in SE2, whereas the EPL group achieved an about higher mean syntactic correctness () than the PSP group () in ASE. In SE2, the mean semantic correctness of the PSP group () is about higher than in the EPL group (). In ASE, the mean semantic correctness is about higher in the EPL group () than in the PSP group (). The mean and median response times are overall faster in the SE2 course than in the ASE course. In SE2, the mean response time of the LTL group (43.49 min) is slightly faster than in EPL (44.87 min) and a few minutes faster than in the PSP group (48.68 min). In ASE, the mean response time of the LTL group (52.32 min) is 3–4 min faster than in the PSP group (55.99 min) and 6–7 min faster than in the EPL group (58.82 min).

Table 9

Number of observations, central tendency and dispersion of the dependent variables semantic/syntactic correctness and response time per group and course

	LTL	PSP	EPL
Total number of observations	51	49	49
Number of considered observations	49	47	49
Software Engineering 2 (SE2) (bachelor-level course)
Syntactic correctness
Arithmetic mean (%)	56.52	68.64	61.82
Standard deviation (SD) (%)	16.40	16.99	16.85
Median (%)	57.84	72.55	61.76
Median absolute deviation (MAD) (%)	19.19	13.37	18.61
Minimum (%)	9.02	24.51	21.18
Maximum (%)	96.27	98.82	89.22
Skew	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.3$$\end{document}-0.3	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.55$$\end{document}-0.55	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.53$$\end{document}-0.53
Kurtosis	0.01	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.09$$\end{document}-0.09	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.4$$\end{document}-0.4
Semantic correctness
Arithmetic mean (%)	28.49	50.19	38.20
Standard deviation (SD) (%)	14.48	15.74	14.73
Median (%)	27.06	49.61	36.08
Median absolute deviation (MAD) (%)	13.66	15.12	13.66
Minimum (%)	2.75	18.04	10
Maximum (%)	68.43	80.59	72.55
Skew	0.75	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.08$$\end{document}-0.08	0.27
Kurtosis	0.24	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.68$$\end{document}-0.68	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.56$$\end{document}-0.56
Response time
Arithmetic mean (min)	43.49	48.68	44.87
Standard deviation (SD) (min)	13.10	14.39	14.07
Median (min)	40.50	45.67	47.22
Median absolute deviation (MAD) (min)	11.98	17.49	13.66
Minimum (min)	15.07	27.00	14.58
Maximum (min)	75.40	79.93	75.00
Skew	0.33	0.38	0.14
Kurtosis	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.35$$\end{document}-0.35	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.93$$\end{document}-0.93	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.41$$\end{document}-0.41
Total number of observations	22	20	24
Number of considered observations	21	17	24
Advanced Software Engineering (ASE) (master-level course)
Syntactic correctness
Arithmetic mean (%)	57.01	65.13	71.91
Standard deviation (SD) (%)	15.62	21.02	13.78
Median (%)	56.67	67.84	72.06
Median absolute deviation (MAD) (%)	18.90	26.74	10.47
Minimum (%)	29.61	21.76	31.76
Maximum (%)	81.96	89.41	94.71
Skew	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.15$$\end{document}-0.15	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.5$$\end{document}-0.5	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.9$$\end{document}-0.9
Kurtosis	1.22	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.02$$\end{document}-1.02	1.05
Semantic correctness
Arithmetic mean (%)	30.85	46.93	49.71
Standard deviation (SD) (%)	12.96	17.14	13.46
Median (%)	29.61	47.84	51.57
Median absolute deviation (MAD) (%)	14.54	19.19	12.06
Minimum (%)	12.75	17.65	19.41
Maximum (%)	63.14	75.69	76.86
Skew	0.6	0.06	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.37$$\end{document}-0.37
Kurtosis	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.41$$\end{document}-0.41	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.12$$\end{document}-1.12	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.45$$\end{document}-0.45
Response time
Arithmetic mean (min)	52.32	55.99	58.82
Standard deviation (SD) (min)	15.36	13.64	14.15
Median (min)	49.00	62.00	58.00
Median absolute deviation (MAD) (min)	16.88	11.64	15.64
Minimum (min)	28.00	29.50	37.17
Maximum (min)	84.00	73.08	81.78
Skew	0.42	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.61$$\end{document}-0.61	0.15
Kurtosis	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.94$$\end{document}-0.94	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.09$$\end{document}-1.09	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.19$$\end{document}-1.19

Skew is a measure of the shape of a distribution. A positive skew value indicates a right-tailed distribution (e.g., more cases of low correctness than high correctness), a negative skew value indicates a left-tailed distribution (e.g., more cases of high correctness than low correctness), and a skew value close to zero indicates a symmetric distribution. Differences in skew are, for example, present between the semantic correctness distributions of LTL (0.75 indicating that the mass of the distribution is concentrated at lower levels of correctness) and PSP ( indicating a rather symmetric distribution) in SE2, between the syntactic correctness distributions of LTL ( indicating a curve that is slightly leaned to the right) and EPL ( indicating a distribution with only few measurements in lower correctness ranges) in ASE, between the semantic correctness distributions of LTL (0.6 indicating higher densities in lower correctness ranges) and EPL ( indicating higher densities in higher correctness ranges) in ASE, and between the response time distributions of LTL (0.42 indicating a left-leaning curve) and PSP ( indicating a right-leaning curve) in ASE. Summary of dropped participants Number of observations, central tendency and dispersion of the dependent variables semantic/syntactic correctness and response time per group and course Kurtosis is another measure for the shape of a distribution which focuses on the general tailedness. Positive kurtosis values indicate skinny tails with a steep distribution, whereas negative kurtosis values indicate fat tails. The most severe difference in kurtosis is present between the syntactic correctness distributions of the LTL group (1.22) and PSP group (). So far, the dependent variables were analyzed on the basis of separating between course groups, which reflects the participants academic level of progression. Next, the dependent variables are investigated focusing on participants with industrial working experience. Table 10 summarizes the descriptive statistics of the dependent variables when focusing on participants with industrial working experience of one year and above. Based on the demographic data collected (cf. “Appendix D”), we consider this subset of participants to be close to the population of industrial practitioners with basic to modest experience. The mean syntactic correctness in the LTL group () is about lower than in the PSP () and EPL () groups. The PSP group achieved the highest degree of semantic correctness (), closely followed by the EPL group (). The LTL group achieved semantic correctness, which is noticeable lower than in the two other groups. Present differences in skew and kurtosis are indications of differences in central location and distribution shape.

Table 10

Number of observations, central tendency and dispersion of the dependent variables semantic/syntactic correctness and response time per group of participants with working experience year

	LTL	PSP	EPL
Number of observations	20	17	22
Syntactic correctness
Arithmetic mean (%)	58.65	66.79	66.01
Standard deviation (SD) (%)	14.68	17.76	14.82
Median (%)	58.82	67.84	70.20
Median absolute deviation (MAD) (%)	16.42	13.08	12.50
Minimum (%)	31.18	21.76	26.67
Maximum (%)	81.96	89.41	89.22
Skew	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}- 0.33	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}- 0.89	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}- 0.87
Kurtosis	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}- 1.03	0.24	0.32
Semantic correctness
Arithmetic mean (%)	30.51	48.58	44.46
Standard deviation (SD) (%)	16.04	16.93	15.20
Median (%)	28.73	49.22	45.78
Median absolute deviation (MAD) (%)	16.86	20.93	18.46
Mnimum (%)	8.24	17.65	15.69
Maximum (%)	63.33	75.69	72.55
Skew	0.55	0.2	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}- 0.1
Kurtosis	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}- 0.72	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}- 1.27	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}- 1.07
Response time
Arithmetic mean (min)	49.31	49.19	48.64
Standard deviation (SD) (min)	16.81	13.34	14.03
Median (min)	47.94	48.85	48.13
Median absolute deviation (MAD) (min)	15.80	20.36	15.52
Minimum (min)	15.07	29.50	24.07
Maximum (min)	84.00	66.00	76.08
Skew	0.29	0.21	0.22
Kurtosis	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}- 0.42	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}- 1.56	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}- 0.87

Number of observations, central tendency and dispersion of the dependent variables semantic/syntactic correctness and response time per group of participants with working experience year Participants’ perceived correctness For additional descriptive statistics of the dependent variables syntactic correctness, semantic correctness and response time, we refer the interested reader to “Appendix B.” With regard to the stacked bar chart (cf. Bryer and Speerschneider [44]) in Fig.5a showing the perceived correctness in SE2, the share of strongly agree responses to the statement “I think that my transformation of the requirement to the constraint language is correct” is higher in PSP () than in the other two groups, and the share of (strongly) disagree answers is in PSP while it is higher in LTL () and EPL (). With the share of neutral answers is largest in PSP. In ASE (cf. Fig. 5b), the participants appear to be overall slightly more confident regarding the correctness of their formalizations. The largest share of (strongly) agree responses is again present in the PSP group (), followed by LTL () and EPL (). According to the stacked bar charts in Fig. 5, the perceived correctness of PSP appears to be slightly higher than in the other experiment groups in SE2, while EPL has a slightly lower perceived correctness than the other languages in ASE. According to Fig. 5c, a large share () of participants with industry experience in the PSP is undecided whether the given answer is correct. The percentage of neutral answers of participants with industry experience is lowest in the EPL group () and only slightly higher in the LTL group. The largest share of (strongly) agree responses of participants with industry experience is present in the EPL group (), followed by LTL () and PSP ().

Fig. 5

Participants’ perceived correctness

Participants’ perceived ease of application Figure 6 contains stacked bar charts of the participants’ perceived ease of application of the tested languages. Interestingly, there appears to be a strong similarity between the perceived correctness and perceived ease of application responses in SE2 regarding the ranking of the approaches (cf. Figs. 6a, 5a). PSP with (strongly) agreeing and (strongly) disagreeing appears to be slightly easier to apply than EPL with (strongly) agreeing and (strongly) disagreeing, and LTL with (strongly) agreeing and (strongly) disagreeing is perceived slightly more difficult to apply than EPL. In ASE (cf. Fig. 6b), the application of PSP is perceived to be even easier than in SE2. Interestingly, EPL is perceived to be similarly easy as PSP with regard to application. Like in SE2, LTL is ranked last in perceived ease of application. Figure 6c focuses on industry participants and reveals striking differences between the groups. The perceived ease of application is highest rated in the EPL group with (strongly) agreeing and (strongly) disagreeing, which means that there is still a shift toward a negative rating. The strongest shift toward low ease of application is present in the LTL group with only (strongly) agreeing and (strongly) disagreeing. In between are the results of the PSP group with (strongly) agreeing and (strongly) disagreeing.

Fig. 6

Participants’ perceived ease of application

Statistical inference

Before applying any statistical test, its model assumption must be tested and met. For a discussion whether or not the normality assumption is violated by the acquired data, see “Appendix C.” Since there is uncertainty regarding normality, a core assumption of parametric testing, nonparametric testing is the preferable approach. Cliff’s d of syntactic/semantic correctness and response time in SE2, one-tailed with confidence intervals calculated for (cf. Cliff [14] and Rogmann [70]), adjusted p-values (cf. Benjamini and Hochberg [8]) [level of significance: * for , ** for , *** for ] and effect size magnitudes (cf. Kitchenham et al. [49]) Standard nonparametric tests like Kruskal–Wallis cannot be applied if distribution shapes differ apart from their central location (cf. descriptive statistics in “Appendix B”), so Cliff’s delta (cf. Cliff [14] and Rogmann [70]), a robust nonparametric test, is applied. Table 11 summarizes the test results for the bachelor-level course SE2. To take multiple testing into account, the p-values are adjusted based on the method proposed by Benjamini and Hochberg [8]. There is a highly significant result with a medium effect size magnitude, indicating that PSP provides a higher syntactic correctness than LTL. After p-value adjustments, no such result is present in the remaining syntactic correctness tests. All semantic correctness test results are highly significant with medium- to large-sized effects. There is no significant difference between the response times. Consequently, is rejected on the basis of syntactic and semantic correctness whereas and can only be rejected based on semantic correctness.

Table 11

Cliff’s d of syntactic/semantic correctness and response time in SE2, one-tailed with confidence intervals calculated for (cf. Cliff [14] and Rogmann [70]), adjusted p-values (cf. Benjamini and Hochberg [8]) [level of significance: * for , ** for , *** for ] and effect size magnitudes (cf. Kitchenham et al. [49])

	PSP/LTL	PSP/EPL	EPL/LTL
Syntactic correctness
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.7059	0.6071	0.6028
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.0038	0.0014	0.0046
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.2904	0.3916	0.3926
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.4155$$\end{document}-0.4155	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2155$$\end{document}-0.2155	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2103$$\end{document}-0.2103
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.1054	0.1166	0.1148
z	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-3.9412$$\end{document}-3.9412	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.8477$$\end{document}-1.8477	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.8308$$\end{document}-1.8308
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.5733$$\end{document}-0.5733	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.3976$$\end{document}-0.3976	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.3899$$\end{document}-0.3899
CI high	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2281$$\end{document}-0.2281	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0171$$\end{document}-0.0171	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0152$$\end{document}-0.0152
p	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$7.7 \times 10^{-5}$$\end{document}7.7×10-5	0.0339	0.0351
FDR adjusted p	0.0004	0.0658	0.0658
Level of significance	***	–	–
Effect size magnitude	Medium	–	–
Semantic correctness
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.8448	0.7153	0.6913
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.1356	0.0032	0.0058
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.1535	0.2816	0.3029
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.6913$$\end{document}-0.6913	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.4337$$\end{document}-0.4337	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.3884$$\end{document}-0.3884
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.0794	0.1057	0.1066
z	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-8.7104$$\end{document}-8.7104	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-4.1028$$\end{document}-4.1028	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-3.6445$$\end{document}-3.6445
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.8006$$\end{document}-0.8006	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.5909$$\end{document}-0.5909	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.549$$\end{document}-0.549
CI high	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.5374$$\end{document}-0.5374	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2447$$\end{document}-0.2447	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2002$$\end{document}-0.2002
p	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4.4 \times 10^{-14}$$\end{document}4.4×10-14	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4.4 \times 10^{-5}$$\end{document}4.4×10-5	0.0002
FDR adjusted p	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$6.5 \times 10^{-13}$$\end{document}6.5×10-13	0.0003	0.0008
Level of significance	***	***	***
Effect size magnitude	Large	Large	Medium
Response time
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.5928	0.5632	0.5298
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.0017	0.0023	0.0029
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.4055	0.4346	0.4673
d	0.1873	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1286$$\end{document}-0.1286	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0626$$\end{document}-0.0626
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.1153	0.119	0.1185
z	1.625	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.0808$$\end{document}-1.0808	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.5281$$\end{document}-0.5281
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.3685$$\end{document}-0.3685	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.3176$$\end{document}-0.3176	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2533$$\end{document}-0.2533
CI high	0.0076	0.0702	0.1329
p	0.0537	0.1413	0.2993
FDR adjusted p	0.0895	0.1766	0.2993
Level of significance	–	–	–
Effect size magnitude	–	–	–

Cliff’s d of syntactic/semantic correctness and response time in ASE, one-tailed with confidence intervals calculated for (cf. Cliff [14] and Rogmann [70]), adjusted p-values (cf. Benjamini and Hochberg [8]) [level of significance: * for , ** for , *** for ] and effect size magnitudes (cf. Kitchenham et al. [49]) In the master-level course ASE (cf. Table 12), there is a large-sized difference in syntactic correctness between EPL and LTL. Regarding semantic correctness, there are large-sized effects between PSP/LTL and EPL/LTL, indicating that the former outperforms the latter mentioned approach. As in SE2, there are no significant differences regarding the response times. Consequently, can only be rejected on the basis of semantic correctness, whereas is rejected based on both types of correctness.

Table 12

	PSP/LTL	PSP/EPL	EPL/LTL
Syntactic correctness
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.6303	0.4069	0.7718
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.0058	0	0.006
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.3697	0.5931	0.2222
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2605$$\end{document}-0.2605	0.1863	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.5496$$\end{document}-0.5496
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.1923	0.1991	0.1404
z	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.3547$$\end{document}-1.3547	0.9354	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-3.9153$$\end{document}-3.9153
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.5429$$\end{document}-0.5429	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1513$$\end{document}-0.1513	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.7397$$\end{document}-0.7397
CI high	0.0748	0.4849	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2783$$\end{document}-0.2783
p	0.092	0.1777	0.0002
FDR adjusted p	0.1971	0.2961	0.0012
Level of significance	–	–	**
Effect size magnitude	–	–	Large
Semantic correctness
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.7815	0.4461	0.8373
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0	0.0025	0.002
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.2185	0.5515	0.1607
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.563$$\end{document}-0.563	0.1054	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.6766$$\end{document}-0.6766
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.1517	0.1938	0.1229
z	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-3.7109$$\end{document}-3.7109	0.5438	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-5.5023$$\end{document}-5.5023
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.7633$$\end{document}-0.7633	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2153$$\end{document}-0.2153	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.8322$$\end{document}-0.8322
CI high	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2641$$\end{document}-0.2641	0.4055	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.4221$$\end{document}-0.4221
p	0.0003	0.2948	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$9.6 \times 10^{-7}$$\end{document}9.6×10-7
FDR adjusted p	0.0017	0.3641	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.4 \times 10^{-5}$$\end{document}1.4×10-5
Level of significance	**	–	***
Effect size magnitude	Large	–	Large
Response time
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.5686	0.4755	0.6349
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.0112	0	0.002
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.4202	0.5245	0.3631
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1485$$\end{document}-0.1485	0.049	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2718$$\end{document}-0.2718
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.194	0.1894	0.1697
z	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.7652$$\end{document}-0.7652	0.2588	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.6016$$\end{document}-1.6016
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.4451$$\end{document}-0.4451	0.2595	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.5243$$\end{document}-0.5243
CI high	0.1775	0.3485	0.0246
p	0.2246	0.3985	0.0583
FDR adjusted p	0.3062	0.4703	0.1507
Level of significance	–	–	–
Effect size magnitude	–	–	–

Table 13 contains the test results for participants with industry experience. There is no significant difference in terms of syntactic correctness and response time. Similarly to ASE, there is no significant difference in semantic correctness between PSP and EPL, while there are significant differences with large-sized effects when comparing PSP against LTL and EPL against LTL.

Table 13

	PSP/LTL	PSP/EPL	EPL/LTL
Syntactic correctness
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.6471	0.5321	0.6636
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.0029	0	0.0023
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.35	0.4679	0.3341
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2971$$\end{document}-0.2971	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0642$$\end{document}-0.0642	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.3295$$\end{document}-0.3295
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.1875	0.1934	0.1702
z	1.5845	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.3318$$\end{document}-0.3318	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.9358$$\end{document}-1.9358
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.5697$$\end{document}-0.5697	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.3682$$\end{document}-0.3682	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.5774$$\end{document}-0.5774
CI high	0.0345	0.2523	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0260$$\end{document}-0.0260
p	0.061	0.371	0.0299
FDR adjusted p	0.1526	0.4752	0.1043
Level of significance	–	–	–
Effect size magnitude	–	–	–
Semantic correctness
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.7824	0.5802	0.7295
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0	0	0.0023
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.2176	0.4198	0.2682
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.5647$$\end{document}-0.5647	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1604$$\end{document}-0.1604	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.4614$$\end{document}-0.4614
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.1501	0.1907	0.1582
z	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-3.7622$$\end{document}-3.7622	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.8412$$\end{document}-0.8412	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-2.9159$$\end{document}-2.9159
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.7632$$\end{document}-0.7632	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.4513$$\end{document}-0.4513	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.6812$$\end{document}-0.6812
CI high	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2687$$\end{document}-0.2687	0.1613	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1652$$\end{document}-0.1652
p	0.0003	0.2028	0.0029
FDR adjusted p	0.0023	0.3803	0.0145
Level of significance	**	–	*
Effect size magnitude	Large	–	Large
Response time
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.5059	0.5134	0.4909
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.0029	0	0.0045
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.4912	0.4866	0.5045
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0147$$\end{document}-0.0147	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0267$$\end{document}-0.0267	0.0136
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.1986	0.191	0.1822
z	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.074$$\end{document}-0.074	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.14$$\end{document}-0.14	0.0749
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.3314$$\end{document}-0.3314	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.331$$\end{document}-0.331	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2808$$\end{document}-0.2808
CI high	0.305	0.2825	0.3057
p	0.4707	0.4447	0.4704
FDR adjusted p	0.4752	0.4752	0.4752
Level of significance	–	–	–
Effect size magnitude	–	–	–

Cliff’s d of syntactic/semantic correctness and response time for participants with industry experience year, one-tailed with confidence intervals calculated for (cf. Cliff [14] and Rogmann [70]), adjusted p-values (cf. Benjamini and Hochberg [8]) [level of significance: * for , ** for , *** for ] and effect size magnitudes (cf. Kitchenham et al. [49]) Cliff’s d of perceived correctness and ease of application in SE2 and ASE, one-tailed with confidence intervals calculated for (cf. Cliff [14] and Rogmann [70]), adjusted p-values (cf. Benjamini and Hochberg [8]) [level of significance: * for , ** for , *** for ] and effect size magnitudes (cf. Kitchenham et al. [49]) Tables 14 and 15 summarize the test results regarding perceived correctness and perceived ease of application. Almost all test results are not significant with two exceptions: (1) A significant test result () with a medium-sized effect is present in SE2 between PSP and LTL with regard to perceived correctness. Consequently, can be rejected in SE2. That is, PSP participants are significantly more confident that the formalization is correct than LTL participants at the bachelor level while such an effect is not measurable at the master level or within the sample of industry participants. (2) Participants with industry experience rate the ease of application of EPL significantly higher than of LTL (). Consequently, can be rejected for participants with industry experience.

Table 14

Cliff’s d of perceived correctness and ease of application in SE2 and ASE, one-tailed with confidence intervals calculated for (cf. Cliff [14] and Rogmann [70]), adjusted p-values (cf. Benjamini and Hochberg [8]) [level of significance: * for , ** for , *** for ] and effect size magnitudes (cf. Kitchenham et al. [49])

	PSP/LTL	PSP/EPL	EPL/LTL
Software Engineering 2 (Bachelor-level course)
Perceived correctness
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.4336	0.4087	0.392
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.2485	0.2589	0.259
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.3179	0.3324	0.349
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1157$$\end{document}-0.1157	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0763$$\end{document}-0.0763	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.043$$\end{document}-0.043
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.05	0.0511	0.0502
z	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-2.3139$$\end{document}-2.3139	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.494$$\end{document}-1.494	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.8566$$\end{document}-0.8566
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.197$$\end{document}-0.197	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1597$$\end{document}-0.1597	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1253$$\end{document}-0.1253
CI high	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0328$$\end{document}-0.0328	0.0081	0.0398
p	0.0105	0.0679	0.196
FDR adjusted p	0.0316	0.1019	0.2262
Level of significance	*	–	–
Effect size magnitude	Medium	–	–
Perceived ease of application
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.4213	0.4005	0.3881
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.2518	0.2569	0, 2631
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.3269	0.3426	0.3488
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0945$$\end{document}-0.0945	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0579$$\end{document}-0.0579	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0394$$\end{document}-0.0394
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.0502	0.0513	0.0501
z	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.8827$$\end{document}-1.8827	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.1272$$\end{document}-1.1272	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.7857$$\end{document}-0.7857
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1762$$\end{document}-0.1762	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1417$$\end{document}-0.1417	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1214$$\end{document}-0.1214
CI high	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0114$$\end{document}-0.0114	0.0268	0.0432
p	0.0302	0.1301	0.2162
FDR adjusted p	0.0658	0.1766	0.2317
Level of significance	–	–	–
Effect size magnitude	–	–	–
Advanced Software Engineering (master-level course)
Perceived correctness
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.3675	0.4013	0.3095
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.3039	0.2914	0.324
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.3286	0.3074	0.3664
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0389$$\end{document}-0.0389	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0939$$\end{document}-0.0939	0.0569
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.0808	0.0778	0.0722
z	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.481$$\end{document}-0.481	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.2065$$\end{document}-1.2065	0.7882
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1706$$\end{document}-0.1706	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.22$$\end{document}-0.22	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0623$$\end{document}-0.0623
CI high	0.0942	0.0352	0.1745
p	0.3155	0.1145	0.2157
FDR adjusted p	0.3641	0.2147	0.3062
Level of significance	–	–	–
Effect size magnitude	–	–	–
Perceived ease of application
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.4338	0.3752	0.4233
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.2613	0.2616	0.2891
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.3049	0.3632	0.2876
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.129$$\end{document}-0.129	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.012$$\end{document}-0.012	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1356$$\end{document}-0.1356
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.0827	0.0807	0.0725
z	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.5594$$\end{document}-1.5594	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1481$$\end{document}-0.1481	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.8697$$\end{document}-1.8697
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.262$$\end{document}-0.262	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.144$$\end{document}-0.144	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2526$$\end{document}-0.2526
CI high	0.009	0.1205	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0147$$\end{document}-0.0147
p	0.0603	0.4412	0.0314
FDR adjusted p	0.1507	0.4412	0.1178
Level of significance	–	–	–
Effect size magnitude	–	–	–

Table 15

	PSP/LTL	PSP/EPL	EPL/LTL
Perceived correctness
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.3586	0.344	0.37
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.2778	0.2872	0.2745
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.3636	0.3689	0.3555
d	0.0051	0.0249	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0145$$\end{document}-0.0145
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.0813	0.0793	0.0768
z	0.0622	0.3142	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1894$$\end{document}-0.1894
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1283$$\end{document}-0.1283	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1055$$\end{document}-0.1055	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1402$$\end{document}-0.1402
CI high	0.1383	0.1545	0.1116
p	0.4752	0.3769	0.425
FDR adjusted p	0.4752	0.4752	0.4752
Level of significance	–	–	–
Effect size magnitude	–	–	–
Perceived ease of application
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_1 = P(X>Y)$$\end{document}p1=P(X>Y)	0.4078	0.3006	0.5014
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_2 = P(X=Y)$$\end{document}p2=P(X=Y)	0.2734	0.2524	0.2555
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_3 = P(X<Y)$$\end{document}p3=P(X<Y)	0.3188	0.447	0.2432
d	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0889$$\end{document}-0.0889	0.1463	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2581$$\end{document}-0.2581
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_d$$\end{document}sd	0.0826	0.0802	0.0733
z	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.0765$$\end{document}-1.0765	1.8252	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-3.5228$$\end{document}-3.5228
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.2226$$\end{document}-0.2226	0.0124	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.3744$$\end{document}-0.3744
CI high	0.048	0.2751	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1339$$\end{document}-0.1339
p	0.1416	0.0348	0.0003
FDR adjusted p	0.3033	0.1042	0.0023
Level of significance	–	–	**
Effect size magnitude	–	–	Medium

Cliff’s d of perceived correctness and ease of application for participants with industry experience, one-tailed with confidence intervals calculated for (cf. Cliff [14] and Rogmann [70]), adjusted p-values (cf. Benjamini and Hochberg [8]) [Level of significance: * for , ** for , *** for ], and effect size magnitudes (cf. Kitchenham et al. [49]) The statistics software R7 was used for all statistical analyses. In particular, the following libraries were used in the course of the performed statistical evaluations: biotools [75], car [33], ggplot2 [85], mvnormtest [76], mvoutlier [63], orddom [70], psych [67] and usdm [58].

Discussion

This sections discusses the results and threats to validity of the study.

Evaluation of results and implications

The experimental goal was stated as LTL, PSP and EPLtheir evaluationtheir understandability related to modeling compliance specificationsthe novice and moderately advanced software engineer, designer or developerthe Software Engineering 2 Lab and the Advanced Software Engineering Lab courses at the Faculty of Computer Science of the University of Vienna. Due to the large number of participants with industry experience, it became possible to consider a third population, namely participants with industry experience, who function as proxies for industrial practitioners with basic to modest industry experience. Based upon the stated goal, questions concerning understandability were generated. The understandability construct focuses on the degree of syntactic and semantic correctness achieved and on the time spent on modeling compliance specifications. The results per question are summarized in Table 16. By differentiating between syntactic and semantic correctness, it became possible to reveal that differences in understandability in formal modeling of compliance specifications predominately lie in semantic correctness. Almost all test results regarding semantic correctness are highly significant with large-sized effects. Interestingly, no significant difference in semantic correctness is present between the pattern-based PSP approach and the CEP-based EPL language in the master-level course ASE and in the subset of participants with industry experience. That might imply that more experienced users are able to cope equally well with both approaches. Aside from that, the results suggest that the pattern-based PSP approach is more understandable than EPL and LTL and that EPL provides a higher level of understandability than LTL. In terms of syntactic correctness, PSP seems to be more understandable than LTL for less experience users, while EPL seems to be more understandable than LTL for more experienced users. This study did not reveal any significant differences in response time. Regarding perceived correctness and perceived ease of application, there are two significant test results, which imply that transformations to PSP are perceived to be more correct than LTL transformations by less experienced users, and more experienced users with industry experience find that EPL is easier to apply than LTL.

Table 16

GQM summary

ID	Question	Summary of results
Q1	How understandable are the tested approaches for participants at the bachelor level (attending the Software Engineering 2 Lab course)?
Q2	Are there differences in understandability between the tested approaches for participants at the bachelor level (attending the Software Engineering 2 Lab course)?	There are significant differences between all tested approaches in terms of semantic correctness, and between PSP and LTL in terms of syntactic correctness
Q3	How understandable are the tested approaches for participants at the master level (attending the Advanced Software Engineering Lab course)?
Q4	Are there differences in understandability between the tested approaches for participants at the master level (attending the Advanced Software Engineering Lab course)?	There are significant differences in terms of semantic and syntactic correctness between EPL and LTL, and between PSP and LTL in terms of semantic correctness
Q5	How understandable are the tested approaches for participants with industrial working experience?
Q6	Are there differences in understandability between the tested approaches for participants with industrial working experience?	There are significant differences in terms of semantic correctness between PSP and LTL as well as between EPL and LTL

Overall, the results imply that the pattern-based PSP approach has advantages with regard to understandability. Therefore, the pattern-based approach seems to be particularly well suited for modeling compliance specifications. Moreover, the results indicate that EPL is more understandable than LTL. This could be important in cases where the set of available PSP patterns is not sufficient to model a compliance specification. In such cases, the compliance specification could be encoded in EPL for runtime verification or an extension of the pattern catalog could take place. In this regard, EPL specifications could be used to aid the creation of new patterns with underlying LTL formalizations by checking the plausibility of the LTL formula (cf. Czepa et al. [18, 19]). Moreover, the results are overall in line with two controlled experiments on the understandability of already existing formal specifications in LTL, EPL and PSP carried out by Czepa and Zdun [17]. The results of these controlled experiments with 216 participants in total suggested that existing specifications in PSP are significantly easier to understand than existing specifications in EPL and LTL. Moreover, the results implied that existing specifications in EPL are significantly easier to understand than existing specifications in LTL. The correctness of understanding was evaluated by letting the participant decide whether a truth value is the correct truth value of a specification, given a specific trace. In contrast to the current study, which focuses on the formal modeling of compliance specifications, no major differences between novice and moderately advanced users were found in understandability of existing specifications. Interestingly, the response times between the experimental groups were significantly different in most cases, an effect which appears to be absent during modeling (cf. Sect. 4.3). GQM summary

Threats to validity

In the following, all known threats that might have an impact on the validity of the results of this study are discussed.

Threats to internal validity

Threats to internal validity are unobserved variables that might have an undesired impact on the result of the experiment by disturbing the causal relationship of independent and dependent variables. There exist several threats to internal validity, which must be discussed: History effects refer to events that happen in the environment resulting in changes in the conditions of a study. The short duration of the study limits the possibility of changes in environmental conditions, and none were observed. The occurrence of such effects prior to the study cannot be entirely ruled out. However, in such a case, it would be extremely unlikely that the scores of one experiment group are more affected than another, because of the random allocation of participants to groups. Maturation effects refer to the impact the passage of time has on an individual. Like history effects, maturation effects are rather problematic in long-term studies. Since the duration of the experiment was short, maturation effects are considered to be of minor importance, and none were observed. Testing effects comprise learning effects and experimental fatigue. Learning effects were avoided by testing each person only once. Experimental fatigue is concerned with happenings during the experiment that exhaust the participant either physically or mentally. The short time frame of the experiment session limits chances of fatigue. Neither were any signs of fatigue observed nor were there any reports by participants indicating fatigue. Instrumental bias occurs if the measuring instrument (i.e., a physical measuring device or the actions/assessment of the researcher) changes over time during the experiment. Since the answers given in the experiment tasks were evaluated manually, this is a serious threat to validity. It might be the case that the experience gained in scoring some answers had an influence on subsequent evaluations. This threat was mitigated by evaluating the results in no specific prescribed order, and in case of substantial differences in grading, a discussion took place until consensus was achieved. Selection bias is present if the experimental groups are unequal before the start of the experiment (e.g., severe differences in previous experience). Selection bias is likely to be more threatening in quasi-experimental research. By using an experimental design with the fundamental requirement to randomly assignment participants to the different groups of the experiment, it became possible to avoid selection bias to a large extent. In addition, the investigation of the composition of the groups did not reveal any major differences between them. (cf. “Appendix D”). Experimental mortality more likely occurs in long-lasting experiment since the chances for dropouts increase (e.g., participants leaving the town). Due to the short time frame of this study, experimental mortality did not occur. Diffusion of treatments is present if at least one group is contaminated by the treatments of at least one other group. Since the participants share the same social group, and they are interacting outside the research process as well, a cross-contamination between the groups cannot be entirely rule out. Compensatory rivalry is present if participants of a group put in extra effort when the impression arises that the treatment of another group might lead to better results than their own treatment. This threat was mitigated by clarifying that different degrees of difficulty will be considered and compensated in the calculation of bonus points. Demoralization could occur if a participant is assigned to a specific group that she/he does not want to be part of. No indications of demoralization such as increased dropout rates or complaints regarding group allocation were observed. Experimenter bias refers to undesired effects on the dependent variables that are unintentionally introduced by the researcher. All participants received a similar training and worked on the same set of tasks. A manual evaluation of the given answers regarding their correctness was performed. To mitigate the threat of experimenter bias in that regard, the first, second and third author performed the evaluation of all tasks individually. Differentiating between semantic and syntactic correctness overall simplified the evaluation process by enabling a separation of concerns. A potential threat in that regard could be falsely classifying defects. Therefore, after the completion of all individual evaluations, in case of substantial differences in grading, a discussion took place until consensus was achieved.

Threats to external validity

The external validity of a study focuses on its generalizability. In the following, potential threats that hinder a generalization are discussed. Different types of generalizations must be considered: Generalizations across populations: By statistical inference, generalizations from the sample to the immediate population are made. The initial study design considered two populations, namely computer science students that enrolled in the course SE2 as proxies for novice software engineers, designers or developers, as well as computer science students that enrolled in the course ASE as proxies for moderately advanced software engineers, designers or developers. Due to the large number of participants with industry experience, it became possible to consider a third population, namely participants with industry experience, who function as proxies for industrial practitioners with basic to modest industry experience. The results of this study show interesting discrepancies between these populations. In particular, there are no significant differences in understandability between PSP and EPL for more advanced users while a significant difference is measurable when testing less experienced users. In general, this study does not intent to claim generalizability to other populations without further empirical evidence. For example, it might be plausible that leading experts working in the software industry or as business administrators perform similarly to ASE participants or the subset of participants with industry experience, but this study can neither support nor reject such claims. Generalizations across treatments: The treatments are equivalent to specific tested languages. Treatment variations would likely be related to changing the contents, amount or difficulty of experiment tasks or the amount of training provided. The experiment design attempts to be as general as possible by using compliance specifications stemming from different domains and applying a moderate amount of training. Generalizations across settings/contexts: The participants of this study are students who enrolled computer science courses at the University of Vienna, Austria. The majority of the students are Austrian citizens, but there is a large presence of foreign students as well. Surely, it would be interesting to repeat the experiment in different settings/context to evaluate the generalizability in that regard. For example, repeating the experiment with English native speakers might lead to different and presumably better results. Generalizations across time: It is hard to foresee whether the results of this study will hold over time. For example, if teaching of a specific tested language is intensified in the computer science curricula at the University of Vienna, then the students would bring in more expertise, which likely would have an impact on the results.

Threats to construct validity

There are potential threats to the validity of the construct that must be discussed: Inexact definition and Construct confounding: This study has a primary focus on the construct understandability, which is measured by the dependent variables syntactic correctness, semantic correctness and response time. This construct is exact and adequate, and the dependent variables syntactic correctness and semantic correctness make even a more fine-grained analysis possible than in existing studies that measure correctness by a single variable (cf. Feigenspan et al. [31] and Hoisl et al. [40]). Mono-method bias: Due to organizational reasons, keeping time records was the personal responsibility of each participant. The participants were carefully instructed how to record start and end times, and we did not detect any irregularities (e.g., overlapping time frames or long pauses) in those records. Nonetheless, this measuring method leaves room for measuring errors, and an additional or alternative measuring method (e.g., direct observation by experimenters or performing the experiment with an online tool that handles record keeping) would reduce this threat. However, these methods would have influenced the overall study design and potentially could have introduced other threats to validity (e.g., prolonged experiment execution potentially leading to an exposure of the experiment task contents or technical problems during experiment execution). To avoid mono-method bias in evaluating the syntactic and semantic correctness, the grading was not performed by a single but by three experimenters individually. Reducing levels of measurements: Both correctness variables and the response time are continuous variables. That is, the levels of measurements are not reduced. The Likert scales used in this study offer 5 answer categories rather than 7 or 11, because the latter mentioned would produce data of lower quality according to Revilla et al. [68]. Treatment-sensitive factorial structure: In some empirical studies, a treatment might sensitize participants to develop a different view on a construct. The actual level of understandability based on the task solutions provided was measured, so the participants’ view on this construct appears to be irrelevant.

Threats to content validity

Content validity is concerned with the relevance and representativeness of the elements of a study for the measured construct: Relevance: The tasks of this study are based on realistic scenarios stemming from three different domains in which compliance is highly relevant (cf. Elgammal et al. [29], Rovani et al. [71], and United States Environmental Protection Agency [83]). Representativeness: In the formal modeling of the compliance specifications, the use of all core temporal LTL operators and EPL operators was required, which means that the construct understandability was measured comprehensively. The use of each PSP pattern was required two or more times (cf. sample solutions of experimental tasks in “Appendix A”). Unfortunately, it was not possible to test all available pattern–scope combinations. However, the majority of specifications are based on the global scope (cf. Dwyer et al. [27, 28]), which is as well reflected in the realistic specifications used in the tasks of this experiment (cf. experimental tasks in Table 7 and sample solutions in “Appendix A”). That is, a representative subset of PSP was tested.

Threats to conclusion validity

Thorough statistical investigations of model assumptions were performed before applying the most suitable statistical test with the greatest statistical power, given the properties of the acquired data. That course of action is considered to be highly beneficial to the conclusion validity of this study. The decision to retain outliers might be a threat to conclusion validity, but all outliers appear to be valid measurements, so deleting them would pose a threat to conclusion validity as well.

Related work

We are not aware of any empirical studies evaluating the understandability related to the formal modeling of compliance specifications in particular. There exists, however, related work focusing on similar issues. Related studies in the field of business process management are concerned with declarative workflows (cf. van der Aalst [1]), which use graphical patterns with underlying formal representations in LTL (cf. Montali [56]) or event calculus (cf. Montali et al. [57]). Haisjackl and Zugal [35] investigated differences between textual and graphical declarative workflows in an empirical study with 9 participants. The descriptive statistics of this study indicates that the graphical representation is advantageous in terms of perceived understandability, error rate, duration and mental effort. The lack of hypothesis testing and the small number of participants are severe threats to the validity of this study. Zugal et al. [87] investigated the understandability of hierarchies on basis of the same data set. The results of their research indicate that hierarchies must be handled with care. While information hiding and improved pattern recognition are considered to be positive aspects of hierarchies since the mental effort for understanding a process model is lowered, the fragmentation of processes by hierarchies might lower overall understandability of the process model. Another important finding of their study is that users appear to approach declarative process models in a sequential manner even if the user is definitely not biased by previous experiences with sequential/imperative business process models. They conclude that the abstract nature of declarative process models does not seem to fit the human way of thinking. Moreover, they observed that the participants of their study tried to reduce the number of constraints to consider by putting away sheets that describe irrelevant sub-process or by using the hand to hide parts of the process model that are irrelevant. Like in the previously discussed study, it must be assumed that the validity of this study is strongly limited by the extremely small sample size. Haisjackl et al. [36] investigate the users’ understanding of declarative business process models, again on the same data set. As in the previously mentioned study, they point out that users tend to read such models sequentially despite the declarative nature of the approach. The larger a model, the often are hidden dependencies overlooked, which indicates increasing numbers of constraints lower understanding. Moreover, they report that single constraints are overall well understood, but there seem to be problems with understanding the precedence constraint. As the authors point out, this kind of confusion could be related to the graphical arrow-based representation of the constraints where subtle differences decide on the actual meaning. That is, the arrow could be confused with a sequence flow as present in flow-driven, sequential business processes. As previously stated for the other two studies that are based on the same data set, the validity of this study is possibly strongly affected by the small sample size. De Smedt et al. [26] tried to improve the understandability of declarative business process models by explicitly revealing hidden dependencies. They conduced an experiment with 95 students. The result suggests that explicitly showing hidden dependencies enables a better understandability of declarative business process models. Pichler et al. [64] compared the understandability of imperative and declarative business process modeling notations. The results of this study are in line with Zugal et al. [87] and suggest that imperative process models are significantly better understandable than declarative models, but the authors also state that the participants had more previous experience with imperative process modeling than with declarative process modeling. The small sample size (28 participants) is a threat to validity of this study. Rodrigues et al. [69] compared the understandability of textual and graphical BPMN [59] business process descriptions with 32 students and 41 practitioners. They conclude that experienced users understand a process better if it is presented by a graphical BPMN process model whereas for inexperienced users there is no difference in understandability between the textual and graphical process descriptions. Jost et al. [46] compared the intuitive understanding of process diagrams with 103 students. They conclude that UML activity diagrams provide a higher level of understandability than BPMN diagrams and EPCs. Software architecture compliance, which focuses on the alignment of software architecture and implementation, and requirements engineering are also related to this study. Czepa et al. [21] compared the understandability of three languages for behavioral software architecture compliance checking, namely the natural language constraint (NLC) language, the cause–effect constraint (CEC) language and the temporal logic pattern-based constraint (TLC) language, in a controlled experiment with 190 participants. The NLC language is simply referring to using the English language for documenting software architectures. CEC is a high-level structured architectural description language that abstracts EPL. It supports the nesting of cause parts, which observe an event stream for a specific event pattern, and effect parts, which can contain further cause–effect structures and truth value change commands. TLC is a high-level structured architectural description language based on PSP. Interestingly, the statistical inference of this study suggests that there is no difference in understandability of the tested languages. This could indicate that the high-level abstractions employed bring those structured languages closer to the understandability of unstructured natural language architecture descriptions. Moreover, it might also suggest that natural language leaves more room for ambiguity, which is detrimental for its understanding. Potential limitations of that study are that its tasks are based on common architectural patterns/styles (i.e., a participant possibly recognizes the meaning of a constraint more easily by having knowledge of the related architectural pattern) and the rather small set of involved patterns (i.e., only very few patterns of PSP were necessary to represent the architecture descriptions). A controlled experiment carried out by Heijstek et al. [38] with 47 participants focused on finding differences in understanding of textual and graphical software architecture descriptions. Interestingly, participants who predominantly used textual architecture descriptions performed significantly better, which suggests that textual architectural descriptions could be superior to their graphical counterparts. An eye-tracking experiment with 28 participants by Sharafi et al. [74] on the understandability of graphical and textual software requirement models did not reveal any statistically significant difference in terms of correctness of the approaches. The study also reports that the response times of participants working with the graphical representations were slower. Interestingly though, the participants preferred the graphical notation. Hoisl et al. [40] conducted a controlled experiment on three notations for scenario-based model tests with 20 participants. In particular, they evaluated the understandability of a semi-structured natural language scenario notation, a diagrammatic scenario notation and a fully structured textual scenario notation. According to the authors, the purely textual semi-structured natural language scenario notation is recommended for scenario-based model tests, because the participants of this group were able to solve the given tasks faster and more correctly. That is, the study might indicate that a textual approach outperforms a graphical one for scenario-based model test, but the validity of the experiment is limited by the small sample size and the absence of statistical hypothesis testing.

Conclusion and future work

The main goal of this empirical study was testing and comparing the understandability of representative approaches for the formal modeling of compliance specifications. The experiment was conducted with 215 participants in total. Major differences were found especially in semantic correctness of the approaches. Since formalizations in the property specification patterns (PSP) were overall more correct than in linear temporal logic (LTL) and event processing language (EPL), there is evidence that the pattern-based PSP approach provides a higher level of understandability. More advanced users, however, seemingly are able to cope equally well with PSP and EPL. That is, for more advanced users, these approaches can be used interchangeably as fitting best to a concrete domain or task. Moreover, EPL provides a higher level of understandability than LTL. Therefore, EPL is well suitable in situations that demand runtime verification in which the set of available patterns in PSP is not sufficient to model a compliance specification or to aid the creation of new patterns with underlying LTL formalizations (cf. Czepa et al. [18, 19]). Moreover, the results are overall in line with two controlled experiments with 216 participants in total on the understandability of already existing formal specifications in LTL, EPL and PSP (cf. Czepa and Zdun [16]). In contrast to the current study, which focuses on the formal modeling of compliance specifications, no major differences between novice and moderately advanced users were found in understandability of existing specifications. Interestingly, the response times between the experimental groups were significantly different in most cases, an effect which appears to be absent during modeling. Opportunities for further empirical research are the consideration of an extended set of representations including, for example, event calculus (cf. Kowalski and Sergot [52]) or Declare (cf. Pešić and van der Aalst [61]) and studying the understandability construct in different settings with other user groups (e.g., business administrators or professional software engineers). Moreover, besides the understandability construct, additional metrics such as changeability (i.e., “Is one representation easier to change when taking new/amended compliance specifications into account?”) and verifiability (i.e., “Are there differences between the representations when it comes to assessing whether a given compliance specification is fully covered?”) could be investigated.

Table 17

Sample solution of Task 1

Group	Sample solution
EPL	init ==> TS
	not ‘Evaluate Loan Risk’.completed until ‘Officially Sign Contract’.started ==> PV
	‘Evaluate Loan Risk’.completed ==> PS
	init ==> TS
	‘Evaluate Loan Risk’.role != ‘Branch Office Manager’ ==> PV
	init ==> TS
	‘Officially Sign Contract’.role != ‘Branch Office Manager’ ==> PV
LTL	! ‘Officially Sign Contract’.started W ‘Evaluate Loan Risk’.completed
	G! (‘Officially Sign Contract’.role != ‘Branch Office Manager’)
	G! (‘Evaluate Loan Risk’.role != ‘Branch Office Manager’)
PSP	‘Evaluate Loan Risk’.completed precedes ‘Officially Sign Contract’.started
	‘Officially Sign Contract’.role != ‘Branch Office Manager’ never occurs
	‘Evaluate Loan Risk’.role != ‘Branch Office Manager’ never occurs

Table 18

Sample solution of Task 2

Group	Sample solution
EPL	init ==> TS
	every(‘Check Customer Privilege’.completed -> ‘Check Credit Worthiness’.started) ==> TS
	every ‘Check Customer Privilege’.started ==> TV
	init ==> TS
	not ‘Check Customer Privilege’.completed until ‘Evaluate Loan Risk’.started ==> PV
	‘Check Customer Privilege’.completed ==> PS
	init ==> TS
	not ‘Check Credit Worthiness’.completed until ‘Evaluate Loan Risk’.started ==> PV
	‘Check Credit Worthiness’.completed ==> PS
LTL	G(‘Check Customer Privilege’.completed -> F ‘Check Credit Worthiness’.started)
	! ‘Evaluate Loan Risk’.started W ‘Check Customer Privilege’.completed
	! ‘Evaluate Loan Risk’.started W ‘Check Credit Worthiness’.completed
PSP	‘Check Customer Privilege’.completed leads-to ‘Check Credit Worthiness’.started
	‘Check Customer Privilege’.completed precedes ‘Evaluate Loan Risk’.started
	‘Check Credit Worthiness’.completed precedes ‘Evaluate Loan Risk’.started

Table 19

Sample solution of Task 3

Group	Sample solution
EPL	init ==> TS
	not ‘Preoperative Screening’.completed until ‘Laparoscopic Gastrectomy’.started ==> PV
	‘Preoperative Screening’.completed ==> PS
	init ==> TS
	not ‘Preoperative Screening’.completed until ‘Open Gastrectomy’.started ==> PV
	‘Preoperative Screening’.completed ==> PS
	init ==> TS
	‘Open Gastrectomy’.started leads-to ‘Laparoscopic Gastrectomy’.started ==> PV
	‘Laparoscopic Gastrectomy’.started leads-to ‘Open Gastrectomy’.started ==> PV
	init ==> TS
	every(‘Laparoscopic Gastrectomy’.completed leads-to ‘Nursing’.started) ==> TS
	every ‘Laparoscopic Gastrectomy’.completed ==> TV
	init ==> TS
	every(‘Open Gastrectomy’.completed leads-to ‘Nursing’.started) ==> TS
	every ‘Open Gastrectomy’.completed ==> TV
LTL	! ‘Laparoscopic Gastrectomy’.started W ‘Preoperative Screening’.completed
	! ‘Open Gastrectomy’.started W ‘Preoperative Screening’.completed
	(F ‘Open Gastrectomy’.started -> G! ‘Laparoscopic Gastrectomy’.started) & (F ‘Laparoscopic Gastrectomy’.started -> G! ‘Open Gastrectomy’.started)
	G(‘Laparoscopic Gastrectomy’.completed -> F ‘Nursing’.started)
	G(‘Open Gastrectomy’.completed -> F ‘Nursing’.started)
PSP	‘Preoperative Screening’.completed precedes ‘Laparoscopic Gastrectomy’.started
	‘Preoperative Screening’.completed precedes ‘Open Gastrectomy’.started
	after ‘Open Gastrectomy’.started [ ‘Laparoscopic Gastrectomy’.started never occurs ]
	after ‘Laparoscopic Gastrectomy’.started [ ‘Open Gastrectomy’.started never occurs ]
	‘Laparoscopic Gastrectomy’.completed leads-to ‘Nursing’.started
	‘Open Gastrectomy’.completed leads-to ‘Nursing’.started

Table 20

Sample solution of Task 4

Group	Sample solution
EPL	init ==> TS
EPL	every(‘Lead Contamination identified’ leads-to not ‘Renovation’.completed until [‘Cleaning’.running and not ‘Presence of Certified Renovator’.running]) ==> PV
LTL	G(‘Lead Contamination identified’ & ! ‘Renovation’.completed -> (! (‘Cleaning’.running & ! ‘Presence of Certified Renovator’.running) W ‘Renovation’.completed))
PSP	after ‘Lead Contamination identified’ until ‘Renovation’.completed [‘Cleaning’.running and not ‘Presence of Certified Renovator’.running never occurs]

Table 21

Sample solution of Task 5

Group	Sample solution
EPL	init ==> TS
	not b.finished until r.started ==> PV
	not r.started until b.finished ==> PS
	init ==> TS
	not d.finished until r.started ==> PV
	not r.started until d.finished ==> PS
	init ==> TS
	not p.started until [y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\texttt {<}$$\end{document}< 1978 and (t = ‘residential house’ or t = ‘apartment’ or t = ‘child-occupied facility’) and renovation.started] ==> PV
	p.started ==> PS
LTL	!r.started W (b.finished & !r.started)
	!r.started W (d.finished & !r.started)
	!(y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\texttt {<}$$\end{document}< 1978 & (t = ‘residential house’ \| t = ‘apartment’ \| t = ‘child-occupied facility’) & renovation.started)) W p.started
PSP	before r.started [ b.finished occurs ]
	before r.started [ d.finished occurs ]
	p.started precedes (y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\texttt {<}$$\end{document}< 1978 and (t = ‘residential house’ or t = ‘apartment’ or t = ‘child-occupied facility’) and renovation.started)

Table 22

Shapiro–Wilk test of multivariate normality (* for , ** for , * for )

Group	SE2	ASE
LTL	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.96138$$\end{document}W=0.96138	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.89909$$\end{document}W=0.89909
LTL	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.09547$$\end{document}p=0.09547	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.03359$$\end{document}p=0.03359*
PSP	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.94299$$\end{document}W=0.94299	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.96263$$\end{document}W=0.96263
PSP	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.02316$$\end{document}p=0.02316 *	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.6813$$\end{document}p=0.6813
EPL	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.96448$$\end{document}W=0.96448	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.91843$$\end{document}W=0.91843
EPL	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.1618$$\end{document}p=0.1618	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.05393$$\end{document}p=0.05393

Table 23

Shapiro–Wilk test of univariate normality (* for , ** for , * for )

Table 24

Welch’s t test of syntactic/semantic correctness and response time in SE2, one-tailed with confidence intervals calculated for (cf. Welch [84]) and adjusted p-values (cf. Benjamini and Hochberg [8]) [Level of significance: * for , ** for , *** for ]

	PSP/LTL	PSP/EPL	EPL/LTL
Syntactic correctness
t	3.5867	1.9529	1.5761
df	94.691	91.994	94.863
CI low	0.0651	0.0102	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.0029$$\end{document}-0.0029
CI high	–	–	–
Mean x	0.6864	0.6864	0.6182
Mean y	0.5652	0.6182	0.5652
p	0.0003	0.0269	0.0592
FDR adjusted p	0.0013	0.0647	0.1109
Level of significance	**	–	–
Semantic correctness
t	7.0831	3.8143	3.2849
df	93.444	91.596	95.061
CI low	0.1661	0.0679	0.048
CI high	–	–	–
Mean x	0.5019	0.5019	0.382
Mean y	0.2849	0.382	0.285
p	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.3 \times 10^{-10}$$\end{document}1.3×10-10	0.0001	0.0007
FDR adjusted p	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.9 \times 10^{-9}$$\end{document}1.9×10-9	0.0009	0.0027
Level of significance	***	***	**
Response time
t	1.861	1.2971	0.5009
df	93.123	91.955	93.774
CI low	–	–	–
CI high	9.8170	8.6859	5.9519
Mean x	48.6769	48.6769	44.869
Mean y	43.4902	44.869	43.4902
p	0.9671	0.9011	0.6912
FDR adjusted p	0.9671	0.9655	0.7975
Level of significance	–	–	–

Table 25

Welch’s t test of syntactic/semantic correctness and response time in ASE, one-tailed with confidence intervals calculated for (cf. Welch [84]) and adjusted p-values (cf. Benjamini and Hochberg [8]) [Level of significance: * for , ** for , *** for ]

	PSP/LTL	PSP/EPL	EPL/LTL
Syntactic correctness
t	1.3239	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-1.1642$$\end{document}-1.1642	3.371
df	28.887	25.573	40.268
CI low	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.023$$\end{document}-0.023	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1671$$\end{document}-0.1671	0.0746
CI high	–	–	–
Mean x	0.6513	0.6513	0.7191
Mean y	0.5701	0.7191	0.5701
p	0.098	0.8724	0.0008
FDR adjusted p	0.2449	0.9254	0.0062
Level of significance	–	–	**
Semantic correctness
t	3.1981	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.5583$$\end{document}-0.5583	4.7839
df	29.231	29.156	42.581
CI low	0.0754	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.1125$$\end{document}-0.1125	0.1223
CI high	–	–	–
Mean x	0.4693	0.4693	0.4971
Mean y	0.3085	0.4971	0.3085
p	0.0017	0.7095	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$10^{-5}$$\end{document}10-5
FDR adjusted p	0.0083	0.8869	0.0002
Level of significance	**	–	***
Response time
t	0.7786	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-0.6463$$\end{document}-0.6463	1.4701
df	35.654	35.389	41.049
CI lowCI low	–	–	–
CI highCI high	11.6186	4.5789	13.9503
Mean x	55.9853	55.9853	58.8236
Mean y	52.3191	58.8236	52.3191
p	0.7793	0.2611	0.9254
FDR adjusted p	0.8992	0.4352	0.9254
Level of significance	–	–	–

2 in total

1. The generalisation of student's problems when several different population variances are involved.

Authors: B L WELCH
Journal: Biometrika Date: 1947 Impact factor: 2.445

2. Biotools: an R function to predict spatial gene diversity via an individual-based approach.

Authors: A R da Silva; G Malafaia; I P P Menezes
Journal: Genet Mol Res Date: 2017-04-13

2 in total

Group	Dependent variable	SE2	ASE
LTL	Syntactic	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.97501$$\end{document}W=0.97501	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.951$$\end{document}W=0.951
	Correctness	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.3526$$\end{document}p=0.3526	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.3558$$\end{document}p=0.3558
	Semantic	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.95487$$\end{document}W=0.95487	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.94524$$\end{document}W=0.94524
	Correctness	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.05047$$\end{document}p=0.05047	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.2761$$\end{document}p=0.2761
	Response time	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.98169$$\end{document}W=0.98169	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.95759$$\end{document}W=0.95759
	Response time	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.6127$$\end{document}p=0.6127	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.469$$\end{document}p=0.469
PSP	Syntactic	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.96204$$\end{document}W=0.96204	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.91825$$\end{document}W=0.91825
	Correctness	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.1296$$\end{document}p=0.1296	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.138$$\end{document}p=0.138
	Semantic	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.98311$$\end{document}W=0.98311	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.96835$$\end{document}W=0.96835
	Correctness	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.7232$$\end{document}p=0.7232	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.7889$$\end{document}p=0.7889
	Response time	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.95661$$\end{document}W=0.95661	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.89976$$\end{document}W=0.89976
	Response time	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.0789$$\end{document}p=0.0789	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.06731$$\end{document}p=0.06731
EPL	Syntactic	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.96063$$\end{document}W=0.96063	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.9358$$\end{document}W=0.9358
	Correctness	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.1139$$\end{document}p=0.1139	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.1314$$\end{document}p=0.1314
	Semantic	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.98412$$\end{document}W=0.98412	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.96757$$\end{document}W=0.96757
	Correctness	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.7652$$\end{document}p=0.7652	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.6075$$\end{document}p=0.6075
	Response time	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.98163$$\end{document}W=0.98163	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W = 0.94779$$\end{document}W=0.94779
	Response time	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.6606$$\end{document}p=0.6606	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = 0.2425$$\end{document}p=0.2425