Literature DB >> 34555061

Free and Open Source Software organizations: A large-scale analysis of code, comments, and commits frequency.

Tadeusz Chełkowski1, Dariusz Jemielniak1, Kacper Macikowski1.   

Abstract

As Free and Open Source Software (FOSS) increases in importance and use by global corporations, understanding the dynamics of its communities becomes critical. This paper measures up to 21 years of activities in 1314 individual projects and 1.4 billion lines of code managed. After analyzing the FOSS activities on the projects and organizations level, such as commits frequency, source code lines, and code comments, we find that there is less activity now than there was a decade ago. Moreover, our results suggest a greater decrease in the activities in large and well-established FOSS organizations. Our findings indicate that as technologies and business strategies related to FOSS mature, the role of large formal FOSS organizations serving as intermediary between developers diminishes.

Entities:  

Mesh:

Year:  2021        PMID: 34555061      PMCID: PMC8460050          DOI: 10.1371/journal.pone.0257192

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Online communities in general and Free and Open Source (FOSS) communities in particular, have been a subject of stable academic interest since their inception [1-3]. Although individual FOSS projects have been the subject of many in-depth analyses, the organizations that manage and control FOSS projects have not yet garnered much academic interest. Within the field of organization studies, researchers have studied topics such as the emergence of new teams from FOSS development networks [4], continued engagement [5], successful productization of peer production in software [6], group activity, dynamics, and social ties [7,8], diversity [9], leadership [10,11], interdependencies [12], the influence of leaders on project sustainability [13], network ties between projects [14], or and IP strategies [15]. As a new phenomenon, FOSS has often been described in terms of its innovative nature, market potential [16], surprising growth [17], the ability to “hack” capitalism [18], and its key differences from traditional software [19]. The gist of much of the literature is that the peak of the FOSS revolution is ahead of us and that we are still observing its growth and maturation [20,21] as organizational and economic regimes continue to change [22]. It is worth noting that “free software” and “open source software” are similar, but not identical, especially in activist circles where they are hotly debated. In order to avoid complex, ideological, and licensing-nuanced discussions we therefore attempt to stay neutral and use Free/Open Source Software (FOSS) as a caveat term [23-25]. Using “FOSS” to refer to free and/or open source software is a way to capture two different philosophies: the one formulated by Richard Stallman in 1983 and Open Software as defined by Open Source Initiative [26]. We acknowledge that many researchers have traced the roots of open source to early as 1970 [27], but we understand that the term “open source” was coined in 1998 to separate the free programs from Open Source Initiative’s ideas of freedom. The term “Open Source” captures two distinct ideas, therefore it’s worth emphasizing that despite even though in many cases “Open Source” is used as a single term, it refers to two separate movements within the free software community. The first is the mission of promoting computers’ freedom to use software without any cost and copyright restrictions. The second refers to a more practical aspect of making software source codes accessible [28,29]. FOSS now incorporates philosophies and approaches as distant as leftist activism and corporate strategies [27]. For our purposes we are going to refer to FOSS mainly in its politically neutral field of collaborative and organizational practices. See also: https://gnu.org/gnu/the-gnu-project.html and https://opensource.org/osd. In the early 21st century it seemed that FOSS would revolutionize society. Wikipedia conquered the market for online encyclopedias and marginalized Britannica [30], Linux became the No.1 server, breaking Microsoft’s monopoly [31], and Firefox was the most popular browser after Internet Explorer bundled with Windows [32]. All these successes led some researchers to hypothesize that FOSS in particular, and peer production in general had the potential to transform late capitalism [18,33]. Sharing and cooperation were expected to emerge as a new modality of economic production [34], leading to a groundbreaking transformation of markets and societies [35]. FOSS, through the creation of new forms of property, would “infect capitalism like a virus,” and challenge the dominant logic of private property and ownership [36-38]. The emergence of private collectives [39], creating new, a-hierarchical and loosely coordinated structures [29], and relying on creation of zero-reproduction costs goods, often of a non-competitive character [36] offered the promise of an entirely new organizational model that would gradually take over the existing ones. They also indicated a fundamentally different approach to organizational innovation [40]. On the surface, the narrative about constant growth and increase in importance seems very plausible. The development and global diffusion of FOSS are quite clear [16]. Even though some projects are naturally abandoned [41], there are certain patterns of growth and decline in FOSS projects [42], and we can reasonably expect FOSS organizations to grow and take over an increasing portion of market share from traditional organizations [43-45]. In the early 1980s, the open source community grew and open source sharing customs were embraced by a growing number of academic and non-academic organizations. In 1985, as a result of the conflict between AT&T and UNIX, Richard Stallman created a Free Software Foundation protecting the right to keep software freely available [46]. The institutionalization of the open source movement produced a variety of organizations structured around an idea, a project, a group of projects, or more recently, software vendors [35,47]. Once an open source group of collaborators reaches a certain size, the norms of sharing, licensing standards and maintenance duties of the community need to be maintained. FOSS organizations adopt or design FOSS licensing standards, distribution methods, software development standards, outside world communication representatives, quality and testing procedures and finally tools for community collaboration. FOSS organizations adopt traditional controlling structures to a degree that was needed to control the release process, but at the same time relaxed enough to preserve the free nature of the open source movement [46-48]. However, even though digital commons, peer production, and open collaboration are still perceived as showing great promise [49,50], and the dream of open organizing’s transformative powers has not been entirely lost [51], the situation has become much more fuzzy in the past decade. While FOSS has always had some balance of for-profit and for-fun activities [26], large corporations have recently been able to incorporate elements of FOSS organization and approach into their traditional business development strategies [1], and to exploit FOSS software for closed and proprietary products [52]. In fact, even though the FOSS model initially proved a viable alternative to traditional software development methods, it has not been consistently successful in productization: the creation of products that the customers would find easy to understand and use [6,53]. Open organizing is a beautiful idea that showed enormous promise when it took the traditional modes of organizing by surprise, but it may be already past its peak. It is also much more hierarchical and bureaucratic than it originally assumed [54]. To understand the future and place of FOSS in management and society, it is more important than ever to measure engagement in FOSS projects over time across selected small, medium, and large projects. It should allow both the estimation of the general development of FOSS, and reveal the finer details, depending on the size of the organization. Our paper is an attempt to fill this gap.

Project rationale

The Apache Software Foundation is often cited as a paragon of FOSS organization [55-57]. According to Mark Driver, research vice president at Gartner, “The Apache Software Foundation is a cornerstone of the modern open source software ecosystem–supporting some of the most widely used and important software solutions powering today’s Internet economy." (https://blogs.apache.org/foundation/entry/apache-is-open). Indeed, the Apache Software Foundation (ASF) is arguably the most prominent example of a large and successful FOSS organization. It is responsible for the fundamental components of the modern web architecture (Apache HTTP) [58], the backbone of data mining (Apache Hadoop, Apache Spark) and hundreds of tools essential for programming, integration and standardization of the internet as we know it [16,59,60]. Since 1999, Apache has been not only a place for project development, but also a model of open innovation and open collaboration, in many cases displacing traditional software development methods. However, although the Apache Software Foundation is proud of its continuous growth, it is worthwhile to look more closely at the fine-grained details of the community’s activity. For instance, data presented on the official Apache statistics pages (https://projects.apache.org/statistics.html). indicate an undeniable success in the growing code base; however, activity measured in community emails and issues presents very interesting fluctuations. According to Fig 1, ASF recorded the highest number of emails (78 846) in March 2016 but that number dropped to 42 814 in October 2017, a level last seen in May 2011 (42 400).
Fig 1

Emails, topics and authors.

Source: https://projects.apache.org/statistics.html.

Emails, topics and authors.

Source: https://projects.apache.org/statistics.html. The observed decrease in the email communication can be explained by factors such as change in users’ behavior: many users moved away from email to integrated messaging systems in the code repository interface [61,62]. At the same time, there is empirical evidence of the correlation between the ratio of email messages in public mailing lists to versioning system commits [63] and consequently to project activity as a whole. Thus, it may be a possible signal of decreased participation in FOSS projects. However, even though emails and communication in FOSS have been studied as a proxy for project health and growth [56,64], and mature projects are known to rely on well-structured communication [65], research so far has focused on small samples, precluding a more definitive observation of larger trends over time. This observation has inspired us to conduct what we believe is one of the largest analyses of FOSS projects’ code, gathering data from 1314 individual projects and 1.4 billion lines of code managed. The strength of our study relies on a huge sample of commits, which allows us to make more certain observations about the changes, even though it also makes providing explanations more difficult. Additionally, one advantage of our study is its long-time focus: as only in the longer periods it is possible to observe incremental but clear shifts in the organizational landscape.

Research question

The aim of our study is to improve our understanding of the level of activity among a large sample of FOSS projects. Additionally, data stratification on the FOSS organization gives us a chance to analyze projects from the perspective of the FOSS organizational association. The importance of FOSS organizations such as Apache Software Foundation for the modern networked society could not be overestimated; any change in its community dynamic is an interesting factor from the vantage points of academia and business [60,61]. In order to understand it better, we explore the following research question: What is the structure of commits, code and comments contribution among the selected Open Source Software Organizations over the last 20 years? For this article, we’ve selected a stratified sample of small, medium and large FOSS organizations. We have collected 21 years of quantitative data describing commits frequency, code inserts and deletions as well as data about comments attached to code. We classify this study as exploratory research on a large data sample, a first step in the direction of deeper case-oriented analysis. In order to answer our research question, we’ve quantified contributors’ activities on the project level. To picture the activity across the analyzed projects, we use simple contingency tables collecting commits, comments, code projects as our main variables and its relationship as calculated variables. We argue that activity in FOSS projects measured in commits, source code, and comments has declined over the last 10 years.

Materials and methods

Data source

Our research data such as number of commits, comments and code lines was collected from Open Hub, a database and public directory of FOSS (Open Hub and the Open Hub logo are trademarks of Black Duck Software, Inc. in the United States and/or other jurisdictions). OpenHub’s automated analytics software regularly visits the most popular source versioning systems such as Git, SubVersion, CVS and Bazaar. OpenHub curates data using its large online community where anyone is able to correct and edit OpenHub data entries. It is now arguably the largest and most trustworthy aggregated data source of FOSS. This article is based on the May 20, 2018 snapshot of the OpenHub.net repository. Since the frequency of updating the OpenHub repository may vary, APIs of the developers repositories change, and OpenHub needs time to adapt to these changes, to avoid data inconsistencies we have not used data after December 31, 2017, so as to make sure that the data completeness is as high as possible. Using the custom developed application and publicly available OpenHub API, we have collected a large data sample containing a comprehensive overview of the 20-year history of FOSS organizations.

Data sample and data collection

The result of programmers’ work are code lines and comment lines distributed among the number of files, in most cases compiled into an executable software. Programmers are generally producing software using programming languages that fall into one of two categories: interpreted or compiled. Interpreted programming language code must be parsed, and executed each time the program is run. The compiled programs are translated by compilers into a very efficient lower level code that can be executed many times. Some programming languages are using a dual interpreted or compiled paradigm. The main artifact created in the process is a source code represented as a set of statements written in a programming language like Java, C, C++, JavaScript or XML, CSS or HTML tags. Software developed in collaborative environments is created in a series of commits; commits happen every time a developer wants to contribute a piece of work to a shared repository. This process is supported by concurrent developments in software such as Git [66]. The role of concurrent software development applications is to track the changes between the programmer’s local environment and synchronize it with a remote repository, making sure that potential code changes and code conflicts are resolved and seamlessly merged (see: https://git-scm.com/about). A source-code modification such as adding, modifying, or removing lines of code, adding or removing files, changes in the documentation files, are typical examples of commits. Because of the open nature of software repositories and their accessibility, commits have been a subject of numerous software development studies [67,68], and the activity of developers measured in commits is known to be highly unequal in FOSS organizations. Additionally, to make a source code clearer and easier for others to understand, programmers add comments to address the meaning of the code block or a code line. As noted by researchers, “source code comments are a valuable instrument to preserve design decisions and to communicate the intent of the code to programmers and maintainers” [69]. For the purpose of our research, each month we analyze the number of code lines and comment lines added by programmers for each project. To retrieve and collect the research information, we have developed an automated application retrieving and parsing data using the headless interface of OpenHub. Our application, which relies on REST API, listed the requested organizations and records, reflecting committers’ activity ordered by projects in monthly snapshots.

Dataset selection and stratification

FOSS organizations differ in many ways–some like Eclipse represent large companies and their business goals while others like Apache others started as a single open project which, over time, attracted more programmers with new projects and ideas. To reduce sampling error and improve the precision of the results, we’ve divided the FOSS population into homogeneous subgroups before sampling (stratification). Subgroups (strata) were determined by the size of the FOSS organization, measured in numbers of projects [70-72]. Selection of a project as a stratification criterion has limitations that are discussed in the limitations, data and results section. Stratum 1 - [LARGE] organizations with managed #Projects > = 100 Stratum 2 - [MEDIUM] organizations with 100> #Projects > = 25 Stratum 3 - [SMALL] organizations with #Projects <25 Combined sample consists of (n = 1314) projects with MOE (Margin of Error) ±2.30% for the CL (Confidence Level) = 95% and MOE ±3.03% for the CL = 99%). It encompasses 15 FOSS organizations, 16 727 184 commits and over 1.4 billion lines of code. The collected attributes timespan ranges from 11 to 21 years. For each project in each year, we have collected a full 12-month history or a partial history (some project life spans are shorter than 21 years). In total, we have collected 3246 data months, in 9 cases the year data did not include the all months data.

Data record

Each record consists of raw attributes imported from data sources and variables derived from the collected data. Individual record represents a monthly activity for the analyzed project managed by the FOSS organizations. To understand the nature of projects’ activity better, we have calculated additional attributes. First, for better understanding of the project committers’ level of activity and the nature of developed software, we measure a coefficient of code submitted per commit using the following equation: CODPC = ∑Lines of Code/∑Commits CODPC might indicate the current project stage as frequent commits with a low number of submitted code may indicate that a project is in the maintenance phase [55]. Second, to identify the relationship between the lines of comments submitted in a single commit, we calculate comments per commit coefficient. COMPC = ∑Comments/∑Commits This might be an interesting indicator of, for instance, the documentation phase of the project and code creation phase [55]. Lastly, we have calculated the ratio of comments per line of effective source code, COMPCOD = ∑Comments/∑Code since a high number of comments per line of actual code may indicate more formal organization processes in a project. The list of variables, with types and source is demonstrated in Table 1.
Table 1

The list of variables, with types and source classification.

Variable nameDescriptionVariable typeSource
Organization nameOpen Source Software Organization nameClassification variableCollected from the Data Source
Project nameName of the Open source projectClassification variableCollected from the Data Source
Organization sizeClassification of the organization size—Large, Medium or Small.Classification variableCalculated, using managed projects number as a determination criterion.
Lines of code addedNumber of code lines added in the observed time (one month).Main quantitative variableCollected from the Data Source
Lines of comments addedNumber of comments lines added in the observed time (one month)Main quantitative variableCollected from the Data Source
Blank Lines addedNumber of blank lines added in the observed time (one month)Supporting quantitative variableCollected from the Data Source
CommitsNumber of commits lines added in the observed time (one month)Main quantitative variableCollected from the Data Source
Lines of code removedNumber of code lines removed from the code commit-to-commit comparison in the observed time (one month)Supporting quantitative variableCollected from the Data Source
Lines comments RemovedNumber of comments lines removed from the code commit-to-commit comparison in the observed time (one month)Supporting quantitative variableCollected from the Data Source
Blank lines removedNumber of blank code lines removed from the code commit-to-commit comparison in the observed time (one month)Supporting quantitative variableCollected from the Data Source
Code per commits (CODPC)Calculated value. Code/commits. Variable calculated for all collected cases.Supporting variable.Calculated variable.
Comments per commit (COMPC)Calculated value. Comments/commits. Variable calculated for all collected cases.Supporting variable.Calculated variable.
Comments per code (COMPCOD)Calculated value. Comments/code. Variable calculated for all collected cases.Supporting variable.Calculated variable.
YearTime variable, observation year.Main data variableCollected from the Data Source
MonthTime variable, observation month.Main data variableCollected from the Data Source
Sample data record is presented in Table 2.
Table 2

Example of collected data record.

OrganizationApache
SizeLarge
ProjectApache Commons Math
code added257
code removed891
comments added24
comments removed449
blanks added17
blanks removed136
Commits10
contributors1
Year2016
Month12
CODPC25.7
COMPC2.4
COMPCOD0.093385214
The number of collected observations varies among organizations, which is well represented in the frequency table (see Table 3).
Table 3

Frequency distribution of collected observations.

Frequency (Fc)PercentCumulative Percent
Apache2276226.526.5
Debian35554.130.6
Eclipse1466617.047.6
Gentoo13141.549.2
GNOME245.349.5
JBoss28213.352.7
Kde1844821.474.2
Mozilla58666.881.0
nasa19182.283.2
openstack10911.384.5
OSGeo30733.688.1
OW227883.291.3
OWASP13201.592.8
tdf248.393.1
wikimedia59096.9100.0
A comparison of projects, commits, and code size is included in Table 4.
Table 4

Projects, commits and code size comparison.

# of projects∑ commits∑ code added∑ code removed
Large (#projects>100)
apache3431 828 824407 946 922275 627 732
eclipse1721 532 770449 599 501295 623 314
kde2044 619 403768 319 949521 996 888
nasa11196 37444 976 53025 955 044
wikimedia1681 164 67463 501 04744 812 409
Medium (25<#projects<100)
debian29789 435710 261 271460 060 995
JBoss35583 720164 351 117117 092 382
mozilla942 559 615777 124 625516 245 203
OW240501 288519 017 687260 478 326
OWASP6349 87731 375 64823 233 759
Small (#projects< = 25)
gentoo14768 76739 103 07528 916 805
GNOME1523 28037 933 31228 153 328
openstack13926 42976 950 34151 594 436
OSGeo25383 00270 590 58548 623 746
tdf2399 72660 838 35041 705 135
Total1 31416 727 1844 221 889 9602 740 119 502

Results

Commits’ analysis

Fig 2 shows that the drop in the commits’ volume growth affecting large FOSS organizations (stratum 1) started around 2010, and continued to the end of the dataset history. A closer look at the data with trimmed mean (top and bottom 5% of observations have been removed), reveals that the average annual growth for large FOSS organizations was 25.39%, 30.21% for medium FOSS organizations (stratum 2) and 35.68% for the small FOSS organizations (stratum 3). Table 5 presents the combined growth rates for FOSS organizations of all three studied organizations sizes.
Fig 2

Commits analysis in FOSS organizations 1997–2017.

Table 5

Combined growth rates for large, medium and small FOSS organizations 1997–2017.

LargeMediumSmall
Trimmed mean (5%)25.59%30.21%35.68%
Mean28.27%56.17%44.42%
Max142.27%631.01%273.87%
Min-34.88%-25.42%-18.95%

Code analysis

Fig 3 shows the source code growth dynamic, measured in the number of lines written. It is worth to notice differences between the source code contribution between the three analyzed groups and a dominance of the medium FOSS organizations. In order to have a clear view of the code base we use the code net value as a variable. Code net value represents a number of functional code lines, without banks and comment lines, additionally it deducts the deleted lines, since even a single commit can add and also remove code.
Fig 3

Source code committed in number of lines.

Comments’ analysis

In order to have a clear picture of comments contribution among the analyzed FOSS organizations we have introduced a new metric: contributed lines of comments per lines of code (COMPCOD). COMPCOD, described in Table 6, shows differences among large, medium and small FOSS organizations in their code commenting behavior. In the medium FOSS organizations, the largest average value of 0.86 lines of comments per code line was recorded. It’s important to emphasize that within the collected observations, only one organization, The Open Web Application Security Project (OWASP), is an outlier with over 2.46 comment lines per code line. Tables 6–8 show the mean COMPCOD, mean of comments line per code lines, and the number of source code lines per commit.
Table 6

Mean COMPCOD (mean).

 Mean of comments lines per code line
Large(#projects>100)
apache0.61270
eclipse0.51178
kde0.32221
nasa0.39455
wikimedia0.32492
Medium(25<#projects<100)
debian0.30002
JBoss0.43842
mozilla0.44154
OW20.68189
OWASP2.48573
Small(#projects< = 25)
gentoo0.22984
GNOME0.14378
openstack0.20929
OSGeo0.44055
Tdf0.67851
Table 8

Number of source code lines per commit (CODPC).

 Large(#projects>100)Medium (25< #projects<100)Small (#projects < = 25)
1997159.6457213.6047188.9899
1998121.78831367.619663.7804
1999122.403299.2152585.9229
2000100.490095.5775169.4701
2001137.2856129.0426116.8105
2002112.5782169.6384147.2365
200375.3101217.3441133.9808
2004202.8971304.066277.2385
2005254.2520251.1138107.8710
2006208.5257484.9788188.4939
2007209.4148373.9491165.4688
2008218.2877374.0096110.8150
2009212.1694300.7204149.9105
2010222.8003296.1230206.8572
2011252.7280376.7635352.3639
2012218.6569715.4053131.8826
2013209.0371500.2177131.1219
2014226.6461612.1060130.7347
2015250.44031202.5672102.1072
2016215.5444555.9114105.3622
2017227.4132184.8912143.2757
Trimmed mean (5% bottom and 5% top)190.9870387.4562150.5258
Average188.4912420.2317167.1283
SD51.88962251.80445110.31318

Discussion

Although all the analyzed organizations have grown over the past 20 years (Figs 2 and 3) we have observed lower and decreasing growth rates in the large FOSS organizations when compared to medium and small FOSS organizations (stratum 3). Furthermore, in recent years the commits volume of large FOSS organizations (stratum 1) started to drop by an average of 16.7% annually. The best example of this trend is the fact that in 2017, small FOSS organizations (stratum 3) surpassed the large FOSS organizations (stratum 1) commits volume by 10.8%. This is surprising, as 10 years earlier the commits volume of large FOSS organizations (stratum 1) was more than 6 times bigger than that of small FOSS organizations (stratum 3) (626 136 to 93 512) and over three and a half-time bigger than medium FOSS organizations (stratum 2) (626 136 to 172 843) (Fig 2). Moreover, as Fig 2 shows, the drops in the commits’ volume growth is affecting large FOSS organizations disproportionately, and this phenomenon was not observable in the first 10+ years. As Table 5 shows, the trimmed mean (top and bottom 5% removed) the change is dramatic: The drop in the large FOSS organizations activity measured in commits can be demonstrated by one comparison—commits in 2017 represented only approximately 30% of commits compared to the record high in 2010 (2010–951 294, 2017–287 907). In the first decade of the period under analysis (1997–2017), we observed a steady growth in the medium and large FOSS organizations, while small FOSS code growth dynamics tended to fluctuate. However, from 2010, the organizations managing over 100 projects started to receive less new source code than in the previous decade. Compared to 2010, when users committed over 100 million of the net new source code lines, in the year 2012 large FOSS organizations (stratum 1) received only 36.1 million lines. In that same period, medium FOSS organizations (stratum 2) surpassed large FOSS organizations and by 2012 they had received almost 3.5 times more net new source code than large FOSS organizations. One of the most surprising findings is a noticeable “time shift” of the growth dynamic between the medium and large FOSS organizations. The observed decrease in the code base growth dynamic starts two years later in medium FOSS organizations (stratum 2) and two years later than that in small FOSS organizations. In a deeper analytical look at our proposed metric COMPCOD (Table 6), studying the distribution reveals that one of the OWASP projects, “The OWASP Zed Attack Proxy,” described as “… one of the world’s most popular free security tools and is actively maintained by hundreds of international volunteers,” (https://github.com/zaproxy/zaproxy/wiki) includes the code of conduct, instructions and even the elements of documentation in the comments sections. Regardless of the outliers, projects associated with large FOSS organizations registered less activity than projects in the medium FOSS organizations (stratum 2) with comments per line 0.45 ratio. Finally, small FOSS organizations (stratum 3) are the least active in code comments, providing approximately 1 line of comment for every 3 lines of code (COMPCOD = 0.34). Additionally, an analysis of 20 years of comments per code history shows that standard deviation in large and small FOSS organizations (stratum 3) is smaller.

Conclusions

Our results indicate a shift in contribution activity across FOSS projects of different sizes and growth stages over time, and that the largest organizations are slowing their growth at a faster pace than medium and small organizations. There are many possible reasons for the observed phenomenon. We cannot exclude the possibility that the modalities of cooperation have changed over time and that the measures we are using do not hold a stable accuracy over the whole period. However, if the results reflect the actual changes in FOSS organizations and projects, they are quite troubling for the open source movement. One possible interpretation of this phenomenon is that open source, as an approach to developing projects, has lost some of its appeal. It is worth remembering that at first, FOSS principles were interpreted as bringing together an ideological paradigm shift (openness), governance and technological innovations [52,73]. These three areas were conflated into one, and raised the hopes of early enthusiasts that openness as a social norm is inseparable from and consequent to the other two, and supports a redefinition of labor leading to the reshaping of capitalism. In other words, the dominant assumption was that as new forms of governance and technology promote open organizing, we can expect traditional organizing to be gradually replaced, and the far-reaching consequences, according to some authors, may even change the capitalism as we know it [35]. Our study does not allow us to make claims about causality, and as our interpretation here is speculative, it should be treated with caution. However, what we believe may be happening is the result of FOSS technology and organizational model maturing and becoming mainstream. While initially the governance and technological innovations indeed led to a wider adaptation of openness as a dominant logic, the traditional organizations soon learned how to use (and sometimes abuse) these two innovations to create closed ecosystems and gatekeep their position. The successes of Google in leveraging Android to win commercially on a mobile market, or of WordPress to build a regular business based on FOSS principles, as well as a series of takeovers, such as acquiring GitHub by Microsoft for 7.5 billion dollars, and acquiring RedHat by IBM for the staggering sum of 34 billion dollars, all show that rather than transforming society, FOSS may be trimmed and harnessed for traditional corporate goals. While open source may be on the rise as an effective organizing principle [74], it has been disentangled from at least some of its original premises. The principles of sharing economy, rooted in collaborative, prosocial, and anti-commercial ideals [75] have also been used rhetorically and adjusted for the mainstream economy, leading to further exploitation and inequality [76]. In a way, FOSS movement has both “won and lost the war” [77], as it has been widely accepted as a form of software development, but the profits deriving from it have largely been appropriated by corporations. In its 2.0 version, FOSS development becomes yet another business model [78], bordering freemium more than a revolutionary society-changing movement. The ideologies of openness, sharing, and collaborating are being repurposed for business as usual [79,80]. The openness of software have become routine factors for influencing productivity and efficiency [81,82]. Moreover, open collaboration software development turned out to be much less collaborative in an actual daily practice had been assumed [83,84]. Moreover, far from being stable, FOSS organizations underwent major adaptations to the environment. One of the major roles of FOSS organizations to nurture interactions among community members, calling actions, setting guiding principles or developing tools to facilitate collaborative software development and streamline coordination [85,86]. Benefits provided for the FOSS developers and users by the FOSS organizations, such as Apache Software Foundation, Mozilla Foundation or Linux Foundation, include project governance and vital institutional support infrastructure [87,88]. Users or contributors can rely on an organizational framework for intellectual property rights management as well as for legal support and well-defined development and maintenance processes. In many cases FOSS organizations exist as communities of practice, where people engage in collective development, learning and solving similar problems [89]. Yet, as technologies develop and organizational practices mature, some functions that had previously been crucial in FOSS development and provided by FOSS organizations may be replaced by software and online services. While this paper does not analyze the new emerging FOSS organizations especially created after 2017, the analyzed data provides evidence that activities measured as commits are declining, and may have dire side effects for the entire FOSS movement. The existence of large FOSS organizations has made big policy and activism possible. Promoting big ideological changes in the areas of open licensing, fairness in digital files sharing [90], sharing rather than selling as a principle of contemporary society, or openness in general as a strong social norm [91] would not have been possible without their support. Large FOSS organizations brought grand projects, such as new operating systems (Linux) or productivity suites (such as OpenOffice) into existence. These large projects were essential for the belief that the emerging peer-to-peer economy and the new commons may make a larger impact on the society that went beyond isolated cases of software [43,92]. It is possible that ideological manifestos, postulating openness as a new principle of social organizing, having a potential for transformative influence on capitalism, may not have had as much appeal as it seemed. Yes, FOSS organizations paved the way to distributed structures and to making openness an organizing principle, and according to some measures their influence on capitalism may have been profound. They also developed tools and processes that made virtual collaboration more effective. Yet, our results may indicate that as soon as the traditional organizations caught up on both of these fronts, FOSS organizations, and especially the large ones, started to lose momentum. It may be that the demand for a revolution simply was not there, and the general public couldn’t care less about openness. Even though projects with a non-market sponsor, as well as with open licenses used to be able to attract greater user interest over time in the past [93], the successes of services such as TripAdvisor, Quora, Google Guides or Yelp have made it abundantly clear that many users do not have a problem with creating collective content for a for-profit company, which uses this content on a restrictive license, and relies on corporate-decided community governance without any open collaboration in regards to organizational structures and roles. They just enjoy a friendly UI, and a peer production mode of contributing. The final nail in the coffin has been the rise of centralized cloud services such as GitHub or BitBucket, which have met many of the organizational and cooperative needs of developers that were previously addressed by open designs.

Limitations of the research model, data and results

Our study relies on data from the period of 1997–2017, and does not cover the most recent changes in the open source environment. While this approach is reasonable because of the data availability and comparability, it should be noted that in recent years FOSS organizations have explored new ways of supporting open source projects, and new ways of managing coordination, including ways more difficult to measure and compare to previous years. This paper is a quantitative conceptualization of the activity levels in a stratified sample of projects associated with FOSS organizations. Even though OpenHub is a reliable source of data, the results should be considered within the trust boundary of the data source. There is no guarantee that all projects, commits, comments, or organizations are fully represented in the OpenHub database. It is also worth mentioning that the results are applicable only to FOSS projects associated with formal FOSS organizations, thus the results do not represent the full population of FOSS projects. The proposed perspective of looking at the FOSS organizations through the lens of projects, commits, submitted code, and code commits may not fully represent all behind-the-scenes activities, including important cooperative behaviors not related to coding, but providing the much-needed social glue of interactions. It is widely accepted that communication among FOSS collaborators happens in many different channels [94-97], and we have studied only the structured, technical ones. There are many activities that foster cooperation, and that are not code-centric [98,99]. Researchers used different methods to understand the nature of FOSS collaborations such as Social Network Analysis or dedicated metrics for understanding the nature of the FOSS models [58,63,84,100]. Despite these trends, our findings need to be reconfirmed through other methods. Our research raises many questions about the potential change in the way that FOSS processes are organized. Since the selected data source and perspective criteria introduce natural bias into our results, these results should not be unreflexively used to generalize to other FOSS communities or organizations. Moreover, as our results’ main strength is the sample size, it is also its major weakness, as it makes an explanatory approach—seeking correlations, reasons, and causes–much more difficult.

Final remarks

Our study is an attempt to determine the basic quantitative indicators of growth of FOSS organizations. We have discovered interesting trends in commits, comments and code growth dynamics, indicating that there has been a change in the activity levels across all types of FOSS organizations. FOSS organizations are still gaining new code, but the collaborative efforts measured in commits, committed code, and comments are lower than they were in 2020. Medium and small FOSS organizations seem to be less affected by the overall slowdown, still attracting new users but not as quickly as in the past. These results might be explained by the increasing adoption of FOSS collaborative online services such as GitHub and BitBucket. With more tools and simpler collaborative processes there may be a diminishing need for organizational proxies, because people can create ad hoc short-lived structures without dedicated processes and formal committees. If the original success of FOSS was even partly a result of this form of organization substituting for what can be more easily achievable through online services and software tools, it is quite understandable that FOSS organizations develop less dynamically. However, if this is what is happening, the practical implications are considerable: instead of revolutionizing the society or even just software development, FOSS will turn out to be a modest innovation, one that temporarily helped resolve some structural and communication issues, but only until the mainstream organizations have absorbed some of its model, and until regular project management tools have sufficiently evolved. Another possible explanation may be that we observe the maturation and aging of the FOSS development model: it not only no longer relies on archetypal hacking-for-fun, but it also has entered a stage in which many projects require maintenance and stability, and are much less reliant on frequent communication and commits. If this is the case, the FOSS model is not going to disappear any time soon, but it is still not going to make any radical organizational difference, and will remain a temporary fad in the organization of work. Finally, we cannot exclude the possibility that the larger FOSS organization are all falling prey to the “rise and decline” phenomenon observed in Wikipedia and some other peer production projects [101,102], and rooted in the fossilization of procedures, and the growth of quality control systems. We believe that in the near future we may observe a steady decline in the role of the large and formal organizations as large independent FOSS organizations are replaced by corporate-driven FOSS foundations. Perhaps the free software-oriented movement will reorganize itself into smaller, dynamic, tools-oriented networks. FOSS will probably not die, but it may not really live. (XLSX) Click here for additional data file. 28 Jan 2021 PONE-D-20-37277 Free and Open Source Software a large scale analysis of code, comments, and commit frequency PLOS ONE Dear Dr. Jemielniak, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. As you can see below, both reviewers identified interesting aspects of your manuscript. However they also raised a number of concerns that prevent it from meeting PLOS publication criteria. In particular, when revision your article, pay special attention to the following issues: - Discussion of the obtained results: Both reviewers asked for a revision of the conclusions reached (and the way they were expressed), considering teh limitations of the study. - Methodology: Notice specially the requirements by Reviewer 1 for a more clear description of the applied methodology. Moreover, address the concerns by reviewer 2 on the collected dataset. - English: Reviewer 1 also asked for a revision of the English used, as some parts are much less clear than others. Please submit your revised manuscript by Mar 14 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols We look forward to receiving your revised manuscript. Kind regards, Sergi Lozano Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. We suggest you thoroughly copyedit your manuscript for language usage, spelling, and grammar. If you do not know anyone who can help you do this, you may wish to consider employing a professional scientific editing service. Whilst you may use any professional scientific editing service of your choice, PLOS has partnered with both American Journal Experts (AJE) and Editage to provide discounted services to PLOS authors. Both organizations have experience helping authors meet PLOS guidelines and can provide language editing, translation, manuscript formatting, and figure formatting to ensure your manuscript meets our submission guidelines. To take advantage of our partnership with AJE, visit the AJE website (http://learn.aje.com/plos/) for a 15% discount off AJE services. To take advantage of our partnership with Editage, visit the Editage website (www.editage.com) and enter referral code PLOSEDIT for a 15% discount off Editage services.  If the PLOS editorial team finds any language issues in text that either AJE or Editage has edited, the service provider will re-edit the text for free. Upon resubmission, please provide the following: The name of the colleague or the details of the professional service that edited your manuscript A copy of your manuscript showing your changes by either highlighting them or using track changes (uploaded as a *supporting information* file) A clean copy of the edited manuscript (uploaded as the new *manuscript* file) 3. Thank you for stating the following financial disclosure: "no" At this time, please address the following queries: Please clarify the sources of funding (financial or material support) for your study. List the grants or organizations that supported your study, including funding received from your institution. State what role the funders took in the study. If the funders had no role in your study, please state: “The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.” If any authors received a salary from any of your funders, please state which authors and which funders. If you did not receive any funding for this study, please state: “The authors received no specific funding for this work.” Please include your amended statements within your cover letter; we will change the online submission form on your behalf. 4. Thank you for stating the following in your Competing Interests section: "no" Please complete your Competing Interests on the online submission form to state any Competing Interests. If you have no competing interests, please state "The authors have declared that no competing interests exist.", as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now This information should be included in your cover letter; we will change the online submission form on your behalf. Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests 5. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability. Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized. Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access. We will update your Data Availability statement to reflect the information you provide in your cover letter. 6. Please ensure that you refer to Figures 1, 2, 3, and 4 in your text as, if accepted, production will need this reference to link the reader to the figure. 7. Please include a copy of Tables 7, 8, and 9 which you refer to in your text on page 18. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: I Don't Know Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: No Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: My review is uploaded as an attachment. This following text is included to meet the minimum character count required by the PLOS ONE web form. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Reviewer #2: SUMMARY The paper presents a large study of FOSS projects to better understand the growth of FOSS organization via basic quantitative indicators. The dataset built for the analysis includes projects being developed by 15 FOSS organizations, which are classified into 3 groups (i.e., big, medium, small) according to the number of projects they manage. The identified groups are used to conduct the study. Results reveal that collaborative efforts (measured in terms of commits, committed code, and comments) are lower than at the beginning of the current decade. Only projects from medium and small FOSS organizations seems to be less affected by this trend. REVIEW The paper addresses an interested topic of the field and helps to better understand the evolution of the contribution activity in FOSS projects, in particular, in the context of FOSS organizations. Although the analysis is purely quantitative, its large size helps to visualize how collaborative actions (i.e., commits and comments) have changed along the years. This is revealing and worth tracking. I have two main concerns: * The collection date to build the dataset looks old, and it should be better noted in the "Limitations of the research model, data and results" section. My main concern is the exponential explosion of activity in collaborative development platforms such as GitHub, which are becoming the de facto standard to develop FOSS projects, and where the organization dimension sometimes is blurry. This is also related to possible novel ways to manage FOSS projects which may be beyond traditional FOSS organizations. I understand that this idea may require additional research, but the text should note the situation. * The discussion and conclusions of the paper are mainly hypotheses about FOSS organization evolution, as no empirical confirmation (i.e., interviews with developers) has been done. I wonder if they could be presented first for FOSS projects in general, and then extrapolate for FOSS organizations (being very cautious). This concern is also related to the previous one, as in the last years (2017-2021) FOSS organizations have been actively working on exploring new ways to support and help FOSS projects. All in all, I believe the paper makes an interesting contribution, but the conclusions are too focused on FOSS organizations and such a link is hard to visualize from the data gathered and analyzed. DETAILED REVIEW The first section introduces the problem and presents most of the related work. It also presents the main triggering idea to conduct the analysis presented in the paper. In the second section, the authors briefly discuss the differences between free and open-source software, which helps to present some characteristics (and issues) typically found in FOSS software. This also serves to motivate the study conducted in the paper. The third section presents the research methodology of the paper. Some indications for the subsections: * Table 1 could be simplified indicating the starting year and its collected data-months. When needed, years with less than 12 collected data-months should be indicated (e.g., openstack in 2017) * When describing CODPC, COMPC and COMPCOD and referring to either (or comments), is that the difference between the added and removed code (or comments)? The fourth section presents the results of the study. Companion tables and figures illustrated the obtained results. The fifth and last sections discuss the results and elaborates on some hypothesis to justify them. MINOR ERRORS AND POSSIBLE TYPOS - Normalize the use of "open-source" vs "open source" ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 17 Mar 2021 Our detailed response is included in revision-tracked file. 20 Apr 2021 PONE-D-20-37277R1 Free and Open Source Software: A large-scale analysis of code, comments, and commit frequency PLOS ONE Dear Dr. Jemielniak, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. When revising your text, pay special attention to the following issues: 1.- Both reviewers pointed out that some aspects of the methodology are not properly justified and/or described. Notice PLOS ONE's publication criterion #3 (https://journals.plos.org/plosone/s/criteria-for-publication#loc-3). 2.- Reviewer 3 stressed the need for a better alignment between the results obtained and the conclusions reached. Notice also the journal's publication criterion #4 (https://journals.plos.org/plosone/s/criteria-for-publication#loc-4). Moreover, before submitting your revised manuscript, please be sure that the comments by both reviewers are properly answered. Please submit your revised manuscript by Jun 04 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Sergi Lozano Academic Editor PLOS ONE [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #2: (No Response) Reviewer #3: (No Response) ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #2: Yes Reviewer #3: No ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #2: Yes Reviewer #3: No ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #2: Yes Reviewer #3: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #2: Yes Reviewer #3: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #2: I have not found an answer to my comments in the revised version of the paper. I hold my decision and attach again my concerns. The paper addresses an interesting topic of the field and helps to better understand the evolution of the contribution activity in FOSS projects, in particular, in the context of FOSS organizations. Although the analysis is purely quantitative, its large size helps to visualize how collaborative actions (i.e., commits and comments) have changed along the years. This is revealing and worth tracking. I have two main concerns: * The collection date to build the dataset looks old, and it should be better noted in the "Limitations of the research model, data and results" section. My main concern is the exponential explosion of activity in collaborative development platforms such as GitHub, which are becoming the de facto standard to develop FOSS projects, and where the organization dimension sometimes is blurry. This is also related to possible novel ways to manage FOSS projects which may be beyond traditional FOSS organizations. I understand that this idea may required additional research which is not included in the paper, but the text should note the situation. * The discussion and conclusions of the paper are mainly hypotheses about FOSS organization evolution, as no empirical confirmation (or attempt to confirm) has been done. I wonder if they could be presented for FOSS projects in general, and then for FOSS organizations (being cautious). This concern is also related to the previous one, as in the last years (2017-2021) FOSS organizations have been actively working on exploring new ways to support and help FOSS projects. All in all, I believe the paper makes an interesting contribution, but the conclusions are too focused on FOSS organizations and such a link is hard to visualize from the data gathered and analyzed. DETAILED REVIEW The first section introduces the problem and presents most of the related work. It also presents the main triggering idea to conduct the analysis presented in the paper. In the second section, the authors briefly discuss the differences between free and open source software, which helps to present some characteristics (and issues) typically found in FOSS software. This also serves to motivate the study conducted in the paper. The third section presents the research methodology of the paper. Some indications for the subsections: * Table 1 could be simplified indicating the starting year and its collected data-months. When needed, years with less than 12 collected data-months should be indicated (e.g., GNOME in 2017) * When describing CODPC, COMPC and COMPCOD and referring to either (or comments), is that the difference between the added and removed code (or comments)? The fourth section presents the results of the study. Companion tables and figures illustrated the obtained results. The fifth and last sections discuss the results and elaborates on some hypothesis to justify them. MINOR ERRORS AND POSSIBLE TYPOS - Normalize the use of "open-source" vs "open source" Reviewer #3: My review exceeds the length limit and I am uploading the full review as an attachment. A summary of my review is below: This article undertakes a longitudinal study of participation in FOSS projects, considering commit frequency, lines of code, and code comments. The work concludes that contributors are less active now, and that this decline is more strongly associated with large organizations than smaller ones. This finding is associated with three outcomes: the relationship between peer production and commercial production is maturing, the role of large organizations as intermediaries is declining, and the "likelihood of new big formal organizations" managing "emerging" projects is "systematically declining". The work concludes that this result suggests that open organizing is a "pipe dream". This paper tackles an important concern with an ambitious scope. The writing is clear, and I appreciate the attention to a range of FOSS-related organizations. The work has strong potential to contribute to the field, with implications for computer science, social computing, and organizational studies.To the extent these organizations are in a state of failure, we need to know about it. Overall, however, I find that this project faces a crossroads; there is a disconnection between the evidence developed and the argument made, and one side or the other needs to give way. In particular, there needs to be a clear connection between all of the core elements of the paper. To assist in tracking my argument in parallel with the paper's argument, I have broken my concerns into six themes below: the unit of analysis, the sample, the measures, the methods, the connection between the analysis and the discussion, and the ultimate conclusions. However, the gist of my concern (inconsistency between the design section, analysis section, and introduction/discussion/conclusions) is threaded throughout the paper, so some of my concerns as I describe them may seem to repeat/reprise previous concerns; for this I apologize in advance. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #2: No Reviewer #3: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. Submitted filename: Review PONE-D-20-37277R1.pdf Click here for additional data file. 4 Jun 2021 Our full response is included as a separate file. Submitted filename: REPLY TO REVIEWERS PLOS second revision.docx Click here for additional data file. 30 Jun 2021 PONE-D-20-37277R2 Free and Open Source Software Organizations: A large-scale analysis of code, comments, and commit frequency using OpenHub.net data source PLOS ONE Dear Dr. Jemielniak, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. As you can see below, both reviewers are highly positive concerning your article. Nevertheless, Reviewer 3 provided a detailed list of issues (already included in a previous report) that should be addressed and/or properly discussed. In particular, your revision should pay special attention to point 2 in the reviewer's report (Evidence-Claims Link) as it is directly related to PLOS ONE's publication criterion #4 (https://journals.plos.org/plosone/s/criteria-for-publication#loc-4). Please submit your revised manuscript by Aug 14 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Sergi Lozano Academic Editor PLOS ONE Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #2: All comments have been addressed Reviewer #3: (No Response) ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #2: Yes Reviewer #3: Partly ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #2: Yes Reviewer #3: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #2: Yes Reviewer #3: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #2: Yes Reviewer #3: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #2: I would like to thank authors for the detailed response and for addressing my comments. In my opinion, the new version of the paper has improved with regard to the previous one. I therefore recommend its acceptance. Reviewer #3: I found myself even more excited about this paper having reviewed the latest version and I appreciate the revision efforts and response to comments. This is valuable and informative work that I am eager to see succeed. However, some of my core concerns remain, particularly with respect to the claims made versus the evidence presented. Content-related concerns: 1. Abstract. I reprise my earlier comments here -- the final claim of the abstract is unsustained and needs to be amended. With apologies for the format of the below: A. "as...the role of large FOSS organizations serving as platforms diminishes" -- it's not clear what is meant by "platforms" here, or whether these organizations seek to serve as such, or whether this service is indeed declining. What's presented is a decline in productivity and growth by various metrics, not a decline in "role as platform". B. "the likelihood of new formal organizations" -- the analysis presented in this paper explicitly does not address new organizations nor does it measure likelihoods. C. "managing the...projects emerging is systematically declining" -- the analysis does not address emerging projects; to claim a systematically declining likelihood, one would need evidence addressing the system rather than only its long-lived members (to capture 'systematically'), as well as measuring which organizations an emerging project goes on to join (to capture 'likelihood'). 2. Evidence-Claims Link. The "organizations supplanted by platforms" argument is problematic and emblematic of the necessity to connect claims and evidence I cited in my previous review. The use of the term "platform" is unclear and the argument of platforms supplanting organizations is not sustained by the analysis presented. A. In the abstract, organizations are said to have a role to serve as platforms; 1A has the opener for my points on this. Organizations create and use tools like GitHub to manage their work, but Mozilla for example is not Git nor does it frame itself as a Git provider. B. The paper presents an argument that "platforms" are supplanting "organizations" (p. 16). This claim needs evidence. It may be the case that new project founders don't see a need to join an organization when a platform is sufficiently robust, but evidence for this is not presented because of the way the sample is constructed. Valid evidence here would perhaps look like joining rates of new projects through processes like the Apache Incubator versus similarly-successful projects staying independent. But in this case, there's still no supplanting per se because this is a false dichotomy: every project needs both an organization of some kind and an instance on github or similar, with sufficient value in terms of users and code already accumulated before they could hope to join any organization. 3. Sample. The response to my comments with respect to the sample are very much appreciated; a few more of these details should make their way into the paper, given the effort that was expended in this regard. I don't think it's necessary to include the data source in the title---my concern here is more about substance: reporting what is known about the sample, how it is constructed and vetted and verified, its limitations, and so on. 4. Collection of minor points: A. The paper would benefit from a modest revision pass for typographical issues. I observed single-sentence paragraphs and some of the captions had typos. Y-axis labels on the figures would be useful. B. Figure 4 is difficult to make sense of; it may be possible to improve it using a faceted plot or simply a repeat of the style of Figure 3. C. I'm not sure it's necessary to publish the QQ plot; a direct visualization of the distribution or a measure of skewness would be reasonable. D. "technological platforms" (p. 17) seems like an overbroad scope; I think what's meant here is only source control/collaboration platforms like GitHub. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #2: No Reviewer #3: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 10 Aug 2021 Our response is included in a separate file. Submitted filename: REPLY TO REVIEWERS PLOS third revision.docx Click here for additional data file. 26 Aug 2021 Free and Open Source Software Organizations: A large-scale analysis of code, comments, and commits frequency. PONE-D-20-37277R3 Dear Dr. Jemielniak, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Sergi Lozano Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: 14 Sep 2021 PONE-D-20-37277R3 Free and Open Source Software Organizations: A large-scale analysis of code, comments, and commits frequency. Dear Dr. Jemielniak: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Sergi Lozano Academic Editor PLOS ONE
Table 7

Mean of comments line per code lines (COMPCOD).

Mean values of comments lines per code line (COMPCOD)   
 Large(#projects>100)Medium (25<#projects <100)Small (#projects< = 25)
19970.155290.256130.16807
19980.147920.248060.21714
19990.215481.158500.31579
20000.260960.266740.28325
20010.318280.606310.23779
20020.281400.290480.33766
20030.473770.930730.65974
20041.365621.606810.37094
20050.427310.432900.36057
20061.070820.319240.31417
20070.375971.003060.27286
20080.638190.299770.26000
20090.364950.325520.28688
20100.640232.146430.95249
20110.359070.269900.23549
20120.445460.340840.49329
20130.334730.454880.24373
20140.312850.501290.22618
20150.295810.891440.33060
20160.411320.238550.32675
20170.595860.219950.22574
Trimmed mean (5% bottom and 5% top)0.41990.54950.3157
Mean0.45200.60990.3390
SD0.275300.500110.16956
  3 in total

1.  Symposium on health care reform.

Authors:  J P Newhouse
Journal:  J Econ Perspect       Date:  1994

Review 2.  Adaptive sampling in behavioral surveys.

Authors:  S K Thompson
Journal:  NIDA Res Monogr       Date:  1997

3.  Inequalities in Open Source Software Development: Analysis of Contributor's Commits in Apache Software Foundation Projects.

Authors:  Tadeusz Chełkowski; Peter Gloor; Dariusz Jemielniak
Journal:  PLoS One       Date:  2016-04-20       Impact factor: 3.240

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.