Literature DB >> 32941523

Preparing GIS data for analysis of stream monitoring data: The R package openSTARS.

Mira Kattwinkel¹, Eduard Szöcs¹, Erin Peterson^2,3, Ralf B Schäfer¹.

Abstract

Stream monitoring data provides insights into the biological, chemical and physical status of running waters. Additionally, it can be used to identify drivers of chemical or ecological water quality, to inform related management actions, and to forecast future conditions under land use and global change scenarios. Measurements from sites along the same stream may not be statistically independent, and the R package SSN provides a way to describe spatial autocorrelation when modelling relationships between measured variables and potential drivers. However, SSN requires the user to provide the stream network and sampling locations in a certain format. Likewise, other applications require catchment delineation and intersection of different spatial data. We developed the R package openSTARS that provides the functionality to derive stream networks from a digital elevation model, delineate stream catchments and intersect them with land use or other GIS data as potential predictors. Additionally, locations for model predictions can be generated automatically along the stream network. We present an example workflow of all data preparation steps. In a case study using data from water monitoring sites in Southern Germany, the resulting stream network and derived site characteristics matched those constructed using STARS, an ArcGIS custom toolbox. An advantage of openSTARS is that it relies on free and open-source GRASS GIS and R functions, unlike the original STARS toolbox which depends on proprietary ArcGIS. openSTARS also comes without a graphical user interface, to enhance reproducibility and reusability of the workflow, thereby harmonizing and simplifying the data pre-processing prior to statistical modelling. Overall, openSTARS facilitates the use of spatial regression and other applications on stream networks and contributes to reproducible science with applications in hydrology, environmental sciences and ecology.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2020 PMID： 32941523 PMCID： PMC7498020 DOI： 10.1371/journal.pone.0239237

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Streams and rivers are regularly monitored to assess their biological (e.g. species composition or abundance), chemical (e.g. nutrient or pesticide concentrations) and physical (e.g. temperature) status. For example, the EU Water Framework Directive’s (WFD) aim to maintain and improve water quality led to vast monitoring efforts to assess the status of European water bodies, comprising a monitoring network of more than 67000 sites in 2012 [1]. This extensive network is complemented by additional national, regional or local stream monitoring programs; for example to evaluate pesticide concentrations or to monitor industrial discharge [2]. Monitoring data is often related to climatic, land use or hydrological predictors to investigate the effects anthropogenic impacts on in-stream condition or to support biodiversity conservation [3, 4]. Sampling sites in branching stream networks are often connected by stream flow and may share similar landscape characteristics (e.g. elevation or climate) to sites in close geographic space. Therefore, the measurements may be correlated in geographic space, topological space, or both [5]. This violates the assumption of independence in many classical statistical approaches (e.g. linear regression) and alternative methods accounting for the spatial dependence in the data should be used [6]. The package SSN [7] for R statistical software [8] provides the functionality to fit spatial statistical stream network (SSN) models using a mixture of covariance functions that account for the unique spatial relationships found in streams data [9]. However, several data preparation steps are necessary to generate the spatial information needed to fit these models. SSN models have been applied in almost 50 case studies mainly predicting water temperature [10] but also other physico-chemical [11] or biological variables [12, 13]. To the best of our knowledge, all of the applications used the Spatial Tools for the Analysis of River Systems (STARS) toolbox [14] for preparing the spatial input data to allow for subsequent modelling in SSN. Although STARS is freely available, it depends on the proprietary software ArcGIS [15], which does not allow users to study or improve the source code and incurs relatively high license costs [16]. Additionally, a redesign of the ArcGIS Pro environment and discontinuation of the personal geodatabase would require significant modification of STARS to meet the requirements of the latest ArcGIS versions. Here, we introduce the R package openSTARS [17] as an alternative tool for data preparation of spatial stream network data, which can subsequently be used with the SSN package and other applications. The package is independent of proprietary software, relying on the geographic information system (GIS) functionalities of R and GRASS GIS via the package rgrass7 [18]. GRASS GIS is free and open source software (FOSS) with a strong user and developer community and offers powerful functions for deriving stream networks and catchment delineation [19]. Our implementation within R also releases the user from the need to familiarize themselves with GRASS GIS. We provide example code in the S2 File, enabling readers to recreate this workflow using their own stream data, and compare openSTARS with STARS output.

Material and methods

Background

The openSTARS package provides functions to generate the spatial information needed to fit SSN models to stream data using the SSN package [7] (Fig 1). The STARS toolbox uses and cleans an existing stream network in vector format, whereas openSTARS creates the stream network from a digital elevation model (DEM) based on the GRASS functions r.watershed and r.stream.extract [20]. Optionally, an existing stream network can be provided in vector format that guides the stream network derived from the DEM. In the SSN package, several topological conditions are inadmissible [14]: converging nodes (two stream segments converge at a confluence without flowing into another downstream segment), diverging nodes (a stream segment flows into a node and splits into multiple segments downstream of the node), and complex confluences (more than two stream segments flow into a node, and out to a single downstream segment). A major advantage of relying on a streams dataset derived from a DEM is that it is free of true topological errors, i.e. all streams flow downstream, there are no duplicate reaches and only a single outflow per network. The algorithms also produce non-braided networks, which would lead to diverging nodes. Hence, this approach can save a significant amount of editing time when there are numerous topological errors in a vector stream network.

Fig 1

openSTARS workflow.

Workflow

Load data

First, the GRASS environment is set based on the region and projection of the DEM (setup_grass_environment) (Fig 1, S2 File). Second, the DEM and site locations are read into the GRASS location, along with optional data including a stream network in vector format, maps of potential predictors and prediction sites using import_data (Fig 1A; Table 1).

Table 1

Input data for openSTARS.

Data	Mandatory or optional	Format	Description
digital elevation model (DEM)	mandatory	raster	elevation data needed to derive the stream network and delineate catchment boundaries
sampling sites	mandatory	vector	locations of the sampling sites
streams	optional	vector	stream network to be burnt into the DEM to guide the derived one
prediction sites	optional	vector	locations of sites where model predictions will be generated
potential predictors	optional	raster or vector	spatial datasets used to calculate predictor variables for use in the SSN model
measurements	optional	table (e.g. txt, csv)	measurements at the sampling sites (dependent variables and optionally potential predictors)

Derive and clean stream network

The preprocessing process starts by deriving a stream network from the DEM (derive_streams; Fig 1B). In this step, the optional stream network can be burnt into the DEM by a given number of meters guiding the DEM-derived streams to this network. The spatial resolution of the network (i.e. how fine the branching of the network is) can be tuned using the parameters accum_threshold and min_stream_length, which represent the minimum number of accumulated raster cells for delineating a stream line and the minimum stream length in DEM raster cells, respectively. If the resolution of the derived network is too fine or coarse this step can be repeated, or other tools can be used to determine an optimal threshold for a given stream network [21]. Next, the streams should be checked for complex confluences (check_complex_confluences) and corrected if necessary (correct_complex_confluences). The latter function moves the downstream node of one stream segment a fraction of the DEM cell size upstream, creating a tiny artificial segment in between the new and old nodes (Fig 1B). As an optional feature, artificial stream segments that flow through lakes and reservoirs can be deleted to create separate unconnected stream networks (delete_lakes). Once the streams have been topologically corrected, several attributes necessary for SSN modelling including reach contributing area (RCA, i.e. the land area adjacent to each segment that provides lateral overland flow) and catchment areas must be calculated using calc_edges [14] resulting in a new vector map of streams called ‘edges’.

Prepare sampling and prediction sites

The site locations are cleaned (i.e. snapped to the edges if they do not exactly intersect line segments (calc_sites; Fig 1C). Additionally, attributes necessary for SSN modelling are assigned and a new vector map ‘sites’ is created. The first step is necessary because of frequent mismatches between site and stream locations due to GPS imprecision, the need to represent three-dimensional streams as lines in a GIS, or when deriving streams from a raster-based DEM. The column ‘dist’ in the sites’ attribute table gives the distance a point was moved in map units. A maximum distance can be provided as an argument in calc_sites, and sites exceeding this distance will be deleted. If a large fraction of sites is moved long distances, this may indicate a too coarse spatial resolution of the stream network. The calc_prediction_sites function (Fig 1C) allows to automatically create prediction sites for use in SSN modelling. The user specifies the number of prediction sites to be created or the distance between sites. Prediction sites are created evenly along all or selected networks in the data set with identical distances from downstream to upstream sites.

Derive potential predictors

Predictor variables are commonly used in SSN models to represent characteristics thought to influence the response (e.g. water quality or organism abundance). These must be assigned to the sampling and prediction sites attribute tables (Fig 1D). For approximate assignment as in the STARS toolbox, calc_attributes_edges summarises values within the RCA and the catchment of the downstream node of each edge. Then values are assigned to the sites based on their position on the line segment using calculate_attributes_sites_approx. The second option is to derive exact catchments for each site and then summarise predictor values within the catchments (calc_attributes_sites_exact). This can be computationally intensive and take considerably longer than the approximation when there is a large number of sites.

Export data

The optional merge_sites_with_measurements function is used to reduce the computational resources needed to process repeated measurements at a single location (Fig 1E). Before the data are exported, a table of measurement data containing repeated measurements can be merged to the sites attribute table. A new vector point feature is generated for each repeated measurement, which contains the static predictor variables and other attributes generated in the preprocessing steps (Fig 1B–1D). Note that time-varying predictor variables will need to be generated after this step. Finally, export_ssn saves the processed data to a new local directory (a ‘.ssn object’), which contains streams and sample sites as shape files (‘edges.shp’, ‘sites.shp’, respectively) and optionally prediction sites, as well as topological relationships stored in text files (‘netX.dat’), with the naming conventions and formats required by SSNn (Fig 1E).

Application example and comparison with STARS toolbox

We compare the openSTARS and STARS (in ArcGIS version 10.6) output for an analysis based on 39 monitoring sites in Southern Germany (Baden-Württemberg). Point coordinates were provided by the State Environment Agency Baden Württemberg (LUBW), the DEM was provided by the European Environment Agency [22] and a stream network by the German Federal Institute of Hydrology (www.wasserblick.net). As examples for predictor maps we used the share of arable land use (vector format) in the sites’ catchments (based on ATKIS land cover data [23]). STARS requires a stream network in vector format and so we used (i) the one burnt into the network in openSTARS and (ii) the one derived from the DEM by openSTARS that exhibited a higher resolution. The results of the two tools were inspected visually and by systematically comparing the calculated catchment sizes of the sampling sites and the area of arable land use.

Results

openSTARS and STARS yielded very similar results with regard to the position of sites snapped to the edges (Fig 2). The degree of small tributaries of the derived steam network of openSTARS depends on the choice of the parameters threshold in derive_streams and was adjusted to minimize the snapping distance of sites to an edge. However, the stream courses of both tools match.

Fig 2

Comparison of openSTARS and STARS edges and snapped sampling sites.

STARS edges with slight offset for readability. The sites marked with a dark circle were removed in STARS because their snapping distance exceeded 150 m.

Comparison of openSTARS and STARS edges and snapped sampling sites.

STARS edges with slight offset for readability. The sites marked with a dark circle were removed in STARS because their snapping distance exceeded 150 m. The derived attributes catchment size and area of arable land use within the catchments of the sites derived with the two tools are very similar (correlation coefficients between the attributes calculated with STARS and openSTARS for the sites: 0.97 and 0.98, respectively; Fig 3), when based on the original stream network. The results were also similar when based on the derived stream network. There are only two exceptions: one site was snapped to a smaller tributary created in openSTARS, which is lacking in the streams dataset used in STARS, and in the other case the network is smaller (S1 File).

Fig 3

Comparison of openSTARS and STARS calculated catchment attributes for the sampling sites.

A: catchment area in km2; B: area of arable land use in km2. r is the Pearson’s correlation coefficient (including the marked outliers), the dotted line shows the 1:1 relationship, the solid black dots mark two outliers (S1 File).

Comparison of openSTARS and STARS calculated catchment attributes for the sampling sites.

Discussion

Despite the differences in the procedures of openSTARS and STARS, the derived catchment characteristics for sampling sites were very similar. The major conceptional difference between the two is that the openSTARS derives the stream network from a DEM, whereas STARS relies on an existing stream network in vector format. Hence, openSTARS fills an important gap given that stream networks are either not readily available or are too coarse in many regions of the world, making them unsuitable for use in spatial statistical stream network models. Moreover, existing networks in vector format often contain many topological errors that can be time consuming and difficult to correct. On the other hand, in some regions stream datasets have been topologically corrected for use in SSN modelling [24] or have been attributed with information that can be used as predictor variables [25]. Preserving such information in openSTARS would be challenging as it derives the stream network from the DEM as a new map. Another technical difference between the tools is that calculating RCAs in STARS is based on the D8 flow direction algorithm, while openSTARS applies the more current multiple flow direction (MFD) algorithm. This may lead to differences in the calculation of RCAs and thereby in catchment areas and other potential predictors. The requirement of non-braided streams for SSN modelling leads to another issue. In heavily modified areas (e.g. artificial drainage ditches or channels) it can be challenging to choose just one”true” stream segment. Likewise, independent of the tools used, deriving RCAs and catchments in such situations may be difficult. In openSTARS, deriving the streams from the DEM ensures the absence of braided sections, and complex confluences can be corrected automatically. In STARS, the cleaning of the network is done manually or semi-automatically using the ArcGIS Topology tools, which can be very laborious depending on the size of the network and the number of such topological errors. A future openSTARS version may incorporate the option for additional manual checks and corrections. The GRASS GIS algorithms used to derive the stream network in openSTARS (r.watershed and r.stream.extract) can fail in very flat areas where the relief energy is very low although it is deemed to outperform other algorithms [26]. In such cases, a sufficiently large DEM may provide a gradient, even if the sampling sites cover a smaller area. Additionally, burning in an existing stream network can fix this issue. However, the same issue arises for catchment delineation in the STARS toolbox. A great advantage of openSTARS is that it relies on free and open-source GRASS GIS and R functions, unlike the original STARS toolbox for the proprietary ArcGIS software. Moreover, compared to data preparation in ArcGIS and statistical analysis in R, openSTARS unifies the complete workflow in R. Thereby it also facilitates the reproducibility and tracking of the data processing routine. In addition, a deeper understanding of GRASS or other GIS is not required. openSTARS supports the wider application of spatial statistical modelling on stream networks, a technique that is growing in popularity for the analysis of stream data e.g. from biological or chemical monitoring. Such approaches will be particularly useful in the future, as the volume and density of data from low-cost in situ sensors continues to increase [27], and the analyses of these rich datasets may lead to new insights about stream ecosystems.

Differences in catchment areas for STARS and openSTARS in the case study.

(PDF) Click here for additional data file.

Complete openSTARS Workflow (commented R code).

(PDF) Click here for additional data file. 13 Jul 2020 PONE-D-20-15404 Preparing GIS data for analysis of stream monitoring data: The R package openSTARS PLOS ONE Dear Mira Kattwinkel Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but needs some revisions. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Aug 27 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols We look forward to receiving your revised manuscript. Kind regards, Stoyan Nedkov Academic Editor PLOS ONE Additional Editor Comments: Dear Authors, Dear authors, the reviewers finalized their evaluation suggesting revisions of the paper. I agree with their evaluation and suggest major revision and recommend to pay special attention on their critics. Please thoroughly address all reviewer comments in your reply and in the revised version of your manuscript. Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Thank you for stating in your Funding Statement: "MK was partly funded by the EU-INTERREG V Upper Rhine via project 1.6 SERIOR (Security-Risk-Orientation). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. All other authors received no specific funding for this work." Please provide an amended statement that declares *all* the funding or sources of support (whether external or internal to your organization) received during this study, as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now. Please also include the statement “There was no additional external funding received for this study.” in your updated Funding Statement. Please include your amended Funding Statement within your cover letter. We will change the online submission form on your behalf. 3. Thank you for stating the following in the Competing Interests section: "The authors have declared that no competing interests exist." We note that one or more of the authors are employed by a commercial company: BASF SE, Biostatistics & Data Sciences. 3.1. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form. Please also include the following statement within your amended Funding Statement. “The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.” If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement. 3.2. Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc. Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests) . If this adherence statement is not accurate and there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared. Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf. Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests 4. We note that Figures 1, 2, S1, S2 in your submission contain map images which may be copyrighted. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For these reasons, we cannot publish previously copyrighted maps or satellite images created using proprietary data, such as Google software (Google Maps, Street View, and Earth). For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright. We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission: 4.1. You may seek permission from the original copyright holder of Figures 1, 2, S1, S2 to publish the content specifically under the CC BY 4.0 license. We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text: “I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.” Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission. In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].” 4.2. If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only. The following resources for replacing copyrighted map figures may be helpful: USGS National Map Viewer (public domain): http://viewer.nationalmap.gov/viewer/ The Gateway to Astronaut Photography of Earth (public domain): http://eol.jsc.nasa.gov/sseop/clickmap/ Maps at the CIA (public domain): https://www.cia.gov/library/publications/the-world-factbook/index.html and https://www.cia.gov/library/publications/cia-maps-publications/index.html NASA Earth Observatory (public domain): http://earthobservatory.nasa.gov/ Landsat: http://landsat.visibleearth.nasa.gov/ USGS EROS (Earth Resources Observatory and Science (EROS) Center) (public domain): http://eros.usgs.gov/# Natural Earth (public domain): http://www.naturalearthdata.com/ 5. Please ensure that you refer to Figure 2 and 3 in your text as, if accepted, production will need this reference to link the reader to the figure. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: No Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The submitted text "Preparing GIS data for analysis of stream monitoring data: The R package openSTARS" by Mira Kattwinkel at.al. describes a complete working licenses free library named "openStars" for R statistical environment. The software is dedicated on pre-processing of stream monitoring data for further statistical analysis, mainly by SSN. The tool is already available online and free for use. There is a manual which is easy to understand and reproduce. However, I didn't succeeded to reproduce provided example in Supporting_Information_S2. There are computational error with derive_streams(accum_threshold = 100, condition = T, clean = TRUE, burn = 10) when accum_threshold is equal to 100, failing to execute correctly calc_edges() and calc_sites(), mainly producing missing values for locID, pid and netID. The error does not exist for accum_threshold bigger than 170, but the results are different than those in supporting information. The used version of GRASS is 7.6 on R version 3.6.3 for Fedora 31. 2.The text have to be revisited because the existence of non-clear statements, such as at p.8, line 170 where the following text exists "openSTARS and STARS yielded very similar results (2)", where reference to some index (2) is confusing. Another example is the sentence on p. 9, line 182, starting with "The major conceptional difference..." where it is claimed "that the STARS derives the stream network from a DEM, whereas openSTARS relies on an existing stream network in vector format", which is opposite to the previous text. 3.According the statistical part in Paragraph 3, I think that is too brief and incomplete. Secondly, it is not clear a correlation of what and how is computed. I only could guess that outliers are removed from computation, because it is very suspicious to have r bigger that 0.98 with so large outliers. Moreover, the correlation is not suitable because samples in poor agreement may have high correlation. Likewise, the study of possible bias is also missing. As an option for revision the authors could consider any parametric or non-parametric test or at least a graphical tools, such as widely used in medicine Bland-Altman plots. Reviewer #2: The submission itself is more or less ready to go, and explains workflow variants, and in the supplementary materials also analysis in the SSD package. The points that I feel deserve attention are that GRASS is now 7.8.3, R 4.0.2, and some of the package code is showing its age. For example, check_compl_confluences() now reports: "The command: v.to.db --quiet map=streams_v option=length type=line columns=length_new produced an error (1) during execution: ERROR: Column exists. To overwrite, use the --overwrite flag". This may have subsequent consequences. Another is in calc_edges(), where: "Error in .prepareFastSubset(isub = isub, x = x, enclos = parent.frame(), : RHS of == is length 2 which is not 1 or nrow (1197). For robustness, no recycling is allowed (other than of length 1 RHS). Consider %in% instead. In addition: Warning message: In if (dt[stream == id, prev_str01, ] == 0) { : the condition has length > 1 and only the first element will be used" which I believe occurs from R 4.0.0, because the lengths of compared objects are now checked. These are minor issues, but would trip up users of the R package. I feel that the writing of CI code for the package code on Github with the latest released versions of R, this package and the packages it depends on, GRASS, and the GRASS extensions. A possible vulnerability is that both GRASS and R (sp and sf packages) have adapted to changes in PROJ and GDAL, so that the handling of coordinate reference systems across the boundary between the two interfaces software systems needs checking, and relying on PROJ strings may not be as reliable as it has. So when checking the software and provided scripts for changes from version changes in R and GRASS, it would be very sensible to check whether any vulnerabilities are present as R moves from Proj4 to WKT2 string representations. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 7 Aug 2020 Our response can be found in the Response to Reviewers file. Submitted filename: Response_to_Reviewers.pdf Click here for additional data file. 2 Sep 2020 Preparing GIS data for analysis of stream monitoring data: The R package openSTARS PONE-D-20-15404R1 Dear Dr. Kattwinkel, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Stoyan Nedkov Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: N/A ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Finally, I successfully replicated the whole process as it is shown in Supporting Information 2. The main reason for the previous issues was partly my failure - I was set different geographical projections in GRASS software. However, the error messages was completely useless and their revisions must be considered. In my opinion, the reliance on GRASS settings must be explained explicitly in the manual. Note, that it is a technical remark for further developments and it does not impact my decision. Reviewer #2: (No Response) ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No 9 Sep 2020 PONE-D-20-15404R1 Preparing GIS data for analysis of stream monitoring data: The R package openSTARS Dear Dr. Kattwinkel: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Stoyan Nedkov Academic Editor PLOS ONE

6 in total

1. Let the four freedoms paradigm apply to ecology.

Authors: Duccio Rocchini; Markus Neteler
Journal: Trends Ecol Evol Date: 2012-04-20 Impact factor: 17.712

2. Social equity shapes zone-selection: Balancing aquatic biodiversity conservation and ecosystem services delivery in the transboundary Danube River Basin.

Authors: Sami Domisch; Karan Kakouei; Javier Martínez-López; Kenneth J Bagstad; Ainhoa Magrach; Stefano Balbi; Ferdinando Villa; Andrea Funk; Thomas Hein; Florian Borgwardt; Virgilio Hermoso; Sonja C Jähnig; Simone D Langhans
Journal: Sci Total Environ Date: 2018-11-28 Impact factor: 7.963

3. How a complete pesticide screening changes the assessment of surface water quality.

Authors: Christoph Moschet; Irene Wittmer; Jelena Simovic; Marion Junghans; Alessandro Piazzoli; Heinz Singer; Christian Stamm; Christian Leu; Juliane Hollender
Journal: Environ Sci Technol Date: 2014-05-12 Impact factor: 9.028

4. Monitoring spatial and temporal variation of dissolved oxygen and water temperature in the Savannah River using a sensor network.

Authors: Christopher J Post; Michael P Cope; Patrick D Gerard; Nicholas M Masto; Joshua R Vine; Roxanne Y Stiglitz; Jason O Hallstrom; Jillian C Newman; Elena A Mikhailova
Journal: Environ Monit Assess Date: 2018-04-10 Impact factor: 2.513

5. IMPROVING PREDICTIVE MODELS OF IN-STREAM PHOSPHORUS CONCENTRATION BASED ON NATIONALLY-AVAILABLE SPATIAL DATA COVERAGES.

Authors: Murray W Scown; Michael G McManus; John H Carson; Christopher T Nietch
Journal: J Am Water Resour Assoc Date: 2017-08

Review 6. Modelling dendritic ecological networks in space: an integrated network perspective.

Authors: Erin E Peterson; Jay M Ver Hoef; Dan J Isaak; Jeffrey A Falke; Marie-Josée Fortin; Chris E Jordan; Kristina McNyset; Pascal Monestiez; Aaron S Ruesch; Aritra Sengupta; Nicholas Som; E Ashley Steel; David M Theobald; Christian E Torgersen; Seth J Wenger
Journal: Ecol Lett Date: 2013-03-04 Impact factor: 9.492

6 in total