| Literature DB >> 33082344 |
Alberto Rodriguez-Ramirez1,2, Manuel González-Rivero3,4,5, Oscar Beijbom6,7, Christophe Bailhache8, Pim Bongaerts6,9, Kristen T Brown10,11, Dominic E P Bryant11, Peter Dalton6, Sophie Dove10,11, Anjani Ganase11, Emma V Kennedy6,12, Catherine J S Kim11, Sebastian Lopez-Marcano6,13, Benjamin P Neal6,14,15, Veronica Z Radice11, Julie Vercelloni6,11,16,17, Hawthorne L Beyer10,11, Ove Hoegh-Guldberg18,19,20.
Abstract
Addressing the global decline of coral reefs requires effective actions from managers, policymakers and society as a whole. Coral reef scientists are therefore challenged with the task of providing prompt and relevant inputs for science-based decision-making. Here, we provide a baseline dataset, covering 1300 km of tropical coral reef habitats globally, and comprised of over one million geo-referenced, high-resolution photo-quadrats analysed using artificial intelligence to automatically estimate the proportional cover of benthic components. The dataset contains information on five major reef regions, and spans 2012-2018, including surveys before and after the 2016 global bleaching event. The taxonomic resolution attained by image analysis, as well as the spatially explicit nature of the images, allow for multi-scale spatial analyses, temporal assessments (decline and recovery), and serve for supporting image recognition developments. This standardised dataset across broad geographies offers a significant contribution towards a sound baseline for advancing our understanding of coral reef ecology and thereby taking collective and informed actions to mitigate catastrophic losses in coral reefs worldwide.Entities:
Mesh:
Year: 2020 PMID: 33082344 PMCID: PMC7576589 DOI: 10.1038/s41597-020-00698-6
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1The workflow for generating the global dataset of coral reef imagery and associated data. The 860 photographic surveys from the Western Atlantic Ocean, Southeast Asia, Central Pacific Ocean, Central Indian Ocean, and Eastern Australia, were conducted between 2012 and 2018. Reef locations are represented by points colour-coded according to the survey region. Surveys images were post-processed in order to transform raw fish-eye images into 1 × 1 m quadrats for manual and automated annotation (inset originally published in González-Rivero et al.[23] as Figure S1). For the image analysis, nine networks were trained. For each network, images were divided in two groups: Training and Testing images. Both sets were manually annotated to create a training dataset and verification dataset. The training dataset was used to train and fine-tune the network. The fully trained network was then used to classify the test images, and contrast the outcomes (Machine) against the human annotations (Observer) in the test dataset during the validation process. Finally, the non-annotated images (photo-quadrats) were automatically annotated using the validated network. The automated classifications were processed to originate the benthic covers that constitute this dataset. QGIS software was used to generate the map using the layer “Countries WGS84” downloaded from ArcGIS Hub (http://hub.arcgis.com/datasets/UIA::countries-wgs84).
Summary of the photographic surveys conducted between 2012 and 2018.
| Region | Survey year | Country or Territory | Surveys | Quadrats |
|---|---|---|---|---|
| Western Atlantic Ocean | 2013 | Anguilla | 15 | 22,250 |
| 2013 | Aruba | 3 | 5,615 | |
| 2013 | The Bahamas | 30 | 42,678 | |
| 2013 | Belize | 24 | 32,515 | |
| 2013 | Bermuda | 12 | 25,748 | |
| 2013 | Bonaire | 13 | 23,028 | |
| 2013 | Curacao | 16 | 29,419 | |
| 2013 | Guadeloupe | 14 | 15,277 | |
| 2013 | Mexico | 23 | 38,192 | |
| 2013 | Saint Martin | 4 | 4,368 | |
| 2013 | Saint Vincent and the Grenadines | 25 | 25,498 | |
| 2013 | Saint Eustatius | 3 | 2,033 | |
| 2013 | Turks and Caicos Islands | 13 | 18,387 | |
| Eastern Australia | 2012, 2014,2016,2017 | Australia (Great Barrier Reef) | 261 | 316,369 |
| Central Indian Ocean | 2015 | The Chagos Archipelago | 29 | 42,777 |
| 2015, 2017 | Maldives | 63 | 94,921 | |
| Southeast Asia | 2016, 2018 | Taiwan | 29 | 30,062 |
| 2014 | Timor-Leste | 26 | 28,482 | |
| 2014, 2018 | Indonesia | 114 | 124,889 | |
| 2014, 2018 | The Philippines | 24 | 33,038 | |
| 2014 | The Solomon Islands | 20 | 33,667 | |
| Central Pacific Ocean | 2015, 2016 | United States | 99 | 93,111 |
| Totals | ||||
Summary of the images, manual point annotations, and test transects used during the train and test processes of each network.
| Trained network | Region | Folder name at the Repository | Country or Territory | Training images | Training annotations | Testing images | Testing annotations | Test transects |
|---|---|---|---|---|---|---|---|---|
| 1 | WAO | ATL | Anguilla | 10 | 44,900 | 50 | 48,000 | 5 |
| Aruba | 7 | 30 | 3 | |||||
| The Bahamas | 70 | 150 | 12 | |||||
| Belize | 111 | 115 | 13 | |||||
| Bermuda | 17 | 60 | 8 | |||||
| Bonaire | 21 | 115 | 8 | |||||
| Curacao | 36 | 90 | 7 | |||||
| Guadeloupe | 17 | 75 | 6 | |||||
| Mexico | 60 | 115 | 11 | |||||
| Saint Martin | 9 | 25 | 2 | |||||
| Saint Vincent and the Grenadines | 54 | 60 | 7 | |||||
| Saint Eustatius | 12 | 25 | 1 | |||||
| Turks and Caicos Islands | 25 | 50 | 4 | |||||
| 2 | CIO | IND_CHA | Chagos | 359 | 35,900 | 331 | 16,550 | 29 |
| 3 | CIO | IND_MDV | Maldives | 1,171 | 117,100 | 540 | 27,000 | 52 |
| 4 | EA | PAC_AUS | Australia | 1,234 | 129,340 | 1,426 | 57,080 | 130 |
| 5 | CPO | PAC_USA | Unitated States (Hawaii) | 501 | 50,100 | 660 | 33,000 | 60 |
| 6 | SEA | PAC_IDN_PHL | Indonesia | 614 | 75,100 | 600 | 45,000 | 50 |
| Philippines | 137 | 300 | 24 | |||||
| 7 | SEA | PAC_SLB | The Solomon Islands | 439 | 44,200 | 300 | 15,000 | 29 |
| 8 | SEA | PAC_TWN | Taiwan | 350 | 35,000 | 300 | 15,000 | 27 |
| 9 | SEA | PAC_TLS | Timor-Leste | 547 | 55,100 | 330 | 16,500 | 29 |
Abbreviations at Region column = Western Atlantic Ocean (WAO), Central Indian Ocean (CIO), Eastern Australia (EA), Central Pacific Ocean (CPO), Southeast Asia (SEA).
List of tables/files (and their structure) included in the repository.
| Table/File | Data field | Descriptor |
|---|---|---|
| seaviewsurvey_surveys.csv | surveyid | A five unique digit survey ID representing data collection at one location (a transect location) at one point in time. This ID is unique in this table and used in the folder naming code. There may be multiple survey IDs associated with a transect ID if multi-temporal surveys were conducted at this reef location |
| transectid | The five unique digit transect ID. A transect ID will appear more than once in this field if the transect has been surveyed temporally (more than one occasion) | |
| surveydate | The date (YYYYMMDD) on which the survey was completed | |
| ocean | The three letter code representing the ocean within which the survey occurred: ATL = Atlantic Ocean, IND = Indian Ocean, PAC = Pacific Ocean | |
| country | The three letter country code of the Exclusive Economic Zone within which the survey occurs (e.g., AUS = Australia) | |
| folder_name | The full name of the zipped folder associated with the survey (e.g., “ PAC_AUS_47035_201710”) where quadrats are stored within the “photo_quadrats” folder | |
| lat_start | The latitude of the start of the survey (decimal degrees) | |
| lng_start | The longitude of the start of the survey (decimal degrees) | |
| lat_end | The latitude of the end of the survey (decimal degrees) | |
| lng_end | The longitude of the end of the survey (decimal degrees) | |
| pr_hard_coral | The proportional cover of hard coral | |
| pr_algae | The proportional cover of algae | |
| pr_soft_coral | The proportional cover of soft coral | |
| pr_oth_invert | The proportional cover of other invertebrates apart from hard corals and soft corals | |
| pr_other | The proportional cover of other categories apart from hard corals, algae, soft corals and other invertebrates | |
| seaviewsurvey_quadrats.csv | surveyid | The five-digit survey ID number from the “seaviewsurvey_surveys.csv” table. There is a one:many relationship between the surveys table and the quadrats table. |
| imageid | The nine-digit image ID. This correspond to the five-digit survey ID and four more digits between 0001 and 9999 (number of the picture automatically assigned by the camera). Identical image ID’s will appear multiple times in this field as one image has associated multiple quadrats | |
| quadratid | The 11 unique digit quadrat ID. This correspond to the nine-digit image ID and two more digits usually between 01 and 09, which refer to the quadrat number within an image (automatically assigned during the cropping process). “.jpg” must be added to the quadrat ID to derive the filename of the quadrat (with the extension) within the imagery. | |
| seaviewsurvey_reefcover_[region].csv | surveyid | The five-digit survey ID number from the seaviewsurvey_surveys.csv table. There is a one:many relationship between the surveys table and the reefcover table. |
| imageid | The nine-digit image ID from the seaviewsurvey_quadrats.csv. Identical image ID’s appear multiple times in this field as one image has associated multiple quadrats | |
| quadratid | The 11-digit quadrat ID from the seaviewsurvey_quadrats.csv table. There is a one:many relationship between the cover table and the annotations table because each quadrat can have many annotations | |
| lat | The latitude the survey (decimal degrees) | |
| lng | The longitude of the survey (decimal degrees) | |
| [label] | The proportional cover of the label within the quadrat. A description of each label is presented in the “seaviewsurvey_labelsets.csv” table | |
| seaviewsurvey_annotations.csv | quadratid | The 11-digit quadrat ID from the seaviewsurvey_quadrats.csv table. There is a one:many relationship between the quadrat table and the annotations table because each quadrat can have many annotations |
| y | Y coordinate (row) of the pixel (graphics format image, origin of first pixel in the upper left corner) that has been automatically annotated. | |
| x | X coordinate (column) of the pixel (graphics format image, origin of first pixel in the upper left corner) that has been automatically annotated. | |
| label | The short code representing the category identified. A description of each label is presented in the “seaviewsurvey_labelsets.csv” table | |
| annotations_[ocean]_[country].csv | quadratid | The 11 digit quadrat ID from the seaviewsurvey_quadrats.csv table. There is a one:many relationship between the quadrat table and the annotations table because each quadrat can have many annotations |
| y | Y coordinate (row) of the pixel (graphics format image, origin of first pixel in the upper left corner) that has been manually annotated | |
| x | X coordinate (column) of the pixel (graphics format image, origin of first pixel in the upper left corner) that has been manually annotated | |
| label_name | The full name of the label identified. A description of each label is presented in the seaviewsurvey_labelsets.csv table | |
| label | The short code representing the category identified. A description of each label is presented in the “seaviewsurvey_labelsets.csv” table | |
| func_group | Main benthic group of the label identified: hard coral, soft coral, other invertebrates, algae, other | |
| method | Type of annotation: random = point selected randomly during the annotation; target = point aimed during the annotation | |
| data_set | Application of the manual annotations: train = to train a net; test = to validate a net | |
| seaviewsurvey_labelsets.csv | region | Name of one of the five regions surveyed: Atlantic, Indian Ocean, Pacific Australia, Southeast Asia, Pacific Hawaii |
| label | The short code for one of the labels within the region considered | |
| func_group | Main benthic group of the label: hard coral, soft coral, other invertebrates, algae, other | |
| label_name | The full name of the label within the region considered | |
| merged_label | Alternative short code in order reduce the complexity of the label-set. Some related labels were merged for the technical validation | |
| merged_name | The full name of the merged label | |
| description_examples | A brief description of what the labels represent, with examples |
Fig. 2Mean Absolute Error for the automated estimation of the abundance (cover) of main benthic groups per region. Solid and error bars represent the mean and 95% Confidence Intervals of the error, respectively. Figure originally published as Fig. 3 in González-Rivero et al.[23].
Fig. 3Mean Absolute Error for the automated estimation of the abundance (cover) of key benthic categories. Errors are aggregated by main benthic groups, along the y-axis, and regions, along the x-axis. Solid and error bars represent the mean and 95% Confidence Intervals of the error, respectively. Figure originally published as Fig. 4 in González-Rivero et al.[23].
Fig. 4Overall agreement between observer (manual) and machine (networks) estimations of abundance (cover). Agreement is here discretised in two metrics: (a) Correlation between machine and observer annotations, and (b) bias (Bland-Altman plot). Each filled circle in these panels represents the estimated cover for each label classified by both, the machine and the observer in a given transect. The correlation shows that estimations of benthic abundance by expert observations are significantly represented by the automated estimations (R2 = 0.97). The Bland-Altman plot shows that overall the differences (Bias) between machine and observer tend to mean of zero (grey continuous line), and a homogenous error around the mean, defined by Critical Difference (Critical diff.) or the 95% Confidence Interval of the difference between observers and machines (dashed grey lines). Figure originally published in González-Rivero et al.[23] (as Figure A1).
Fig. 5Comparison of compositional similarity within regions. Community similarity between observed and estimated benthic composition per region was evaluated by the Bray-Curtis similarity index. Solid and error bars represent the mean and 95% Confidence Intervals of the error, respectively. Figure originally published in González-Rivero et al.[23] (as Fig. 5).
| Measurement(s) | ecosystem • coral reef • composition |
| Technology Type(s) | automated image annotation • machine learning |
| Factor Type(s) | year of data collection • geographic location |
| Sample Characteristic - Organism | Anthozoa • Algae • Porifera |
| Sample Characteristic - Environment | marine coral reef biome • marine coral reef fore reef |
| Sample Characteristic - Location | Atlantic Ocean • Eastern Australia • Indian Ocean • Southeast Asia • Pacific Ocean • Great Barrier Reef |