| Literature DB >> 30537922 |
Alejandra Noreña-P1, Andrea González Muñoz2, Jeanneth Mosquera-Rendón1, Kelly Botero1, Marco A Cristancho1,3.
Abstract
BACKGROUND: Latin America harbors some of the most biodiverse countries in the world, including Colombia. Despite the increasing use of cutting-edge technologies in genomics and bioinformatics in several biological science fields around the world, the region has fallen behind in the inclusion of these approaches in biodiversity studies. In this study, we used data mining methods to search in four main public databases of genetic sequences such as: NCBI Nucleotide and BioProject, Pathosystems Resource Integration Center, and Barcode of Life Data Systems databases. We aimed to determine how much of the Colombian biodiversity is contained in genetic data stored in these public databases and how much of this information has been generated by national institutions. Additionally, we compared this data for Colombia with other countries of high biodiversity in Latin America, such as Brazil, Argentina, Costa Rica, Mexico, and Peru.Entities:
Keywords: Big data; Biodiversity; Data mining; Latin America; Molecular databases
Mesh:
Year: 2018 PMID: 30537922 PMCID: PMC6288850 DOI: 10.1186/s12864-018-5194-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Search fields for the respective databases in order to filter the entries produced at the national level
| Databases | Search fields |
|---|---|
| Nucleotide | Journal |
| BioProject | <Submission |
| BOLD Systems | Institution_storing |
| PATRIC | Sequencing_center |
Fig. 1Main Colombian institutes that submit data to the Nucleotide (NCBI) database (Release 219.0 of April 15 of 2017). The values shown were compiled and analyzed in this study
Species richness values for six Latin American countries, classified by taxonomic group, and percentage of species representation at a national level for each country compared to the referenced values of species diversity shown. The percentage values shown were compiled and analyzed in this study. Data was obtained from Genbank (Nucleotide) release 219.0 of April 15, 2017
| Species richness value per taxonomic group | ||||||
|---|---|---|---|---|---|---|
| Country | Mammals | Birds | Reptiles | Amphibians | Vascular Plants | References |
| Colombia | 492 | 1921 | 606 | 803 | 51,220 | [ |
| Brazil | 701 | 1712 | 793 | 1042 | 56,215 | [ |
| Mexico | 564 | 1113 | 922 | 382 | 26,071 | [ |
| Argentina | 386 | 1049 | 364 | 439 | 10,593 | [ |
| Costa Rica | 249 | 918 | 259 | 205 | 12,119 | [ |
| Peru | 441 | 1781 | 484 | 592 | 17,144 | [ |
| Percentage of species representation at the national level (Nucleotide database) compared to species richness per taxonomic group | ||||||
| Colombia | 9.35% | 4.79% | 2.48% | 13.33% | 1.13% | |
| Brazil | 10.27% | 1.58% | 0.38% | 7.10% | 0.82% | |
| Mexico | 30.32% | 25.34% | 12.36% | 35.60% | 5.53% | |
| Argentina | 26.94% | 5.53% | 38.19% | 44.87% | 12.90% | |
| Costa Rica | 1.61% | 0.11% | 2.32% | 0% | 1.50% | |
| Peru | 1.13% | 0.06% | 1.03% | 0.51% | 0.27% | |
Fig. 2Colombian institutes that submit data to the BioProject (NCBI) database (Consulted June of 2017). The values shown were compiled and analyzed in this study
Fig. 3Number of records for each taxonomic supergroup in the BioProject (NCBI) database (Consulted June of 2017) for six Latin American countries surveyed. The values shown were compiled and analyzed in this study
Fig. 4Colombian institutes that submit data to BOLD systems database and data representation of phyla per institute (Consulted June of 2017). The values shown were compiled and analyzed in this study
Number of DNA barcode sequence records deposited at a national level in BOLD Systems database (Consulted June of 2017), classified by taxonomic group, for six Latin American countries surveyed. The values shown were compiled and analyzed in this study
| Country | Total national records | Number of national records and percentage of total records per taxonomic group | |||||
|---|---|---|---|---|---|---|---|
| Mammals | Birds | Reptiles | Amphibians | Vascular Plants | Insects | ||
| Colombia | 1673 | No records | 281 (16.8%) | 253 (15.1%) | 655 (39.2%) | 254 (15.2%) | 127 (7.6%) |
| Brazil | 19,330 | 546 (2.8%) | 1150 (5.9%) | 1 (0.000005%) | 1662 (8.6%) | 639 (3.3%) | 5392 (28.0%) |
| Mexico | 62,096 | 2458 (4.0%) | 1105 (1.8%) | 59 (0.0009%) | 84(0.001%) | 1908 (3.1%) | 13,693 (22.1%) |
| Argentina | 90,055 | 960 (1.1%) | 2651 (3.0%) | 581 (0.6%) | 79 (0.0009%) | 2 (0.00002%) | 79,298 (88.1%) |
| Costa Rica | 101,266 | No records | No records | 3 (0.00003%) | 17 (0.0002%) | 7563 (7.5%) | 93,387 (92.2%) |
| Peru | 3438 | 13 (3.8%) | 1715 (49.9%) | No records | 12(3.5%) | 919 (26.7%) | 312 (9.1%) |
Fig. 5Colombian institutes that submit data to the PATRIC database (Consulted June of 2017). The values shown were compiled and analyzed in this study