| Literature DB >> 35402671 |
Conrad Omonhinmin1, Chinedu Onuselogu1.
Abstract
The ribulose-bisphosphate carboxylase (rbcL) gene sequence data in the molecular data repository has been increasing significantly, over the years with contributions from different parts of the world. The abundance of the gene has enhanced its applications in several ways. Bulk records were obtained from National Center for Biotechnology Information (NCBI) GenBank database using the entrez efetch utility as implemented in the Biopython package version 1.77. Records corresponding to the following keywords "rbcL AND plants [filter] AND biomol_genomic [PROP] AND is _nuccore [filter]" were created. Generated records were cleaned and then further analysed using the code file in the supplementary materials. Country information was obtained by searching reference information for matches to countries present in the pycountry package. Where no match was found, null was returned. This data article contains information about the plant family and species whose rbcL gene sequence has been deposited on the NCBI and regions of the world that has contributed to the rbcL repository growth. This data can be used to analyse the intra and inter family relatedness of plant and compare with existing relationships the molecular characterization of plants, evolutionary relationship studies, inferring biogeography origin of plant.Entities:
Keywords: Biogeography; Evolutionary; Molecular repository; Phylogeny; rbcL gene
Year: 2022 PMID: 35402671 PMCID: PMC8987485 DOI: 10.1016/j.dib.2022.108090
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Fig. 1Most studied plant families with rbcL gene sequence in GenBank.
*The numbers indicate the number of species in each family with rbcL gene deposited on NCBI GenBank.
*NB: The study discovered a total number of 808 plant families with rbcL gene sequence submitted on the NCBI GenBank making it difficult to include all the families in the tree map in Fig. 1, hence the plant families with the most rbcL gene submission are mentioned in Fig. 1.
Fig. 2Percentage of plant phyla with rbcL gene data deposited on GenBank.
Fig. 3Percentage of rbcL sequences contribution from different regions.
Fig. 4Countries with higher submissions of rbcL sequences on the GenBank repository.
Fig. 5Map showing global concentration of rbcL sequence contribution to GenBank repository.
* Regions with dark blue has higher contribution of rbcL gene sequence on the NCBI GenBank.
| Subject | Biological sciences |
| Specific subject area | Molecular phylogenetics, Phylogeny and Evolution |
| Type of data | Text, Table, Chart, Figure |
| How data were acquired | Biopython package version 1.77. was used to retrieve the |
| Data format | Raw, Analysed and Filtered. |
| Description of data collection | Bulk data were obtained from NCBI GenBank database using the entrez efetch utility as implemented in the Biopython package version 1.77. Datasets that do not have the matching words |
| Data source location | The data was obtained from the NCBI GenBank database. |
| Data accessibility | With the article. |
| Repository name | Mendeley Data |
| Data identification number | |
| Direct link to the dataset: |