Literature DB >> 28050581

Data on taxonomic status and phylogenetic relationship of tits.

Xue-Juan Li1, Li-Liang Lin1, Ai-Ming Cui1, Jie Bai1, Xiao-Yang Wang2, Chao Xin1, Zhen Zhang1, Chao Yang3, Rui-Rui Gao1, Yuan Huang1, Fu-Min Lei4.   

Abstract

The data in this paper are related to the research article entitled "Taxonomic status and phylogenetic relationship of tits based on mitogenomes and nuclear segments" (X.J. Li et al., 2016) [1]. The mitochondrial genomes and nuclear segments of tits were sequenced to analyze mitochondrial characteristics and phylogeny. In the data, the analyzed results are presented. The data holds the resulting files of mitochondrial characteristics, heterogeneity, best schemes, and trees.

Entities:  

Year:  2016        PMID: 28050581      PMCID: PMC5192249          DOI: 10.1016/j.dib.2016.11.079

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the data The provided files of comparative mitochondrial characteristics of tits can be valuable to further summarize. The files of phylogenetic relationships would help to further study the phylogeny of tits and even Passeriformes. The provided ‘.tree’ files can be directly used to compare with other results.

Data

In the data, Fig. 1, Fig. 2 show base compositions and conserved site percentages of tits, respectively. Fig. 3 is the result of heterogeneity. Fig. 4 shows gene trees and a species tree. Table 1 describes the taxonomic samples. Table 2 lists the primer sequences. Table 3 is the P-distance based on mitochondrial dataset. Table 4 shows the best schemes.
Fig. 1

Nucleotide compositions of different mitochondrial partitions in 10 tits species. Note: AT-skew ([A−T]/[A+T]), GC-skew ([G−C]/[G+C]), PCG-1st (the first codon positions of protein-coding genes), PCG-2nd (the second codon positions of protein-coding genes), PCG-3rd (the third codon positions of protein-coding genes), tRNA-H (the tRNA genes on H-strand), tRNA-L (the tRNA genes on L-strand).

Fig. 2

Conserved site percentages of mitochondrial genes among 10 tits species.

Fig. 3

The heterogeneity analyzed by AliGROOVE. Note: The heterogeneity continuously decreased from −1 (red coloring) to +1 (blue coloring). A: the first and second codon positions of protein-coding genes, B: protein-coding genes with the third codon positions not using RY-coding method, C: mitochondrial genome with the third codon positions not using RY-coding method, D: nuclear segments dataset. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article).

Fig. 4

The gene trees and species tree analyzed by using ASTRAL. Note: The gene trees (A–F) were constructed based on maximum likelihood method. A: MOS; B: FGB; C: ALDOB; D: PCBD1; E: CALB1; F: mitochondrial genome; G: species tree.

Table 1

Taxonomic samples in the study.

FamilyGenusSpecies and subspeciesSample locality/sourceGenBank accession Nos.
MitogenomeMOSFGBALDOBPCBD1CALB1
ParidaeParusParus majorBeach forestry centre, Zhouqu County, Gansu ProvinceKX388473KX388398KX388413KX388428KX388443KX388458
Parus majorBaihualing, Gaoligongshan, Yunnan ProvinceKX388480KX388405KX388420KX388435KX388450KX388465
Parus monticolusBeach forestry centre, Zhouqu County, Gansu ProvinceKX388474KX388399KX388414KX388429KX388444KX388459
Parus monticolusDahaoping, Gaoligongshan, Yunnan ProvinceKX388481KX388406KX388421KX388436KX388451KX388466
PoecileParus montanusLiancheng, Yongdeng County, Gansu ProvinceKX388478KX388403KX388418KX388433KX388448KX388463
Parus montanus baicalensisMaoershan, Shangzhi City, Heilongjiang ProvinceKX388479KX388404KX388419KX388434KX388449KX388464
Parus palustrisBeach forestry centre, Zhouqu County, Gansu ProvinceKX388475KX388400KX388415KX388430KX388445KX388460
CyanistesParus cyanusKizil, Baicheng County, XinjiangKX388472KX388397KX388412KX388427KX388442KX388457
MachlolophusParus spilonotusLongqishan Nature Reserve, Fujian ProvinceKX388476KX388401KX388416KX388431KX388446KX388461
LophophanesParus dichrousSanguanmiao, Shaanxi ProvinceKX388477KX388402KX388417KX388432KX388447KX388462
PeriparusParus aterWen County, Gansu ProvinceNC_026223KX388408KX388423KX388438KX388453KX388468
PardaliparusParus venustulusYangxin County, Huangshi City, Hubei ProvinceNC_026701KX388410KX388425KX388440KX388455KX388470
PseudopodocesPseudopodoces humilisBird Island, Qinghai Lake, Qinghai ProvinceKP001174KX388407KX388422KX388437KX388452KX388467
SylviparusSylviparus modestusLuding County, Sichuan ProvinceNC_026793KX388409KX388424KX388439KX388454KX388469





















RemizidaeRemizRemiz consobrinusXinxing Town, Panjin City, Liaoning ProvinceNC_021641KX388411KX388426KX388441KX388456KX388471
Table 2

The primers used in this study.

NameSequence(5′–3′)NameSequence(5′–3′)
L1263baAAAGCATRRCACTGAAH10343bTGGGCTCATGTGACKGTRACKCC
H1859bTCGATTACAGAACAGGCTCCTCTAL10236bTTCTGAGCMTTCTTCCAYTCMAG
L1754bTGGGATTAGATACCCCACTATGH10884baGGGTCRAAWCCRCATTCGTATGG
H2294bTTTCAGGTGTAAGCTGAATGCTTL10635baCACCACTTYGGCTTYGARGCAGC
L2260baCAAGGTAAGTGTACCGGAAGGTGH11837bARGGTKGCTTCRAATGCRATRTARAA
H2891baTGATGGCTGCTTRARGGCCCACL11458bTCYACCCGAACYCACGGCTCMGA
L2725bCGAGCCGGGTGATAGCTGGH12344bCTATGTGGCTKACKGAKGAGTAKGC
H3292bTGATTGCGCTACCTTTGCACGGL12156bCCHAAAGCMCACGTAGAAGCMCC
L3218bCGACTGTTTACCAAAAACATAGCCH13047bCTTTTACTTGGATTTGCACCAA
H3784bCGGTCTGAACTCAGATCACGL13040baATCCAATGGTCTTAGGARCCA
L3722bGGTTTACGACCTCGATGTTGGH13563baTGRAGGGCDGCRGTGTTRGC
H4170bCCYACRATRTTTGGGCCTTTKCGL13525bGMTGAGAAGGRGTAGGAATCATATC
L3803bCTACGTGATCTGAGTTCAGACCGH14127bCCTATTTTTCGRATGTCYTGTTC
H4644bTCRAATGGGGCRCGRTTTGTYTCL14080bTCAACYCACGCATTCTTYAARGC
L4500bGTAGCCCAAACAATCTCMTAYGARGH15049bGTGTCTGCTGTGTAGTGYATDGC
H5201bCCATCATTTTCGGGGTATGGL14770bTMGGMCCAGAAGGAYTVGC
L5143bGAACCTRCACWARAGRGATCAAAACH15295bCCTCAGAATGATATTTGKCCTCAKGG
H5766bGGAKGAGAAGGCTAKGATTTTTCGL14996bAACATCTCADCHTGATGAAACTTYGG
L5758bGGRGGMTGAATAGGMCTAAACCARACH15646bGGYGTGAARTTTTCTGGRTCTCC
H6681baGGTATAGGGTDCCRATGTCTTTRTGL15413bGGWGGATTYTCAGTAGACAACCC
L6615baCCTCTGTAAAAAGGACTACAGCCH16064baCTTCAATCTTTGGYTTACAAGACC
H7122bGCTGTTGTRATGAAGTTGATDGCYCCL15725baAAACCHGAATGATACTTCCTMTTYGC
L7036bGGAACAGGATGAACYGTNTACCCH1530baGGTGGCTGGCACARGATTTACC
H7548bGTRGCGGATGTRAAGTATGCTCGCMOSFGCCTGGTGCTCCATCGACTGG
L7525bGTNTGAGCMCACCACATRTTYACCMOSRGCAAATGAGTAGATGTCTGCT
H8121bGGGCAGCCGTGRATTCATTCFIB4FCTGTAATATCCCGGTGGTTTCAGG
L7987bTCAGACTACCCAGAYGCCTAYACFIB4RATTTCAGATGTTTCACCTCCCTTTC
H8628bTCGTAGGWTCAGTATCATTGRTGNCCAldB6FGAGCCAGAAGTCTTACCTGAYGG
L8386bGCYTCATCMCCYATCATAGAAGAAldB7RCAGCTGTCACCATGTTNGG
H9235bTCGAAGAAGCTTAGGTTCATGGTCADCOH3FAGGCCTGGCTTCATGAC
L8929bGGMCAATGCTCAGAAATYTGYGGDCOH4RGATAAACCYGTGCARTCYTGGGTGCT
H9726bAGRTGKCCTGCTGTNAGRTTNGCCal9FAGGGTGTCAARATGTGTGSGAAAGA
L9700bGAAACAACAAGCCTACTHATYCGHCCCal11RGTANAGCTTCCCTCCATCNGACAA

Means the primers used in LA-PCR.

Table 3

The P-distance based on mitogenome dataset.

SpeciesGenus
Parus cyanusCyanistes
Parus major0.085Parus0.085
Parus monticolus0.0840.051Poecile0.0920.088
Parus palustris0.0920.0870.088Machlolophus0.0900.0830.094
Parus spilonotus0.0900.0840.0830.094Lophophanes0.0960.0890.0800.096
Parus dichrous0.0960.0890.0890.0800.096Pseudopodoces0.0960.0910.0990.0940.101
Parus montanus0.0910.0880.0880.0380.0930.080Periparus0.0930.0890.0790.0940.0820.102
Parus montanus baicalensis0.0920.0890.0880.0380.0940.0800.021Pardaliparus0.0920.0870.0770.0920.0800.0990.071
Parus major0.0860.0210.0520.0870.0840.0900.0880.088
Parus monticolus0.0850.0520.0100.0880.0830.0890.0880.0880.053
Pseudopodoces humilis0.0960.0910.0900.0970.0940.1010.1000.0990.0910.091
Parus ater0.0930.0900.0890.0800.0940.0820.0780.0790.0900.0890.102
Parus venustulus0.0920.0890.0860.0770.0920.0800.0770.0760.0880.0860.0990.071
Table 4

Best schemes analyzed by PartitionFinder.

DatasetSubsetSubset partitionsOptimal model
Protein-coding genesP1atp6_pos1, nad1_pos1, nad2_pos1, nad3_pos1, nad4L_pos1, nad4_pos1, nad5_pos1GTR+I+G
P2atp6_pos2, atp8_pos2, cox3_pos2, cox2_pos2, cox1_pos2, cytb_pos2, nad1_pos2, nad2_pos2, nad3_pos2, nad4L_pos2, nad4_pos2, nad5_pos2GTR+I+G
P3atp6_pos3, atp8_pos1, atp8_pos3, cox3_pos3, cox2_pos3, cox1_pos3, cytb_pos3, nad1_pos3, nad2_pos3, nad3_pos3, nad4L_pos3, nad4_pos3GTR+G
P4cox3_pos1, cox2_pos1, cox1_pos1, cytb_pos1GTR+I+G
P5nad5_pos3, nad6_pos3GTR+G
P6nad6_pos1, nad6_pos2GTR+G









MitogenomesP1rrnS, rrnL, atp6_pos1, nad1_pos1, nad2_pos1, nad3_pos1, nad4L_pos1, nad4_pos1, nad5_pos1, trnR, trnD, trnG, trnH, trnI, trnK, trnM, trnF, trnS(agy), trnW, trnVGTR+I+G
P2atp6_pos2, atp8_pos2, cox3_pos2, cox2_pos2, cox1_pos2, cytb_pos2, nad1_pos2, nad2_pos2, nad3_pos2, nad4L_pos2, nad4_pos2, nad5_pos2GTR+I+G
P3atp6_pos3, atp8_pos3, cox3_pos3, cox2_pos3, cox1_pos3, cytb_pos3, D_loop, nad1_pos3, nad2_pos3, nad3_pos3, nad4L_pos3, nad4_pos3GTR+I+G
P4cox3_pos1, cox2_pos1, cox1_pos1, cytb_pos1, trnN, trnL(uur), trnL(cun), trnS(ucn), trnT, trnYGTR+I+G
P5atp8_pos1, nad5_pos3, nad6_pos3GTR+I+G
P6nad6_pos1, nad6_pos2, trnA, trnC, trnQ, trnE, trnPGTR+I+G









Nuclear segmentsP1ALDOB_exon, CALB1_exon, MOS_exon, PCBD1_exon, PCBD1_intron, FGB_exonGTR+I+G
P2ALDOB_intron, CALB1_intron, FGB_intronGTR+G

Note: Pos1, pos2, and pos3 indicate the first, second and third codon positions of protein-coding genes in mitogenomes, respectively.

Experimental design, materials and methods

This study sampled 13 individuals of tits by using Sylviparus modestus and Remiz consobrinus as outgroups. Each gene was aligned in Muscle [3] independently. The mitochondrial characteristics, including A+T contents, conserved site percentages and P-distances, were analyzed by using MEGA 4.1 [2], and the results can be found in Fig. 1, Fig. 2 and Table 3, respectively. Four datasets, A: the first and second sites of protein-coding genes, B: protein-coding genes with the third sites not employing RY-coding method, C: 37 mitochondrial genes with the third sites of protein-coding genes not using RY-coding method plus one control region, D: five nuclear segments, were used to analyze the heterogeneity in AliGROOVE [5], and the results can be found in Fig. 3. The best schemes were analyzed by using Partitionfinder v1.1.1 [4], and the results were in Table 4. The gene trees in Fig. 4 were constructed by using RAxML 7.0.3 [6], employing 1000 replications, and these results were used to construct a species tree by using ASTRAL [7].
Subject areaBiology, Genetics and Genomics
More specific subject areaPhylogenetics and Phylogenomics
Type of dataFigures, Tables, Trees
How data was acquiredThe analyses of A+T contents, conserved site percentages and P-distances, were obtained in MEGA 4.1 [2]. Sequences were aligned in Muscle [3]. The best schemes were analyzed in Partitionfinder v1.1.1 [4]. The heterogeneity was inferred with AliGROOVE [5]. The gene trees based on six datasets (one mitochondrial dataset and five nuclear segments) were constructed in RAxML 7.0.3 [6]. A species tree was obtained with employing these gene trees in ASTRAL [7].
Data formatAnalyzed
Experimental factorsThe RY-coding method was employed for the third sites of protein-coding genes, while nuclear dataset was divided into different parts (exons and introns).
Experimental featuresThe phylogeny employed the best schemes inferred by PartionFinder v1.1.1 [4]. A species tree was obtained by employing gene trees in ASTRAL [7].
Data source locationShaanxi Normal University
Data accessibilityData is with this article
  7 in total

1.  MUSCLE: multiple sequence alignment with high accuracy and high throughput.

Authors:  Robert C Edgar
Journal:  Nucleic Acids Res       Date:  2004-03-19       Impact factor: 16.971

2.  Partitionfinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses.

Authors:  Robert Lanfear; Brett Calcott; Simon Y W Ho; Stephane Guindon
Journal:  Mol Biol Evol       Date:  2012-01-20       Impact factor: 16.240

3.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.

Authors:  Alexandros Stamatakis
Journal:  Bioinformatics       Date:  2006-08-23       Impact factor: 6.937

4.  MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0.

Authors:  Koichiro Tamura; Joel Dudley; Masatoshi Nei; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2007-05-07       Impact factor: 16.240

5.  Taxonomic status and phylogenetic relationship of tits based on mitogenomes and nuclear segments.

Authors:  Xuejuan Li; Liliang Lin; Aiming Cui; Jie Bai; Xiaoyang Wang; Chao Xin; Zhen Zhang; Chao Yang; Ruirui Gao; Yuan Huang; Fumin Lei
Journal:  Mol Phylogenet Evol       Date:  2016-07-19       Impact factor: 4.286

6.  ASTRAL: genome-scale coalescent-based species tree estimation.

Authors:  S Mirarab; R Reaz; Md S Bayzid; T Zimmermann; M S Swenson; T Warnow
Journal:  Bioinformatics       Date:  2014-09-01       Impact factor: 6.937

7.  AliGROOVE--visualization of heterogeneous sequence divergence within multiple sequence alignments and detection of inflated branch support.

Authors:  Patrick Kück; Sandra A Meid; Christian Groß; Johann W Wägele; Bernhard Misof
Journal:  BMC Bioinformatics       Date:  2014-08-30       Impact factor: 3.169

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.