| Literature DB >> 27766939 |
Si Chen1, Lih-Yuan Deng2, Dale Bowman2, Jyh-Jen Horng Shiau3, Tit-Yee Wong4, Behrouz Madahian2, Henry Horng-Shing Lu5.
Abstract
BACKGROUND: It has been a challenging task to build a genome-wide phylogenetic tree for a large group of species containing a large number of genes with long nucleotides sequences. The most popular method, called feature frequency profile (FFP-k), finds the frequency distribution for all words of certain length k over the whole genome sequence using (overlapping) windows of the same length. For a satisfactory result, the recommended word length (k) ranges from 6 to 15 and it may not be a multiple of 3 (codon length). The total number of possible words needed for FFP-k can range from 46=4096 to 415.Entities:
Keywords: Feature frequency profile (FFP); Phylogenetic tree construction; Reading frame; Summary statistics; Tree comparison
Mesh:
Substances:
Year: 2016 PMID: 27766939 PMCID: PMC5073869 DOI: 10.1186/s12859-016-1222-3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Gene count and the minimum, average, and maximum of gene lengths for each of 56 species
| Strain (Species) | Gene | Min | Mean | Max |
|---|---|---|---|---|
| Count | ||||
| Escherichia_coli_O15_7_H7_VT2Sakai | 5361 | 45 | 903.5 | 15876 |
| Escherichia_coli_0127_H6_E2348_69 | 4703 | 45 | 929.7 | 9672 |
| Escherichia_coli_536 | 4685 | 66 | 934.7 | 9729 |
| Escherichia_coli_55989 | 4919 | 45 | 929.4 | 9492 |
| Escherichia_coli_BL21_DE3 | 4319 | 36 | 937.5 | 7104 |
| Escherichia_coli_BW2952 | 4084 | 45 | 954.8 | 7077 |
| Escherichia_coli_B_REL606 | 4209 | 45 | 953.7 | 7152 |
| Escherichia_coli_C_ATCC_8739 | 4200 | 75 | 974.7 | 6342 |
| Escherichia_coli_E24377A | 4755 | 90 | 907.1 | 6891 |
| Escherichia_coli_ED1a | 5123 | 45 | 900.6 | 9492 |
| Escherichia_coli_IAI1 | 4443 | 45 | 942.0 | 6444 |
| Escherichia_coli_IAI39 | 4892 | 45 | 931.1 | 9492 |
| Escherichia_coli_K_12_substr_DH10B | 4200 | 45 | 945.6 | 7104 |
| Escherichia_coli_K_12_substr_MG1655 | 4321 | 45 | 946.5 | 7077 |
| Escherichia_coli_K_12_substr_W3110 | 4337 | 45 | 950.7 | 8622 |
| Escherichia_coli_O157_H7_EC4115 | 5315 | 93 | 873.0 | 7863 |
| Escherichia_coli_S88 | 4847 | 45 | 924.0 | 9492 |
| Escherichia_coli_SE11 | 4679 | 45 | 929.2 | 5421 |
| Escherichia_coli_SMS_3_5 | 4743 | 75 | 935.4 | 8802 |
| Escherichia_coli_UMN026 | 4907 | 45 | 942.9 | 20778 |
| Escherichia_coli_UTI89 | 5066 | 66 | 911.3 | 9789 |
| Escherichia_fergusonii_ATCC_35469 | 4319 | 45 | 954.2 | 21669 |
| Neisseria_gonorrhoeae_FA_1090 | 2002 | 111 | 845.4 | 5934 |
| Neisseria_meningitidis_053442 | 2020 | 93 | 853.9 | 5364 |
| Neisseria_meningitidis_FAM18 | 1975 | 87 | 916.5 | 6090 |
| Neisseria_meningitidis_MC58 | 2063 | 69 | 871.9 | 8112 |
| Neisseria_meningitidis_Z2491 | 1993 | 93 | 900.1 | 6048 |
| Orientia_tsutsugamushi_Boryong | 2179 | 30 | 796.1 | 6900 |
| Rickettsia_conorii_Malish_7 | 1374 | 126 | 746.4 | 6066 |
| Rickettsia_prowazekii_Madrid_E | 834 | 126 | 1006.9 | 7023 |
| Rickettsia_akari_Hartford | 1259 | 63 | 741.9 | 5682 |
| Rickettsia_bellii_OSU_85-389 | 1476 | 78 | 831.9 | 4752 |
| Rickettsia_bellii_RML369-C | 1429 | 123 | 907.8 | 5946 |
| Rickettsia_felis_URRWXCal2 | 1400 | 123 | 889.4 | 9369 |
| Rickettsia_rickettsii_Iowa | 1384 | 54 | 701.7 | 5622 |
| Rickettsia_rickettsii_Sheila_Smith | 1345 | 63 | 713.4 | 6750 |
| Rickettsia_typhi_wilmington | 838 | 75 | 1002.1 | 6996 |
| Salmonella_enterica_serovar_Typhi_CT18 | 4395 | 42 | 910.1 | 10875 |
| Salmonella_typhimurium_LT2_SGSC1412 | 4451 | 45 | 947.6 | 16680 |
| Salmonella_enterica_Choleraesuis | 4445 | 66 | 898.3 | 16680 |
| Salmonella_enterica_Paratypi_ATCC_9150 | 4093 | 66 | 924.8 | 13683 |
| Shigella_boydii_Sb227 | 4142 | 45 | 880.2 | 4962 |
| Shigella_dysenteriae | 4277 | 45 | 789.9 | 4767 |
| Shigella_flexneri_2a_301 | 4436 | 42 | 912.4 | 5673 |
| Shigella_sonnei_Ss046 | 4224 | 45 | 919.9 | 4962 |
| Wolbachia_pipientis_wMel | 1271 | 93 | 857.0 | 8532 |
| Wolbachia_pipientis_wBm | 805 | 129 | 899.4 | 8520 |
| Yersinia_enterocolitica_8081 | 4060 | 84 | 962.1 | 9486 |
| Yersinia_pestis_Angola | 3837 | 114 | 902.1 | 9492 |
| Yersinia_pestis_Antiqua | 4167 | 69 | 949.0 | 11118 |
| Yersinia_pestis_biovar_Medievalis_91001 | 3895 | 63 | 962.3 | 11133 |
| Yersinia_pestis_CO92 | 4008 | 45 | 973.0 | 11118 |
| Yersinia_pestis_KIM_10 | 4090 | 45 | 937.8 | 11133 |
| Yersinia_pestis_Pestoides_F | 3850 | 87 | 962.9 | 13971 |
| Yersinia_pseudotuberculosis_IP32953 | 3974 | 45 | 998.5 | 16872 |
| Yersinia_pseudotuberculosis_IP_31758 | 4124 | 48 | 952.2 | 14862 |
Fig. 1Phylogenic tree based on the TUP-R1 vector
Fig. 2Phylogenic tree based on the TUP-R2 vector
Fig. 3Phylogenic tree based on the TUP-R3 vector
Fig. 4Phylogenic tree based on TUP-All vector
Fig. 5Phylogenic tree based on the FFP method with length 3
Pairwise distances among various trees
|
|
|
|
| |
|---|---|---|---|---|
|
| 12.08 | 7.97 | 11.89 | 108.36 |
|
| 16.17 | 22.90 | 117.82 | |
|
| 14.43 | 110.49 | ||
|
| 96.74 |