| Literature DB >> 35885975 |
Tao Xu1, Lingfeng Kong1,2, Qi Li1,2.
Abstract
Most recently, species identification has leaped from DNA barcoding into shotgun sequencing-based "genome skimming" alternatives. Genome skims have mainly been used to assemble organelle genomes, which discards much of the nuclear genome. Recently, an alternative approach was proposed for sample identification, using unassembled genome skims, which can effectively improve phylogenetic signal and identification resolution. Studies have shown that the software Skmer and APPLES work well at estimating genomic distance and performing phylogenetic placement in birds and insects using low-coverage genome skims. In this study, we use Skmer and APPLES based on genome skims of 11 patellogastropods to perform assembly-free and alignment-free species identification and phylogenetic placement. Whether or not data corresponding to query species are present in the reference database, Skmer selects the best matching or closest species with COI barcodes under different sizes of genome skims except lacking species belonging to the same family as a query. APPLES cannot place patellogastropods in the correct phylogenetic position when the reference database is sparse. Our study represents the first attempt at assembly-free and alignment-free species identification of marine mollusks using genome skims, demonstrating its feasibility for patellogastropod species identification and flanking the necessity of establishing a database to share genome skims.Entities:
Keywords: genome skims; genomic distance; patellogastropoda; phylogenetic placement
Mesh:
Year: 2022 PMID: 35885975 PMCID: PMC9318368 DOI: 10.3390/genes13071192
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
List of species used in this study.
| Subclass | Family | Species | Locality |
|---|---|---|---|
| Patellogastropoda | |||
| Nacellidae | |||
|
| Yangjiang, Guangdong, China | ||
| Wenchang, Hainan, China | |||
|
| Jeju Island, South Korea | ||
|
| Ningde, Fujian, China | ||
| Patellidae | |||
|
| Sansha, Hainan, China | ||
| Lottiidae | |||
|
| Weihai, Shandong, China | ||
|
| Weihai, Shandong, China | ||
|
| Wenchang, Hainan, China | ||
|
| Weihai, Shandong, China | ||
|
| Weihai, Shandong, China | ||
|
| Qingdao, Shandong, China | ||
| Vetigastropoda | |||
| Trochidae | |||
|
| Sanya, Hainan, China |
Figure 1Coverage distribution of P. saccharina lanx (Psl), P. conulus (Pco), P. ryukyuensis (Pry), C. toreuma (GD) (Cto-GD), C. toreuma (HN) (Cto-HN), C. grata (Cgr), C. nigrolineata (Cni), L. goshimai (Lgo), L. cassis (Lca), N. radula (Nra), and S. flexuosa (Sfl) under different sizes of genome skims.
The calculation distance from COI and different sizes of genome skims between C. toreuma (HN) and reference species in our study. Color shows the distance ranking between reference species and query species, that is, the darker the color, the farther the relationship.
| COI | 0.1 Gb | 0.5 Gb | 1 Gb | 2 Gb | 4 Gb | Largest Data | |
|---|---|---|---|---|---|---|---|
| 0.003 | 0.0918 | 0.0095 | 0.0115 | 0.0131 | 0.0145 | 0.0159 | |
|
| 0.174 | 0.1981 | 0.1272 | 0.1334 | 0.1359 | 0.1382 | 0.1406 |
|
| 0.187 | 0.1987 | 0.1236 | 0.1299 | 0.1337 | 0.1373 | 0.1397 |
|
| 0.368 |
|
|
|
|
|
|
|
| 0.384 |
| 0.1985 | 0.2106 |
|
|
|
|
| 0.392 | 0.2541 | 0.1883 | 0.1955 | 0.2042 | 0.2174 | 0.2229 |
|
|
| 0.2480 | 0.1807 | 0.2041 | 0.2112 | 0.2230 | 0.228 |
|
|
|
|
|
|
|
|
|
|
|
| 0.2439 |
|
| 0.2116 | 0.2320 | 0.2389 |
Color from light to dark:.
The calculation distance from COI and different sizes of genome skims between S. flexuosa and reference species in our study. Color shows the distance ranking between reference species and query species, that is, the darker the color, the farther the relationship.
| COI | 0.1 Gb | 0.5 Gb | 1 Gb | 2 Gb | 4 Gb | Largest Data | |
|---|---|---|---|---|---|---|---|
|
| 0.216 | 0.2541 |
|
| 0.2174 | 0.2359 | 0.2408 |
|
| 0.216 | 0.2439 | 0.1697 | 0.1806 | 0.1974 | 0.2139 | 0.2230 |
|
| 0.224 | 0.2582 | 0.1926 | 0.2077 | 0.2173 | 0.2365 |
|
|
| 0.376 | 0.2503 | 0.1806 | 0.1850 | 0.2037 | 0.2179 | 0.2229 |
|
| 0.379 |
| 0.1905 | 0.2120 | 0.2194 |
| 0.2379 |
|
| 0.388 | 0.2480 | 0.1885 | 0.2011 | 0.2112 | 0.2218 | 0.2299 |
|
| 0.408 | 0.2330 | 0.1637 | 0.1788 | 0.1881 | 0.2003 | 0.2043 |
|
| 0.654 |
|
|
|
|
|
|
|
|
| 0.2480 | 0.1863 | 0.2030 |
| 0.2317 | 0.2360 |
Color from light to dark:.
The calculation distance from COI and different sizes of genome skims between P. conulus and reference species in our study. Color shows the distance ranking between reference species and query species, that is, the darker the color, the farther the relationship.
| COI | 0.1 Gb | 0.5 Gb | 1 Gb | 2 Gb | 4 Gb | Largest Data | |
|---|---|---|---|---|---|---|---|
|
| 0.152 | 0.1155 | 0.0501 | 0.0570 | 0.0656 | 0.0729 | 0.0778 |
|
| 0.234 | 0.1900 | 0.1308 | 0.1375 | 0.1481 | 0.1581 | 0.1627 |
|
| 0.245 | 0.1744 | 0.1044 | 0.1122 | 0.1216 | 0.1263 | 0.1288 |
|
| 0.306 | 0.2176 | 0.152 | 0.1574 | 0.1666 | 0.1787 | 0.1797 |
|
| 0.325 | 0.1951 | 0.1310 | 0.1434 | 0.1528 | 0.1597 | 0.1614 |
|
| 0.359 | 0.2775 |
| 0.2434 | 0.2394 | 0.2458 |
|
|
| 0.362 |
| 0.2225 |
|
| 0.2435 | 0.2501 |
|
|
| 0.2630 | 0.1986 | 0.2041 | 0.2300 |
| 0.2467 |
Color from light to dark:.
Figure 2Phylogenetic placement of C. toreuma (query species) under different sizes of genome skims.
Figure 3Phylogenetic placement of S. flexuosa (query species) under different sizes of genome skims.
Figure 4Phylogenetic placement of P. conulus (query species) under different sizes of genome skims.