| Literature DB >> 21114810 |
Pascal Bally1, Jonathan Grandaubert, Thierry Rouxel, Marie-Hélène Balesdent.
Abstract
BACKGROUND: Micro-and minisatellites are among the most powerful genetic markers known to date. They have been used as tools for a large number of applications ranging from gene mapping to phylogenetic studies and isolate typing. However, identifying micro-and minisatellite markers on large sequence data sets is often a laborious process.Entities:
Year: 2010 PMID: 21114810 PMCID: PMC3002364 DOI: 10.1186/1756-0500-3-322
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
FONZIE results when performed on different fungal or oomycetes whole genomes.
| Nb of markers identifiedc | Nb of amplification products (AP) and primers designedd | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Organism | Genome Size (Mb) | Nb of contigs or super contigsa | Execution Timeb | Total | Single copy | Multiple copies | Single copy | Multiple copies | No primers | No BLAST results | % single-copy AP |
| 34.85 | 24 | 2 min 53 sec | 637 | 533 | 104 | 600 | 18 | 17 | 2 | 94.19 | |
| 37.21 | 108 | 4 min 41 sec | 1288 | 978 | 310 | 959 | 154 | 169 | 6 | 74.46 | |
| 37.84 | 47 | 5 min 12 sec | 1141 | 878 | 263 | 935 | 195 | 8 | 3 | 81.94 | |
| 42.66 | 588 | 12 min 03 sec | 3648 | 2719 | 929 | 3339 | 118 | 176 | 15 | 91.53 | |
| 45.12 | 76 | 23 min 36 sec | 2606 | 1799 | 807 | 2405 | 146 | 49 | 6 | 92.29 | |
| 64.88 | 665 | 45 min 39 sec | 5393 | 1516 | 3877 | 1640 | 3409 | 338 | 6 | 30.41 | |
| 228.54 | 4921 | 3 h 21 min 37 sec | 7514 | 1855 | 5659 | 1718 | 5239 | 553 | 4 | 22.86 | |
a Number of sequences in the Multifasta file
b Machine used for this test: Laptop Intel Core 2 Duo, 2.4 GHz and 3Go RAM
c FONZIE results after step d of the workflow (Figure 1), using the TRF default parameters (match = 2, indel = 7, mismatch = 7, pi = 10, pm = 80, minscore = 50, maxperiod = 500) and screen parameters for core motif size >3, % identity between motifs = 90%, BLAST cut-off value = 1e-10)
d Final FONZIE results after steps e (Primer pair design) and f (checking for the specificity of the amplification product) of the workflow shown in Figure 1, using a BLAST cut-off value of 1e-40
Figure 1The graphical user interface of FONZIE. The graphical user interface is composed of five major sections. a. TRF parameters: All the modifiable parameters of TRF are located in this section. Users can modify each of them separately. b. Screening parameters. Users can modify the core motif size and the percent match between core motifs c. BLAST cut-off parameters: Users can modify the two E-value cut-off used during the screening for marker specificity and the virtual PCR steps d. Flanking sequences size and Primer3 parameters. e. The file field: In this section, users can open the Fasta- or Multifasta formatted sequence file, the BLAST database file, and the optional gff format file which contains the excluded regions.
Figure 2The FONZIE workflow: The FONZIE workflow consists of 6 steps. a Execution of TRF on a Fasta sequence or a Multifasta file, b Optional: exclusion of some user-defined specific regions, c Screening by core motif size and percent match parameters, d Screening for marker specificity with BLAST (1e-10 by default) against a database, e Primer design step using PRIMER3, f Virtual PCR by making a BLAST (1e-40 by default) against the database to check the specificity of the PCR product (single copy locus).
Example of the FONZIE final result table, run on Supercontig 16 of the Leptosphaeria maculans genome.
| MARKER_ID | ||||||||
|---|---|---|---|---|---|---|---|---|
| min_supercontig_16_10 | ATAAAAGTAAACTACTACTTTA | 2.0 | MULTIPLE_COPIES | GCATAAAGCTAATCTTCTCTACCCC | GTATAAACTGCCCTTGTGTATACCT | 100841 | 101019 | MULTIPLE_COPIES |
| min_supercontig_16_11 | GGATCATCAAGGA | 17.3 | UNIQUE_COPY | CGTTTTGGCTTTGTTGTTGA | ACTATGAGCCAGGTGAACCG | 111896 | 112241 | UNIQUE_COPY |
| min_supercontig_16_12 | CGCTCTCTCTCTCTCTCTTTCTCTCT | 4.3 | MULTIPLE_COPIES | CGCCAACAAGACTACCCATC | GAAGCGGTGGCAGTTTTTAG | 112524 | 112812 | UNIQUE_COPY |
| min_supercontig_16_13 | CCATGT | 5.8 | UNIQUE_COPY | ACCTCCCGAGGAAAAGTGAC | CTTGTGTGGTCTGGTTGCAG | 134392 | 134595 | UNIQUE_COPY |
| min_supercontig_16_14 | GAGAGAGAGAGAGAGAGA | 7.4 | MULTIPLE_COPIES | TGACTCGGCGTCTACCCTAC | AGCCAGCCAGCCAGTACTAA | 136186 | 136390 | UNIQUE_COPY |
| min_supercontig_16_15 | AAGCAGAAGGCTATTGAGTCGCCAGAGACAAGTCCACAGTCC | 2.1 | UNIQUE_COPY | AAGTGGCTGGACCTAGCAGA | ACATCGGCGACACGTTTAGT | 142179 | 142347 | UNIQUE_COPY |
| min_supercontig_16_16 | GTGTGG | 11.2 | MULTIPLE_COPIES | TGTGGATGATAGGATGGGGT | GTGACAAGCACATGATTCGC | 156524 | 156707 | UNIQUE_COPY |
Only a few markers generated by FONZIE are displayed in the table.
a consensus sequence of the core motif of the minisatellite (MS)
b number of repeats of the core motif
c results of the first BLAST step on the BLAST database: UNIQUE_COPY, the unique sequence matching the MS (query) is the query sequence; MULTIPLE_COPIES, more than one sequence of the BLAST database match the query sequence and the best hit is obtained for the query sequence (E-value cut-off e-10)
d and e, sequences of the left and right primers, respectively, generated by Primer3 to amplify the minisatellite locus and flanking sequences
f and g, location (in base pairs) of the primers along Super-Contig 16 sequence.
h, results of the second BLAST step, where the amplification product is blasted on the BLAST database: UNIQUE_COPY, the unique sequence matching the PCR product (query) is the query sequence; MULTIPLE_COPIES, more than one sequence of the BLAST database match the query sequence and the best hit is obtained for the query sequence (E value cut-off e-40).
Figure 3Example of one file from "MARKERS_RECAP" directory generated by FONZIE. From top to bottom: the marker sequence, the marker sequence with flanking regions, the amplification product sequence, and the primer3 output