Literature DB >> 26722765

Introduction of the Python script STRinNGS for analysis of STR regions in FASTQ or BAM files and expansion of the Danish STR sequence database to 11 STRs.

Susanne L Friis1, Anders Buchard1, Eszter Rockenbauer1, Claus Børsting2, Niels Morling1.   

Abstract

This work introduces the in-house developed Python application STRinNGS for analysis of STR sequence elements in BAM or FASTQ files. STRinNGS identifies sequence reads with STR loci by their flanking sequences, it analyses the STR sequence and the flanking regions, and generates a report with the assigned SNP-STR alleles. The main output file from STRinNGS contains all sequences with read counts above 1% of the total number of reads per locus. STR sequences are automatically named according to the nomenclature used previously and according to the repeat unit definitions in STRBase (http://www.cstl.nist.gov/strbase/). The sequences are named with (1) the locus name, (2) the length of the repeat region divided by the length of the repeat unit, (3) the sequence(s) of the repeat unit(s) followed by the number of repeats and (4) variations in the flanking regions. Lower case letters in the main output file are used to flag sequences with previously unknown variations in the STRs. SNPs in the flanking regions are named by their "rs" numbers and the nucleotides in the SNP position. Data from 207 Danes sequenced with the Ion Torrent™ HID STR 10-plex that amplified nine STRs (CSF1PO, D3S1358, D5S818, D7S820, D8S1179, D16S539, TH01, TPOX, vWA), and Amelogenin was analysed with STRinNGS. Sequencing uncovered five common SNPs near four STRs and revealed 20 new alleles in the 207 Danes. Three short homopolymers in the D8S1179 flanking regions caused frequent sequencing errors. In 29 of 3726 allele calls (0.8%), sequences with homopolymer errors were falsely assigned as true alleles. An in-house developed script in R compensated for these errors by compiling sequence reads that had identical STR sequences and identical nucleotides in the five common SNPs. In the output file from the R script, all SNP-STR haplotype calls were correct. The 207 samples and six additional samples were sequenced for D3S1358, D12S391, and D21S11 using the 454 GS Junior platform in this and a previous work. Overall, next generation sequencing (NGS) of the 11 STRs lowered the mean match probability 386 times and increased the typical paternity indexes (i.e. the geometric mean) for trios and duos 47 and 23 times, respectively, compared to the traditional PCR-CE typing of the same population.
Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

Entities:  

Keywords:  Forensic genetics; Massive parallel sequencing; Next generation sequencing; SNP–STR sequence analysis; STRinNGS Python application

Mesh:

Year:  2015        PMID: 26722765     DOI: 10.1016/j.fsigen.2015.12.006

Source DB:  PubMed          Journal:  Forensic Sci Int Genet        ISSN: 1872-4973            Impact factor:   4.882


  10 in total

Review 1.  Increasing the reach of forensic genetics with massively parallel sequencing.

Authors:  Bruce Budowle; Sarah E Schmedes; Frank R Wendt
Journal:  Forensic Sci Med Pathol       Date:  2017-06-19       Impact factor: 2.007

2.  Characterization of 58 STRs and 94 SNPs with the ForenSeq™ DNA signature prep kit in Mexican-Mestizos from the Monterrey city (Northeast, Mexico).

Authors:  José Alonso Aguilar-Velázquez; Miguel Ángel Duran-Salazar; Miranda Fabiola Córdoba-Mercado; Carolina Elena Coronado-Avila; Orlando Salas-Salas; Gabriela Martinez-Cortés; Ferrán Casals; Francesc Calafell; Benito Ramos-González; Héctor Rangel-Villalobos
Journal:  Mol Biol Rep       Date:  2022-06-03       Impact factor: 2.742

3.  Sequencing of human identification markers in an Uyghur population using the MiSeq FGxTM Forensic Genomics System.

Authors:  Halimureti Simayijiang; Niels Morling; Claus Børsting
Journal:  Forensic Sci Res       Date:  2020-09-10

4.  Sequence-based U.S. population data for 27 autosomal STR loci.

Authors:  Katherine Butler Gettings; Lisa A Borsuk; Carolyn R Steffen; Kevin M Kiesler; Peter M Vallone
Journal:  Forensic Sci Int Genet       Date:  2018-07-19       Impact factor: 4.882

5.  Report from the STRAND Working Group on the 2019 STR sequence nomenclature meeting.

Authors:  Katherine Butler Gettings; David Ballard; Martin Bodner; Lisa A Borsuk; Jonathan L King; Walther Parson; Christopher Phillips
Journal:  Forensic Sci Int Genet       Date:  2019-09-21       Impact factor: 4.882

6.  A technique for setting analytical thresholds in massively parallel sequencing-based forensic DNA analysis.

Authors:  Brian Young; Jonathan L King; Bruce Budowle; Luigi Armogida
Journal:  PLoS One       Date:  2017-05-18       Impact factor: 3.240

Review 7.  An Introductory Overview of Open-Source and Commercial Software Options for the Analysis of Forensic Sequencing Data.

Authors:  Tunde I Huszar; Katherine B Gettings; Peter M Vallone
Journal:  Genes (Basel)       Date:  2021-10-29       Impact factor: 4.096

8.  Transcriptome Analysis and HPLC Profiling of Flavonoid Biosynthesis in Citrus aurantium L. during Its Key Developmental Stages.

Authors:  Jing Chen; Yaliang Shi; Yicheng Zhong; Zhimin Sun; Juan Niu; Yue Wang; Tianxin Chen; Jianhua Chen; Mingbao Luan
Journal:  Biology (Basel)       Date:  2022-07-19

9.  Identification of sequence polymorphisms at 58 STRs and 94 iiSNPs in a Tibetan population using massively parallel sequencing.

Authors:  Dan Peng; Yinming Zhang; Han Ren; Haixia Li; Ran Li; Xuefeng Shen; Nana Wang; Erwen Huang; Riga Wu; Hongyu Sun
Journal:  Sci Rep       Date:  2020-07-22       Impact factor: 4.379

10.  Sequencing of 231 forensic genetic markers using the MiSeq FGx™ forensic genomics system - an evaluation of the assay and software.

Authors:  Christian Hussing; Christina Huber; Rajmonda Bytyci; Helle S Mogensen; Niels Morling; Claus Børsting
Journal:  Forensic Sci Res       Date:  2018-04-09
  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.