| Literature DB >> 22640879 |
Ricardo H Ramirez-Gonzalez1, Raoul Bonnal, Mario Caccamo, Daniel Maclean.
Abstract
BACKGROUND: The SAMtools utilities comprise a very useful and widely used suite of software for manipulating files and alignments in the SAM and BAM format, used in a wide range of genetic analyses. The SAMtools utilities are implemented in C and provide an API for programmatic access, to help make this functionality available to programmers wishing to develop in the high level Ruby language we have developed bio-samtools, a Ruby binding to the SAMtools library.Entities:
Year: 2012 PMID: 22640879 PMCID: PMC3473260 DOI: 10.1186/1751-0473-7-6
Source DB: PubMed Journal: Source Code Biol Med ISSN: 1751-0473
Figure 1bio-samtools and its relationship to the underlying libbam. Green boxes indicate C source code in libbam, red boxes indicate Ruby files that interact the Ruby FFI represented in yellow.
Attributes and methods of the Bio::DB::Sam object
| binary | denotes whether this is a binary file |
|---|---|
| compressed | denotes whether this file is compressed |
| fasta_path | path to the reference FASTA file |
| sam | path to the associated BAM file |
| chromosome_coverage | return Ruby Array of coverage over a region |
| fetch | fetch alignment in a region from a bam file, returning a Ruby Array object |
| fetch_reference | fetch regions of the reference file returning a String object of the relevant sequence |
| fetch_with_function | fetch all alignments in a region passing in a Ruby Proc object as a callback, returning an iterator |
| index_stats | get information about reference and number of mapped reads |
| merge | merge two or more bam files |
| mpileup | an iterator that returns Pileup objects representing the reads over a single position |
| sort | sort the BAM file |
Attributes of the Bio::DB::Alignment object
| calend | nucleotide position of the end of the alignment |
|---|---|
| cigar | CIGAR string describing the matches/mismatches |
| failed_quality | this read failed the quality threshold |
| first_in_pair | first of a pair |
| is_duplicate | this read is a suspected optical or PCR duplicate |
| is_mapped | the read was aligned |
| is_paired | the read is one of a pair |
| isize | the insert size distance between mapped mates |
| mapq | the PHRED scaled mapping quality of the alignment |
| mate_strand | the strand of the mate |
| mate_unmapped | the mate is unmapped |
| mpos | start position of the mate on the reference |
| pos | start position of the alignments |
| primary | is a primary alignment |
| qlen | read length |
| qname | read name |
| qual | read quality string |
| query_strand | strand of alignment |
| query_unmapped | query is unmapped |
| rname | name of reference to which read mapped |
| second_in_pair | this is second in the pair |
| seq | read sequence |
| tags | Bio::DB::Tag object representing the tags for this alignment |
Attributes and methods of the Bio::DB::Pileup object
| consensus | the consensus nucleotide calculated as the nucleotide with highest count multiple nucleotides returned in a tie |
|---|---|
| coverage | the number of reads covering this position |
| non_ref_count | the number of reads that disagree with the reference nucleotide |
| non_ref_count_hash | a Hash with A,T,G and C as keys and the number each nucleotide appears in the pileup when that nucleotide is not |
| | the reference |
| pos | the position in the reference sequence that this pileup represents |
| read_bases | the read nucleotides covering this position |
| read_quals | the quality scores of the read nucleotides covering this position |
| ref_base | the reference sequence nucleotide |
| ref_count | the number of times the reference nucleotide appears in the read nucleotides covering this position |
| ref_name | the name of the reference sequence |
| ar1, ar2, ar3 | the allele calls from pileup |
| consensus1 | the consensus of the reads according to SAMtools method of calculation |
| consensus_quality1 | the quality score of the consensus according to SAMtools method of calculation |
| rms_mapq1 | the root mean square mapping quality at the position |
| snp_quality1 | the SNP quality at the position |
1ten column format only.
SAMtools options recognised by the Bio::DB:Sam#mpileup method and the symbols used to invoke them
| r | limit retrieval to a region | :r | :region | all positions |
| 6 | assume Illumina scaled quality scores | :six | :illumina_quals | FALSE |
| A | count anomalous read pairs scores | :A | :count_anomalous | FALSE |
| B | disable BAQ computation | :B | :no_baq | FALSE |
| C | parameter for adjusting mapQ | :C | :adjust_mapq | 0 |
| d | max per-BAM depth to avoid excessive memory usage | :d | :max_per_bam_depth | 250 |
| E | extended BAQ for higher sensitivity but lower specificity | :E | :extended_baq | FALSE |
| G | exclude read groups listed in FILE | :G | :exclude_reads_file | FALSE |
| l | list of positions (chr pos) or regions (BED) | :l | :list_of_positions | FALSE |
| M | cap mapping quality at value | :M | :mapping_quality_cap | 60 |
| R | ignore RG tags | :R | :ignore_rg | FALSE |
| q | skip alignments with mapping quality smaller than value | :q | :min_mapping_quality | 0 |
| Q | skip bases with base quality smaller than value | :Q | :imin_base_quality | 13 |