| Literature DB >> 23118488 |
Anke Busch1, Klemens J Hertel.
Abstract
HEXEvent (http://hexevent.mmg.uci.edu) is a new database that permits the user to compile genome-wide exon data sets of human internal exons showing selected splicing events. User queries can be customized based on the type and the frequency of alternative splicing events. For each splicing version of an exon, an ESTs count is given, specifying the frequency of the event. A user-specific definition of constitutive exons can be entered to designate an exon exclusion level still acceptable for an exon to be considered as constitutive. Similarly, the user has the option to define a maximum inclusion level for an exon to be called an alternatively spliced exon. Unlike other existing splicing databases, HEXEvent permits the user to easily extract alternative splicing information for individual, multiple or genome-wide human internal exons. Importantly, the generated data sets are downloadable for further analysis.Entities:
Mesh:
Year: 2012 PMID: 23118488 PMCID: PMC3531206 DOI: 10.1093/nar/gks969
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Cassette exon inclusion levels determined from HEXEvent. The plot shows the relationship between exon inclusion levels and the cumulative number of events.
Definition of the columns in the output format of HEXEvent for a randomly chosen exon
| No. | Column name | Example | Description |
|---|---|---|---|
| 1 | chromo | chrX | Reference sequence chromosome name |
| 2 | strand | + | + or − for strand |
| 3 | start | 101 854 639 | First position of the exon (0-based) |
| 4 | end | 101 854 775 | Last position of the exon (1-based) |
| 5 | count | 15 | Number of ESTs that include the exon as given in columns [chromo], [strand], [start], and [end] |
| 6 | alt3 | 10 | Number of ESTs that include the exon with an alternative 3′-splice site |
| 7 | alt5 | 0 | Number of ESTs that include the exon with an alternative 5′-splice site |
| 8 | alt3+5 | 2 | Number of ESTs that include the exon with an alternative 3′ and an alternative 5′-splice site simultaneously |
| 9 | skip | 0 | Number of ESTs in which the exon is skipped |
| 10 | constitLevel | 0.556 | Constitutive level of the exon (= |
| 11 | inclLevel | 1.000 | Inclusion level of the exon (= |
| 12 | 3usageLevel | 0.556 | Usage level of the major 3′-splice site of the exon |
| 13 | 5usageLevel | 0.926 | Usage level of the major 5′-splice site of the exon |
| 14 | alt3singleCount | 10 | Number(s) of ESTs for different alternative 3′- splice site |
| 15 | alt3singleLoc | 101 854 633 | Location(s) of alternative 3′-splice sites |
| 16 | alt5singleCount | 0 | Number(s) of ESTs for different alternative 5′-splice site |
| 17 | alt5singleLoc | # | Location(s) of alternative 5′-splice sites |
| 18 | alt3and5singleCount | 2 | Number(s) of ESTs for different alternative 3′- and 5′-splice sitecombinations |
| 19 | alt3and5singleLocName | 101 854 633–101 854 787 | Location(s) of alternative 3′ and 5′-splice site combinations |
| 20 | OnlyESTexonCount | 0 | Number of ESTs in which the exon is overlapped by an alternative version of it, that is not included in the human isoform list of the UCSC Genome Browser yet, but has at least one EST supporting it |
| 21 | OnlyESTexons | # | Location(s) of alternative version(s) of the exon, that is/are not included in the human isoform list of the UCSC Genome Browser yet, but has/have at least one EST supporting it |
| 22 | genename | ARMCX5 | Name of the gene the exon is part of, if no gene name was assignedyet, it is indicated by ‘onlyEST’ |
The first four columns describe the location of the exon, whereas columns 5–9 give EST counts for inclusion (as given in columns 1–4), alternative splice site usage and exclusion of the exon. Column 10 specifies the constitutive level of the exon. Here, the grade of being constitutive is calculated by comparing the occurrence of the exon as specified in columns 1–4 with all other alternative events. In column 11, the inclusion level of the exon is given. Here, inclusion is calculated as the sum of ESTs showing the exon with the coordinates given in columns 1–4 and EST counts for alternative version of the exons showing an alternative 3′- and/or 5′-splice site, whereas exclusion is represented by the number of ESTs having this exon skipped. Columns 12 and 13 show the usage level of the major 3′-splice site and the major 5′-splice site (as given in columns 3 and 4 or columns 4 and 3 when on the negative strand), respectively. The usage level of the major 3′-splice site of the exon is calculated as the ratio of the number of ESTs showing this 3′-splice site, i.e. all ESTs showing the exons as given in columns 1–4 as well as all ESTs showing the exon with an alternative 5′-splice site, and the number of ESTs that include the exon with any splice site. The usage level of the major 5′-splice site is calculated analogously. The location of all alternative 3′-splice sites of the exon can be found in column 15, whereas the EST counts for each single one are given in column 14. Respective entries can be found in columns 16 and 17 for alternative 5′-splice sites, as well as in columns 18 and 19 for mutually occurring 3′- and 5′-splice sites. The EST count and location of the new versions of the exon that have EST evidence but are not confirmed events in the UCSC Genome Browser yet, are shown in columns 20 and 21. The location is given in the form ‘chromosomeSTRANDstart-end’. The last column shows the name of the gene the exon is part of. If none was assigned yet, ‘onlyEST’ is specified.
Figure 2.Workflow during the creation of HEXEvent. We downloaded the UCSC Genes track, the spliced ESTs track, as well as the human mRNAs track from the UCSC Genome Browser. Using all three data sets, we extracted all known versions of human internal exons. An EST count was assigned to each version of each exon, specifying inclusion and exclusion levels. In a last step, overlapping exons were combined and indicated as alternative versions of each other.
Examples of HEXEvent database outputs
| chromo | strand | start | end | count | alt3 | alt5 | alt3+5 | skip | constit Level | incl Level | 3usage Level | 5usage Level | alt3single Count | alt3single Loc | alt5single Count | alt5single Loc | alt3and5single Count | alt3and5 singleLoc | OnlyEST exonCount | OnlyEST exons | genename |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| All exons of gene | |||||||||||||||||||||
| chrX | + | 100 673 897 | 100 673 988 | 15 | 0 | 1 | 0 | 0 | 0.938 | 1.000 | 1.000 | 0.938 | 0 | # | 1 | 100 673 984 | 0 | # | 0 | # | |
| chrX | + | 100 699 039 | 100 699 143 | 9 | 0 | 0 | 0 | 7 | 0.562 | 0.562 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 700 985 | 100 701 032 | 7 | 0 | 0 | 0 | 5 | 0.583 | 0.583 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 741 009 | 100 741 079 | 33 | 5 | 5 | 1 | 1 | 0.733 | 0.978 | 0.864 | 0.864 | 5 | 100 741 012 | 5 | 100 741 085 | 1 | 100 741 012– 100 741 085 | 0 | # | |
| chrX | + | 100 742 179 | 100 742 259 | 45 | 1 | 0 | 0 | 0 | 0.978 | 1.000 | 0.978 | 1.000 | 1 | 100 742 191 | 0 | # | 0 | # | 1 | chrX+100 742 179– 100 742 255 | |
| chrX | + | 100 742 594 | 100 742 678 | 44 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 743 030 | 100 743 086 | 38 | 7 | 0 | 0 | 0 | 0.844 | 1.000 | 0.844 | 1.000 | 7 | 100 742 994 | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 743 430 | 100 744 302 | 5 | 0 | 0 | 0 | 19 | 0.208 | 0.208 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 753 131 | 100 754 353 | 5 | 0 | 2 | 0 | 0 | 0.714 | 1.000 | 1.000 | 0.714 | 0 | # | 2 | 100 753 318 | 0 | # | 0 | # | |
| chrX | + | 100 759 923 | 100 760 342 | 5 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 764 345 | 100 764 414 | 3 | 3 | 0 | 0 | 0 | 0.500 | 1.000 | 0.500 | 1.000 | 3 | 100764350 | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 764 576 | 100 764 665 | 7 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 766 011 | 100 766 042 | 7 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 779 190 | 100 779 419 | 2 | 1 | 0 | 0 | 3 | 0.333 | 0.500 | 0.667 | 1.000 | 1 | 100779272 | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 786 630 | 100 786 999 | 3 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| All exons with an alternative 3′-splice site of gene | |||||||||||||||||||||
| chrX | + | 100 742 179 | 100 742 259 | 45 | 1 | 0 | 0 | 0 | 0.978 | 1.000 | 0.978 | 1.000 | 1 | 100 742 191 | 0 | # | 0 | # | 1 | chrX+100 742 179– 100 742 255 | |
| chrX | + | 100 743 030 | 100 743 086 | 38 | 7 | 0 | 0 | 0 | 0.844 | 1.000 | 0.844 | 1.000 | 7 | 100 742 994 | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 764 345 | 100 764 414 | 3 | 3 | 0 | 0 | 0 | 0.500 | 1.000 | 0.500 | 1.000 | 3 | 100 764 350 | 0 | # | 0 | # | 0 | # | |
| All exons with an alternative 3′-splice site of gene | |||||||||||||||||||||
| chrX | + | 100 741 009 | 100 741 079 | 33 | 5 | 5 | 1 | 1 | 0.733 | 0.978 | 0.864 | 0.864 | 5 | 100 741 012 | 5 | 100 741 085 | 1 | 100 741 012– 100 741 085 | 0 | # | |
| chrX | + | 100 742 179 | 100 742 259 | 45 | 1 | 0 | 0 | 0 | 0.978 | 1.000 | 0.978 | 1.000 | 1 | 100 742 191 | 0 | # | 0 | # | 1 | chrX+100 742 179– 100 742 255 | |
| chrX | + | 100 743 030 | 100 743 086 | 38 | 7 | 0 | 0 | 0 | 0.844 | 1.000 | 0.844 | 1.000 | 7 | 100 742 994 | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 764 345 | 100 764 414 | 3 | 3 | 0 | 0 | 0 | 0.500 | 1.000 | 0.500 | 1.000 | 3 | 100 764 350 | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 779 190 | 100 779 419 | 2 | 1 | 0 | 0 | 3 | 0.333 | 0.500 | 0.667 | 1.000 | 1 | 100 779 272 | 0 | # | 0 | # | 0 | # | |
Columns are as defined in Table 1. All ‘hash’ symbol indicate a non-existing value, meaning that there are no alternative 3′- or 5′-splice sites known.
aOutput files were generated when searching for all internal human exons of the gene ARMCX4 including all of their known alternative splicing events.
bLocation of both alternative splice sites (3′- and 5′-splice site).
cLocation of an alternative version of the exon not found in the UCSC Gene list, but in ESTs. It is given in the format ‘chromosomeSTRANDstart-end’.
dOutput files were generated when searching for internal human exons of the gene ARMCX4 that have an alternative 3′-splice site, but do not show any other alternative event.
eOutput files were generated when searching for internal human exons of the gene ARMCX4 that have an alternative 3′-splice site and possibly show other alternative splicing events.
Output of constitutive exons of the gene ARMCX4
| chromo | strand | start | end | count | alt3 | alt5 | alt3+5 | skip | constit Level | inclLevel | 3usage Level | 5usage Level | alt3single Count | alt3single Loc | alt5single Count | alt5single Loc | alt3and5single Count | alt3and5single Loc | OnlyESTexon Count | OnlyEST exons | genename |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| All constitutive exons of gene | |||||||||||||||||||||
| chrX | + | 100 742 594 | 100 742 678 | 44 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 759 923 | 100 760 342 | 5 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 764 576 | 100 764 665 | 7 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 766 011 | 100 766 042 | 7 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 786 630 | 100 786 999 | 3 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| All constitutive exons of gene | |||||||||||||||||||||
| chrX | + | 100 742 179 | 100 742 259 | 45 | 1 | 0 | 0 | 0 | 0.978 | 1.000 | 0.978 | 1.000 | 1 | 100 742 191 | 0 | # | 0 | # | 1 | chrX+100 742 179– 100 742 255 | |
| chrX | + | 100 742 594 | 100 742 678 | 44 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 759 923 | 100 760 342 | 5 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 764 576 | 100 764 665 | 7 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 766 011 | 100 766 042 | 7 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
| chrX | + | 100 786 630 | 100 786 999 | 3 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 1.000 | 1.000 | 0 | # | 0 | # | 0 | # | 0 | # | |
Columns are as in Table 2 and as defined in Table 1.
aOutput of constitutive exons of the gene ARMCX4 with no alternative versions allowed.
bOutput of constitutive exons of the gene ARMCX4 with at most 5% of ESTs showing alternative versions of the exon.