Hao Zhao1, Zhifu Sun, Jing Wang, Haojie Huang, Jean-Pierre Kocher, Liguo Wang. 1. Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA, Division of Biomedical Statistics and Informatics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA and Department of Biochemistry and Molecular Biology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA.
Abstract
MOTIVATION: Reference genome assemblies are subject to change and refinement from time to time. Generally, researchers need to convert the results that have been analyzed according to old assemblies to newer versions, or vice versa, to facilitate meta-analysis, direct comparison, data integration and visualization. Several useful conversion tools can convert genome interval files in browser extensible data or general feature format, but none have the functionality to convert files in sequence alignment map or BigWig format. This is a significant gap in computational genomics tools, as these formats are the ones most widely used for representing high-throughput sequencing data, such as RNA-seq, chromatin immunoprecipitation sequencing, DNA-seq, etc. RESULTS: Here we developed CrossMap, a versatile and efficient tool for converting genome coordinates between assemblies. CrossMap supports most of the commonly used file formats, including BAM, sequence alignment map, Wiggle, BigWig, browser extensible data, general feature format, gene transfer format and variant call format. AVAILABILITY AND IMPLEMENTATION: CrossMap is written in Python and C. Source code and a comprehensive user's manual are freely available at: http://crossmap.sourceforge.net/.
MOTIVATION: Reference genome assemblies are subject to change and refinement from time to time. Generally, researchers need to convert the results that have been analyzed according to old assemblies to newer versions, or vice versa, to facilitate meta-analysis, direct comparison, data integration and visualization. Several useful conversion tools can convert genome interval files in browser extensible data or general feature format, but none have the functionality to convert files in sequence alignment map or BigWig format. This is a significant gap in computational genomics tools, as these formats are the ones most widely used for representing high-throughput sequencing data, such as RNA-seq, chromatin immunoprecipitation sequencing, DNA-seq, etc. RESULTS: Here we developed CrossMap, a versatile and efficient tool for converting genome coordinates between assemblies. CrossMap supports most of the commonly used file formats, including BAM, sequence alignment map, Wiggle, BigWig, browser extensible data, general feature format, gene transfer format and variant call format. AVAILABILITY AND IMPLEMENTATION: CrossMap is written in Python and C. Source code and a comprehensive user's manual are freely available at: http://crossmap.sourceforge.net/.
Authors: E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki Journal: Nature Date: 2001-02-15 Impact factor: 49.962
Authors: Belinda Giardine; Cathy Riemer; Ross C Hardison; Richard Burhans; Laura Elnitski; Prachi Shah; Yi Zhang; Daniel Blankenberg; Istvan Albert; James Taylor; Webb Miller; W James Kent; Anton Nekrutenko Journal: Genome Res Date: 2005-09-16 Impact factor: 9.043
Authors: Sarah A McClymont; Paul W Hook; Alexandra I Soto; Xylena Reed; William D Law; Samuel J Kerans; Eric L Waite; Nicole J Briceno; Joey F Thole; Michael G Heckman; Nancy N Diehl; Zbigniew K Wszolek; Cedric D Moore; Heng Zhu; Jennifer A Akiyama; Diane E Dickel; Axel Visel; Len A Pennacchio; Owen A Ross; Michael A Beer; Andrew S McCallion Journal: Am J Hum Genet Date: 2018-11-29 Impact factor: 11.025
Authors: Rajandeep S Sekhon; Christopher Saski; Rohit Kumar; Barry S Flinn; Feng Luo; Timothy M Beissinger; Arlyn J Ackerman; Matthew W Breitzman; William C Bridges; Natalia de Leon; Shawn M Kaeppler Journal: Plant Cell Date: 2019-06-25 Impact factor: 11.277
Authors: Lama AlAbdi; Debapriya Saha; Ming He; Mohd Saleem Dar; Sagar M Utturkar; Putu Ayu Sudyanti; Stephen McCune; Brice H Spears; James A Breedlove; Nadia A Lanman; Humaira Gowher Journal: Cell Rep Date: 2020-02-04 Impact factor: 9.423
Authors: Martyna Gajos; Olga Jasnovidova; Alena van Bömmel; Susanne Freier; Martin Vingron; Andreas Mayer Journal: Nucleic Acids Res Date: 2021-05-07 Impact factor: 16.971
Authors: Chao Xie; Zhen Xuan Yeo; Marie Wong; Jason Piper; Tao Long; Ewen F Kirkness; William H Biggs; Ken Bloom; Stephen Spellman; Cynthia Vierra-Green; Colleen Brady; Richard H Scheuermann; Amalio Telenti; Sally Howard; Suzanne Brewerton; Yaron Turpaz; J Craig Venter Journal: Proc Natl Acad Sci U S A Date: 2017-07-03 Impact factor: 11.205
Authors: Charles G Danko; Lauren A Choate; Brooke A Marks; Edward J Rice; Zhong Wang; Tinyi Chu; Andre L Martins; Noah Dukler; Scott A Coonrod; Elia D Tait Wojno; John T Lis; W Lee Kraus; Adam Siepel Journal: Nat Ecol Evol Date: 2018-01-29 Impact factor: 15.460