Literature DB >> 27144000

Fish-T1K (Transcriptomes of 1,000 Fishes) Project: large-scale transcriptome data for fish evolution studies.

Ying Sun1, Yu Huang2, Xiaofeng Li2, Carole C Baldwin3, Zhuocheng Zhou4, Zhixiang Yan5, Keith A Crandall6, Yong Zhang11, Xiaomeng Zhao2, Min Wang7, Alex Wong8, Chao Fang2, Xinhui Zhang2, Hai Huang9, Jose V Lopez10, Kirk Kilfoyle10, Yong Zhang11, Guillermo Ortí6, Byrappa Venkatesh12, Qiong Shi13.   

Abstract

Ray-finned fishes (Actinopterygii) represent more than 50 % of extant vertebrates and are of great evolutionary, ecologic and economic significance, but they are relatively underrepresented in 'omics studies. Increased availability of transcriptome data for these species will allow researchers to better understand changes in gene expression, and to carry out functional analyses. An international project known as the "Transcriptomes of 1,000 Fishes" (Fish-T1K) project has been established to generate RNA-seq transcriptome sequences for 1,000 diverse species of ray-finned fishes. The first phase of this project has produced transcriptomes from more than 180 ray-finned fishes, representing 142 species and covering 51 orders and 109 families. Here we provide an overview of the goals of this project and the work done so far.

Entities:  

Keywords:  Biodiversity; Database; Fish; Fish-T1K; RNA; Transcriptome

Mesh:

Year:  2016        PMID: 27144000      PMCID: PMC4853854          DOI: 10.1186/s13742-016-0124-7

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


Background

Ray-finned fishes (Actinopterygii) are the most diverse and abundant group of extant vertebrates. Thus far, approximately 32,900 fish species are recorded in FishBase [1]. Fishes encompass enormous variation in morphology, physiology and ecology. They are of great economic and medical significance as a primary source of protein for people worldwide, as a novel source of active ingredients in pharmaceuticals [2], and as evolutionary models for specific human diseases and conditions [3]. However, genomic resources for fishes are relatively underrepresented and published genetic data represent only a small fraction of extant fish species. So far, the whole genomes of only 38 fish species have been published (Additional file 1) and, although the number is growing (Additional file 2), searching the National Center for Biotechnology Information (NCBI)’s Sequence Read Archive (SRA) database for “fish AND transcriptome” yields 16,975 transcriptomes of only 242 fish species (Table 1). A lack of genomic resources for most fish species motivated us to generate large-scale fish transcriptome data and establish a database that may be used by scientists around the world. To this end, we initiated the “Transcriptomes of 1,000 Fishes” (Fish-T1K) project, an effort devoted to sequencing the transcriptomes of 1,000 different species of ray-finned fishes.
Table 1

List of fish species with published transcriptome data in NCBI’s SRA, and those generated by Fish-T1K

OrderNo. of species in SRANo. of species in Fish-T1KNo. of new species generated by Fish-T1K
Cypriniformes4253
Cyprinodontiformes3320
Perciformes2199
Cichliformes1522
Salmoniformes1400
Order-level incertae sedis in Eupercaria922
Pleuronectiformes921
Osteoglossiformes842
Siluriformes898
Clupeiformes611
Syngnathiformes655
Gymnotiformes521
Acipenseriformes411
Anabantiformes444
Anguilliformes443
Centrarchiformes443
Scombriformes422
Beloniformes322
Characiformes366
Gadiformes311
Order-level incertae sedis in Ovalentaria355
Tetraodontiformes344
Carangiformes221
Amiiformes110
Batrachoidiformes111
Blenniiformes144
Esociformes100
Labriformes132
Lepisosteiformes122
Ophidiiformes122
Osmeriformes100
Pempheriformes122
Polypteriformes133
Spariformes122
Synbranchiformes122
Argentiniformes011
Atheriniformes022
Aulopiformes022
Chaetodontiformes011
Elopiformes011
Ephippiformes022
Galaxiiformes011
Gobiiformes033
Holocentriformes033
Kurtiformes022
Lepidogalaxiiformes011
Lobotiformes011
Lophiiformes022
Mugiliformes022
Order-level incertae sedis in Carangimorphariae033
Order-level incertae sedis in Percomorpharia01212
Percopsiformes011
Uranoscopiformes011
Zeiformes011
others (Chondrichthyes and Sarcopterygii)17//
All242142128
List of fish species with published transcriptome data in NCBI’s SRA, and those generated by Fish-T1K

Fish-T1K

Fish-T1K is an international, collaborative and non-profit initiative officially launched by BGI and the China National Genebank (CNGB) in November 2013. The objective is to generate RNA-seq transcriptome sequences for 1,000 diverse fish species to help scientists unravel the mysteries of fish evolution, and pursue innovative approaches and strategies for addressing challenges in fish breeding, disease control and prevention, seafood safety, and biodiversity conservation. Through this project, an integrated biobank will be established, incorporating a high-level bio-repository and a large-scale transcriptome database. The biobank will collect and store fish genetic resources including vouchers and frozen tissues, DNA and RNA nucleotides, together with related sample information documented according to standard operating procedures (SOPs). A companion database, committed to being the world’s largest database of fish transcriptomes, has already been established and provides access to the sequences via BLAST search.

The Fish-T1K consortium

More than 40 scientists from 25 institutions across seven countries are active members of the Fish-T1K project (Fig. 1; Additional file 3). The Steering Committee consists of six core consortium members who are recognized experts in ichthyology, taxonomy, bioinformatics, phylogenetics, and evolution. In addition to the head office at BGI in Shenzhen, China, we have also established a hub at the Smithsonian National Museum of Natural History (NMNH) in Washington DC, USA, to facilitate quality sample collection from North America.
Fig. 1

Distribution of Fish-T1K Consortium members. See detailed information of these numbered institutions in Additional file 3

Distribution of Fish-T1K Consortium members. See detailed information of these numbered institutions in Additional file 3

Species selection

Fish-T1K proposes to sequence 1,000 different ray-finned fish species representing all the orders and major families [4], and filling important gaps in the phylogenetic tree. Species that are endangered, of great economic and medical significance, or exhibit extreme phenotypes will also be targeted. Candidate species will be decided based on their importance and availability, while the target number will be a compromise between scientific needs and practical limitations such as financial constraints and availability of specimens.

Subprojects

To maximize usage of these transcripts, Fish-T1K has launched several subprojects to address specific questions in fish evolution. The major research goal of Fish-T1K is to reconstruct a comprehensive molecular phylogeny of ray-finned fishes to further resolve and test existing phylogenetic hypotheses. Additional subprojects include analysis of the evolutionary genomics of fish venoms, evolution of the annual life cycle in killifishes, and adaptations related to marine-to-freshwater transitions/migration.

SOPs and best practices

In the past two years, the Fish-T1K Team has established a series of SOPs, approved by BGI’s Institutional Review Board on Bioethics and Biosafety (No. BGI-IRB 15139), to ensure high quality sampling is achieved. Adhering to these SOPs means that all of our genetic resources, data and associated metadata are appropriately obtained, documented, and stored, which is helpful in establishing and optimizing standards common to large-scale transcriptome and genome sequencing projects. Transcriptome data from multiple tissues of five fishes were generated as a pilot quality control test (Additional file 4). Accordingly, total RNA is now routinely extracted from gills and other tissues of interest, and approximately 3.5 Gb of raw data are generated for each sample. Clean reads are assembled de novo into contigs with SOAPdenovo-Trans (v1.3) [5], and the final assembled transcripts are used for annotation, ortholog prediction and other analyses.

Current RNA sequencing progress

The Fish-T1K team has established a collaborative global network for collecting specimens. As of January 2016, 7,000 high quality fish samples were collected from Australia, the Caribbean, Denmark, Singapore, the UK, USA, and many places in China such as the Tibetan Plateau, Sanya, and the Yellow Sea. From these 7,000 samples, RNA samples were extracted from 142 ray-finned species covering 51 orders and 109 families, and around 180 transcriptomes have been produced (Table 1; Additional file 5). Meanwhile, more RNA samples from other species are being isolated and sequenced.

Website and database

The official Fish-T1k website [6] is equipped with a database for BLAST search. The website provides detailed information about the Fish-T1K project, and particular sample information (RNA quality, sample provider, etc.) and data quality (raw data size, scaffold size and number, etc.) are presented in the database. Users can access the BLAST tool and download sequences of interest. Data will be uploaded periodically as sample collection and transcriptome sequencing progresses.

Data sharing policy and data availability

All sequences generated from Fish-T1K will be deposited in NCBI and GigaDB in addition to the Fish-T1K database, following the Fort Lauderdale rules [7] and Toronto International Data Release Workshop guidelines [8], and will be released at least in the time of publication of any resulting papers. We plan to peer review and publish the SOP and method papers, will be published and we’re expecting publications for some of the ongoing subprojects are also expected in one the coming year or sooner.

Fish-T1K membership

All are welcome to participate in Fish-T1K and to propose new subprojects; these should address a major question in fish evolution and lead to (a) significant publication(s). Interested researchers can email fisht1k@genomics.cn with a brief proposal. The significance, question(s) to be addressed and fishes/tissues to be sequenced and analyzed should be included. On acceptance of a proposal, the lead scientist(s) will be asked to collect any fish tissues that are not already in our list, and to be in charge of analyzing and publishing the generated data.

Conclusions

Similar initiatives already exist to sequence the transcriptomes of large numbers of plants (1KP [9]) and insects (1KITE [10]). They have been well received and have been useful in establishing Fish-T1K. Although some progress has already been made, the Fish-T1K is at an early stage. We will continue to expand the scope of the project: in the first phase we aim to cover all orders, and all families in the second phase. More species will be added as required by subprojects. As the world’s first large-scale transcriptome database exclusively for fish, Fish-T1K will greatly enhance the study of fish biology, and eventually contribute efforts towards global fish biodiversity conservation and the sustainable utilization of natural fish resources.
  6 in total

1.  Phylogenomics resolves the timing and pattern of insect evolution.

Authors:  Bernhard Misof; Shanlin Liu; Karen Meusemann; Ralph S Peters; Alexander Donath; Christoph Mayer; Paul B Frandsen; Jessica Ware; Tomáš Flouri; Rolf G Beutel; Oliver Niehuis; Malte Petersen; Fernando Izquierdo-Carrasco; Torsten Wappler; Jes Rust; Andre J Aberer; Ulrike Aspöck; Horst Aspöck; Daniela Bartel; Alexander Blanke; Simon Berger; Alexander Böhm; Thomas R Buckley; Brett Calcott; Junqing Chen; Frank Friedrich; Makiko Fukui; Mari Fujita; Carola Greve; Peter Grobe; Shengchang Gu; Ying Huang; Lars S Jermiin; Akito Y Kawahara; Lars Krogmann; Martin Kubiak; Robert Lanfear; Harald Letsch; Yiyuan Li; Zhenyu Li; Jiguang Li; Haorong Lu; Ryuichiro Machida; Yuta Mashimo; Pashalia Kapli; Duane D McKenna; Guanliang Meng; Yasutaka Nakagaki; José Luis Navarrete-Heredia; Michael Ott; Yanxiang Ou; Günther Pass; Lars Podsiadlowski; Hans Pohl; Björn M von Reumont; Kai Schütte; Kaoru Sekiya; Shota Shimizu; Adam Slipinski; Alexandros Stamatakis; Wenhui Song; Xu Su; Nikolaus U Szucsich; Meihua Tan; Xuemei Tan; Min Tang; Jingbo Tang; Gerald Timelthaler; Shigekazu Tomizuka; Michelle Trautwein; Xiaoli Tong; Toshiki Uchifune; Manfred G Walzl; Brian M Wiegmann; Jeanne Wilbrandt; Benjamin Wipfler; Thomas K F Wong; Qiong Wu; Gengxiong Wu; Yinlong Xie; Shenzhou Yang; Qing Yang; David K Yeates; Kazunori Yoshizawa; Qing Zhang; Rui Zhang; Wenwei Zhang; Yunhui Zhang; Jing Zhao; Chengran Zhou; Lili Zhou; Tanja Ziesmann; Shijie Zou; Yingrui Li; Xun Xu; Yong Zhang; Huanming Yang; Jian Wang; Jun Wang; Karl M Kjer; Xin Zhou
Journal:  Science       Date:  2014-11-06       Impact factor: 47.728

2.  Fish oil inhibits human lung carcinoma cell growth by suppressing integrin-linked kinase.

Authors:  Shouwei Han; Xiaojuan Sun; Jeffrey D Ritzenthaler; Jesse Roman
Journal:  Mol Cancer Res       Date:  2009-01       Impact factor: 5.852

3.  Evolutionary mutant models for human disease.

Authors:  R Craig Albertson; William Cresko; H William Detrich; John H Postlethwait
Journal:  Trends Genet       Date:  2008-12-26       Impact factor: 11.639

4.  Prepublication data sharing.

Authors:  Ewan Birney; Thomas J Hudson; Eric D Green; Chris Gunter; Sean Eddy; Jane Rogers; Jennifer R Harris; S Dusko Ehrlich; Rolf Apweiler; Christopher P Austin; Lisa Berglund; Martin Bobrow; Chas Bountra; Anthony J Brookes; Anne Cambon-Thomsen; Nigel P Carter; Rex L Chisholm; Jorge L Contreras; Robert M Cooke; William L Crosby; Ken Dewar; Richard Durbin; Stephanie O M Dyke; Joseph R Ecker; Khaled El Emam; Lars Feuk; Stacey B Gabriel; John Gallacher; William M Gelbart; Antoni Granell; Francisco Guarner; Tim Hubbard; Scott A Jackson; Jennifer L Jennings; Yann Joly; Steven M Jones; Jane Kaye; Karen L Kennedy; Bartha Maria Knoppers; Nikos C Kyrpides; William W Lowrance; Jingchu Luo; John J MacKay; Luis Martín-Rivera; W Richard McCombie; John D McPherson; Linda Miller; Webb Miller; Don Moerman; Vincent Mooser; Cynthia C Morton; James M Ostell; B F Francis Ouellette; Julian Parkhill; Parminder S Raina; Christopher Rawlings; Steven E Scherer; Stephen W Scherer; Paul N Schofield; Christoph W Sensen; Victoria C Stodden; Michael R Sussman; Toshihiro Tanaka; Janet Thornton; Tatsuhiko Tsunoda; David Valle; Eero I Vuorio; Neil M Walker; Susan Wallace; George Weinstock; William B Whitman; Kim C Worley; Cathy Wu; Jiayan Wu; Jun Yu
Journal:  Nature       Date:  2009-09-10       Impact factor: 49.962

Review 5.  Data access for the 1,000 Plants (1KP) project.

Authors:  Naim Matasci; Ling-Hong Hung; Zhixiang Yan; Eric J Carpenter; Norman J Wickett; Siavash Mirarab; Nam Nguyen; Tandy Warnow; Saravanaraj Ayyampalayam; Michael Barker; J Gordon Burleigh; Matthew A Gitzendanner; Eric Wafula; Joshua P Der; Claude W dePamphilis; Béatrice Roure; Hervé Philippe; Brad R Ruhfel; Nicholas W Miles; Sean W Graham; Sarah Mathews; Barbara Surek; Michael Melkonian; Douglas E Soltis; Pamela S Soltis; Carl Rothfels; Lisa Pokorny; Jonathan A Shaw; Lisa DeGironimo; Dennis W Stevenson; Juan Carlos Villarreal; Tao Chen; Toni M Kutchan; Megan Rolf; Regina S Baucom; Michael K Deyholos; Ram Samudrala; Zhijian Tian; Xiaolei Wu; Xiao Sun; Yong Zhang; Jun Wang; Jim Leebens-Mack; Gane Ka-Shu Wong
Journal:  Gigascience       Date:  2014-10-27       Impact factor: 6.524

6.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors:  Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal:  Gigascience       Date:  2012-12-27       Impact factor: 6.524

  6 in total
  17 in total

1.  Liver transcriptome resources of four commercially exploited teleost species.

Authors:  André M Machado; Antonio Muñoz-Merida; Elza Fonseca; Ana Veríssimo; Rui Pinto; Mónica Felício; Rute R da Fonseca; Elsa Froufe; L Filipe C Castro
Journal:  Sci Data       Date:  2020-07-07       Impact factor: 6.444

2.  Comprehensive phylogeny of ray-finned fishes (Actinopterygii) based on transcriptomic and genomic data.

Authors:  Lily C Hughes; Guillermo Ortí; Yu Huang; Ying Sun; Carole C Baldwin; Andrew W Thompson; Dahiana Arcila; Ricardo Betancur-R; Chenhong Li; Leandro Becker; Nicolás Bellora; Xiaomeng Zhao; Xiaofeng Li; Min Wang; Chao Fang; Bing Xie; Zhuocheng Zhou; Hai Huang; Songlin Chen; Byrappa Venkatesh; Qiong Shi
Journal:  Proc Natl Acad Sci U S A       Date:  2018-05-14       Impact factor: 11.205

3.  Comprehensive phylogeny of Konosirus punctatus (Clupeiformes: Clupeidae) based on transcriptomic data.

Authors:  Fangrui Lou; Shengyao Qiu; Yongzheng Tang; Zhiyang Wang; Lei Wang
Journal:  Biosci Rep       Date:  2021-05-28       Impact factor: 3.840

4.  Phylogenetic classification of bony fishes.

Authors:  Ricardo Betancur-R; Edward O Wiley; Gloria Arratia; Arturo Acero; Nicolas Bailly; Masaki Miya; Guillaume Lecointre; Guillermo Ortí
Journal:  BMC Evol Biol       Date:  2017-07-06       Impact factor: 3.260

Review 5.  From Marine Venoms to Drugs: Efficiently Supported by a Combination of Transcriptomics and Proteomics.

Authors:  Bing Xie; Yu Huang; Kate Baumann; Bryan Grieg Fry; Qiong Shi
Journal:  Mar Drugs       Date:  2017-03-30       Impact factor: 5.118

6.  Characterization of viral RNA splicing using whole-transcriptome datasets from host species.

Authors:  Chengran Zhou; Shanlin Liu; Wenhui Song; Shiqi Luo; Guanliang Meng; Chentao Yang; Hua Yang; Jinmin Ma; Liang Wang; Shan Gao; Jian Wang; Huanming Yang; Yun Zhao; Hui Wang; Xin Zhou
Journal:  Sci Rep       Date:  2018-02-19       Impact factor: 4.379

7.  Whole Genome Sequencing of the Pirarucu (Arapaima gigas) Supports Independent Emergence of Major Teleost Clades.

Authors:  Ricardo Assunção Vialle; Jorge Estefano Santana de Souza; Katia de Paiva Lopes; Diego Gomes Teixeira; Pitágoras de Azevedo Alves Sobrinho; André M Ribeiro-Dos-Santos; Carolina Furtado; Tetsu Sakamoto; Fábio Augusto Oliveira Silva; Edivaldo Herculano Corrêa de Oliveira; Igor Guerreiro Hamoy; Paulo Pimentel Assumpção; Ândrea Ribeiro-Dos-Santos; João Paulo Matos Santos Lima; Héctor N Seuánez; Sandro José de Souza; Sidney Santos
Journal:  Genome Biol Evol       Date:  2018-09-01       Impact factor: 3.416

8.  Identification and Characterization of Zebrafish Tlr4 Coreceptor Md-2.

Authors:  Andrea N Loes; Melissa N Hinman; Dylan R Farnsworth; Adam C Miller; Karen Guillemin; Michael J Harms
Journal:  J Immunol       Date:  2021-01-20       Impact factor: 5.422

9.  A resource for sustainable management: De novo assembly and annotation of the liver transcriptome of the Atlantic chub mackerel, Scomber colias.

Authors:  André M Machado; Mónica Felício; Elza Fonseca; Rute R da Fonseca; L Filipe C Castro
Journal:  Data Brief       Date:  2018-03-13

10.  A Comparative Genomic and Transcriptomic Survey Provides Novel Insights into N-Acetylserotonin Methyltransferase (ASMT) in Fish.

Authors:  Kai Zhang; Zhiqiang Ruan; Jia Li; Chao Bian; Xinxin You; Steven L Coon; Qiong Shi
Journal:  Molecules       Date:  2017-10-02       Impact factor: 4.411

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.