Literature DB >> 26573482

WIDDE: a Web-Interfaced next generation database for genetic diversity exploration, with a first application in cattle.

Guilhem Sempéré¹, Katayoun Moazami-Goudarzi², André Eggen³, Denis Laloë⁴, Mathieu Gautier⁵, Laurence Flori^6,7.

Abstract

BACKGROUND: The advent and democratization of next generation sequencing and genotyping technologies lead to a huge amount of data for the characterization of population genetic diversity in model and non model-species. However, efficient storage, management, cross-analyzing and exploration of such dense genotyping datasets remain challenging. This is particularly true for the bovine species where many SNP datasets have been generated in various cattle populations with different genotyping tools. DESCRIPTION: We developed WIDDE, a Web-Interfaced Next Generation Database that stands as a generic tool applicable to a wide range of species and marker types ( http://widde.toulouse.inra.fr). As a first illustration, we hereby describe its first version dedicated to cattle biodiversity, which includes a large and evolving cattle genotyping dataset for over 750,000 SNPs available on 129 (89 public) different cattle populations representative of the world-wide bovine genetic diversity and on 7 outgroup bovid species. This version proposes an optional marker and individual filtering step, an export of genotyping data in different popular formats, and an exploration of genetic diversity through a principal component analysis. Users can also explore their own genotyping data together with data from WIDDE, assign their samples to WIDDE populations based on distance assignment method and supervised clustering, and estimate their ancestry composition relative to the populations represented in the database.
CONCLUSION: The cattle version of WIDDE represents to our knowledge the first database dedicated to cattle biodiversity and SNP genotyping data that will be very useful for researchers interested in this field. As a generic tool applicable to a wide range of marker types, WIDDE is overall intended to the genetic diversity exploration of any species and will be extended to other species shortly. The structure makes it easy to include additional output formats and new tools dedicated to genetic diversity exploration.

Entities: Disease Species

Mesh：

Year: 2015 PMID： 26573482 PMCID： PMC4647285 DOI： 10.1186/s12864-015-2181-1

Source DB: PubMed Journal: BMC Genomics ISSN： 1471-2164 Impact factor: 3.969

Background

Next Generation Sequencing (NGS) and genotyping (NGG) technologies have revolutionized variant genotyping and now allow cost-effective and genome-wide characterization of genetic diversity in a growing number of species including non-model species [1]. In livestock species, based on low to high density SNP chips, a growing amount of genomic information on several dozens of local breeds have been generated as exemplified by cattle population studies [2-7] and studies in other species [8-10]. However, efficient storage of the huge resulting datasets for management, sharing and routine exploration purposes remains challenging. We thus developed WIDDE, a Web accessible NoSQL Database, dedicated to the storage and management of dense genotyping datasets (e.g. up to hundreds of thousands of markers genotyped on thousands of individuals), coupled with various user friendly tools for (i) data selection, (ii) data exploration, (iii) export into various popular formats and (iv) population assignment. Via a web interface managing access to public (freely accessible) and private (accessible via login and password) data, users can therefore select (on a population and/or marker location basis) data subsets, perform basic data quality checking and standard population genetic analyses via a test for Hardy-Weinberg equilibrium and principal component analysis (PCA), and export the resulting datasets into various popular formats. Users can also jointly analyze their own genotyping data with WIDDE data subsets, in order to explore genetic proximity between populations by allele sharing distance (ASD) calculation, PCA and supervised clustering, to perform an estimation of ancestry composition of the samples and population assignment. WIDDE functionalities are illustrated on a large and evolving cattle dataset which is representative of the world-wide bovine genetic diversity.

Construction and content

Database architecture and implementation details

The WIDDE architecture diagram is shown on Fig. 1. From a technical point of view, we used a NoSQL database engine to store and efficiently query millions of genotypes. MongoDB (http://www.mongodb.org/) was chosen as an open-source solution supporting complex queries and easy scalability. MongoDB achieves relationship management by providing a concept of collections able to store documents with a flexible structure and that can be embedded in one another. Thereby, it uses BSON (Binary JSON; http://bsonspec.org/) as data storage format. Defining a data structure for use with NoSQL relies on preliminary analysis of the queries that the targeted application will need to execute. Therefore, WIDDE data structure was centered on variants (Additional file 1: Figure S1) due to the higher expected number of variants to be stored compared to the number of individuals. The genotyping data documents are stored in a collection where keys consist in triplets (variant, project, run). Such documents, which are the most basic unit of data stored in the database, embed marker genotypes for all samples involved in the given run.

Fig. 1

WIDDE architecture diagram. This high-level diagram illustrates the WIDDE architecture. It provides information about entities involved when using the information system, the data flows that occur between them, and the third-party software used in the process The server application was written in Java making use of the industry-standard Spring framework (http://spring.io/). Opal Toolkit (http://nbcr.ucsd.edu/data/docs/opal/) allows submitting jobs to a computer cluster running Sun Grid Engine (SGE), to perform either PCA or individual assignment. The client interface was developed in JSP with the jQuery JavaScript library (http://jquery.com/), and relies on the D3.js library for PCA result display (http://d3js.org/).

Methods

PCA of individuals based on SNP genotyping data is performed with the smartpca software package [11]. Assignment of new individuals provided by users to WIDDE populations is performed using both distance method [12] and supervised clustering [13]. Allele sharing distance (ASD), defined as 1-x where x represents the proportion of allele alike in state averaged over all genotyped SNPs, are calculated between individuals submitted by users and all public individuals included in WIDDE [6]. For each submitted individual, the average ASD with all individuals of each population is also calculated and the top 5 or 10 genetically closest populations are summarized. Supervised clustering is used to estimate ancestry proportions of samples relative to each reference population represented in the database (world dataset). We relied on a simplified version of the EM algorithm described in [13] and [14] to estimate (genome-wide) ancestry proportions of each individual relative to the reference populations. To that end, we first estimate SNP allele frequencies within each reference population using a Laplace approximation: , where y is the allele count and n the total allele count for population i. Then, we used the likelihood model proposed by the FRAPPE’s EM algorithm [14] and Admixture to estimate the fraction q of individual j’s genome assigned to the k populations (see equations 2 and 4 in [13]) using different values for the EM algorithm’s ε stopping criterion (0.01, 0.1 and 1). As the convergence of EM algorithm is slow, a fairly loose ε criterion is used to allow a fast termination of the algorithm. A smaller value of ε improves the accuracy of parameter estimates (providing the algorithm is not converging to a local optimum) at a cost of additional computational burden.

Data source for WIDDE-cattle

The first WIDDE version includes bovine genotyping data, obtained with medium to high density Illumina SNP chips (54K and 770K), from different breeds arising from biodiversity studies [6, 7, 15–18]. Genotyping data from [6, 7, 15, 18] were already stored in the Dryad Digital Repository [19-21]. Genotyping data produced by Gautier & Naves were available as online supporting information [16] and data from [17] that we produced, were added to the database. HD genotyping data were obtained from Illumina. Only cattle populations with at least 5 individuals were stored in WIDDE and those with at least 15 individuals were chosen as reference populations for population assignment. SNPchiMp was used to obtain consistent marker-lists and identify marker synonyms [22] and all markers were mapped on the current reference genome assembly bosTau6 UMD3.1. We also detected identical SNPs with different Illumina identifiers (98 duplicates and 6 triplicates) based on chromosome position. At these positions, we checked that the genotypes were identical for each individual and stored only one genotype per chromosome position in WIDDE. Each SNP stored in WIDDE is thus unique and ID synonym information was stored. At the time of writing, final data imports into WIDDE of any new population can be proposed by contacting administrators.

Utility and Discussion

Application features

The WIDDE application has four main functionalities: (i) storing high density genotyping data for hundreds to thousands of individuals each characterized by their population of origin and genotyping projects, (ii) selecting, filtering and exporting genotyping data subsets in several formats (i.e. plink, eigenstrat, hapmap) for downstream analyses, (iii) exploring directly intra-species genetic diversity via PCA and (iv) exploring user-provided genotyped individuals with WIDDE individuals by assigning them to WIDDE reference populations. This latter step includes a visual assignment through PCA, a distance-based assignment without calibration and an estimation of samples’ ancestry composition relative to reference populations by supervised clustering [12-14]. WIDDE supports storing information from any type of markers derived from NGS (e.g. vcf file), NGG (SNP data) or older technologies (e.g. microsatellites data). Moreover, the WIDDE data structure contains various information about populations, genotyping projects, marker ID synonyms and chromosomal positions on current given reference genome assembly. WIDDE handles public and private (accessible via login/password) genotyping data.

Web-interface

The WIDDE website consists in five sections accessible from the homepage. The “Home” section provides a concise description of the database and gives general information about the tool. The “Tutorial” section contains a didactic step-by-step tutorial illustrated with several screenshots. The “Data sources” section lists the different references and sources of data included in the database and the “Contact us” section contains the name, affiliation and email addresses of the main people involved in database conception and maintenance. The “Cattle data” section gives access to the actual application dedicated to bovine species. At the top of the application’s screen, a logo and three icons allow respectively to (i) return to the homepage, (ii) visualize populations’ origin on a map, (iii) upload data to launch population assignment and user’s genotyping data exploration, and (iv) authenticate to have access to private genotyping data. At the middle of the screen, a user friendly Web interface contains three panels for individual selection, marker selection and quality filtering, successively appearing when previous selection is valid (Fig. 2). Indeed, the dataset is defined in two steps. Individuals are selected from the first box, according to their population, genotyping project of origin and possible misidentification (e.g. problematic individuals identified by previous genetic analyses, due to population misidentification on phenotype). While choosing from the population list, the total numbers of currently selected individuals and samples are automatically displayed. A batch selection of individuals by population groups (European taurine, African taurine, zebu, hybrid and outgroup species) and DNA chip model (Illumina BovineSNP50v1, Illumina BovineSNP50v2 and Illumina BovineHD) is also possible. Selected chips are then displayed in the second box and markers can be selected according to their DNA location (mitochondrial, autosomal and/or sex chromosomes). The number of markers in the current selection is also kept up to date in real-time. An optional quality filtering step is available in a third box where two thresholds fix the minimum genotyping call rate for individuals and markers (95 % and 75 % by default). The order in which these two first filters are applied can be reversed by ticking another checkbox. By carrying out an exact test for Hardy-Weinberg Equilibrium [23], a third filter can be applied to discard outlying markers (P < 0.001 by default). Last, a filter on Minor Allele Frequency computed over all populations of the selected dataset can discard poorly informative markers (MAF < 0.01 by default).

Fig. 2

Web interface to select individuals and markers, apply quality filter, export data in various formats and launch principal component analysis

Web interface to select individuals and markers, apply quality filter, export data in various formats and launch principal component analysis Before or after quality filtering, genotyping data can be exported in popular formats (e.g. plink [24], eigenstrat [25]). Users may also explore genetic diversity via an online PCA performed with the smartpca software [11]. The dataset is first converted to eigenstrat format and smarpca is then launched on a computer cluster. Individuals are then plotted by default on the first factorial plan in a new window allowing selection of other components (Fig. 3). As this step may take time (few to few dozens of minutes depending on the size of the dataset and the cluster queue status), users have the possibility to enter their email address to be informed of the job completion. After job completion, genotyping data in eigenstrat format, a summary of individual and marker selection (selection.txt) as well as smartpca output files (output.pca.evec, output.eval and sdtout.txt) may be downloaded.

Fig. 3

Plot of the individuals according to their coordinates on the first two principal components of the principal component analysis including 44,554 SNPs genotyped on 685 individuals from 22 cattle populations representative of the cattle genetic diversity. Eight EUT (Abondance/ABO, Angus/ANG, Aubrac/AUB, Charolais/CHA, Holstein/HOL, Montbéliard/MON, Normande/NOR and Salers/SAL), four AFT (Baoulé/BAO, Lagune/LAG, N’Dama/NDA and Somba/SOM), six ZEB (Brahman/BRM, Nelore/NEL, Gir/GIR, Zebu Bororo/ZBO, Zebu Fulani/ZFU and Zebu from Madagascar/ZMA) and four admixed populations (Borgou/BOR, Kouri/KUR, Oumes Zaër/OUL and Santa Gertrudis/SGT) genotyped on the Illumina Bovine SNP50v1 were selected. Data has been filtered using default parameters By clicking on the assignment icon, users also have the option through an upload interface to analyze their own genotyping data (in plink format with nucleotide letters) with WIDDE public genotyping data. This process, which can be time consuming, is also detached from the web interface and runs on the mentioned computer cluster via Opal and SGE. Based on SNPs in common, user can choose (i) to perform a PCA of genotyping data combined with public genotyping data stored in WIDDE, and (ii) to assign these new individuals to populations of the public reference dataset based on ASD calculation and on estimation of ancestry proportions by supervised clustering [12-14]. ASD between submitted individuals and each individual from the reference dataset are calculated and the top five or ten populations of the WIDDE reference dataset with the weakest ASD average are listed (along with ASD minimum and maximum within populations) for each new individual. The supervised clustering step determines for each new individual the proportion of ancestry attributed to each population of the WIDDE reference dataset. Users, who may enter their email address to be informed of job completion, can download: (i) a summary and the complete results of ASD calculation (asd_summary.tsv and asd_results.tsv), (ii) a summary and the complete results of the supervised clustering (ancestry_summary.tsv and ancestry_results.tsv) and (iii) the merged dataset used in the analysis in eigenstrat format.

Illustration with cattle data

As an illustration of WIDDE functionalities, we hereby detailed the cattle module currently containing 783,640 SNPs and 3951 (2827 publicly available) individuals belonging to 129 (89 public) different cattle populations and 8 (7 public) populations of outgroup species (two Bos javanicus populations, Bison bison, Syncerus caffer, Bos gaurus, Bubalus depressicornis, Bos grunniens), that were thoroughly selected and curated. These various local cattle populations are representative of the bovine genetic diversity and belong to the three main cattle groups, i.e. European (EUT) and African (AFT) taurine (Bos taurus) and zebus (ZEB; Bos indicus). Figure 3 describes the PCA results of a data subset including 685 individuals from 22 cattle populations representative of EUT, AFT and ZEB, genotyped on 44,554 SNPs after quality filtering using default settings (Fig. 2). The first factorial plan allows recovering the already described triangle-like 2-dimensional global organization of cattle genetic diversity [6]. Briefly, each main cattle group is positioned at the three apexes of the triangle and admixed populations lie at intermediate positions. A world reference dataset including all WIDDE public populations representative of the world-wide genetic diversity of the bovine species with at least 15 individuals was defined to assign user-uploaded individuals to WIDDE public populations. To illustrate this step, we estimated ancestry proportions of 2250 individuals from 45 public populations of the world reference dataset against the world reference dataset itself using supervised clustering. We started from 33K SNP (i.e. the highest number of variants taken into account in the analysis), 10K SNP and 1K SNP randomly chosen within the 33K list, and considered different EM stopping criteria (ε = 0.01, ε = 0.1 and ε = 1). Based on these supervised clustering results, the proportion of assigned individuals and the misassignment rate were then calculated for different ancestry thresholds ranging from 0 to 1 (Fig. 4). More precisely, for a given threshold t, each individual was assigned to a population j if the estimated ancestry proportion q > t (for a small value of t, if several populations satisfied this criterion, the individual was assigned to the population displaying the highest ancestry proportion). As a result, the assignment rate (for a given ancestry threshold t) was defined as the proportion of individuals assigned to a population and the misassignment rate corresponded to the proportion (with respect to all the assigned individuals) that were assigned to a population different from their population of origin.

Fig. 4

Proportion of assigned individuals and misassignment rate in assignment tests based on supervised clustering. The 2250 individuals from 45 public populations of the world reference dataset were assigned against the world reference dataset, using 32,966 (33K) SNP, 10K SNP and 1K SNP, with different values for the EM algorithm’s ε stopping criterion (0.01, 0.1 and 1). The proportion of assigned individuals a and the misassignment rate b were plotted against ancestry thresholds (0–1) As expected, the proportion of assigned individuals increased with the number of selected markers and with the stringency of the stopping criterion (Fig. 4a). For instance, at an ancestry threshold of 0.8, the proportions of assigned individuals were above 80 % (respectively 60 %) with 33K (respectively 10K) SNPs whatever the stopping criterion value. Conversely, the misassignment rates always remained under 2 % with both 33K and 10K SNPs (Fig. 4b). Note however that for a small number of SNPs (e.g. 1K) the misassignment rates sometimes reached values above 5 %. As a rule of thumb, one may thus recommend using for assignment purposes at least 10K SNPs and an ancestry threshold above 0.75. We also applied WIDDE diversity exploration and population assignment tools to a test dataset with individuals not included in the database and belonging to two European taurine breeds i.e. Montbeliard (2 individuals) and Tarentaise (5 individuals) [7]. Tarentaise is simply another name for the Tarine although considered as a separate breed in [7]. We first checked that these populations were positioned near the EUT group as expected on the first factorial plan of PCA (Additional file 2: Figure S2). In order to have an idea of the WIDDE populations presenting the strongest genetic proximity with the uploaded individuals, we then launched the assignment module using the reference dataset (ε = 0.01). Additional file 3: Tables S1 and S2 resume for each new individual the top five nearest WIDDE populations based on average ASD calculation and the proportion of ancestry attributed to each population of the reference dataset, respectively. We thus checked that the Montbéliard and Tarentaise individuals were properly assigned as the corresponding populations were already present in WIDDE. The two supposed Montbéliard individuals presented the closest genetic distance with MON with a proportion of ancestry above 95 %, as expected (Additional file 3: Tables S1 and S2). The supposed Tarentaise individuals proved genetically close to TAR (Tarine) population but with a weaker proportion of ancestry attributed to this breed (between 48.8 and 68.7 %), illustrating a possible admixture with other European taurine breeds. Our results demonstrate the utility of WIDDE in assigning any individual to the genetically closest population and the ability to estimate the ancestry proportion of any individual to the WIDDE reference populations. The assignment method based on supervised clustering is especially accurate when the true population of origin is included in the reference dataset [12]. When the new individual’s true breed population is not present in WIDDE, the tool still provides an estimate of the mostly closely related population. Future work will require the implementation of an exclusion method to measure the confidence that an individual truly belongs to a given population. As these exclusion methods are at the moment difficult to implement with an acceptable computation time compatible with a high number of markers, they were not integrated into WIDDE but output for Geneclass 2.0 software might be easy to generate and will be available shortly [26].

Conclusion

In summary, the NoSQL next generation database WIDDE represents a biodiversity database able to manage and explore a large amount of genotyping data, and to assign new individuals to populations stored internally. Thus, WIDDE is a generic tool applicable to a wide range of species and marker types. It is a versatile tool and further version of the database will include additional output formats and new tools dedicated to genetic diversity exploration. The first module, WIDDE-cattle, described here, represents the first database dedicated to cattle biodiversity and SNP genotyping data, which allows users to explore not only the WIDDE dataset but also their own genotyping data. It will be very useful for researchers interested in cattle genetic diversity and will be extended to other livestock species shortly.

Availability and requirements

WIDDE is deployed on our institutional website at http://widde.toulouse.inra.fr for research and academic use.

23 in total

1. New methods employing multilocus genotypes to select or exclude populations as origins of individuals.

Authors: J M Cornuet; S Piry; G Luikart; A Estoup; M Solignac
Journal: Genetics Date: 1999-12 Impact factor: 4.562

2. A note on exact tests of Hardy-Weinberg equilibrium.

Authors: Janis E Wigginton; David J Cutler; Goncalo R Abecasis
Journal: Am J Hum Genet Date: 2005-03-23 Impact factor: 11.025

3. Estimation of individual admixture: analytical and study design considerations.

Authors: Hua Tang; Jie Peng; Pei Wang; Neil J Risch
Journal: Genet Epidemiol Date: 2005-05 Impact factor: 2.135

4. Principal components analysis corrects for stratification in genome-wide association studies.

Authors: Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal: Nat Genet Date: 2006-07-23 Impact factor: 38.330

5. PLINK: a tool set for whole-genome association and population-based linkage analyses.

Authors: Shaun Purcell; Benjamin Neale; Kathe Todd-Brown; Lori Thomas; Manuel A R Ferreira; David Bender; Julian Maller; Pamela Sklar; Paul I W de Bakker; Mark J Daly; Pak C Sham
Journal: Am J Hum Genet Date: 2007-07-25 Impact factor: 11.025

6. Adaptive admixture in the West African bovine hybrid zone: insight from the Borgou population.

Authors: Laurence Flori; Sophie Thevenon; Guiguigbaza-Kossigan Dayo; Marcel Senou; Souleymane Sylla; David Berthier; Katayoun Moazami-Goudarzi; Mathieu Gautier
Journal: Mol Ecol Date: 2014-06-19 Impact factor: 6.185

7. Population structure and eigenanalysis.

Authors: Nick Patterson; Alkes L Price; David Reich
Journal: PLoS Genet Date: 2006-12 Impact factor: 5.917

8. The genome sequence of taurine cattle: a window to ruminant biology and evolution.

Authors: Christine G Elsik; Ross L Tellam; Kim C Worley; Richard A Gibbs; Donna M Muzny; George M Weinstock; David L Adelson; Evan E Eichler; Laura Elnitski; Roderic Guigó; Debora L Hamernik; Steve M Kappes; Harris A Lewin; David J Lynn; Frank W Nicholas; Alexandre Reymond; Monique Rijnkels; Loren C Skow; Evgeny M Zdobnov; Lawrence Schook; James Womack; Tyler Alioto; Stylianos E Antonarakis; Alex Astashyn; Charles E Chapple; Hsiu-Chuan Chen; Jacqueline Chrast; Francisco Câmara; Olga Ermolaeva; Charlotte N Henrichsen; Wratko Hlavina; Yuri Kapustin; Boris Kiryutin; Paul Kitts; Felix Kokocinski; Melissa Landrum; Donna Maglott; Kim Pruitt; Victor Sapojnikov; Stephen M Searle; Victor Solovyev; Alexandre Souvorov; Catherine Ucla; Carine Wyss; Juan M Anzola; Daniel Gerlach; Eran Elhaik; Dan Graur; Justin T Reese; Robert C Edgar; John C McEwan; Gemma M Payne; Joy M Raison; Thomas Junier; Evgenia V Kriventseva; Eduardo Eyras; Mireya Plass; Ravikiran Donthu; Denis M Larkin; James Reecy; Mary Q Yang; Lin Chen; Ze Cheng; Carol G Chitko-McKown; George E Liu; Lakshmi K Matukumalli; Jiuzhou Song; Bin Zhu; Daniel G Bradley; Fiona S L Brinkman; Lilian P L Lau; Matthew D Whiteside; Angela Walker; Thomas T Wheeler; Theresa Casey; J Bruce German; Danielle G Lemay; Nauman J Maqbool; Adrian J Molenaar; Seongwon Seo; Paul Stothard; Cynthia L Baldwin; Rebecca Baxter; Candice L Brinkmeyer-Langford; Wendy C Brown; Christopher P Childers; Timothy Connelley; Shirley A Ellis; Krista Fritz; Elizabeth J Glass; Carolyn T A Herzig; Antti Iivanainen; Kevin K Lahmers; Anna K Bennett; C Michael Dickens; James G R Gilbert; Darren E Hagen; Hanni Salih; Jan Aerts; Alexandre R Caetano; Brian Dalrymple; Jose Fernando Garcia; Clare A Gill; Stefan G Hiendleder; Erdogan Memili; Diane Spurlock; John L Williams; Lee Alexander; Michael J Brownstein; Leluo Guan; Robert A Holt; Steven J M Jones; Marco A Marra; Richard Moore; Stephen S Moore; Andy Roberts; Masaaki Taniguchi; Richard C Waterman; Joseph Chacko; Mimi M Chandrabose; Andy Cree; Marvin Diep Dao; Huyen H Dinh; Ramatu Ayiesha Gabisi; Sandra Hines; Jennifer Hume; Shalini N Jhangiani; Vandita Joshi; Christie L Kovar; Lora R Lewis; Yih-Shin Liu; John Lopez; Margaret B Morgan; Ngoc Bich Nguyen; Geoffrey O Okwuonu; San Juana Ruiz; Jireh Santibanez; Rita A Wright; Christian Buhay; Yan Ding; Shannon Dugan-Rocha; Judith Herdandez; Michael Holder; Aniko Sabo; Amy Egan; Jason Goodell; Katarzyna Wilczek-Boney; Gerald R Fowler; Matthew Edward Hitchens; Ryan J Lozado; Charles Moen; David Steffen; James T Warren; Jingkun Zhang; Readman Chiu; Jacqueline E Schein; K James Durbin; Paul Havlak; Huaiyang Jiang; Yue Liu; Xiang Qin; Yanru Ren; Yufeng Shen; Henry Song; Stephanie Nicole Bell; Clay Davis; Angela Jolivet Johnson; Sandra Lee; Lynne V Nazareth; Bella Mayurkumar Patel; Ling-Ling Pu; Selina Vattathil; Rex Lee Williams; Stacey Curry; Cerissa Hamilton; Erica Sodergren; David A Wheeler; Wes Barris; Gary L Bennett; André Eggen; Ronnie D Green; Gregory P Harhay; Matthew Hobbs; Oliver Jann; John W Keele; Matthew P Kent; Sigbjørn Lien; Stephanie D McKay; Sean McWilliam; Abhirami Ratnakumar; Robert D Schnabel; Timothy Smith; Warren M Snelling; Tad S Sonstegard; Roger T Stone; Yoshikazu Sugimoto; Akiko Takasuga; Jeremy F Taylor; Curtis P Van Tassell; Michael D Macneil; Antonio R R Abatepaulo; Colette A Abbey; Virpi Ahola; Iassudara G Almeida; Ariel F Amadio; Elen Anatriello; Suria M Bahadue; Fernando H Biase; Clayton R Boldt; Jeffery A Carroll; Wanessa A Carvalho; Eliane P Cervelatti; Elsa Chacko; Jennifer E Chapin; Ye Cheng; Jungwoo Choi; Adam J Colley; Tatiana A de Campos; Marcos De Donato; Isabel K F de Miranda Santos; Carlo J F de Oliveira; Heather Deobald; Eve Devinoy; Kaitlin E Donohue; Peter Dovc; Annett Eberlein; Carolyn J Fitzsimmons; Alessandra M Franzin; Gustavo R Garcia; Sem Genini; Cody J Gladney; Jason R Grant; Marion L Greaser; Jonathan A Green; Darryl L Hadsell; Hatam A Hakimov; Rob Halgren; Jennifer L Harrow; Elizabeth A Hart; Nicola Hastings; Marta Hernandez; Zhi-Liang Hu; Aaron Ingham; Terhi Iso-Touru; Catherine Jamis; Kirsty Jensen; Dimos Kapetis; Tovah Kerr; Sari S Khalil; Hasan Khatib; Davood Kolbehdari; Charu G Kumar; Dinesh Kumar; Richard Leach; Justin C-M Lee; Changxi Li; Krystin M Logan; Roberto Malinverni; Elisa Marques; William F Martin; Natalia F Martins; Sandra R Maruyama; Raffaele Mazza; Kim L McLean; Juan F Medrano; Barbara T Moreno; Daniela D Moré; Carl T Muntean; Hari P Nandakumar; Marcelo F G Nogueira; Ingrid Olsaker; Sameer D Pant; Francesca Panzitta; Rosemeire C P Pastor; Mario A Poli; Nathan Poslusny; Satyanarayana Rachagani; Shoba Ranganathan; Andrej Razpet; Penny K Riggs; Gonzalo Rincon; Nelida Rodriguez-Osorio; Sandra L Rodriguez-Zas; Natasha E Romero; Anne Rosenwald; Lillian Sando; Sheila M Schmutz; Libing Shen; Laura Sherman; Bruce R Southey; Ylva Strandberg Lutzow; Jonathan V Sweedler; Imke Tammen; Bhanu Prakash V L Telugu; Jennifer M Urbanski; Yuri T Utsunomiya; Chris P Verschoor; Ashley J Waardenberg; Zhiquan Wang; Robert Ward; Rosemarie Weikard; Thomas H Welsh; Stephen N White; Laurens G Wilming; Kris R Wunderlich; Jianqi Yang; Feng-Qi Zhao
Journal: Science Date: 2009-04-24 Impact factor: 47.728

9. Development and characterization of a high density SNP genotyping assay for cattle.

Authors: Lakshmi K Matukumalli; Cynthia T Lawley; Robert D Schnabel; Jeremy F Taylor; Mark F Allan; Michael P Heaton; Jeff O'Connell; Stephen S Moore; Timothy P L Smith; Tad S Sonstegard; Curtis P Van Tassell
Journal: PLoS One Date: 2009-04-24 Impact factor: 3.240

10. The genome response to artificial selection: a case study in dairy cattle.

Authors: Laurence Flori; Sébastien Fritz; Florence Jaffrézic; Mekki Boussaha; Ivo Gut; Simon Heath; Jean-Louis Foulley; Mathieu Gautier
Journal: PLoS One Date: 2009-08-12 Impact factor: 3.240

24 in total

1. Analysis of the Genetic Diversity and Population Structure of Four Senegalese Sheep Breeds Using Medium-Density Single-Nucleotide Polymorphisms.

Authors: Ayao Missohou; Basse Kaboré; Laurence Flori; Simplice Bosco Ayssiwede; Jean-Luc Hornick; Marianne Raes; Jean-François Cabaraux
Journal: Animals (Basel) Date: 2022-06-10 Impact factor: 3.231

2. Genetic diversity and the application of runs of homozygosity-based methods for inbreeding estimation in German White-headed Mutton sheep.

Authors: Sowah Addo; Stefanie Klingel; Georg Thaller; Dirk Hinrichs
Journal: PLoS One Date: 2021-05-06 Impact factor: 3.240

3. The York Gospels: a 1000-year biological palimpsest.

Authors: Matthew D Teasdale; Sarah Fiddyment; Jiří Vnouček; Valeria Mattiangeli; Camilla Speller; Annelise Binois; Martin Carver; Catherine Dand; Timothy P Newfield; Christopher C Webb; Daniel G Bradley; Matthew J Collins
Journal: R Soc Open Sci Date: 2017-10-25 Impact factor: 2.963

4. Age-based partitioning of individual genomic inbreeding levels in Belgian Blue cattle.

Authors: Marina Solé; Ann-Stephan Gori; Pierre Faux; Amandine Bertrand; Frédéric Farnir; Mathieu Gautier; Tom Druet
Journal: Genet Sel Evol Date: 2017-12-22 Impact factor: 4.297

5. Erratum to: Gigwa-Genotype investigator for genome-wide analyses.

Authors: Guilhem Sempéré; Florian Philippe; Alexis Dereeper; Manuel Ruiz; Gautier Sarah; Pierre Larmande
Journal: Gigascience Date: 2016-11-02 Impact factor: 6.524

6. Linkage Disequilibrium-Based Inference of Genome Homology and Chromosomal Rearrangements Between Species.

Authors: Daniel Jordan de Abreu Santos; Gregório Miguel Ferreira de Camargo; Diercles Francisco Cardoso; Marcos Eli Buzanskas; Rusbel Raul Aspilcueta-Borquis; Naudin Alejandro Hurtado-Lugo; Francisco Ribeiro de Araújo Neto; Lúcia Galvão de Albuquerque; Li Ma; Humberto Tonhati
Journal: G3 (Bethesda) Date: 2020-07-07 Impact factor: 3.154

7. High-density Genotyping reveals Genomic Characterization, Population Structure and Genetic Diversity of Indian Mithun (Bos frontalis).

Authors: Anupama Mukherjee; Sabyasachi Mukherjee; Rajan Dhakal; Moonmoon Mech; Imsusosang Longkumer; Nazrul Haque; Kezhavituo Vupru; Kobu Khate; I Yanger Jamir; Pursenla Pongen; Chandan Rajkhowa; Abhijit Mitra; Bernt Guldbrandtsen; Goutam Sahana
Journal: Sci Rep Date: 2018-07-09 Impact factor: 4.379

8. Gigwa v2-Extended and improved genotype investigator.

Authors: Guilhem Sempéré; Adrien Pétel; Mathieu Rouard; Julien Frouin; Yann Hueber; Fabien De Bellis; Pierre Larmande
Journal: Gigascience Date: 2019-05-01 Impact factor: 6.524

9. Identification of selective sweeps reveals divergent selection between Chinese Holstein and Simmental cattle populations.

Authors: Minhui Chen; Dunfei Pan; Hongyan Ren; Jinluan Fu; Junya Li; Guosheng Su; Aiguo Wang; Li Jiang; Qin Zhang; Jian-Feng Liu
Journal: Genet Sel Evol Date: 2016-10-06 Impact factor: 4.297

10. Inferring sex-specific demographic history from SNP data.

Authors: Florian Clemente; Mathieu Gautier; Renaud Vitalis
Journal: PLoS Genet Date: 2018-01-31 Impact factor: 5.917