Literature DB >> 31598083

Construction of a core collection of eggplant (Solanum melongena L.) based on genome-wide SNP and SSR genotypes.

Koji Miyatake¹, Yoshimi Shinmura¹, Hiroshi Matsunaga¹, Hiroyuki Fukuoka^1,2, Takeo Saito¹.

Abstract

A core collection of eggplant (Solanum melongena L.) was developed based on a dataset of genome-wide 831 SNP and 50 SSR genotypes analyzed in 893 accessions of eggplant genetic resources collected in the NARO Genebank using the Core Hunter II program. The 893 accessions were collected worldwide, mainly Asia. Genetic variation and population structure among the 893 eggplant accessions were characterized. The genetic diversity of the Asian accessions, especially the South Asian and Southeast Asian accessions, forming the center of diversity in eggplant, was higher than that of the other regions. The resulting core collection, World Eggplant Core (WEC) collection consisted of 100 accessions basically collected from the high genetic diversity countries. Based on the results of the cluster and STRUCTURE analyses with SNP genotypes, the WEC collection was divided into four clusters (S1-S4). Each cluster corresponds to a geographical group as below, S1; the European, American and African countries, S2; the East Asian countries, S3; the Southeast Asian countries, S4; the South Asian and Southeast Asian countries. The genotype and phenotype data of the WEC collection are available from the VegMarks database (https://vegmarks.nivot.affrc.go.jp/resource/), and seed samples are available from the NARO Genebank (https://www.gene.affrc.go.jp/databases-core_collections.php).

Entities: Chemical Gene Species

Keywords: SNP; SSR; Solanum melongena L.; core collection; database; eggplant

Year: 2019 PMID： 31598083 PMCID： PMC6776151 DOI： 10.1270/jsbbs.18202

Source DB: PubMed Journal: Breed Sci ISSN： 1344-7610 Impact factor: 2.086

Introduction

Since its establishment in 1985 as a department of the National Institute of Agrobiological Resources (NIAR) (currently, the National Agriculture and Food Research Organization [NARO]), the Genebank Project has allowed the strategic performance of exploration, collection and introduction of genetic resources, followed by their characterization, reproduction and distribution. At present, the Institute of Vegetable and Floriculture Science, NARO, in collaboration with the Genetic Resources Center, NARO, is the main player of the project regarding genetic resources of vegetable crops. Nearly 1,000 accessions of eggplant (Solanum melongena L.) and its wild relative species are respectively registered and deposited at present. Among them, some accessions have been utilized in breeding practice as valuable sources of unique and useful traits. By using the accessions derived from the Genebank, a number of leading varieties has been developed, including the rootstock varieties ‘Torvum Vigor’ (Yamakawa 1981) that shows composite resistance to bacterial wilt, Verticillium wilt, Fusarium wilt and nematodes; ‘Daitaro’ (Monma ) and ‘Daizaburo’ (Yoshida ) with high resistance to bacterial wilt and Fusarium wilt; and the unique parthenocarpic varieties ‘Anominori’ (Saito ) and ‘Anominori 2 go’ (Saito ). Genetic resources containing wide genetic variation are widely considered to be indispensable materials with latent potential for future eggplant breeding. However, the status of genetic resources in eggplant (and probably in other species as well) is not completely organized. There are a significant number of confusing cases, for instance, the same variety and/or germplasm may be registered more than once under the same name, the same germplasm can be registered under different names, and the same name can be used for the registration of different germplasms. These situations must be appropriately reorganized. Additionally, for fruit vegetables such as eggplant, the physical effort and field area required to grow and characterize a plant are definitely larger than those for field crops such as wheat or rice. Therefore, the construction of a relatively small subset that consists of a limited number of accessions but still retains the range of genetic variation of all the genetic resources as much as possible is useful for their efficient and pragmatic use. For this purpose, the construction of an eggplant core collection based on molecular genetic information is urgently required. Although the construction of a core collection using molecular genetic information, i.e., DNA marker genotypes, has been reported in several field crop species in Japan, including rice (Ebana ), soybean (Kaga ) and rapeseeds (Chen ), there are only a few reports on vegetable crops. Outside Japan, Gangopadhyay , Kumar , and Mao have conducted studies to develop an eggplant core collection, but their studies are mainly based on regional sources and phenotype data. As one of the few examples, Cericola reported that, using the genotype data obtained from 24 SSR markers, a core set consisting of 48 lines could be built to cover all 140 SSR alleles found in the 191 germplasms examined. This study provides some novel information. However, the genetic resources in each country would differ from each other in content and size and therefore it is important to develop individualistic core collections from domestic genetic resources. Here we report the construction of a unique eggplant core collection that was accomplished using the genotype data of 831 SNPs and 50 SSRs obtained from 893 eggplant accessions collected from across the world, mainly Asia. Unlike the previous examples, this collection was prepared for distribution, and is can be procured from the NARO Genebank with a simple procedure, in addition, the genotype and phenotype data are freely available from the VegMarks database.

Materials and Methods

When the experiment was started, the number of eggplant accessions registered in the NARO Genebank was around 1,000. The seeds of all accessions were sown in a greenhouse and young leaves were sampled from each plant of the 938 accessions that showed good germination and initial growth. The total DNA was extracted using the DNeasy Plant DNA Extraction Kit (Qiagen, Valencia, CA, USA) and 1,536 SNP genotypes were collected for each accession using the GoldenGate Assay Kit (Illumina, San Diego, CA, USA) constructed by Hirakawa . Initially, we removed unreliable markers among the 1,536 SNPs based on the stability of the genotype and the rate of missing data (>10%). Then, two hundred accessions were selected as members of the tentative core collection using the Core Hunter II (De Beukelaer ) software adopting the Mixed Replica algorithm method with the default parameter settings suggested in the instruction manual. Further, 111 microsatellite markers assigned to each of the 12 chromosomes (Fukuoka ) were genotyped against the tentative core collection, and removed unreliable genotype calls based on the minor allele frequencies (<0.05) and a rate of missing data (>15%). The core collection was then reconstructed using Core Hunter II with a combined dataset of biallelic SNP genotypes and multiallelic microsatellite genotypes. Model-based Bayesian clustering analysis of whole eggplant materials and the core collection was performed using STRUCTURE 2.3.4 (Pritchard ) software with the admixture-non F model. The GGT 2.0 (van Berloo 2008) program was used to formulate Jaccard similarity coefficient and an unrooted Unweighted Pair Group Method with Arithmetic mean (UPGMA) tree was constructed using MEGA program (version 6) based on the distance matrix, with 1,000 bootstrap replicates (Tamura ). Genetic diversity indices were defined using GenAlex 6.5 (Peakall and Smouse 2012) and PowerMarker 3.25 (Liu and Muse 2005).

Results and Discussion

SNP genotyping of whole accessions and initial construction of core collection

By performing the GoldenGate genotyping assay with the DNA samples of the 938 accessions, the data of SNPs were obtained. Firstly, based on polymorphism and stability of the multiple genotype data of four standard eggplant lines (‘AE-P03’, ‘LS1934’, ‘Nakate-Shinkiro’ and ‘WCGR112-8’) included in the above 938 accessions, 987 SNPs were selected as ‘reliable markers’. Secondly, 893 accessions that exhibited a rate of less than 10% missing data for the 987 ‘reliable’ SNP markers were selected as candidate accessions for constructing of the core collection. Two hundred accessions were selected based on SNP genotypes using Core Hunter II program. Subsequently, possible duplications, judging from the SNP genotype, were removed and 176 accessions were selected as independent members of a tentative core collection.

SSR genotyping and second-round construction of the core collection—World Eggplant Core collection

With a view to practically utilize the collection, we set the final size of the core collection to 100 and made further refined the collection. To investigate the genetic background of the 176 accessions in depth, the genotype data were collected from 111 microsatellite markers that were polymorphic among the four standard eggplant lines and analyzed (Nunome ). To remove poorly reliable genotype calls, the data obtained from alleles of which the frequency was lower than 0.05 and/or the data called as heterozygous were envisaged as missing data. Fifty microsatellites were found to produce genotype data with less than 15% missing data, and therefore the data obtained from these 50 microsatellites were determined to be reliable enough and used for further analyses (Supplemental Table 1). Regarding the 50 microsatellites investigated, among the 176 accessions, the number of alleles of each microsatellite seemed reasonable, ranging from 2 to 8 (4.5 on average, 227 in total). Similarly, 831 SNP markers that showed minor allele frequencies of 0.05 or more were selected from the 987 markers used for the selection of the 176 accessions. The genotype data from the carefully selected 881 markers (831 SNPs and 50 microsatellites) were used to select 100 accessions for the final core collection, World Eggplant Core (WEC) collection, using the Core Hunter II program (Fig. 1, Supplemental Table 2). The genetic diversity indices for each country are summarized in Table 1. Generally, the values of expected heterozygosity (He), Shannon’s information index (I) and polymorphism information content (PIC) were higher in Asian countries (especially India and Malaysia) than in other countries. The WEC collection consisted of 3 Africa (among 22), 4 American (among 27), 80 Asian (among 695), 8 Europe (among 87) and 5 unknown accessions. Based on the examined accessions and diversity indices, a large number of accessions from Malaysia (21), Lao PDR (11) and Japan (10) were collected (Table 1). The WEC collection constructed using the Core Hunter II program had a high retention ratio (97.5%; Supplemental Table 3), indicating that it has almost all alleles observed in the whole collection. Additionally, the diversity indices, except observed heterozygosity (Ho), were no significantly different among the whole collection and WEC collection (Supplemental Table 3). These results suggest that the WEC collection maintains most of the genetic diversity in whole collection.

Fig. 1

Pictures of mature (Left) and immature (Right) fruit of 100 accessions constituting the World Eggplant Core collection (WEC).

Table 1

Genetic diversity indices for the NARO Eggplant collection and WEC among geographic groups

Whole collection										WEC collection

Region	Country	No. of accessions	Average number of alleles	Major allele frequency	Ho	He	I	PIC	sum of accession number	No. of accessions	sum
Africa	Egypt	5	1.34	0.89	0.01	0.14	0.20	0.11		1
	Ghana	11	1.55	0.85	0.01	0.20	0.30	0.16		2
	Kenya	5	0.98	0.91	0.03	0.02	0.03	0.01		0
	Nigeria	1	–	–	–	–	–	–	22	0	3

America	Brazil	5	1.44	0.88	0.05	0.16	0.23	0.13		2
	Canada	4	1.16	0.94	0.01	0.07	0.10	0.05		2
	Chile	3	1.38	0.87	0.01	0.17	0.24	0.13		0
	USA	15	1.62	0.88	0.06	0.17	0.27	0.14	27	0	4

Asia	Bangladesh	71	1.70	0.86	0.05	0.19	0.29	0.16		7
	China	41	1.78	0.86	0.07	0.20	0.31	0.16		2
	India	18	1.87	0.79	0.06	0.28	0.43	0.23		8
	Indonesia	2	1.23	0.85	0.01	0.15	0.20	0.11		2
	Iran	3	1.41	0.87	0.14	0.16	0.24	0.19		0
	Iraq	1	–	–	–	–	–	–		0
	Japan	301	1.98	0.88	0.06	0.16	0.27	0.14		10
	Lao PDR	70	1.84	0.82	0.07	0.25	0.38	0.20		11
	Malaysia	54	1.91	0.82	0.04	0.28	0.43	0.23		21
	Myanmar	35	1.80	0.82	0.04	0.24	0.37	0.20		5
	Nepal	13	1.61	0.83	0.05	0.22	0.33	0.18		3
	Pakistan	1	–	–	–	–	–	–		1
	Philippines	6	1.55	0.85	0.11	0.20	0.30	0.16		0
	Sri Lanka	2	1.56	0.83	0.20	0.23	0.33	0.18		1
	Taiwan	22	1.74	0.81	0.11	0.25	0.37	0.20		0
	Thailand	11	1.73	0.81	0.07	0.25	0.38	0.20		1
	Turkey	22	1.64	0.89	0.05	0.16	0.25	0.13		0
	Vietnam	22	1.79	0.81	0.05	0.25	0.38	0.20	695	8	80

Europe	Bulgaria	1	–	–	–	–	–	–		0
	France	27	1.58	0.89	0.10	0.16	0.24	0.13		4
	Germany	1	–	–	–	–	–	–		0
	Greece	4	1.18	0.94	0.01	0.08	0.11	0.06		0
	Italy	25	1.60	0.87	0.05	0.19	0.28	0.15		3
	Netherlands	1	–	–	–	–	–	–		0
	Romania	6	1.21	0.94	0.01	0.08	0.12	0.06		1
	Spain	3	1.31	0.90	0.01	0.14	0.20	0.10		0
	UK	19	1.83	0.86	0.09	0.21	0.34	0.18	87	0	8

Unknown	Unknown	62	–	–	–	–	–	–	62	5	5

Whole collection		893	2.00	0.79	0.06	0.29	0.43	0.23

Ho, observed heterozygosity; He, expected heterozygosity; I, Shannon’s information index; PIC, polymorphism information content.

Basic characterization of the WEC collection

A cluster analysis based on the genetic-distance matrix obtained from the SNP genotype data of the WEC collection lines suggested a cluster structure reflecting their geographic origins (Supplemental Fig. 1). With the STRUCTURE analysis, the understanding of the cluster structure could become more clear and accurate. The optimum cluster number (K) was suggested to be K = 2 according to Evanno’s method (Evanno ); however, judging from the transition of the cluster structure with an increase in the value of K and taking the origin of the lines and dendrogram structure into account, K = 4 might be more appropriate (Supplemental Fig. 1). Each cluster (S1–S4) corresponds to geographical group. The S1 cluster originated from European, American and African countries, the S2 cluster originated from East Asian countries (mainly Japan); the S3 cluster originated from the Southeast Asian countries (mainly Malaysia, Vietnam and Lao PDR), and the S4 cluster was originated from a part of Southeast Asian countries (mainly Malaysia and Myanmar) and South Asian countries (mainly India and Bangladesh) (Supplemental Fig. 1). The Asian accessions categorized into three groups (S2, S3 and S4), the East Asian accessions mainly Japanese accessions comprised an independent cluster with high bootstrap values compared with those of other Asian accessions, however not forming a monophyly. And the Southeast Asian accessions were divided into two groups (S3 and S4). Malaysian accessions were included in the both clusters, and seemed to differentiate widely. In cluster S4, some of the Southeast Asian accessions and South Asian accessions were mixed. Similarly, the dendrogram constructed using the SSR markers also showed the same categorization, especially the part of the groups with high bootstrap values in the analysis of 831 SNPs (data not shown). In conclusion, the eggplant core collection, WEC collection, represents the genetic diversity of a large collection with a pragmatic size of 100 accessions. Hence, it will enable easy access to eggplant genetic resources and accelerate its utilization. In addition, molecular genetic information of the WEC collection will help strategic planning of research in the future. Basic trait-based characterization of the core collection is underway, which will contribute to the validation and utilization of the collection. The WEC collection will be a valuable source for developing new breeding materials to improve important and complicated traits such as biotic and abiotic stress tolerance, plant architecture, and stable productivity against today’s worsening environmental conditions.

Data and material availability

Basic data of the WEC collection, such as provenance and phenotype data listed in Supplemental Table 4, are published through the ‘Genetic resources’ menu of the VegMarks database (https://vegmarks.nivot.affrc.go.jp/resource/) and the DNA marker genotypes (831 SNP data and 50 SSR fragment length) are published on the same site. Seeds of the WEC collection will be distributed by the NARO Genebank (https://www.gene.affrc.go.jp/index_en.php).

13 in total

1. PowerMarker: an integrated analysis environment for genetic marker analysis.

Authors: Kejun Liu; Spencer V Muse
Journal: Bioinformatics Date: 2005-02-10 Impact factor: 6.937

2. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study.

Authors: G Evanno; S Regnaut; J Goudet
Journal: Mol Ecol Date: 2005-07 Impact factor: 6.185

3. GGT 2.0: versatile software for visualization and analysis of genetic data.

Authors: Ralph van Berloo
Journal: J Hered Date: 2008-01-24 Impact factor: 2.645

4. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0.

Authors: Koichiro Tamura; Glen Stecher; Daniel Peterson; Alan Filipski; Sudhir Kumar
Journal: Mol Biol Evol Date: 2013-10-16 Impact factor: 16.240

5. Development of gene-based markers and construction of an integrated linkage map in eggplant by using Solanum orthologous (SOL) gene sets.

Authors: Hiroyuki Fukuoka; Koji Miyatake; Tsukasa Nunome; Satomi Negoro; Kenta Shirasawa; Sachiko Isobe; Erika Asamizu; Hirotaka Yamaguchi; Akio Ohyama
Journal: Theor Appl Genet Date: 2012-02-16 Impact factor: 5.699

6. GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research--an update.

Authors: Rod Peakall; Peter E Smouse
Journal: Bioinformatics Date: 2012-07-20 Impact factor: 6.937

7. Draft genome sequence of eggplant (Solanum melongena L.): the representative solanum species indigenous to the old world.

Authors: Hideki Hirakawa; Kenta Shirasawa; Koji Miyatake; Tsukasa Nunome; Satomi Negoro; Akio Ohyama; Hirotaka Yamaguchi; Shusei Sato; Sachiko Isobe; Satoshi Tabata; Hiroyuki Fukuoka
Journal: DNA Res Date: 2014-09-18 Impact factor: 4.458

8. Analysis of genetic diversity of rapeseed genetic resources in Japan and core collection construction.

Authors: Ruikun Chen; Takashi Hara; Ryo Ohsawa; Yosuke Yoshioka
Journal: Breed Sci Date: 2017-05-23 Impact factor: 2.086

9. Core Hunter II: fast core subset selection based on multiple genetic diversity measures using Mixed Replica search.

Authors: Herman De Beukelaer; Petr Smýkal; Guy F Davenport; Veerle Fack
Journal: BMC Bioinformatics Date: 2012-11-23 Impact factor: 3.169

10. The population structure and diversity of eggplant from Asia and the Mediterranean Basin.

Authors: Fabio Cericola; Ezio Portis; Laura Toppino; Lorenzo Barchi; Nazareno Acciarri; Tommaso Ciriaci; Tea Sala; Giuseppe Leonardo Rotino; Sergio Lanteri
Journal: PLoS One Date: 2013-09-06 Impact factor: 3.240

4 in total

Review 1. Core Collections: Is There Any Value for Cotton Breeding?

Authors: Lucy Marie Egan; Warren Charles Conaty; Warwick Nigel Stiller
Journal: Front Plant Sci Date: 2022-04-28 Impact factor: 6.627

2. The landscape of microsatellites in the enset (Ensete ventricosum) genome and web-based marker resource development.

Authors: Manosh Kumar Biswas; Jaypal N Darbar; James S Borrell; Mita Bagchi; Dhiman Biswas; Gizachew Woldesenbet Nuraga; Sebsebe Demissew; Paul Wilkin; Trude Schwarzacher; J S Heslop-Harrison
Journal: Sci Rep Date: 2020-09-17 Impact factor: 4.379

3. Morphological and molecular characterization of some pumpkin (Cucurbita pepo L.) genotypes collected from Erzincan province of Turkey.

Authors: Halil İbrahim Öztürk; Veysel Dönderalp; Hüseyin Bulut; Recep Korkut
Journal: Sci Rep Date: 2022-04-26 Impact factor: 4.996

4. Genetic and Flower Volatile Diversity in Natural Populations of Origanum vulgare subsp. hirtum (Link) Ietsw. in Bulgaria: Toward the Development of a Core Collection.

Authors: Marina Alekseeva; Tzvetelina Zagorcheva; Mila Rusanova; Krasimir Rusanov; Ivan Atanassov
Journal: Front Plant Sci Date: 2021-07-15 Impact factor: 5.753

4 in total