Summer Research Fellowship Programme of India's Science Academies

Development and use of SSR for genotyping in wheat(Triticum aestivum)

Simran Singh

Department of Genetics and Plant Breeding, Institute of Agricultural Science, Banaras Hindu University,Varanasi, Uttar Pradesh 221005

Prof. Pushpendra K. Gupta

Department of Genetics and Plant Breeding, Ch. Charan Singh University, Meerut, Uttar Pradesh 250004


Wheat (Triticum aestivum) is among the most important crops of the world. It has a large genome of about 16×109 bp/1C with more than 80% repetitive DNA, and is an allohexaploid (AABBDD, 2n=6x=42), with three sub-genomes namely, A, B and D. Molecular marker technology has been extensively used in this crop, and continues to be used for effective conservation of germplasm and use of genetic resources, evaluation of the genetic variation, etc. Among different types of molecular markers, microsatellites or simple sequence repeats (SSRs) are the markers of choice for tagging genes or assessing genetic diversity. This is mainly because SSR analysis requires only a small amount of DNA. Moreover, SSRs are easily detectable by PCR, amenable to high-throughput analysis, co-dominantly inherited, multiallelic, highly polymorphic, abundant and evenly distributed in the genome. SSRs occur throughout the entire genome of wheat in both noncoding and coding regions, thus making them ideal for genotyping, genome mapping, as well as for population genetic studies in wheat. The current study on wheat SSRs involved two components: (i) A polymorphism survey of 81 SSR markers in two wheat varieties, followed by use of polymorphic markers for genotyping a subset of RIL (recombinant inbred lines). (ii) Identification of novel SSRs from the Pan Genome assembly of wheat using bioinformatics approach. Genome-wide Microsatellite Analyzing Tool Package (GMATA), an integrated software for SSR mining, and primer designing was used for the bioinformatics component of the present study.

Keywords: Triticum aestivum, SSR markers, polymorphic, GMATA, genotyping, pan genome




Wheat (Triticum aestivum L. em Thell, 2n = 42; AABBDD) has an allohexaploid genome structure that arose from two polyploidization events. The first event hybridized the genomes of two diploid species related to the wild species Triticum urartu (2n = 2× = 14; AuAu) and a species related to Aegilops speltoides (2n = 14; SS) . This hybridization formed the allotetraploid Triticum turgidum (2n = 4x = 28;AABB) that underwent the second hybridization event with a diploid grass species, Aegilops tauschii (DD), producing the ancestral allohexaploid T. aestivum (2n = 6x = 42;AABBDD). Thus, the hexaploid wheat genome is characterized by its large size (~17 Gb) and complexity, with repetitive sequences accounting for ~ 80% of the genome.

The genomic DNA of wheat has been used for development of a variety of molecular markers, which have been used directly or indirectly for crop improvement. Among different molecular markers, simple sequence repeats (SSR) of 2–6 bp motifs are of tremendous value due to their relative abundance, co-dominant inheritance, multiple alleles, uniform genome coverage, and simple reproducible assays (Powell et al. 1996). Till date, more than 4000 SSR markers have been developed and used in genetic mapping studies of wheat (Han et al ., 2015). These markers have been extensively used for QTL analysis for identification of marker-trait association (MTAs) that are routinely used for marker-assisted selection (MAS) to supplement plant breeding.

Although ~4000 SSRs are already available and their primers designed, the entire genome of wheat is estimated to contain 476,169 SSRs, which occur in the genome at a frequency of SSRs 29.73 per Mb. This shows that many more SSRs can be developed and used for wheat breeding. For this purpose, available pangenome based assembly can be used as an invaluable resource for in silico mining of SSRs, which are ubiquitously distributed over entire genome of wheat. GMATA is recently developed pipelines/tool provide multiple functions such as SSR mining, characterization of SSR distribution and SSR marker design on a genome scale. Availability of several polymorphic SSR markers in wheat genome makes it further proficient for the study of wheat genome. They are ideal for genetic diversity studies and intensive genetic mapping. Moreover, it is necessary to investigate the genetic diversity in wheat germplasm in order to widen the genetic variation in future wheat breeding (Huang et al., 2002).


Statement of the Problems

Wheat is the staple food of about 35% world population and most preferred cereals in the world .Various abiotic factors causes’ reduction in yield among many wheat growing regions of the world. Therefore, development of abiotic stress tolerant genotypes is one of major concern in wheat breeding programs. To study the adaptation of crop plants to stress condition due to climatic changes, there is a need to exploit the available biodiversity in crop genotypes growing in diverse environments (Bhargava and Sawant,2013).

Although, morphological traits can be used for assessing genetic diversity, but they are often influenced by the environment. Therefore, the use of molecular markers specially SSR for the assessment of genetic diversity is gaining much attention from wheat breeders and molecular geneticists (Huang et al.,2002; Salem et al., 2015).

The use of DNA markers in plant (and animal) breeding has opened a new realm in agriculture called ‘molecular breeding’ (Rafalski & Tingey, 1993). SSRs have been extensively used in wheat due to their high level of polymorphism, codominant inheritance and abundant distribution in the wheat genome. It has also been found that they show a much higher level of polymorphism and information than any other marker system in wheat crop. Polymorphism survey is often the first step towards the creation of framework molecular linkage map for QTL mapping projects employing bi-parental populations. Often several hundred SSR markers are screened to find sufficient number of polymorphic loci covering the whole genome, and it can be used to identify associated markers for improvement of grain traits using MAS(Marker assisted selection) during wheat breeding.

Although there have been numerous QTL mapping studies for a wide range of traits in diverse crop species, relatively few markers have actually been implemented in plant breeding programs (Young, 1999). The primary reason for this lack of adoption is that the markers used have not been reliable in predicting the desired phenotype. In many cases, this would be attributable to a low accuracy of QTL mapping studies or inadequate validation (Sharp et al., 2001; Young, 1999).

Key lessons learnt from past research are likely to encourage researchers more to develop reliable markers and plant breeders to adopt MAS. However, Young (1999) emphasized that scientists must realize the necessity of using larger population sizes, more accurate phenotypic data, independent verification and different genetic backgrounds, in order to develop reliable markers for MAS. There are various other factors which will greatly affect the efficiency and effectiveness of QTL mapping and MAS research in the future: New developments and improvements in marker technology, the integration of functional genomics with QTL mapping, and the availability of more high-density maps. New types of markers and high-throughput marker techniques should play a vital role in the construction of second-generation maps, provided that these methods are not too expensive. Due to the abundance of single nucleotide polymorphisms (SNPs) and development of sophisticated high-throughput SNP detection systems, it has been proposed that SNP markers will have a great influence on future mapping research studies and MAS (Rafalski, 2002; Koebner & Summers, 2003).

Currently, the cost of utilizing markers is possibly the most important factor that limits the implementation of MAS. However, it is anticipated that in the future, technology improvements and novel applications will result in a reduction in the cost of markers, which will subsequently lead to a greater adoption of markers in plant breeding.

Combination of QTL mapping with methods in functional genomics is latest trend these days, which is developed for the study of gene expression. These techniques include expressed sequence tag (EST) and microarray analysis, which can be utilized to develop markers from genes themselves (Gupta et al., 2001; Morgante & Salamini, 2003). Furthermore, there is a rapid growth in the number of EST and genomic sequences available in databases and the accumulation of these sequences will be extremely useful for the discovery of SNPs and data mining for new markers in the future (Gupta et al., 2001; Kantety et al., 2002). The development of high-density (or ‘saturated’) maps that incorporate SNPs, EST-derived markers, and STSs will provide researchers with a greater availability of tools for QTL mapping and MAS.

SSRs have been extensively used for GWAS (Genome-Wide Association Studies) involving single locus single trait (SLST), multi locus mixed model (MLMM), and multi-trait mixed model (MTMM) approach, for discovery of of MTAs in wheat (Jaiswal et al., 2016). Hence, bulk mining of SSRs from each and every chromosome is required, which can be used in QTL/gene discovery.


Objectives of the Research

Overall objective

1.The present study was designed to conduct genotyping of RIL mapping population derived from the cross between C306(PARENT 1) and HUW468 (PARENT 2) Further this genotyping data can be used to identify important QTL and associated markers for improvement of grain traits using MAS during wheat breeding.

2. Mining of chromosome-wise SSR from pan genome assembly of wheat of all the three sub- genome (A, B, D) and its primer designing for genotyping


It is expected that the SSR marker are invaluable for the development of high resolution maps, which will also facilitate the isolation of actual genes (rather than markers) via ‘mapbased cloning’ (also ‘positional cloning’). Map-based cloning involves the use of tightly linked markers to isolate target genes by using the marker as a ‘probe’ to screen a genomic library (Tanksley et al., 1995; Meyer et al., 1996)


Condit and Hubbell (1991) were first to report the study of SSRs in plants. Since then, SSRs have been widely used for all important crop plants. Several hundred SSR primer pairs have been developed for all the three genomes of wheat (Mantovani et al.,2008) and have been used for a various purposes, including genome mapping, gene tagging, physical mapping and genetic diversity estimates(Wang et al., 2007). These DNA Markers have been used to reveal the site of variation in DNA and hence used in polymorphism survey among the population. DNA marker systems, have many advantages over the traditional morphological and protein markers that are used in genetic analyses of plant populations: firstly, an unlimited number of DNA markers can be developed; secondly, DNA markers are not affected by the environmental factors, and, thirdly DNA markers, unlike isozyme markers, are not constrained by tissue or developmental stage specificity. These DNA markers are primarily used in agricultural research in construction of linkage maps. These linkage maps are utilized in identification of regions in chromosomes that contain different genes for simple traits as well as complex quantitative trait using QTL analysis. These DNA markers may also be used as molecular tool for marker assisted selection (MAS) since DNA markers are tightly linked to some agronomically important traits.

As compares to other markers, SSRs have many advantages over other DNA markers, firstly they show high reproducibility, which would be the most important in genetic analysis. Secondly, the hyper-variable nature of SSRs produces very high allelic variations even among very closely related varieties. . The level of genetic variation detected by SSRPs analysis is several fold higher than revealed by other approaches, and thirdly, co-dominant nature of SSR polymorphisms is useful in parentage analysis in hybrids.


Along with the above merits, there are certain shortcomings of using SSR as molecular markers. Often, in SSR analyses of a large number of samples from diverse germplasms, a few samples fail to produce PCR products. This is a source of frustration to experimenters because they cannot determine whether the absence of PCR products represents true null alleles of the SSR locus or is due to a failure of the PCR reaction. SSRs derived from ESTs or cDNA often fail to produce PCR products if one or both primer binding sites happen to be on the splice sites. Presence of large introns or primers designed from chimeric cDNA will not produce successful PCR products.

A segregating plant population is required, for the construction of linkage maps. Also, the parents selected should differ in one or more traits for mapping the population. A large population size should be selected for high resolution mapping, if the map is to be used for QTL studies (which is usually the case). Further, the mapping population must be phenotypically evaluated (i.e. trait data must be collected) before subsequent QTL mapping.

Recombinant inbred lines are chosen for QTL mapping because it contains a series of homozygous lines, each containing a unique combination of chromosomal segments from the original parents. But on the same hand the length of time needed for producing RIL populations is the major drawback, because usually six to eight generations are required.

An overall idea of different molecular markers and its uses is tabulated in form of Table No. 1

:Comparison between widely used Molecular markers in plant system  
Abundance Low Medium Very high Very high High Very high
Types of polymorphism Amino acid change in polypeptide Single base change, insertion, deletion, inversion Single base change, insertion, deletion, inversion Single base change, insertion, deletion, inversion Repeat length variation Single base change
DNA quality - High Medium High Medium Medium
DNA sequence information - Not required   Not required   Not required   Required Required
Level of polymorphism Low Medium High High High High
Inheritance Codominance   Codominance   Dominance Dominance Codominance   Codominance
Reproducibility Medium High Low Medium High High
Technical complexity Medium High Low Medium Low Medium
Developmental cost Medium High Low Low High in start High
Species Transferability High Medium High High Medium Low
Automation Low Low Medium Medium High High



For DNA isolation CTAB method was used because of its versatility, speed, and low cost have made it the procedure of choice in our lab. PCR product was separated on Urea PAGE because as compared to Agrose gel, Urea PAGE have high resolving power and as well as in urea PAGE we got sharp bands whereas in case of agrose gel we get fuzzy bands and there was a good separation of low molecular weight fragments too. It can be stored for a period of time for later studies in refrigerator packed within plastic film without suffering distortion.

For SSR mining GMATA software was used as it has overcame some of the limitations like insufficient processing capability when analyzing complete genomes, time-consuming processes, especially pipelines that possess multiple functions and integrate different software, the lack of a graphical interface or missing marker design shown by some of the other tools- RF , MISA , SciRoko , GMATo , IMEx , mreps , TROLL and MsDetector .


Genotyping of RIL Population

The plant material

Plant material used in current study consist of two different variety of wheat as parent C306(Female) and HUW468(Male). 154 recombinant inbred line of this parent was developed in Department of Genetics and Plant Breeding, Banaras Hindu University(BHU), further these lines were selected for genotyping.


Molecular marker analysis

DNA isolation and quantification.

Total genomic DNA from the Parent as well as 154 RIL genotypes was extracted according to the cetyltrimethylammonium bromide (CTAB) method for plant tissues (Murray and Thompson, 1980). To remove RNA contamination, DNA samples were treated with 1 μl of RNase A solution (10 mg/ ml) per 50 μl of DNA sample. Quality of DNA samples was checked using UV-spectrophotometer, the ratio of the absorbance at 260 nm and 280 nm was noted.

SSR-PCR Amplification.

Eighty one wheat microsatellites or SSR primer sets were selected and used for screening the studied wheat genotypes. Primers sequences and PCR conditions were obtained from high-density microsatellite consensus map for bread wheat (somer et al, 2004).PCR amplification was performed on bench top thermal cycler. PCR product was separated on 6% urea PAGE in 1X TBE Buffer. A 100 bp DNA ladder was used to estimate the size of each amplified DNA.

Allele scoring.

SSR amplification profiles were scored visually, based on the polymorphic(Taken as P) and monomorphic(taken as M) bands parent and for RIL genotypes scoring was done in form of A(C306) and B(HUW468). Only clear and unambiguous bands were scored. The size (in nucleotide base pairs) of amplified bands was determined based on its migration relative to standard molecular size marker (100 kb).

SSR mining from Pangenome assembly of wheat

Data Source

 The whole draft pangenome assembly of T. aestivum Version: UWA_1.0 was retrieved in FASTA format from Wheat Genome Databases http://wheatgenome.info/wheat_genome_ databases.php

 SSR Mining

 For SSR mining, sequences were taken chromosome-wise in fasta format and using GMATA software SSR motif were obtained.

Primer designing

 The default parameters for designing primers were 18–27 bp length, 57–63◦C melting temperature, GC content 30–70% and product size range 100–280 bp.



A list of 81 set of SSR Marker as primer was used for genotyping in C306(PARENT 1) and HUW468(PARENT 2)(Fig. 1). Out of 81 microsatellite, 41 were found to be polymorphic in both the Parents (Fig. 2).These primers are located on genome D (chromosome 1, 2, 3, 4, 5, 6 and 7), the overall size of amplified PCR products ranged from 78-326 bp(Fig3 ).Obtain 41 polymorphic SSR markers were analysed on 154 RIL Population(Fig. 4.a/4.b) and scored data will be useful in construction of QTL maps which will also facilitate the isolation of actual genes (rather than markers) via ‘mapbased cloning’ (also ‘positional cloning’). Map-based cloning involves the use of tightly linked markers to isolate target genes by using the marker as a ‘probe’ to screen a genomic library (Tanksley et al., 1995; Meyer et al., 1996)

Total no of marker1.jpg
    Total no of markers 2.jpg
      List of microsatellites (81 SSR Marker) used for genotyping in C306 (PARENT 1) and HUW468 (PARENT 2)
      Polymorphic marker list.jpg
        List of polymorphic microsatellites(41 SSR Maker) present in C306 (PARENT 1) and HUW468 (PARENT 2)

                Urea PAGE with representative samples showing different levels of polymorphism in C306 (PARENT 1) andHUW468 (PARENT 2) with 81 markers.(List of markers given in Fig. 1)
                SSR marker cfd 53
                  SSR marker gwm 52
                    Electrophoretic pattern of 44 RIL population in SSR analysis using cfd 53(a) and gwm 52(b) primer along with A(PARENT 1) and B(PARENT 2) scoring.

                    In case of in silico mining of SSR from pangenome assembly of wheat a total of 24,457 SSR were obtain from sub genome(A,B,D) of wheat. Microsatellite distribution showed abundance on B genome chromosomes followed by D and A genome chromosomes. Chromosome-wise results showed that chromosome 3 has the highest number of SSR markers (9,954) followed by chromosomes 7,4,5,1 and 2. Similarly, at individual level, chromosome 3B showed highest number (9,884) of SSR markers and chromosome 2D reported least number(22) of markers.

                    Similarly, number of repeats for each chromosome was calculated. The relative abundance in terms of percentage of dinucleotide repeats was highest while pentanucleotide were observed lowest, representing 61.8% and 0.004% of the total genome, respectively (Fig.5).

                    Segregation distortion phenomena adversely affect linkage map construction and QTL identification. High density SSR consensus map can resolve such issue (Li et al., 2015). Such high density linkage map of wheat has been successfully used to discover QTL controlling grain shape and size (Wu et al., 2015).Bulk SSRs can be used in linkage map even in single intraspecific population of common wheat (Torada et al., 2006).Obtained SSR can be used to construct densest genetic map of wheat.

                      Percetage of occurrence of SSR motif in whole genome



                      SSR are good source for evaluating genetic diversity among the different variety of wheat and other crop plants as well like maize, rice etc. Regardless, SSRPs have more advantages than

                      other molecular marker techniques. With more and more DNA information sequences becoming available through ESTs or whole genome sequencing, the number of available SSR markers is also increasing, RIL population are highly preferred for construction of high density QTL maps.

                      SSR primer obtained from pangenome assembly of wheat through in silico mining can further undergo in vitro validation or e-PCR validation and hence this SSR markers can be used for some potential application like QTL analysis, MAS, characterization of genetic diversity etc.


                      Bennetzen JL, Ma J, Devos KM(2005). Mechanisms of recent genome size variation in flowering plants. Ann Bot (Lond). 95:127–32

                      Chakravarthi, KB, Naravaneni R(2006). SSR Marker based DNA fingerprinting and diversity study in rice (Oryza sativa L). Afri. J. Biotech. 5: 684 – 688

                      Collard BCY, Jahufer MZZ, Brouwer JB, Pang ECK (2005). An introduction to markers, quantitative trait loci (QTL) mapping and marker-assisted selection for crop improvement: The basic concepts. Euphytica 142: 169–196

                      Doyle JJ, Doyle JL (1990) Isolation of plant DNA from fresh tissue.Focus 12:13–15

                      Jaiswal S, Sheoran S, Arora V, Angadi UB, Iquebal MA, Raghav N, Aneja B, Kumar D, Singh R, Sharma P,Singh GP, Rai A, Tiwari R, Kumar D(2017) Putative Microsatellite DNA Marker-Based Wheat Genomic Resource for Varietal Improvement and Management. Frontiers in Plant science.8:2009

                      Kota R, Varshney RK, Thiel T, Dehmer KJ, Graner A(2001). Generation and comparison of EST-derived SSRs and SNPs in barley (Hordeum vulgare L.). Hereditas 135:145–151

                      Gupta PK,Rustgi S, Sharma S, Singh R, Kumar N, Balyan HS(2003). Transferable EST-SSR markers for the study of polymorphism and genetic diversity in bread wheat. Mol. Genet.Genomics 270:315–323.

                      Gupta PK, Roy JK, Prasad M (2001). Single nucleotide polymorphisms:A new paradigm for molecular marker technology and DNA polymorphism detection with emphasis on their use in plants. Curr Sci 80: 524–535

                      Gupta PK, Varshney RK, Sharma PC, Ramesh B(1999). Molecular markers and their applications in wheat breeding. Plant Breed 118:369–390

                      Morgante, Rafalski M,Biddle A,Tingey P, Olivieri S, AM(1994). Genetic mapping and variability of seven soybean simple sequence repeat loci. Genome 37:763–769

                      Petersen G, Seberg O, Yde M, Berthelsen K(2006). Phylogenetic relationships of Triticum and Aegilops and evidence for the origin of the A, B, and D genomes of common wheat (Triticum aestivum). Mol Phylogenet Evol. 39:70–82

                      Phougat D, Panwar IS, Punia MS, Sethi SK(2018). Microsatellite markers based characterization in advance breeding lines and cultivars of bread wheat. Journal of Environmental Biology 39:339-346

                      Powell, W, Machray G, Provan J(1996). Polymorphism revealed by simple sequence repeats. Trends Plant Sci 1: 215–222

                      Rafalski A(2002). Applications of single nucleotide polymorphisms in crop genetics. Curr Opin Plant Biol 5: 94–100.

                      Rafalski J, Tingey S(1993). Genetic diagnostics in Plant Breed:RAPDs, microsatellites and machines. Trends Genet 9: 275–280.

                      Rakshit S, Rakshit A, Patil JV(2012). Multiparent intercross populations in analysis of quantitative traits. J. Genet. 91: 111–117

                      Saha MC, Mian MAR, Eujayl I, Zwonitzer JC, Wang L, May GD(2004). Tall fescue EST SSR markers with transferability across several grass species. Theor. Appl. Genet., 109:783–791

                      Scott KD,Eggler P, Seaton G, Rossetto M, Ablet EM, Lee LS, Henry, R.J(2000). Analysis of SSRs derived from grape ESTs. Theor. Appl. Genet. 100: 723–726

                      Semagn K, Bjørnstad, Ndjiondjop MN(2006). An overview of molecular marker methods for plants.Afr. J. Biotechnol. 5: 2540–2568

                      Slavov GT, Howe GT, Gyaourova AV, Birkes DS, Adams,WT(2005). Estimating pollen flow using SSR markers and paternity exclusion: Accounting for mistyping. Mol. Ecol.14:3109–3121

                      Singh H, Deshmukh RK, Singh A, Singh AK, Gaikwad K Sharma TR, Mohapatra T, Singh NK (2010). Highly variable SSR markers suitable for rice genotyping using agarose gels. Mol Breeding 25:359–364

                      Somers DJ, Isaac P, Edwards K(2004). A high-density wheat microsatellite consensus map for bread wheat (Triticum aestivumL.). Theory and Appl Genet 109(6):1105–1114

                      Thiel T, Michalek W, Varshney R, Graner A(2003). Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor. Appl. Genet. 106:411–422

                      Virk, DS, Steele, Witcombe KA, JR(2007). Mass and Line selection can produce equally uniform rice varieties. Field. Crop. Res. 100:341 –347

                      ​Wang X, Wang L(2016). GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing.Front Plant Sci. 7:1350


                      I am thankful to Indian Science Academy(IAS) for providing me Summer Research Fellowship Program(SRFP),2019.I am indebted to Prof. P.K Gupta for always guiding me during the course of work as well Prof H.S Balyan for helping me out with Lab work during the tenure of IAS SRFP 2019. I am also thankful to Ms Rakhi Singh(PhD Scholar) and Dr. Kalpana Singh(Research Associate) for assisting me during wet lab and bioinformatics work along with that I am thankful to Department of Genetics and Plant breeding, Ch. Charan Singh University(CCSU),Meerut for providing me good working environment in Molecular biology lab and Bioinformatic infrastructure Lab.

                      Written, reviewed, revised, proofed and published with