FEEDBACK  |  CONTACT  |  SITE MAP   
Please ask an URGI account
WHEAT URGI
You are here : Home / Home Wheat / Seq Repository / Assemblies

Assemblies

Triticum aestivum (Chinese Spring)   

 

  • The IWGSC RefSeq v2.0 assembly is available for download and BLAST (under Toronto agreement).

Under the leadership of Mingcheng Luo and Jan Dvorak (UC Davis, CA, USA) and with funding from the US National Science Foundation, an improved version of the reference wheat genome has been completed and is being released in advance of publication to the scientific community under the terms of Toronto agreement which affords the data producers the right to publish the first whole genome analyses of the data.

The genome assembly of Triticum aestivum cv. Chinese Spring (IWGSC RefSeq v1.0; IWGSC, 2018 ) was improved using whole genome optical maps and contigs assembled from whole-genome-shotgun (WGS) PacBio SMRT reads (Zimin et al. 2017 ). Optical maps were used to detect and resolve chimeric scaffolds, anchor unassigned scaffolds, correct ambiguities in positions and orientations of scaffolds, create super-scaffolds, and estimate gap sizes more accurately. PacBio contigs were used for gap closing. Pseudomolecules of the Chinese Spring 21 chromosomes were re-constructed to develop a new reference sequence, IWGSC RefSeq v2.0. All revisions involved approximately 10% sequence length of the IWGSC RefSeq v1.0. 

Importantly, please note that this version has yet to be annotated. The IWGSC annotation team will be performing targeted annotation and QC of IWGSC RefSeq v2.0. In addition, all manually curated genes submitted to the IWGSC by the end of August 2019 (see call for contributions ) will be integrated into IWGSC RefSeq v2.0, annotation v2.0. The IWGSC aims to release the annotation v2.0 in January 2020.

How to access the data

Access does require registration. For specific access terms, see the IWGSC General Data Access agreement .

- Individuals who have not sign the IWGSC Data Access Agreement should FIRST register on the IWGSC website and sign the Agreement ; URGI login details will be provided subsequently for access to the data.

- Individuals who have already signed the IWGSC Data Access Agreement can go directly to the URGI website to access the data using their URGI login details. If you forgot your URGI credentials, please send an email to urgi-support@inra.fr

cf. IWGSC annoucement .

 

  • The IWGSC RefSeq v1.0 assembly (the first version of the reference sequence of the bread wheat variety Chinese Spring) is publicly available for download , BLAST , display in a browser and a InterMine .

The IWGSC RefSeq v1.0 assembly (pseudomolecules and scaffolds) is an integration of the IWGSC WGA v0.4 – made available in June 2016 – with IWGSC chromosome-based and other resources, including but not limited to:

-       Physical maps for all chromosomes;

-       Sequenced BACs for 8 chromosomes (1A, 1B, 3B, 3D, 6B, 7A, 7B, 7D) and partial MTP BAC sequences for 2 chromosome arms (4AL, 5BS);

-       MTP BAC WGPTM sequence tags for all chromosomes, except 3B;

-       BioNano optical maps (7A, 7B, 7DS);

-       Alignment to RH maps (D chromosomes); and

-       GBS map of the SynOp RIL population CsxRn genetic map (INRA). 

With the addition of the resources that have been developed by IWGSC members over the past few years, the quality of the assembly increased substantially. When compared with IWGSC WGA v0.4, the chromosomal scaffold/ superscaffold N50 increased from 7.0 Mb to 22.8 Mb. 

 

How to access the data?

All these data are now in open access. While scientists may freely publish using the IWGSC data, IWGSC does request that the source of the data be properly acknowledged.

>>> The corresponding Annotations are accessible here . <<<

The data are alsa available at NCBI’s Short Read Archive (SRP114784 ) and ENA (PRJEB27788 ).

 

Warning:

Notice that some bioinformatics tools (e.g. GATK) requiere that you split the chromosomes to chunks of 512 Mbp maximum.

 

 

 

  • The IWGSC WGA v0.4 (Whole Genome Assembly), comprised of Illumina short sequence reads assembled with NRGene’s DeNovoMAGICTM software, produced scaffolds totaling 14.5Gb with a L50 of 7.1Mb that have been assigned to chromosomal locations using POPSEQ data and a HiC map.

Over 99% of chromosome survey contigs map to the IWGSC WGA v0.4 assembly. We hope that the more contiguous sequences of the new assembly will help users accelerate the identification of genes associated with important traits.
A BLAST server has been set up to facilitate rapid access to single or small numbers of queries.The data can also be downloaded from the IWGSC data repository.     

How to access the data?

All these data are now in open access. While scientists may freely publish using the IWGSC data, IWGSC requests that the source of the data be properly acknowledged.

 

the first release of the TGACv1 genome assembly of Triticum aestivum cv. Chinese Spring, generated by The Genome Analysis Centre, Norwich , as part of the BBSRC-funded project, Triticeae Genomics for Sustainable Agriculture. The assembly has an N50 of 88 Kb and a total length of 13.4 Gb in contigs greater than 500 bp. A total of 98,974 genes (99% of the total) annotated on the previously released assembly have been located on the new assembly. Alignments of RNA-seq data from 3 different studies across 18 samples have additionally been located on the new assembly.

 

  • IWGSC Survey sequence chromosomes
     
    • Version 1 and version 2 assemblies are publicly available  for download , BLAST and in a browser .
    Version 2 assembly is the version 1 assembly cleaned i.e. from which duplicates were removed. Fasta by A- B- and D- genomes are available for download at MIPS.
    Summary of the different CSS assemblies and versions (TGAC):   This new version of the IWGSC CSS wheat survey sequence has been generated by the incorporation of ca. 185 Gbp of mate pair sequence data produced from libraries ranging in size from 1-40kb from a Chinese Spring + 7EL addition line. The assembly has been produced by A. Sharpe, D. Konkin and C. Pozniak, at the National Research Council Canada and the U. of Saskatchewan, Canada.

 

  • 3B reference sequence (F. Choulet):

Display the 3B pseudomolecule using the GBrowse or JBrowse.
Download the 3B data : Genomic DNA, CDSs, annotation of features and a README.
BLAST the 3B reference sequence: whole chromosome or CDS only (nucleotide and peptide).

 

  • Some individual chromosome low coverage shotgun (454 data) are also publicly available  for BLAST .

 

Other wheat species   
 

  • WGS assemblies (TGAC, TSL) are publicly available  for download and BLAST :

- Triticum durum, cv. Cappelli (listed as durum _v1)
- Triticum durum, cv. Strongfield
- Triticum monococcum
- Aegilops speltoides
- Aegilops. sharonensis
- Triticum urartu
- Aegilops tauschii

Aegilops speltoides

Aegilops speltoides 29, registered in the INRA Genetic Resources Center under the code ERGE26012.

Triticum durum cv. Cappelli

The cultivar Senatore Cappelli (know as Cappelli) is an historical durum wheat genotype selected from a North African landrace by N. Strampelli and registered in 1915. Cappelli is one of the founders of the Italian durum wheat breeding program and it is present in the pedigree of many durum wheat cultivars released in southern Europe in the XXth  century. Cappelli has been largely grown in Mediterranean regions since the late 1950s.
Cappelli is characterized by an elevated water use efficiency (Rizza et al., 2012 Field Crops Research 125, 49–60). 

Triticum durum cv. Strongfield

Strongfield durum wheat was developed by Dr. John Clarke during his tenure at Agriculture and Agri-Food Canada.    "Strongfield" expresses high grain yield, high grain protein concentration coupled with low grain cadmium concentration. Canadian durum what production represents 60% of durum wheat traded globally and Strongfield currently occupies 65% of the total planted area of durum wheat in Canada.  A detailed description of the variety has been published (Clarke et al. 2005; Can J. Plant Sci., 83: 651-654 see http://pubs.aic.ca/doi/abs/10.4141/P04-119 ).

  • Aegilops tauschii scaffolds available for BLAST at ATGSP

We would like to announce the launch of the website for the NSF-funded 'Sequencing the Aegilops tauschii genome' project (NSF-IOS-1238231).

The project has been underway only since last September and at this point the website offers the opportunity to BLAST search the individual Ae. tauschii scaffolds available at this point.
The scaffolds are not yet ordered and gaps remain, but we are making them available in this preliminary form believing that they can be useful for characterization and comparisons of single Ae. tauschii genes and surrounding sequences, development of markers, and other genetic applications related to wheat and barley research and improvement.The BLAST database will be updated as new scaffolds are prepared; the most recent update was Sept. 4, 2014.
Olin Anderson

  • Triticum durum cv Svevo reference sequence available at CNR-ITB .
Update: 07 Aug 2019
Creation date: 03 Sep 2013