|
|
|
|
Information about assembly Zm-W22-REFERENCE-NRGENE-2.0
(also known as W22)
|
|
|
Click
here
to learn about maize genome and gene model nomenclature rules.
|
|
|
|
|
|
Genome Sequencing Project Information |
|
| |
Stock provided by Hugo Dooner. This stock was derived by R.A. Brink at the U. of Wisconsin for his studies on paramutation at the R locus by five back-crosses of the regular W22 inbred (colorless seed because of c1; r alleles) to a purple seed stock that he apparently obtained from Cornell (Brink, 1956, Genetics 41:872-889). This resulted in the introgression of C1 and R alleles from the purple stock into W22. The C1 allele was characterized by Karen Cone. The R allele is the paramutable R-r: standard allele (Brink, 1956, op. cit.; Dooner & Kermicle, 1971, Genetics 67:427-436). Allele was obtained in 1972 from the U. of Wisconsin cold storage collection (pedigree CS-810) and has been selfed for 30 generations. Known alleles of color genes: C1 (Cooper and Cone, unpub.; GenBank AF320614); R-r: std (Walker et al., 1995, EMBO J. 10: 2360-2363; only partial sequence); Bz1-W22 (Dooner & He, 2008, Plant Cell 20:249-258; GenBank EU338354). |
|
|
This sequence has been released under the
Toronto Agreement. No whole-genome research
may be submitted for publication until the official publication for this genome
assembly has been published.
|
| |
GenBank BioProject |
PRJNA311133 |
| |
Project PI |
Tom Brutnell |
| |
Project start date |
August, 2014 |
| |
Release date |
2017 |
| |
Consortium |
W22 Sequencing Consortium |
| |
Browse Genome |
Genome browser at MaizeGDB |
| |
Data download |
ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/plant/Zea_mays/latest_assembly_versions/GCA_001644905.2_Zm-W22-REFERENCE-NRGENE-2.0 |
| |
Publication status |
in preparation |
|
Project reference |
The maize W22 genome: a foundation for gene discovery and functional genomics.
Tom Brutnell, Omer Barad, Kobi Baruch, Gil Ben-Zvi, Ed Buckler, Ethalinda Cannon, Paul Chomet, Hugo Dooner, Chunguang Du, Georg Jander, Karen Koch, Don McCarty, Ilya Soifer, Doron Shem-Tov, Erik Vollbrect, Doreen Ware, Maggie Woodhouse
|
|
|
|
|
|
|
|
|
|
|
Stock and Biosample Information |
|
| Stock information |
| |
Stock name |
cultivar:W22 (C1:R-r:std - PI 674445) |
| |
Stock record |
9034197 |
| |
Stock details |
cultivar:W22 (C1:R-r:std - PI 674445) |
| |
Stock provided by |
Hugo Dooner |
|
|
| Biosample information |
| |
GenBank BioSample |
SAMN04479043 |
| |
Sample type |
whole organism |
| |
Sample description |
Plant Sample collected by hand, DAN extracted HMW DNA extraction (80-120KB in size) |
| |
Collection date |
5-Sep-14 |
| |
Collected by |
Jiang Hui |
| |
Age |
9th day after sowing the seed |
| |
Plant structure |
PO:0000003 |
| |
Developmental stage |
seedling |
|
|
|
|
|
|
|
|
|
|
|
Sequencing and Assembly Information |
|
| |
Assembly name |
Zm-W22-REFERENCE-NRGENE-2.0 |
| |
Sequencing description |
Sequence service provider: Roy J. Carver Biotechnology Center (Urbana, IL) at the University of Illinois Sequencing method: Illumina short read and 10x Genomics Sequencing hardware: Illumina short read and 10x Genomics |
| |
Assembly description |
Assembly methods: DenovoMAGIC Construction of pseudomolecules: Scaffolds were ordered and oriented |
| |
Browse Genome |
Genome browser at MaizeGDB |
| |
Data download |
ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/plant/Zea_mays/latest_assembly_versions/GCA_001644905.2_Zm-W22-REFERENCE-NRGENE-2.0 |
| |
Release date |
2017 |
| |
Sequencing method |
NRGene de novo assembly |
| |
Finishing strategy |
Complete genome |
| |
Seq hardware |
Illumina HiSeq2500 |
| |
Seq chemistry |
v4 and rapid mode 2 |
| |
Seq chemistry version |
v4 and rapid mode 2 |
| |
Genome coverage |
210x |
| |
Seq service provider |
Roy J. Carver Biotechnology Center (Urbana, IL) at the University of Illinois |
|
| Assembly statistics |
| |
Scaff num |
306 |
| |
Longest scaff |
83 bp |
| |
N50 scaff length |
35 bp |
| |
N50 scaff count |
18 |
| |
N90 scaff length |
10,997,073 bp |
| |
N90 scaff count |
58 |
|
|
Total number of scaffolds in assembly.
|
|
Longest scaffold in assembly.
|
|
The length of scaffold which takes the sum length (summing from longest to shortest scaffold) past 50% of the total assembly size.
|
|
How many scaffolds are counted in reaching the N50 threshold.
|
|
The length of scaffold which takes the sum length (summing from longest to shortest scaffold) past 90% of the total assembly size.
|
|
How many scaffolds are counted in reaching the N90 threshold.
|
|
|
|
A contig is a contiguous consensus sequence that is
derived from a collection of overlapping reads.
A scaffold is set of a ordered and orientated contigs
that are linked to one another by mate pairs of sequencing reads.
|
|
|
|
|
|
|
|
|
|
|
Annotation |
|
| |
Annotation Identifier |
Zm00004b.1 |
| |
Annotation Provider |
Yinping Jiao, Ware lab |
| |
Annotation Date |
May, 2017 |
| |
Annotation Software |
_Zm00004b_MAKER-P |
| |
Annotation Description |
Annotation of protein coding genes was performed using MAKER-P pipeline software(Campbell et al. 2014), with parameters and evidence similar to those recently used to annotate B73(Law et al. 2015; Jiao et al. 2016). Repeat masking by RepeatMasker was performed using exemplar transposon sequences (Schnable et al. 2009) available online at the maize transposable element database. We excluded helitron and MULE elements to avoid false-positive masking from captured exon sequences in such elements. Gene expression evidence included PacBio Iso-seq long reads sequenced from cDNA libraries of six tissues in B73 (n=111,151)(Wang et al. 2016). In addition, we included the following transcriptome assemblies, each processed to exclude short transcripts (<300-bp) and redundancies based on application of CD-HIT(Fu et al. 2012): 1) a pooled set of 94 transcriptome assemblies constructed from publicly-available RNA-seq reads (n=508,233) (Law et al. 2015), 2) a transcriptome assembly of B73 seedlings (n=112,963) (Martin et al. 2014), 3) a transcriptome assembly of W22 tissues (n=589,743). Cross-species evidence was supplied in the form of the following annotated protein files downloaded from Gramene release 46(Gramene FTP) (Tello-Ruiz et al. 2016): 1) Arabidopsis_thaliana.TAIR10.27.pep.all.fa, 2) Brachypodium_distachyon.v1.0.27.pep.all.fa, 3) Oryza_sativa.IRGSP-1.0.27.pep.all.fa, 4) Setaria_italica.JGIv2.0.27.pep.all.fa, and 5) Sorghum_bicolor.Sorbi1.27.pep.all.fa. Alignment and downstream processing of sequence evidence to the repeat-masked W22 reference was performed within the MAKER-P pipeline using default parameters. For gene model prediction, the pipeline incorporated AUGUSTUS(Stanke et al. 2006) applied with the maize5 model and FGENESH(Salamov and Solovyev 2000) applied with the monocot model. Stable gene identifiers were assigned using the format Zm00004bXXXXXX (where the X's represent a random 6-digit number), as specified under A Standard For Maize Genetics Nomenclature available at MaizeGDB. |
| |
Annotation Download |
ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/plant/Zea_mays/all_assembly_versions/GCA_001644905.2_Zm-W22-REFERENCE-NRGENE-2.0 |
|
|
|
|
|
|
|