Report an assembly error

Information about assembly Zm-B73-REFERENCE-GRAMENE-4.0    (also known as AGPv4, B73 RefGen_v4)
Click here to learn about maize genome and gene model nomenclature rules.

Genome Sequencing Project Information

   The reference genome of Zea mays sp. mays, inbred B73 was completely resequenced using PacBio Single Molecule Real-Time technology and a high-resolution genome map. Seed for the sequenced accession is available from NCRPIS (PI 677128).
   Project PI   Doreen Ware
   Project start date   2015
   Release date   2017
   Changes to previous version   De novo assembly of this representative maize genome using PacBio sequencing technologies.
   Funding   NSF Gramene grant IOS-1127112, NSF Cereal Gene Discovery grant 1032105, USDA-ARS CRIS 1907-21000-030-00D, NSF Plant Genome award 1238014, USDA Hatch project CA-D-PLS-2066-H, NSF Plant Genome award 1238014, NSF Plant Genome award 1444514, NSF grant 1444624, USDA NIFA project HAW05022-H and NSF PGRP PRFB 1523793
   Browse Genome   Genome browser at MaizeGDB
   Data download   ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/plant/Zea_mays/latest_assembly_versions/GCA_000005005.6_B73_RefGen_v4
   Publication status   Published
Project reference Improved maize reference genome with single-molecule technologies. Yinping Jiao; Paul Peluso; Jinghua Shi; Tiffany Liang; Michelle C. Stitzer; Bo Wang; Michael Campbell; Joshua C. Stein; Xuehong Wei; Chen-Shan Chin; Katherine Guill; Michael Regulski; Sunita Kumari; Andrew Olson; Jonathan Gent; Kevin L. Schneider; Thomas K. Wolfgruber; Michael R. May; Nathan M. Springer; Eric Antoniou; Richard McCombie; Gernot G. Presting; Michael McMullen; Jeffrey Ross-Ibarra; Kelly Dawe; Alex Hastie; David R. Rank; Doreen Ware
At MaizeGDB  
DOI  

Stock and Biosample Information

Stock information
   Stock name   PI 677128 (maize inbred line B73 from NCRPIS (PI550473), which was grown at University of Missouri. )
   Stock details   PI 677128 (maize inbred line B73 from NCRPIS (PI550473), which was grown at University of Missouri. )
Biosample information

Sequencing and Assembly Information

   Assembly name   Zm-B73-REFERENCE-GRAMENE-4.0
   Sequencing description   Sequence service provider: Pacific Biosciences
Sequencing technologies: PacBio SMRT
Sequencing method: PacBio Single Molecule Real-Time sequencing
   Assembly description   Assembly methods: Celera Assembler PBcR–MHAP pipeline and Falcon. Quiver from SMRT Analysis v2.3.0 was used to polish base calling of contigs.
Construction of pseudomolecules: Sequences from BACs used for v3 pseudomolecules were aligned to PacBio contigs using MUMMER. The scaffolds were then ordered and oriented into pseudochromosomes using the order of BACs as a guide. Gap filling was done with Pbjelly. The pseudomolecules were then polished using the Quiver pipeline from SMRT Analysis v2.3.0. Illumina 2500 Rapid was used to improve accuracy of base calls. These reads were aligned to the assembly using BWA-mem. SAMtools was used to generate the BAM format alignment for the Pilon pipeline.
   Browse Genome   Genome browser at MaizeGDB
   Data download   ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/plant/Zea_mays/latest_assembly_versions/GCA_000005005.6_B73_RefGen_v4
   Release date   2017
   Sequencing method   PacBio SMRT
   Finishing strategy   Complete genome, 65X coverage. PBcR-MHAP assembly had the fewest contigs: 3303 contigs. That was the assembly adopted for the B73 RefGen_v4 genome
   Genome coverage   65X
Assembly statistics
   Scaff num   356
   Perc seq scaffold   99
   Perc seq unscaffold   1
   Total scaff length   2,075,000,000 bp
   N50 scaff length   9,730,000 bp
   N50 scaff count   79
   N90 scaff length   595,319 bp
   N90 scaff count   356
   Total contig length   2,104,000,000 bp
   N50 contig length   1,180,000 bp
Total number of scaffolds in assembly.
% assembly in scaffolded contigs.
% assembly in UNscaffolded contigs.
Total sequence length represented by scaffolds.
The length of scaffold which takes the sum length (summing from longest to shortest scaffold) past 50% of the total assembly size.
How many scaffolds are counted in reaching the N50 threshold.
The length of scaffold which takes the sum length (summing from longest to shortest scaffold) past 90% of the total assembly size.
How many scaffolds are counted in reaching the N90 threshold.
Total sequence length represented by contigs.
The length of contig which takes the sum length (summing from longest to shortest contig) past 50% of the total assembly size.
A contig is a contiguous consensus sequence that is derived from a collection of overlapping reads.
A scaffold is set of a ordered and orientated contigs that are linked to one another by mate pairs of sequencing reads.

Annotation

   Annotation Identifier   Zm00001d.1
   Annotation Provider   Doreen Ware, Cold Spring Harbor
   Annotation Date   9-12-2016
   Annotation Software   MAKER-P, Genome Assembly Converter, Mummer, CrossMap, Augustus, FGENESH
   Annotation Description   Gene annotation based on Genbank cDNA, PacBio Iso-seq RNA, MAKER-P gene model annotation using Arabidopsis, rice, sorghum, Setaria, and Brachypodium, Augustus and FGENESH gene model prediction, Genome Assembly Converter, Mummer, and CrossMap with known v3 gene models
   Annotation Download   ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/plant/Zea_mays/latest_assembly_versions/GCF_000005005.2_B73_RefGen_v4
   Annotation Identifier   Zm00001d.2
   Annotation Provider   Doreen Ware, Cold Spring Harbor
   Annotation Date   6-7-2017
   Annotation Software   MAKER-P, Genome Assembly Converter, Mummer, CrossMap, Augustus, FGENESH
   Annotation Description   Gene annotation based on Genbank cDNA, PacBio Iso-seq RNA, MAKER-P gene model annotation using Arabidopsis, rice, sorghum, Setaria, and Brachypodium, Augustus and FGENESH gene model prediction, Genome Assembly Converter, Mummer, and CrossMap with known v3 gene models. Corresponds to Gramene version 36.
   Annotation Download   ftp://ftp.ensemblgenomes.org/pub/plants/release-36/fasta/zea_mays
   Annotation Identifier   NCBI 101
   Annotation Provider   NCBI
   Annotation Date   2017-03-20
   Annotation Software   NCBI Eukaryote Annotation
   Annotation Description   Annotated by the NCBI Eukaryotic Genome Annotation Pipeline.
   Annotation Download   ftp://ftp.ncbi.nlm.nih.gov/genomes/Zea_mays/protein/