MaizeGDB
jobs | upcoming events | sitemap
 | docs | bulk data | browse data | tools | login / register | links 
homeA popup window providing help for using the search form to the right  

A Standard For Maize Genetics Nomenclature

From MNL 69:182-184 (1995), as updated Sep 1996; Apr 2000; Apr 2002; Oct 2006.

Index:


PREAMBLE: We wish to have a system that is consistent, compatible with the historical background of maize genetics (insofar as these two goals can be reconciled), is easily understood by plant geneticists working with other species, and forms the basis for the importation of maize data into a general plant genetics data base so that the basic knowledge concerning maize genes is available to researchers with other species and vice versa. We believe that this goal is best implemented by the researchers in each species having their own working vocabulary, while the identification of genes that catalyze the same functions in all species should rely on entry into a relational data base of the genes' function as an E.C. number (2.4.1.13), trivial name (sucrose synthase), and systematic name (UDPglucose:D-fructose 2-glucosyltransferase). The situation can be less completely categorized for genes whose products are transcription factors, structural proteins, storage proteins, etc.

If one accepts the premise outlined above that the common ground between species need not reside in the working vocabulary of geneticists using any species as a model system but in the manner in which their data are expressed in the data base, then the previously adopted names for maize genes can be retained. It will not be necessary to rename the genes previously named on the basis of the mutant phenotype produced as soon as the function of the nonmutant alleles becomes known, but we should proceed to define more precisely words or terms whose meanings need clarification and to decide how we wish to deal with the new information becoming available.

1. DEFINITIONS: The words "locus" and "gene" should not be treated as synonymous. A locus can be defined as "a chromosomal site of variable size at or within which is located a gene, a restriction site, a knob, a breakpoint, an insertion, or other distinguishable feature". This necessitates specifying whether we mean a gene locus or an RFLP locus, etc. We can then define a plant gene as "a DNA sequence of which a segment is regularly or conditionally transcribed at some time in either or both generations of the plant. The DNA is understood to include not only the exons and introns of the structural gene but the cis 5' and 3' regions in which a sequence change can affect gene expression". This treats the gene as a functionally defined entity that is not circumscribed by the transcribed region or other fixed limits.

2. ANONYMOUS TRANSCRIPTS: For most of the history of genetics, the existence of a gene was recognized when a mutation occurred, and the gene was then named by a word/term that was descriptive of the mutant phenotype. That will continue to be the practice except with isozyme markers, for which the designation will be the enzyme in question, or the instances in which the biochemical lesion responsible for the mutant phenotype is identified before the locus is reported. The loci of these genes have then been placed on chromosome maps in relation to other mapped loci. However, we now have the possibility of recognizing genes in which no mutation has been detected through the construction of cDNA libraries. These anonymous cDNAs are often used as probes in RFLP mapping. When such a probe hybridizes to a single band, it is clear that the RFLP loci circumscribe the transcriptional unit that encodes the message represented by the cDNA, and these RFLP loci with other RFLP loci can be used as the basis for mapping the gene. Mapping a locus in this fashion is encouraged as a means of obtaining maximum coverage of the genome. As long as the locus retains an anonymous status (unknown function or no mutant phenotype), the symbol for the locus should be assigned according to the convention used for RFLP loci (as umc148, see Section 8). Further information about the probe and its derivation is best provided in tabular or data base form rather than in the symbol itself.

A gene name identifying function for a locus detected with a cloned sequence should be given only when there is unambiguous evidence that this is the site by which that function is encoded. Particular caution should be taken in identifying genes (and their function) from several RFLPs hybridizing to a gene-specific probe from another organism. Until a sequence has been shown to encode the function in question, the gene designation should be that of an RFLP locus (see Section 8).

3. STANDARD NOMENCLATURE AND SYMBOLS: The names and symbols that have been used for maize genes should be retained. The name and symbol of a gene locus should be represented with lower-case, italic characters (defective kernel12, dek12). Note that no hyphen separates the gene name from a numerical suffix, which is a change from previous usage. We use a hyphen in the case of mutant alleles to separate the allele designation from a suffix specifying the particular allele (see Section 5). We advocate strongly that all genes identified in the future be given a three letter symbol. Newly detected maize genes that have been previously identified in other plant species should be named where appropriate (see the last paragraph in Section 2) with reference to the list of generic names compiled by the Commission on Plant Gene Nomenclature.

When designating homozygous genotypes with two or more unlinked genes, the genes are separated by semicolons, e.g. a1;a2;c1;c2;r. If linked, the genes are separated by spaces, e.g.C1 sh1 bz1 Wx1. Heterozygous genotypes should be written with a slash separating the sets of linked genes, e.g. C1 Bz1/c1 bz1. If the genes are unlinked, the proper designation is Sh2/sh2; Bt2/bt2.

4. LOCI WITH THE SAME GENE NAME: Where we have more than one nonallelic mutant with the same gene name, the earlier recommendation was that the first one to receive that name should not have a numerical suffix but the second has 2 as a suffix. Thus we have shrunken (sh), shrunken2 (sh2), and shrunken4 (sh4) mutants. Geneticists outside the maize community are apt to misinterpret this convention. We recommend that we be consistent and write shrunken1 or sh1 and advocate that even if a new locus is identified and given a unique name, it be designated as 1. This has the definite advantage in maintaining data bases and indices that no retrospective correction would be necessary if a second gene locus receives the same designation.

5. ALLELIC DESIGNATIONS: Where a mutant allele is recessive, it should be designated by an italicized symbol (lower case) as dek12, which is the same as the symbol of the locus. Since it is unlikely that any two mutant or nonmutant alleles in a highly polymorphic species such as maize have identical sequences, maize geneticists are encouraged to specify the particular allele with which they are working (see in this Section, Alleles of Independent Mutational Origin and Designation of Nonmutant Alleles). The symbol for dominant, nonmutant (i.e., conditioning a normal phenotype) alleles will be the same italicized three letter symbol as the mutant alleles but with the first letter capitalized (Dek12). The symbol of the gene product should not be italicized and should be written with all letters capitalized (e.g., ADH1). The name of the gene product (alcohol dehydrogenase) should neither be capitalized nor italicized.

When the mutant alleles of a gene are dominant, the first letter of the mutant symbol is capitalized. The nonmutant symbol has all the letters lower case. For example, the corn grass1 (cg1) gene locus has several dominant mutant (Cg1) alleles as well as nonmutant (cg1) alleles. The reference mutant allele is designated as Cg1-R or -1.

Codominant alleles such as isozymes where the variants are functional and distinguished from each other by electrophoretic mobility, should be designated by symbols with the first letter capitalized and identified by allelic specifications as Pgm2-5 or Pgm2-7.

ALLELES OF INDEPENDENT MUTATIONAL ORIGIN: The unambiguous designation of mutant alleles that have arisen as independent mutational events is increasingly important. It is generally understood that a gene symbol followed by a hyphen plus a letter or number(s) specifies a particular recessive allele at that gene locus. We have referred to the mutation by which the gene was identified as the reference allele; e.g. bz1-Ref or bz1-R. It is equally appropriate to refer to that allele as bz1-1. The mutations in any gene that were identified subsequently have been categorized in various idiosyncratic ways. Alleles that have arisen by independent mutational events have been designated by letters, numbers, a letter plus numbers, the name of the inbred in which the mutation occurred, and sometimes all of these applied to a group of alleles at a gene locus. While all of these designations served the purpose of indicating that these alleles had independent mutational origins, there is a clear advantage to greater standardization. As in the 1973 Nomenclature Standard, it is recommended that new alleles be identified by a laboratory number that might indicate the year of isolation as sh2-6801. This has the definite advantage that two laboratories are unlikely to designate two new mutations of the same gene by the same number. However, if two laboratories are targeting the same locus in mutagenesis experiments, they should consult before naming their new alleles to avoid giving the same designation to different alleles. Also recommended is the convention of referring to a new mutation of a given phenotype by a provisional designation as bt*-lab number until it is ascertained whether the mutant is a new allele of a known gene or identifies a previously unidentified gene. In the first instance, the proper gene symbol (bt1 or sh2) replaces bt*, but the lab number is retained (e.g., bt1-8711). In the second instance (a previously unidentified locus), a new gene name and symbol would be selected, and this mutant would become the reference allele (-R or -1).

When mutant alleles are referred to in the generic sense without specification of their origin, a hyphen without further designation (e.g., bz1-, dek12-) is desirable to make it clear that one is referring to an allele or alleles, not the gene locus.

DESIGNATION OF NONMUTANT ALLELES: Since it is now apparent that in a species as polymorphic as maize, nonmutant alleles from different sources are apt to have a number of sequence differences one from the other, and these differences can be reflected in gene action (nonmutant isoalleles), it is desirable to specify the nonmutant allele being investigated or used as a control. Incorporating the name of the inbred as part of the allelic designation, Bz1-W22, is an appropriate method of doing this. However, mutant alleles should not be designated by the inbred in which they arose (e.g., bz1-W22) to avoid confusion with the progenitor allele. Also, there may eventually be numerous mutant alleles of a particular gene isolated in that inbred if a researcher uses that inbred in a mutagenesis experiment. A particular nonmutant allele may be found in an exotic race or other accession that is not an inbred. A unique designator (e.g., a PI number or Bolivia #) should be part of the allelic designation.

RFLPs AND RAPDs AS ALLELES: The presence or absence of a restriction site or a primer-amplifiable sequence at a particular locus represent Mendelian alternatives. They fall under the broadest definition of an allele, and it is appropriate to refer to these alternatives as alleles as has already been done in some reports.

6. NAMING DELETIONS: When it is clear that a mutation results from a deletion that has removed all or part of two gene loci, it would be appropriate to indicate this in the following manner. For an1-6923, this would be def(an1..bz2)-6923, and for sh-bz-X2, def(bz1..sh1)-X2. When molecular evidence indicates that a deletion has removed all of the structural portion of a gene as is true of wx1-C34, it should be indicated in the same manner; i.e., def(wx1)-C34.

7. MUTATIONS RESULTING FROM TRANSPOSABLE ELEMENT INSERTIONS: There is one further point concerning allelic specification. Maize in particular has many mutable alleles resulting from the insertion of a transposable element. These have been designated by the mutant symbol, a hyphen, a lower case "m", and an isolation number; e.g., wx-m1. When the transposable element insertion [Ac, Ds, Spm(En), dSpm(I), Mu1..MuX, etc.] is known, it is suggested that this be indicated by a double colon following the allele as wx-m1::Ds1. Since a maize stock may have more than one transposable element family active at the same time, firm genetic and/or molecular evidence is necessary to ascribe mutability to a particular transposable element family. Further, mutable alleles generate both stable nonmutant and stable mutant alleles when the transposable element excises from the gene locus. Since the mutant derivatives are certain to differ in sequence from the nonmutant progenitor allele around the site of the transposable element insertion and the nonmutant derivatives are very likely to differ at that site, researchers should be certain to indicate the origin of such alleles in their reports. One means of doing this is to indicate such an origin by an apostrophe following the locus symbol as Bz1'-7801 or bz1'-8905. The specifics of its origin including the transposable element involved could then be included in the text and entered in the Maize Genome Data Base. Since transpositions of a transposable element from a site within a gene often insert in locations where they have no phenotypic effect but can be useful markers, it is desirable to have a standard to refer to such insertions. Designate them as RFLP's would be designated (see Section 8), but follow the institutional symbol and number with a double colon and the symbol of the transposable element (e.g., dnap2094::Ac).

8. NAMING RFLPs AND RAPDS: In naming RFLPs and RAPDs, use a lower case three or four letter code designating the originating university or company followed by a laboratory number (no space between the code and the number). When the probe used is a cDNA or a subclone of a gene, the gene symbol should be added in parentheses after the RFLP locus designation, as umc000(a1). Since a probe not infrequently recognizes RFLPs on two or more chromosomes, these should be designated by the same institutional code, number, and probe followed immediately by A, or B, or C. In so far as possible, the locus with the strongest hybridization should be designated A and the more weakly hybridizing loci be designated B, C etc. in descending order of signal strength.

9. CHROMOSOME REARRANGEMENTS: The conventions for dealing with chromosomal rearrangements are well established and adequate for the purpose. To designate particular reciprocal translocations as T1-2a or T1-9(4995) etc. with the breakpoints noted parenthetically or in a table of supporting information is explicit and sufficient. Additional information (the fact that the translocation stock is homozygous for wx1) can be incorporated by prefacing the translocation number with the gene symbol as the Co-op does in its stock lists (e.g., wx1 T1-9c). Translocations with B chromosomes have designations that indicate the arm of the A chromosome involved (L or S) as well as a lower case letter distinguishing that translocation from any others involving that particular chromosome arm, as TB-5Sc. The cytological breakpoint in the A chromosome as well as the loci uncovered when the TB translocation is used as a male parent can be noted in the text or in a table of supplementary information. The designations for inversions (e.g., Inv9b again with the breakpoints, 9S.05-L.87, listed in a supporting table) are succinct and convey the necessary information.

10. ORGANELLAR GENES: For chloroplast and mitochondrial genes, we accept for the present the proposals already in place. For chloroplast genes, this is Hallick and Bottomley, 1983. Plant Mol. Biol. Rep. 1(4): 38-43, as updated at SwissProt or by the Chloroplast working group for the Commission on Plant Gene Nomenclature. For mitochondrial genes, this is Lonsdale and Leaver, 1988. Ibid. 6(2):14-21, updated by the Mitochondrion working groupfor the Commission on Plant Gene Nomenclature. For brevity's sake, these are not summarized here.

11. TRANSCRIPTION FACTORS: (Oct 2006 addition) We define here TFs as proteins that contain a DNA-binding domain and that fall within one of the families described in http://arabidopsis.med.ohio-state.edu/AtTFDB/.

There is currently no coherent effort in maize for a rational and organized naming of transcription factors (TFs). The use of GenBank accession numbers, EST names or locus identifiers provides an impractical mechanism, which often leads to ambiguities, for example because of multiple entries in GenBank or of several ESTs for the same protein. Thus, we propose here to create a uniform nomenclature for maize TFs, following the lead from Arabidopsis. A similar proposal is being adopted by the TIGR rice annotation group and by the SUCEST-FUN sugarcane annotation group.

Recommendation
Gene products - Each transcription factor will have an organism identifier (Zm) to be used only in the context of other organisms, followed by letters that represent the TF family (e.g., MYB, bHLH, HD, bZIP) and by a number that will start with '1'. A similar strategy is currently being applied to other maize gene families (e.g., the kinesins, see http://www.maizegdb.org/cgi-bin/displaygprecord.cgi?id=276102). Since we realize that many TFs are known by their genetic names, this nomenclature will permit the use of synonyms. For example, KNOTTED could be named HD1(KN) (or ZmHD1(KN) when being compared to HDs of other species) and C1 would be MYB1(C1) (or ZmMYB1(C1)). In addition, whenever possible, we will try to have the numbers provide a historic perspective of which TFs have been first identified. In that regard, since KN and C1 correspond to the founding members of their respective families in maize, they are assigned the number '1'. Prior genetic nomenclature will be incorporated in the database.

Genes - Existing names for genes encoding TFs will not be altered. If necessary, and only as a way to provide coherence with the naming of the gene products, the synonym strategy described above would be used. In that regard, c1 would continue to be c1 but could also be cross-referenced as c1(myb1). New genes will be named according to their products. If mutant phenotypes are identified at a later date, gene names derived from mutant phenotypes will be added as synonyms, but the original name will not be changed. As indicated for the gene products, the use of the prefix Zm in front of the gene's name will only be used when comparing maize genes with related genes from other species (e.g., Zm myb1).

Note that for generating a position for transcription factors, Erich Grotewold served on the Nomenclature Committee in an ad hoc capacity.

12. GENE MODEL IDENTIFIERS: In 2013, MaizeGDB, MaizeSequence.org, and the Maize Nomenclature Committee recognized the need to formulate a method for naming genes across the subspecies such that the nomenclature would do three things.

  1. Use species and inbred specific identifiers instead of project-specific identifiers (e.g. "GRMZM" for "Gramene Zea mays"\).
  2. Reflect the maturity of the B73 genome assembly by not implying chromsomal order in the name. Current order and orientation of gene models within BACs that make up the pseudomolecule may not represent their correct order and orientation on the chromosome.
  3. Allow the unique diversity among maize lines to be accounted for. Order and orientation (indeed presence/absence and copy number\) are not conserved among lines [Wang Q and Dooner H 2006 PNAS 103:17644-17649; Springer NM et al 2009 PLoS Genet 5:e1000734]. Nomenclature of genes based on the order in B73 would likely be in conflict among lines, and could unnecessarily imply or confound the order of genes in other lines.
As of March 2013, gene model names for the B73 reference genome assembly version 2 now reflect the species and inbred line, and include a random number that is independent of order on a chromosome. Using this nomenclature, the prefix for Zea mays ssp. mays B73 is "ZEAMMB73" and for Zea mays ssp mays Palomero Toluqueno is "ZEAMMPT". These guidelines were agreed upon by representatives from MaizeGDB, NCBI, Gramene, and the Maize Nomenclature Committee based on discussions facilitated by Carolyn Lawrence.

The Gramene GRMZM identifiers are retained as synonyms and can be searched. This means that gene model ZEAMMB73_979442 will also be found by searching on GRMZM2G010095.

Click here for a table of the current names and synonyms.

13. CLEARING HOUSE FOR NOMENCLATURE: We also believe that it is desirable to initiate a clearing house for maize nomenclature so that a researcher wishing to name a recently identified gene can ascertain almost immediately that no one has used the proposed designation and symbol. This clearing house can, in principle, function through the MaizeGDB, which will be refereed by a cooperator. The same facility could be used to insure that allelic designations are not duplicated or to answer questions concerning nomenclature.

Submitted Sep 10, 1996 by the Nomenclature Subcommittee.

Current Members Include:
Marty Sachs (Chair)
Tom Brutnell
Hugo Dooner
Charles (Chunguang) Du
Toby Kellogg
Carolyn Lawrence
Mary (Polacco) Schaeffer
Philip Stinard

1996 UPDATES:
  • ANONYMOUS TRANSCRIPTS: decision made not to utilize the parenthetic 'gfu' designation for "gene, function unknown". RATIONALE: in common usage, the 'gfu' suffix has proven confusing, implying 'known function', especially to researchers from other species. The confusion arises from the practice in RFLP naming to include parenthetic acronyms where sites are detected by probes with an assigned or putative identity with a particular gene product.

  • ALLELIC DESIGNATIONS: decision made to use '-', rather than '+', in designations of non-mutant alleles. RATIONALE: use of '+' has met with resistance by journal editors; definition of non-mutant alleles can be a grey area.

APPENDIX:Probe ACRONYMS IN USE

May 2000 Updated:

 agr    Agrigenetics                                       
 asg    Asgrow Seed                                        
 ast    Academica Sinica, Taiwan                           
 bcd    barley cDNA, Cornell University                    
 bnl    Brookhaven National Laboratory 
 bnlg   Brookhaven National Laboratory, SSR probes                    
 cdo    oat leaf cDNA, Cornell University                  
 crc    Carlsberg Research Center                          
 csh    Cold Spring Harbor                                 
 csic   Centro de Investigacion y Desarrollo, Barcelona
 csu	California State University, Hayward    
 cuny   City University of New York                        
 dnap   DNA Plant Technologie Corp                         
 dup    Dupont 
 fco    Colorado State U. Fort Collins
 fmi    Friedrich Miescher-Institut                                            
 gii    Genetics Institute Inc.                            
 ias    Iowa State University
 iger   Institute of Grassland and Environmental Research
 inra   Institut National de al Recherche Agronomique
 isc    Ist Sper Cereal
 isu    Iowa State University
 klp    Universitat Hohenheim, Stuttgart                                    
 koln   University of Koln 
 ksu    Kansas State University
 lim    Limagrain
 mmc    Maize Microsatellite Consortium (UK) 
 mmp    Missouri Maize Project                               
 mpik   Max-Planck-Institute, Koln 
 mps    Mycogen Plant Sciences
 nc     North Carolina                        
 ncr    North Carolina Raleigh                             
 ncsu   North Carolina State University                    
 niu    Northern Illinois University                       
 npi    Native Plants Incorporated
 op     Operon Technologies
 osu    Ohio State University
 pbs    Purdue Biological Sciences                         
 pge    Plant Gene Expression Center 
 pgs    Plant Genetic Systems
 phi    Pioneer Hi-Bred International (SSR)                      
 php    Pioneer Hi-Bred International 
 pic    Plant Industry Canberra                     
 psu    Penn State University                              
 rg     rice genomic, Cornell University 
 rgp    Rice Genome Program, Japan                  
 rny    Rockefeller University                             
 rpa    Rhone Poulenc                                      
 rz     rice cDNA, Cornell University
 sb     Sorghum biocolor 
 scri   Scottish Crop Research Insitute                     
 std    Stanford University
 tda    Tripsacum dactyloides
 tjp    University of Tokyo, Japan
 ttu    Texas Tech University
 tum    Technische Universitat Munchen
 uat    University of Arizona - Tucson                               
 uaz    University of Arizona                              
 ucb    University of California - Berkley
 ucd    Univeristy of Califormia - Davis                   
 ucla   University of California - Los Angeles               
 ucr    University of California - Riverside                 
 ucsd   University of California - San Diego                
 ufg    University of Florida - Gainesville                  
 uiu    University of Illinois - Urbana 
 ukd    University of Copenhagen
 uky    University of Kentucky                     
 umc    University of Missouri - Columbia                    
 umn    University of Minnesota
 umsl   University of Missouri - St. Louis
 uob    University of Barcelona
 uom    Univeristy of Manitoba
 uor    University of Oregon                            
 uox    University of Oxford
 usu    Utah State University                               
 uwo    University of Western Ontario
 uzh    University of Zurich                      
 wsu    Washington State University                        
 wusl   Washington University, St. Louis                   
 ynh    Yale University                      

Return to the homepage

Last updated 11:59 am, May 23, 2014.

home  

Please cite us!

This page is HTML 4.01 valid!