Schema for Geneid Genes - Geneid Gene Predictions
  Database: DbipGB2    Primary Table: geneid    Row Count: 25,081
fieldexampleSQL type info
bin 585smallint(5) unsigned range
name geneid_AFFE02000001.1varchar(255) values
chrom AFFE02000001varchar(255) values
strand -char(1) values
txStart 323int(10) unsigned range
txEnd 407int(10) unsigned range
cdsStart 323int(10) unsigned range
cdsEnd 407int(10) unsigned range
exonCount 1int(10) unsigned range
exonStarts 323,longblob  
exonEnds 407,longblob  

Sample Rows
 
binnamechromstrandtxStarttxEndcdsStartcdsEndexonCountexonStartsexonEnds
585geneid_AFFE02000001.1AFFE02000001-3234073234071323,407,
585geneid_AFFE02000004.1AFFE02000004-1402931402931140,293,
585geneid_AFFE02000007.1AFFE02000007-6416746416741641,674,
585geneid_AFFE02000023.1AFFE02000023+10581107131058110713110581,10713,
585geneid_AFFE02000025.1AFFE02000025+61261216,12,
585geneid_AFFE02000026.1AFFE02000026-408541994085419914085,4199,
585geneid_AFFE02000043.1AFFE02000043-7809397809391780,939,
585geneid_AFFE02000044.1AFFE02000044-961115996111591961,1159,
585geneid_AFFE02000045.1AFFE02000045-4116454116451411,645,
585geneid_AFFE02000046.1AFFE02000046-11268113041126811304111268,11304,

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

Geneid Genes (geneid) Track Description
 

Description

This track shows gene predictions from the geneid program developed by Roderic Guigó's Computational Biology of RNA Processing group which is part of the Centre de Regulació Genòmica (CRG) in Barcelona, Catalunya, Spain.

Methods

Geneid is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure. In the first step, splice sites, start and stop codons are predicted and scored along the sequence using Position Weight Arrays (PWAs). Next, exons are built from the sites. Exons are scored as the sum of the scores of the defining sites, plus the the log-likelihood ratio of a Markov Model for coding DNA. Finally, from the set of predicted exons, the gene structure is assembled, maximizing the sum of the scores of the assembled exons.

Credits

Thanks to Computational Biology of RNA Processing for providing these data.

References

Blanco E, Parra G, Guigó R. Using geneid to identify genes. Curr Protoc Bioinformatics. 2007 Jun;Chapter 4:Unit 4.3. PMID: 18428791

Parra G, Blanco E, Guigó R. GeneID in Drosophila. Genome Res. 2000 Apr;10(4):511-5. PMID: 10779490; PMC: PMC310871