Schema for Geneid Genes - Geneid Gene Predictions
  Database: DsuzGB1    Primary Table: geneid    Row Count: 32,882
fieldexampleSQL type info
bin 585smallint(5) unsigned range
name geneid_AWUT01012676.1varchar(255) values
chrom AWUT01012676varchar(255) values
strand -char(1) values
txStart 21759int(10) unsigned range
txEnd 44732int(10) unsigned range
cdsStart 21759int(10) unsigned range
cdsEnd 44732int(10) unsigned range
exonCount 3int(10) unsigned range
exonStarts 21759,22150,44639,longblob  
exonEnds 22097,22292,44732,longblob  

Sample Rows
 
binnamechromstrandtxStarttxEndcdsStartcdsEndexonCountexonStartsexonEnds
585geneid_AWUT01012676.1AWUT01012676-21759447322175944732321759,22150,44639,22097,22292,44732,
585geneid_AWUT01012676.2AWUT01012676-45846527394584652739345846,51339,52726,46087,51379,52739,
585geneid_AWUT01012676.3AWUT01012676+56049562415604956241156049,56241,
585geneid_AWUT01012676.4AWUT01012676-62619653846261965384262619,65221,62816,65384,
585geneid_AWUT01012676.5AWUT01012676-70077703357007770335170077,70335,
585geneid_AWUT01013263.1AWUT01013263-239374239374223,9108,204,9374,
585geneid_AWUT01013263.2AWUT01013263+17597477251759747725217597,47376,17725,47725,
585geneid_AWUT01013263.3AWUT01013263+55675558375567555837155675,55837,
585geneid_AWUT01013263.4AWUT01013263+57062572695706257269157062,57269,
585geneid_AWUT01013449.1AWUT01013449+844090558440905518440,9055,

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

Geneid Genes (geneid) Track Description
 

Description

This track shows gene predictions from the geneid program developed by Roderic Guigó's Computational Biology of RNA Processing group which is part of the Centre de Regulació Genòmica (CRG) in Barcelona, Catalunya, Spain.

Methods

Geneid is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure. In the first step, splice sites, start and stop codons are predicted and scored along the sequence using Position Weight Arrays (PWAs). Next, exons are built from the sites. Exons are scored as the sum of the scores of the defining sites, plus the the log-likelihood ratio of a Markov Model for coding DNA. Finally, from the set of predicted exons, the gene structure is assembled, maximizing the sum of the scores of the assembled exons.

Credits

Thanks to Computational Biology of RNA Processing for providing these data.

References

Blanco E, Parra G, Guigó R. Using geneid to identify genes. Curr Protoc Bioinformatics. 2007 Jun;Chapter 4:Unit 4.3. PMID: 18428791

Parra G, Blanco E, Guigó R. GeneID in Drosophila. Genome Res. 2000 Apr;10(4):511-5. PMID: 10779490; PMC: PMC310871