Schema for Geneid Genes - Geneid Gene Predictions
  Database: DkikGB2    Primary Table: geneid    Row Count: 28,115
fieldexampleSQL type info
bin 585smallint(5) unsigned range
name geneid_AFFH02000003.1varchar(255) values
chrom AFFH02000003varchar(255) values
strand -char(1) values
txStart 427int(10) unsigned range
txEnd 1015int(10) unsigned range
cdsStart 427int(10) unsigned range
cdsEnd 1015int(10) unsigned range
exonCount 2int(10) unsigned range
exonStarts 427,515,longblob  
exonEnds 458,1015,longblob  

Sample Rows
 
binnamechromstrandtxStarttxEndcdsStartcdsEndexonCountexonStartsexonEnds
585geneid_AFFH02000003.1AFFH02000003-427101542710152427,515,458,1015,
585geneid_AFFH02000010.1AFFH02000010+104010611040106111040,1061,
585geneid_AFFH02000012.1AFFH02000012+7758177758171775,817,
585geneid_AFFH02000021.1AFFH02000021-969100296910021969,1002,
585geneid_AFFH02000026.1AFFH02000026-669168566916852669,1030,971,1685,
585geneid_AFFH02000037.1AFFH02000037+6426666426661642,666,
585geneid_AFFH02000038.1AFFH02000038+209821282098212812098,2128,
585geneid_AFFH02000039.1AFFH02000039+6432564325164,325,
585geneid_AFFH02000040.1AFFH02000040-6237316237311623,731,
585geneid_AFFH02000041.1AFFH02000041+6236596236591623,659,

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

Geneid Genes (geneid) Track Description
 

Description

This track shows gene predictions from the geneid program developed by Roderic Guigó's Computational Biology of RNA Processing group which is part of the Centre de Regulació Genòmica (CRG) in Barcelona, Catalunya, Spain.

Methods

Geneid is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure. In the first step, splice sites, start and stop codons are predicted and scored along the sequence using Position Weight Arrays (PWAs). Next, exons are built from the sites. Exons are scored as the sum of the scores of the defining sites, plus the the log-likelihood ratio of a Markov Model for coding DNA. Finally, from the set of predicted exons, the gene structure is assembled, maximizing the sum of the scores of the assembled exons.

Credits

Thanks to Computational Biology of RNA Processing for providing these data.

References

Blanco E, Parra G, Guigó R. Using geneid to identify genes. Curr Protoc Bioinformatics. 2007 Jun;Chapter 4:Unit 4.3. PMID: 18428791

Parra G, Blanco E, Guigó R. GeneID in Drosophila. Genome Res. 2000 Apr;10(4):511-5. PMID: 10779490; PMC: PMC310871