Schema for GeMoMa Genes - GeMoMa Gene Predictions
|
|
Database: DereCAF1 Primary Table: GeMoMa Row Count: 13,211   Data last updated: 2022-10-20
field | example | SQL type | info |
bin | 585 | smallint(5) unsigned | range |
name | FBTR0091810_R4 | varchar(255) | values |
chrom | scaffold_1121 | varchar(255) | values |
strand | - | char(1) | values |
txStart | 48 | int(10) unsigned | range |
txEnd | 438 | int(10) unsigned | range |
cdsStart | 48 | int(10) unsigned | range |
cdsEnd | 438 | int(10) unsigned | range |
exonCount | 1 | int(10) unsigned | range |
exonStarts | 48, | longblob | |
exonEnds | 438, | longblob | |
score | 0 | int(11) | range |
name2 | gene_0 | varchar(255) | values |
cdsStartStat | unk | enum('none', 'unk', 'incmpl', 'cmpl') | values |
cdsEndStat | unk | enum('none', 'unk', 'incmpl', 'cmpl') | values |
exonFrames | 0, | longblob | |
|
| |
|
|
Sample Rows
|
|
bin | name | chrom | strand | txStart | txEnd | cdsStart | cdsEnd | exonCount | exonStarts | exonEnds | score | name2 | cdsStartStat | cdsEndStat | exonFrames |
---|
585 | FBTR0091810_R4 | scaffold_1121 | - | 48 | 438 | 48 | 438 | 1 | 48, | 438, | 0 | gene_0 | unk | unk | 0, |
585 | FBTR0082679_R6 | scaffold_1128 | + | 812 | 1817 | 812 | 1817 | 1 | 812, | 1817, | 0 | gene_1 | unk | unk | 0, |
585 | FBTR0077470_R1 | scaffold_1247 | - | 244 | 601 | 244 | 601 | 1 | 244, | 601, | 0 | gene_2 | unk | unk | 0, |
585 | FBTR0111241_R1 | scaffold_1361 | - | 1520 | 2180 | 1520 | 2180 | 2 | 1520,1672, | 1615,2180, | 0 | gene_3 | unk | unk | 1,0, |
585 | FBTR0072634_R0 | scaffold_1383 | + | 3541 | 3964 | 3541 | 3964 | 1 | 3541, | 3964, | 0 | gene_4 | unk | unk | 0, |
585 | FBTR0072634_R1 | scaffold_1383 | + | 6664 | 7087 | 6664 | 7087 | 1 | 6664, | 7087, | 0 | gene_5 | unk | unk | 0, |
585 | FBTR0082086_R0 | scaffold_1388 | + | 6016 | 8968 | 6016 | 8968 | 3 | 6016,6609,8701, | 6551,8426,8968, | 0 | gene_6 | unk | unk | 0,1,0, |
585 | FBTR0347010_R1 | scaffold_1399 | + | 235 | 1671 | 235 | 1671 | 4 | 235,584,1041,1592, | 277,779,1535,1671, | 0 | gene_7 | unk | unk | 0,0,0,2, |
585 | FBTR0083367_R2 | scaffold_1408 | + | 2834 | 3818 | 2834 | 3818 | 3 | 2834,2931,3421, | 2878,3363,3818, | 0 | gene_8 | unk | unk | 0,2,2, |
585 | FBTR0307509_R1 | scaffold_1425 | + | 2697 | 4868 | 2697 | 4868 | 5 | 2697,2868,3118,3877,4830, | 2808,3028,3607,4774,4868, | 0 | gene_9 | unk | unk | 0,0,1,1,1, |
|
Note: all start coordinates in our database are 0-based, not
1-based. See explanation
here.
| |
|
|
GeMoMa Genes (GeMoMa) Track Description
|
|
Description
D. melanogaster protein sequences from FlyBase were aligned against each
scaffold in the D. erecta (DereCAF1) assembly and the predicted gene models were constructed
using GeMoMa.
Methods
D. melanogaster protein sequences were aligned against the D. erecta (DereCAF1) genome
assembly using NCBI TBLASTN with the following parameters:
-evalue 1e-5
-max_intron_length 100000
-matrix BLOSUM80
-gapopen 13
-gapextend 2
-soft_masking true
-db_soft_mask 30
-best_hit_overhang 0.1
-best_hit_score_edge 0.1
The TBLASTN results were used by GeMoMa to produce an initial set of gene predictions.
The GeMoMa predictions are then filtered by the GAF module in GeMoMa to
produce the final set of gene predictions.
References
Keilwagen J, Wenk M, Erickson JL, Schattat MH, Grau J, Hartung F.
Using intron position conservation for homology-based gene prediction.
Nucleic Acids Res. 2016 May 19;44(9):e89.
| |
|
|
|