Description
This track shows gene predictions using the N-SCAN gene structure prediction
program with multiple Drosophila species as informant.
Methods
N-SCAN
N-SCAN combines biological-signal modeling in the target genome sequence along
with information from a multiple-genome alignment to generate de novo gene
predictions. It extends the TWINSCAN target-informant genome pair to allow for
an arbitrary number of informant sequences as well as richer models of sequence
evolution. N-SCAN models the phylogenetic relationships between the aligned
genome sequences, context-dependent substitution rates, insertions,
and deletions.
N-SCAN PASA-EST
N-SCAN PASA-EST combines EST alignments into N-SCAN. Similar to the conservation sequence models
in TWINSCAN, separate probability models are developed for EST alignments to genomic sequence in
exons, introns, splice sites and UTRs, reflecting the EST alignment patterns in these regions.
N-SCAN PASA-EST is more accurate than N-SCAN while retaining the ability to discover novel genes
to which no ESTs align.
In N-SCAN PASA-EST, the TransDecoder gene predictions
are used as 'EST' sequences in N-SCAN PASA-EST. The resulting gene models were updated with the
input PASA clusters using the assembly tool of the PASA pipeline. These updates consist of
automatically generated alternative splices, UTR features and sometimes merging of two gene models.
In addition, PASA assigned open reading frames to clusters that did not overlap a gene prediction,
but that did contain a full length cDNA, and output them as 'novel genes'. Note that PASA does not
use any cDNA annotation from input but assigns the ORF itself.
References
- Gross SS, Brent MR.
Using multiple alignments to improve gene prediction.
In Proc. 9th Int'l Conf. on Research in Computational Molecular Biology
(RECOMB '05):374-388 and J Comput Biol. 2006 Mar;13(2):379-93.
- Korf I, Flicek P, Duan D, Brent MR.
Integrating genomic homology into gene structure prediction.
Bioinformatics. 2001 Jun 1;17(90001):S140-8.
-
Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK Jr, Hannick LI, Maiti R, Ronning CM, Rusch DB, Town CD et al.
Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies.
Nucleic Acids Res 2003 Oct 1;31(19):5654-66.
|
|