Description
RNA-seq reads generated by the modENCODE project for D. mojavensis were
mapped against the D. mojavensis genome using TopHat2 and predicted transcripts
are assembled using Cufflinks and CEM. Coding regions within the
predicted transcripts are identified using TransDecoder. This collection of
coding regions are aligned against the D. mojavensis genome using
blastx followed by Spaln2.
Methods
Candidate coding regions in the collection of assembled D. mojavensis
transcripts were identified using TransDecoder
using the following parameters: -m 50, --search_pfam Pfam-A.hmm.
The collection of predicted D. mojavensis proteins were initially mapped
against the D. mojavensis genome using BLASTX to identify regions of similarity.
Spaln2 is then used to re-align each protein against their corresponding region
with same-species parameters optimized for D. melanogaster: (-Tdromel -yS).
References
Hass B. TransDecoder (Finding Coding Regions Within Transcripts).
Iwata H., and Gotoh, O.
Benchmarking spliced alignment programs including Spaln2,
an extended version of Spaln2 that incorporates additional species-specific features.
Nucleic Acids Research. 2012, 109
The RNA-Seq data were submitted by the modENCODE project.
The original RNA-Seq dataset can be obtained from the NCBI GEO database under the accession number
GSE28078.
|