Schema for Unmapped modENCODE RNA-Seq - Assembled Unmapped modENCODE RNA-Seq Reads
  Database: DmojImproved    Primary Table: SRR166834_unmapped    Row Count: 107,263
fieldexampleSQL type info
bin 585smallint(5) unsigned range
chrom improved_6498varchar(255) values
chromStart 804int(10) unsigned range
chromEnd 912int(10) unsigned range
name SRR166834_unmapped_96833varchar(255) values
score 1000int(10) unsigned range
strand -char(1) values
thickStart 804int(10) unsigned range
thickEnd 912int(10) unsigned range
reserved 0int(10) unsigned range
blockCount 1int(10) unsigned range
blockSizes 108,longblob  
chromStarts 0,longblob  

Sample Rows
 
binchromchromStartchromEndnamescorestrandthickStartthickEndreservedblockCountblockSizeschromStarts
585improved_6498804912SRR166834_unmapped_968331000-80491201108,0,
585improved_649850325107SRR166834_unmapped_787271000+503251070175,0,
585improved_64981035810466SRR166834_unmapped_968331000-103581046601108,0,
585improved_64981510915217SRR166834_unmapped_968331000-151091521701108,0,
585improved_64985229198018SRR166834_unmapped_99469992-52291980180327,136,66,0,872,45661,
585improved_64985643156485SRR166834_unmapped_960751000+56431564850154,0,
585improved_64986351664173SRR166834_unmapped_52894998-635166417301657,0,
585improved_64986414964283SRR166834_unmapped_96906956+64149642830294,40,0,94,
585improved_64986426164353SRR166834_unmapped_411251000+64261643530192,0,
585improved_64987733077405SRR166834_unmapped_341881000+77330774050175,0,

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

Unmapped modENCODE RNA-Seq (unmapped_rnaseq) Track Description
 

Description

RNA-seq reads generated by the modENCODE project for D. mojavensis were mapped against the D. mojavensis genome using TopHat2. Unmapped reads are collected and assembled using ABySS and CAP3. The assembled unmapped reads are then mapped against the D. mojavensis genome using BLAT.

Methods

Unmapped RNA-seq reads are partitioned into 1GB chunks and assembled separately using ABySS. The assembled contigs are merged together using CAP3.

References

Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009 Jun;19(6):1117-23.

Huang X, Madan A. CAP3: A DNA sequence assembly program. Genome Res. 1999 Sep;9(9):868-77.

The RNA-Seq data were submitted by the modENCODE project. The original RNA-Seq dataset can be obtained from the NCBI GEO database under the accession number GSE28078.