Schema for RepeatMasker - Repeating Elements by RepeatMasker
  Database: DbipGB2    Primary Table: rmsk    Row Count: 131,213   Data last updated: 2022-10-20
fieldexampleSQL type info
bin 585smallint(5) unsigned range
swScore 6437int(10) unsigned range
milliDiv 13int(10) unsigned range
milliDel 47int(10) unsigned range
milliIns 6int(10) unsigned range
genoName AFFE02000007varchar(255) values
genoStart 0int(10) unsigned range
genoEnd 1067int(10) unsigned range
genoLeft 0int(11) range
strand +char(1) values
repName RepeatScout_4446varchar(255) values
repClass Unknownvarchar(255) values
repFamily Unknownvarchar(255) values
repStart 1int(11) range
repEnd 825int(11) range
repLeft -467int(11) range
id 1char(1) values

Sample Rows
 
binswScoremilliDivmilliDelmilliInsgenoNamegenoStartgenoEndgenoLeftstrandrepNamerepClassrepFamilyrepStartrepEndrepLeftid
585643713476AFFE02000007010670+RepeatScout_4446UnknownUnknown1825-4671
58514613466AFFE020000230179-14024+RepeatScout_4005LINECR12054223202
58512711524117AFFE02000023142449-13754-Mariner-6_DKDNATcMar-Tc1-238139910543
5851006140418AFFE020000234881756-12447+Jockey-14_DBpLINEI-Jockey81316-33264
585416335178AFFE0200002317632143-12060+Jockey-14_DBpLINEI-Jockey15702062-25804
58542427700AFFE0200002321432255-11948+RepeatScout_4129UnknownUnknown1112-16245
585416335178AFFE0200002322552558-11645+Jockey-14_DBpLINEI-Jockey20632458-21844
585599052647AFFE0200002325693338-10865+RepeatScout_4016LINEJockey24873298-3146
585831345250AFFE0200002333374344-9859+Jockey-14_DBpLINEI-Jockey35474578-644
58536811303244AFFE0200002343455029-9174-Mariner-6_DKDNATcMar-Tc1-880689143

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

RepeatMasker (rmsk) Track Description
 

Description

This track was created by using Arian Smit's RepeatMasker program, which screens DNA sequences for interspersed repeats and low complexity DNA sequences. The program outputs a detailed annotation of the repeats that are present in the query sequence (represented by this track), as well as a modified version of the query sequence in which all the annotated repeats have been masked (generally available on the Downloads page). RepeatMasker uses the Repbase Update library of repeats from the Genetic Information Research Institute (GIRI). Repbase Update is described in Jurka, J. (2000) in the References section below.

Display Conventions and Configuration

In full display mode, this track displays up to ten different classes of repeats:

  • Short interspersed nuclear elements (SINE), which include ALUs
  • Long interspersed nuclear elements (LINE)
  • Long terminal repeat elements (LTR), which include retroposons
  • DNA repeat elements (DNA)
  • Simple repeats (micro-satellites)
  • Low complexity repeats
  • Satellite repeats
  • RNA repeats (including RNA, tRNA, rRNA, snRNA, scRNA)
  • Other repeats, which includes class RC (Rolling Circle)
  • Unknown

The level of color shading in the graphical display reflects the amount of base mismatch, base deletion, and base insertion associated with a repeat element. The higher the combined number of these, the lighter the shading.

Methods

UCSC has used the most current versions of the RepeatMasker software and repeat libraries available to generate these data. Note that these versions may be newer than those that are publicly available on the Internet.

Data are generated using the RepeatMasker -s flag. Additional flags may be used for certain organisms. Repeats are soft-masked. Alignments may extend through repeats, but are not permitted to initiate in them. See the FAQ for more information.

Credits

Thanks to Arian Smit and GIRI for providing the tools and repeat libraries used to generate this track.

References

Jurka J. Repbase update: a database and an electronic journal of repetitive elements. Trends Genet. 2000 Sep;16(9):418-20. PMID: 10973072