OLOGRAM : Determining significance of total overlap length between genomic regions sets

authors

  • Ferré Q
  • Charbonnier G.
  • Sadouni N
  • Lopez F
  • Kermezli Y
  • Spicuglia S.
  • Capponi C
  • Ghattas B.
  • Puthier D.

document type

ART

abstract

Motivation: Various bioinformatics analyses provide sets of genomic coordinates of interest. Whether two such sets possess a functional relation is a frequent question. This is often determined by interpreting the statistical significance of their overlaps. However, only few existing methods consider the lengths of the overlap, and they do not provide a resolutive p-value. Results: Here, we introduce OLOGRAM, which performs overlap statistics between sets of genomic regions described in BEDs or GTF. It uses Monte Carlo simulation, taking into account both the distributions of region and inter-region lengths, to fit a negative binomial model of the total overlap length. Exclusion of user-defined genomic areas during the shuffling is supported.

more information