Software

ChIP-Enrich, Poly-Enrich, and proxReg: Gene Set Enrichment Testing for large sets of genomic regions

Webtool: http://chip-enrich.med.umich.edu
Bioconductor package: http://bioconductor.org/packages/release/bioc/html/chipenrich.html

ChIP-Enrich and Poly-Enrich test for enrichment of biological pathways in large sets of narrow genomic regions, such as from ChIP-seq or ATAC-seq peaks, repetitive region families, etc. (For broad genomic regions, see Broad-Enrich instead.) ProxReg is a complementary tool that tests whether the regulation of a gene set tends to be mainly from promoter regions or enhancer regions. Users have several options including Gene Ontology, KEGG, MESH, MSigDB, and other types of gene sets. Using an uploaded input file, ChIP-Enrich and Poly-Enrich assign genomic regions to genes based on a chosen "locus definition". The "locus" of a gene is the region from which the gene is predicted to be regulated. We are now adding smart enhancer-gene target links, which we’ve shown perform better than simply assigning each genomic region to the gene with the nearest TSS. ChIP-Enrich uses a logistic regression model to test for association between the presence of at least one peak in a gene and gene set membership, while Poly-Enrich uses a negative binomial regression model to test the association between the number of peaks in a gene and gene set membership. They empirically adjust for the relationship between the length of the loci (and optionally mappability) and the outcome using a cubic smoothing spline term within the model. Poly-enrich canalso take weighted, or scored, genomic regions. Output includes summary plots, peak to gene assignments, and enrichment (and depletion) results including odds ratio, p-value, and FDR for each gene set.

Broad-Enrich: Gene Set Enrichment Testing for large sets of broad genomic regions

Webtool : http://broad-enrich.med.umich.edu
Bioconductor package: http://bioconductor.org/packages/release/bioc/html/chipenrich.html

Broad-Enrich tests sets of broad genomic regions (e.g., from ChIP-seq data for histone modifications or copy number variations) for enriched biological pathways, Gene Ontology terms, or other gene sets. The pre-defined gene sets are the same as used in LRpath, and can be browsed here. Using an input .bed, .narrowPeak or.broadPeak file, Broad-Enrich determines the proportion of each gene locus covered by a peak, using a chosen "gene locus definition". The "locus" of a gene is the region from which the gene is predicted to be regulated. Broad-Enrich uses a logistic regression model to test for association between the proportion of each gene locus covered by a peak and gene set membership. It empirically adjusts for the bias due to locus length using a binomial cubic smoothing spline within the logistic model. Output includes summary plots, peak to gene assignments, and enrichment (and depletion) results including odds ratio, p-value, and FDR for each gene set

Annotatr: Annotation of Genomic Regions to Genomic Annotations

Bioconductor package: https://www.bioconductor.org/packages/release/bioc/html/annotatr.html

The annotatr Bioconductor package provides an easy way to summarize and visualize the intersection of genomic sites/regions with genomic annotations. Given a set of genomic sites/regions (e.g. ChIP-seq peaks, CpGs, differentially methylated CpGs or regions, SNPs, etc.) it is often of interest to investigate the intersecting genomic annotations. Such annotations include those relating to gene models (promoters, 5'UTRs, exons, introns, and 3'UTRs), CpGs (CpG islands, CpG shores, CpG shelves), or regulatory sequences such as enhancers.

PePr: Peak Prioritization Pipeline

PePr: Peak Prioritization Pipeline Logo

PePr on GitHub
PePr is a python-based analysis pipeline for ChIP-Seq experiments with biological replicates. The program accounts for the variation among biological replicates using a negative binomial model, and uses local information to improve estimates of variance. It uses a novel between-sample normalization strategy to account for variable antibody efficiency, and post hoc steps to increase peak resolution and reduce false positives. It can be used either to determine histone modifications or transcription factor binding versus control data, or for two group comparisons (i.e. differential binding). With PePr, users do not need to separately call peaks in each sample first; the differential peak calling is all performed in one analysis.

Methylation Integration (Mint) Pipeline

Github site: https://github.com/sartorlab/mint

The mint pipeline analyzes single-end reads coming from sequencing assays measuring DNA methylation and hydroxymethylation. The pipeline analyzes reads from both bisulfite-converted assays such as WGBS and RRBS, and from pulldown assays such as MeDIP-seq, hMeDIP-seq, and hMeSeal. Moreover, with data measuring both 5-methylcytosine (5mc) and 5-hydroxymethylcytosine (5hmc), the mint pipeline integrates the two data types to classify genomic regions of 5mc, 5hmc, a mixture, or neither.
The pipeline is available as both a command line(https://github.com/sartorlab/mint) and a Galaxy graphical user interface too(https://github.com/sartorlab/mint_galaxy). Both implementations require minimal configuration while remaining flexible to experiment specific needs.

LR Path and RNA-Enrich

LR Path Logo

Webtool: http://lrpath.med.umich.edu

LRpath performs gene set enrichment testing using logistic regression, allowing the input data to remain on a continuous scale. RNA-Enrich additionally takes into account gene coverage for RNA-seq data. This web-based tool tests against several annotation databases, including Gene Ontology, multiple pathway databases, metabolite, transcription factor and microRNA target sets, and literature-derived annotations. LRpath performs well with both small and large sample sizes. Additional benefits of using the LRpath program include (1) the ability to perform both “directional” and “non-directional” enrichment tests that allow for two different perspectives and (2) the ability to easily compare and visualize results across multiple studies using LRpath clustering.

MethylSig – testing for differentially methylated CpGs or regions with bisulfite sequencing data

Bioconductor package: https://bioconductor.org/packages/release/bioc/html/methylSig.htmll

MethylSig is a Bioconductor package that tests for differentially methylated CpGs (DMCs) or differentially methylated tiled regions (DMRs) between groups of samples using a beta binomial model. Two testing methods are offered- the methylSig test and the DSS test. The DSS test allows for additional covariates in the model, while the methylSig test allows the use of local information to improve estimates of variance. Several options exist for either site-specific or sliding window tests, and variance estimation.
Check out the newest version of methylSig on GitHub
The easiest way to install the devel version of methylSig is with the devtools R package:

> library(devtools)

> install_github('sartorlab/methylSig')