Workflow: DiffBind - Differential Binding Analysis of ChIP-Seq Peak Data
Differential Binding Analysis of ChIP-Seq Peak Data --------------------------------------------------- DiffBind processes ChIP-Seq data enriched for genomic loci where specific protein/DNA binding occurs, including peak sets identified by ChIP-Seq peak callers and aligned sequence read datasets. It is designed to work with multiple peak sets simultaneously, representing different ChIP experiments (antibodies, transcription factor and/or histone marks, experimental conditions, replicates) as well as managing the results of multiple peak callers. For more information please refer to: ------------------------------------- Ross-Innes CS, Stark R, Teschendorff AE, Holmes KA, Ali HR, Dunning MJ, Brown GD, Gojis O, Ellis IO, Green AR, Ali S, Chin S, Palmieri C, Caldas C, Carroll JS (2012). “Differential oestrogen receptor binding is associated with clinical outcome in breast cancer.” Nature, 481, -4.
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
alias | String | Experiment short name/Alias | |
threads | Integer (Optional) | Number of threads |
Number of threads for those steps that support multithreading |
use_common | Boolean (Optional) | Use common peaks within each condition. Ignore Minimum peakset overlap |
Derive consensus peaks only from the common peaks within each condition. Min peakset overlap is ignored. Default: false |
min_overlap | Integer (Optional) | Minimum peakset overlap |
Min peakset overlap. Only include peaks in at least this many peaksets when generating consensus peakset. Default: 2 |
name_cond_1 | String (Optional) | Condition 1 name, single word with letters and numbers only |
Condition 1 name, single word with letters and numbers only |
name_cond_2 | String (Optional) | Condition 2 name, single word with letters and numbers only |
Condition 2 name, single word with letters and numbers only |
blocked_file | File (Optional) [Textual format] | Blocking attribute headerless TSV/CSV file for multi-factor analysis with columns to set name and group. If this inputs is set, blocking attributes above are ignored |
Blocking attribute metadata file for multi-factor analysis. Headerless TSV/CSV file. First column - names from --name1 and --name2, second column - group name. --block is ignored |
cutoff_param | https://w3id.org/cwl/view/git/a409db2289b86779897ff19003bd351701a81c50/workflows/diffbind.cwl#cutoff_param/cutoff (Optional) | Parameter to which cutoff should be applied |
Parameter to which cutoff should be applied (fdr or pvalue). Default: fdr |
cutoff_value | Float (Optional) | P-value or FDR cutoff for reported results |
P-value or FDR cutoff for reported results |
fragmentsize | Integer (Optional) | Reads extension size, bp |
Extended each read from its endpoint along the appropriate strand. Default: 125bp |
promoter_dist | Integer (Optional) | Promoter distance, bp |
Max distance from gene TSS (in both direction) overlapping which the peak will be assigned to the promoter region. Default: 1000 bp |
upstream_dist | Integer (Optional) | Upstream distance, bp |
Max distance from the promoter (only in upstream direction) overlapping which the peak will be assigned to the upstream region. Default: 20,000 bp |
analysis_method | https://w3id.org/cwl/view/git/a409db2289b86779897ff19003bd351701a81c50/workflows/diffbind.cwl#analysis_method/method (Optional) | Analysis method |
Method by which to analyze differential binding affinity. Default: deseq2 |
annotation_file | File [TSV] | Genome annotation |
Genome annotation file in TSV format |
min_read_counts | Integer (Optional) | Minimum read counts. Exclude intervals where MAX read counts for all samples < specified value |
Min read counts. Exclude all merged intervals where the MAX raw read counts among all of the samples is smaller than the specified value. Default: 0 |
chrom_length_file | File [Textual format] | Chromosome length file |
Chromosome length file |
peak_files_cond_1 | File[] [xls] | Biological condition 1 samples. Minimum 2 samples |
XLS peak files for condition 1 from MACS2. Minimim 2 files. Order corresponds to read_files_cond_1 |
peak_files_cond_2 | File[] [xls] | Biological condition 2 samples. Minimum 2 samples |
XLS peak files for condition 2 from MACS2. Minimim 2 files. Order corresponds to read_files_cond_2 |
read_files_cond_1 | File[] [BAM] | Biological condition 1 samples. Minimum 2 samples |
Read files for condition 1. Minimim 2 files in BAM format |
read_files_cond_2 | File[] [BAM] | Biological condition 2 samples. Minimum 2 samples |
Read files for condition 2. Minimim 2 files in BAM format |
remove_duplicates | Boolean (Optional) | Remove duplicated reads |
Remove reads that map to exactly the same genomic position. Default: false |
blocked_attributes | String[] (Optional) | Blocking attributes for multi-factor analysis. Minimum 2 |
Blocking attributes for multi-factor analysis. Minimum 2. Either names from --name1 or/and --name2 or array of strings that can be parsed by R to bool. In the later case the order and size should correspond to [--read1]+[--read2]. Default: not applied |
sample_names_cond_1 | String[] (Optional) | Biological condition 1 sample names |
Aliases for biological condition 1 samples to make the legend for generated plots. Order corresponds to the read_files_cond_1 |
sample_names_cond_2 | String[] (Optional) | Biological condition 2 sample names |
Aliases for biological condition 2 samples to make the legend for generated plots. Order corresponds to the read_files_cond_2 |
narrow_peaks_files_cond_1 | File[] (Optional) [ENCODE narrow peak format] | Called peaks for biological condition 1 |
Narrow peaks file(s) for biological condition 1 |
narrow_peaks_files_cond_2 | File[] (Optional) [ENCODE narrow peak format] | Called peaks for biological condition 2 |
Narrow peaks file(s) for biological condition 2 |
genome_coverage_files_cond_1 | File[] [bigWig] | Genome coverage(s) for biological condition 1 |
Genome coverage bigWig file(s) for biological condition 1 |
genome_coverage_files_cond_2 | File[] [bigWig] | Genome coverage(s) for biological condition 2 |
Genome coverage bigWig file(s) for biological condition 2 |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
pipe |
diffbind.cwl#pipe/32343c61-3964-4d73-90be-c30295f297b3
(ExpressionTool)
|
||
diffbind |
../tools/diffbind.cwl
(CommandLineTool)
|
DiffBind - Differential Binding Analysis of ChIP-Seq Peak Data |
Runs R script to compute differentially bound sites from multiple ChIP-seq experiments using affinity (quantitative) and occupancy data. |
sort_bed |
../tools/linux-sort.cwl
(CommandLineTool)
|
Tool sorts data from `unsorted_file` by key |
|
assign_genes |
../tools/iaintersect.cwl
(CommandLineTool)
|
Tool assigns each peak obtained from MACS2 to a gene and region (upstream, promoter, exon, intron, intergenic) |
|
select_files |
diffbind.cwl#select_files/73f4dd4f-8570-441c-8293-4c303f4e5372
(ExpressionTool)
|
||
bed_to_bigbed |
../tools/ucsc-bedtobigbed.cwl
(CommandLineTool)
|
Tool converts bed file to bigBed |
|
convert_to_bed |
../tools/custom-bash.cwl
(CommandLineTool)
|
Tool to run custom script set as `script` input with arguments from `param`. Default script runs sed command over the input file and exports results to the file with the same name as input's basename |
|
filter_columns |
../tools/custom-bash.cwl
(CommandLineTool)
|
Tool to run custom script set as `script` input with arguments from `param`. Default script runs sed command over the input file and exports results to the file with the same name as input's basename |
|
restore_columns |
../tools/custom-bash.cwl
(CommandLineTool)
|
Tool to run custom script set as `script` input with arguments from `param`. Default script runs sed command over the input file and exports results to the file with the same name as input's basename |
Outputs
ID | Type | Label | Doc |
---|---|---|---|
diffbind_ma_plot | File (Optional) [PNG] | MA plot for significantly differentially bound sites |
MA plot for significantly differentially bound sites |
diffbind_bed_file | File [bigBed] | Estimated differential peaks |
Estimated differential peaks, bigBed |
diffbind_pca_plot | File (Optional) [PNG] | PCA plot for significantly differentially bound sites |
PCA plot for significantly differentially bound sites |
diffbind_stderr_log | File [Textual format] | diffbind stderr log |
diffbind stderr log |
diffbind_stdout_log | File [Textual format] | diffbind stdout log |
diffbind stdout log |
narrow_peaks_cond_1 | File[] (Optional) [ENCODE narrow peak format] | Called peaks for biological condition 1 |
Narrow peaks file(s) for biological condition 1 |
narrow_peaks_cond_2 | File[] (Optional) [ENCODE narrow peak format] | Called peaks for biological condition 2 |
Narrow peaks file(s) for biological condition 2 |
diffbind_ma_plot_pdf | File (Optional) [PDF] | MA plot for significantly differentially bound sites |
MA plot for significantly differentially bound sites |
diffbind_report_file | File [TSV] | Differential binding analysis results |
Differential binding analysis results exported as TSV |
diffbind_all_pca_plot | File (Optional) [PNG] | PCA plot for all bound sites |
PCA plot for all bound sites |
diffbind_boxplot_plot | File (Optional) [PNG] | Box plots of read distributions for significantly differentially bound sites |
Box plots of read distributions for significantly differentially bound sites |
diffbind_pca_plot_pdf | File (Optional) [PDF] | PCA plot for significantly differentially bound sites |
PCA plot for significantly differentially bound sites |
diffbind_volcano_plot | File (Optional) [PNG] | Volcano plot for for significantly differentially bound sites |
Volcano plot for for significantly differentially bound sites |
genome_coverage_cond_1 | File[] [bigWig] | Genome coverage(s) for biological condition 1 |
Genome coverage bigWig file(s) for biological condition 1 |
genome_coverage_cond_2 | File[] [bigWig] | Genome coverage(s) for biological condition 2 |
Genome coverage bigWig file(s) for biological condition 2 |
diffbind_all_pca_plot_pdf | File (Optional) [PDF] | PCA plot for all bound sites |
PCA plot for all bound sites |
diffbind_boxplot_plot_pdf | File (Optional) [PDF] | Box plots of read distributions for significantly differentially bound sites |
Box plots of read distributions for significantly differentially bound sites |
diffbind_volcano_plot_pdf | File (Optional) [PDF] | Volcano plot for for significantly differentially bound sites |
Volcano plot for for significantly differentially bound sites |
diffbind_db_sites_binding_heatmap | File (Optional) [PNG] | Binding heatmap for significantly differentially bound sites |
Binding heatmap for significantly differentially bound sites |
diffbind_peak_correlation_heatmap | File (Optional) [PNG] | Peak overlap correlation heatmap |
Peak overlap correlation heatmap |
diffbind_all_peak_overlap_rate_plot | File (Optional) [PNG] | All peak overlap rate plot |
All peak overlap rate plot |
diffbind_counts_correlation_heatmap | File (Optional) [PNG] | Raw counts correlation heatmap |
Raw counts correlation heatmap |
diffbind_consensus_peak_venn_diagram | File (Optional) [PNG] | Consensus peak Venn Diagram |
Consensus peak Venn Diagram |
diffbind_all_data_correlation_heatmap | File (Optional) [PNG] | Not filtered normalized counts correlation heatmap |
Not filtered normalized counts correlation heatmap |
diffbind_db_sites_binding_heatmap_pdf | File (Optional) [PDF] | Binding heatmap for significantly differentially bound sites |
Binding heatmap for significantly differentially bound sites |
diffbind_db_sites_correlation_heatmap | File (Optional) [PNG] | Normalized counts correlation heatmap for significantly differentially bound sites |
Normalized counts correlation heatmap for significantly differentially bound sites |
diffbind_peak_correlation_heatmap_pdf | File (Optional) [PDF] | Peak overlap correlation heatmap |
Peak overlap correlation heatmap |
diffbind_peak_overlap_rate_plot_cond_1 | File (Optional) [PNG] | Condition 1 peak overlap rate plot |
Condition 1 peak overlap rate plot |
diffbind_peak_overlap_rate_plot_cond_2 | File (Optional) [PNG] | Condition 2 peak overlap rate plot |
Condition 2 peak overlap rate plot |
diffbind_all_peak_overlap_rate_plot_pdf | File (Optional) [PDF] | All peak overlap rate plot |
All peak overlap rate plot |
diffbind_counts_correlation_heatmap_pdf | File (Optional) [PDF] | Raw counts correlation heatmap |
Raw counts correlation heatmap |
diffbind_consensus_peak_venn_diagram_pdf | File (Optional) [PDF] | Consensus peak Venn Diagram |
Consensus peak Venn Diagram |
diffbind_all_data_correlation_heatmap_pdf | File (Optional) [PDF] | Not filtered normalized counts correlation heatmap |
Not filtered normalized counts correlation heatmap |
diffbind_db_sites_correlation_heatmap_pdf | File (Optional) [PDF] | Normalized counts correlation heatmap for significantly differentially bound sites |
Normalized counts correlation heatmap for significantly differentially bound sites |
diffbind_peak_overlap_rate_plot_cond_1_pdf | File (Optional) [PDF] | Condition 1 peak overlap rate plot |
Condition 1 peak overlap rate plot |
diffbind_peak_overlap_rate_plot_cond_2_pdf | File (Optional) [PDF] | Condition 2 peak overlap rate plot |
Condition 2 peak overlap rate plot |
https://w3id.org/cwl/view/git/a409db2289b86779897ff19003bd351701a81c50/workflows/diffbind.cwl