-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex-hic.html
328 lines (243 loc) · 19.3 KB
/
index-hic.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
<!DOCTYPE html>
<html lang="en">
<head>
<title>Galaxy Australia</title>
<meta property="og:title" content="" />
<meta property="og:description" content="" />
<meta property="og:image" content="/assets/media/galaxy-eu-logo.512.png" />
<meta name="description" content="The Australian Galaxy Instance">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<link rel="stylesheet" href="/assets/css/bootstrap.min.css">
<link rel="stylesheet" href="/assets/css/main.css">
<link rel="canonical" href="https://usegalaxy-au.github.io/index-hic.html">
<link rel="shortcut icon" href="/assets/media/galaxy-eu-logo.64.png" type="image/x-icon" />
<link rel="alternate" type="application/rss+xml" title="Galaxy Australia" href="/feed.xml">
<link href="/assets/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-wvfXpqpZZVQGK6TAh5PVlGOfQNHSoD2xbE+QkPxCAFlNEevoEH3Sl0sibVcOQVnN" crossorigin="anonymous">
<script src="/assets/js/jquery-3.2.1.slim.min.js" integrity="sha256-k2WSCIexGzOj3Euiig+TlR8gA0EmPjuc79OEeY5L45g=" crossorigin="anonymous"></script>
<script src="/assets/js/bootstrap.min.js" integrity="sha256-U5ZEeKfGNOja007MMD3YBI0A3OSZOQbeG6z2f2Y0hu8=" crossorigin="anonymous"></script>
</head>
<body>
<div id="wrap">
<div id="main">
<div class="container" id="maincontainer">
<div class="home">
<h1 id="galaxy-hicexplorer">Galaxy HiCExplorer</h1>
<p>Welcome to the Galaxy HiCExplorer – a webserver to process, analyse and visualize Hi-C data.</p>
<p><img src="https://raw.githubusercontent.com/deeptools/HiCExplorer/master/docs/images/hicex2.png" alt="" /></p>
<h2 id="get-started-with-galaxy-hicexplorer">Get started with Galaxy HiCExplorer</h2>
<p>Are you new to Galaxy, or returning after a long time, and looking for help to get started? Take <a target="_parent" href="https://hicexplorer.usegalaxy.eu/tours/core.galaxy_ui">a guided tour</a> through Galaxy’s user interface.</p>
<p>Take <a target="_parent" href="https://hicexplorer.usegalaxy.eu/tours/hixexplorer">a guided tour</a> for an introduction to Galaxy HiCExplorer and Hi-C data analysis. This tour is guides you through the Hi-C tutorial on the <a target="_parent" href="https://galaxyproject.github.io/training-material/topics/epigenetics/tutorials/hicexplorer/tutorial.html">Galaxy Training Network</a> where you can analyse Hi-C data of Drosophila melanogaster. Follow the tutorial to understand the analysis steps better or as a help which parameters are useful.</p>
<p>A precomputed history of the tutorial can be viewed <a target="_parent" href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/h/drosophila-melanogaster-hi-c-training">here</a>.</p>
<p>A more advanced tutorial is hosted on <a target="_parent" href="https://hicexplorer.readthedocs.io/en/latest/content/mES-HiC_analysis.html">readthedocs.io</a>. It is designed for the shell based version of the HiCExplorer but can be easily adapted to Galaxy HiCExplorer. In this tutorial mouse stems cells from <a target="_parent" href="https://www.genomebiology.com/2015/16/1/149">Marks et al. (2015)</a> are analysed. We provided the input fastq files in our <a target="_parent" href="https://hicexplorer.usegalaxy.eu/library/list#folders/F49c63be29eb6cbc1">data library</a>.</p>
<p>We recommend to follow the tutorial on <a target="_parent" href="https://galaxyproject.github.io/training-material/topics/sequence-analysis/tutorials/quality-control/tutorial.html">FASTQC<a></a> for quality checks.</a></p>
<h3 id="example-data">Example data</h3>
<p>The Galaxy Training Network tutorial uses Hi-C data from Drosophila melanogaster and is hosted on zenodo: <a target="_parent" href="https://doi.org/10.5281/zenodo.1183661"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.1183661.svg" alt="DOI" /></a></p>
<p>Additional we provide the data in <a target="_parent" href="https://hicexplorer.usegalaxy.eu/library/list#folders/F8607ddb1c5387e36">the shared data library</a> of the Galaxy HiCExplorer. In comparison to the data hosted on zenodo it contains preprocessed intermediate files.</p>
<p>Galaxy HiCExplorer can process large Hi-C data. We processed Hi-C data with around 750 million reads from <a href="http://circ.ahajournals.org/content/136/17/1613.long">Rosa-Garrido et al.</a>. Have a look at the preprocessed <a target="_parent" href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/h/nar-publication-750-million-reads">files</a>.</p>
<h2 id="galaxy-hicexplorer--many-possibilities">Galaxy HiCExplorer – many possibilities</h2>
<p><img src="/assets/media/publication_plots.png" alt="" />
<b>(A)</b> Galaxy HiCExplorer workflows and tools. Quality control tools: <b>(B)</b> Output of hicCorrelate comparing two wild types and one knockdown samples. <b>(C)</b> Output of hicPlotDistVsCounts that shows changes of the number of contacts for different conditions. Analysis tools: <b>(D)</b> hicPlotMatrix of the Pearson correlation matrix derived from a contact matrix for chromosome 6 in mouse computed with hicTransform. The optional data track at the bottom shows the first eigenvector for A/B compartment obtained using hicPCA. <b>(E)</b> The pixel difference between a Hi-C corrected matrix for wild type condition and a knock down was computed using hicCompareMatrices and a 7Mb region is visualized using hicPlotMatrix. Visualization tools: <b>(F)</b> Contact matrix plot of a 80 to 105 Mb region of chromosome 2 in log scale. <b>(G)</b> Example output of hicPlotViewpoint showing the corrected number of Hi-C contacts for a single bin in chromosome 5 (output similar to 4C-seq) (<a target="_parent" href="https://doi.org/10.1101/gr.213066.116">Andrey 2017</a>). <b>(H)</b> A Hi-C matrix was converted into an observed vs. expected matrix using hicTransform and this matrix, together with the location of high-affinity sites from (<a target="_parent" href="https://doi.org/10.1016/j.molcel.2015.08.024">Ramirez 2015</a>) were used to run hicAggregateContacts. <b>(I)</b> 85 Mb to 110 Mb region from human chromosome 2 visualized using hicPlotTADs. TADs were computed by hicFindTADs. The additional tracks added correspond to: TAD- separation score (as reported by hicFindTADs), chromatin state , principal component 1 (A/B compartment) computed using hicPCA, ChIP-seq coverage for the H3K27ac mark, DNA methylation, and a gene track. Hi-C data for <b>B</b>, <b>C</b>, <b>E</b> and <b>H</b> from Drosophila melanogaster S2 cells from (<a target="_parent" href="https://doi.org/10.1038/s41467-017-02525-w">Ramirez 2018</a>). Hi-C data for <b>D</b>, <b>F</b> and <b>I</b> from mouse cardiac myocytes(<a target="_parent" href="https://doi.org/10.1038/s41467-017-01724-9 ">Nothjunge 2017</a>). Additional tracks in <b>I</b> from (<a target="_parent" href="https://doi.org/10.1038/s41467-017-01724-9 ">Nothjunge 2017</a>).</p>
<h2 id="workflows">Workflows</h2>
<p>To automatize different consecutive steps we provide the following workflows in three categories: From scratch (FASTQ files), from scratch (FASTQ files) and summing up replicates and if you have already your contact matrix. Many workflows require collections of FASTQ files as an input, it is shown
<a href="https://galaxyproject.org/tutorials/collections/">here</a> how to create a collection. Please do not forget to check the quality of the FASTQ files with FastQC.</p>
<p>Please have in mind that all workflows need additional input from the user. All mapping steps are done with BWA-MEM and the correct reference genome need to be defined by the user. The correct restriction site and the bin size for hicBuildMatrix needs to be defined too. The correction of the matrix is done with the default parameters of -1.5 and 5, change this if necessary. Furthermore, the correct region and or chromosome needs to be defined for plotting the matrix, TADs or PCA.</p>
<h3 id="from-scratch-fastq-files-individual">From scratch (FASTQ files) individual</h3>
<p>These workflows expect collections of FASTQ files as an input. The first collections needs to have all forward strand FASTQ files and the second one all reverse FASTQ files. Please make sure that the order of the FASTQ files in both collections is equal. The order is important to associate the related forward and reverse read strand files.</p>
<p>The following workflows are provided:</p>
<ul>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/from-scratch-to-a-contact-matrix">From scratch to a contact matrix</a></li>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/from-scratch-to-pca">From scratch to PCA</a></li>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/from-scratch-to-tad">From scratch to TAD</a></li>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/from-scratch-to-pca-and-plotting">From scratch to PCA and plotting</a></li>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/from-scratch-to-tad-and-plotting">From scratch to TAD and plotting</a></li>
</ul>
<h3 id="from-scratch-fastq-files-and-summing-up-replicates">From scratch (FASTQ files) and summing up replicates</h3>
<p>These workflows takes collections of FASTQ files for forward and reverse strand as an input, for each pair a contact matrix is build and all created contact matrices are summed up to one contact matrix. Use this workflow if you want to use replicates to increase statistical power of your contact matrix and the replicates are checked to be correct.</p>
<ul>
<li><a href="https://usegalaxy.eu/u/joachim-wolff/w/workflow-hicexplorer-hicsummatrix">From scratch to a contact matrix (summing up replicates)</a></li>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/from-scratch-to-pca-summing-up-replicates">From scratch to PCA (summing up replicates)</a></li>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/from-scratch-to-tads-summing-up-replicates">From scratch to TAD (summing up replicates)</a></li>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/from-scratch-to-pca-and-plot-summing-up-replicates">From scratch to PCA and plot (summing up replicates)</a></li>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/from-scratch-to-tads-and-plot-summing-up-replicates">From scratch to TAD and plot (summing up replicates)</a></li>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/from-scratch-to-tads-pca-and-plot-summing-up-replicates">From scratch to TADs, PCA and plot (summing up replicates)</a></li>
</ul>
<h3 id="contact-matrix-as-a-basis">Contact matrix as a basis</h3>
<p>Use the following workflows if you have already created a contact matrix.</p>
<ul>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/a--b-comparments">Plot Pearson matrix and PC1 / PC2</a></li>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/plot-tads">Plot TADs</a></li>
<li><a href="https://hicexplorer.usegalaxy.eu/u/joachim-wolff/w/plot-tads-and-pc">Plot TADs and PC</a></li>
</ul>
<h2 id="known-pitfalls">Known pitfalls</h2>
<p>Preprocssed SAM/BAM files:
To build the contact matrix the SAM/BAM files need to generated using the –reorder option from bowtie2 / hisat2 to output the SAM/BAM files in the exact same order as in the fastq files. To cover the identical reason, the SAM/BAM file should not be sorted. Please make sure your preprocessed SAM/BAM files fulfill these requirements, if not the creation of a contact matrix with hicBuildMatrix will fail.</p>
<p>We recommend to use BWA-MEM with the Hi-C specific parameters, as shown in our tutorials.</p>
<h2 id="citation">Citation</h2>
<p>Joachim Wolff, Vivek Bhardwaj, Stephan Nothjunge, Gautier Richard, Gina Renschler, Ralf Gilsbach, Thomas Manke, Rolf Backofen, Fidel Ramírez, Björn A Grüning.
<strong>“Galaxy HiCExplorer: a web server for reproducible Hi-C data analysis, quality control and visualization”, Nucleic Acids Research</strong>, Volume 46, Issue W1, 2 July 2018, Pages W11–W16, doi: https://doi.org/10.1093/nar/gky504</p>
<h2>Our Data Policy</h2>
<style>
th, td {
border-bottom: 1px solid #ddd;
padding: 10px;
}
th {
background-color: #f2f2f2;
}
</style>
<p><strong>Galaxy Australia</strong> is designed for data analysis and not for
long term storage of data.</p>
<br />
<p>Use Galaxy Australia to host your input data only for the period required for
analysis. Also, remember to export/download your analysed data as this also
will not be stored beyond the limits set out below.</p>
<p>It is <u>your responsibility</u> as a user of the system to manage your own data
and routinely remove both your input and output data from this community system
to enable capacity for other users. Any data that is not removed by you within
a defined time period (see below) will be automatically and permanently deleted.
</p>
<p>Galaxy Australia maintains a collection of frequently used reference genomes
and annotation datasets. The inclusion of additional reference genomes and/or
annotation data on the system for community use can be <a href="https://docs.google.com/forms/d/e/1FAIpQLSdXuarvkzFA5kRqoCfO8uiUGAB0PplfR4yvAfpCPSpdMcehmA/viewform">
requested</a>. Galaxy Australia's hosting of all reference and annotation data
does not count to your quota and it is the best way to access
reference/annotation data.</p>
<h3>Data storage quotas and retention periods</h3><br />
<center>
<table>
<tr>
<th></th>
<th>Storage quota</th>
<th>Data retention period</th>
</tr>
<tr>
<td><strong>Registered Australian researchers</strong></td><td>600GB</td><td>1 year (52 weeks)</td>
</tr>
<tr>
<td><strong>Other registered users</strong></td><td>100GB</td><td>1 year (52 weeks)</td>
</tr>
<tr>
<td><strong>Unregistered users</strong></td><td>5GB</td><td>NA</td>
</tr>
</table>
</center>
<h3>Registered Users</h3>
<ul>
<li><u>Registered Australian Researchers</u> are defined by registration
Email from:
<ul>
<li>@domain.edu.au</li>
<li>@domain.org.au</li>
<li>@domain.edu - only in the case of known Australian Universities
not on the .au domain</li>
</ul>
</li>
<li><u>Other Registered Users</u> are defined as any registration Email
from all other @domains</li>
<li>Please contact us ([email protected]) if your institution does not
conform to this rule but you understand it should be defined as
performing publicly funded Australian research</li>
</ul>
<p>Registered users from Australian publicly funded research organisations have
a 600GB data storage quota. Other registered users have a 100GB quota. An
increased data storage quota can be <a href="https://docs.google.com/forms/d/e/1FAIpQLSeiw6ajmkezLCwbXc3OFQEU3Ai9hGnBd967u9YbQ8ANPgvatA/viewform">
requested</a> for a limited time period in special cases.</p>
<p>Registered User's data (i.e. datasets, histories) will be available on the
system for 1 year (52 weeks) from the point of upload or creation. Within this period,
any data marked by a Registered User as "deleted" will be permanently removed
within 5 days. If a registered user "purges" the dataset, it will be removed
immediately and permanently.</p>
<h3>Unregistered Users</h3>
<p>Processed data will only be accessible during one browser session, using a
browser cookie to identify an Unregistered User's data. This cookie is not used
for any other purposes (e.g. tracking or analytics.)</p>
<h3>What does it mean when I go over quota?</h3>
<p>You data and histories are still accessible, but you not be able to run new
jobs or import more data. If you know in advance then take advantage of
<a href="https://docs.google.com/forms/d/e/1FAIpQLSeiw6ajmkezLCwbXc3OFQEU3Ai9hGnBd967u9YbQ8ANPgvatA/viewform">
requesting</a> more analysis storage or downloading and deleting old, unwanted data.</p>
<style>
.column {
float: left;
width: 25%;
}
.big_column {
float: left;
width: 100%
}
/* Clear floats after the columns */
.row:after {
content: "";
display: table;
clear: both;
}
/* Responsive layout - makes a two column-layout instead of four columns */
@media screen and (max-width: 1200px) {
.column {
width: 50%;
}
}
/* Responsive layout - makes the two columns stack on top of each other instead of next to each other */
@media screen and (max-width: 600px) {
.column {
width: 100%;
}
}
</style>
<div class="row" width="90%">
<div class="big_column">
<!--<iframe width="100%" height="200px" src="https://stats.genome.edu.au/d-solo/-D4mtTAik/for_embedding?refresh=10s&orgId=1&panelId=2" frameborder="0" ></iframe> -->
<iframe src="https://stats.usegalaxy.org.au/d-solo/-D4mtTAik/for_embedding?orgId=1&refresh=10s&panelId=2" width="100%" height="200px" frameborder="0"></iframe>
</div>
</div>
<center>
<div class="row" width="90%">
<div class="column">
<iframe src="https://stats.usegalaxy.org.au/d-solo/-D4mtTAik/for_embedding?orgId=1&refresh=1d&panelId=19" width="95%" height="110px" frameborder="0"></iframe>
</div>
<div class="column">
<iframe src="https://stats.usegalaxy.org.au/d-solo/-D4mtTAik/for_embedding?orgId=1&refresh=1d&panelId=21" width="95%" height="110px" frameborder="0"></iframe>
</div>
<div class="column">
<iframe src="https://stats.usegalaxy.org.au/d-solo/-D4mtTAik/for_embedding?orgId=1&refresh=1d&panelId=23" width="95%" height="110px" frameborder="0"></iframe>
</div>
<div class="column">
<iframe src="https://stats.usegalaxy.org.au/d-solo/-D4mtTAik/for_embedding?orgId=1&refresh=1d&panelId=25" width="95%" height="110px" frameborder="0"></iframe>
</div>
</div>
</center>
<div class="row">
<section class="section-content">
<div class="col-md-12">
</div>
</section>
</div>
</div>
</div>
</div>
</div>
<footer class="navbar-default">
<div class="container">
<div class="row">
<div class="col-lg-12" style="text-align:center">
<p>UseGalaxy.org.au is maintained largely by the <a href="/people">Australian Galaxy Team</a> including staff from QCIF, UQ-RCC and Melbourne Bioinformatics.
All content on this site is available under <a href="https://creativecommons.org/share-your-work/public-domain/cc0/" target="_blank">CC0-1.0</a>, unless otherwise specified.
Galaxy Australia is currently running Galaxy version 21.09 (September 2021)</p>
</div>
</div>
<div class="row">
<div class="col-lg-12" style="text-align:center">
<ul class="contact-info">
<li><i class="fa fa-envelope"></i><a href="mailto:[email protected]">[email protected]</a></li>
<li><i class="fa fa-github"></i><a href="https://github.com/usegalaxy-au" target="_blank">usegalaxy-au</a></li>
<li><i class="fa fa-twitter"></i><a href="https://twitter.com/galaxyaustralia" target="_blank">galaxyaustralia</a></li>
<!-- <li><i class="fa fa-rss"></i>Subscribe <a href="/feed.xml">via RSS (UseGalaxy.eu Feed)</a></li> -->
</ul>
</div>
</div>
</div>
</footer>
</body>
</html>