-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex-metagenomics.html
408 lines (312 loc) · 21.9 KB
/
index-metagenomics.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
<!DOCTYPE html>
<html lang="en">
<head>
<title>Galaxy Australia</title>
<meta property="og:title" content="" />
<meta property="og:description" content="" />
<meta property="og:image" content="/assets/media/galaxy-eu-logo.512.png" />
<meta name="description" content="The Australian Galaxy Instance">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<link rel="stylesheet" href="/assets/css/bootstrap.min.css">
<link rel="stylesheet" href="/assets/css/main.css">
<link rel="canonical" href="https://usegalaxy-au.github.io/index-metagenomics.html">
<link rel="shortcut icon" href="/assets/media/galaxy-eu-logo.64.png" type="image/x-icon" />
<link rel="alternate" type="application/rss+xml" title="Galaxy Australia" href="/feed.xml">
<link href="/assets/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-wvfXpqpZZVQGK6TAh5PVlGOfQNHSoD2xbE+QkPxCAFlNEevoEH3Sl0sibVcOQVnN" crossorigin="anonymous">
<script src="/assets/js/jquery-3.2.1.slim.min.js" integrity="sha256-k2WSCIexGzOj3Euiig+TlR8gA0EmPjuc79OEeY5L45g=" crossorigin="anonymous"></script>
<script src="/assets/js/bootstrap.min.js" integrity="sha256-U5ZEeKfGNOja007MMD3YBI0A3OSZOQbeG6z2f2Y0hu8=" crossorigin="anonymous"></script>
</head>
<body>
<div id="wrap">
<div id="main">
<div class="container" id="maincontainer">
<div class="home">
<p><br />
<img src="/assets/media/asaim_logo.png" height="100px" alt="ASaiM logo" /></p>
<p>Welcome to <strong>Galaxy Metagenomics</strong> (<a href="https://asaim.readthedocs.io/en/latest/" target="_blank">ASaiM</a>) – a webserver to process, analyse and visualize Metagenomic and Microbiota data in general.</p>
<ol id="markdown-toc">
<li><a href="#get-started" id="markdown-toc-get-started">Get started</a></li>
<li><a href="#tools" id="markdown-toc-tools">Tools</a></li>
<li><a href="#tutorials" id="markdown-toc-tutorials">Tutorials</a></li>
<li><a href="#workflows" id="markdown-toc-workflows">Workflows</a> <ol>
<li><a href="#analysis-of-raw-metagenomic-or-metatranscriptomic-shotgun-data" id="markdown-toc-analysis-of-raw-metagenomic-or-metatranscriptomic-shotgun-data">Analysis of raw metagenomic or metatranscriptomic shotgun data</a></li>
<li><a href="#assembly-of-metagenomic-data" id="markdown-toc-assembly-of-metagenomic-data">Assembly of metagenomic data</a></li>
<li><a href="#analysis-of-metataxonomic-data" id="markdown-toc-analysis-of-metataxonomic-data">Analysis of metataxonomic data</a></li>
<li><a href="#running-as-in-ebi-metagenomics" id="markdown-toc-running-as-in-ebi-metagenomics">Running as in EBI metagenomics</a></li>
</ol>
</li>
<li><a href="#references" id="markdown-toc-references">References</a></li>
</ol>
<h1 id="get-started">Get started</h1>
<p>Are you new to Galaxy, or returning after a long time, and looking for help to get started? Take <a href="https://metagenomics.usegalaxy.eu/tours/core.galaxy_ui" target="_blank">a guided tour</a> through Galaxy’s user interface.</p>
<p>Want to learn about metagenomics analyses? Check our <a href="#tutorials">tutorials</a> or take one of our guided tour:</p>
<ul>
<li><a href="https://metagenomics.usegalaxy.eu/tours/metagenomics-general-tutorial-amplicon" target="_blank">Introduction of amplicon data analyses using mothur tool suite</a></li>
<li><a href="https://metagenomics.usegalaxy.eu/tours/metagenomics-general-tutorial-shotgun" target="_blank">Introduction to shotgun metagenomics data analyses</a></li>
<li><a href="https://metagenomics.usegalaxy.eu/tours/mothur-miseq-sop" target="_blank">16S Microbial Analysis with Mothur MiSeq SOP</a></li>
</ul>
<p>Check also the standard but customizable <a href="#workflows">workflows</a> available there.</p>
<h1 id="tools">Tools</h1>
<p>More than <a href="https://asaim.readthedocs.io/en/latest/tools/index.html" target="_blank">200 tools</a> are integrated in this custom Galaxy instance. They were chosen for their use in exploitation of microbiota data:</p>
<ul>
<li><a href="https://asaim.readthedocs.io/en/latest/tools/file_meta_tools.html" target="_blank"><strong>General tools</strong></a>
<ul>
<li><strong>Data retrieval</strong>: EBISearch, ENASearch, SRA Tools</li>
<li><strong>BAM/SAM file manipulation</strong>: SAM tools</li>
<li><strong>BIOM file manipulation</strong>: BIOM-Format tools</li>
</ul>
</li>
<li><a href="https://asaim.readthedocs.io/en/latest/tools/genomics.html" target="_blank"><strong>Genomics tools</strong></a>
<ul>
<li><strong>Quality control</strong>: FastQC, PRINSEQ, Trim Galore! , Trimmomatic, MultiQC</li>
<li><strong>Clustering</strong>: CD-Hit</li>
<li><strong>Sorting and prediction</strong>: SortMeRNA, FragGeneScan</li>
<li><strong>Mapping</strong>: BWA, Bowtie</li>
<li><strong>Similarity search</strong>: NCBI Blast+, Diamond</li>
<li><strong>Alignment</strong>: HMMER3</li>
</ul>
</li>
<li><a href="https://asaim.readthedocs.io/en/latest/tools/microbiota.html" target="_blank"><strong>Microbiota dedicated tools</strong></a>
<ul>
<li><strong>Metagenomics data manipulation</strong>: VSEARCH, Nonpareil</li>
<li><strong>Assembly</strong>: MEGAHIT, metaSPAdes, metaQUAST, VALET</li>
<li><strong>Metataxonomic sequence analysis</strong>: Mothur, QIIME</li>
<li><strong>Taxonomy assignation on WGS sequences</strong>: MetaPhlAn2, Format MetaPhlan2, Kraken</li>
<li><strong>Metabolism assignation</strong>: HUMAnN2, Group HUMAnN2 to GO slim terms, Compare HUMAnN2 outputs, PICRUST, InterProScan</li>
<li><strong>Combination of functional and taxonomic results</strong></li>
<li><strong>Visualization</strong>: Export2graphlan, GraPhlAn, KRONA</li>
</ul>
</li>
</ul>
<h1 id="tutorials">Tutorials</h1>
<p>We are passionate about training. So we are working in close collaboration with the <a href="https://galaxyproject.org/teach/gtn/" target="_blank">Galaxy Training Network (GTN)</a> to develop training materials of data analyses based on Galaxy <a class="citation" href="#batut2017community">(Batut <i>et al.</i>, 2017)</a>. These materials hosted on the GTN GitHub repository are available online at <a href="https://training.galaxyproject.org" target="_blank">https://training.galaxyproject.org</a>.</p>
<p>We then developed <a href="https://galaxyproject.github.io/training-material/topics/metagenomics/" target="_blank">several tutorials</a> and more will come:</p>
<ul>
<li>
<p><a href="https://galaxyproject.github.io/training-material/topics/metagenomics/tutorials/general-tutorial/tutorial.html" target="_blank">Analyses of metagenomics data - The global picture</a></p>
<p>This tutorial introduces the amplicon and shotgun data analyses with the general principles behind and the differences.</p>
</li>
<li>
<p><a href="https://galaxyproject.github.io/training-material/topics/metagenomics/tutorials/mothur-miseq-sop/tutorial.html" target="_blank">16S Microbial Analysis with Mothur</a></p>
<p>In this tutorial the Standard Operating Procedure (SOP) for MiSeq data, developed by the creators of the Mothur software package, is perfomed within Galaxy.</p>
</li>
</ul>
<h1 id="workflows">Workflows</h1>
<p>To orchestrate tools and help users with their analyses, several <a href="https://asaim.readthedocs.io/en/latest/workflows.html" target="_blank">workflows</a> are available. They formally orchestrate tools in a defined order and with defined parameters, but they are customizable (tools, order, parameters).</p>
<p>The workflows are available in the <a href="https://metagenomics.usegalaxy.eu/workflows/list_published">Shared Workflows</a>, with the label “<strong><em>asaim</em></strong>”.</p>
<h2 id="analysis-of-raw-metagenomic-or-metatranscriptomic-shotgun-data">Analysis of raw metagenomic or metatranscriptomic shotgun data</h2>
<p>The workflow quickly produces, from raw metagenomic or metatranscriptomic shotgun data, accurate and precise taxonomic assignations, wide extended functional results and taxonomically related metabolism information</p>
<p><img src="https://asaim.readthedocs.io/en/latest/_images/main_workflow.png" alt="" /></p>
<p>This workflow consists of</p>
<ol>
<li>Processing with quality control/trimming (<strong>FastQC</strong> and <strong>Trim Galore!</strong>) and dereplication (<strong>VSearch</strong>)</li>
<li>Taxonomic analyses with assignation (<strong>MetaPhlAn2</strong>) and visualization (<strong>KRONA</strong>, <strong>GraPhlAn</strong>)</li>
<li>Functional analyses with metabolic assignation and pathway reconstruction (<strong>HUMAnN2</strong>)</li>
<li>Functional and taxonomic combination with developed tools combining HUMAnN2 and MetaPhlAn2 outputs</li>
</ol>
<p>It is available with 4 versions, given the input</p>
<ol>
<li>Simple files: <a href="https://metagenomics.usegalaxy.eu/u/berenice/w/asaim-shotgun-workflow">Single-end</a> or <a href="https://metagenomics.usegalaxy.eu/u/berenice/w/asaim---shotgun-workflow-for-paired-end-data">paired-end</a></li>
<li>Collection input files: <a href="https://metagenomics.usegalaxy.eu/u/berenice/w/asaim-shotgun-workflow-se-collection">Single-end</a> or <a href="https://metagenomics.usegalaxy.eu/u/berenice/w/asaim---shotgun-workflow-for-paired-end-data-collection">paired-end</a></li>
</ol>
<h2 id="assembly-of-metagenomic-data">Assembly of metagenomic data</h2>
<p>To reconstruct genomes or to get longer sequences for further analysis, microbiota data needs to be assembled, using the recently developed metagenomics assemblers.</p>
<p>To help in this task, two workflows have been developed using two different well-performing assemblers:</p>
<ul>
<li>
<p><a href="https://metagenomics.usegalaxy.eu/u/berenice/w/asaim-metagenomic-assembly-with-megahit">MEGAHIT</a></p>
<p>It is currently the most efficent computationally assembler: it has the lowest memory and time consumption <a class="citation" href="#van2017assembling">(van der Walt <i>et al.</i>, 2017; Awad <i>et al.</i>, 2017; Sczyrba <i>et al.</i>, 2017)</a>. It produced some of the best assemblies (irrespective of sequencing coverage) with the fewest structural errors <a class="citation" href="#olson2017metagenomic">(Olson <i>et al.</i>, 2017)</a> and outperforms in recovering the genomes of closely related strains <a class="citation" href="#awad2017evaluating">(Awad <i>et al.</i>, 2017)</a>, but has a bias towards relatively low coverage genomes leading to a suboptimal assembly of high abundant community member genomes in very large datasets <a class="citation" href="#vollmers2017comparing">(Vollmers <i>et al.</i>, 2017)</a></p>
</li>
<li>
<p><a href="https://metagenomics.usegalaxy.eu/u/berenice/w/asaim-metagenomic-assembly-with-metaspades">MetaSPAdes</a></p>
<p>It is particularly optimal for high-coverage metagenomes <a class="citation" href="#van2017assembling">(van der Walt <i>et al.</i>, 2017)</a> with the best contig metrics <a class="citation" href="#greenwald2017utilization">(Greenwald <i>et al.</i>, 2017)</a> and produces few under-collapsed/over-collapsed repeats <a class="citation" href="#olson2017metagenomic">(Olson <i>et al.</i>, 2017)</a></p>
</li>
</ul>
<p>Both workflows consists of</p>
<ol>
<li>Processing with quality control/trimming (<strong>FastQC</strong> and <strong>Trim Galore!</strong>)</li>
<li>Assembly with either <strong>MEGAHIT</strong> or <strong>MetaSPAdes</strong></li>
<li>Estimation of the assembly quality statistics with <strong>MetaQUAST</strong></li>
<li>Identification of potential assembly error signature with <strong>VALET</strong></li>
<li>Determination of percentage of unmapped reads with <strong>Bowtie2</strong> combined with <strong>MultiQC</strong> to aggregate the results.</li>
</ol>
<h2 id="analysis-of-metataxonomic-data">Analysis of metataxonomic data</h2>
<p>To analyze amplicon data, the <strong>Mothur</strong> and <strong>QIIME</strong> tool suites are available there. We implemented the workflows described in tutorials of Mothur and QIIME websites, as example of amplicon data analyses as well as support for the training material. These workflows, as any workflows available there, can be adapted for a specific analysis or used as subworkflows by the users.</p>
<h2 id="running-as-in-ebi-metagenomics">Running as in EBI metagenomics</h2>
<p>The tools used in the EBI Metagenomics pipeline are also available here and can be run as a <a href="https://metagenomics.usegalaxy.eu/u/berenice/w/asaim-ebi-metagenomics-workflow-30" target="_blank">workflow</a> with the same steps as the <a href="https://www.ebi.ac.uk/metagenomics/pipelines/3.0" target="_blank">EBI Metagenomics pipeline (3.0)</a>.</p>
<p><img src="https://asaim.readthedocs.io/en/latest/_images/ebi_metagenomics_workflow.png" alt="" /></p>
<p>However, the parameters must be adjusted by the user as we could not find them in the EBI Metagenomics documentation.</p>
<h1 id="references">References</h1>
<ol class="bibliography"><li><span id="index-metagenomics-awad2017evaluating">Awad,S. <i>et al.</i> (2017) Evaluating Metagenome Assembly on a Simple Defined Community with Many Strain Variants. <i>bioRxiv</i>, 155358.</span></li>
<li><span id="index-metagenomics-batut2017community">Batut,B. <i>et al.</i> (2017) Community-driven data analysis training for biology. <i>BioRxiv</i>, 225680.</span></li>
<li><span id="index-metagenomics-greenwald2017utilization">Greenwald,W.W. <i>et al.</i> (2017) Utilization of defined microbial communities enables effective evaluation of meta-genomic assemblies. <i>BMC genomics</i>, <b>18</b>, 296.</span></li>
<li><span id="index-metagenomics-olson2017metagenomic">Olson,N.D. <i>et al.</i> (2017) Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes. <i>Briefings in Bioinformatics</i>, bbx098.</span></li>
<li><span id="index-metagenomics-sczyrba2017critical">Sczyrba,A. <i>et al.</i> (2017) Critical Assessment of Metagenome Interpretation- a benchmark of computational metagenomics software. <i>Biorxiv</i>, 099127.</span></li>
<li><span id="index-metagenomics-van2017assembling">Walt,A.J. van der <i>et al.</i> (2017) Assembling metagenomes, one community at a time. <i>bioRxiv</i>, 120154.</span></li>
<li><span id="index-metagenomics-vollmers2017comparing">Vollmers,J. <i>et al.</i> (2017) Comparing and Evaluating Metagenome Assembly Tools from a Microbiologist’s Perspective-Not Only Size Matters! <i>PloS one</i>, <b>12</b>, e0169662.</span></li></ol>
<h2>Our Data Policy</h2>
<style>
th, td {
border-bottom: 1px solid #ddd;
padding: 10px;
}
th {
background-color: #f2f2f2;
}
</style>
<p><strong>Galaxy Australia</strong> is designed for data analysis and not for
long term storage of data.</p>
<br />
<p>Use Galaxy Australia to host your input data only for the period required for
analysis. Also, remember to export/download your analysed data as this also
will not be stored beyond the limits set out below.</p>
<p>It is <u>your responsibility</u> as a user of the system to manage your own data
and routinely remove both your input and output data from this community system
to enable capacity for other users. Any data that is not removed by you within
a defined time period (see below) will be automatically and permanently deleted.
</p>
<p>Galaxy Australia maintains a collection of frequently used reference genomes
and annotation datasets. The inclusion of additional reference genomes and/or
annotation data on the system for community use can be <a href="https://docs.google.com/forms/d/e/1FAIpQLSdXuarvkzFA5kRqoCfO8uiUGAB0PplfR4yvAfpCPSpdMcehmA/viewform">
requested</a>. Galaxy Australia's hosting of all reference and annotation data
does not count to your quota and it is the best way to access
reference/annotation data.</p>
<h3>Data storage quotas and retention periods</h3><br />
<center>
<table>
<tr>
<th></th>
<th>Storage quota</th>
<th>Data retention period</th>
</tr>
<tr>
<td><strong>Registered Australian researchers</strong></td><td>600GB</td><td>1 year (52 weeks)</td>
</tr>
<tr>
<td><strong>Other registered users</strong></td><td>100GB</td><td>1 year (52 weeks)</td>
</tr>
<tr>
<td><strong>Unregistered users</strong></td><td>5GB</td><td>NA</td>
</tr>
</table>
</center>
<h3>Registered Users</h3>
<ul>
<li><u>Registered Australian Researchers</u> are defined by registration
Email from:
<ul>
<li>@domain.edu.au</li>
<li>@domain.org.au</li>
<li>@domain.edu - only in the case of known Australian Universities
not on the .au domain</li>
</ul>
</li>
<li><u>Other Registered Users</u> are defined as any registration Email
from all other @domains</li>
<li>Please contact us ([email protected]) if your institution does not
conform to this rule but you understand it should be defined as
performing publicly funded Australian research</li>
</ul>
<p>Registered users from Australian publicly funded research organisations have
a 600GB data storage quota. Other registered users have a 100GB quota. An
increased data storage quota can be <a href="https://docs.google.com/forms/d/e/1FAIpQLSeiw6ajmkezLCwbXc3OFQEU3Ai9hGnBd967u9YbQ8ANPgvatA/viewform">
requested</a> for a limited time period in special cases.</p>
<p>Registered User's data (i.e. datasets, histories) will be available on the
system for 1 year (52 weeks) from the point of upload or creation. Within this period,
any data marked by a Registered User as "deleted" will be permanently removed
within 5 days. If a registered user "purges" the dataset, it will be removed
immediately and permanently.</p>
<h3>Unregistered Users</h3>
<p>Processed data will only be accessible during one browser session, using a
browser cookie to identify an Unregistered User's data. This cookie is not used
for any other purposes (e.g. tracking or analytics.)</p>
<h3>What does it mean when I go over quota?</h3>
<p>You data and histories are still accessible, but you not be able to run new
jobs or import more data. If you know in advance then take advantage of
<a href="https://docs.google.com/forms/d/e/1FAIpQLSeiw6ajmkezLCwbXc3OFQEU3Ai9hGnBd967u9YbQ8ANPgvatA/viewform">
requesting</a> more analysis storage or downloading and deleting old, unwanted data.</p>
<style>
.column {
float: left;
width: 25%;
}
.big_column {
float: left;
width: 100%
}
/* Clear floats after the columns */
.row:after {
content: "";
display: table;
clear: both;
}
/* Responsive layout - makes a two column-layout instead of four columns */
@media screen and (max-width: 1200px) {
.column {
width: 50%;
}
}
/* Responsive layout - makes the two columns stack on top of each other instead of next to each other */
@media screen and (max-width: 600px) {
.column {
width: 100%;
}
}
</style>
<div class="row" width="90%">
<div class="big_column">
<!--<iframe width="100%" height="200px" src="https://stats.genome.edu.au/d-solo/-D4mtTAik/for_embedding?refresh=10s&orgId=1&panelId=2" frameborder="0" ></iframe> -->
<iframe src="https://stats.usegalaxy.org.au/d-solo/-D4mtTAik/for_embedding?orgId=1&refresh=10s&panelId=2" width="100%" height="200px" frameborder="0"></iframe>
</div>
</div>
<center>
<div class="row" width="90%">
<div class="column">
<iframe src="https://stats.usegalaxy.org.au/d-solo/-D4mtTAik/for_embedding?orgId=1&refresh=1d&panelId=19" width="95%" height="110px" frameborder="0"></iframe>
</div>
<div class="column">
<iframe src="https://stats.usegalaxy.org.au/d-solo/-D4mtTAik/for_embedding?orgId=1&refresh=1d&panelId=21" width="95%" height="110px" frameborder="0"></iframe>
</div>
<div class="column">
<iframe src="https://stats.usegalaxy.org.au/d-solo/-D4mtTAik/for_embedding?orgId=1&refresh=1d&panelId=23" width="95%" height="110px" frameborder="0"></iframe>
</div>
<div class="column">
<iframe src="https://stats.usegalaxy.org.au/d-solo/-D4mtTAik/for_embedding?orgId=1&refresh=1d&panelId=25" width="95%" height="110px" frameborder="0"></iframe>
</div>
</div>
</center>
<div class="row">
<section class="section-content">
<div class="col-md-12">
</div>
</section>
</div>
</div>
</div>
</div>
</div>
<footer class="navbar-default">
<div class="container">
<div class="row">
<div class="col-lg-12" style="text-align:center">
<p>UseGalaxy.org.au is maintained largely by the <a href="/people">Australian Galaxy Team</a> including staff from QCIF, UQ-RCC and Melbourne Bioinformatics.
All content on this site is available under <a href="https://creativecommons.org/share-your-work/public-domain/cc0/" target="_blank">CC0-1.0</a>, unless otherwise specified.
Galaxy Australia is currently running Galaxy version 21.09 (September 2021)</p>
</div>
</div>
<div class="row">
<div class="col-lg-12" style="text-align:center">
<ul class="contact-info">
<li><i class="fa fa-envelope"></i><a href="mailto:[email protected]">[email protected]</a></li>
<li><i class="fa fa-github"></i><a href="https://github.com/usegalaxy-au" target="_blank">usegalaxy-au</a></li>
<li><i class="fa fa-twitter"></i><a href="https://twitter.com/galaxyaustralia" target="_blank">galaxyaustralia</a></li>
<!-- <li><i class="fa fa-rss"></i>Subscribe <a href="/feed.xml">via RSS (UseGalaxy.eu Feed)</a></li> -->
</ul>
</div>
</div>
</div>
</footer>
</body>
</html>