.. |PipeCraft2_logo| image:: _static/PipeCraft2_icon_v2.png
:width: 50
:target: https://github.com/pipecraft2/pipecraft
.. |PipeCraft_sticker1sideways| image:: _static/PipeCraft_sticker1sideways.png
:width: 350
:class: right
:target: https://github.com/pipecraft2/pipecraft
.. |main_interface| image:: _static/main_interface.png
:width: 2000
.. raw:: html
.. role:: red
.. meta::
:description lang=en:
PipeCraft manual. PipeCraft is a Graphical User Interface software for metabarcoding data analyses
============================
PipeCraft2 |PipeCraft2_logo|
============================
**PipeCraft2** is a Graphical User Interface (GUI) software that implements :ref:`various popular tools ` for **metabarcoding** data analyses.
Implements various :ref:`ready-to-run (pre-defined) pipelines ` as well as an option to run a
variety of :ref:`individual steps ` outside of a full-pipeline.
.. _interface:
*(click on the image for enlargement)*
|main_interface|
| Software settings for pipeline processes contain key options for metabarcoding sequence data analyses, but all options of any implemented program may be accessed via :ref:`PipeCraft console (command line) `.
| Default settings in the panels represent commonly used options for amplicon sequence data analyses, which may be tailored according to user experience or needs.
Custom-designed pipeline settings can be saved, and thus the exact same pipeline may be easily re-run on other sequencing data (and for reproducibility, may be used as a supplement material in the manuscript).
PipeCraft enables executing the :ref:`full pipeline ` (user specifies the input, and output will be e.g. OTU/ASV table with taxonomic annotations of the generated features),
but supports also :ref:`single-step mode (Quick Tools panel) ` where analyses may be performed in a step-by-step manner *(e.g. perform quality filtering, then examine the output and decide whether to adjust the quality filtering options of
to proceed with next step, e.g. with chimera filtering step)*.
Glossary
========
List of terms that you may encounter in this user guide.
=========================== ===================================
**working directory** | the directory (folder) that contains the files for the analyses.
| The outputs will be written into this directory
**paired-end data** | obtained by sequencing two ends of the same DNA fragment,
| which results in read 1 (R1) and read 2 (R2) files per library or per sample.
| Note that PipeCraft expects that :red:`read 1 file contains the string R1`
| and :red:`read 2 contains R2`
| (not e.g. my_sample_L001_1.fastq / my_sample_L001_2.fastq)
**single-end data** | only one sequencing file per library or per sample.
| Herein, may mean also assembled paired-end data.
**demultiplexed data** | sequences are sorted into separate files, representing individual samples
**multiplexed data** | file(s) that represent a pool of sequences from different samples
**read/sequence** | DNA sequence; herein, reads and sequences are used interchangeably
=========================== ===================================
____________________________________________________
Docker images
==============
.. |pulling_image| image:: _static/pulling_image.png
:width: 280
All the processes are run through `docker `_, where the PipeCraft's GUI simply mediates the
information exchange. Therefore, whenever a process is initiated for the **first time**,
a relevant Docker image (contains required software for the analyses step) will be pulled from `Docker Hub `_.
Initial PipeCraft2 installation does not contain any software for sequence data processing.
Example: when running DEMULTIPLEXING for the first time |pulling_image|
Thus working **Internet connection** is initially required. Once the Docker images are pulled, PipeCraft2 can work without an Internet connection.
:ref:`Docker images ` vary in size, and the speed of the first process is extended by the docker image download time.
____________________________________________________
.. _tools:
Currently implemented software
------------------------------
:ref:`See software version on the 'Releases' page `
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| Software | Reference | Task |
+========================================================================+=========================================================================================+=========================================================================================+
| `docker `_ | https://www.docker.com | building env, sharing and running applications |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`DADA2 ` | `Callahan et al. 2016 `_ | ASV pipeline (denoising, taxonomy) |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`vsearch ` | `Rognes et al. 2016 `_ | full OTU/zOTU pipeline operations |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`NextITS ` | `Mikryukov et al. `_ | full pipeline for fungal full-ITS (PacBio) |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`OptimOTU ` | `Furneaux et al. `_ | optimised OTU pipeline for Fungi and Metazoa |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`metaMATE ` | `Creedy et al. 2022 `_ | NUMT/erroneous-sequence filtering for COI / mitochondrial amplicons |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`BlasCh ` | `Hakimzadeh et al. `_ | BLAST-based recovery of false-positive chimeras |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`swarm clustering ` | `Mahé et al. 2021 `_ | single-linkage clustering with dynamic similarity threshold |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`UNCROSS2 ` | `Edgar 2018 `_ | filter tag-jumps from OTU/ASV table |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`FunBarONT ` | `Dziurzynski et al. `_ | fungal barcoding pipeline for Oxford Nanopore (ONT) ITS data |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`trimmomatic ` | `Bolger et al. 2014 `_ | quality filtering |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`fastp ` | `Chen et al. 2018 `_ | quality filtering |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`seqkit ` | `Shen et al. 2016 `_ | multiple sequence manipulation operations |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`cutadapt ` | `Martin 2011 `_ | demultiplexing, cut primers |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| `biopython `_ | `Cock et al. 2009 `_ | multiple sequence manipulation operations |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| `mothur `_ | `Schloss et al. 2009 `_ | submodule in ITSx to make unique and deunique seqs |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`ITS Extractor ` | `Bengtsson-Palme et al. 2013 `_ | extract ITS regions (with :ref:`updated HMM profiles `) |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`fqgrep ` | `Indraniel Das 2011 `_ | core for reorient reads |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`BLAST ` | `Camacho et al. 2009 `_ | assign taxonomy |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`RDP classifier ` | `Wang et al. 2007 `_ | assign taxonomy |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`ORFfinder ` | `NCBI Tool `_ | finding open reading frames of protein coding genes (filtering pseudogenes/off-targets) |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`FastQC ` | `Andrews 2019 `_ | QualityCheck module |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`MultiQC ` | `Ewels et al. 2016 `_ | QualityCheck module |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`LULU ` | `Frøslev et al. 2017 `_ | post-clustering curation |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| :ref:`DEICODE ` | `Martino et al. 2019 `_ | dissimilarity analysis |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
| `GNU Parallel `_ | `Tange 2021 `_ | executing jobs in parallel |
+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+
Let us know if you would like to have a specific software implemeted to PipeCraft (:ref:`contacts `) or create an issue in the `main repository `_.
____________________________________________________
Contents of this user guide
---------------------------
.. toctree::
:maxdepth: 1
installation
quickstart
pre-defined_pipelines
quicktools
postprocessing
example_analyses
troubleshoot
licence
citation
releases
docker_images
contact
for_developers
|PipeCraft_sticker1sideways|