.. |PipeCraft2_logo| image:: _static/PipeCraft2_icon_v2.png :width: 50 :target: https://github.com/pipecraft2/pipecraft .. |PipeCraft_sticker1sideways| image:: _static/PipeCraft_sticker1sideways.png :width: 350 :class: right :target: https://github.com/pipecraft2/pipecraft .. |main_interface| image:: _static/main_interface.png :width: 2000 .. raw:: html .. role:: red .. meta:: :description lang=en: PipeCraft manual. PipeCraft is a Graphical User Interface software for metabarcoding data analyses ============================ PipeCraft2 |PipeCraft2_logo| ============================ **PipeCraft2** is a Graphical User Interface (GUI) software that implements :ref:`various popular tools ` for **metabarcoding** data analyses. Implements various :ref:`ready-to-run (pre-defined) pipelines ` as well as an option to run a variety of :ref:`individual steps ` outside of a full-pipeline. .. _interface: *(click on the image for enlargement)* |main_interface| | Software settings for pipeline processes contain key options for metabarcoding sequence data analyses, but all options of any implemented program may be accessed via :ref:`PipeCraft console (command line) `. | Default settings in the panels represent commonly used options for amplicon sequence data analyses, which may be tailored according to user experience or needs. Custom-designed pipeline settings can be saved, and thus the exact same pipeline may be easily re-run on other sequencing data (and for reproducibility, may be used as a supplement material in the manuscript). PipeCraft enables executing the :ref:`full pipeline ` (user specifies the input, and output will be e.g. OTU/ASV table with taxonomic annotations of the generated features), but supports also :ref:`single-step mode (Quick Tools panel) ` where analyses may be performed in a step-by-step manner *(e.g. perform quality filtering, then examine the output and decide whether to adjust the quality filtering options of to proceed with next step, e.g. with chimera filtering step)*. Glossary ======== List of terms that you may encounter in this user guide. =========================== =================================== **working directory** | the directory (folder) that contains the files for the analyses. | The outputs will be written into this directory **paired-end data** | obtained by sequencing two ends of the same DNA fragment, | which results in read 1 (R1) and read 2 (R2) files per library or per sample. | Note that PipeCraft expects that :red:`read 1 file contains the string R1` | and :red:`read 2 contains R2` | (not e.g. my_sample_L001_1.fastq / my_sample_L001_2.fastq) **single-end data** | only one sequencing file per library or per sample. | Herein, may mean also assembled paired-end data. **demultiplexed data** | sequences are sorted into separate files, representing individual samples **multiplexed data** | file(s) that represent a pool of sequences from different samples **read/sequence** | DNA sequence; herein, reads and sequences are used interchangeably =========================== =================================== ____________________________________________________ Docker images ============== .. |pulling_image| image:: _static/pulling_image.png :width: 280 All the processes are run through `docker `_, where the PipeCraft's GUI simply mediates the information exchange. Therefore, whenever a process is initiated for the **first time**, a relevant Docker image (contains required software for the analyses step) will be pulled from `Docker Hub `_. Initial PipeCraft2 installation does not contain any software for sequence data processing. Example: when running DEMULTIPLEXING for the first time |pulling_image| Thus working **Internet connection** is initially required. Once the Docker images are pulled, PipeCraft2 can work without an Internet connection. :ref:`Docker images ` vary in size, and the speed of the first process is extended by the docker image download time. ____________________________________________________ .. _tools: Currently implemented software ------------------------------ :ref:`See software version on the 'Releases' page ` +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | Software | Reference | Task | +========================================================================+=========================================================================================+=========================================================================================+ | `docker `_ | https://www.docker.com | building env, sharing and running applications | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`DADA2 ` | `Callahan et al. 2016 `_ | ASV pipeline (denoising, taxonomy) | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`vsearch ` | `Rognes et al. 2016 `_ | full OTU/zOTU pipeline operations | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`NextITS ` | `Mikryukov et al. `_ | full pipeline for fungal full-ITS (PacBio) | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`OptimOTU ` | `Furneaux et al. `_ | optimised OTU pipeline for Fungi and Metazoa | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`metaMATE ` | `Creedy et al. 2022 `_ | NUMT/erroneous-sequence filtering for COI / mitochondrial amplicons | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`BlasCh ` | `Hakimzadeh et al. `_ | BLAST-based recovery of false-positive chimeras | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`swarm clustering ` | `Mahé et al. 2021 `_ | single-linkage clustering with dynamic similarity threshold | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`UNCROSS2 ` | `Edgar 2018 `_ | filter tag-jumps from OTU/ASV table | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`FunBarONT ` | `Dziurzynski et al. `_ | fungal barcoding pipeline for Oxford Nanopore (ONT) ITS data | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`trimmomatic ` | `Bolger et al. 2014 `_ | quality filtering | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`fastp ` | `Chen et al. 2018 `_ | quality filtering | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`seqkit ` | `Shen et al. 2016 `_ | multiple sequence manipulation operations | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`cutadapt ` | `Martin 2011 `_ | demultiplexing, cut primers | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | `biopython `_ | `Cock et al. 2009 `_ | multiple sequence manipulation operations | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | `mothur `_ | `Schloss et al. 2009 `_ | submodule in ITSx to make unique and deunique seqs | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`ITS Extractor ` | `Bengtsson-Palme et al. 2013 `_ | extract ITS regions (with :ref:`updated HMM profiles `) | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`fqgrep ` | `Indraniel Das 2011 `_ | core for reorient reads | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`BLAST ` | `Camacho et al. 2009 `_ | assign taxonomy | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`RDP classifier ` | `Wang et al. 2007 `_ | assign taxonomy | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`ORFfinder ` | `NCBI Tool `_ | finding open reading frames of protein coding genes (filtering pseudogenes/off-targets) | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`FastQC ` | `Andrews 2019 `_ | QualityCheck module | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`MultiQC ` | `Ewels et al. 2016 `_ | QualityCheck module | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`LULU ` | `Frøslev et al. 2017 `_ | post-clustering curation | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | :ref:`DEICODE ` | `Martino et al. 2019 `_ | dissimilarity analysis | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ | `GNU Parallel `_ | `Tange 2021 `_ | executing jobs in parallel | +------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+ Let us know if you would like to have a specific software implemeted to PipeCraft (:ref:`contacts `) or create an issue in the `main repository `_. ____________________________________________________ Contents of this user guide --------------------------- .. toctree:: :maxdepth: 1 installation quickstart pre-defined_pipelines quicktools postprocessing example_analyses troubleshoot licence citation releases docker_images contact for_developers |PipeCraft_sticker1sideways|