High-throughput Genome and Big Data Analysis Core Facility

DOC_ID : T15-0002 QC_Fastp module : DOC_ID : M27-3000Editor : Anita/MiraReviewer : Angela Function : 1.Fastp 軟體功能進行分析質量曲線，基本含量、KMER、Q20 / Q30、GC Ratio、duplication、adapter contents…來比較過濾前與過濾後的品質。過濾掉不良reads（質量太低..）去除品質較差的部分並比Trimmomatic速度更快。選取前端與後端去除的bp長度與切除adapters部分。對質量進行校正對於重疊部分配對。對帶分子標籤（UMI）的數據進行預處理，不管UMI在插入片段還是在index上。產生JSON與HTML格式檔案。 2.Fastp 模組功能可以取得隨機片段進行分析去除品質較差與adapters的部分可以分析 Paired-End/Single-End 兩種格式的DATA Ref:https://github.com/OpenGene/fastp Installation : All software are included in GA environment. Note : ►執行分析前請先利用CreateProject.sh創建一個專案資料夾，請參閱Project standard folder structure文件。 ►執行模組需確認所屬計算節點(–partition) : 一般節點的使用者建議使用ct56 ; 生醫節點的使用者建議使用ngs7G註1。 ►欲了解模組使用的方式，請執行模組的 -h 指令 #註1 : 欲確認使用者身分，請登入國網中心iService後，選取會員中心/計畫管理/我的計畫，若計畫名稱為”國家生醫數位資料與分析運算雲端服務平台III”即為生醫節點使用者 Description : Tested environment GApp0.0.0.2 Software version fastp=0.20.1 Usage(Slurm) Command in Slurm (Taiwania III)Rapid Quality Analysis (partial reads) Paired-Endsbatch -A $projectID --mail-user=$email --export='projDir='$(pwd)'/,seqType=PE,inFile=Sample01,sampleName=Sample01,reads_to_process=1000000' modules/QC_Fastp.shSingle-Endsbatch -A $projectID --mail-user=$email --export='projDir='$(pwd)'/,seqType=SE,inFile=Sample01,sampleName=Sample01,reads_to_process=1000000' modules/QC_Fastp.sh Analyze quality and read clean-upPaired-Endsbatch -A $projectID --mail-user=$email --export='projDir='$(pwd)'/,seqType=PE,inFile=Sample01,sampleName=Sample01,out=cleanup' modules/QC_Fastp.shSingle-Endsbatch -A $projectID --mail-user=$email --export='projDir='$(pwd)'/,seqType=SE,inFile=Sample01,sampleName=Sample01,out=cleanup' modules/QC_Fastp.sh Analyze quality and trimming readPaired-Endsbatch -A $projectID --mail-user=$email --export='projDir='$(pwd)'/,seqType=PE,inFile=Sample01,sampleName=Sample01,adapter=path/XXXX,trim_front1=number,trim_front2=number,trim_tail1=number,trim_tail2=number,out=trim' modules/QC_Fastp.shSingle-Endsbatch -A $projectID --mail-user=$email --export='projDir='$(pwd)'/,seqType=SE,inFile=Sample01,sampleName=Sample01,adapter=path/XXXX,trim_front1=number,trim_tail1=number,out=trim' modules/QC_Fastp.sh Usage(Linux console) Command in linux consoleRapid Quality Analysis (partial reads)Paired-Endbash modules/QC_Fastp.sh -p $(pwd) -t PE -i Sample01 -s Sample01 -R 1000000 Single-Endbash modules/QC_Fastp.sh -p $(pwd) -t SE -i Sample01 -s Sample01 -R 1000000Analyze […]

Module-QC_Fastp

This entry was posted in on January 17, 2022 by angela

DOC_ID : T15-0002 QIIME2_DPP_PE module : DOC_ID : M19-3000Editor : AnitaReviewer :Angela Function : QIIME2_DPP perform Data import, denoise, dereplicate, and filters chimeras of paired-end sequences : Data import : Import our data. (fastq to qza format) according to manifest fileUsing the qiime demux summarize command to split sequences and measure the quality, depth of each sample. The visualized output files (qza to qzv format) are save by specifying the output path with –o-visualization parameter. Sequence clean-up :denoise, dereplicate, and filters chimeras by DADA2, output 3 files : paired-end-demux.qza ⇒ table_dada2.qza ⇒ table.qzv paired-end-demux.qza ⇒ rep-seqs_dada2.qza ⇒ rep-seqs_dada2.qzv paired-end-demux.qza ⇒ stats_dada2.qza ⇒ stats_dada2.qzv Ref : https://docs.qiime2.org/2021.8/tutorials/overview/ Installation : All software are included in GA environment. Note : ►執行分析前請先利用CreateProject.sh創建一個專案資料夾，請參閱Project standard […]

Module-QIIME2_DPP_PE

This entry was posted in on January 17, 2022 by angela

DOC_ID : T15-0002 QIIME2_DPP_SE module : DOC_ID : M55-3000Editor : AnitaReviewer :Angela Function : QIIME2_DPP perform Data import, denoise, dereplicate, and filters chimeras of single-end sequences : Data import: Import our data. (fastq to qza format) according to manifest fileUsing the qiime demux summarize command to split sequences and measure the quality, depth of each sample. The visualized output files (qza to qzv format) are save by specifying the output path with –o-visualization parameter. Sequence clean-updenoise, dereplicate, and filters chimeras by DADA2, output 3 files : single-end-demux.qza ⇒ table_dada2.qza ⇒ table.qzv single-end-demux.qza ⇒ rep-seqs_dada2.qza ⇒ rep-seqs_dada2.qzv single-end-demux.qza ⇒ stats_dada2.qza ⇒ stats_dada2.qzv Ref : https://docs.qiime2.org/2021.8/tutorials/overview/ Installation : All software are included in GA environment. Note : ►執行分析前請先利用CreateProject.sh創建一個專案資料夾，請參閱Project standard folder […]

Module-QIIME2_DPP_SE

This entry was posted in on January 17, 2022 by angela

DOC_ID : T15-0002 QIIME2_DA module : DOC_ID : M21-3000Editor : AnitaReviewer :Angela Function : QIIME DA module perform phylogenetic diversity, Alpha and beta diversity analysis. The detail function and output file list below: Phylogenetic diversity analyses :generating and manipulating phylogenetic trees using fasttree and mafft alignment. The output will be applied to alpha/beta analysis. rep-seqs_dada2.qza ⇒ aligned-rep-seqs.qza aligned-rep-seqs.qza ⇒ masked-aligned-rep-seqs.qza masked-aligned-rep-seqs.qza ⇒ unrooted-tree.qza unrooted-tree.qza ⇒ rooted-tree.qza beta diversity analysis : Generate core-metrics-phylogenetic (core-metrics-results) Test for associations between categorical metadata columns and alpha diversity data (Faith Phylogenetic Diversity and Evenness metrics) Faith Phylogenetic Diversity(a measure of community richness) : faith_pd_vector.qza ⇒ faith-pd-group-significance.qzv Evenness metrics : evenness_vector.qza ⇒ […]

Module-QIIME2_DA

This entry was posted in on January 17, 2022 by angela

DOC_ID : T15-0002 QIIME2_TA module : DOC_ID : M20-3000Editor : AnitaReviewer :Angela Function : We use taxonomy classifiers to determine the closest taxonomic affiliation with some degree of confidence or consensus, based on alignment, k-mer frequencies, etc. This module contains many steps : Classified by greengene pretraining classifier : Generate taxonomy_gg.qza tabulate: Interactively explore Metadata in an HTML table (taxonomy_gg.qza ⇒ taxonomy_gg.qzv) barplot: Visualize taxonomy with an interactive bar plot (table_dada2.qza and taxonomy_gg.qza ⇒ taxa_gg-bar-plots.qzv) Classified by silva pretraining classifier : Generate taxonomy_silva.qza tabulate: Interactively explore Metadata in an HTML table (taxonomy_silva.qza ⇒ taxonomy_silva.qzv) barplot: Visualize taxonomy with an interactive bar plot (table_dada2.qza and taxonomy_silva.qza […]

Module-QIIME2_TA

This entry was posted in on January 17, 2022 by angela

DOC_ID : T15-0002 PICRUSt2 module : DOC_ID : M22-3000Editor : AnitaReviewer :Angela Function : PICRUSt2 (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States) is a software for predicting functional abundances based only on marker gene sequences. This module contains many steps : sequence placement :Place ASV reads into reference tree. .fasta ⇒ out.tre .fasta ⇒ intermediate/place_seqs/epa_out .fasta ⇒ intermediate/place_seqs/query_align.stockholm .fasta ⇒ intermediate/place_seqs/ref_seqs_hmmalign.fasta .fasta ⇒ intermediate/place_seqs/study_seqs_hmmalign.fasta hidden-state prediction of genomes : Hidden-state prediction of ASV gene families 16S + out.tre ⇒ marker_predicted_and_nsti.tsv.gz Hidden-state prediction of ASV gene families restrict to the EC number database EC + out.tre ⇒ EC_predicted.tsv.gz Hidden-state prediction of ASV gene families restrict to the KO number database KO + out.tre ⇒ KO_predicted.tsv.gz metagenome prediction : Generate EC_metagenome predictions feature-table.biom + […]

Module-PICRUSt2

This entry was posted in on January 17, 2022 by angela

DOC_ID : T15-0002 QIIME2_Emperor module : DOC_ID : M57-3000Editor : AnitaReviewer :Angela Function : Ordination is a popular approach for exploring microbial community composition in the context of sample metadata.QIIME2_Emperor module use the Emperor tool to explore principal coordinates (PCoA) plots in the context of sample metadata. While qiime diversity core-metrics-phylogenetic command did already generate some Emperor plots, official website want to pass an optional parameter, –p-custom-axes, which is very useful for exploring time series data. The PCoA results that were used in core-metrics-phylogeny are also available, making it easy to generate new visualizations with Emperor.The detail function and output file list […]

Module-QIIME2_Emperor

This entry was posted in on January 17, 2022 by angela

DOC_ID : T15-0002 QIIME2_BetaSig module : DOC_ID : M56-3000Editor : AnitaReviewer :Angela Function : QIIME BetaSig module perform analyze sample composition in the context of categorical metadata using PERMANOVA (first described in Anderson (2001)) using the beta-group-significance command. The detail function and output file list below: Jaccard distance (a qualitative measure of community dissimilarity) : (fieldName=’A:B’)* jaccard_distance_matrix.qza ⇒ jaccard-A-group-significance.qzv jaccard_distance_matrix.qza ⇒ jaccard-B-group-significance.qzv Bray-Curtis distance (a quantitative measure of community dissimilarity) : bray_curtis_distance_matrix.qza ⇒ bray_curtis-A-group-significance.qzv bray_curtis_distance_matrix.qza ⇒ bray_curtis-B-group-significance.qzv unweighted UniFrac distance (a qualitative measure of community dissimilarity that incorporates phylogenetic relationships between the features) : unweighted_unifrac_distance_matrix.qza ⇒ unweighted-unifrac-A-group-significance.qzv unweighted_unifrac_distance_matrix.qza ⇒ unweighted-unifrac-B-group-significance.qzv weighted UniFrac distance (a quantitative measure of community dissimilarity that […]

Module-QIIME2_BetaSig

This entry was posted in on January 17, 2022 by angela

A total solution for your genome study

A total solution for your genome study

Module-QC_Fastp

Module-QIIME2_DPP_PE

Module-QIIME2_DPP_SE

Module-QIIME2_DA

Module-QIIME2_TA

Module-PICRUSt2

Module-QIIME2_Emperor

Module-QIIME2_BetaSig