Comparison of commonly used software pipelines for analyzing fungal metabarcoding data

In recent years, high-throughput-sequencing (HTS) has resulted in an exponential increase in the detection of new fungal species from various environments. Prior to the statistical analysis of metabarcoding results, the generated sequences must undergo sequence processing. Various software pipelines have been developed and made freely available, however, they vary in their applicability to fungal metabarcoding data and can impact the results. Thus, testing the performance of different pipelines on fungal datasets generated from complex field-collected environmental samples is of particular interest for ecological studies.

Here, we evaluated the performance of three different pipelines. We applied two pipelines generating OTUs (which we named mothur_97% and mothur_99%) and one pipeline that is inferring ASVs (which was named dada2). These three pipelines were used to analyze fungal communities from two different environmental sample types, fresh bovine feces and pasture soil. We compared fungal alpha and beta diversity results generated by the three different pipelines for field collected samples and for a set of identical samples (= 18 technical replicates for every sample type) we evaluated the homogeneity of those pipelines.

Our comparison revealed significant differences in the results obtained from commonly used pipelines, particularly when pipeline-specific default or recommended settings are used. We found that fungal species richness in biological replicates differed significantly among pipelines and that one pipeline showed a high heterogeneity of relative abundances and a poor OTU/ASV detection across technical replicates (n=18) compared with other pipelines.

With these results, we want to i) generally draw attention to the great impact of pipeline settings on sufficient OTU/ASV detection and ii) point out that the OTU-approach outcompeted the ASV-approach. Hence, we recommend using a pipeline with OTU-clustering and a careful reflection of respective pipeline settings for future studies.

Figure 1

Figure: Heatmaps showing the relative abundance of fungal genera in identical replicates of bovine feces (left) and pasture soil (right) samples. Fungal genera were identified using three different sequence processing pipelines (dada2, mothur_97%, mothur_99%). Significant differences among pipelines are indicated by asterisks (p < 0.001 ***, p < 0.01 **, p < 0.05 *, p > 0.05 ns). The mean ratio of standard deviation to mean relative abundance (mean Stdv) for every pipeline is given below the heatmaps and shows that the heterogeneity among sample replicates was highest with the dada2 pipeline.

 

Galla, Giulia; Praeg, Nadine; Rzehak, Theresa; Sprecher, Else; Colla, Filippo; Seeber, Julia; Illmer, Paul; Hauffe, Heidi C (2024): Comparison of DNA extraction methods on different sample matrices within the same terrestrial ecosystem.
In: Scientific Reports 14, Nr. 8715. (DOI)

Nach oben scrollen