Algae and cyanobacteria are ubiquitous in freshwater and marine environments, playing essential ecological roles but also posing risks when their excessive growth leads to harmful algal blooms. These blooms can impact ecosystems, water quality, and public health. Accurate identification and quantification of these microorganisms are critical for effective water resource management. Traditional manual methods for counting and classifying algal and cyanobacterial cells are labor-intensive and susceptible to human error. The integration of machine learning-based techniques offers a promising approach to automating this process with improved accuracy and efficiency.
This study evaluates the Fast Segment Anything Model (FastSAM), an advanced image segmentation algorithm, for its effectiveness in identifying and clustering algal and cyanobacterial cells from microscopic images obtained from water samples across Australia, Taiwan, Malaysia, and Canada. FastSAM, a Vision Transformer (ViT)-based model that utilizes a compact ViT encoder, was employed to segment cells from diverse water sources, including freshwater, marine, and treated wastewater samples. The algorithm's performance was assessed by comparing machine-segmented outputs with manually validated segmentations and applying three clustering evaluation metrics to analyze shape-based classification.
Our findings highlight both the strengths and limitations of machine learning-driven image processing in aquatic microbial analysis. FastSAM effectively segmented entire elements within all microscopic images used in the study. Segmentation accuracy varied based on cell shape, with 50–100% similarity between machine-based and manually validated segmentations. Notably, 100% of single cells were correctly segmented by FastSAM, demonstrating its high precision for isolated cell identification. However, segmentation performance for aggregated or irregularly shaped cells showed greater variation.
Clustering analysis using different evaluation metrics indicated performance ranges of 57–94%, with the Spectral Angle Mapper achieving the highest accuracy at 84–94% when compared to manually chosen clustering benchmarks. These results emphasize the potential of automated image clustering in distinguishing between various algal and cyanobacterial morphologies, offering a significant advancement for microbial ecology studies and water quality monitoring.
The biological diversity of cyanobacterial and algal communities underscores the importance of accurate shape-based classification in environmental assessments. As machine learning techniques continue to evolve, their application in ecological research will enhance our ability to monitor and mitigate the impacts of algal blooms, supporting improved water management strategies and public health protection. This study demonstrates the feasibility of machine learning-driven segmentation and clustering techniques as a scalable and efficient solution for algal and cyanobacterial identification across diverse global water systems.