Atmospheric Measurement Techniques (Jun 2025)
Exploring the effect of training set size and number of categories on ice crystal classification through a contrastive semi-supervised learning algorithm
Abstract
The shapes of ice crystals play an important role in global precipitation formation and the radiation budget. Classifying ice crystal shapes can improve our understanding of in-cloud conditions and these processes. Existing classification methods rely on features such as the aspect ratio of ice crystals, environmental temperature, and so on, which bring high instability to the classification performance, or employ supervised machine learning algorithms that heavily rely on human labeling. This poses significant challenges, including human subjectivity in classification and a substantial labor cost in manual labeling. In addition, previous deep learning algorithms for ice crystal classification are often trained and evaluated on datasets with varying sizes and classification schemes, each with distinct criteria and a different number of categories, making it difficult to make a fair comparison of algorithm performance. To overcome these limitations, a contrastive semi-supervised learning (CSSL) algorithm for the classification of ice crystals is proposed. The algorithm consists of an upstream unsupervised learning network tasked with extracting meaningful representations from a large number of unlabeled ice crystal images, and a downstream supervised network is fine-tuned with a small subset of labeled images of the entire dataset to perform the classification task. To determine the minimum number of ice crystal images that require human labeling while maintaining the algorithm performance, the algorithm is trained and evaluated on datasets with varying sizes and numbers of categories. The ice crystal data used in this study were collected during the NASCENT campaign at Ny-Ålesund and CLOUDLAB project on the Swiss Plateau using a holographic imager mounted on a tethered balloon system. In general, the CSSL algorithm outperforms a purely supervised algorithm in classifying 19 categories. Approximately 154 h of manual labeling can be avoided using just 11 % (2048 images) of the training set for fine tuning, sacrificing only 3.8 % in overall precision compared to a fully supervised model trained on the entire dataset. In the four-category classification task, the CSSL algorithm also outperforms the purely supervised algorithm. When fine-tuned on just 2048 images (25 % of the dataset), it achieves an overall accuracy of 89.6 %, nearly matching the 91.0 % accuracy of the supervised algorithm trained on 8192 images. When tested on the unseen CLOUDLAB dataset, CSSL shows superior generalization, improving accuracy by an average of 2.19 %. Our analysis also reveals that both CSSL and purely supervised algorithms exhibit inherent instability when trained on small dataset sizes, and the performance difference between them converges as the training set size exceeds 2048 samples. These results highlight the strength and practical effectiveness of CSSL in comparison to purely supervised methods and the potential of the CSSL algorithm to perform well on datasets collected under different conditions.