Тип публикации: доклад, тезисы доклада, статья из сборника материалов конференций
Конференция: Hybrid Methods of Modeling and Optimization in Complex Systems (HMMOCS-III 2024); Krasnoyarsk; Krasnoyarsk
Год издания: 2025
Идентификатор DOI: 10.1051/itmconf/20257204003
Аннотация: This paper explores the transformation of heterogeneous features, including continuous data, into binary form using frequency discretization. This method is particularly beneficial for clustering tasks, as binary features simplify the interpretation of results using logical expressions. In unsupervised learning, where class labels Показать полностьюare unknown, we propose a binarization approach that converts continuous features into binary values based on their frequency distribution. Our experiments show that this technique not only preserves essential information but also improves clustering quality, as measured by the Rand Index, compared to known groupings of industrial product batches. The method reduces noise, simplifies the feature space, and enhances cluster interpretability. Among various distance metrics, the best results were achieved using Cosine distance. These findings highlight the potential of frequency discretization for improving clustering outcomes.
Журнал: ITM Web of Conferences
Номера страниц: 4003
Место издания: Krasnoyarsk