Formation of Fuzzy Patterns in Logical Analysis of Data Using a Multi-Criteria Genetic Algorithm : научное издание

Описание

Тип публикации: статья из журнала

Год издания: 2022

Идентификатор DOI: 10.3390/sym14030600

Ключевые слова: logical analysis of data, pattern generation, genetic algorithm

Аннотация: The formation of patterns is one of the main stages in logical data analysis. Fuzzy approaches to pattern generation in logical analysis of data allow the pattern to cover not only objects of the target class, but also a certain proportion of objects of the opposite class. In this case, pattern search is an optimization problem witПоказать полностьюh the maximum coverage of the target class as an objective function, and some allowed coverage of the opposite class as a constraint. We propose a more flexible and symmetric optimization model which does not impose a strict restriction on the pattern coverage of the opposite class observations. Instead, our model converts such a restriction (purity restriction) into an additional criterion. Both, coverage of the target class and the opposite class are two objective functions of the optimization problem. The search for a balance of these criteria is the essence of the proposed optimization method. We propose a modified evolutionary algorithm based on the Non-dominated Sorting Genetic Algorithm-II (NSGA-II) to solve this problem. The new algorithm uses pattern formation as an approximation of the Pareto set and considers the solution's representation in logical analysis of data and the informativeness of patterns. We have tested our approach on two applied medical problems of classification under conditions of sample asymmetry: one class significantly dominated the other. The classification results were comparable and, in some cases, better than the results of commonly used machine learning algorithms in terms of accuracy, without losing the interpretability. The formation of patterns is one of the main stages in logical data analysis. Fuzzy approaches to pattern generation in logical analysis of data allow the pattern to cover not only objects of the target class, but also a certain proportion of objects of the opposite class. In this case, pattern search is an optimization problem with the maximum coverage of the target class as an objective function, and some allowed coverage of the opposite class as a constraint. We propose a more flexible and symmetric optimization model which does not impose a strict restriction on the pattern coverage of the opposite class observations. Instead, our model converts such a restriction (purity restriction) into an additional criterion. Both, coverage of the target class and the opposite class are two objective functions of the optimization problem. The search for a balance of these criteria is the essence of the proposed optimization method. We propose a modified evolutionary algorithm based on the Non-dominated Sorting Genetic Algorithm-II (NSGA-II) to solve this problem. The new algorithm uses pattern formation as an approximation of the Pareto set and considers the solution’s representation in logical analysis of data and the informativeness of patterns. We have tested our approach on two applied medical problems of classification under conditions of sample asymmetry: one class significantly dominated the other. The classification results were comparable and, in some cases, better than the results of commonly used machine learning algorithms in terms of accuracy, without losing the interpretability. © 2022 by the authors. Licensee MDPI, Basel, Switzerland.

Ссылки на полный текст

Издание

Журнал: SYMMETRY-BASEL

Выпуск журнала: Vol. 14, Is. 3

Номера страниц: 600

ISSN журнала: 20738994

Место издания: BASEL

Издатель: MDPI

Персоны

  • Masich Igor S. (Reshetnev Siberian State Univ Sci & Technol, Inst Informat & Telecommun, 31 Krasnoyarsky Rabochy Av, Krasnoyarsk 660037, Russia; Siberian Fed Univ, Inst Space & Informat Technol, 79 Svobodny Pr, Krasnoyarsk 660041, Russia)
  • Kulachenko Margarita A. (Reshetnev Siberian State Univ Sci & Technol, Inst Informat & Telecommun, 31 Krasnoyarsky Rabochy Av, Krasnoyarsk 660037, Russia)
  • Stanimirovic Predrag S. (Univ Nis, Fac Sci & Math, Visegradska 33, Nish 18000, Serbia)
  • Popov Aleksey M. (Reshetnev Siberian State Univ Sci & Technol, Inst Informat & Telecommun, 31 Krasnoyarsky Rabochy Av, Krasnoyarsk 660037, Russia)
  • Tovbis Elena M. (Reshetnev Siberian State Univ Sci & Technol, Inst Informat & Telecommun, 31 Krasnoyarsky Rabochy Av, Krasnoyarsk 660037, Russia)
  • Stupina Alena A. (Reshetnev Siberian State Univ Sci & Technol, Inst Informat & Telecommun, 31 Krasnoyarsky Rabochy Av, Krasnoyarsk 660037, Russia; Siberian Fed Univ, Inst Business Proc Management, 79 Svobodny Pr, Krasnoyarsk 660041, Russia)
  • Kazakovtsev Lev A. (Reshetnev Siberian State Univ Sci & Technol, Inst Informat & Telecommun, 31 Krasnoyarsky Rabochy Av, Krasnoyarsk 660037, Russia; Siberian Fed Univ, Inst Business Proc Management, 79 Svobodny Pr, Krasnoyarsk 660041, Russia)