A method of sequentially generating a set of components of a multidimensional random variable using a nonparametric pattern recognition algorithm

Описание

Тип публикации: статья из журнала

Год издания: 2021

Идентификатор DOI: 10.18287/2412-6179-CO-902

Ключевые слова: bandwidths selection of the kernel functions, forming a set of independent features, hypothesis testing, information processing, kernel probability density estimate, nonparametric pattern recognition algorithm, optical data processing, pattern recognition, remote sensing data

Аннотация: We study in which way a priori information on the independence of random variables affects the approximation accuracy of a nonparametric estimate of the Rosenblatt–Parzen probability density. A new technique for generating sets of independent components of a multidimensional random variable is proposed. The methodology is based on Показать полностьюtesting the hypotheses of the independence of combinations of the multidimensional random variable components using a two-alternative nonparametric kernel algorithm for pattern recognition corresponding to the maximum likelihood criterion. Classes correspond to the domains of definition of the probability densities of sets of independent and dependent components of the multidimensional random variable. Nonparametric statistics of the kernel type are used to estimate the probability densities. The choice of the bandwidths of the kernel estimates of the probability densities is made from the condition of the minimum root-mean-square criterion. The sequential procedure for generating a set of independent components begins with the analysis of paired combinations of components of a multidimensional random variable. For each pair of components, the probability of an error in recognizing classes corresponding to the assumptions of independence and dependence of the considered components is estimated. A pair of components with the maximum difference between these errors is determined. If the errors obtained do not differ significantly, then there are no independent components in the considered multivariate random variable. If there is a significant difference in the probability estimates of class recognition errors, a pair of independent components is established. These components are included in a three-component set of a multidimensional random variable. The analysis of their combinations is carried out in the same way, following the above-described procedure. The process of generating the set of independent components is stopped when no reliable difference occurs any more between the probabilities of errors in recognizing situations belonging to the accepted classes. In this case, the previous set of independent components is the desired result. In contrast to the traditional methodology based on the Pearson criterion, the proposed approach allows us to bypass a problem of the decomposition of the range of values of random variables into multidimensional intervals. The method of generating a set of independent components of a multidimensional random variable is illustrated by the results of the analysis of spectral features of remote sensing data of forest tracts using space imagery from the Landsat-8 satellite. © 2021, Institution of Russian Academy of Sciences. All rights reserved. We study in which way a priori information on the independence of random variables affects the approximation accuracy of a nonparametric estimate of the Rosenblatt-Parzen probability density. A new technique for generating sets of independent components of a multidimensional random variable is proposed. The methodology is based on testing the hypotheses of the independence of combinations of the multidimensional random variable components using a two-alternative nonparametric kernel algorithm for pattern recognition corresponding to the maximum likelihood criterion. Classes correspond to the domains of definition of the probability densities of sets of independent and dependent components of the multidimensional random variable. Nonparametric statistics of the kernel type are used to estimate the probability densities. The choice of the bandwidths of the kernel estimates of the probability densities is made from the condition of the minimum root-mean-square criterion. The sequential procedure for generating a set of independent components begins with the analysis of paired combinations of components of a multidimensional random variable. For each pair of components, the probability of an error in recognizing classes corresponding to the assumptions of independence and dependence of the considered components is estimated. A pair of components with the maximum difference between these errors is determined. If the errors obtained do not differ significantly, then there are no independent components in the considered multivariate random variable. If there is a significant difference in the probability estimates of class recognition errors, a pair of independent components is established. These components are included in a three-component set of a multidimensional random variable. The analysis of their combinations is carried out in the same way, following the above-described procedure. The process of generating the set of independent components is stopped when no reliable difference occurs any more between the probabilities of errors in recognizing situations belonging to the accepted classes. In this case, the previous set of independent components is the desired result. In contrast to the traditional methodology based on the Pearson criterion, the proposed approach allows us to bypass a problem of the decomposition of the range of values of random variables into multidimensional intervals. The method of generating a set of independent components of a multidimensional random variable is illustrated by the results of the analysis of spectral features of remote sensing data of forest tracts using space imagery from the Landsat-8 satellite.

Ссылки на полный текст

Издание

Журнал: Computer Optics

Выпуск журнала: Vol. 45, Is. 6

Номера страниц: 926-933

ISSN журнала: 01342452

Издатель: Institution of Russian Academy of Sciences

Персоны

  • Zenkov I.V. (Siberian Federal University, Svobodny Av. 79, Krasnoyarsk, 660041, Russian Federation, Reshetnev Siberian State University of Science and Technology, Krasnoyarsky Rabochy Av. 31, Krasnoyarsk, 660037, Russian Federation, Krasnoyarsk branch of the Federal Research Center for Information and Computational Technologies, Mira Av. 53, Krasnoyarsk, 660049, Russian Federation)
  • Lapko A.V. (Institute of Computational Modelling SB RAS, Akademgorodok 50, Krasnoyarsk, 660036, Russian Federation, Reshetnev Siberian State University of Science and Technology, Krasnoyarsky Rabochy Av. 31, Krasnoyarsk, 660037, Russian Federation)
  • Lapko V.А. (Institute of Computational Modelling SB RAS, Akademgorodok 50, Krasnoyarsk, 660036, Russian Federation, Reshetnev Siberian State University of Science and Technology, Krasnoyarsky Rabochy Av. 31, Krasnoyarsk, 660037, Russian Federation)
  • Kiryushina E.V. (Siberian Federal University, Svobodny Av. 79, Krasnoyarsk, 660041, Russian Federation)
  • Vokin V.N. (Institute of Computational Modelling SB RAS, Akademgorodok 50, Krasnoyarsk, 660036, Russian Federation)
  • Bakhtina A.V. (Reshetnev Siberian State University of Science and Technology, Krasnoyarsky Rabochy Av. 31, Krasnoyarsk, 660037, Russian Federation)