Journal of Technical Research

Журнал технических исследований

2500-3313

106302

Информационные технологии и телекоммуникации

Information technology and telecommunication

Информационные технологии и телекоммуникации

Application of machine learning methods and mathematical models for big data analysis

Применение методов машинного обучения и математических моделей для анализа больших данных

Исаева

А. Т.

Isaeva

Aida Taalaevna

akeldibekova@oshsu.kg

https://orcid.org/0000-0001-6444-0468

Келдибекова

А. О.

Keldibekova

Aida Oskonovna

aidaoskk@gmail.com

доктор педагогических наук;

doctor of pedagogical sciences;

Ошский государственный университет Osh KG Osh State University Osh KG

Ошский государственный университет Ош Киргизия Osh state university Osh Kyrgyzstan

30 12 2025

11 4 23 27 07 11 2025 14 11 2025

https://naukaru.ru/en/nauka/article/106302/view

В условиях стремительного роста объемов информации проблема анализа больших данных (Big Data) приобретает особую актуальность. В данной статье исследуется симбиоз методов машинного обучения (МО) и фундаментальных математических моделей как основа для эффективного извлечения знаний из больших массивов информации. Цель работы — разработка и сравнительная оценка комплекса методов МО, подкрепленных математическим аппаратом, для задач классификации и кластеризации. На основе эксперимента с использованием набора данных UCI Machine Learning Repository проведен сравнительный анализ алгоритмов, включая логистическую регрессию, метод опорных векторов (SVM), случайный лес и многослойный перцептрон. Результаты показывают, что нейронные сети (Accuracy: 0.92, F1-мера: 0.89) и ансамблевые методы демонстрируют превосходство над классическими алгоритмами при работе с разнородными данными. Подчеркивается, что математические модели из областей оптимизации, линейной алгебры и теории вероятностей являются неотъемлемым фундаментом, обеспечивающим корректность и эффективность алгоритмов МО. Делается вывод о целесообразности комплексного подхода, объединяющего вычислительную мощь МО и строгость математических моделей.

In the context of the rapid growth of information volumes, the problem of big data analysis is becoming particularly relevant. This article investigates the symbiosis of machine learning (ML) methods and fundamental mathematical models as a basis for effective knowledge extraction from large datasets. aree aim of the work is the development and comparative evaluation of a set of ML methods supported by a mathematical apparatus for classification and clustering tasks. Based on an experiment using a dataset from the UCI Machine Learning Repository, a comparative analysis of algorithms was conducted, including Logistic Regression, Support Vector Machine (SVM), Random Forest, and Multilayer Perceptron. The results show that neural networks (Accuracy: 0.92, F1-score: 0.89) and ensemble methods outperform classical algorithms when working with heterogeneous data. It is emphasized that mathematical models from the fields of optimization, linear algebra, and probability theory are an integral foundation that ensures the correctness and efficiency of ML algorithms. The conclusion is made about the feasibility of an integrated approach combining the computational power of ML and the rigor of mathematical models.

большие данные машинное обучение математические модели классификация кластеризация нейронные сети оптимизация

big data machine learning mathematical models classification clustering neural networks optimization

Chen, M., Mao, S., & Liu, Y. Big Data: A Survey. Mobile Networks and Applications. 2019. №19(2). Pp. 171–209.

Deisenroth, M. P., Faisal, A. A. & Ong, C. S. Mathematics for Machine Learning. Cambridge University Press. 2020. 398 p.

Bottou, L., Curtis, F. E., & Nocedal, J. Optimization Methods for Large-Scale Machine Learning. SIAM Review. 2018. № 60(2). Pp. 223–311.

Исаков, Р. Ж., Абдыкалыков, А. А. Возможности применения искусственного интеллекта в диагностике заболеваний в Кыргызстане. Вестник Кыргызско-Российского Славянского университета. 2022. № 22(5). С. 124–130.

Isakov, R. Zh., Abdykalykov, A. A. Possibilities of applying artificial intelligence in disease diagnostics in Kyrgyzstan. Bulletin of the Kyrgyz-Russian Slavic University. 2022. No. 22(5). Pp. 124–130.

Murphy, K. P. Probabilistic Machine Learning: An Introduction. MIT Press. 2022. 864 p.

Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., & Gulin, A. CatBoost: unbiased boosting with categorical features. Advances in Neural Information Processing Systems. 2018. P. 31.

Goodfellow I., Bengio, Y. & Courville, A. Deep Learning. MIT Press. 2016. 800 p.

Dua, D. and Graff, C. UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science. 2019. [Электронный ресурс]. URL: http://archive.ics.uci.edu/ml

Dua, D. and Graff, C. UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science. 2019. [Electronic resource]. URL: http://archive.ics.uci.edu/ml