Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Method for Training Perceptron on Tabular Data with Missing Values

https://doi.org/10.15514/ISPRAS-2025-37(6)-22

Abstract

Handling missing values in tabular data remains a critical challenge for building robust machine learning models. This paper presents a novel approach to imputation based on unary classification. The proposed method employs an ensemble of perceptrons trained independently for each class to estimate the likelihood of reconstructed values with respect to the empirical support of that class. A uniform distribution over a bounded region of the feature space is used as a background model, enabling the interpretation of the model’s output as an approximation of the posterior probability that an object belongs to a given class. This probabilistic interpretation is then leveraged within an iterative procedure for missing value imputation and classifier training. The theoretical validity of the proposed estimator is rigorously justified. Experiments on synthetic two-dimensional datasets with missing values generated under the MCAR (Missing Completely At Random) mechanism demonstrate the superiority of the proposed method over classical imputation techniques, particularly in scenarios with high missingness rates and complex class boundaries.

About the Authors

Andrey Igorevich PERMINOV
Ivannikov Institute for System Programming of the Russian Academy of Sciences
Russian Federation

A postgraduate student at the Institute of System Programming of the RAS. Research interests: neural network data processing, digital image processing, trusted artificial intelligence.



Andrey Petrovich KOVALENKO
Ivannikov Institute for System Programming of the Russian Academy of Sciences
Russian Federation

Dr. Sci. (Tech.), a researcher at the Center for Trusted Artificial Intelligence at the Institute of System Programming of the RAS. Research interests: trusted artificial intelligence.



Denis Yurievich TURDAKOV
Ivannikov Institute for System Programming of the Russian Academy of Sciences
Russian Federation

Cand. Sci. (Phys.-Math.), head of department ISP RAS. Research interests: social network analysis, text mining, information extraction, big data, trusted artificial intelligence.



References

1. Lukyanov K. S. et al. Extrapolation of the Bayesian classifier with an unknown support of the two-class mixture distribution //Russian Mathematical Surveys. 2024. Vol. 79, No. 6, pp. 991-1015.

2. Cybenko G. Approximation by superpositions of a sigmoidal function //Mathematics of control, signals and systems. 1989. Vol. 2, No. 4, pp. 303-314.

3. Kovalenko A. Geometric interpretation of a multilayer perceptron with piecewise linear activation functions // 31st scientific and technical conference "MiTSOBIT". Saint Petersburg, 2022. pp. 34—35.

4. Devroye L. Nonparametric density estimation //The L_1 View. 1985.

5. Devroye L., Györfi L., Lugosi G. A probabilistic theory of pattern recognition. – Springer Science & Business Media, 2013. Vol. 31.

6. MissingDataPerceptron, https://github.com/dronperminov/MissingDataPerceptron, last accessed: 01 July 2025.


Review

For citations:


PERMINOV A.I., KOVALENKO A.P., TURDAKOV D.Yu. Method for Training Perceptron on Tabular Data with Missing Values. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2025;37(6):93-106. (In Russ.) https://doi.org/10.15514/ISPRAS-2025-37(6)-22



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)