Wi-Fi sensing Human Detection with Kolmogorov-Wiener Filter and Gated Recurrent Neural Networks

—Using Received Signal Strength Indicator (RSSI) values to detect human presence is a well-known Wi-Fi sensing technique. In this paper, an overview of existing algorithms solving the problem is considered. Two new techniques based on the discrete Kolmogorov-Wiener filter and the gated recurrent unit neural network are proposed. Human detection experiment results are presented along with algorithms’ accuracy analysis.


I. INTRODUCTION
Wi-Fi sensing is based on the use of Wi-Fi devices to recognize people's activity by identifying deviations and features in various characteristics of the Wi-Fi signal. The technology is especially relevant on the eve of the new Wi-Fi standard, which is focused on the active development of Wi-Fi networks for the Internet of Things. Wi-Fi sensing technologies can be used in the field of transport, household security, healthcare and other areas of life.
With the development of Wi-Fi technology, in particular, with the improvement of the hardware of access points, it became possible to use Wi-Fi access points as locators. The simplest and most common method of determining the presence of a person near a Wi-Fi access point is based on a detection of changes in the RSSI. Various algorithms have been already proposed to solve the problem of detection changes in the RSSI, but not all of them work with acceptable accuracy.
The relevance of this work is due to the active development of Wi-Fi technologies as a part of the movement towards the latest standard, 802.11 (WiFi-7) [1]. Currently, the Wireless Broadband Alliance has developed a document [2] reflecting the current state and problems of Wi-Fi scanning technologies.

II. PROBLEM STATEMENT
The goal is to develop an algorithm for determining the presence of a person by detecting changes in RSSI values (an indicator of the level of the received signal) received from a Wi-Fi access point. There is a vector of RSSI r train = [r 1 , r 2 , ..., r n ] with values collected at regular intervals, and a vector of labels y train = [y 1 , y 2 , ..., y n ]. The value of the label 1 corresponds to the presence of a person, 0 -absence. It's required to find an algorithm A(r), which should be trained on the data (optionally, this data is cleaned from the noise). The algorithm, receiving new unlabeled data r input , tries to determine the presence of a person. The result is a vector y out . It's required to maximize the accuracy score, which is the percentage of samples that have all their labels classified correctly, as well as minimize the number of Type I and Type II errors.

III. RELATED WORKS
The following criteria were used for survey of the algorithms: • algorithm class (for example, a statistical filter or a machine learning algorithm) • presence of a special noise elimination method in the algorithm/combination of the considered algorithm • accuracy of the considered algorithm • equipment required for the experiment • type and dimensions of the room in which the experiment was performed

A. General approaches
There are two approaches to Wi-Fi sensing. The first is based on the RSSI, and the second is based on information about the state of the communication channel -Channel State Information (CSI). Both approaches are based on a similar principle. When a person passes between Wi-Fi devices, the RSSI or CSI changes. The RSSI level is a physical quantity that characterizes the full power of the signal received by the Wi-Fi device [3]. The measurement takes place on a logarithmic scale in dBm (decibels relative to 1 milliwatt). This indicator can be obtained from almost all devices that work with Wi-Fi. Each Wi-Fi card manufacturer makes its own formula for determining RSSI. The general formula is usually written as follows: where: d is the distance to the signal source in meters, d 0 is the distance in meters from the signal source to the point where the RSSI measurement takes place, n is the coefficient for the middle point in which the measurement takes place, which is a dimensionless quantity, calculated empirically, P d is the desired RSSI indicator.
A very significant physical factor that makes it possible to detect a person using RSSI is the absorption coefficient of the human body. It is close to the coefficient of water absorption (due to the fact that the human body is mainly composed of water) and differs from the coefficient of, for example, walls and furniture [3]. For reference, an approximate RSSI absorption values are presented in the Table I. For comparison, the absorption value for the human body is approximately 9-30 dBm [4], which approximately corresponds to the absorption value for water 15-20 dBm [5].
The essence of the CSI-based approach is also worth mentioning. To establish a Wi-Fi connection, MIMO technology (Multiple Input Multiple Output) is often used. The essence of the method lies in increasing the bandwidth of the communication channel due to data transmission via a system of several antennas. In this case, one can set a matrix to describe the connection state in the case when the transmitting device has a composite antenna of M cells, and the receiving device has N cells. When transmitting data, the j-th cell of the transmitter transmits to the i-th cell of the receiver (k ∈ 1, ..., M ; j ∈ 1, ..., N ). The element of the CSI matrix H = [h kj ] = [a kj + b kj i] characterizes the state of data transmission from the j-th cell of the transmitter to the k-th cell of the receiver.

B. RSSI-based Approaches
To solve the problem of determining the presence of a person according to RSSI data, algorithms that are of a statistical nature are often used. For example, the Kalman filter [6] with subsequent estimation of variance allows one to achieve accuracy of 95%. The experiment from this paper was conducted in a small room 3m x 3m using three Wireless Sensor Network (WSN) nodes. A hybrid statistical approach based on a combination of expectation and variance estimates without noise control was used [7]. The experiment was conducted in the halls of the university. The achieved accuracy exceeded 90%. With the help of special Zigbee modules (used for low-consumption wireless communication) and an algorithm based on moving averages [8], it was possible to achieve 100% accuracy in laboratory conditions. The probabilistic algorithm based on MRF (Markov random fields) [9] achieved an accuracy of 86% in a large room with an area of 150 m 2 . The study used a Chipcon 1100 wireless communication board.
It is also possible to use machine learning algorithms. The approach based on the complicated k-means method allowed to achieve an accuracy of 94% [11]. The experiment was carried out in a room of 18m x 18m using a TelosB board.

C. Survey Conclusion
Thus, the most frequent approach to solving the problem under consideration is based on the use of various statistical characteristics of the filtered RSSI time series. The best results in this category were shown by an algorithm based on the Kalman filter and an algorithm based on a combination of filters using moving averages. However, there is also an alternative group of approaches based on machine learning algorithms, which is represented in the survey by an algorithm based on K-means. Table II presents a comparison of the considered methods according to the selected survey criteria.
It is proposed to consider new approaches from both categories, when choosing the algorithm to solve the considered problem. The Kolmogorov-Wiener filter is presented as a statistical algorithm, and a gated recurrent units (GRU) neural network is presented as a machine learning algorithm.

A. Kolmogorov-Wiener Filter
First, it is proposed to consider the Kolmogorov-Wiener filter, which should be attributed to the first group of (statistical) approaches. The filter is a simpler analogue of the Kalman filter, successfully used to denoise signals [10].
A discrete version of the Kolmogorov-Wiener filter [11] is considered. The filter works in the following way. The measured discrete signal w[n] is fed to the filter input. There is also an unknown useful signal s[n]. The signal is fed into the Kolmogorov-Wiener filter to get the output signal To detect outliers and anomalies corresponding to the presence of a person (assuming that the noise removal was successful), the approach proposed by Frank Hampel [12] is used. The essence of the approach is as follows. Let's consider a data set x i and a sample X N = {X (k) }, k ∈ 1, ..., N -k-th order statistics. As an estimate of the median is often recommended to use median(X n = ). As an estimate of the Mean Absolute Deviation (MAD): M AD(X n ) = median(|x 1 − median(X n )|, ..., |x n − median(X n )|. The value of x should be considered an outlier if: |x − median(X n )| ≥ g(N, a N )M AD(X N ), where g(N, a N ) is some function of the sample size N and some parameter a N .
Thus, based on the Kolmogorov-Wiener filter with an auxiliary approach, which is often called the Hampel filter, it is possible to construct an algorithm for solving the considered problem of human presence detection.

B. GRU Neural Network
Recurrent Neural Networks (RNN) are successfully used for processing time series [13]. In particular, networks with Long short-term memory (LSTM) performed well on RSSI data. An alternative to the LSTM approach is architectures based on recurrent managed units (GRU).

C. Decision Trees
Gradient boosting classifiers can be used for time series forecasting [14]. These algorithms are based on a combination of decision trees. In such a way a strong classifier is built out of weak classifiers. AdaBoosting and Gradient Boosting Machine algorithm is considered in this work.
Random forest algorithm is ensemble algorithm which consists of the combination of ordinary decision trees. It can also be used for time series forecasting [15].

V. EXPERIMENTAL STAND DESIGN
In practice, the goal is to determine human presence via a system of two Wi-Fi devices (a receiver device -a smartphone, and a Wi-Fi access point). Presence is determined based on RSSI metrics. As an additional condition, absence of significant environmental disturbances (movement of dimensional objects, sudden changes in humidity and temperature) is accepted.
The experiment was conducted using a stand consisting of two WiFi devices: a Samsung A7 2018 mobile phone and a TP-Link TL-MR3020 access point with the Open WRT operating system 07/19/19 firmware. Access to bash via ssh connection is configured on the point with a script for exporting RSSI values to a laptop support. The distance in the living room between the devices is 3 m, and in the office space -5 m. In both cases, the devices were at a height of about 1.5 m.

A. Kolmogorov-Wiener Filter
Kolmogorov-Wiener Filter is implemented as a separate class. During the fit phase, RSSI data measured in the human absence should be fed to the algorithm. In such a way, noise level is determined. In the transform phase, any RSSI data is passed to be denoised. After on the denoised RSSI values outliers are detected via the Hampel filter.

B. Data Preprocessing
No additional data preprocessing was done for the Kolmogorov-Wiener filter. However, data was transformed before being fed into the tree-based algorithms and the GRU neural network. Sliding window technique was used. Let the window size be s for raw RSSI values r 1 , ..., r n . The matrix M = [[r 1 , ...r w ], [r 2 , ..., r w+1 ], ..., [r n−w , ..., r n ]] is composed to be fed into these algorithms. This data set (matrix M ) is divided into two parts: M train (71.5%) and M test (28.5%). Also, it is worth mentioning that cross-validation for hyperparameters tuning was done for tree-based algorithms.

C. GRU Neural Network
The GRU neural network has the following architecture. A GRU layer with 200 units takes M as an input. tanh is used as an activation function. This layer is followed by two dense layers. The first one has 50 units and relu as an activation function, the second one has one single unit and linear activation. Adam optimizer with a mean squared error loss is used. It takes approximately 36-40 epochs to train such a network. After that, the model becomes overfitted.

A. Goals
It is required to check the following basic provisions. 1) The accuracy of the algorithms considered and implemented. 2) Reliability of the assembled stand.
3) Correctness of the methodology for detecting a person using RSSI

B. Methodology
Two experiments were conducted. One is Kolmogorov-Wiener filter-specific, the other one is general for all the considered algorithms. In the first case, the person was absent from the room for 200s, and RSSI data corresponding to human absence was collected After that, the following actions were performed: 1) for the first 50 seconds, a person crosses the line between the devices, and is also on the side 2) in the next 150 seconds, the person is removed from the room. RSSI values are requested every second. This experiment is intended to show the Kolmogorov-Wiener filter in action.
In the second case, for the 2100 seconds, a human leaves the room every five minutes. RSSI values are also requested every second. The goal of this experiment to collect a dataset to fit and score the considered algorithms. Experiments are conducted on the two venues: a living room and an office space.

C. Experiment results
RSSI raw data and denoised data cleaned with the Kolmogorov-Wiener filter is presented in the Fig. 1 and Fig. 2.
Accuracy, true positive rate (TPR) and true negative rate (TNR) for all algorithms is estimated on the test part. The accuracy score results for Kolmogorov-Wiener filter, GRU NN and decision trees are presented in the Table III and the Table  IV.

VIII. CONCLUSION AND FURTHER RESEARCH
Thus, machine learning algorithms are more accurate than the algorithm based on the Kolmogorov-Wiener filter, which, in addition, requires determining the noise level in the room. Machine learning algorithms do not require special adjustment to the noise level. The results and algorithms are publicly available [16].
Wi-Fi sensing is a very promising area of research, as Wi-Fi hardware is constantly improving. Possible directions of the work are as follows: • develop new algorithms for channel state information data • develop an algorithm to solve human detection problem based on both RSSI and CSI values for multiple Wi-Fi access points • develop techniques to integrate WiFi-sensing into existing IoT solutions