Human Activity Recognition || A Deep Neural Network Approach || A Summary to Research Paper
Wi-Fi Channel State Information is a set of data that provides detailed information about the wireless channel between a Wi-Fi transmitter (such as Wi-Fi router or access point) and a receiver (such as Wi-Fi enabled device like a smartphone, laptop or IoT devices).
The data includes three main components. Those components are:
- Phase: CSI includes phase information, which refers to the relative timing of the transmitted signal compared to the received signal. Phase information can be used to infer changes in the wireless channel caused by factors such as movement of the objects or people. In CSI data, phase information provides insight into how the transmitted Wi-Fi signal interacts with objects, obstacles, or people in the environment before reaching the receiver. When the signal encounters change in the wireless channel, such as interference caused by objects or movement, it affects the phase of the received signal. By analyzing changes in phase over time we can infer characteristics of the environment, such as the presence of moving objects or changes in the physical layout of the space. The Wi-Fi signal can take a different amount of time to reach the device receiving them, depending on what’s in the way which is called the phase information.
- Amplitude: Amplitude refers to the strength or magnitude of the received signal. In CSI data, amplitude information provides the details about the power level of the received Wi-Fi signal. By monitoring the changes in the signal amplitude, we can infer variations in signal strength, which may indicate changes in the environment or the presence of objects or people that affect signal propagation.
- Frequency: Frequency refers to the spectral components of the Wi-Fi signal, or the range of frequencies over which signal is spread. In CSI, frequency-domain information allows us to analyze how different frequency components of the transmitted signal are affected by the wireless channel. The wireless channel can introduce frequency-selective fading, where certain frequency components of the signal are attenuated or amplified more than others. By examining changes in the frequency-domain representation of the signal, we can infer characteristics of the wireless channel, such as the presence of multipath propagation, frequency-dependent attenuation and other channel effects.
How is this data used in Deep Learning?
- Feature Extraction: Deep learning models can be used to automatically extract relevant features from raw CSI data. These features can capture complex patterns and relationships within the data that are difficult to extract manually. Techniques such as CNN or RNN can be employed for feature extraction from CSI sequences.
- Activity Recognition: CSI data can be used to recognize human activities or behaviors in smart environments. Deep learning models trained on CSI data can learn to classify different activities, such as walking, sitting or standing based on the patterns in CSI signals caused by the movement of individuals.
- Localization and Tracking: Deep learning models can be trained to localize and track objects or people based on their impact on the wireless channel. By analyzing changes in CSI data over time, these models can estimate the position and trajectory of moving objects within the monitored area.
- Gesture Recognition: CSI data can be used to recognize gestures or hand movements in human-computer interactions application. Deep learning models can learn to distinguish between different gestures based on the unique patterns in the CSI signal generated by hand movements.
- Environment Sensing: Deep learning models can be utilized for environment sensing applications, such as monitoring air quality or detecting the environmental hazards. Changes in the CSI signal caused by environmental factors can be analyzed to infer relevant information about the surrounding environment.
Background on Traditional Activity Recognition Systems
Human activity recognition has gained tremendous attention in recent years due to numerous applications that aim to monitor the movement and behavior of humans in indoor areas. In an active activity recognition system, the person needs to wear various kinds of sensors such as gyroscope and accelerometer. The data passed from the sensors is then processed using the supervised learning algorithms. This approach has a good performance of over 90% for recognition but always wearing a device is cumbersome and may not be possible in every use case in real world scenarios. That’s why, to solve this problem a monitoring system based on wireless signal, which doesn’t violate the privacy of people is desired.
Human Activity Recognition Based on Wi-Fi CSI Data
The Wi-Fi routers used in our daily life have the capability to estimate a Wi-Fi channel state which provides a complex, but powerful possibility to sense the surrounding environment. The combination of Wi-Fi CSI and machine learning can be used for the purpose of human activity recognition which can help us to avoid the costly set up of vision based approach along with other many drawbacks of vision based approach. With such a system we can determine if a person drops down with a heart attack, enhance security systems by detecting people present in the total darkness or even potentially check whether a person is breathing and decide whether to call for an emergency. RSSI (Received Signal Strength Indicator) and CSI (Channel State Information) mechanisms are used for the purpose of analyzing the Wi-Fi channel states. The CSI tool gives more precise information about the channel state. For each antenna pair of transmitter and receiver at each subcarrier frequency, it measures propagating wireless signals and provides amplitude and phase distortion for different sub-channels. In such a way, CSI variations in the time domain have different patterns for different humans, activities etc, and this can be used for human activity recognition. Various reports show an average accuracy of 20–40 cm for localisation based on CSI data.
Note: Multipath propagation refers to the phenomenon in which radio frequency signals travel from a transmitter to a receiver via multiple paths, often due to reflection, diffraction and scattering off objects and surfaces in the environment. This can result in multiple copies of the transmitted signal arriving at the receiver with different phase shifts and amplitudes.
Orthogonal Frequency Division Multiplexing
OFDM is a modulation technique commonly used in modern wireless communication systems, such as Wi-Fi, 4G LTE and digital television broadcasting. It works by dividing the available spectrum into multiple orthogonal subcarriers, each carrying the portion of the data. In simple words, OFDM is like sending different colored blocks (subcarriers) together through a tunnel (communication channel) to make it faster, without them getting mixed up along the way. This splits the total frequency spectrum into multiple subcarriers. For example, for a channel bandwidth of 20 MHz, there are typically 56 subcarriers, and for bandwidth of 40 MHz there are 114 sub carriers.
Channel Bandwidth in Wi-Fi
In Wi-Fi communication, channel bandwidth refers to the range of frequencies allocated for data transmission. The wider the bandwidth, the more data can be transmitted simultaneously. The wider channel bandwidth offers higher data rates, they also require more spectrum space and can be more prone to interference. The spectrum space mentioned here refers to the range of frequencies within the electromagnetic spectrum that are available for various wireless communication technologies and services. The allocation of spectrum space is regulated by government agencies in each country to prevent interference between different services and ensure efficient use of limited spectrum resources.
Data Transmission through Waves
Data is transferred through WiFi waves using a process called modulation. Here’s a simplified explanation of how it works:
- Digital Data: First, the digital data (such as text, images, or videos) that you want to send from one device to another is converted into a stream of binary digits, which are just combinations of 1s and 0s. This data represents the information you want to transmit.
- Modulation: The binary data is then modulated onto a carrier wave. This carrier wave is like the delivery truck that carries your data through the air. The modulation process changes the properties of the carrier wave, such as its frequency, amplitude, or phase, in a way that represents the binary data.
- Wi-Fi Signal Transmission: Once the data is modulated onto the carrier wave, the resulting Wi-Fi signal is transmitted through the air using antennas. The antennas emit the Wi-Fi signal, which travels through the air as electromagnetic waves.
- Reception: At the receiving end, another device (such as a smartphone or a laptop) picks up the Wi-Fi signal using its antenna. The Wi-Fi signal is then demodulated to extract the original binary data. Demodulation reverses the modulation process, recovering the digital data from the carrier wave.
- Data Processing: Finally, the digital data is processed and presented to the user in a usable format, such as displaying a webpage on a screen or playing a video on a device.
Throughout this process, multiple techniques are used to ensure reliable and efficient transmission of data, including error correction, encryption, and channel access protocols. These techniques help to minimize errors, protect the data from unauthorized access, and ensure fair access to the Wi-Fi network for all devices.
System Setup for CSI
There are three options to get CSI data from routers: Linux 802.1n CSI Tool, Atheros CSI Tool and Nexmon Channel State Information Extractor. Among these different tools, the Atheros CSI Tool is used for the purpose of extracting the CSI data from the Wi-Fi. Two routers need to be used as receiver and transmitter. The transmitter works in the Access Points mode and the receiver is configured to a Client mode. And for this purpose, the tool called OpenWRT is used which gives the ability to configure operating frequency, mode, channel bandwidth and other parameters. After that, the client router has to associate with the created AP via OpenWRT interface to stay in the same network. The custom Atheros CSI tool firmware was built and installed there to enable physical layer information extraction.
Note: TX refers to the transmit phase, one device, typically referred to as the transmitter or TX device, sends data to another device. RX refers to the receive phase, another device, known as the receiver or RX device, detects and captures the transmitted signal.
Now the system for this purpose contains two programs, one for sending a constant WiFi packet (transmitter) and another for receiving and calculating the CSI (receiver).The transmitter continuously sends a WiFi packet to the receiver. This packet contains data that will be used to compute the CSI. The receiver receives the WiFi packet and calculates the CSI. CSI represents the signal propagation effects in the channel between the transmitter and receiver. It provides valuable information about how the signal changes during transmission, which is crucial for understanding the wireless channels’ characteristics. As the router doesn’t have enough storage to save large amounts of data, it sends the raw CSI data to the user’s laptop. This data contains information about the wireless channel conditions captured by the receiver. The user’s laptop receives the raw CSI data from the receiver, processes it, and stores it for further analysis. This step involves handling and managing the data to ensure it is stored securely and efficiently. The laptop visualizes the CSI data in real-time, allowing users to analyze and interpret on the fly. Visualization techniques may include plots, graphs, or other graphical representations of the CSI data. Simultaneously, a camera takes a photo of the room for automatic scene parsing. These images are used to validate the human activity recognition system. By analyzing the images, the system can verify the accuracy of the detected human activities based on the received WiFi signals.
Then the CSI data obtained need to be preprocessed for feeding it to the deep learning model. A deep learning model, such as a convolutional neural network for image processing or recurrent neural network for sequential data like CSI, would be trained using labeled data. Once trained, the deep learning model can be used to predict or classify human activities based on new CSI data and images. The model would analyze the features extracted from the data and provide predictions about the observed activities.
Wi-Fi Experimental Network Configuration
The network configuration acts as the crucial part in the human activity recognition task where the effective network configuration like the frequency, channel, bandwidth, number of antennas need to be considered very carefully. The way to find the optimal configuration consists of comparing the amplitude and phase measured over different channels, bandwidth etc for the same human motion pattern. And then plotting results on a graph and visually choosing the most distinguishable configuration so taht the activity pattern can be easier observed. These are the things to consider while configuring the network for the human pattern recognition purpose:
- t is better to choose the least loaded channel from the recommended non-overlapping list. For example, for 2.4GHz, it cOuld be 1, 6 or 11. The pattern of walking activity on channel 11 was more well distinguished compared to the other channels.
- The pattern visibility was better in 40 MHz bandwidth than the 20 MHz.
- The amplitude variability at 5 GHz is smaller than at 2.4 GHz resulting in less data noise.
- The higher the number of antennas, the better. While a person is not on the sight line, it is almost impossible to detect any activities and movements in the 1Tx 1Rx configuration. At least two receiver and transmitter antennas are required to detect human activity when they are not on the sight line.
Dataset Collection
Two publicly available datasets were considered, both collected using outdated hardware and limited in their representation of real-world scenarios and human behavior due to the absence of transitions between activities and the over-controlled lab-like environment in which they were gathered. The new dataset was collected through this research. That dataset includes activities such as walking, sitting, standing, lying down, getting up, getting down, and periods of no activity when the room is empty. The collection process involved three different rooms, and each packet of CSI data was labeled with an image, the corresponding activity, and a bounding box indicating the person in the image. Such a dataset addresses the limitations of the previous datasets by providing a more diverse set of activities, including transitions between activities, and by capturing data in a real-world environment.
Human Activity Recognition Model
The process of building a stable and accurate Human Activity Recognition system involves significant data preprocessing due to inherent noise in Channel State Information data, caused by environmental factors and hardware instability. This noise renders raw phase information unusable, necessitating preprocessing techniques.
- Phase Sanitization: Raw phase data is affected by carrier frequency offset and sampling frequency offset, making it unreliable for activity recognition. A linear transformation is applied to mitigate these effects, making the phase data less noisy and enabling the observation of activity-related changes.
- Outliers Removal: Noise in amplitude and phase data, stemming from various sources such as transition rate and thermal noise, introduces outliers. The hampel identifier algorithm is utilized to identify and remove these outliers, enhancing the quality of the signal.
- Noise Reduction with Discrete Wavelet Transform: Despite previous preprocessing steps, CSI data may still contain significant noise. The DWT-based noise reduction algorithm is implemented to remove noise from the data without sacrificing signal quality, improving the overall reliability of the dataset.
Following data preprocessing, the machine learning model is trained and validated using the prepared dataset. Given the nature of CSI data and the complexity of human activities, conventional methods like SVM and decision trees are deemed ineffective. Hence, a neural network approach, particularly Long Short-Term Memory networks, is chosen due to their ability to capture temporal dependencies in sequential data.
The LSTM–based approach proves more effective for HAR tasks with limited data, providing better accuracy and generalization despite some remaining challenges, such as overfitting and the need for further analysis of classification performance for certain activities.
Future Work
The dataset can be increased in size by incorporating data from additional locations, involving more individuals and considering hardware variations to enhance model generalization. Also utilizing advanced feature extraction techniques such as Principal Component Analysis, Power Spectral Density, signal skewness and Median Absolute Deviation to extract meaningful information from the available data, potentially improving model performance.
