sEMG Signal Processing & Gesture Classification
In 2023, I ve attend to a master program on data science and as a thesis project found my self in a completly new area, signal processing. My supervisor was heavily working on EEG and EMG applications back there, since I really love to way of him thinking and works, wanted to work with him. In the first year, I was thinkng to work on a project probably in field of marketing, more specifally CRM where I feel more confortable. sEMG become a nice suprise, and actually my interest on digital health started with that project.
In this post I will talk about my sEMG Signal Processing & Gesture Classification project. I ve uploaded my jupyter notebook to my Github and recently got an email from a student and actually made me really happy because never thought that someone will use that code. After that, decided to write this post dives deep into the methods and data science approaches outlined in a my study of sEMG classification. You can always reach to code and annoynimzed data from my Github page, and love to hear your comments and your work in this field. I ve also tested different approaches on UCI sEMG public dataset, you can reach to date repository through this link.
What Is sEMG?
Surface electromyography (sEMG) measures the electrical activity of muscles through non-invasive electrodes on the skin. These signals—tiny voltage fluctuations—are a direct result of neural commands sent to muscles. By capturing and analyzing these signals, we can map them to specific gestures.
The goal? To enable prosthetics or robotics to respond to natural human intentions. sEMG-based systems have applications in rehabilitation, assistive devices, and even gaming, but achieving reliable classification requires sophisticated methods.
Our project aimed to create a 2-channel, affordable prosthetic robotic hand. Before moving on, let’s take a moment to clarify what a "channel" is and why the number of channels is so critical—both technically and economically.
A channel in the context of sEMG refers to an electrode pair capturing electrical activity from a specific muscle group. The number of channels directly correlates with the system’s ability to differentiate gestures. For example, more channels allow finer control and recognition of complex gestures, but they also increase the system’s complexity, cost, and processing requirements.
By using only two channels, we aim to strike a balance: achieving functional gesture recognition while keeping the device affordable and accessible. This design choice makes the technology viable for broader adoption, especially in underserved communities where cost is a major barrier. In essence, fewer channels simplify the system without compromising usability for basic prosthetic needs.
Degrees of Freedom and Gestures
Before we move into gestures, it’s crucial to understand the concept of degrees of freedom and their significance in the context of the human hand.
Degrees of freedom (DOF) refer to the independent ways in which a system can move. The human hand is a marvel of biomechanics, boasting over 20 DOFs. Each joint—from the wrist to the phalanges—contributes to the hand’s versatility, enabling complex movements like grasping, pinching, and typing.
In prosthetics, replicating even a fraction of this natural DOF is a significant challenge. More DOFs allow for greater dexterity and control, but they also increase the complexity of the system’s design, control algorithms, and user training. For my project, balancing DOFs with simplicity was key. By focusing on basic but functional gestures, we prioritized accessibility and ease of use without sacrificing core capabilities.
The human hand’s complexity is further highlighted by its structure: 27 bones, numerous muscles, tendons, and over 17,000 tactile sensors work in harmony to deliver precise and intricate movements. Natural hands effortlessly perform a variety of grasping techniques (e.g., spherical, cylindrical, lateral, pinching) and gesturing abilities that make daily activities possible. However, replicating these movements in prosthetics remains a formidable challenge, even with technological advancements.
Recent improvements in prosthetic hands, such as lighter and more compact motors, have enabled higher degrees of freedom, allowing for enhanced functionality. While human hands naturally possess 22 degrees of freedom, our project focused on achieving utility with six selected gestures: rest, wrist extension, pronation, fist, radial deviation, and wrist flexion. These gestures are widely studied in signal processing and supervised learning, making them an ideal foundation for aligning my research with established methodologies.
The Data Pipeline: From Raw Signals to Gestures
My project followed a structured pipeline to process and classify sEMG signals effectively. Here, we delve into the critical steps that ensured accurate classification.
1. Normalization
Normalization is a critical preprocessing step in signal processing, ensuring that data is comparable across sessions, individuals, or conditions. sEMG signals vary widely due to differences in muscle physiology, electrode placement, and recording conditions. Without normalization, these variations could compromise the reliability of classification models.
Different Applications of Normalization:
Peak Activation Levels During Maximum Contractions: This method normalizes signals relative to the maximum activation recorded during a task. It’s commonly used in tasks requiring high muscle force.
M-Max (Peak-to-Peak Amplitude of the Maximum M-Wave): Involves using the maximum muscle response from a single electrical stimulus as a reference point.
Activation Levels During Submaximal Isometric Contractions: Normalizes signals based on sustained, submaximal muscle contractions, offering a more dynamic reference for lower-intensity tasks.
For my project, we adopted a global min-max normalization approach to address the domain shift issue inherent in inter-subject sEMG-based hand gesture classification. This method assumes a reference data cycle has been collected from the testing subject. From this cycle, the maximum and minimum sEMG potentials across all channels for each class are computed. These values are then used to normalize all subsequent data cycles.
Normalization was especially important for my project because the data was collected over different days and from multiple participants. By applying global min-max values, we ensured consistent signal scaling across sessions, mitigating the variability that could otherwise hinder inter-subject classification accuracy.
2. Outlier Handling
Outlier handling is another crucial preprocessing step, especially in sEMG signals, which are prone to noise and artifacts due to factors such as electrode movement, skin conductivity changes, or external interference. Ignoring outliers can lead to skewed datasets and compromised classification accuracy.
Different Methods for Outlier Handling:
Z-Score Analysis: Identifies outliers based on the number of standard deviations a data point deviates from the mean.
Interquartile Range (IQR): Flags data points that fall outside the typical range defined by the 25th and 75th percentiles.
Hampel Filter: Replaces outliers with the median of nearby data points within a defined window.
Through experimentation, I observed that the choice of outlier handling method significantly impacted classification accuracy. After applying various techniques, I opted for the Hampel filter due to its robust performance in retaining signal integrity while effectively mitigating outliers. This method provided a balance between preserving useful information and eliminating noise.
3. Filtering
Signal filtering is a critical step in preprocessing sEMG signals, ensuring that only relevant data is retained while noise and artifacts are minimized. Filters operate by isolating specific frequency ranges, allowing for better interpretation and classification of the signals.
Applications in sEMG:
Noise Reduction: Eliminates unwanted frequencies such as powerline interference (50/60 Hz).
Signal Smoothing: Reduces sharp transitions that may arise from electrode movement.
Frequency Isolation: Focuses on the frequency band most relevant to muscle activity.
What is Power Spectrum Density (PSD)? Power Spectrum Density (PSD) is a measure of how power is distributed across different frequency components in a signal. For sEMG, PSD helps identify the dominant frequency range associated with muscle contractions, which typically falls between 10 Hz and 400 Hz. Analyzing PSD ensures that filters are designed to retain meaningful data while excluding irrelevant noise.
Filtering Techniques: In our project, we applied a Butterworth filter, known for its smooth frequency response and minimal distortion. Other commonly used filtering techniques include:
Chebyshev Filters: Provide sharper cutoff but introduce ripples in the passband or stopband.
Elliptic Filters: Offer the steepest cutoff rates but at the expense of greater passband and stopband ripples.
Bandpass Filters: Ideal for isolating the 10–400 Hz range, which captures most of the relevant sEMG signal activity.
By leveraging the Butterworth filter, we achieved a balance between effective noise reduction and preservation of the essential frequency components of sEMG signals. This step laid the groundwork for accurate feature extraction and classification in subsequent stages.
4. Feature Extraction
Feature extraction transforms raw sEMG signals into meaningful representations that machine learning models can use for classification. These features capture unique aspects of the signal from different domains, making them critical for accurate gesture recognition .
Time-Domain Features: These features describe the signal’s behavior over time and are computationally efficient to extract. Common examples include:
Mean Absolute Value (MAV): Measures the average amplitude of the signal, reflecting overall muscle activity.
Zero Crossing Rate (ZCR): Counts how often the signal crosses the zero line, providing insights into frequency and signal changes.
Waveform Length (WL): Represents the cumulative length of the signal waveform over time, correlating with signal complexity.
Frequency-Domain Features: Frequency-domain features analyze the power distribution of the signal across various frequencies. They are particularly useful for identifying patterns in muscle activity:
Power Spectral Density (PSD): Indicates the power distribution over frequency, crucial for identifying dominant muscle activity ranges.
Mean Frequency (MNF): Represents the weighted average of the signal's frequency components.
Median Frequency (MDF): The frequency that divides the power spectrum into two equal halves, useful for fatigue analysis.
Time-Frequency Domain Features: These features blend time and frequency information, capturing localized frequency characteristics as they vary over time. Techniques include:
Wavelet Coefficients: Extract detailed, localized frequency features, ideal for transient signal analysis.
Spectrogram Features: Visualize the signal’s frequency content dynamically, highlighting changes over time.
To optimize the feature set, we employed Mutual Information (MI), a statistical measure that evaluates the dependency between features and gesture labels. MI quantifies how much each feature contributes to recognizing a gesture, enabling us to retain the most informative features. By reducing irrelevant features, we achieved a balance between computational efficiency and classification performance. In my study, MI narrowed down the top 10 features from an initial set of 15, streamlining the model while maintaining high accuracy.
5. Classification
Classification represents the final step in our sEMG signal processing pipeline. In this stage, the extracted features are used to train machine learning models to accurately recognize hand gestures. We employed a 10-fold cross-validation approach to evaluate the models. This method involves dividing the dataset into ten parts, training the model on nine parts, and testing it on the remaining one, ensuring robust and unbiased performance assessment.
K-Nearest Neighbors (KNN): KNN is a simple yet powerful algorithm that classifies data points based on their proximity to neighboring points. Its intuitive design and effectiveness in low-dimensional spaces made it an excellent fit for our dataset. Remarkably, KNN achieved the highest accuracy at 98%, outperforming more complex models. This result underscores the power of simplicity, demonstrating that for certain datasets, straightforward methods can rival or surpass sophisticated algorithms.
Random Forest: Random Forest, an ensemble learning method, builds multiple decision trees and aggregates their outputs. Known for its ability to handle complex decision boundaries, Random Forest achieved an accuracy of 73.2% in our study. While robust against overfitting, its lower performance compared to KNN highlights the importance of matching the algorithm to the dataset's characteristics.
XGBoost: XGBoost, a highly efficient gradient boosting algorithm, delivered an accuracy of 84%. Although it outperformed Random Forest, it fell short of KNN's performance. This result highlights that more complex models do not always guarantee superior results, particularly in cases with limited input dimensions like our two-channel sEMG system.
Results Summary:
KNN: 98% accuracy, showcasing its strength in handling small, well-structured datasets.
Random Forest: 73.2% accuracy, revealing limitations when applied to datasets with fewer features and simpler decision boundaries.
XGBoost: 84% accuracy, balancing complexity and performance but not surpassing KNN.
Despite the simplicity of using only two sEMG channels, the preprocessing, normalization, and feature extraction techniques employed ensured that the data was well-prepared for classification. The KNN classifier’s performance demonstrates that simplicity can often deliver outstanding results in machine learning applications, particularly when paired with a well-structured pipeline.
My humle findings highlight several key insights:
Simplicity in model design, as exemplified by KNN, can outperform more complex algorithms like Random Forest and XGBoost when paired with well-curated data.
Preprocessing steps such as global min-max normalization and Hampel filtering play a critical role in standardizing and refining sEMG signals, ensuring reliability across subjects and sessions.
Effective feature extraction and dimensionality reduction, guided by Mutual Information, enabled our models to focus on the most informative aspects of the data, enhancing both efficiency and accuracy.
While the two-channel design demonstrated the feasibility of low-cost prosthetic solutions, it also highlighted certain limitations, such as reduced data richness. Future work could explore the inclusion of additional channels to enhance system performance further, as well as the integration of neural networks to handle more complex decision boundaries.
Kommentare