Digital Signal Processing (DSP) option enables automatic processing of raw data and feature extraction for data from gyroscopes, accelerometers, magnetometers, electromyography (EMG), etc. To preprocess audio files, use our desktop app.
Features created in Feature Extraction will be normalized within their own scale.
Raw data will be normalized within the scale of each variable/axis from which they were extracted.
If you turn on the Windowing option, you will see that the normalization type is automatically changed to Autoselection.
Windowing & Feature Extraction
This option transforms the signal data into a vector with a specified window size, and also enables users to generate additional variables of their choice.
Windowing
Window size is the portion of data that will be used for processing the training dataset (the static window approach is applied on the platform). It should be universal for all events in the training dataset, even if the duration of events is different. Therefore, for correct training of the model, the data in the training and the validation datasets (if applicable) must be brought to the same window using upsampling or downsampling, before loading into the platform.
Using the window size, the training dataset will be grouped by a selected number of samples. All variables extracted from the raw data will be calculated from these groups. For correct feature calculation, the recommended minimum window size is 5 samples/lines.
Since the platform is intended for solving TinyML tasks, it is recommended that the maximum window size does not exceed 1000 samples/lines. However, it all depends on your data and there are no technical restrictions on the window size in the platform.
You have 3 options for the windowing specification:
In Number of Rows/Samples
In Number of Rows/Samples
The platform default is 128, but you can specify the count of rows to use for windowing. For example, for the following dataset:
the window = 10 samples
In Duration
Specify window size in ms and frequency in Hz to define window size for the training dataset.
If you use sensors with different frequencies, then you must first bring all signals to a single frequency using downsampling or upsampling.
According to the specified data, the platform will determine the size of the window in samples/lines and it will be used in the future to calculate features and other operations.
Auto Determination
Using Fast Fourier Transform this option automatically selects the window size based on the best model quality (this option is available only for the classification task type). For finer tuning of SRAM usage, it is recommended that you manually enter the window size and frequency of data.
If you plan to use frequency features to solve your problem, then the window size should be a multiple of a power of two.
Sliding Shift
Sliding Shift
The sliding shift is used for the window overlap and indicates how many rows/samples are needed to shift to form the next window. It is available for both training and inference. If the sliding shift values are equal to the window size, then the window segmentation is performed without an overlap.
Sliding Shift
The options' value can't exceed the specified window size. When window size auto-determination is enabled, the sliding shift values are set automatically equal to the window size.
Estimated SRAM Usage
Estimated SRAM
SRAM depends on the window you choose and the list of features that will be created from the raw data.
If you have selected auto-detection of the window, then the estimated SRAM usage will not be calculated as the platform will determine the size of the window after the start of training
Estimated SRAM. Auto Determination Enabled.
Feature Extraction
Feature Extraction
When the window size is specified, the platform automatically extracts the following features for each column in the training dataset. The same feature extraction will be automatically executed during inference on the device. You can manage calculated features for each original variable in the training dataset by marking or unmarking the appropriate checkbox.
To delete a variable/axis from the Feature Extraction block, use the Remove button, which appears when hovering over a variable. After doing so, no feature will be created for it, and therefore it will not participate in building the model. By default, Feature Extraction uses all variables/axes.
If you have deleted a variable, but want to return it, then use the “Add variables” button.
Add Variables
Variables List
You can correct the list of features, as well as configure their parameters by clicking on the Edit button next to the list of features.
Edit button
If you have deleted all features in a variable/axis, then it is automatically excluded from the list of variables/axes. If you have deleted all variables, then Windowing and Feature Extraction are turned off automatically.
The list of supported features is provided below:
Amplitude
Global peak to peak of high frequency (Enabled by default)
Global peak to peak of low frequency (Enabled by default)
Amplitude
You can specify the Smoothing Factor (4 by default) which is the amount of attenuation for frequencies over the cutoff frequency. It is indicated for both features and the requirements for it are the same.
The technically acceptable values of the Smoothing factor for both features are within the window size, but when choosing extreme values, the feature can be calculated with an error. Therefore, the recommended value for the smoothing factor is 10% of the window size.
For example, if the window size is 128:
the recommended smoothing factor is 12-13
deprecated values – 1-4 and 121-12
In this case, choose the value that is best suited for your data and task.
Statistical
Mean – Enabled by default.
Max – Enabled by default.
Min – Enabled by default.
Root mean square – Enabled by default.
Range – Enabled by default.
Mean Absolute Deviation – Enabled by default.
Zero-crossing rate – Enabled by default.
Average Magnitude Difference – Enabled by default.
Absolute Mean – Enabled by default.
Standard Deviation – Enabled by default.
Mean-crossing rate – Enabled by default.
Threshold-crossing Rate – Disabled by default.
For this parameter, the threshold is set by the user. The default value is 0. If you leave it unchanged, the feature will have the same values as the Zero-crossing Rate.
Skewness – Disabled by default.
Kurtosis – Disabled by default.
Statistical
Frequency
FFT first peak power (disabled by default)
FFT first peak frequency (disabled by default)
FFT second peak power (disabled by default)
FFT second peak frequency (disabled by default)
FFT third peak power (disabled by default)
FFT third peak frequency (disabled by default)
Frequency
These functions use fast Fourier transform to calculate the frequency of peaks and their power.
If frequency features are selected the window size should be equal to the power of 2.
Feature complexity
Feature complexity can be classified as light, medium, or heavy, depending on the level of difficulty in its calculation. As the feature complexity increases, so does the memory requirement for its computation. Therefore, it is recommended to use only light features if you have resource-constrained devices.
Feature complexity
Raw Data
Raw Data option – Disabled by default.
If you want to include raw data in training, you need to enable the function. This setting is applied immediately to all variables/axes specified in the Feature Extraction block, i.e., you can either apply raw data for all variables/axes in the train dataset, or exclude it.
Raw data
The Raw Data option can't be used when the sliding shift values differ from the window size values.
Feature Selection
The option is available only if the user has selected features in the Feature Extraction block.
Feature selection will include all created features for all variables/axes in the Feature Extraction block, except for the raw data. Therefore, if you haven’t selected any variable in the feature selection block, but the raw data option is enabled, then the feature selection block will be disabled.
The following algorithms are available for selecting features (you can choose only 1 of the methods):
Recursive Feature Elimination with Cross-validation (Enabled by default if the Feature Selection option is on)
Eliminates features using various machine learning algorithms with cross-validation.
Correlated Threshold
Removes highly correlated features. The degree of correlation is set by the user.
Variance Threshold
Excludes features with small value fluctuations. The threshold indicates the minimal fluctuation value for excluding unimportant variables.
Information Gain
Ranks all features according to the degree of influence on the target variable and leaves only the most significant ones. You set the number of features that should be left in the train dataset.
Feature Selection