This section provides a glossary of terms and definitions used in this guide. We do not intend to give scientific definitions, but we want to provide general terms and descriptions which are clear for users with or without any data science background
Accuracy represents how accurately a class is predicted. If 73 from 100 predicted records are assigned a correct class, then the accuracy is 0.73 or 73%. The higher the value – the better.
Accuracy does not take class imbalance into account. It is best applicable when classes are represented by an approximately equal quantity of samples.
Artificial Intelligence (AI) is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans.
AUC (ROC AUC)
A Receiver Operating Characteristic Curve is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied.
A classification metric “Area Under the Curve” (AUC) related to the ROC curve represents how effectively the model answers the question “Does the current object belong to the corresponding class?”.
The value is always between 0 and 1:
- The higher = the better;
- The lower = the worse, but in this case, the model works in “reverse” mode (1 - value = the better);
- ~0,5 = the worst (the model provides values like random)
Balanced Accuracy accounts for imbalanced classes where (for example) one class may be represented by 10% of samples and the second class may be represented by the remaining 90% of samples.
Binary Classification is used to predict one of the two possible outcomes or classes (e.g. ‘yes’ or ‘no’, ‘black or ‘white’, 0 or 1). If all of the values of your target variable are represented by only two unique values, this is a binary classification task type.
Classification (Classification Task) is the prediction of a target variable represented as a range of discrete classes. Binary classification tasks are represented by a target variable with two possible classes. Multi classification tasks are represented by a target variable with 3 or more classes.
Coefficient of Determination
Coefficient of Determination is the proportion of variance in the dependent variable that is predictable from the independent variables. This metric scores the model with 1 if our model is perfect, and with 0 otherwise. If the Coefficient of Determination is 0.95, then 95% of the data is explained by observed statistics and the trained model.
Confidence Interval is applicable for regression task types. With XX% probability, it shows the possible prediction spread for each predicted value.
Confusion Matrix, also known as an error matrix, is a specific table layout that helps to visualize the performance of an algorithm. Each row of the Matrix represents the samples in a predicted class while each column represents the samples in an actual class (or vice versa). The Matrix makes it easy to see if the system is confusing two classes.
Data Preprocessing is a process of dataset preparation for model training, including data cleaning, missing values insertion, removal of outliers and variables transformation, etc.
Dataset is a volume of data (statistics) in a tabular format.
Exploratory Data Analysis (EDA)
Exploratory Data Analysis (EDA) is a set of charts displaying the main characteristics of a training dataset.
Feature (independent variable, predictor) is represented by a column in a dataset that characterizes the target variable.
Feature Extraction is an extraction of additional information (creation of the new variables) from existing data.
Feature Importance Matrix (FIM)
Feature Importance Matrix (FIM) is a chart which represents the 10 features that had the most significant impact on the model prediction of the target variable.
- FFT first peak power
- FFT first peak power
- FFT second peak power
- FFT second peak frequency
- FFT third peak power
- FFT third peak frequency
These functions use fast Fourier transform to calculate peaks' frequency and power. If Frequency features are selected the window size should be equal to the power of 2.
The F1 Score can be interpreted as a weighted average of the precision and recall for each class, where an F1 Score reaches its best value at 1 and worst score at 0. You should use this metric when you want to have a good balance between Precision and Recall.
The Gini coefficient applies to binary classification and requires a classifier that can in some way rank examples according to the likelihood of being in a positive class. A Gini value of 0% means that the characteristic cannot distinguish between classes.
The Gini coefficient makes sense for the whole collection of predictions and not individual data points. The Gini coefficient only tells if perfect segregation (based on probability predictions) is possible or not and nothing about the probability threshold. In short, there is no relation between probability threshold and Gini. The Gini coefficient provides an accurate model predictive power measure for imbalanced class problems.
Global Peak to Peak of High Frequency
Global peak to peak of high frequency (amplitude function) is a calculation of the high-frequency signal by subtracting the moving average filter output from the original signal (feature). You can specify the Smoothing Factor which is the amount of attenuation for frequencies over the cutoff frequency. The smoothing factor value should be equal to the power of 2 but not greater than the window size.
Global peak to peak of low frequency (amplitude function) is a calculation of the low-frequency signal by applying a moving average filter with a smoothing factor. The smoothing factor value should be equal to the power of 2 but not greater than the window size.
Holdout Validation Dataset
Holdout Validation Dataset is an independent portion of data that won’t be used in model training, but on which metrics will be calculated.
Kurtosis is a statistical measure of the combined weight of a distribution's tails relative to the center of the distribution.
Lag Feature is a name for a variable that contains raw data from prior time steps.
Lift is a measure of the performance of a targeting model at predicting or classifying cases as having an enhanced response (with respect to the population as a whole), measured against a random choice targeting model.
Logarithmic Loss is a measure of prediction confidence level. LogLoss represents the difference between the actual class and the probability of a prediction being in that class. For example, the model correctly predicts a 0.90 probability of being in class 1- that means it is pretty confident, but still, there is 0.1 uncertainty of this prediction; LogLoss penalizes for this uncertainty.
The lower the LogLoss score, the better the model`s predictive power. LogLoss takes into account not the rounded-off predicted class but the probability of the prediction corresponding to a certain class.
Machine Learning (ML) is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on patterns and inference instead.
Macro Average Precision
Precision is the fraction of relevant samples among the retrieved samples. The precision score reaches its best value at 1 and the worst score at 0. Precision is intuitively the ability of the classifier not to label as positive a sample that is negative. Use this metric when it is most important to find only relevant samples without mistake even if you may skip some of the relevant samples. Macro Average Precision does not take class imbalance into account. It is best applicable when (for example) three classes of a multiclass classification problem are represented by an approximately equal quantity of samples.
Macro Average Recall
The Recall is the fraction of relevant samples that have been retrieved over the total amount of relevant samples. The recall score reaches its best value at 1 and worst score at 0. The Recall is intuitively the ability of the classifier to find all the positive samples. Use this metric when it is the most important to find as many relevant samples as possible even if you may also mark some of the wrong examples as relevant. Macro Average Recall does not take class imbalance into account. It is best applicable when (for example) three classes of a multiclass classification problem, represented by an approximately equal quantity of samples.
Macro Average F1 Score
The F1 score can be interpreted as an arithmetic average of the precision and recall for each class, where an F1 score reaches its best value at 1 and worst score at 0. You should use this metric when you want to have a good balance between Precision and Recall. Macro Average F1 score does not take class imbalance into account. It is best applicable when (for example) three classes of a multiclass classification problem, represented by an approximately equal quantity of samples.
Max is a statistical function that calculates the maximum of the column (feature).
Mean is a statistical function that calculates the arithmetic mean of the column (feature).
Mean Crossings is a statistical function that calculates the number of times the selected column crosses the mean.
Mean Absolute Error (MAE)
Mean Absolute Error is the difference between all the observed and predicted values. The direction of the error (positive or negative) does not matter, because the size of the error is calculated by module.
Use this metric to minimize the average error.
There are two other metrics, dependent on Mean Absolute Error:
- Max AE is the maximum absolute difference between the actual value (true) and the predicted value;
- Min AE is the minimum absolute difference between the actual value (true) and the predicted value
Mean Squared Error (MSE)
MSE is a measure of the quality of an estimator, it is always non-negative, and values closer to zero are better. MSE measures the average of the squares of the errors - that is, the average squared difference between the estimated values and true values. The squaring is necessary to remove any negative signs, it also gives more weight to larger differences, so bigger errors are penalized higher.
Metadata is the information about the data.
Metric is a functional value that describes model quality.
Min is a statistical function that calculates the minimum of the column (feature).
Model is a mathematical representation of dependencies between the features (independent variables) and the target variable.
Model Interpreter allows you to interactively change the values of original features and see prediction results in real time. For continuous and discrete features, the Model Interpreter builds a graphical representation of their relation to the target variable, and for classification tasks, you can see the probabilities of predicted classes on the graph. It also allows you to specify the threshold value and see the feature values for which prediction results are below or above the threshold. Furthermore, feature influence for continuous features will show you the trend in the target variable value.
Model-to-Data Relevance Indicator
Taking into account feature impact on the model prediction, the Model-to-Data Relevance Indicator calculates the statistical similarity between the data uploaded for predictions and the data used for model training. The measure (100-M2D) shows the possible degradation level for the target metric value.
A low value for M2D may indicate a significant change in the model input data (data drift) that can lead to model performance degradation (model decay).
For example, if the Model-to-Data Relevance Indicator on the current dataset (uploaded for predictions) equals 95%, then the user can reasonably expect the model quality to decrease by as much as 5% from the validation metric.
100% = no change in model input data, no model performance degradation. 1% = a significant change in model input data, that leads to significant model performance degradation.
Model-to-Data Relevance Indicator is calculated for:
- every row sent for predictions
- dataset sent for predictions
- all data sent for predictions aggregated over time (Historical Model-to-Data Relevance Indicator)
Model Quality Diagram
Model Quality Diagram is a graphical representation of model quality in relation to metric indicator values that are scaled in the range [0-1], where 1 is the ideal quality of the model, and 0 is the minimum quality of the model.
Model Quality Index
Model Quality Index determines the quality of the model based on the metric indicator values.
Model Quality Index Value Range: 1 - 100%
Maximum quality: 100%
The Model Quality Index is calculated during the model training based on the training, or the validation dataset (if the User uploaded the validation dataset for model training). The correlation between the Training Model Quality Index and the acceptable model predictive power depends significantly on the problem being solved by the model.
Thus for tasks that do not require high model predictive power, the acceptable range of the Model Quality Index values is 75-100%, while for high-precision tasks it is 99-100%.
Multi Classification is used to predict one value of the limited number (greater than 2) of possible outcomes (e.g. ‘red’ or ‘green’ or ‘yellow’; ‘high or ‘medium’ or ‘low’; 1 or 2 or 3 or 4 or 5, etc.) If all of the values of your target variable are represented by a discrete (fixed) number of unique values/classes (>2), then this is a Multiclass classification task type.
Negative Mean Crossings
Negative Mean Crossings compute the number of times the selected input crosses the mean with a negative slope.
Neural Network is a series of algorithms that endeavors to recognize underlying relationships in a dataset through a process that is close to the way the human brain operates.
Petrosian Fractal Dimension
Petrosian Fractal Dimension is a statistical function that converts the data to a binary sequence and estimates the fractal dimension from the time series.
Positive Mean Crossings
Positive Mean Crossings compute the number of times the selected input crosses the mean with a positive slope.
Prediction is the output of a model after it has been trained and applied to new data when one is trying to predict unknown values of the target variable.
Precision is the fraction of relevant samples among the retrieved samples. The precision score reaches its best value at 1 and worst score at 0. Precision is intuitively the ability of the classifier not to label as positive a sample that is negative. Use this metric when the priority is to find only relevant samples without a mistake, even if you may end up skipping some of the relevant samples.
Recall is the fraction of relevant samples that have been retrieved over the total amount of relevant samples. The recall score reaches its best value at 1 and worst score at 0. The recall is the ability of the classifier to find all the positive samples. Use this metric when the priority is to find as many relevant samples as possible even if you may also mark some of the wrong examples as relevant.
Regression (regression task type) is predicting a continuous value (for example predicting the prices of a house given the house features like location, size, number of bedrooms, etc).
Root Mean Square
Root Mean Square is the root of the arithmetic mean of the squares of a set of numbers.
Root Mean Squared Error (RMSE)
Root Mean Squared Error or RMSE represents an error between observed and predicted values (square root of squared average error over all observations). The lower RMSE - the better predictive power the model has. RMSE is always non-negative, and a value of 0 would indicate a perfect fit for the data. It should be used when you want to make a model which would not have big individual errors for every prediction.
Root Mean Squared Logarithmic Error (RMSLE).
Root Mean Squared Logarithmic Error or RMSLE can be used when one doesn’t want to penalize huge differences when both the values are huge numbers. The lower RMSLE - the better predictive power the model has. RMSLE can also be used when one wants to penalize underestimates more than overestimates.
Root Mean Squared Percentage Error (RMSPE)
Root Mean Squared Percentage Error represents a percentage error between the observed and predicted values measured as the square root of the mean of the squared (difference between the actual values and the predicted values divided by the actual values). The lower the RMSPE – the better the model’s predictive power.
RMSPE is always non-negative, and a value of 0 (almost never achieved in practice) would indicate a perfect fit to the data. In general, a lower RMSE is better than a higher one. However, comparisons across different types of data would be invalid because the measure is dependent on the scale of the numbers used.
Rows with 0 values in the target variable are filtered out and are not used in the validation pipeline as division by 0 is not possible.
Sensor Data is the data from gyroscopes, accelerometers, magnetometers, electromyography (EMG), and other similar devices.
Skewness statistical function measures the asymmetry of the distribution of a variable.
Solution is an object in Neuton in which all model parameters are specified. All workflow actions are executed inside the solution.
Splitting is the process of separating a dataset into two parts one for training and one for validation.
Tabular Data is the data used for solving AutoML tasks (not suitable for TinyML tasks).
Target Variable is a variable the model is learning to predict. The target variable may be represented as a range of discrete classes or as continuous real numbers.
TinyML is a field of study in Machine Learning and Embedded Systems that explores the types of models you can run on small, low-powered devices like microcontrollers. It enables low-latency, low-power, and low-bandwidth model inference on edge devices.
Total Footprint is the amount of space in FLASH memory and SRAM that the model uses for prediction.
Training is the process of learning to uncover relationships between the features of a particular dataset and the target variable.
Training Dataset is the input dataset (or its part) that the machine learning algorithm uses to “learn” to uncover relationships between its features and the target variable.
Validation is the quality assessment process for a model which has been trained and built to predict a particular target variable.
Validation Dataset is another subset of the input data used to predict the target variable with the trained model, and measure the error between the known target values in the validation dataset and the predictions.
Validation Metric is a functional value that describes model quality applying to the holdout validation dataset or cross-validation process.
Variance is a statistical function that calculates the minimum of the column (feature).
Weighted Average Precision
Weighted Average Precision accounts for imbalanced classes where (for example) one class may be represented by 10% of samples, the second class may be represented by 60% of samples and the other N classes are represented by the remaining 30% of samples.
Weighted Average Recall
Weighted Average Recall accounts for imbalanced classes where (for example) one class may be represented by 10% of samples, the second class may be represented by 60% of samples and the other N classes are represented by the remaining 30% of samples.
Weighted Average F1 Score
Weighted Average F1 Score accounts for imbalanced classes where (for example) one class may be represented by 10% of samples, the second class may be represented by 60% of samples and the other N classes are represented by the remaining 30% of samples.