Step 1: Select data for training
The model creation process within Neuton begins with creating a new Solution. The solution is the object for training parameters specification and prediction processes management. Only one model can be built within one solution.
My Solutions. Default View.
After you have worked with Neuton and created some solutions, you can view and manage your solutions in the “My Solutions” workspace:
My Solutions
To create a new solution with a list of previously created solutions click “Add New Solution”. The “New Solution” pop-up window will appear. Choose the input data type you will use for your project: audio, sensor, or tabular.
Tabular data must be selected if you solve a task with tabular data which does not belong to a TinyML field.
Select sensor data if you plan to run models on tiny devices and even if:
data was collected not only from sensors
data is already pre-processed
Add the desired “Solution Name” in Latin letters.
New Solution
After that, click “Next” to go to dataset import/selection.
Select Data (Dataset tab)
Please read first about the requirements for training data in the Dataset requirements section.
Training dataset selection
Please read first about the requirements for training data in the “Dataset requirements” section.
Uploading Options
To select the dataset you have the following options (tabs):
Upload dataset
This tab allows you to upload your own dataset. (This option is not available for the test-drive version.)
Select dataset from storage
On this tab you can select one of your previously loaded datasets.
Preloaded dataset
Neuton provides preloaded datasets (for demonstration and testing purposes).
Select one of these options to specify the training dataset for your solution. One dataset is used to train one model. If your data is represented by several datasets, you need to combine them in advance.
Upon uploading data to the platform, the data is encrypted in the cloud. All the resulting datasets and models are encrypted as well. Check out the Storage section for more details.
Step 2: Upload dataset
Please select “Click to upload CSV file” and browse to the file location on your hard drive (or drag & drop):
Select Dataset from Storage
During the uploading process platform checks the dataset. If you see an error message you should verify the dataset and upload it again. When the file is successfully uploaded you will see a green check mark.
Machine learning operations cannot be performed on inappropriate data structures and variable types. Please make sure your dataset is processed accordingly. To read more about dataset requirements please refer to the “Dataset requirements” section.
When the file is uploaded you can preview the selected dataset in the web interface using the "lens" icon. If you have selected the wrong file by mistake you can click on the trash icon to select another dataset. To go to the next steps, press “OK”.
The file name must not contain the following characters: !/[+!@#$%^&*,. ?":{}\\/|<>()[]] If you upload a file with these characters, the platform will automatically rename the file to remove invalid characters from the name and inform you about it.
When uploading a file, the platform checks the uniqueness of the file name. If a file with the same name has already been uploaded, the platform will offer to rename the newly uploaded file. For optimal storage usage, it is recommended to upload a file to the platform once and then select it from the storage.
Select dataset from storage
Select one of the existing datasets by choosing the “Select dataset from storage” tab and navigate to a dataset that has been previously uploaded:
Select Dataset from Storage
Preloaded Dataset
Select one of the preloaded datasets by choosing the “Preloaded dataset” option and navigate to the dataset that has been preloaded with Neuton.
To view information about the dataset, click on the corresponding question mark icon “?”
Preloaded Dataset
On the Dataset, Training, and Prediction tabs for the preloaded datasets, all settings are preconfigured.
Step 3: Specify dataset options
After the training dataset has been defined you should specify the target column. Also, at this stage, you can specify the dataset for holdout validation and drop some feature columns which you consider as irrelevant.
Dataset Options
To enable holdout validation and specify the dataset for it, turn on the switch button near holdout validation. With the holdout dataset specified the training process will happen in the same way as without the holdout dataset but after model training completion, metrics will be calculated on the holdout dataset you uploaded. Otherwise, the platform will measure the validation metric at each training iteration using a 10-fold cross-validation approach. Neuton has a built-in patented feature to prevent overfitting (overtraining) which stops training right before overfitting starts to occur.
The holdout validation dataset must be in the same format as the training dataset.
To exclude some features from the training dataset, mark the check box for the appropriate feature name in the “Remove variables” section. The model will not train on the excluded data. If you select variables to delete, they will be deleted both in the training dataset and in the validation dataset automatically.
If you are making predictions on the platform, it is acceptable to submit data for the test with columns that you have removed in the training and validation datasets. The platform will simply ignore these columns and make predictions without them. For predictions on the device, it is necessary to exclude this data and not feed it on the inference.
Click “Next” to proceed to the training stage.