"Neuton freed our time for creativity"
Neuton AutoML let us automate the entire model lifecycle and freed our heads for more
complicated tasks. Such tasks should definitely be solved automatically, without the
involvement of expensive highly qualified specialists.
Andrew K.
Head of Data, LITRES, the largest ebook retailer in Europe
Automated prediction of number of books expected to be sold immediately after release.
Customer
LITRES is one of the
largest ebook and audiobook retailers in Europe, operating under numerous different brand
names. It currently contains more than 250,000 titles in more than 30 languages. Over 16
apps make the company a leader in top grossing charts in the App Store and Google Play,
among the non-gaming category. LITRES has more than 23 million registered users and has sold
more than 30 million ebooks over the company’s lifetime!
Challenge
More than 10,000 new books appear in the LITRES platform every month, and need to be ranked
in order to determine which are likely to be best-sellers. This is necessary in order to be
able to focus marketing activities and user attention on the releases that are likely to be
most popular. Of course, this may be fairly easy to do when it comes to the release of books
by eminent authors, or the continuation of a popular series, but what about tens of
thousands of completely new books? Machine Learning to the rescue!
Solution
Based on data from book sales over several years, LITRES created a model that predicts the
number of sales for new books immediately after release. To build the model, a variety of
data about each book was used, including: data about the author, number of pages, subject,
genre, language, price, author rating, year of publishing, etc. In total, data collected
from 150,000 books sold over the past 3 years was used to train the model.
Internal R&D eventually resulted in a model that predicts the number of sales of any book in
the first month after its release, with an error of 240, measured by RMSE. Prediction
results were then used to determine the likely best-sellers.
Initially, LITRES built the model on their own, manually preparing the data and using
XGBoost to train the model. The results were good, but the need for manual preparation
prevented them from automating use of the model in their business processes.
Soon after reaching this stage, LITRES decided to explore the possibility of solving this
new challenge with the help of Neuton AutoML. This proved to be a wise
decision. After
uploading the training file, all additional work was fully automated, model training went
comparatively quickly, and results yielded an overall total increase in accuracy.
In terms of useful application of automation, in this particular task there were a lot of
categorical variables, and a variety of other of data points. Typically, the work of
preparing categorical variables in the necessary format for machine learning takes a
significant amount of time. Neuton simplified & sped up the preprocessing stage by
identifying which encoding method to apply to a particular feature (e.g. one hot encoding,
mean target or frequency encoding), thereby also speeding up creation and application of
their new Neuton-devised model in business processes.
While it’s often assumed that utilizing automation in preprocessing, and/or faster model
processing will reduce accuracy, in LITRES’ experience with Neuton they were delighted to
confirm that they actually achieved a considerable improvement in the forecast accuracy. The
mean error decreased by 50% and amounted to 129 on the RMSE metric, which is almost half
less than the mean error in the original model (240).
Thanks to the speed, quality and accuracy of Neuton’s results, LITRES opted to continue to
refer to the resulting model automatically, through the API, which enables them to now make
such predictions in real time, saving considerable time and money, while also bringing in
more revenue thanks to being able to more quickly and effectively identify and promote
likely best-sellers.
«The model that we used earlier gave good results and was effective for selecting top
new books, but it required regular manual work to update it and get predictions. It was not
always clear if the model was outdated or not. We just updated it monthly.
It was important for us to find a tool that allows us to automate the entire life of the
model from its creation, to getting the predictions and updating the model. Neuton
AutoML
let us do it and freed our heads for more complicated tasks. I am convinced that today such
tasks should definitely be solved automatically, without the involvement of expensive highly
qualified specialists,» said Andrew K. LITRES Head of Data.