Learning users' habits is something that can be useful in many cases: smart homes or energy consumption are examples. Today we're introducing a new use case: creating a wellness coach that is more relevant and personal thanks to craft ai.
Let’s build a coach!
Our focus is on providing meaningful advices to help the user sleep better. The idea is to use craft ai to predict how well the user can sleep during the following night to select the more meaningful advices to send him. Every day, for each user, we have some details about his day and how much time he sleeps during the night. The application will use craft ai to learn the model able to do the prediction using this data.
We just released a new Python starter kit. It features the prediction part of this coach and is fully working on some historical data we gathered from a cooperating teammate.
About the data
This starter kit uses real data anonymized and extracted from one of our teammate's connected watch. A total of 97 days have been observed. We merge his day-to-day locations and activity (work or not) provided by his smartphone (GPS) and his agenda so that we can know if the user works during a day and if he works the next day. We also add if he sleeps home or not. We have the feeling that these variables have an impact on his sleep quality. The sleep time is measured by a smart watch.
Here are the summarized features:
- date - the date of the day
- day_off - whether or not the user worked this day
- nextdayoff - whether or not the user works the next day
- sleepathome - whether or not the user plans to sleep at home
- sleep_start - hour at which he started to sleep
- sleep - sleep time during the night (hours)
We want to predict the sleep feature. Since it's a continuous feature, we say that we are facing a regression problem. It’s different from a classification problem, where the output can only take feature in a predefined list of class.
Hopefully continuous property output was just enabled in craft ai!
Running the starter kit
Go to the Github repository and download it. Take a look at the Readme, add an environment file with your craft ai credentials and run the script. Feel free to read the code, it simply illustrates how to use the python client. Now you can go to your account on the craft ai website and see the results!
We get a beautiful decision tree describing the user habits. The grey nodes represent a split on a feature. The colored ones are leaves and represent the final decisions that can be made on a given context. The color reflects how confident we are with the associated decision.
You can generate the decision tree at whatever timestamp you want between your first and last recorded data, by moving the cursor on the timeline. For the rest of this article and to test a new context, we will use the latest tree generated because it takes into account the whole dataset.
Let's analyze the generated tree. We will go through the tree like if we were interested by making a decision based on a new context:
- date: 2017-01-29
- day_off: True, the user doesn’t work this day
- nextdayoff: False, he also has to work the next day
- sleepathome: True, he plans to sleep at his own home
- sleep_start: let’s suppose he wants to watch movies (example) and start to sleep late, around 01:30am
First, take a look at the root.
The first logical split generated is on the feature day_off. It looks like our user’s sleep time depends mainly on whether or not he worked during the day. We can make many assumptions, as he may still feel job stress which affects his sleep quality. In our context, he worked during his day so let’s take the True edge on the right.
The next split is on the sleep_start feature. Since the type of this feature is a time of day, the values are only between 00:00 and 23:59. It is represented as an interval. The user seems to sleep differently when he starts his night before 1:28am or after. It is consistent since 1:28am is quite a late hour for starting to sleep. In the context, the feature sleep_start is 1:30. So let’s move on the left side with the edge labeled [1:28,21:18]. The next node is just a more precise split on the sleep_start feature. We also continue on the left edge.
So now we have a node which splits the sleepathome feature. Because our context is at True for this feature, we take the right edge and find a leaf node! The prediction for the context is 8.517 hours of sleep, with a 99.31% confidence. This value is an indicator on how the tree is confident predicting this value here. It is not the probability to get this value in real world.
After the training and tree generation steps, let’s analyze the prediction quality. At craft ai, we are used to validating a model with the sliding window validation technique.
Like in classic cross-validation, we define a train and test set. We’ll loop over the time from the first known sample up to the last one. We have to define a window: the time distance on which we test the model. We also have to define the step: the number of samples we add to the test set each sliding iteration. The window slides on the timeline and we compare the predicted values and the real values. This comparison gives us the percentage of good predictions.
In this instance, we set the window at seven days and step at one day. So that, each new day we are able to count the number of good predictions for the following seven days.
Let’s analyze the results after running the sliding window technique.
The first chart represents the number of good predictions in percentage, month after month. Because we deal with a regression problem, a prediction is said to be good if its value doesn’t deviate with more than 2 standard deviations. The chart on the second line stands for a linear regression of the one just above. We can see the increase of good predictions. The more data we provide, the more accurate our model is getting.
We can notice a gap during December. During this month, our user had unusual behavior because of the festive season. The predictions were consequently bad at this period because craft ai didn’t have any information about this particular period.
We were able to make sense of personal data we collected in order to provide useful information for the final user. We got a "white box" prediction model where we can understand the rules the machine detected. We also observed through validation and visualization that our model is more and more accurate over time.
craft ai is convenient if you want to learn from hidden habits or rules you can’t naturally identify. This kind of application in the area of wellness can be used by itself or combined with an expert coach. The user gets great value thanks to the personal decision model craft ai learns.
This example of application is simple but you can imagine a lot of more complex ones, with more diversity in data and other useful features to predict. Feel free to experiment on craft ai with the data you collect, you’ll be impressed by the extreme simplicity of building new models for your final users!