By Sergey Yengoyan, Software Engineer
As the number of mobile apps continues to grow at a rapid pace, any dimensionality reduction method that helps decrease the size of a prediction model can improve performance.
We replaced one-hot encodings of apps with dense encodings in k-dimensional space, where k is a lot less than the total number of apps. Apps that are similar in terms of user interest should have a small distance in this k-dimensional space, while dissimilar ones will be far apart.
For an interest profile consisting of three apps, we would have six training samples, as shown in the figure below:
In our case, the weights represent apps, so we tried to visualize them to see if the relationship between the weights carries any value. We found that apps closest to each other in terms of user interest, were very close in genre, name, and type. For example, racing apps would be in one cluster, apps for selling personal items in another, etc.
So the model represented by the weights is a predictor of what other apps the user might already have. We not only achieve effective vectorization of the app appearances, but also have a tool to predict user interest.
As the saying goes: “If you are targeting everyone, you are not targeting anyone.” At Aarki, our data scientists are constantly experimenting to develop sophisticated machine learning algorithms that allow us a better understanding of our data and better predict user profiles. That’s how we reach and acquire the best users and deliver strong ROI to our clients.