Graphical recommender system using personalized PageRank (Random Walk with Restart)
A bipartite graph is a graph where nodes can be divided into two groups V1 and V2 such that no edge connects the vertices in the same group.
Given a query node a in V1, Neighborhood Formation computes the relevance scores of all the nodes in V1 to a.
Random walk with restart is the 'personalized' rangerank algorithm. It models the distribution of rank, given that the distance random walkers can travel from their source is determined by alpha. For each iteration, the probability of going to a connected node is (1-alpha)/(#of connected node), and there is also a alpha probability for the walker to go to a random node. You can refer to the literature folder for more details of the algorithm.
In this dataset, given a patient(user) node, the RWR algorithm will compute and output the closest 10 neighboring user.
In the simulated data above, user2 and user3 are connected by Med2&3 and Lab2&3, whereas user1 doesn't have any shared node connected with user2/3's neighborhood. Therefore, when running the proof of concept python code you will see that the closest neighbor (user edge) for user2 (beside himself) is user3.
- Java 1.8+
- Scala 2.1+
- built.sbt