Skip to content

Latest commit

 

History

History
38 lines (21 loc) · 1.99 KB

README.md

File metadata and controls

38 lines (21 loc) · 1.99 KB

Overview

Graphical recommender system using personalized PageRank (Random Walk with Restart)

Background

Bipartite Graph Structure

A bipartite graph is a graph where nodes can be divided into two groups V1 and V2 such that no edge connects the vertices in the same group.

Image

Neighborhood formation: Random Walk with restart algorithm (personalized pagerank)

Given a query node a in V1, Neighborhood Formation computes the relevance scores of all the nodes in V1 to a.

Random walk with restart is the 'personalized' rangerank algorithm. It models the distribution of rank, given that the distance random walkers can travel from their source is determined by alpha. For each iteration, the probability of going to a connected node is (1-alpha)/(#of connected node), and there is also a alpha probability for the walker to go to a random node. You can refer to the literature folder for more details of the algorithm.

In this dataset, given a patient(user) node, the RWR algorithm will compute and output the closest 10 neighboring user.

Image

In the simulated data above, user2 and user3 are connected by Med2&3 and Lab2&3, whereas user1 doesn't have any shared node connected with user2/3's neighborhood. Therefore, when running the proof of concept python code you will see that the closest neighbor (user edge) for user2 (beside himself) is user3.

Image

Dependencies

  • Java 1.8+
  • Scala 2.1+
  • built.sbt

References

  1. Epasto et al. Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs
  2. Sun et al. Neighborhood Formation and Anomaly Detection in Bipartite Graphs
  3. GraphX personalized pagerank