Graph-Analytics using Neo4j and Spark's GraphX API

The aim of this project is to develop end-to-end graph analytics module for big data. As a part of implementation, Stack Overflow Questions & Answers dataset, Neo4j Graph database, Spark's GraphX API, Scala programming and Amazon's EC2 cloud instance for hosting database for used.

The demonstration includes:

Graph Cypher queries for the following use cases -
- Finding trends of a technology in the data set
- Identify top answerers for javascript questions
- Fetch all the answers for each Java questions based on the scores
- Lists where Else Were the Top Answerers of Java also Active?
- Find users posting most Javascript questions
Extended Graph Analytics using Scala based implementation for Spark's GraphX API for -
- Evaluate an expert's rank for a programming language based on ranking using Page Rank Algorithm
- Identifying the connected and non-connected tags from the dataset using Connected Components Algorithm.

Experimental Setup

The experimental setup included:
- Programming Language: Scala – Scala SDK – 4.7.0
- Dependencies: Spark-core_2.11, Spark-sql_2.11, spark-graphx_2.11
- Neo4j Database – Neo4j 3.2.4 version
Cloud Implmentation for Neo4j Database: Neo4j Graph Database Community Edition was deployed on AWS EC2 instance and graph implementation for Stack overflow dataset. The Neo4j database edition deployed is limited to the single machine.
Locally Neo4j Community Edition can be downloaded from http://neo4j.com/download/ and server should be started after installation. Default graphdb folder should be replaced with unzipped folder. Consequently, restart the services.
Clone the SparkNeo4j locally, build the Gradle and Run the project to execute

References

GraphX

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
SparkNeo4j		SparkNeo4j
IndependentStudy_Report.docx		IndependentStudy_Report.docx
Queries.txt		Queries.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SparkNeo4j

SparkNeo4j

IndependentStudy_Report.docx

IndependentStudy_Report.docx

Queries.txt

Queries.txt

README.md

README.md

Repository files navigation

Graph-Analytics using Neo4j and Spark's GraphX API

Experimental Setup

References

About

Releases

Packages

Languages

Krupa-S/Graph-Analytics

Folders and files

Latest commit

History

Repository files navigation

Graph-Analytics using Neo4j and Spark's GraphX API

Experimental Setup

References

About

Topics

Resources

Stars

Watchers

Forks

Languages