MappIt

MappIt is an application that allows users to discover new places to visit and share their adventures and experiences, helping creators to promote their contents on the community.

MappIt application architecture

MappIt is an application based on the client-server paradigm. We designed and implemented in detail the server part.

Cluster architecture

MappIt was deployed on a cluster of three servers, in which each of them was in charge of a different part of the service.

In particular we had:

server A: which run the Java backend of the service and was part of the MongoDB cluster
server B: which run the Neo4j server and was part of the MongoDB cluster
server C: which run the data population periodic scripts and was part of the MongoDB cluster

Entities

The main entities of MappIt are:

user
place
post
activity

There are different kind of relations between these three entities, and some attributes of the entities are stored only on the document database (MongoDB) while some other information are stored only in the graph database (Neo4j).

Neo4j entities and relations

In the following it is reported a schema of the entities and relations declared in the graph database:

Cross-database consistency management:

In this section we analyze queries that requires to be handled as they involve both the databases.

We designed data flow schemas in the cases of successfull or failed operations for each instruction, always aiming at preserving a state of consistency for the data.

In addition to perform an automatic attempt of consistency recovery, we decided to log errors into an errors.txt file, allowing administrators to manually check and enforce consistency and restored the nominal state.

Registration of a new user

Adding or Removing a like on post

Analysis queries

In the following are reported some queries we performed over the database to get interesting overviews of the data and extract information.

Document database queries: MongoDB

Most popular posts of a given period

Description:

this query selects the most appreciated posts, in terms of likes received, in a period between two dates and filtering by an activity.

Mandatory parameters: fromDate, toDate

Optional parameters: activity name and maximum number of posts to return

Java method: PostService.getPopularPosts

db.post.aggregate([
  {$match:
    {activity:{$in:["activityName"]}},
  },
  {$match:
    {postDate:{$gte:"fromDate", $lt: "toDate" }}
  },
  {$sort:{likes:-1}},
  {$project:
    {
      _id:0,
      title:1,
      authorUsername:1,
      placeName:1,
      desc:1,
      postDate:1
    }
  },
  {$limit: "howManyResults"}
])

Graph queries: Neo4j

Domain query:

What are the most visited places, between the ones visited by the followings of a specified user?

Graph-centric query:

Considering U as all the User vertices that have an incoming edge “FOLLOWS” from a specific User vertex, select Place vertices that have an incoming edge “VISITED” from U vertices. Then count the incoming “VISITED” edges for each of those places.

Equivalent query in Cypher:

MATCH (u:User{username:$username})-[f:FOLLOWS]->(followings:User)-[v:VISITED]
->(p:Place)
WITH p.id AS id, p.name AS place, count(v) AS visitTimes
ORDER BY visitTimes DESC
LIMIT $howManyResults
RETURN id, place, visitTimes

Domain query:

Makes suggestions about new posts in the same places to check, basing on users’ liked posts and ordering by number of likes

Graph-centric query:

Considering P as the Post vertices with an incoming edge “LIKES” from a specific User vertex, select PL as the Place vertices that have an incoming edge “LOCATION” from P. Then considering the Post vertices with an outgoing edge “LOCATION” from PL, count the incoming “LIKES” edges and sort Posts by this value.

Equivalent query in Cypher:

MATCH (u:User{username:$username})-[:LIKES]->(p:Post)-[:LOCATION]->(pl:Place)
WITH DISTINCT pl AS places, COLLECT(p) AS likedPosts
MATCH (:User)-[l:LIKES]->(sp:Post WHERE NOT(sp IN likedPosts))-[:LOCATION]->(places)
WITH DISTINCT sp AS suggestedPosts, COUNT(l) AS likeReceived
ORDER BY likeReceived DESC
RETURN suggestedPosts.id, likeReceived, suggestedPosts.title

Managing redundancies

Since MappIt is a service that exploits two kinds of databases the data population procedures must be in charge also of handling the storage of information in the two systems.

The creation of a new place, for example, not only consists in inserting a document in MongoDB, but also a node in Neo4j, while the generation of social relations mainly consists in accessing the Neo4j entities. By the way there are some redundancies, like for example the total likes counter, that are cross-database.

Those redundancies were designed in order to improve the execution time of frequent database operations, but can introduce inconsitencies of the data.

In order to restore eventual inconsistencies that could be present, we implemented the redundancies updater procedure, which is responsible to update the redundancies counters that we inserted in the documents of some entities in Mongo like the field “likes” in the Post documents, the field “followers” in the User documents or the fields “favourites” and “totalLikes” in the Place documents.

The update of these redundancies is needed only when a user, all the posts of a user, or a post are deleted from the application, because the consistency in this case is demanded to this procedure, to prevent too much load on the server for the entities’ deletion.

This procedure is scheduled periodically.

Full project documentation

We analyzed deeper and more broadly the aforementioned aspects and even others, like:

databases queries analysis by means of the Operation Frequency Table
Indexes on certain collections and documents attributes to improve performances of certain frequent queries
Redundant fields in documents to improve queries performances in terms of executionStats
Database sharding: we proposed a database sharding based on the country code of places and users in order to grant higher service availability
Java application packages organization
Java application databases connection handling
Service endpoints
Application use cases
Functional requirements
Non-Functional requirements
Data population service

Have a look at the full project documentation at this link

Name		Name	Last commit message	Last commit date
Latest commit History 373 Commits
.idea		.idea
Server		Server
dataPopulation		dataPopulation
documentation		documentation
python_client		python_client
queries/mongodb		queries/mongodb
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

Server

Server

dataPopulation

dataPopulation

documentation

documentation

python_client

python_client

queries/mongodb

queries/mongodb

.gitignore

.gitignore

README.md

README.md

Repository files navigation

MappIt

MappIt application architecture

Cluster architecture

Entities

Neo4j entities and relations

Cross-database consistency management:

Registration of a new user

Adding or Removing a like on post

Analysis queries

Document database queries: MongoDB

Most popular posts of a given period

Graph queries: Neo4j

Managing redundancies

Full project documentation

About

Releases

Packages

Contributors 3

Languages

Ruggero1912/MappIt

Folders and files

Latest commit

History

Repository files navigation

MappIt

MappIt application architecture

Cluster architecture

Entities

Neo4j entities and relations

Cross-database consistency management:

Registration of a new user

Adding or Removing a like on post

Analysis queries

Document database queries: MongoDB

Most popular posts of a given period

Graph queries: Neo4j

Managing redundancies

Full project documentation

About

Topics

Resources

Stars

Watchers

Forks

Languages