Skip to content

E2E Recommendation System: data engineering preparation project for enthusiasts & students.

License

Notifications You must be signed in to change notification settings

trendyol-data-eng-summer-intern-2019/recom-engine-white-paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

End to End Recommender System Development

Description

The aim of this project is teaching how to develop a recommender system and create necessary data pipeline to feed a recommendation system. To achieve that purpose, attendants must research and learn the basics of required technologies such as Docker, Spark Streaming & Spark ML, Kafka, Flume, Spring Boot and MongoDB. Architecture of the project is visualized in figure below.

System Overview

Figure 1: Visualized architecture of the project.

14 Days Learning Plan

  • Git & Docker investigating (2 days)

  • Design and develop a basic Spring Boot Rest API which will provide 2 functionalities:

    • An event collection service

      • Forward user generated data to Kafka with a proper format (1 day)
    • User recommendation service

      • Return user’s preprocessed recommendations from MongoDB. (1 day)
  • A flume service which will read data from Kafka and write to a FileSystem (1 day)

  • A scheduled SparkML service will read collected data daily/hourly and will export a trained model from data. (3 days)

  • A Spark job will read user actions in real-time, and produce user recommendations from pretrained model and then save to MongoDB. (3 days)

  • Dockerize all applications & provide a docker-compose file for easy initialization of the system (3 days)

Other Requirements

Footnote

This project originally has been made during Trendyol Data Engineering Internship Program. Sources have been shared for enhusiasts & students who learn or try to improve themselves in data engineering, and also to provide a proof of concept of a real-time recommendation system.

Credits

Team: Oğuzhan Bölükbaş, Sercan Ersoy, Yasin Uygun

Mentors: Hatice Özdemir, Veysi Ertekin

Interested In?

Take a look at our job offerings!

License

Apache-2.0