Skip to content
This repository has been archived by the owner on Feb 1, 2024. It is now read-only.

GaspardBT/pfe_rapids

Repository files navigation

PFE Rapids

End of Studies projects at CentraleSupelec. The goal is to discover and test the Rapids ecosystem, focusing on the librairy cuML.

Installation

Follow the following steps:

  • Install Conda
  • Create working environment for Rapids using the following explainer: link. It will install all the require depencie to run cuml.
  • Activate the environment conda activate env_name.
  • Install the others depencies the will be listed in for each sub-project.

Data

You can download the data used using the following links::

Hello Scripts

Script to benchmark different clustering algorithms. Code adapted from here.

Test cuml kNN on real data. Code adapted from here

A script showing to most basic use of cuML Kmeans implementation.

Test of a integration of cuML in a Flask API.
Run the app with: python app.py
You can pretrained the model with the model_maker.py script, make sure to properly set the dataset path.

Urls Classification

Set the rigth dataset path and launch the script using python trainer_standalone.py

Full Stream

  • Set the right dataset path in this script.
  • Be sure to have Kafka running you can follow this
  • Launch the mock producer python src/mock_producer/main.py
  • Launch the trainer python src/trainer/main.py
  • Launch the metric collector python src/metrics_garbage/main.py

Metrics Analysis

Notebooks to plot metrics analysis.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published