Writing dummy snippets of code to read, manipulate, and build a simple ML model with PySpark.
-
Updated
Jul 18, 2023 - Jupyter Notebook
Writing dummy snippets of code to read, manipulate, and build a simple ML model with PySpark.
Given a set of documents and the minimum required similarity threshold find the number of document pairs that exceed the threshold
This notebook contains detailed code for spark and machine learning and databricks
A laboratory to carry out experiments with PySpark
Trying best case apache spark working environment for robust data pipelines
An academic project carried out for the Distributed Data Analysis and Mining course (a. y. 2022/2023)
Assignments as given in the course of CSE545. All assignments are part of this course
1061Data Mining Research and Practice Homeworks
stockmarket machine learning
Repo contains various tutorials I've created to help people learn Python and other tools
Predict outcomes of IPL Cricket Matches for the year 2018 using Spark MLLib framework.
Leverage parallel python sprak computation based on intel deep learning architecture, bigdl to solve one shot learning on pokeman dataset by siamese network.
An engineering process for data science and big data processing
My submission for Grab AI for S.E.A. challenge
Add a description, image, and links to the pyspark topic page so that developers can more easily learn about it.
To associate your repository with the pyspark topic, visit your repo's landing page and select "manage topics."