This repository contains code and resources related to machine learning using Google Cloud's Vertex AI. The project leverages BigQuery for data collection and preprocessing before building machine learning models with Vertex AI.
src/
: This directory contains the source code for various machine learning models and utilities.data/
: You can store your training and testing datasets in this directory.notebooks/
: Jupyter notebooks showcasing examples and tutorials for using Vertex AI and BigQuery.docs/
: Documentation files explaining the project structure, setup, and usage.
Before running the code, make sure you have the following installed:
- Python (version 3.10.8)
- Google Cloud SDK
- Additional dependencies (list them, if any)
-
Clone this repository:
git clone https://github.com/pradeep-016/Machine-Learning-with-Vertex-AI-on-Google-Cloud.git cd Machine-Learning-with-Vertex-AI-on-Google-Cloud
-
Install dependencies:
pip install -r requirements.txt
-
Configure Google Cloud credentials:
gcloud auth application-default login
-
Create a BigQuery dataset and table for your project.
-
Use the following command to export data from BigQuery to a CSV file:
bq extract --destination_format CSV your-project:your-dataset.your-table data/train.csv
Replace
your-project
,your-dataset
, andyour-table
with your actual values.
-
Upload your dataset to Google Cloud Storage:
gsutil cp data/train.csv gs://your-bucket/data/train.csv
Replace
your-bucket
with your actual bucket name. -
Run the following command to train a model using Vertex AI:
python src/train_model.py --input gs://your-bucket/data/train.csv --output gs://your-bucket/models/model.pkl
Update the input and output paths accordingly.