There are two GitHub Action pipelines for Continuous Integration(CI) and Continuous Deployment(CD) of the ML pipeline on Vertex AI environment.
CI: This is to verify the proposed changes of the pipeline by creating and compiling through the TFX CLI. Specifically, tfx pipeline create
and tfx pipeline compile
will error out if the given pipeline code has import/syntax problems.
You can also run the pipeline locally with tfx run create
. However, in this case, you need to consider the code structure to be able to set the number of epochs and datasets differently. You certainly don't want to run the pipeline over for a larger number of epochs and on the full dataset since the purpose of CI is to check if there is no problem.
CD: This runs the pipeline on Vertex AI environment. In order to give more flexibility, it leverages the workflow_dispatch
feature of GitHub Action which allows you to set some parameters manually via GitHub Action UI. In this project, there are four parameters as below(gcpProject
, gcpRegion
, and pipelineName
are used in tfx run create
TFX CLI):
gcpProject
: This sets which GCP Project ID to run the pipeline on Vertex AI. This GCP Project ID is going to be used to authenticate your credentials to login and send requests to Vertex AI. The credentials(JSON) should be provided in GitHub Action Secret, and the key for the GitHub Action Secret should be the same to the value ofgcpProject
except that dash character(-
) should be replaced by underscore(_
).-
is not an allowed character in a GitHub Action Secret key.gcpRegion
: This sets which GCP Region to run the pipeline on Vertex AI. The default value is set toun-central1
.pipelineName
: In order to run the pipeline, TFX CLI need to know which pipeline to run. The names of each pipeline are defined inconfigs.py
files, and these values are rigistered automatically when runningtfx pipeline create
CLI. In this project, there is only one pipeline, and its name is set tosegmentation-training-pipeline
by default.enableDataflow
: This option is used to delegate the jobs ofExampleGen
andTransform
to DataFlow service. This is useful to handle a large amount of data since the VM spec of each step in Vertex AI Pipeline is limited, and you sometimes get Out-Of-Memory(OOM) issue.
The basic structure of CD pipeline runs tfx pipeline create
, tfx pipeline compile
, and tfx run create
CLIs sequentially:
tfx pipeline create
: It creates/register the register to the local system. This step is required to run the downstream CLIs since other CLIs should know which pipeline to operate on.tfx pipeline compile
: It compiles the pipeline. It not only produces the pipeline spec in JSON file but also builds/pushes Docker image to Google Cloud Reigstry when--build-image
option is specified. It also makes sure each step(component) of the pipeline to be run on the newly built Docker iamge.tfx run create
: It runs the pipeline on Vertex AI with the--engine=vertex
option specified. Also, three of the GitHub Action parameters are set here too;gcpProject
for--project
,gcpRegion
for--region
, andpipelineName
for--pipeline-name
.