Common solutions and tools developed by Google Cloud's Professional Services team.
This repository and its contents are not an officially supported Google product.
All solutions within this repository are provided under the Apache 2.0 license. Please see the LICENSE file for more detailed terms and conditions.
The examples folder contains example solutions across a variety of Google Cloud Platform products. Use these solutions as a reference for your own or extend them to fit your particular use case.
- Anthos Service Mesh Multi-Cluster - Solution to federate two private GKE clusters using Anthos Service Mesh.
- Anthos CICD with Gitlab - A step-by-step guide to create an example CI/CD solution using Anthos and Gitlab.
- Audio Content Profiling - A tool that builds a pipeline to scale the process of moderating audio files for inappropriate content using machine learning APIs.
- Bigdata generator - Solution that generates large amounts of data for stress-testing bigdata solutions (e.g BigQuery). For each of the fields you want to generate, you can specify rules for generating their values. The generated data can stored in BigQuery or GCS (Avro, CSV).
- BigQuery Analyze Realtime Reddit Data - Solution to deploy a (reddit) social media data collection architecture on Google Cloud Platform. Analyzes reddit comments in realtime and provides free natural-language processing / sentiment.
- BigQuery Audit Log Dashboard - Solution to help audit BigQuery usage using Data Studio for visualization and a sample SQL script to query the back-end data source consisting of audit logs.
- BigQuery Audit Log Anomaly Detection - Sample of using BigQuery audit logs for automated anomaly detection and outlier analysis. Generates user friendly graphs for quick bq environment analysis.
- BigQuery Automated Email Exports - Serverless solution to automate the sending of BigQuery export results via email on a scheduled interval. The email will contain a link to a signed or unsigned URL, allowing the recipient to view query results as a JSON, CSV, or Avro file.
- BigQuery Automated Schema Management - Command-line utility for automated provisioning and management of BigQuery datasets and tables.
- BigQuery Billing Dashboard - Solution to help displaying billing info using Data Studio for visualization and a sample SQL script to query the back-end billing export table in BigQuery.
- BigQuery Cross Project Slot Monitoring - Solution to help monitoring slot utilization across multiple projects, while breaking down allocation per project.
- BigQuery Data Consolidator - Solution to consolidate data within an organization from multiple projects into one target Dataset/Table where all Source tables are of same schema (like Billing Exports!); specifically useful for data consolidation and further reporting in Cloud FinOps engagements.
- BigQuery DDL Validator - A utility that will read the Legacy DDL and compare it against the previously extracted DDL and produce an output with the name of the objects where the DDL is no longer matching.
- BigQuery Group Sync For Row Level Access - Sample code to synchronize group membership from G Suite/Cloud Identity into BigQuery and join that with your data to control access at row level.
- BigQuery Long Running Optimization Utility - A utility that reads the entire SQL and provides a list of suggestions that would help to optimize the query and avoid the long running issues.
- BigQuery Oracle DDL Migration Utility - Oracle DDL Migration Utility to migrate the tables schema (DDL) from Oracle DB to BigQuery. The utility leverages BigQuery Translation API and offers additional features such as adding partitioning, clustering, metadata columns and prefixes to table names.
- BigQuery Pipeline Utility - Python utility class for defining data pipelines in BigQuery.
- BigQuery Remote Function - It allows user to implement custom services or libraries in languages other than SQL or Javascript which are not part of UDFs. The utility contains sample string format Java code to deploy cloud run gen2 instance and invoke the service from BigQuery using remote function.
- BigQuery Amazon S3 Migration Tool - Bigquery Migration Tool to transfer data from files in Amazon S3 to BigQuery Tables based on configuration provided.
- BigQuery Snowflake TabRle Migration Tool - BigQuery Snowflake Table Migration Tool helps to migrate the table DDL's from Snowflake to BigQuery. The utility leverages BigQuery Translation API and offers additional features such as adding partitioning, clustering, metadata columns and prefixes to table names.
- BigQuery Table Access Pattern Analysis - Sample code to analyse data pipeline optimisation points, by pinpointing suboptimal pipeline scheduling between tables in a data warehouse ELT job.
- BigQuery Tink Toolkit - Python utility class for working with Tink-based cryptography in on-prem or GCP systems in a way that is interoperable with BigQuery's field-level encryption. Includes a sample PySpark job and a script for generating and uploading KMS-encrypted Tink keysets to BigQuery.
- BigQuery to XML Export - Python tool that takes a BigQuery query and returns the output as an XML string.
- BigQuery Translation Validator - A python utility to compare 2 SQL Files and point basic differences like column names, table names, joins, function names, is-Null and query syntax.
- BigQuery Generic DDL Migration Utility - Generic DDL Migration Utility to migrate the tables schema (DDL) from Database(Oracle, Snowflake, MSSQL, Vertica, Neteeza) DB to BigQuery. The utility leverages BigQuery Translation API and offers additional features such as adding partitioning, clustering, metadata columns and prefixes to table names.
- Bigtable Dataflow Cryptocurrencies Exchange RealTime Example - Apache Beam example that reads from the Crypto Exchanges WebSocket API as Google Cloud Dataflow pipeline and saves the feed in Google Cloud Bigtable. Real time visualization and query examples from GCP Bigtable running on Flask server are included.
- Bigtable Dataflow Update Table Key Pipeline - Dataflow pipeline with an example of how to update the key of an existing table. It works with any table, regardless the schema. It shows how to update your key for a table with existing data, to try out different alternatives to improve performance.
- Carbon Footprint Reporting - Example of using the prebuilt Data studio & Looker template for analysing GCP Carbon Footprint Estimates.
- Cloud Audit Log Samples - A sample collection of Audit Logs for Users and Customers to better the structure, contents, and values contained in various log events.
- Cloud Build Application CICD Examples - Cloud Build CI/CD Examples for Applications like containerization & deployment to Cloud Run.
- Cloud Build with Proxy Running in Background - Examples of cloudbuild with docker-compose running tcp proxy in the background for all build steps.
- Cloud Composer CI/CD - Examples of using Cloud Build to deploy airflow DAGs to Cloud Composer.
- Cloud Composer Deployment in Shared VPC - Terraform code to deploy cloud composer in shared VPC environment.
- Cloud Composer Dependency Management - Example of Cloud Composer Dependency Management designed to orchestrate complex task dependencies within Apache Airflow which addresses the challenge of managing parent-child DAG relationships across varying temporal frequencies (yearly, monthly, weekly etc)
- Cloud Composer Examples - Examples of using Cloud Composer, GCP's managed Apache Airflow service.
- Cloud Data Fusion Functions and Plugins - Examples of Cloud Data Fusion Functions and Plugins.
- Cloud DNS load balancing - Multi-region HA setup for GCE VMs and Cloud Run based applications utilizing Cloud DNS load balancing and multiple Google Cloud load balancer types.
- Cloud DNS public zone monitoring - Visualizing Cloud DNS public zone query data using log-based metrics and Cloud Monitoring.
- Cloud Function Act As - Example of executing a Cloud Function on behalf and with IAM permissions of the GitHub Workload Identity caller.
- Cloud Function VM Delete Event Handler Example -
Solution to automatically delete A records in Cloud DNS when a VM is
deleted. This solution implements a Google Cloud Function
Background Function triggered on
compute.instances.delete
events published through Stackdriver Logs Export. - Certificate Authority Service Hierarchy - Root and Subordinate Certificate Authority Service CA Pools and CAs with examples for domain ownership validation and sample load test script.
- Cloud Run to BQ - Solution to accept events/data on HTTP REST Endpoint and insert into BQ.
- Cloud SQL Custom Metric - An example of creating a Stackdriver custom metric monitoring Cloud SQL Private Services IP consumption.
- Cloud Support API - Sample code using Cloud Support API
- CloudML Bank Marketing - Notebook for creating a classification model for marketing using CloudML.
- CloudML Bee Health Detection - Detect if a bee is unhealthy based on an image of it and its subspecies.
- CloudML Churn Prediction - Predict users' propensity to churn using Survival Analysis.
- CloudML Customer Support and Complaint Handling - BigQuery + AutoML pipeline classifying customer complaints based on expected resolution; adaptable to other support communications use cases.
- CloudML Deep Collaborative Filtering - Recommend songs given either a user or song.
- CloudML Energy Price Forecasting - Predicting the future energy price based on historical price and weather.
- CloudML Fraud Detection - Fraud detection model for credit-cards transactions.
- CloudML Scikit-learn Pipeline - This is a example for building a scikit-learn-based machine learning pipeline trainer that can be run on AI Platform. The pipeline can be trained locally or remotely on AI platform. The trained model can be further deployed on AI platform to serve online traffic.
- CloudML Sentiment Analysis -
Sentiment analysis for movie reviews using TensorFlow
RNNEstimator
. - CloudML TensorFlow Profiling - TensorFlow profiling examples for training models with CloudML
- Data Generator - Generate random data with a custom schema at scale for integration tests or demos.
- Dataflow BigQuery Transpose Example - An example pipeline to transpose/pivot/rotate a BigQuery table.
- Dataflow Custom Templates Example - An example that demonstrates how to build custom Dataflow templates.
- Dataflow Elasticsearch Indexer - An example pipeline that demonstrates the process of reading JSON documents from Cloud Pub/Sub, enhancing the document using metadata stored in Cloud Bigtable and indexing those documents into Elasticsearch.
- Dataflow BigQuery to AlloyDB - Example that shows how to move data from BigQuery to an AlloyDB table using Dataflow.
- Dataflow Flex Template in Restricted Networking Env - Example implements a python flex template which can be run in an environment where workers can not download python packages due to egress traffic restrictions.
- Dataflow Python Examples - Various ETL examples using the Dataflow Python SDK.
- Dataflow Scala Example: Kafka2Avro - Example to read objects from Kafka, and persist them encoded in Avro in Google Cloud Storage, using Dataflow with SCIO.
- Dataflow Streaming Benchmark - Utility to publish randomized fake JSON messages to a Cloud Pub/Sub topic at a configured QPS.
- Dataflow Streaming Schema Changes Handler - Dataflow example to handle schema changes using schema enforcement and DLT approach
- Dataflow Streaming XML to GCS - Dataflow example to handle streaming of xml encoded messages and write them to Google Cloud Storage
- Dataflow DLP Hashpipeline - Match DLP Social Security Number findings against a hashed dictionary in Firestore. Use Secret Manager for the hash key.
- Dataflow Template Pipelines - Pre-implemented Dataflow template pipelines for solving common data tasks on Google Cloud Platform.
- Dataflow Production Ready - Reference implementation for best practices around Beam, pipeline structuring, testing and continuous deployment.
- Dataflow XML to BigQuery - Example of loading XML data into BigQuery with DataFlow via XMLIO.
- Dataproc GCS Connector - Install and test unreleased features on the GCS Connector for Dataproc.
- Dataproc Job Optimization Guide - Step-by-step guide for optimizing a sample Dataproc Job.
- Dataproc Persistent History Server for Ephemeral Clusters - Example of writing logs from an ephemeral cluster to GCS and using a separate single node cluster to look at Spark and YARN History UIs.
- Dataproc Lifecycle Management via Composer - Ephemeral Dataproc lifecycle management and resources optimization via Composer, Terraform template to deploy Composer and additional reqs, Dynamically generated DAGs from jobs config files.
- Dataproc Running Notebooks - Orchestrating the workflow of running Jupyter Notebooks on a Dataproc cluser via PySpark job
- dbt-on-cloud-composer - Example of using dbt to manage BigQuery data pipelines, utilizing Cloud Composer to run and schedule the dbt runs.
- Data Format Description Language (DFDL) Processesor with Firestore and Pubsub - Example to process a binary using DFDL definition and Daffodil libraries. The DFDL definition is stored in firestore, the request to process is done through a pubsub subcription and the output is published is a JSON format in a Pubsub topic.
- Data Format Description Language (DFDL) Processesor with Bigtable and Pubsub - Example to process a binary using DFDL definition and Daffodil libraries. The DFDL definition is stored in bigtable, the request to process is done through a pubsub subcription and the output is published is a JSON format in a Pubsub topic.
- Dialogflow Webhook Example - Webhook example for dialogflow in Python.
- Dialogflow CX Private Webhook Example - Webhook example for Dialogflow CX in Python.
- Dialogflow Middleware Example - Dialogflow middleware example in Java.
- Dialogflow Entities Creation and Update - Creation and update of entities for Dialogflow in Python.
- DLP API Examples - Examples of the DLP API usage.
- Ephemeral Projects - Creating short lived gcp projects for sandbox purposes.
- GCE Access to Google AdminSDK - Example to help manage access to Google's AdminSDK using GCE's service account identity
- GCS Hive External Table File Optimization - Example solution to showcase impact of file count, file size, and file type on Hive external tables and query speeds.
- GCS to BQ using serverless services - Example to ingest GCS to BigQuery using serverless services such as Cloud Function, Pub/Sub and Serverless Spark.
- GDCE Terraform Example - Example for provisioning GDCE resources using terraform.
- GKE HA setup using spot VMs - Example for running an application with high availability requirements on GKE spot nodes using on-demand nodes as fallback
- Grpc Server connected to Spanner Database - Basic example of a Grpc server that is connected to a Spanner database.
- Grpc Server connected to Redis - Basic example of a Grpc server that is connected to Redis.
- Gitlab KAS agent for GKE - Terraform solution for deploying a Gitlab KAS agent for synchronizing container deployments from Gitlab repos into a GKE cluster
- Home Appliance Status Monitoring from Smart Power Readings - An end-to-end demo system featuring a suite of Google Cloud Platform products such as IoT Core, ML Engine, BigQuery, etc.
- IAP User Profile - An example to retrieve user profile from an IAP-enabled GAE application.
- IoT Nirvana - An end-to-end Internet of Things architecture running on Google Cloud Platform.
- Kubeflow Pipelines Sentiment Analysis - Create a Kubeflow Pipelines component and pipelines to analyze sentiment for New York Times front page headlines using Cloud Dataflow (Apache Beam Java) and Cloud Natural Language API.
- Kubeflow Fairing Example - Provided three notebooks to demonstrate the usage of Kubeflow Faring to train machine learning jobs (Scikit-Learn, XGBoost, Tensorflow) locally or in the Cloud (AI platform training or Kubeflow cluster).
- Left-Shift Validation Pre-Commit Hook - An example that uses a set of Bash scripts to set up a pre-commit hook that validates Kubernetes resources with Gatekeeper constraints and constraint templates from your choice of sources.
- LookerStudio Cost Optimization Dashboard - SQL scripts to help build Cost Optimization LookerStudio Dashboard.
- Personal Workbench Notebooks Deployer - Terraform sample modules to provision Dataproc Hub using personal auth clusters, and workbench managed notebooks for individual analytical users.
- Project factory with Terragrunt -
This implements a
State-Scalable
project factory pattern for creating Google Cloud Platform projects using Terragrunt and public Terraform modules - Python CI/CD with Cloud Builder and CSR - Example that uses Cloud Builder and Cloud Source Repositories to automate testing and linting.
- Pub/Sub Client Batching Example - Batching in Pub/Sub's Java client API.
- QAOA - Examples of parsing a max-SAT problem in a proprietary format, for Quantum Approximate Optimization Algorithm (QAOA)
- Redis Cluster on GKE Example - Deploying Redis cluster on GKE.
- Risk Analysis Asset - Deploying Reliability Risk analysis tool on Cloud Run.
- Spanner Interleave Subquery - Example code to benchmark Cloud Spanner's subqueries for interleaved tables.
- Spanner Change Stream to BigQuery using Dataflow - Terraform code to deploy Spanner change stream and publish changes to BigQuery using Dataflow Streaming Job.
- Spinnaker - Example pipelines for a Canary / Production deployment process.
- STS Metrics from STS Notification - Example code to generate custom metrics from STS notification.
- TensorFlow Serving on GKE and Load Testing - Examples how to implement Tensorflow model inference on GKE and to perform a load testing of such solution.
- TensorFlow Unit Testing - Examples how to write unit tests for TensorFlow ML models.
- Terraform Internal HTTP Load Balancer - Terraform example showing how to deploy an internal HTTP load balancer.
- Terraform NetApp CVS - This example shows how to deploy NetApp CVS volumes using terraform.
- Terraform Resource Change Policy Library -
Contains a library of policies written in the
OPA Constraint Framework
format to be used by
gcloud beta terraform vet
to validate Terraform resource changes in a CI/CD pipeline. - Uploading files directly to Google Cloud Storage by using Signed URL - Example architecture to enable uploading files directly to GCS by using Signed URL.
- TSOP object transfer Log prosessor - This example shows how to log object transfer logs by TSOP to Cloud Logging.
- GCS CSV files to BigQuery - This example shows how to load files in CSV format stored in GCS to load to BigQuery tables. The files can be uncompressed or be compressed in formats such as Bzip2, GZIP and etc. See https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/Compression.html for the list of support compression method.
The tools folder contains ready-made utilities which can simplify Google Cloud Platform usage.
- Agile Machine Learning API - A web application which provides the ability to train and deploy ML models on Google Cloud Machine Learning Engine, and visualize the predicted results using LIME through simple post request.
- Airflow DAG Metadata Generator - Use Google's
generative models to analyze Airflow DAGs and supplement them with generated
description
,tags
, anddoc_md
values. - Airflow States Collector - A tool that creates and uploads an airflow dag to the dags GCS folder. The dag incrementally collect airflow task states and stores to BQ. It also autogenerates a LookerStudio dashboard querying the BQ view.
- Airpiler - A python script to convert Autosys JIL files to dag-factory format to be executed in Cloud Composer (managed airflow environment).
- Ansible Module for Anthos on Bare Metal - Ansible module for installation of Anthos on Bare Metal
- Anthos Bare Metal Installer - An ansible playbook that can be used to install Anthos Bare Metal.
- Apache Beam Client Throttling - A library that can be used to limit the number of requests from an Apache Beam pipeline to an external service. It buffers requests to not overload the external service and activates client-side throttling when the service starts rejecting requests due to out of quota errors.
- API Key Rotation Checker - A tool that checks your GCP organization for API keys and compares them to a customizable rotation period. Regularly rotating API keys is a Google and industry standard recommended best practice.
- AssetInventory - Import Cloud Asset Inventory resourcs into BigQuery.
- BigQuery Discount Per-Project Attribution - A tool that automates the generation of a BigQuery table that uses existing exported billing data, by attributing both CUD and SUD charges on a per-project basis.
- BigQuery Policy Tag Utility - Utility class for tagging BQ Table Schemas with Data Catalog Taxonomy Policy Tags. Create BQ Authorized Views using Policy Tags. Helper utility to provision BigQuery Dataset, Data Catalog Taxonomy and Policy Tags.
- BigQuery Query Plan Exporter - Command line utility for exporting BigQuery query plans in a given date range.
- BigQuery Query Plan Visualizer - A web application which provides the ability to visualise the execution stages of BigQuery query plans to aid in the optimization of queries.
- BigQuery z/OS Mainframe Connector - A utility used to load COBOL MVS data sets into BigQuery and execute query and load jobs from the IBM z/OS Mainframe.
- Boolean Organization Policy Enforcer - A tool to find the projects that do not set a boolean organization policy to its expected state, subsequently, set the organization policy to its expected set.
- Capacity Planner CLI - A stand-alone tool to extract peak resource usage values and corresponding timestamps for a given GCP project, time range and timezone.
- Capacity Planner Sheets Extension - A Google Sheets extension to extract peak resource usage values and corresponding timestamps for a given GCP project, time range and timezone.
- CloudConnect - A package that automates the setup of dual VPN tunnels between AWS and GCP.
- Cloudera Parcel GCS Connector - This script helps you create a Cloudera parcel that includes Google Cloud Storage connector. The parcel can be deployed on a Cloudera managed cluster. This script helps you create a Cloudera parcel that includes Google Cloud Storage connector. The parcel can be deployed on a Cloudera managed cluster.
- Cloud AI Vision Utilities - This is an installable Python package that provides support tools for Cloud AI Vision. Currently there are a few scripts for generating an AutoML Vision dataset CSV file from either raw images or image annotation files in PASCAL VOC format.
- Cloud Composer Backup and Recovery - A command line tool for applying backup and recovery operations on Cloud Composer Airflow environments.
- Cloud Composer DAG Validation - An automated process for running validation and testing against DAGs in Composer.
- Cloud Composer Migration Complexity Assessment - An Airflow DAG that uses a variety of tools to analyze a Cloud Composer 1 environment, determine a work estimate, and accelerate the conversion of airflow 1 dags to airflow 2 dags.
- Cloud Composer Migration Terraform Generator - Analyzes an existing Cloud Composer 1 / Airflow 1 environment and generates terraform. Configures new Cloud Composer 2 environment to meet your workload demands.
- CUD Prioritized Attribution - A tool that allows GCP customers who purchased Committed Use Discounts (CUDs) to prioritize a specific scope (e.g. project or folder) to attribute CUDs first before letting any unconsumed discount float to other parts of an organization.
- Custom Organization Policy Library - A library of custom organization policy constraints and samples. It includes tools to easily generate policies for provisioning across your organization using either Google Cloud (gcloud) or Terraform.
- Custom Role Analyzer - This tool will provide useful insights with respect to custom roles at organization level as well as project level to find predefined roles from which the custom role is built.
- Custom Role Manager - Manages organization- or project-level custom roles by combining predefined roles and including and removing permissions with wildcards. Can run as Cloud Function or output Terraform resources.
- Dataproc Event Driven Spark Recommendations - Use Google Cloud Functions to analyze Cloud Dataproc clusters and recommend best practices for Apache Spark jobs. Also logs cluster configurations for future reference.
- Dataproc Scheduled Cluster Sizing - Use Google Cloud Scheduler an Google Cloud Functions to schedule the resizing of a Dataproc cluster. Changes the primary and secondary worker count.
- DataStream Deployment Automation - Python script to automate the deployment of Google Cloud DataStream. This script will create connection profiles, create stream and start stream.
- DLP to Data Catalog - Inspect your tables using Data Loss Prevention for PII data and automatically tag it on Data Catalog using Python.
- DNS Sync - Sync a Cloud DNS zone with GCE resources. Instances and load balancers are added to the cloud DNS zone as they start from compute_engine_activity log events sent from a pub/sub push subscription. Can sync multiple projects to a single Cloud DNS zone.
- Firewall Enforcer - Automatically watch & remove illegal firewall rules across organization. Firewall rules are monitored by a Cloud Asset Inventory Feed, which trigger a Cloud Function that inspects the firewall rule and deletes it if it fails a test.
- GCE Disk Encryption Converter - A tool that converts disks attached to a GCE VM instance from Google-managed keys to a customer-managed key stored in Cloud KMS.
- GCE switch disk-type - A tool that changes type of disks attached to a GCE instance.
- GCE Quota Sync - A tool that fetches resource quota usage from the GCE API and synchronizes it to Stackdriver as a custom metric, where it can be used to define automated alerts.
- GCE Usage Log - Collect GCE instance events into a BigQuery dataset, surfacing your vCPUs, RAM, and Persistent Disk, sliced by project, zone, and labels.
- GCP Architecture Visualizer - A tool that takes CSV output from a Forseti Inventory scan and draws out a dynamic hierarchical tree diagram of org -> folders -> projects -> gcp_resources using the D3.js javascript library.
- GCP AWS HA VPN Connection terraform - Terraform script to setup HA VPN between GCP and AWS.
- GCP Azure HA VPN Connection Terraform - Terraform code to setup HA VPN between GCP and Microsoft Azure.
- GCP Organization Hierarchy Viewer - A CLI utility for visualizing your organization hierarchy in the terminal.
- GCPViz - a visualization tool that takes input from Cloud Asset Inventory, creates relationships between assets and outputs a format compatible with graphviz.
- GCS Bucket Mover - A tool to move user's bucket, including objects, metadata, and ACL, from one project to another.
- GCS to BigQuery - A tool fetches object metadata from all Google Cloud Storage buckets and exports it in a format that can be imported into BigQuery for further analysis.
- GCS Usage Recommender - A tool that generates bucket-level intelligence and access patterns across all projects for a GCP project to generate recommended object lifecycle management.
- GCVE2BQ - A tool for scheduled exports of VM, datastore and ESXi utilization data from vCenter to BigQuery for billing and reporting use cases.
- GKE AutoPSC Controller - Google Kubernetes Engine controller, to setup PSC ServiceAttachment for Gateway API managed Forwarding Rules.
- Global DNS -> Zonal DNS Project Bulk Migration - A shell script for gDNS-zDNS project bulk migration.
- GKE Billing Export - Google Kubernetes Engine fine grained billing export.
- gmon - A command-line interface (CLI) for Cloud Monitoring written in Python.
- Google Cloud Support Slackbot - Slack application that pulls Google Cloud support case information via the Cloud Support API and pushes the information to Slack
- GSuite Exporter Cloud Function - A script that deploys a Cloud Function and Cloud Scheduler job that executes the GSuite Exporter tool automatically on a cadence.
- GSuite Exporter - A Python package that automates syncing Admin SDK APIs activity reports to a GCP destination. The module takes entries from the chosen Admin SDK API, converts them into the appropriate format for the destination, and exports them to a destination (e.g: Stackdriver Logging).
- Hive to BigQuery - A Python framework to migrate Hive table to BigQuery using Cloud SQL to keep track of the migration progress.
- IAM Permissions Copier - This tool allows you to copy supported GCP IAM permissions from unmanaged users to managed Cloud Identity users.
- IAM Recommender at Scale - A python package that automates applying iam recommendations.
- Instance Mapper - Maps different IaaS VM instance types from EC2 and Azure Compute to Google Cloud Platform instance types using a customizable score-based method. Also supports database instances.
- IPAM Autopilot - A simple tool for managing IP address ranges for GCP subnets.
- K8S-2-GSM - A containerized golang app to migrate Kubernetes secrets to Google Secrets Manger (to leverage CSI secret driver). LabelMaker - A tool that reads key:value pairs from a json file and labels the running instance and all attached drives accordingly.
- Logbucket Global to Regional - Utility to change _Default sink destination to regional log buckets
- Machine Learning Auto Exploratory Data Analysis and Feature Recommendation - A tool to perform comprehensive auto EDA, based on which feature recommendations are made, and a summary report will be generated.
- Maven Archetype Dataflow - A maven archetype which bootstraps a Dataflow project with common plugins pre-configured to help maintain high code quality.
- Netblock Monitor - An Apps Script project that will automatically provide email notifications when changes are made to Google’s IP ranges.
- OpenAPI to Cloud Armor converter - A simple tool to generate Cloud Armor policies from OpenAPI specifications.
- Permission Discrepancy Finder - A tool to find the principals with missing permissions on a resource within a project, subsequently, grants them the missing permissions.
- Pubsub2Inbox - A generic Cloud Function-based tool that takes input from Pub/Sub messages and turns them into email, webhooks or GCS objects.
- Quota Manager - A python module to programmatically update GCP service quotas such as bigquery.googleapis.com.
- Quota Monitoring and Alerting - An easy-to-deploy Data Studio Dashboard with alerting capabilities, showing usage and quota limits in an organization or folder.
- Ranger Hive Assessment for BigQuery/BigLake IAM migration - A tool that assesses which Ranger authorization rules can be migrated or not to BigQuery/BigLake IAM.
- Reddit Comment Streaming - Use PRAW, TextBlob, and Google Python API to collect and analyze reddit comments. Pushes comments to a Google Pub/sub Topic.
- Secret Manager Helper - A Java library to make it easy to replace placeholder strings with Secret Manager secret payloads.
- Service Account Provider - A tool to exchange GitLab CI JWT tokens against GCP IAM access tokens, in order to allow GitLab CI jobs to access Google Cloud APIs
- Site Verification Group Sync - A tool to provision "verified owner" permissions (to create GCS buckets with custom dns) based on membership of a Google Group.
- SLO Generator - A Python package that automates computation of Service Level Objectives, Error Budgets and Burn Rates on GCP, and export the computation results to available exporters (e.g: PubSub, BigQuery, Stackdriver Monitoring), using policies written in JSON format.
- Snowflake_to_BQ - A shell script to transfer tables (schema & data) from Snowflake to BigQuery.
- SPIFFE GCP Proxy - A tool to ease the integration of SPIFFE supported On-Prem workloads with GCP APIs using Workload Identity Federation
- STS Job Manager - A petabyte-scale bucket migration tool utilizing Storage Transfer Service
- [Vertex AI Endpoint Tester] (tools/vertex-ai-endpoint-load-tester) - This utility helps to methodically test variety of Vertex AI Endpoints by their sizes so that one can decide the right size to deploy an ML Model on Vertex AI given a sample request JSON and some idea(s) on expected queries per second.
- Vertex AI Endpoint Tester - This utility helps to methodically test variety of Vertex AI Endpoints by their sizes so that one can decide the right size to deploy an ML Model on Vertex AI given a sample request JSON and some idea(s) on expected queries per second.
- VM Migrator - This utility automates migrating Virtual Machine instances within GCP. You can migrate VM's from one zone to another zone/region within the same project or different projects while retaining all the original VM properties like disks, network interfaces, ip, metadata, network tags and much more.
- VPC Flow Logs Analysis - A configurable Log sink + BigQuery report that shows traffic attributed to the projects in the Shared VPCs.
- VPC Flow Logs Enforcer - A Cloud Function that will automatically enable VPC Flow Logs when a subnet is created or modified in any project under a particular folder or folders.
- VPC Flow Logs Top Talkers - A configurable Log sink + BigQuery view to generate monthly/daily aggregate traffic reports per subnet or host, with the configurable labelling of IP ranges and ports.
- Webhook Ingestion Data Pipeline - A deployable app to accept and ingest unauthenticated webhook data to BigQuery.
- XSD to BigQuery Schema Generator - A command line tool for converting an XSD schema representing deeply nested and repeated XML content into a BigQuery compatible table schema represented in JSON.
- Numeric Family Recommender - Oracle - The Numeric Family Recommender is a database script that recommends the best numeric data type for the NUMBER data type when migrating from legacy databases like Oracle to Google Cloud platforms like BigQuery, AlloyDB, Cloud SQL for PostgreSQL, and Google Cloud Storage.
- Cloud Composer Stress Testing - A collection of tools aimed at testing, benchmarking, and simulating workloads within Composer. Great for integration testing and experimenting with different environment configurations.
- Cloud Composer Environment Rotator - Rotate Airflow resources from an old composer environment to a new composer environment with minimal downtime. Ideal for non in-place environment updates, downgrading environment versions, or migrating to different regions.
- Gradio and Generative AI Example - The example code allows developers to create rapid Generative AI PoC applications with Gradio and Gen AI agents.
- Memorystore Cluster Ops Framework - This is a framework that provides the tools to apply cluster level operations that enable capabilities like cluster backups, migration & validation, etc. The framework can be extended for other use cases as required. The framework uses RIOT to bridge current product gaps with Memorystore Clusters
- ML Project Generator - A utility to create a Production grade ML project template with the best productivity tools installed like auto-formatting, license checks, linting, etc.
See the contributing instructions to get started contributing.
Questions, issues, and comments should be directed to [email protected].