Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery CAI Scripts #394

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions scripts/cloud_asset_inventory/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Cloud Asset Inventory Scripts

This folder contains scripts that can be run against Cloud Asset Inventory exports to BigQuery. Refer to the [public documentation](https://cloud.google.com/asset-inventory/docs/exporting-to-bigquery) for directions on setting up export to BigQuery. It is recommended to perform this export as part of a schedule. The tables would be partitioned by either `read-time` or `request-time`. In the attached scripts, the tables are assumed to be partitioned by `read-time`.

The schema for the BigQuery tables generated from this export is given in the section [here](https://cloud.google.com/asset-inventory/docs/exporting-to-bigquery#bigquery-schema).

Here are the scripts that are provided.

### [BigQuery Table Readers](./bq_table_all_readers.sql)

This script will help Data Stewards or Platform Owners determine the which are the IAM Principals (groups, user or service accounts) that can read data from a BigQuery table.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo


The access to the principal could be applied at any level of the resource hierarchy - Org, Folder, Project or Dataset.



### [BigQuery Table Editors](./bq_table_all_editors.sql)

This script will help Data Stewards or Platform Owners determine the which are the IAM Principals (groups, user or service accounts) that can edit/write data to a BigQuery table.

The access to the principal could be applied at any level of the resource hierarchy - Org, Folder, Project or Dataset.
98 changes: 98 additions & 0 deletions scripts/cloud_asset_inventory/bq_table_all_editors.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
/*
* Copyright 2023 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/*
* This script creates a table that maps a BigQuery Table to the IAM Principals
* that can edit data in the BigQuery table.
*
* The IAM Principal could be assigned a role at the Table level, the Dataset level
* or any level of Project, Folder or Org which are parents of the Dataset.
*
* This script assumes the following:
* 1. The Cloud Asset Inventory has been exported to tables in BigQuery under
* project - <CAI_EXPORT_PROJECT> and dataset - <CAI_EXPORT_DATASET>
* 2. The user has permissions to query <CAI_EXPORT_PROJECT>.<CAI_EXPORT_DATASET>
* 3. All BigQuery permissions is handled via BigQuery predefined roles
* 4. The BigQuery roles for writing data to a Table are
* a. roles/bigquery.dataEditor
* b. roles/bigquery.dataOwner
* c. roles/bigquery.studioAdmin
* d. roles/bigquery.admin
* 5. Replace <RESOURCE_TABLE> with the Resource table name from the CAI Export
* 6. Replace <IAM_POLICY_TABLE> with the IAM Policy table name from the CAI Export
*
* The schema of the table - `bigquery_table_all_editors` is given in ./schema/table_all_readers_schema.json
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the json file name does not match

*/
DECLARE read_date STRING DEFAULT "2023-12-01";

CREATE SCHEMA IF NOT EXISTS cai_analysis;

CREATE
OR REPLACE TABLE cai_analysis.bigquery_table_all_editors AS WITH bq_table_with_parents AS (
SELECT
name as bq_table,
ARRAY_CONCAT(
[name],
[resource.parent],
ARRAY(
SELECT
CONCAT('//cloudresourcemanager.googleapis.com/', a)
FROM
UNNEST(ancestors) as a
)
) as parent_array
FROM
`<CAI_EXPORT_PROJECT>.<CAI_EXPORT_DATASET>.<RESOURCE_TABLE>`
WHERE
TIMESTAMP_TRUNC(readTime, DAY) = TIMESTAMP(read_date)
AND asset_type = 'bigquery.googleapis.com/Table'
),
bq_editors AS (
SELECT
a.name,
a.asset_type,
b.role,
ARRAY_CONCAT_AGG(b.members) as members,
FROM
`<CAI_EXPORT_PROJECT>.<CAI_EXPORT_DATASET>.<IAM_POLICY_TABLE>` a
INNER JOIN UNNEST(iam_policy.bindings) as b
WHERE
TIMESTAMP_TRUNC(readTime, DAY) = TIMESTAMP(read_date)
AND b.role IN (
'roles/bigquery.dataEditor',
'roles/bigquery.dataOwner',
'roles/bigquery.studioAdmin',
'roles/bigquery.admin'
)
GROUP BY
1,
2,
3
)
SELECT
bqt.bq_table as table_name,
ARRAY_AGG(
STRUCT(
bq_editors.name as parent_name,
bq_editors.role,
bq_editors.members
)
) as inherited_editors
FROM
bq_table_with_parents bqt
INNER JOIN UNNEST(parent_array) as parents
INNER JOIN bq_editors ON parents = bq_editors.name
GROUP BY
1;
100 changes: 100 additions & 0 deletions scripts/cloud_asset_inventory/bq_table_all_readers.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
/*
* Copyright 2023 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/*
* This script creates a table that maps a BigQuery Table to the IAM Principals
* that can read data from the BigQuery table.
*
* The IAM Principal could be assigned a role at the Table level, the Dataset level
* or any level of Project, Folder or Org which are parents of the Dataset.
*
* This script assumes the following:
* 1. The Cloud Asset Inventory has been exported to tables in BigQuery under
* project - <CAI_EXPORT_PROJECT> and dataset - <CAI_EXPORT_DATASET>
* 2. The user has permissions to query <CAI_EXPORT_PROJECT>.<CAI_EXPORT_DATASET>
* 3. All BigQuery permissions is handled via BigQuery predefined roles
* 4. The BigQuery roles for reading data from a Table are
* a. roles/bigquery.dataEditor
* b. roles/bigquery.dataViewer
* c. roles/bigquery.dataOwner
* d. roles/bigquery.studioAdmin
* e. roles/bigquery.admin
* 5. Replace <RESOURCE_TABLE> with the Resource table name from the CAI Export
* 6. Replace <IAM_POLICY_TABLE> with the IAM Policy table name from the CAI Export
*
* The schema of the table - `bigquery_table_all_readers` is given in ./schema/table_all_readers_schema.json
*/
DECLARE read_date STRING DEFAULT "2023-12-01";

CREATE SCHEMA IF NOT EXISTS cai_analysis;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two scripts are almost identical, does it make sense to make them DRY?


CREATE
OR REPLACE TABLE cai_analysis.bigquery_table_all_readers AS WITH bq_table_with_parents AS (
SELECT
name as bq_table,
ARRAY_CONCAT(
[name],
[resource.parent],
ARRAY(
SELECT
CONCAT('//cloudresourcemanager.googleapis.com/', a)
FROM
UNNEST(ancestors) as a
)
) as parent_array
FROM
`<CAI_EXPORT_PROJECT>.<CAI_EXPORT_DATASET>.<RESOURCE_TABLE>`
WHERE
TIMESTAMP_TRUNC(readTime, DAY) = TIMESTAMP(read_date)
AND asset_type = 'bigquery.googleapis.com/Table'
),
bq_readers AS (
SELECT
a.name,
a.asset_type,
b.role,
ARRAY_CONCAT_AGG(b.members) as members,
FROM
`<CAI_EXPORT_PROJECT>.<CAI_EXPORT_DATASET>.<IAM_POLICY_TABLE>` a
INNER JOIN UNNEST(iam_policy.bindings) as b
WHERE
TIMESTAMP_TRUNC(readTime, DAY) = TIMESTAMP(read_date)
AND b.role IN (
'roles/bigquery.dataEditor',
'roles/bigquery.dataViewer',
'roles/bigquery.dataOwner',
'roles/bigquery.studioAdmin',
'roles/bigquery.admin'
)
GROUP BY
1,
2,
3
)
SELECT
bqt.bq_table as table_name,
ARRAY_AGG(
STRUCT(
bq_readers.name as parent_name,
bq_readers.role,
bq_readers.members
)
) as inherited_readers
FROM
bq_table_with_parents bqt
INNER JOIN UNNEST(parent_array) as parents
INNER JOIN bq_readers ON parents = bq_readers.name
GROUP BY
1;
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
[
{
"name": "table_name",
"type": "STRING",
"mode": "NULLABLE",
"description": "Name of BigQuery Table"
},
{
"name": "inherited_readers",
"type": "RECORD",
"mode": "REPEATED",
"fields": [
{
"name": "parent_name",
"type": "STRING",
"mode": "NULLABLE",
"description": "The name of dataset, project, folder or org which is ancestor of the Table"
},
{
"name": "role",
"type": "STRING",
"mode": "NULLABLE",
"description": "The BigQuery predefined role applied on the parent_name resource"
},
{
"name": "members",
"type": "STRING",
"mode": "REPEATED",
"description": "IAM Principals assigned with the role on the parent_name resource"
}
]
}
]
Loading