-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data access work #7
base: master
Are you sure you want to change the base?
Changes from all commits
2a4f668
b737051
dce4ef5
3d9955a
790bc6a
7104f0e
dd879c4
62bd464
b278a21
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
|
||
# Sample Data Access Audit View | ||
|
||
## Overview | ||
|
||
Google Cloud offers data access logs across many [services](https://cloud.google.com/logging/docs/audit/services). | ||
This sample will walk through [enabling data access logs](https://cloud.google.com/logging/docs/audit/configure-data-access) | ||
in a project, creating a Stackdriver sink, and creating a BigQuery View providing analytical information. | ||
|
||
## Audit View | ||
|
||
The View analyses READ, WRITE, and ADMIN operations across BigQuery datasets and GCS buckets. The logs | ||
contains object and table-level information, but this is dropped. | ||
|
||
The view is stored in ```data_access.audit_summary```. | ||
|
||
There is one data access log entry for every service executing. There is also an associated | ||
list in protopayload_auditlog.authorizationInfo that contains the list of permissions | ||
granted (or denied) as part of the service execution. | ||
|
||
These permissions are listed for [GCS permissions](https://cloud.google.com/storage/docs/access-control/iam-permissions) and | ||
[BigQuery permissions](https://cloud.google.com/bigquery/docs/access-control#bq-permissions). | ||
|
||
The columns of the View are the following: | ||
|
||
| Column | Description | | ||
| ------ | ----------- | | ||
| hour | Top-of-the-hour the access occurred | | ||
| service | Service (storage or bigquery) | | ||
| actor | Service account or end-user | | ||
| op | Operation (READ, WRITE, or ADMIN) | | ||
| granted | Whether access was permitted | | ||
| entity | Project_ID.GCS_Bucket or Prroject_ID.BigQuery_Dataset | | ||
|
||
## Instructions | ||
|
||
Capture the PROJECT_ID of your default project. | ||
|
||
PROJECT_ID=$(gcloud config get-value core/project) | ||
|
||
Enable data access audit logs. | ||
|
||
POLICY_FILE=/tmp/policy_file_${PROJECT_ID}.$$ | ||
|
||
# Get existing project policy | ||
gcloud projects get-iam-policy ${PROJECT_ID} --format=json > ${POLICY_FILE} | ||
|
||
# Merge new_audit_policy.json into the policy | ||
cat ${POLICY_FILE} | \ | ||
jq --slurpfile audit data_access_policy.json '.auditConfigs=$audit' \ | ||
> ${POLICY_FILE}.new | ||
|
||
# Apply the new policy to the project | ||
gcloud projects set-iam-policy ${PROJECT_ID} ${POLICY_FILE}.new | ||
if [ $? -ne 0 ]; then | ||
echo Failed applying policy | ||
fi | ||
|
||
rm $POLICY_FILE | ||
|
||
Create your data_access dataset. | ||
|
||
bq mk data_access | ||
|
||
Create a data access audit sink. Be sure to grant BigQuery Data Editor role to the appropriate service account. | ||
|
||
gcloud logging sinks create compute_activity \ | ||
bigquery.googleapis.com/projects/${PROJECT_ID}/datasets/data_access \ | ||
--log-filter="logName=\"projects/${PROJECT_ID}/logs/cloudaudit.googleapis.com%2Fdata_access\"" | ||
|
||
Create your data_access.audit_summary VIEW. | ||
|
||
sed -e "s/\${PROJECT_ID}/${PROJECT_ID}/g" ./create_data_access_view.sql | bq query | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,132 @@ | ||
#standardSQL | ||
|
||
-- | ||
-- This is setup for storage.googleapis.com and bigquery.googleapis.com | ||
-- | ||
-- This is for examining access to bigquery datasets and buckets. | ||
-- | ||
-- Table and object-level is available in the logs (and more!), but | ||
-- this creates confusion for a dashboard. | ||
-- | ||
CREATE OR REPLACE VIEW data_access.audit_summary AS | ||
WITH | ||
-- Pull out Data Access logs | ||
DataAccess AS ( | ||
SELECT | ||
-- Hour truncated | ||
TIMESTAMP_TRUNC(d.timestamp, HOUR) AS hour, | ||
-- Project ID that the access method was called on | ||
d.resource.labels.project_id, | ||
-- Actor | ||
d.protopayload_auditlog.authenticationInfo.principalEmail AS actor, | ||
-- Permission used to access data | ||
SPLIT(i.permission,'.')[SAFE_OFFSET(0)] AS service, | ||
-- Permission used | ||
i.permission AS action, | ||
-- Whether granted or denied | ||
IFNULL(i.granted, FALSE) AS granted, | ||
-- Parts of the resource accessed | ||
SPLIT(i.resource, '/') AS parts | ||
FROM | ||
`${PROJECT_ID}.data_access.cloudaudit_googleapis_com_data_access_*` d | ||
CROSS JOIN d.protopayload_auditlog.authorizationInfo i | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Align CROSS JOIN with FROM. |
||
WHERE | ||
i.resource IS NOT NULL AND | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Move AND to the next line.
|
||
d.protopayload_auditlog.serviceName IN ('storage.googleapis.com', | ||
'bigquery.googleapis.com') | ||
) | ||
SELECT | ||
hour, | ||
service, | ||
actor, | ||
-- Translate the action into an operation (READ/WRITE/ADMIN) | ||
CASE | ||
WHEN service = 'storage' THEN | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you think by checking the action value, you can determine the action type? So, maybe simplify the code a little with one level of CASE statement? Same for service='bigquery' code below. |
||
CASE | ||
-- See granular permissions here: https://cloud.google.com/storage/docs/access-control/iam-permissions | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please intend CASE statements. Use 2 spaces for all indentation through out the script. |
||
WHEN action IN ('storage.objects.create', | ||
'storage.objects.delete') THEN | ||
'WRITE' | ||
WHEN action IN ('storage.objects.get') THEN | ||
'READ' | ||
WHEN action IN ('storage.objects.getIamPolicy', | ||
'storage.objects.list', | ||
'storage.objects.setIamPolicy', | ||
'storage.objects.update', | ||
'storage.buckets.create', | ||
'storage.buckets.delete', | ||
'storage.buckets.get', | ||
'storage.buckets.getIamPolicy', | ||
'storage.buckets.list', | ||
'storage.buckets.setIamPolicy', | ||
'storage.buckets.update') THEN | ||
'ADMIN' | ||
ELSE | ||
CONCAT('Unknown storage:', action) | ||
END | ||
-- See granular permissions here: https://cloud.google.com/bigquery/docs/access-control#bq-permissions | ||
WHEN service = 'bigquery' THEN | ||
CASE | ||
WHEN action IN ('bigquery.tables.delete', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please indent WHEN statements. |
||
'bigquery.datasets.delete', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Align values with previous value. Same for the following WHEN statements. |
||
'bigquery.jobs.update', | ||
'bigquery.routines.delete', | ||
'bigquery.tables.updateData') THEN | ||
'WRITE' | ||
WHEN action IN ('bigquery.tables.getData', | ||
'bigquery.tables.export', | ||
'bigquery.readsessions.create', | ||
'bigquery.connections.use') THEN | ||
'READ' | ||
WHEN action IN ('bigquery.jobs.create', | ||
'bigquery.jobs.listAll', | ||
'bigquery.jobs.list', | ||
'bigquery.jobs.get', | ||
'bigquery.datasets.create', | ||
'bigquery.datasets.get', | ||
'bigquery.datasets.update', | ||
'bigquery.tables.create', | ||
'bigquery.tables.list', | ||
'bigquery.tables.get', | ||
'bigquery.tables.update', | ||
'bigquery.routines.create', | ||
'bigquery.routines.list', | ||
'bigquery.routines.get', | ||
'bigquery.routines.update', | ||
'bigquery.transfers.get', | ||
'bigquery.transfers.update', | ||
'bigquery.savedqueries.create', | ||
'bigquery.savedqueries.get', | ||
'bigquery.savedqueries.list', | ||
'bigquery.savedqueries.update', | ||
'bigquery.savedqueries.delete', | ||
'bigquery.connections.create', | ||
'bigquery.connections.get', | ||
'bigquery.connections.list', | ||
'bigquery.connections.update', | ||
'bigquery.connections.delete') THEN | ||
'ADMIN' | ||
ELSE | ||
CONCAT('Unknown bigquery:', action) | ||
END | ||
ELSE | ||
CONCAT('Unknown service:', service) | ||
END AS op, | ||
granted, | ||
-- Project is of the resource or, if not there, | ||
-- then for the method accessing it (eg for buckets) | ||
CASE | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you need a ELSE to set a default value here? Otherwise, it will create NULLs. |
||
-- BigQuery project.dataset | ||
WHEN service = 'bigquery' THEN | ||
CONCAT(parts[SAFE_OFFSET(1)], '.', parts[SAFE_OFFSET(3)]) | ||
-- GCS project.bucket | ||
WHEN service = 'storage' THEN | ||
CONCAT(project_id, '.', parts[SAFE_OFFSET(3)]) | ||
END AS entity | ||
FROM | ||
DataAccess | ||
WHERE | ||
-- Limit to BigQuery dataset / GCS bucket operations | ||
ARRAY_LENGTH(parts) >= 4 | ||
GROUP BY | ||
1,2,3,4,5,6; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add a description, what this file is for? |
||
"service": "allServices", | ||
"auditLogConfigs": [ | ||
{ "logType": "ADMIN_READ" }, | ||
{ "logType": "DATA_READ" }, | ||
{ "logType": "DATA_WRITE" } | ||
] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add AS before alias.