GitHub - nationalarchives/tdr-antivirus

This is the code and configuration to carry out the antivirus checks on a single file from S3

Configuration Parameters

Lambda takes parameters to configure the scanning options.

Parameter Name	Optional	Default Value	Description	Example
consignment_id	false	N/A	TDR UUID for the consignment the object to scan belongs to
file_id	false	N/A	Name of the object to scan
user_id	true	`None`	TDR UUID of the user who uploaded the object to scan
original_path	true	`None`	Original path to the object to scan. Used to create local version of object for scanning
scan_type	true	N/A	Deprecated. Use combination of optional parameters to set configuration. Type of scan to run	`metadata` / `consignment`
s3_source_bucket	true	`tdr-upload-files-cloudfront-dirty-{tdr environment}`	S3 bucket containing the object to scan	`{some AWS S3 bucket name}`
s3_source_bucket_key	true	`{user_id}/{consignment_id}/{file_id}`	S3 bucket key of the object to scan
s3_upload_bucket	true	`tdr-upload-files-{tdr environment}`	S3 bucket to copy clean objects to. If empty string then no copy occurs
s3_upload_bucket_key	true	`{consignment_id}/{file_id}`	S3 bucket key of clean object. If empty value string no copy occurs
s3_quarantine_bucket	true	`tdr-upload-files-quarantine-{tdr environment}`	S3 bucket to copy infected objects to
guard_duty_malware_scan_enabled	true	True	Flag whether source s3 bucket has AWS GuardDuty malicious object scanning enabled.

Example Configuration

Object to scan details:

Consignment Id: bf2181c7-70e4-448d-b122-be561d0e797a
File Id: myFileToScan.txt
Original Path: identifier1/identifer2/myFileToScan.txt
s3 Source bucket name: some-source-bucket
s3 source object key: same as original path
s3 clean bucket name: some-clean-bucket
s3 clean object key: same as s3 source object key
s3 quarantine bucket name: tdr-upload-files-quarantine-{tdr environment}
AWS GuardDuty Malicious Scanning Enabled: False

Event configuration to support the above would be as follows:

    {
        "consignmentId": "bf2181c7-70e4-448d-b122-be561d0e797a",
        "fileId": "myFileToScan.txt",
        "originalPath": "identifier1/identifer2/myFileToScan.txt",
        "s3SourceBucket": "some-source-bucket",
        "s3SourceBucketKey": "identifier1/identifer2/myFileToScan.txt",
        "s3UploadBucket": "some-clean-bucket",
        "s3UploadBucketKey": "identifier1/identifer2/myFileToScan.txt",
        "guardDutyMalwareScanEnabled": False
   }

Building the lambda function

The lambda function is built by GitHub actions. There are three actions in three yml files.

test.yml

This runs git secrets and runs the python tests. This is the standard multibranch pipeline job which runs on PRs and merge to master. If this runs on the master branch, it will rebuild the three Docker images locally and use them to build the lambda.

build.yml

There are three docker images that are used to build the lambda.

File name	Image Name	Description
Dockerfile-yara	yara	Installs yara and some dependencies it needs like openssl on an alpine image.
Dockerfile-compile	yara-rules	Uses yara as the base image. Gets the yara rules from github and compiles them into a single file for yara to use
Dockerfile-dependencies	yara-dependencies	Installs necessary software on an amazon linux image and zips it up to be used by the lambda

These images are built on every master build locally and are not stored in ECR.

deploy.yml

This deploys the lambda from S3

Running locally

Create a virtual environment in the antivirus directory python3 -m venv venv

Activate the environment

source venv/bin/activate

Install dependencies

pip install -r requirements.txt

To run it, you will need to add code to the matcher.py file at the bottom. This will connect to the integration s3 dirty bucket so you will need to run it with integration credentials.

matcher_lambda_handler({
        "userId": "7ad28066-7a76-4e07-b540-f005b6919328",
        "consignmentId": "bf2181c7-70e4-448d-b122-be561d0e797a",
        "fileId": "df216308-e78b-4328-90ef-8e4ebfef6b9d",
        "originalPath": "original/path"
      }, None)

Then either run this through the cli

python src/matcher.py

or debug through Pycharm or the IDE of your choice.

The following environment variables need to be set for this to work although the AWS ones are optional if you're using profiles. AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_SESSION_TOKEN

AWS_LAMBDA_FUNCTION_VERSION - This is the lambda function version. It's provided by the lambda and so needs to be set here. Set it to "$LATEST" ENVIRONMENT - intg, staging or prod ROOT_DIRECTORY - This is the root directory for the file download in the lambda file system.

Running the tests

Normal run: python -m pytest

Run with coverage python -m pytest --cov=src

Run with coverage and missing lines python -m pytest --cov-report term-missing --cov=src

Run with coverage, missing lines and junit output python -m pytest --cov-report term-missing --junitxml="result.xml" --cov=src

Yara rules checks.

There is a GitHub actions job TDR Check Antivirus Rules which is run on a schedule from the .github/workflows/check_rules file. This runs at 07:20 on a Monday. This carries out the following steps.

Builds the base yara image.
Builds the rules yara image.
Gets the highest version tag from git.
Downloads the lambda zip file from S3.
Unzips the zip file and copies the existing compiled rule file into the current directory.
Downloads the tests file from the tdr-antivirus-test-files-mgmt S3 bucket.
Builds the docker image from the Dockerfile-run-tests file which copies the existing compiled rules and the test files to the container.
Runs this docker image. This runs a python script which runs these steps:
- Compares the rule identifiers in the old compiled rules file with the new one.
- If there are new yara rules in the new compiled rules, it will run the antivirus checks against the test files from the S3 bucket.
  - NOTE: It will not detect if rules have been removed, and it won't check if they've been changed either. I tried to do it based on hashes but it's likely to bring up too many false positives.
- Yara is run against the test files using the new rules. If no matches are found after the new rules are run against the test files, it returns exit code zero and the GitHub actions job builds the Antivirus multibranch master branch job.

These steps can be run manually by copying the commands from the .github/workflows/check_rules.yml file.

Because this runs the master build job without any code changes, you end up with more than one version pointed to the same commit. We still have separate zip files stored in S3 for each version though so we can still roll back if necessary.

Test files bucket terraform

Important Note: tdr-antivirus uses >= v1.5.0 of Terraform. Ensure that Terraform >= v1.5.0 is installed before proceeding.

In the terraform directory there are some terraform files which are used to create the bucket tdr-antivirus-test-files-mgmt. This bucket is used to store the files against which we run a periodic check of any new yara rules. This should almost never need to be updated.

To run the Terraform:

Navigate to the terraform directory:
```
[location of project]: cd terraform
```

Clone the tdr-terraform-modules repository into the terraform directory

[location of terraform directory]: git clone https://github.com/nationalarchives/tdr-terraform-modules.git

Initiate Terraform:

[location of terraform directory]: terraform init

Select the default Terraform workspace`

[location of terraform directory]: terraform workspace select default

Run Terraform plan / apply commands with AWS credentials that have access to the TDR management environment as needed:
```
[location of terraform directory]: terraform {plan/apply}
```

Name		Name	Last commit message	Last commit date
Latest commit History 283 Commits
.github/workflows		.github/workflows
src		src
terraform		terraform
test		test
.gitignore		.gitignore
Dockerfile-check-rules		Dockerfile-check-rules
Dockerfile-compile		Dockerfile-compile
Dockerfile-dependencies		Dockerfile-dependencies
Dockerfile-yara		Dockerfile-yara
LICENCE		LICENCE
README.md		README.md
build-dependencies.sh		build-dependencies.sh
requirements-runtime.txt		requirements-runtime.txt
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt
run_av_tests.py		run_av_tests.py
yara-version.txt		yara-version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Configuration Parameters

Example Configuration

Building the lambda function

test.yml

build.yml

deploy.yml

Running locally

Running the tests

Yara rules checks.

Test files bucket terraform

About

Releases 69

Packages

Contributors 10

Languages

License

nationalarchives/tdr-antivirus

Folders and files

Latest commit

History

Repository files navigation

Configuration Parameters

Example Configuration

Building the lambda function

test.yml

build.yml

deploy.yml

Running locally

Running the tests

Yara rules checks.

Test files bucket terraform

About

Resources

License

Stars

Watchers

Forks

Releases 69

Packages 0

Contributors 10

Languages

Packages