Skip to content

Commit

Permalink
Merge pull request #13 from PaperMtn/feature/signature-disable
Browse files Browse the repository at this point in the history
Feature/signature disable
  • Loading branch information
PaperMtn authored Nov 18, 2024
2 parents 57e7238 + abc43c0 commit e264982
Show file tree
Hide file tree
Showing 9 changed files with 99 additions and 23 deletions.
19 changes: 11 additions & 8 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,22 @@
## [3.1.0] - 2024-11-x
### Changed
- Package management and deployment moved to Poetry
- Docker build process improved using multi-stage builds. The Dockerfile now doesn't contain any unnecessary files, and is much smaller.
- Refactor to separate GitLab client and Watchman processing into modules
- Refactor to implement python-gitlab library for GitLab API calls, instead of the custom client used previously.
- This change allows for more efficient and easier to read code, is more reliable, and also allows for enhancements to be added more easily in the future.

## [3.1.0] - 2024-11-18
### Added
- Signatures now loaded into memory instead of being saved to disk. This allows for running on read-only filesystems.
- Ability to disable signatures by their ID in the watchman.conf config file.
- These signatures will not be used when running Slack Watchman
- Signature IDs for each signature can be found in the Watchman Signatures repository
- Tests for Docker build
- Enhanced deduplication of findings
- The same match should not be returned multiple times within the same scope. E.g. if a token is found in a commit, it should not be returned multiple times in the same commit.
- All dates are now converted and logged in UTC
- Unit tests added for models and utils

### Changed
- Package management and deployment moved to Poetry
- Docker build process improved using multi-stage builds. The Dockerfile now doesn't contain any unnecessary files, and is much smaller.
- Refactor to separate GitLab client and Watchman processing into modules
- Refactor to implement [python-gitlab](https://python-gitlab.readthedocs.io/) library for GitLab API calls, instead of the custom client used previously.
- This change gives more efficient and easier to read code, is more reliable, and also allows for enhancements to be added more easily in the future.

### Fixed
- Error when searching wiki-blobs
- There would often be failures when trying to find projects or groups associated with blobs. This is now fixed by adding logic to check if the blob is associated with a project or group, and get the correct information accordingly.
Expand Down
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,18 @@ GitLab Watchman can enumerate potentially useful information from a GitLab insta
### Signatures
GitLab Watchman uses custom YAML signatures to detect matches in GitLab. These signatures are pulled from the central [Watchman Signatures repository](https://github.com/PaperMtn/watchman-signatures). Slack Watchman automatically updates its signature base at runtime to ensure its using the latest signatures to detect secrets.

#### Suppressing Signatures
You can define signatures that you want to disable when running GitLab Watchman by adding their IDs to the `disabled_signatures` section of the `watchman.conf` file. For example:

```yaml
gitlab_watchman:
disabled_signatures:
- tokens_generic_bearer_tokens
- tokens_generic_access_tokens
```
You can find the ID of a signature in the individual YAML files in [Watchman Signatures repository](https://github.com/PaperMtn/watchman-signatures).
### Logging
GitLab Watchman gives the following logging options:
Expand Down Expand Up @@ -106,6 +118,16 @@ You also need to provide the URL of your GitLab instance.
#### Providing token & URL
GitLab Watchman will get the GitLab token and URL from the environment variables `GITLAB_WATCHMAN_TOKEN` and `GITLAB_WATCHMAN_URL`.

### watchman.conf file
Configuration options can be passed in a file named `watchman.conf` which must be stored in your home directory. The file should follow the YAML format, and should look like below:
```yaml
gitlab_watchman:
disabled_signatures:
- tokens_generic_bearer_tokens
- tokens_generic_access_tokens
```
GitLab Watchman will look for this file at runtime, and use the configuration options from here.
## Installation
You can install the latest stable version via pip:
Expand Down
39 changes: 34 additions & 5 deletions src/gitlab_watchman/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@
import traceback
from dataclasses import dataclass
from importlib import metadata
from typing import List
from typing import List, Dict, Any

import yaml

from gitlab_watchman import watchman_processor
from gitlab_watchman.clients.gitlab_client import GitLabAPIClient
Expand All @@ -19,7 +21,8 @@
GitLabWatchmanNotAuthorisedError,
GitLabWatchmanAuthenticationError,
ElasticsearchMissingError,
MissingEnvVarError
MissingEnvVarError,
MisconfiguredConfFileError
)
from gitlab_watchman.loggers import (
JSONLogger,
Expand Down Expand Up @@ -100,7 +103,7 @@ def perform_search(search_args: SearchArgs):
search(search_args, sig, scope)


def validate_variables() -> bool:
def validate_variables() -> Dict[str, Any]:
""" Validate whether GitLab Watchman environment variables have been set
Returns:
Expand All @@ -112,8 +115,30 @@ def validate_variables() -> bool:
for var in required_vars:
if var not in os.environ:
raise MissingEnvVarError(var)
path = f'{os.path.expanduser("~")}/watchman.conf'
if os.path.exists(path):
try:
with open(path) as yaml_file:
conf_details = yaml.safe_load(yaml_file)['gitlab_watchman']
return {
'disabled_signatures': conf_details.get('disabled_signatures', [])
}
except Exception as e:
raise MisconfiguredConfFileError from e
return {}


def supress_disabled_signatures(signatures: List[signature.Signature],
disabled_signatures: List[str]) -> List[signature.Signature]:
""" Supress signatures that are disabled in the config file
Args:
signatures: List of signatures to filter
disabled_signatures: List of signatures to disable
Returns:
List of signatures with disabled signatures removed
"""

return True
return [sig for sig in signatures if sig.id not in disabled_signatures]


# pylint: disable=too-many-locals, missing-function-docstring, global-variable-undefined
Expand Down Expand Up @@ -183,7 +208,8 @@ def main():

OUTPUT_LOGGER = init_logger(logging_type, debug)

validate_variables()
config = validate_variables()
disabled_signatures = config.get('disabled_signatures', [])
gitlab_client = watchman_processor.initiate_gitlab_connection(
os.environ.get('GITLAB_WATCHMAN_TOKEN'),
os.environ.get('GITLAB_WATCHMAN_URL'))
Expand All @@ -204,6 +230,9 @@ def main():

OUTPUT_LOGGER.log('INFO', 'Downloading and importing signatures')
signature_list = SignatureDownloader(OUTPUT_LOGGER).download_signatures()
if len(disabled_signatures) > 0:
signature_list = supress_disabled_signatures(signature_list, disabled_signatures)
OUTPUT_LOGGER.log('INFO', f'The following signatures have been suppressed: {disabled_signatures}')
OUTPUT_LOGGER.log('SUCCESS', f'{len(signature_list)} signatures loaded')
OUTPUT_LOGGER.log('INFO', f'{multiprocessing.cpu_count() - 1} cores being used')

Expand Down
10 changes: 5 additions & 5 deletions src/gitlab_watchman/clients/gitlab_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,9 @@ def inner_function(*args, **kwargs):
elif e.response_code == 500:
pass
else:
raise GitLabWatchmanGetObjectError(e.error_message, func) from e
except IndexError as e:
raise GitLabWatchmanGetObjectError('Object not found', func) from e
raise GitLabWatchmanGetObjectError(e.error_message, func, args) from e
except IndexError:
pass
except Exception as e:
raise e

Expand Down Expand Up @@ -112,7 +112,7 @@ def get_user_by_username(self, username: str) -> Dict[str, Any] | None:
GitLabWatchmanNotAuthorisedError: If the user is not authorized to access the resource
GitlabWatchmanGetObjectError: If an error occurs while getting the object
"""
return self.gitlab_client.users.list(username=username)[0].asdict()
return self.gitlab_client.users.list(username=username, active=False, blocked=True)[0].asdict()

@exception_handler
def get_settings(self) -> Dict[str, Any]:
Expand Down Expand Up @@ -272,7 +272,7 @@ def get_group_members(self, group_id: str) -> List[Dict]:
GitLabWatchmanNotAuthorisedError: If the user is not authorized to access the resource
GitLabWatchmanGetObjectError: If an error occurs while getting the object
"""
members = self.gitlab_client.groups.get(group_id).members.list(as_list=True)
members = self.gitlab_client.groups.get(group_id).members.list(as_list=True, get_all=True)
return [member.asdict() for member in members]

@exception_handler
Expand Down
13 changes: 11 additions & 2 deletions src/gitlab_watchman/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,8 @@ class GitLabWatchmanGetObjectError(GitLabWatchmanError):
""" Exception raised when an error occurs while getting a GitLab API object.
"""

def __init__(self, error_message: str, func):
super().__init__(f'GitLab get object error: {error_message} - Function: {func.__name__}')
def __init__(self, error_message: str, func, arg):
super().__init__(f'GitLab get object error: {error_message} - Function: {func.__name__} - Arg: {arg}')
self.error_message = error_message


Expand All @@ -49,3 +49,12 @@ class GitLabWatchmanNotAuthorisedError(GitLabWatchmanError):
def __init__(self, error_message: str, func):
super().__init__(f'Not authorised: {error_message} - {func.__name__}')
self.error_message = error_message


class MisconfiguredConfFileError(Exception):
""" Exception raised when the config file watchman.conf is missing.
"""

def __init__(self):
self.message = f"The file watchman.conf doesn't contain config details for GitLab Watchman"
super().__init__(self.message)
4 changes: 2 additions & 2 deletions src/gitlab_watchman/loggers.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,9 @@ def log(self,
if notify_type == "result":
if scope == 'blobs':
message = 'SCOPE: Blob' \
f' AUTHOR: {message.get("commit").get("author_name")} - ' \
f'{message.get("commit").get("author_email")}' \
f' COMMITTED: {message.get("commit").get("committed_date")} \n' \
f' AUTHOR: {message.get("commit").get("author_name")} ' \
f'EMAIL: {message.get("commit").get("author_email")}\n' \
f' FILENAME: {message.get("blob").get("basename")} \n' \
f' URL: {message.get("project").get("web_url")}/-/blob/{message.get("blob").get("ref")}/' \
f'{message.get("blob").get("filename")} \n' \
Expand Down
4 changes: 4 additions & 0 deletions src/gitlab_watchman/models/signature.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ class Signature:
They also contain regex patterns to validate data that is found"""

name: str
id: str
status: str
author: str
date: str | datetime.date | datetime.datetime
Expand All @@ -33,6 +34,8 @@ class Signature:
def __post_init__(self):
if self.name and not isinstance(self.name, str):
raise TypeError(f'Expected `name` to be of type str, received {type(self.name).__name__}')
if self.id and not isinstance(self.id, str):
raise TypeError(f'Expected `id` to be of type str, received {type(self.id).__name__}')
if self.status and not isinstance(self.status, str):
raise TypeError(f'Expected `status` to be of type str, received {type(self.status).__name__}')
if self.author and not isinstance(self.author, str):
Expand Down Expand Up @@ -65,6 +68,7 @@ def create_from_dict(signature_dict: Dict[str, Any]) -> Signature:

return Signature(
name=signature_dict.get('name'),
id=signature_dict.get('id'),
status=signature_dict.get('status'),
author=signature_dict.get('author'),
date=signature_dict.get('date'),
Expand Down
4 changes: 3 additions & 1 deletion tests/unit/models/fixtures.py
Original file line number Diff line number Diff line change
Expand Up @@ -476,6 +476,7 @@ class GitLabMockData:

MOCK_SIGNATURE_DICT = {
'name': 'Akamai API Access Tokens',
'id': 'akamai_api_access_tokens',
'status': 'enabled',
'author': 'PaperMtn',
'date': '2023-12-22',
Expand Down Expand Up @@ -566,6 +567,7 @@ def mock_user():
def mock_wiki_blob():
return wiki_blob.create_from_dict(GitLabMockData.MOCK_WIKI_BLOB_DICT)


@pytest.fixture
def mock_signature():
return signature.create_from_dict(GitLabMockData.MOCK_SIGNATURE_DICT)
return signature.create_from_dict(GitLabMockData.MOCK_SIGNATURE_DICT)
7 changes: 7 additions & 0 deletions tests/unit/models/test_unit_signature.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ def test_signature_initialisation(mock_signature):

# Test that the signature object has the correct attributes
assert mock_signature.name == GitLabMockData.MOCK_SIGNATURE_DICT.get('name')
assert mock_signature.id == GitLabMockData.MOCK_SIGNATURE_DICT.get('id')
assert mock_signature.status == GitLabMockData.MOCK_SIGNATURE_DICT.get('status')
assert mock_signature.author == GitLabMockData.MOCK_SIGNATURE_DICT.get('author')
assert mock_signature.date == GitLabMockData.MOCK_SIGNATURE_DICT.get('date')
Expand All @@ -27,6 +28,12 @@ def test_field_type():
with pytest.raises(TypeError):
test_signature = signature.create_from_dict(signature_dict)

# Test that correct error is raised when id is not a string
signature_dict = copy.deepcopy(GitLabMockData.MOCK_SIGNATURE_DICT)
signature_dict['id'] = 123
with pytest.raises(TypeError):
test_signature = signature.create_from_dict(signature_dict)

# Test that correct error is raised when status is not a string
signature_dict = copy.deepcopy(GitLabMockData.MOCK_SIGNATURE_DICT)
signature_dict['status'] = 123
Expand Down

0 comments on commit e264982

Please sign in to comment.