Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement utf16/utf16be/utf16le/wide modifiers #1432

Closed
YamatoSecurity opened this issue Oct 11, 2024 · 7 comments · Fixed by #1503
Closed

Implement utf16/utf16be/utf16le/wide modifiers #1432

YamatoSecurity opened this issue Oct 11, 2024 · 7 comments · Fixed by #1503
Assignees
Labels
enhancement New feature or request under-investigation under investigation to develop

Comments

@YamatoSecurity
Copy link
Collaborator

YamatoSecurity commented Oct 11, 2024

Although not used in any rules yet, we would like to support the following modifiers for sigma support completeness:

  • utf16|base64offset|contains
  • utf16be|base64offset|contains
  • utf16le|base64offset|contains
  • wide|base64offset|contains

Probably no need to support as base64offset is usually used instead of base64:

  • utf16|base64|contains
  • utf16be|base64|contains
  • utf16le|base64|contains
  • wide|base64|contains

Example:

detection:
  selection:
    CommandLine|wide|base64offset|contains: "ping"

Info: https://sigmahq.io/docs/basics/modifiers.html#wide

Prepends a byte order mark and encodes UTF16, (only used in combination with base64 modifiers)

Don't end with utf16, utf16le, utf16be or wide

The value modifier chain must not end with character set encoding modifiers (utf16, utf16le, utf16be and wide). The resulting values are internally represented as byte sequences instead of text strings and contain null characters which are usually difficult to handle in queries. Therefore the should be followed by an encoding modifier (base64, base64offset)

I think we should implement utf16 to check both utf16be and utf16le variants.
wide should be an alias for utf16le in Windows.

We should probably investigate if these encodings are being used inside base64 encoded payloads to begin with. If not, then it probably is not worth implementing.

@YamatoSecurity YamatoSecurity added the enhancement New feature or request label Oct 11, 2024
@hitenkoku
Copy link
Collaborator

The value modifier chain must not end with character set encoding modifiers (utf16, utf16le, utf16be and wide). The resulting values are internally represented as byte sequences instead of text strings and contain null characters which are usually difficult to handle in queries. Therefore the should be followed by an encoding modifier (base64, base64offset)
Usually it doesn't makes sense to combine the re type modifier with any other modifier.

https://github.com/SigmaHQ/sigma/wiki/Rule-Creation-Guide

@YamatoSecurity YamatoSecurity changed the title Implement base64 modifiers Implement utf16/utf16be/utf16le/wide modifiers Oct 12, 2024
@YamatoSecurity
Copy link
Collaborator Author

@hitenkoku Thanks for the information! I was mistaken on how it was being used. I updated the specifications.

@fukusuket
Copy link
Collaborator

fukusuket commented Oct 25, 2024

memo:

csv-timeline

./hayabusa csv-timeline -d ../hayabusa-sample-evtx -w -o timeline.csv -C --include-eid 1,4688 -q
cat timeline.csv | awk -F, '{print $8, $9}' > timeline.txt

tokenize script

import re


def process_and_write_tokens(input_file_path, output_file_path):
    """ファイルを読み込み、分割、トリム、正規表現による文字削除、トークナイズして1行ずつ別ファイルに出力する"""
    try:
        with open(input_file_path, 'r', encoding='utf-8') as file:
            # ファイル内容を読み込み、" ¦ "でsplit、トリム、正規表現で削除
            split_content = [
                re.sub(r'.*?:\s', '', part.strip())
                for part in file.read().split(" ¦ ")
            ]

            # トークナイズした単語を1行ずつ出力用のファイルに書き込み
            with open(output_file_path, 'w', encoding='utf-8') as output_file:
                for part in split_content:
                    tokens = part.split()  # 空白でトークナイズ
                    for token in tokens:
                        output_file.write(token + '\n')

    except FileNotFoundError:
        print(f"Error: File '{input_file_path}' not found.")


if __name__ == '__main__':
    input_file_path = 'timeline.csv'  
    output_file_path = 'output.txt' 
    process_and_write_tokens(input_file_path, output_file_path)

base64 check

cat output.txt | sort | uniq | grep -Eo '([A-Za-z0-9+/]{4}){20,}={0,2}' | while read -r line; do echo "$line" | base64 -d > /dev/null 2>&1 && echo "$line"; done

@fukusuket fukusuket self-assigned this Nov 16, 2024
@fukusuket
Copy link
Collaborator

fukusuket commented Nov 18, 2024

@YamatoSecurity
I created simple check tool for records matching the following conditions!

            if ch == "Security" && id == 4688 {
                value = data["Event"]["EventData"]["CommandLine"].clone();
            }
            if ch == "Microsoft-Windows-Sysmon/Operational" && id == 1 {
                value = data["Event"]["EventData"]["CommandLine"].clone();
            }
            if ch == "Microsoft-Windows-PowerShell/Operational" && id == 4103 {
                value = data["Event"]["EventData"]["Payload"].clone();
            }
            if ch == "Microsoft-Windows-PowerShell/Operational" && id == 4104 {
                value = data["Event"]["EventData"]["ScriptBlockText"].clone();
            }

According to the results of the tool execution, in hayabusa-sample-evtx, the following three files seem to contain base64 data to be decoded into UTF-16 LE strings🤔 (There does not appear to be any UTF-16 BE data in hayabusa-sample-evtx)

Possible Base64 + UTF-16 LE(powersploit-security.evtx): ...
Possible Base64 + UTF-16 LE(many-events-security.evtx): ...
Possible Base64 + UTF-16 LE(discovery_sysmon_1_iis_pwd_and_config_discovery_appcmd.evtx): ...

@YamatoSecurity
Copy link
Collaborator Author

@fukusuket I checked and found all three types in my logs, including Possible Base64 + UTF-16 BE (although most is either UTF-8 and UTF-16LE). So we should probably support each type. Can you do this one?

@fukusuket
Copy link
Collaborator

Interesting! Yes, I would love to implement it!💪

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request under-investigation under investigation to develop
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants