Fluent Bit occasionally corrupts/truncates log entries when processing multiple log files. #8798

avizen-j · 2024-05-06T13:12:32Z

Bug Report

Describe the bug
Fluent Bit occasionally corrupts/truncates log entries when processing multiple log files. To be sure, that issue does not occur on Elasticsearch side where logs end up, 2 different outputs were set up to prove that something is wrong with Fluentbit. Moreover, when resending same log twice (locally), it is being shipped properly, so it seems that Fluentbit does not cope well when there are multiple log files.

Original log from the file:

{"@timestamp":"2024-05-06T11:05:59.5862062+00:00","level":"Debug","messageTemplate":"The request is insecure. Skipping HSTS header.","message":"The request is insecure. Skipping HSTS header.","fields":{"EventId":{"Id":1,"Name":"NotSecure"},"SourceContext":"Microsoft.AspNetCore.HttpsPolicy.HstsMiddleware","RequestId":"HIDDEN","RequestPath":"HIDDEN","ConnectionId":"HIDDEN","LoggingEnvironment":"test","ApplicationName":"HIDDEN","ServerName":"HIDDEN","Product":"HIDDEN"}}

Log from http output (elasticsearch v 8.8.1):

{"@timestamp":"2024-05-06T11:05:59.5862062+00:00","level":"Debug","messageTemplate":"The request is insecu"Id":2},"SourceContext":"Microsoft.AspNetCore.Hosting.Diagnostics","RequestId":"HIDDEN","RequestPath":"HIDDEN","ConnectionId":"HIDDEN","LoggingEnvironment":"test","ApplicationName":"HIDDEN","ServerName":"HIDDEN","Product":"HIDDEN"}}

Log from es output (elasticsearch v 8.12.2):

{"@timestamp":"2024-05-06T11:05:59.5862062+00:00","level":"Debug","messageTemplate":"The request is insecu"Id":2},"SourceContext":"Microsoft.AspNetCore.Hosting.Diagnostics","RequestId":"HIDDEN","RequestPath":"HIDDEN","ConnectionId":"HIDDEN","LoggingEnvironment":"test","ApplicationName":"HIDDEN","ServerName":"HIDDEN","Product":"HIDDEN"}}

For some reason, the log is corrupted/truncated in this place "messageTemplate":"The request is insecu"Id":2}.

Context
Currently Fluentbit is deployed as a Deployment in Kubernetes environment. 1 Fluentbit instance per 1 namespace. In namespace there are multiple apps that write their logs to file. Fluentbit watches around 51 files (but might be up to 150). The evidence is below:

...
[2024/05/06 11:33:03] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1107501554002 watch_fd=49 name=/opt/app-root/app-logs/app1/log20240506.txt
[2024/05/06 11:33:03] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1107424670288 watch_fd=50 name=/opt/app-root/app-logs/app2/log20240429.txt
[2024/05/06 11:33:03] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1106717528779 watch_fd=51 name=/opt/app-root/app-logs/app3/log20240506.txt

Your Environment

Version used: fluent-bit:2.2.2-debug
Environment name and version (e.g. Kubernetes? What version?): Kubernetes
Deployed as: kind: Deployment
Filters and plugins: Using the tail input plugin with a custom dockerjson parser, the modify filter, and both http and es output plugins.

Configuration
fluent-bit.conf:

[SERVICE]
    Parsers_File                        ./custom-parsers.conf
    Log_Level                           info
    Storage.path                        /opt/app-root/data/
    Storage.max_chunks_up               400
    Storage.pause_on_chunks_overlimit   On

# Input to read from a file.
[INPUT]
    Name                tail
    Tag                 product
    Parser              dockerjson
    Path                /opt/app-root/app-logs/*, /opt/app-root/app-logs/*/*, /opt/app-root/app-logs/*/*/*, /opt/app-root/app-logs/*/*/*/*
    Refresh_Interval    30
    Read_from_Head      On
    Skip_Long_Lines     On
    Skip_Empty_Lines    On
    Buffer_Max_Size     2M
    DB                  /opt/app-root/data/fluentbit.db
    Storage.type        filesystem

# Filter to add any additional fields with values to each log that will be sent.
[FILTER]
    Name                modify
    Match               product
    Add                 environment ${ENVIRONMENT}

# First output
[OUTPUT]
    Name                        http
    Match                       product
    Host                        ${OUTPUT_1_HOST}
    URI                         /product
    Port                        443
    Http_User                   ${OUTPUT_1_USER}
    Http_Passwd                 ${OUTPUT_1_PASSWORD}
    Tls                         On
    Tls.verify                  Off
    Format                      json
    Retry_Limit                 10
    Storage.total_limit_size    900M

# Second output.
[OUTPUT]
    Name                        es
    Match                       product
    Host                        ${OUTPUT_2_HOST}
    Port                        9200
    Http_User                   ${OUTPUT_2_USER}
    Http_Passwd                 ${OUTPUT_2_PASSWORD}
    Index                       fluent-bit
    Tls                         On
    Tls.verify                  Off
    Suppress_Type_Name          On
    Retry_Limit                 10
    Storage.total_limit_size    900M

custom-parsers.conf:

[PARSER]
    Name         dockerjson
    Format       json
    Time_Key     timestamp
    Time_Format  %Y-%m-%dT%H:%M:%S.%L
    Time_Keep    On
    # Command      |  Decoder      | Field | Optional Action
    # =============|===============|=======|=========
    Decode_Field_As  escaped    message   
    Decode_Field_As  escaped    messageTemplate

The text was updated successfully, but these errors were encountered:

lecaros · 2024-05-11T23:58:59Z

can you add an stdout output and verify the format?

avizen-j added the status: waiting-for-triage label May 6, 2024

avizen-j changed the title ~~Logs are being corrupted/truncated in Kubernetes [tail input, es & http output].~~ Fluent Bit occasionally corrupts/truncates log entries when processing multiple log files. May 6, 2024

lecaros added the waiting-for-user Waiting for more information, tests or requested changes label May 11, 2024

pecastro mentioned this issue May 17, 2024

fluent-bit is corrupting chunk data #8834

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fluent Bit occasionally corrupts/truncates log entries when processing multiple log files. #8798

Fluent Bit occasionally corrupts/truncates log entries when processing multiple log files. #8798

avizen-j commented May 6, 2024

lecaros commented May 11, 2024

Fluent Bit occasionally corrupts/truncates log entries when processing multiple log files. #8798

Fluent Bit occasionally corrupts/truncates log entries when processing multiple log files. #8798

Comments

avizen-j commented May 6, 2024

Bug Report

lecaros commented May 11, 2024