Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Mutliline regex for Tail input plugin does not work #2473

Closed
stanislaw55 opened this issue Aug 17, 2020 · 5 comments · May be fixed by #2495
Closed

[BUG] Mutliline regex for Tail input plugin does not work #2473

stanislaw55 opened this issue Aug 17, 2020 · 5 comments · May be fixed by #2495
Labels

Comments

@stanislaw55
Copy link

Bug Report

Describe the bug

Mutliline regex for Tail input plugin does not work.

I have log in which message filed can be one line but also multi lines. When I used configuration below, it worked well with single line of message field:

[INPUT]
    Name tail
    Tag tango-machine
    Path /var/log/tango/*/*/*.log
    Multiline On
    Parser_Firstline tango_start
    Parser_1 tango
    Parser_2 tango_trash
    Parser_3 tango_trash2
[FILTER]
    Name modify
    Match tango-machine
    Remove trash
    Remove trash2

in parsers file

[PARSER]
    Name tango_start
    Format regex
    Regex /^<log4j:event logger="(?'logger'.+\/.+\/.+)" timestamp="(?'timestamp'\d+)" level="(?'level'[A-Z]+)" thread="(?'thread'\d+)">$/
    Time_Key timestamp
    Time_Keep On
[PARSER]
    Name tango
    Format regex
    Regex /<log4j:message><!\[CDATA\[(?'message'.+)\]\]><\/log4j:message>/
[PARSER]
    Name tango_trash
    Format regex
    Regex /(?'trash'<log4j:NDC><\!\[CDATA\[\]\]><\/log4j:NDC>)/
[PARSER]
    Name tango_trash2
    Format regex
    Regex /(?'trash2'<\/log4j:event>)/

However, it just appends all stuff to thread key when message is mutliple lines. I've tried using this parser

Regex /(?m-x)<log4j:message><!\[CDATA\[(?<message>.+)\]\]><\/log4j:message>/

To be honest I've tried so many combinations that I've lost count and none of them seems to be working as I expect them to do.

To Reproduce

Single line of message field

<log4j:event logger="foo/bar/baz" timestamp="1597056068329" level="INFO" thread="140214760625920">
<log4j:message><![CDATA[Received a valid event from foo/bar/dee/gee_0 for node <gee>.]]></log4j:message>
<log4j:NDC><![CDATA[]]></log4j:NDC>
</log4j:event>

Multiline message field

<log4j:event logger="foo/bar/baz" timestamp="1597129352078" level="ERROR" thread="140214760625920">
<log4j:message><![CDATA[Received an event from foo/bar/dee/gee_0 that contains errors:
  It is currently not allowed to read attribute 
gee]]></log4j:message>
<log4j:NDC><![CDATA[]]></log4j:NDC>
</log4j:event>
  • Steps to reproduce the problem:
  • Use tail input plugin
  • Use parsers defined above
  • See that multiline message are absent and whole whole is appended to thread field

Expected behavior

message filed in produced JSON shall contain whole message, even if it's multiline.

Screenshots

Your Environment

  • Version used: 1.5.3
  • Configuration:
  • Environment name and version (e.g. Kubernetes? What version?): Virtual Machine
  • Server type and version:
  • Operating System and version: CentOS 7
  • Filters and plugins:

Additional context

@stanislaw55 stanislaw55 changed the title Mutliline regex for Tail input plugin does not work [BUG] Mutliline regex for Tail input plugin does not work Aug 19, 2020
@tom-dudley
Copy link

Hey @stanislaw55 - I believe I faced the same or a very related issue as you. I've raised #2495 which seems to fix the issues I was facing. I ran your multiline message through my branch (without any parsers set) and got the following output. Is this roughly what you were expecting?

[0] tail.0: [1598300717.223342800, {"log"=>"<log4j:event logger="foo/bar/baz" timestamp="1597129352078" level="ERROR" thread="140214760625920">"}]
[1] tail.0: [1598300717.223742400, {"log"=>"<log4j:message><![CDATA[Received an event from foo/bar/dee/gee_0 that contains errors:"}]
[2] tail.0: [1598300717.223765100, {"log"=>"  It is currently not allowed to read attribute "}]
[3] tail.0: [1598300717.223786600, {"log"=>"gee]]></log4j:message>"}]
[4] tail.0: [1598300717.223805600, {"log"=>"<log4j:NDC><![CDATA[]]></log4j:NDC>"}]
[5] tail.0: [1598300717.223819300, {"log"=>"</log4j:event>"}]

@stanislaw55
Copy link
Author

Hi @tom-dudley
Your PR seems promising to fix the issue. Right now I went for using Lua script and it works fine for me.

About the results: actually, the best I could get was similar to what you have presented but each log line after the first line was appended to thread field, with missing message field. Very similar, just whole rest squashed with thread.

What I was expecting was something liek this

[0] tail.0  [1598300717.223342800, {"logger"=>"foo/bar/baz", "timestamp"=>"1597129352078", "level"=>"ERROR", "thread"=>"140214760625920", "meassge"=>"Received an event from foo/bar/dee/gee_0 that contains errors:  It is currently not allowed to read attribute gee"}]

from multiline message

@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions
Copy link
Contributor

github-actions bot commented May 4, 2021

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as completed May 4, 2021
@edsiper
Copy link
Member

edsiper commented Jul 20, 2021

Multiline Update

As part of Fluent Bit v1.8, we have released a new Multiline core functionality. This new big feature allows you to configure new [MULTILINE_PARSER]s that support multi formats/auto-detection, new multiline mode on Tail plugin, and also on v1.8.2 (to be released on July 20th, 2021) a new Multiline Filter.

For now, you can take at the following documentation resources:

Documentation pages now point to complete config examples that are available on our repository.

Thanks everyone for supporting this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants