Skip to content

Regex capture groups silently require whitespace #397

@ohmann

Description

@ohmann

Capture groups within the regex(es) used to parse log lines silently require whitespace to precede them. This is undocumented and is only detectable when using the --debugParse flag. In other words, it is likely to confuse users.

Example command:
synoptic.sh -r "(?<ITIME>),(?<ip>),(?<TYPE>)" [... other args ...]

Example debug parse snippet:

INFO: input: (?<ITIME>),(?<ip>),(?<TYPE>) 
INFO: processed: (?:\s*(?<ITIME>\S+)\s*),(?:\s+(?<ip>\S+)\s*),(?:\s+(?<TYPE>\S+)\s*) 
INFO: standard: (?:\s*(\S+)\s*),(?:\s+(\S+)\s*),(?:\s+(\S+)\s*)

Note that the capture group (?<ip>) is transformed into (?:\s+(?<ip>\S+)\s*), i.e., the capture group won't match a log line unless it is preceded by 1+ whitespace characters. This is somewhat deceptive, since the regex the user passed does not reference whitespace at all. The user might think that a line like 123,4.4.4.4,event would be matched, but in fact it will not be.

This only affects default behavior. Manually specifying capture group formatting, e.g., (?<ip>\S+), works as expected and does not silently require whitespace before it.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions