Fluent-bit Configuration for Kubernetes with CRI

3 min readSep 27, 2023

This post is about the needed changes of the Fluent-bit configuration when upgrading your Kubernetes Cluster to v1.24 and on.

Motivation

We use Fluent-bit as part of our logging publish pipeline.

Our k8s clusters use Fluent-bit as Daemonset, which read all the needed container logs and publish them through AWS Kinesis Firehose to the ELK Stack.

After the upgrade k8s v1.23 we saw that the log format has changed.

instead of getting our usual JSON Log we got a prefix before the JSON log
<Date> <stderr|stdout F> <JSON Content>.

Now al our Logging was down since the ELK did not digest non JSON Format (thanks to the additional prefix <Date> <stderr|stdout F>).

Solution

It seems that this was well known.

The CRI used in k8s has a different format that that of Docker(which was used till now).

You can see the Code Here.

The Docker Format


// {"log":"content 2","stream":"stderr","time":"2016-10-20T18:39:20.57606444Z"}

The CRI Format

// 2016-10-06T00:17:09.669794203Z stderr F log content 2

Fluent-bit already reference that it their documentation
here.

And we can see that they added a new Parser for the CRI to help us remove the Prefix and get the Log content without the CRI prefix.
(we can see that at the Fluent-bit POD at /fluentbit/parsers/parsers.conf).

# CRI Parser
[PARSER]
    # http://rubular.com/r/tjUt3Awgg4
    Name cri
    Format regex
    Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L%z

The Problem

In our case the message streamed to the ELK by kinesis was a string containing all the JSON content, hence the ELK did not recognize it as JSON and did not index he fields.

The CRI parser helps us in removing the additional Prefix added in the new CRI format and leaving the JSON part only.

The JSON part is extracted as a Regexp message group string.

To make it extracted as JSON we need to add additional Parser in the Fluent-bit Pipeline, since otherwise the log itself arrives as a string under the message field.
We fix that by adding the following JSON parser(which expects json).

[FILTER]
   Name parser
   Match *
   key_name message
   Parser json

The Solution

We added again a Fluent-bitFilter with Docker Parser to get the fields exported to the ELK as JSON and not as strings.

a configuration example could be as the following

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  labels:
    k8s-app: fluent-bit
data:
  # Configuration files: server, input, filters and output
  # ======================================================
  fluent-bit.conf: |
    [SERVICE]
        Flush         1
        Log_Level     debug
        Daemon        off
        Parsers_File  /fluent-bit/parsers/parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020

    @INCLUDE input-kubernetes.conf
    @INCLUDE filter-kubernetes.conf
    @INCLUDE output-ealsticsearch.conf
  input-kubernetes.conf: |
    [INPUT]
        Name              tail
        Tag               kube.myservice.*
        Parser            cri
        Path              /var/log/containers/*myservice*.log
        DB                /var/log/flb_kube_myservice.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Ignore_Older      1d
        Refresh_Interval  10
  filter-kubernetes.conf: |
    [FILTER]
        Name    parser
        Match   *
        key_name message
        Parser json
  output-ealsticsearch.conf: |
    [OUTPUT]
        Name            kinesis_firehose
        Match           kube.myservice.*
        region          us-east-1
        delivery_stream myservice_logs_stream
        workers 2

Fluent-bit Configuration for Kubernetes with CRI

Motivation

Solution

The Problem

References

Written by Yossi Cohn

Responses (1)