Fluent-bit Configuration for Kubernetes with CRI

Yossi Cohn
3 min readSep 27, 2023

This post is about the needed changes of the Fluent-bit configuration when upgrading your Kubernetes Cluster to v1.24 and on.

Motivation

We use Fluent-bit as part of our logging publish pipeline.

Our k8s clusters use Fluent-bit as Daemonset, which read all the needed container logs and publish them through AWS Kinesis Firehose to the ELK Stack.

After the upgrade k8s v1.23 we saw that the log format has changed.

instead of getting our usual JSON Log we got a prefix before the JSON log
<Date> <stderr|stdout F> <JSON Content>.

Now al our Logging was down since the ELK did not digest non JSON Format (thanks to the additional prefix <Date> <stderr|stdout F>).

Logging Infrastructure overview

Solution

It seems that this was well known.

The CRI used in k8s has a different format that that of Docker(which was used till now).

You can see the Code Here.

The Docker Format


// {"log":"content 2","stream":"stderr","time":"2016-10-20T18:39:20.57606444Z"}

The CRI Format

// 2016-10-06T00:17:09.669794203Z stderr F log content 2

Fluent-bit already reference that it their documentation
here.

And we can see that they added a new Parser for the CRI to help us remove the Prefix and get the Log content without the CRI prefix.
(we can see that at the Fluent-bit POD at /fluentbit/parsers/parsers.conf).

# CRI Parser
[PARSER]
# http://rubular.com/r/tjUt3Awgg4
Name cri
Format regex
Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L%z

The Problem

In our case the message streamed to the ELK by kinesis was a string containing all the JSON content, hence the ELK did not recognize it as JSON and did not index he fields.

The CRI parser helps us in removing the additional Prefix added in the new CRI format and leaving the JSON part only.

The JSON part is extracted as a Regexp message group string.

To make it extracted as JSON we need to add additional Parser in the Fluent-bit Pipeline, since otherwise the log itself arrives as a string under the message field.
We fix that by adding the following JSON parser(which expects json).

[FILTER]
Name parser
Match *
key_name message
Parser json

The Solution

We added again a Fluent-bitFilter with Docker Parser to get the fields exported to the ELK as JSON and not as strings.

a configuration example could be as the following

apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
labels:
k8s-app: fluent-bit
data:
# Configuration files: server, input, filters and output
# ======================================================
fluent-bit.conf: |
[SERVICE]
Flush 1
Log_Level debug
Daemon off
Parsers_File /fluent-bit/parsers/parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020

@INCLUDE input-kubernetes.conf
@INCLUDE filter-kubernetes.conf
@INCLUDE output-ealsticsearch.conf
input-kubernetes.conf: |
[INPUT]
Name tail
Tag kube.myservice.*
Parser cri
Path /var/log/containers/*myservice*.log
DB /var/log/flb_kube_myservice.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Ignore_Older 1d
Refresh_Interval 10
filter-kubernetes.conf: |
[FILTER]
Name parser
Match *
key_name message
Parser json
output-ealsticsearch.conf: |
[OUTPUT]
Name kinesis_firehose
Match kube.myservice.*
region us-east-1
delivery_stream myservice_logs_stream
workers 2

References

--

--

Yossi Cohn

Software Engineer, Tech Lead, DevOps, Infrastructure @ HiredScore