Fluent-bit Configuration for Kubernetes with CRI
This post is about the needed changes of the Fluent-bit configuration when upgrading your Kubernetes Cluster to v1.24 and on.
Motivation
We use Fluent-bit as part of our logging publish pipeline.
Our k8s clusters use Fluent-bit as Daemonset
, which read all the needed container logs and publish them through AWS Kinesis Firehose
to the ELK Stack.
After the upgrade k8s v1.23 we saw that the log format has changed.
instead of getting our usual JSON Log we got a prefix before the JSON log<Date> <stderr|stdout F>
<JSON Content>
.
Now al our Logging was down since the ELK did not digest non JSON Format (thanks to the additional prefix <Date> <stderr|stdout F>
).
Solution
It seems that this was well known.
The CRI
used in k8s has a different format that that of Docker
(which was used till now).
You can see the Code Here.
The Docker
Format
// {"log":"content 2","stream":"stderr","time":"2016-10-20T18:39:20.57606444Z"}
The CRI
Format
// 2016-10-06T00:17:09.669794203Z stderr F log content 2
Fluent-bit already reference that it their documentation
here.
And we can see that they added a new Parser for the CRI
to help us remove the Prefix and get the Log content without the CRI
prefix.
(we can see that at the Fluent-bit
POD
at /fluentbit/parsers/parsers.conf
).
# CRI Parser
[PARSER]
# http://rubular.com/r/tjUt3Awgg4
Name cri
Format regex
Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
The Problem
In our case the message
streamed to the ELK by kinesis
was a string containing all the JSON content, hence the ELK did not recognize it as JSON and did not index he fields.
The CRI
parser helps us in removing the additional Prefix added in the new CRI
format and leaving the JSON
part only.
The JSON
part is extracted as a Regexp message group
string.
To make it extracted as JSON
we need to add additional Parser in the Fluent-bit
Pipeline, since otherwise the log itself arrives as a string under the message
field.
We fix that by adding the following JSON
parser(which expects json
).
[FILTER]
Name parser
Match *
key_name message
Parser json
The Solution
We added again a Fluent-bit
Filter with Docker Parser to get the fields exported to the ELK as JSON and not as strings.
a configuration example could be as the following
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
labels:
k8s-app: fluent-bit
data:
# Configuration files: server, input, filters and output
# ======================================================
fluent-bit.conf: |
[SERVICE]
Flush 1
Log_Level debug
Daemon off
Parsers_File /fluent-bit/parsers/parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
@INCLUDE input-kubernetes.conf
@INCLUDE filter-kubernetes.conf
@INCLUDE output-ealsticsearch.conf
input-kubernetes.conf: |
[INPUT]
Name tail
Tag kube.myservice.*
Parser cri
Path /var/log/containers/*myservice*.log
DB /var/log/flb_kube_myservice.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Ignore_Older 1d
Refresh_Interval 10
filter-kubernetes.conf: |
[FILTER]
Name parser
Match *
key_name message
Parser json
output-ealsticsearch.conf: |
[OUTPUT]
Name kinesis_firehose
Match kube.myservice.*
region us-east-1
delivery_stream myservice_logs_stream
workers 2
References
- https://kubernetes.io/docs/setup/production-environment/container-runtimes/
- https://docs.fluentbit.io/manual/installation/kubernetes#container-runtime-interface-cri-parser
- https://github.com/kubernetes/kubernetes/blob/355feb21fdec98a5f6baf0927edcd48a3a5612b9/pkg/kubelet/kuberuntime/logs/logs.go#L125-L169