容器场景下的日志解决方案

公司业务系统整体通过容器化方案部署在K8S集群中，业务的流量通过APISix网关进行管理。基本上所有的前端页面请求，后端接口请求流量都会经过APISix，这些请求数据有非常重要的业务分析价值。因此需要把APISix的请求日志打到大数据平台中。通过调研最后选择了loggie作为这个解决方案的工具。

一直学下去

511人浏览 · 2025-07-13 08:57:35

一直学下去 · 2025-07-13 08:57:35 发布

背景说明

公司业务系统整体通过容器化方案部署在K8S集群中，业务的流量通过APISix网关进行管理。基本上所有的前端页面请求，后端接口请求流量都会经过APISix，这些请求数据有非常重要的业务分析价值。因此需要把APISix的请求日志打到大数据平台中。通过调研最后选择了loggie作为这个解决方案的工具。数据流如下：

用户浏览器请求->APISix日志->loggie->kafka->大数据平台

整个日志的采集方案不光可以运用到上面的场景，还可以运用很多场景，参考官方的图片如下：
在这里插入图片描述

注意事项

由于该项目在2024年以后就没有更新的记录了，感觉已经没有人维护了。组件所适配的其他组件的版本要特别关注，我的场景里面就踩坑kafka的版本，目前仅适配到2.7.1
同样由于项目更新的问题，所使用的K8S的接口，CRI的接口在新的K8S版本中可能已经被删除了，会导致故障。我的场景里面踩坑CRI的版本，通过自己修改源代码解决。修改后的代码库和镜像地址在文末给出

环境说明

Kubernetes

Client Version: v1.32.3
Kustomize Version: v5.5.0
Server Version: v1.32.3

容器运行时

containerd github.com/containerd/containerd v1.7.27 05044ec0a9a75232cad458027ca83437aae3f4da

操作系统和内核

NAME="openEuler"
VERSION="24.03 (LTS)"
ID="openEuler"
VERSION_ID="24.03"
PRETTY_NAME="openEuler 24.03 (LTS)"
ANSI_COLOR="0;31"
Linux master-01 6.6.0-28.0.0.34.oe2403.aarch64 #1 SMP Mon May 27 22:43:49 CST 2024 aarch64 aarch64 aarch64 GNU/Linux

kafka信息

版本: 2.7.1
#启用SASL认证，server.properties信息如下，具体kafka的部署请参考其他文章
sasl.mechanism.inter.broker.protocol=PLAIN
sasl.enabled.mechanisms=PLAIN
inter.broker.listener.name=SASL_PLAINTEXT

实施步骤

集群中安装loggie

#通过helm安装
1. clone loggie官方的chart到本地
git clone https://github.com/loggie-io/installation.git
cd helm-chart
2. 修改values.yaml文件以适配本地情况，我使用的values.yaml文件如下。配置中相关的路径需要根据实际情况修改

image: harbor.whyxpt.site/release/loggie:v1.5.1

resources:
  limits:
    cpu: 2
    memory: 2Gi
  requests:
    cpu: 100m
    memory: 100Mi

extraArgs: {}
  # log.level: debug
  # log.jsonFormat: true

extraVolumeMounts:
  - mountPath: /var/log/pods
    name: podlogs
  - mountPath: /var/lib/docker/containers
    name: dockercontainers
  - mountPath: /var/lib/kubelet/pods
    name: kubelet
  - mountPath: /opt/data
    name: db

extraVolumes:
  - hostPath:
      path: /var/log/pods
      type: DirectoryOrCreate
    name: podlogs
  - hostPath:
      path: /data/lib/docker/containers
      type: DirectoryOrCreate
    name: dockercontainers
  - hostPath:
      path: /data/lib/kubelet/pods
      type: DirectoryOrCreate
    name: kubelet
  - hostPath:
      path: /data/lib/loggie
      type: DirectoryOrCreate
    name: db 

extraEnvs: {}
timezone: Asia/Shanghai

## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
nodeSelector: {}

## Affinity for pod assignment
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
affinity: {}
# podAntiAffinity:
#   requiredDuringSchedulingIgnoredDuringExecution:
#   - labelSelector:
#       matchExpressions:
#       - key: app
#         operator: In
#         values:
#         - loggie
#     topologyKey: "kubernetes.io/hostname"

## Tolerations for pod assignment
## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
tolerations: []
# - effect: NoExecute
#   operator: Exists
# - effect: NoSchedule
#   operator: Exists

updateStrategy:
  type: RollingUpdate

## Agent mode, ignored when aggregator.enabled is true
config:
  loggie:
    reload:
      enabled: true
      period: 10s
    monitor:
      logger:
        period: 30s
        enabled: true
      listeners:
        filesource:
          period: 10s
        filewatcher:
          period: 5m
        reload:
          period: 10s
        sink:
          period: 10s
        queue:
          period: 10s
        pipeline:
          period: 10s
    db:
      file: /opt/data/loggie.db
    discovery:
      enabled: true
      kubernetes:
        # Choose: docker or containerd
        containerRuntime: containerd
        # Collect log files inside the container from the root filesystem of the container, no need to mount the volume
        rootFsCollectionEnabled: true
        # Automatically parse and convert the wrapped container standard output format into the original log content
        parseStdout: false
        # If set to true, it means that the pipeline configuration generated does not contain specific Pod paths and meta information,
        # and these data will be dynamically obtained by the file source, thereby reducing the number of configuration changes and reloads.
        dynamicContainerLog: false
        # Automatically add fields when selector.type is pod in logconfig/clusterlogconfig
        typePodFields:
          logconfig: "${_k8s.logconfig}"
          namespace: "${_k8s.pod.namespace}"
          nodename: "${_k8s.node.name}"
          podname: "${_k8s.pod.name}"
          containername: "${_k8s.pod.container.name}"

    http:
      enabled: true
      port: 9196

## Aggregator mode, by default is disabled
aggregator:
  enabled: false
  replicas: 2
  config:
    loggie:
      reload:
        enabled: true
        period: 10s
      monitor:
        logger:
          period: 30s
          enabled: true
        listeners:
          reload:
            period: 10s
          sink:
            period: 10s
      discovery:
        enabled: true
        kubernetes:
          cluster: aggregator
      http:
        enabled: true
        port: 9196
servicePorts:
  - name: monitor
    port: 9196
    targetPort: 9196
#  - name: gprc
#    port: 6066
#    targetPort: 6066
serviceMonitor:
  enabled: false
  ## Scrape interval. If not set, the Prometheus default scrape interval is used.
  interval: 30s
  relabelings: []
  metricRelabelings: []

3. 安装loggie
helm upgrade --install loggie helm-chart -n prd-public-service
4. 查看loggie 运行情况,确保所有为Running状态
kubectl get node -n prd-public-server|grep loggie

通过CRD方式配置容器日志采集

我需要采集apisix的日志，该Pod都具有标签app=apisix,yaml 文件内容如下:
日志推送到Kafka

apiVersion: loggie.io/v1beta1
kind: LogConfig
metadata:
  name: apisix-to-kafka
spec:
  selector:
    type: pod
    labelSelector:
      app: apisix
  pipeline:
    sources: |
      - type: file
        name: nginx
        paths:
          - /var/log/nginx/*.json
    interceptors: |
      - type: transformer
        actions:
         - action: jsonDecode(body)
    sink: |
      type: kafka
      brokers: 
         - 10.128.40.3:9092
      topic: "loggie"
      sasl:
        # 可选scram和plain
        type: plain 
        mechanism: SCRAM-SHA-256
        username: "kafka"
        password: "******!"
        # 当选择scram时生效，可选sha256,sha512
        algorithm: sha256

日志推送到elasticsearch

apiVersion: loggie.io/v1beta1
kind: LogConfig
metadata:
  name: apisix-to-elasticsearch
spec:
  selector:
    type: pod
    labelSelector:
      app: apisix
  pipeline:
    sources: |
      - type: file
        name: apisix
        paths:
          - /usr/local/apisix/logs/*.json
    interceptors: |
      - type: transformer
        actions:
         - action: jsonDecode(body)
         - action: strconv(request_time, float)
         - action: strconv(upstream_response_time, float)
    sink: |
         type: elasticsearch
         hosts: ["172.88.101.23:9200","172.88.101.24:9200","172.88.101.25:9200"]
         index: "loggie-${+YYYY.MM.DD}"
         schema: "http"
         username: "elastic"
         password: "Sew****EeuPU"

查看kafka中由loggie采集的日志

bin/kafka-console-consumer.sh  --consumer.config client.properties \
--bootstrap-server 10.128.40.3:9092 \
--topic loggie  --from-beginning

输出示例:

#由于apisix的日志格式已经做了规范，输出为json格式，所以在采集定义的时候通过transformer的jsonDecode进行格式化，
#这样大数据平台就可以直接消费并处理这些数据，同时如果loggie后端对接es，也可以使用字段进行查询
{"@timestamp":"2025-07-12T01:40:13+08:00","application_name":"default-applicaton","uri":"/index.html","remote_addr":"10.130.0.127","remote_user":"-","server_addr":"10.130.1.230","upstream_response_time":"-","upstream_addr":"-","body_bytes_sent":"615","http_referer":"-","request_filename":"/usr/local/nginx/html/index.html","fields":{"namespace":"prd-public-service","nodename":"k8s-worker-1","podname":"kydy-admin-view-56ff464f45-bff6p","containername":"kydy-admin-view","logconfig":"front-end-applications"},"http_host":"10.68.124.224","request_time":"0.000","http_x_forwarded_for":"-","proxy_protocol_addr":"-","request":"GET / HTTP/1.1","status":"200","http_user_agent":"curl/7.29.0"}

文献和材料说明

CRD的使用说明请参考官方的文档
https://loggie-io.github.io/docs/main/reference/discovery/kubernetes/logconfig/
修改K8S CRI版本的代码地址
https://gitee.com/kevinliu_CQ/loggie
修改版本后的镜像地址

x86_64架构: registry.cn-hangzhou.aliyuncs.com/dockerforkevin/loggie:1.5.1
arm64架构: registry.cn-hangzhou.aliyuncs.com/dockerforkevin/loggie-arm:v1.5.1

该方案适合所有的容器化日志采集，如有疑问请私信。另外loggie已经很久没有更新了，个人感觉这是一个非常好的项目，希望社区的维护者可以持续更新，同时感谢他们提供这么好的项目。

魔乐社区

魔乐社区（Modelers.cn) 是一个中立、公益的人工智能社区，提供人工智能工具、模型、数据的托管、展示与应用协同服务，为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作，由全产业链共同建设、共同运营、共同享有，推动国产AI生态繁荣发展。

更多推荐

替你试过了，消费级显卡可以跑的开源文生图SOTA模型，顶级渲染、高密度文本绘图

魔乐社区

量化挑战赛冠军专访：4小时啃下W4A8量化，我靠的是这些经验

魔乐社区

小参数・大码力・易部署 | Qwen3.6-27B上线魔乐社区，基于昇腾的部署教程来了

继一周前模型开源发布后，千问再度开源Qwen3.6-27B —— 一个拥有270亿参数的稠密多模态模型，也是社区呼声最高的模型规格。Qwen3.6-27B 依然支持多模态思考与非思考模式，在智能体编程方面达到了旗舰级表现，全面超越前代开源旗舰 Qwen3.5-397B-A17B（总参数397B / 激活参数17B的MoE模型）。作为稠密架构，它无需MoE路由即可部署，是开发者在实用、可广泛部署规模