背景说明

公司业务系统整体通过容器化方案部署在K8S集群中,业务的流量通过APISix网关进行管理。基本上所有的前端页面请求,后端接口请求流量都会经过APISix,这些请求数据有非常重要的业务分析价值。因此需要把APISix的请求日志打到大数据平台中。通过调研最后选择了loggie作为这个解决方案的工具。数据流如下:

用户浏览器请求->APISix日志->loggie->kafka->大数据平台

整个日志的采集方案不光可以运用到上面的场景,还可以运用很多场景,参考官方的图片如下:
在这里插入图片描述

注意事项

  • 由于该项目在2024年以后就没有更新的记录了,感觉已经没有人维护了。组件所适配的其他组件的版本要特别关注,我的场景里面就踩坑kafka的版本,目前仅适配到2.7.1
  • 同样由于项目更新的问题,所使用的K8S的接口,CRI的接口在新的K8S版本中可能已经被删除了,会导致故障。我的场景里面踩坑CRI的版本,通过自己修改源代码解决。修改后的代码库和镜像地址在文末给出

环境说明

  • Kubernetes
Client Version: v1.32.3
Kustomize Version: v5.5.0
Server Version: v1.32.3
  • 容器运行时
containerd github.com/containerd/containerd v1.7.27 05044ec0a9a75232cad458027ca83437aae3f4da
  • 操作系统和内核
NAME="openEuler"
VERSION="24.03 (LTS)"
ID="openEuler"
VERSION_ID="24.03"
PRETTY_NAME="openEuler 24.03 (LTS)"
ANSI_COLOR="0;31"
Linux master-01 6.6.0-28.0.0.34.oe2403.aarch64 #1 SMP Mon May 27 22:43:49 CST 2024 aarch64 aarch64 aarch64 GNU/Linux
  • kafka信息
版本: 2.7.1
#启用SASL认证,server.properties信息如下,具体kafka的部署请参考其他文章
sasl.mechanism.inter.broker.protocol=PLAIN
sasl.enabled.mechanisms=PLAIN
inter.broker.listener.name=SASL_PLAINTEXT

实施步骤

集群中安装loggie

#通过helm安装
1. clone loggie官方的chart到本地
git clone https://github.com/loggie-io/installation.git
cd helm-chart
2. 修改values.yaml文件以适配本地情况,我使用的values.yaml文件如下。配置中相关的路径需要根据实际情况修改
image: harbor.whyxpt.site/release/loggie:v1.5.1

resources:
  limits:
    cpu: 2
    memory: 2Gi
  requests:
    cpu: 100m
    memory: 100Mi

extraArgs: {}
  # log.level: debug
  # log.jsonFormat: true

extraVolumeMounts:
  - mountPath: /var/log/pods
    name: podlogs
  - mountPath: /var/lib/docker/containers
    name: dockercontainers
  - mountPath: /var/lib/kubelet/pods
    name: kubelet
  - mountPath: /opt/data
    name: db

extraVolumes:
  - hostPath:
      path: /var/log/pods
      type: DirectoryOrCreate
    name: podlogs
  - hostPath:
      path: /data/lib/docker/containers
      type: DirectoryOrCreate
    name: dockercontainers
  - hostPath:
      path: /data/lib/kubelet/pods
      type: DirectoryOrCreate
    name: kubelet
  - hostPath:
      path: /data/lib/loggie
      type: DirectoryOrCreate
    name: db 

extraEnvs: {}
timezone: Asia/Shanghai

## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
nodeSelector: {}

## Affinity for pod assignment
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
affinity: {}
# podAntiAffinity:
#   requiredDuringSchedulingIgnoredDuringExecution:
#   - labelSelector:
#       matchExpressions:
#       - key: app
#         operator: In
#         values:
#         - loggie
#     topologyKey: "kubernetes.io/hostname"

## Tolerations for pod assignment
## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
tolerations: []
# - effect: NoExecute
#   operator: Exists
# - effect: NoSchedule
#   operator: Exists

updateStrategy:
  type: RollingUpdate

## Agent mode, ignored when aggregator.enabled is true
config:
  loggie:
    reload:
      enabled: true
      period: 10s
    monitor:
      logger:
        period: 30s
        enabled: true
      listeners:
        filesource:
          period: 10s
        filewatcher:
          period: 5m
        reload:
          period: 10s
        sink:
          period: 10s
        queue:
          period: 10s
        pipeline:
          period: 10s
    db:
      file: /opt/data/loggie.db
    discovery:
      enabled: true
      kubernetes:
        # Choose: docker or containerd
        containerRuntime: containerd
        # Collect log files inside the container from the root filesystem of the container, no need to mount the volume
        rootFsCollectionEnabled: true
        # Automatically parse and convert the wrapped container standard output format into the original log content
        parseStdout: false
        # If set to true, it means that the pipeline configuration generated does not contain specific Pod paths and meta information,
        # and these data will be dynamically obtained by the file source, thereby reducing the number of configuration changes and reloads.
        dynamicContainerLog: false
        # Automatically add fields when selector.type is pod in logconfig/clusterlogconfig
        typePodFields:
          logconfig: "${_k8s.logconfig}"
          namespace: "${_k8s.pod.namespace}"
          nodename: "${_k8s.node.name}"
          podname: "${_k8s.pod.name}"
          containername: "${_k8s.pod.container.name}"

    http:
      enabled: true
      port: 9196

## Aggregator mode, by default is disabled
aggregator:
  enabled: false
  replicas: 2
  config:
    loggie:
      reload:
        enabled: true
        period: 10s
      monitor:
        logger:
          period: 30s
          enabled: true
        listeners:
          reload:
            period: 10s
          sink:
            period: 10s
      discovery:
        enabled: true
        kubernetes:
          cluster: aggregator
      http:
        enabled: true
        port: 9196
servicePorts:
  - name: monitor
    port: 9196
    targetPort: 9196
#  - name: gprc
#    port: 6066
#    targetPort: 6066
serviceMonitor:
  enabled: false
  ## Scrape interval. If not set, the Prometheus default scrape interval is used.
  interval: 30s
  relabelings: []
  metricRelabelings: []
3. 安装loggie
helm upgrade --install loggie helm-chart -n prd-public-service
4. 查看loggie 运行情况,确保所有为Running状态
kubectl get node -n prd-public-server|grep loggie

通过CRD方式配置容器日志采集

  1. 我需要采集apisix的日志,该Pod都具有标签app=apisix,yaml 文件内容如下:
    日志推送到Kafka
apiVersion: loggie.io/v1beta1
kind: LogConfig
metadata:
  name: apisix-to-kafka
spec:
  selector:
    type: pod
    labelSelector:
      app: apisix
  pipeline:
    sources: |
      - type: file
        name: nginx
        paths:
          - /var/log/nginx/*.json
    interceptors: |
      - type: transformer
        actions:
         - action: jsonDecode(body)
    sink: |
      type: kafka
      brokers: 
         - 10.128.40.3:9092
      topic: "loggie"
      sasl:
        # 可选scram和plain
        type: plain 
        mechanism: SCRAM-SHA-256
        username: "kafka"
        password: "******!"
        # 当选择scram时生效,可选sha256,sha512
        algorithm: sha256

日志推送到elasticsearch

apiVersion: loggie.io/v1beta1
kind: LogConfig
metadata:
  name: apisix-to-elasticsearch
spec:
  selector:
    type: pod
    labelSelector:
      app: apisix
  pipeline:
    sources: |
      - type: file
        name: apisix
        paths:
          - /usr/local/apisix/logs/*.json
    interceptors: |
      - type: transformer
        actions:
         - action: jsonDecode(body)
         - action: strconv(request_time, float)
         - action: strconv(upstream_response_time, float)
    sink: |
         type: elasticsearch
         hosts: ["172.88.101.23:9200","172.88.101.24:9200","172.88.101.25:9200"]
         index: "loggie-${+YYYY.MM.DD}"
         schema: "http"
         username: "elastic"
         password: "Sew****EeuPU"
  1. 查看kafka中由loggie采集的日志
bin/kafka-console-consumer.sh  --consumer.config client.properties \
--bootstrap-server 10.128.40.3:9092 \
--topic loggie  --from-beginning

输出示例:

#由于apisix的日志格式已经做了规范,输出为json格式,所以在采集定义的时候通过transformer的jsonDecode进行格式化,
#这样大数据平台就可以直接消费并处理这些数据,同时如果loggie后端对接es,也可以使用字段进行查询
{"@timestamp":"2025-07-12T01:40:13+08:00","application_name":"default-applicaton","uri":"/index.html","remote_addr":"10.130.0.127","remote_user":"-","server_addr":"10.130.1.230","upstream_response_time":"-","upstream_addr":"-","body_bytes_sent":"615","http_referer":"-","request_filename":"/usr/local/nginx/html/index.html","fields":{"namespace":"prd-public-service","nodename":"k8s-worker-1","podname":"kydy-admin-view-56ff464f45-bff6p","containername":"kydy-admin-view","logconfig":"front-end-applications"},"http_host":"10.68.124.224","request_time":"0.000","http_x_forwarded_for":"-","proxy_protocol_addr":"-","request":"GET / HTTP/1.1","status":"200","http_user_agent":"curl/7.29.0"}

文献和材料说明

  1. CRD的使用说明请参考官方的文档
    https://loggie-io.github.io/docs/main/reference/discovery/kubernetes/logconfig/
  2. 修改K8S CRI版本的代码地址
    https://gitee.com/kevinliu_CQ/loggie
  3. 修改版本后的镜像地址

x86_64架构: registry.cn-hangzhou.aliyuncs.com/dockerforkevin/loggie:1.5.1
arm64架构: registry.cn-hangzhou.aliyuncs.com/dockerforkevin/loggie-arm:v1.5.1

该方案适合所有的容器化日志采集,如有疑问请私信。另外loggie已经很久没有更新了,个人感觉这是一个非常好的项目,希望社区的维护者可以持续更新,同时感谢他们提供这么好的项目。

Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐