Skip to content

azure_blob output forwarding logs twice with large number of log files (50k) #11645

@sarkisjad

Description

@sarkisjad

Bug Report

Describe the bug
When I am using output_blob to ship 50k small files from a source directory to azure blob, not all files are getting shipped. It's usually 99% but not all files.
These 50k files are being generated while fluent bit is up and each file is small in size
Note: When using S3 output, we don't have this issue, indicating that the issue is coming from the output, not from the input or filter plugins

EDIT:
The missing files are now present after I increased the Retry_Limit (Found in the scheduling and retries: https://docs.fluentbit.io/manual/administration/scheduling-and-retries).
However now, I am coming across duplicate log lines in files.
I saw some similar issues talking about it but for different outputs like:

To Reproduce

  • Steps to reproduce the problem: Use output_blob as an output plugin

Expected behavior
All files are present

Your Environment

  • Version used: 4.2.3
  • Configuration:
  outputs:
    - name: azure_blob
      match: '*'
      path: blob/path
      account_name: {my_account_name}
      auth_type: sas
      sas_token: {sas_token}
      container_name: my-container
      auto_create_container: true

      workers: 5
      file_delivery_attempt_limit: 20
      database_file: /path/azure-blob.db

      upload_file_size: 1M
      upload_timeout: 1m
      tls: on

Additional context

  • What kind of tweaking can we do in the configuration? Is it even something the configuration can fix?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions