Skip to content

gzip: in_forward: Fix concatenated gzip payloads gzip concatenated #10259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: gzip-concatenated
Choose a base branch
from

Conversation

Tangui0232
Copy link

@Tangui0232 Tangui0232 commented Apr 26, 2025

Fix concatenated gzip payload handling.

The implementation of flb_gzip_count is flawed as it relies on looking
for valid gzip headers. A gzip payload can be generated that includes
a valid gzip header in the gzip body - see test_header_in_gzip_body.

Removed flb_gzip_count and associated handling in favor of utilizing
mz_inflate to find the boundaries between concatenated gzip payloads
during decompression. mz_inflate will stop when it reaches the end of
the gzip body and mz_stream.in_avail contains the bytes left in the
buffer for processing.

With this change we are no longer able to allocate the exact memory
required by reading the decompressed length out of the gzip footer
(since we don't know where it is). Instead we allocate buffers of size
FLB_GZIP_BUFFER_SIZE (1MB) as an intermediate location to put the
decompressed data and use MZ_SYNC_FLUSH instead of MZ_FINISH
when reading to allow mz_inflate to return back to us when it needs
more buffer space. In the case of requiring more space we allocate
another buffer (up to FLB_GZIP_MAX_BUFFERS (100)) and call
mz_inflate again. Once complete we allocate the final buffer and copy
the data from the buffers in. This means that we use at least twice the
amount of memory as before for the short period where we are
copying the data from the intermediate buffers to the final buffer.

Addresses #9058.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change

Forward output plugin side:

[SERVICE]
    Flush                             1
    Grace                             60
    Log_Level                         info
    Parsers_File                      /usr/fluentbit/conf/parsers.conf
    Refresh_Interval                  60
    Daemon                            off
    HTTP_Server                       On
    HTTP_Listen                       0.0.0.0
    HTTP_PORT                         2020
    Health_Check                      On
    Hot_Reload                        On
    HC_Errors_Count                   100
    HC_Retry_Failure_Count            5
    HC_Period                         60
    storage.path                      /mnt/logs/fluentbit
    storage.sync                      full
    storage.checksum                  off
    scheduler.base                    10
    scheduler.cap                     60

[INPUT]
    Refresh_Interval            5
    Name                        tail
    Path                        ${ENV_LOG_PATH}
    Path_Key                    path
    Tag                         tail
    Read_from_Head              true
    Ignore_Older                2d
    DB                          /mnt/logs/fluentbit/fluentbitpos.db
    Buffer_Max_Size             20M
    Mem_Buf_Limit               200MB
    threaded                    on

[FILTER]
    Name                        parser
    Match                       *
    Key_Name                    path
    Parser                      path_parser
    Preserve_Key                Off
    Reserve_Data                On

[FILTER]
    Name                        Lua
    Match                       *
    Script                      /usr/fluentbit/conf/lua-filters.lua
    call                        set_path

[OUTPUT]
    Name                        forward
    Match                       *
    Host                        fluentbit-collector-svc
    Port                        ${FORWARD_PORT}
    compress                    gzip
    tls                         On
    tls.verify                  On
    tls.min_version             TLSv1.3
    tls.ca_file                 /usr/fluentbit/cert/ca.crt
    tls.key_file                /usr/fluentbit/cert/tls.key
    tls.crt_file                /usr/fluentbit/cert/tls.crt
    Retry_Limit                 10
    net.keepalive_idle_timeout  5
    net.max_worker_connections  5
    net.connect_timeout         60

Forward input plugin side:

[SERVICE]
    Flush                             1
    Grace                             60
    Log_Level                         debug
    Daemon                            off
    HTTP_Server                       On
    HTTP_Listen                       0.0.0.0
    HTTP_PORT                         2020
    Health_Check                      On
    Hot_Reload                        On
    storage.path                      /usr/fluentbit/logs/.buffer
    storage.sync                      normal
    storage.checksum                  off

[INPUT]
    Name                 forward
    Listen               0.0.0.0
    Port                 24224
    tls                  On
    tls.verify           On
    tls.min_version      TLSv1.3
    tls.ca_file          /usr/fluentbit/cert/ca.crt
    tls.key_file         /usr/fluentbit/cert/tls.key
    tls.crt_file         /usr/fluentbit/cert/tls.crt
    Buffer_Chunk_Size    10M
    Buffer_Max_Size      500M
    Mem_Buf_Limit        100MB
    threaded             on

[INPUT]
    Name                 forward
    Listen               0.0.0.0
    Port                 24225
    tls                  On
    tls.verify           On
    tls.min_version      TLSv1.3
    tls.ca_file          /usr/fluentbit/cert/ca.crt
    tls.key_file         /usr/fluentbit/cert/tls.key
    tls.crt_file         /usr/fluentbit/cert/tls.crt
    Buffer_Chunk_Size    10M
    Buffer_Max_Size      500M
    Mem_Buf_Limit        100MB
    threaded             on

[INPUT]
    Name                 forward
    Listen               0.0.0.0
    Port                 24226
    tls                  On
    tls.verify           On
    tls.min_version      TLSv1.3
    tls.ca_file          /usr/fluentbit/cert/ca.crt
    tls.key_file         /usr/fluentbit/cert/tls.key
    tls.crt_file         /usr/fluentbit/cert/tls.crt
    Buffer_Chunk_Size    10M
    Buffer_Max_Size      500M
    Mem_Buf_Limit        100MB
    threaded             on

[INPUT]
    Name                 forward
    Listen               0.0.0.0
    Port                 24227
    tls                  On
    tls.verify           On
    tls.min_version      TLSv1.3
    tls.ca_file          /usr/fluentbit/cert/ca.crt
    tls.key_file         /usr/fluentbit/cert/tls.key
    tls.crt_file         /usr/fluentbit/cert/tls.crt
    Buffer_Chunk_Size    10M
    Buffer_Max_Size      500M
    Mem_Buf_Limit        100MB
    threaded             on

[INPUT]
    Name                 forward
    Listen               0.0.0.0
    Port                 24228
    tls                  On
    tls.verify           On
    tls.min_version      TLSv1.3
    tls.ca_file          /usr/fluentbit/cert/ca.crt
    tls.key_file         /usr/fluentbit/cert/tls.key
    tls.crt_file         /usr/fluentbit/cert/tls.crt
    Buffer_Chunk_Size    10M
    Buffer_Max_Size      500M
    Mem_Buf_Limit        100MB
    threaded             on

[INPUT]
    Name                 forward
    Listen               0.0.0.0
    Port                 24229
    tls                  On
    tls.verify           On
    tls.min_version      TLSv1.3
    tls.ca_file          /usr/fluentbit/cert/ca.crt
    tls.key_file         /usr/fluentbit/cert/tls.key
    tls.crt_file         /usr/fluentbit/cert/tls.crt
    Buffer_Chunk_Size    10M
    Buffer_Max_Size      500M
    Mem_Buf_Limit        100MB
    threaded             on

[OUTPUT]
    name         vxfile
    match        *
    base_path    /usr/fluentbit/logs
    path_var     path
    log_var      log

I tried running valgrind in an actual running environment as well, but nearly all TLS communication failed. I can look into this more if required.

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [N/A ] Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@Tangui0232
Copy link
Author

Tangui0232 commented Apr 29, 2025

I fixed the windows and centos7 build issues and separated the commit into two commits based on my reading of https://github.com/fluent/fluent-bit/blob/master/CONTRIBUTING.md#commit-changes.

@cosmo0920
Copy link
Contributor

This commit 6bb63a5 does not have Signed-off line. So, we need to mark there.

Brandon Strub added 2 commits April 30, 2025 18:38
The implementation of flb_gzip_count is flawed as it relies on looking
for valid gzip headers. A gzip payload can be generated that includes
a valid gzip header in the gzip body - see test_header_in_gzip_body.

Removed flb_gzip_count and associated handling in favor of utilizing
mz_inflate to find the boundaries between concatenated gzip payloads
during decompression. mz_inflate will stop when it reaches the end of
the gzip body and mz_stream.in_avail contains the bytes left in the
buffer for processing.

Signed-off-by: Brandon Strub <[email protected]>
Utilize new flb_gzip_uncompress_multi method to support concatenated
gzip payloads.

Signed-off-by: Brandon Strub <[email protected]>
@Tangui0232 Tangui0232 force-pushed the fix-concatenated-gzip-payloads-gzip-concatenated branch from f5e9580 to 6ed4a01 Compare April 30, 2025 23:39
@cosmo0920 cosmo0920 changed the title Fix concatenated gzip payloads gzip concatenated gzip: in_forward: Fix concatenated gzip payloads gzip concatenated May 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants