Skip to content

Semaphore not cleaned up on request timeout #2422

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dark264sh opened this issue May 21, 2025 · 5 comments
Open

Semaphore not cleaned up on request timeout #2422

dark264sh opened this issue May 21, 2025 · 5 comments

Comments

@dark264sh
Copy link

After a ngx.semaphore.wait() call, if the current request is terminated by Nginx due to client_header_timeout or client_body_timeout, the corresponding post() is never executed. As a result, all subsequent attempts to wait() on the same semaphore hang until they time out, because the semaphore count was never restored.

1. Reproduce by client_header_timeout on ssl_certificate_by_lua_block


#user  nobody;
worker_processes  1;
daemon off;
master_process off;

error_log  logs/debug.log  debug;

#pid        logs/nginx.pid;


events {
    worker_connections  1024;
}

http {
    lua_code_cache on;

    include       mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  logs/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    #keepalive_timeout  0;
    keepalive_timeout  65;

    #gzip  on;

    # Shared dictionary for the balancer
    lua_shared_dict my_balancer_dict 1m;

    init_worker_by_lua_block {
        local semaphore = require "ngx.semaphore"
        -- create a semaphore with capacity = 1
        local sema, err = semaphore.new(1)
        if not sema then
            ngx.log(ngx.ERR, "failed to create semaphore: ", err)
            return
        end
        ssl_semaphore = sema
    }

	server {
		listen       8080;
		server_name localhost;

		location / {
			content_by_lua_block {
				ngx.sleep(10)
				ngx.say("Hello, world!")
			}
		}
	}

    server {
        listen       8443 ssl;
		server_name edge.foo.com;

		client_header_timeout 8;

        # your existing certs:
		ssl_certificate     /home/barca/src/d26-sandbox/go/data/server.crt;
		ssl_certificate_key /home/barca/src/d26-sandbox/go/data/server.key;

        ssl_certificate_by_lua_block {
            local sema = ssl_semaphore
            local ok, err = sema:wait(5)
            if not ok then
                ngx.log(ngx.ERR, "SEMA] semaphore wait failed: ", err)
                return ngx.exit(ngx.ERROR)
            end

            ngx.log(ngx.INFO, "SEMA] acquired semaphore, sleeping 30s")

	        local httpc = require "resty.http"
	        local http_client, err = httpc.new()
	        if not http_client then
	            ngx.log(ngx.ERR, "failed to create http client: ", err)
	            sema:post(1)
	            return ngx.exit(ngx.ERROR)
	        end
			--http_client:set_timeouts(10000, 10000, 10000) -- connect, send, read timeouts
	
	        local request_uri_to_call = "http://127.0.0.1:8080/some/path"
	        local res, err = http_client:request_uri(request_uri_to_call, {
	            method = "GET",
	        })
	
	        if not res then
	            ngx.log(ngx.ERR, "failed to make HTTP request to ", request_uri_to_call, ": ", err)
	            return ngx.exit(ngx.ERROR)
	        end
	
	        ngx.log(ngx.INFO, "HTTP request to ", request_uri_to_call, " completed with status: ", res.status)
	        local ok, err = http_client:set_keepalive()
	        if not ok then
	            ngx.log(ngx.ERR, "failed to set keepalive: ", err)
	        end


            local posted, perr = sema:post(1)
            if not posted then
                ngx.log(ngx.ERR, "SEMA] semaphore post failed: ", perr)
            else
                ngx.log(ngx.INFO, "SEMA] released semaphore")
            end
        }

        location / {
			return 200 "Hello, world!";
        }
    }
}

The first request times out in the SSL phase and never calls post().
All subsequent requests hang until their wait() deadlines.

curl -v -o /dev/null --connect-to edge.foo.com:443:127.0.0.1:8443 'https://edge.foo.com/' --insecure
curl -v -o /dev/null --connect-to edge.foo.com:443:127.0.0.1:8443 'https://edge.foo.com/' --insecure
curl -v -o /dev/null --connect-to edge.foo.com:443:127.0.0.1:8443 'https://edge.foo.com/' --insecure

2. Reproduce by client_body_timeout on access_by_lua_block


#user  nobody;
worker_processes  1;
daemon off;
master_process off;

#error_log  logs/error.log;
#error_log  logs/error.log  notice;
#error_log  logs/error.log  info;
error_log  logs/debug.log  debug;

#pid        logs/nginx.pid;


events {
    worker_connections  1024;
}

http {
    lua_code_cache on;

    include       mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  logs/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    #keepalive_timeout  0;
    keepalive_timeout  65;

    #gzip  on;

    # Shared dictionary for the balancer
    lua_shared_dict my_balancer_dict 1m;

	init_worker_by_lua_block {
	    local semaphore = require "ngx.semaphore"
	    local sema, err = semaphore.new(1)
	    if not sema then
	        ngx.log(ngx.ERR, "failed to create semaphore: ", err)
	        return
	    end
	    upload_semaphore = sema
	}

    client_body_timeout   5s;

    server {
        listen       8000;
		server_name edge.foo.com;

        # on 408 jump here
        error_page 408 = @body_timeout;

        location /upload {
            access_by_lua_block {
				local sema = upload_semaphore
                local ok, err = sema:wait(10)
                if not ok then
                    ngx.log(ngx.ERR, "SEMA] semaphore wait failed: ", err)
                    return ngx.exit(500)
                end
                ngx.log(ngx.INFO, "SEMA] acquired semaphore, now reading body")

                ngx.req.read_body()
                local data = ngx.req.get_body_data() or ""

                sema:post(1)
                ngx.log(ngx.INFO, "SEMA] released semaphore")

                ngx.say("Got ", #data, " bytes of body")
            }
        }

        location @body_timeout {
            return 408 "Request Timeout: client body not received within 5s\n";
        }
    }
}

The first slow upload triggers client_body_timeout and never calls post()
Subsequent uploads hang at wait()

(   printf 'POST /upload HTTP/1.1\r\n';   printf 'Host: edge.foo.com\r\n';   printf 'Content-Length: 100\r\n';   printf 'Connection: close\r\n';   printf '\r\n';    sleep 10;    yes | head -c 100; ) | nc 127.0.0.1 8000
(   printf 'POST /upload HTTP/1.1\r\n';   printf 'Host: edge.foo.com\r\n';   printf 'Content-Length: 100\r\n';   printf 'Connection: close\r\n';   printf '\r\n';    sleep 3;    yes | head -c 100; ) | nc 127.0.0.1 8000
(   printf 'POST /upload HTTP/1.1\r\n';   printf 'Host: edge.foo.com\r\n';   printf 'Content-Length: 100\r\n';   printf 'Connection: close\r\n';   printf '\r\n';    sleep 3;    yes | head -c 100; ) | nc 127.0.0.1 8000

Proposed Solution
Register a cleanup handler (as done in ngx_http_file_cache or ngx_http_lua_socket_tcp) when acquiring the semaphore. If the request is aborted, this handler should call sema:post() to restore the count.

@zhuizhuhaomeng
Copy link
Contributor

Would you please create a PR for this issue?

@dark264sh
Copy link
Author

Sounds good, I'll create a pull request.

dark264sh added a commit to dark264sh/lua-nginx-module that referenced this issue Jun 5, 2025
After a ngx.semaphore.wait() call, if the current request is terminated by
Nginx due to client_header_timeout or client_body_timeout, the corresponding
post() is never executed. As a result, all subsequent attempts to wait() on
the same semaphore hang until they time out, because the semaphore count was
never restored.

openresty#2422
dark264sh added a commit to dark264sh/lua-resty-core that referenced this issue Jun 5, 2025
After a ngx.semaphore.wait() call, if the current request is terminated by
Nginx due to client_header_timeout or client_body_timeout, the corresponding
post() is never executed. As a result, all subsequent attempts to wait() on
the same semaphore hang until they time out, because the semaphore count was
never restored.

openresty/lua-nginx-module#2422
@dark264sh
Copy link
Author

@zhuizhuhaomeng
I sent the pull requests for this issue.

lua-nginx-module: #2426
lua-resty-core: openresty/lua-resty-core#502

I didn't send for stream-lua-nginx-module because there is a caution comment like this.
I cannot find this file, I don't know how to handle it.

/*
 * !!! DO NOT EDIT DIRECTLY !!!
 * This file was automatically generated from the following template:
 *
 * src/subsys/ngx_subsys_lua_semaphore.c.tt2
 */

@zhuizhuhaomeng
Copy link
Contributor

From https://github.com/openresty/lua-resty-core/blob/master/lib/ngx/semaphore.md#synchronizing-threads-in-different-contexts, there is : "The same applies to different request contexts as long as these requests are served by the same nginx worker process."

Does you change break this?

@dark264sh
Copy link
Author

dark264sh commented Jun 11, 2025

@zhuizhuhaomeng
Good catch!

In this scenario we’re using ngx.timer.at with a closure before acquiring the semaphore.
It seems to work initially, but problems occur when ngx.timer.at runs with a closure after semaphore acquisition.

I’m not sure how to resolve this yet. But this seems similar to this issue.

From https://github.com/openresty/lua-nginx-module?tab=readme-ov-file#ngxtimerat
There is: "You can pass most of the standard Lua values (nils, booleans, numbers, strings, tables, closures, file handles, etc.) into the timer callback, either explicitly as user arguments or implicitly as upvalues for the callback closure. There are several exceptions, however: you cannot pass any thread objects returned by coroutine.create and ngx.thread.spawn or any cosocket objects returned by ngx.socket.tcp, ngx.socket.udp, and ngx.req.socket because these objects' lifetime is bound to the request context creating them while the timer callback is detached from the creating request's context (by design) and runs in its own (fake) request context. If you try to share the thread or cosocket objects across the boundary of the creating request, then you will get the "no co ctx found" error (for threads) or "bad request" (for cosockets). It is fine, however, to create all these objects inside your timer callback."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants