Skip SSLTest.ResponseBodyTerminatedByConnectionClose under
CPPHTTPLIB_MBEDTLS_SUPPORT until the close_notify-mid-response handling
is brought into parity with the OpenSSL and wolfSSL backends. The test
verifies a successful read past the server's close, which mbedTLS
currently reports as an I/O error.
Mark the mbedTLS matrix legs (ubuntu and macos) as
continue-on-error: true. Several timing-sensitive ServerTest cases
(PostMethod2, GetStreamed, Brotli, ...) flake under ASAN+mbedTLS in
ways unrelated to cpp-httplib code; isolating these into a non-blocking
slot keeps master green while the flakiness is investigated separately.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a libwolfssl entry to lsan_suppressions.txt to mirror the existing
libcrypto rule: the wolfSSL ECC subsystem caches per-handshake buffers
that are only freed at library shutdown, which the test binaries do
not perform. These are not leaks in cpp-httplib code.
Disable fail-fast on the ubuntu / macos / windows matrices so a failure
in one TLS backend does not cancel the others; with the runner now
detecting failures correctly, we want to see the full picture per run.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
These tests wrote to a hardcoded "/tmp/" path which does not exist on
Windows, causing the file write to silently fail and the subsequent
make_file_body / make_file_provider call to return zero-sized data.
Use a relative path under the test working directory instead so the
test runs identically on every platform.
Also dump the shard log when a shard's process exits non-zero even
when the gtest summary appears clean (e.g. sanitizer report after
the suite, or assertion-based abort) — previously such failures were
detected only via overall rc and showed no diagnostic output.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous logic considered a shard "passed" if its log contained any
[ PASSED ] line, missing the case where some tests pass and some fail
(both [ PASSED ] N tests. and [ FAILED ] M tests, listed below:
appear in the gtest summary). Exit codes from the test binaries were
also ignored.
Now require both: an [ PASSED ] line, no [ FAILED ] line, and a
zero exit code. Track each shard's PID so wait can surface non-zero
exits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Linux/glibc branch of detail::getaddrinfo_with_timeout used
getaddrinfo_a(GAI_NOWAIT) with a stack-local struct gaicb. On the
connection-timeout branch it called gai_cancel(), which is non-blocking
and may return EAI_NOTCANCELED -- in that case the resolver worker
thread is still alive and writes back to ar_result on the now-destroyed
stack frame after the function has already returned.
Drop the entire #elif _GNU_SOURCE && __GLIBC__ branch and let glibc
fall through to the existing std::thread + std::shared_ptr<State>
implementation that the file already uses for other Unix systems. That
path captures shared ownership in the resolver lambda, so the state
outlives the caller's frame whether or not the worker finishes in
time -- no stack frame is ever referenced after return.
The reproducer added in #2433 (issue-2431 repro CI job) goes from
hanging at job teardown to passing in ~25s with this change.
* Add reproducer for #2431 (getaddrinfo_a use-after-free)
On Linux/glibc, getaddrinfo_with_timeout() runs DNS asynchronously via
getaddrinfo_a(GAI_NOWAIT) using a stack-local gaicb. When gai_suspend()
hits the connection timeout, gai_cancel() is called and the function
returns immediately — but gai_cancel() is non-blocking and can return
EAI_NOTCANCELED, leaving the resolver worker thread alive and still
referencing the destroyed stack frame.
Adds three opt-in gtest cases (GetAddrInfoAsyncCancelTest.*) that
exercise the cancel path repeatedly. They are gated on Linux/glibc +
CPPHTTPLIB_USE_NON_BLOCKING_GETADDRINFO at compile time, and on the
CPPHTTPLIB_TEST_ISSUE_2431=1 env var at runtime, so a normal `make
test` run is unaffected.
Also adds a dedicated CI job (issue-2431-repro) and a Docker-based
local runner (test/run_issue_2431_repro.sh) that sinkhole UDP/53 so
the timeout branch is taken, and run the test under ASAN/LSAN. With
the bug present these runs are expected to fail; with a fix applied
they should pass.
Refs: https://github.com/yhirose/cpp-httplib/issues/2431
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Fix split build for #2431 reproducer tests
The new GetAddrInfoAsyncCancelTest cases call detail::getaddrinfo_with_timeout
directly. In split builds (make test_split) split.py moves the definition into
httplib.cc and strips `inline`, so the symbol is not declared in the public
httplib.h and test.cc fails to compile -- breaking the ubuntu/test-no-exceptions
CI jobs that the PR description says should be unaffected.
Add a forward declaration in test.cc, gated by the same #if as the tests
themselves, so it links against the split-build symbol without changing the
header-only build.
* Cap issue-2431 repro job at 5 minutes
The bug manifests as orphan getaddrinfo_a resolver workers that keep the
runner from completing job teardown -- the previous run had all steps
succeed in ~1m37s but then hung in "Cleaning up orphan processes" for
~57m before GitHub force-killed the job.
A job-level timeout-minutes makes the failure signal fast and predictable:
bug present -> killed at 5 min, bug fixed -> ~2 min pass. Step-level timeout
isn't enough since the hang is in post-job cleanup, not the test step.
* Enable ASAN detect_stack_use_after_return for #2431 repro
The bug is a textbook stack-use-after-return: a stack-local struct gaicb
is destroyed when getaddrinfo_with_timeout returns after gai_cancel()
yields EAI_NOTCANCELED, then the still-live resolver worker thread writes
back into the freed frame. ASAN's detect_stack_use_after_return is the
direct detector for exactly this pattern -- enabling it lets the failure
surface as a clear ASAN diagnostic during the test run instead of as an
orphan-process hang at job teardown.
* Revert ASAN detect_stack_use_after_return for #2431 repro
The option did not detect the bug in CI -- the resolver worker write
likely lands on the heap (via the gaicb's pai pointer) or happens after
the test process exits, neither of which stack-use-after-return can
catch. Roll back to relying on the job-level timeout: bug present ->
post-cleanup hangs ~8min then job-level timeout cancels at 10min total;
bug fixed -> job completes in ~2min.
* Switch issue-2431 repro to a delayed loopback DNS test fixture
The previous repro setup dropped UDP/53 outright, which made glibc's
resolver hang forever on every lookup -- the worker never actually
received a response and so never reached the buggy write-back path
that #2431 is about. As a result, neither the broken HEAD nor the
fix made any visible difference in CI: both produced "tests pass +
post-cleanup hangs ~10min" because the orphan resolver thread is a
structural property of *any* getaddrinfo path on a hung resolver,
not a property of the bug.
Replace the sinkhole with a small loopback test fixture
(test/dns_test_fixture.py, ~50 lines, stdlib only) that answers DNS
queries after a 3s delay -- longer than the test's 1s timeout. An
iptables NAT rule routes the test job's lookups to the fixture
without touching /etc/resolv.conf, so the rest of the runner's DNS
behaviour is unaffected.
With ASAN's detect_stack_use_after_return enabled, the worker's
late write-back into the destroyed gaicb stack frame is now caught
as a stack-use-after-return diagnostic, so the broken HEAD fails
fast at the test step (clear red) and the fix turns the same job
green in well under a minute.
Same fixture is wired into both the GitHub Actions job and the
docker-based test/run_issue_2431_repro.sh script, so local repro on
macOS and CI repro on Linux exercise the identical path.
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Fix#2427
* Use setarch -R on Linux to fix ASAN crash on WSL2
WSL2 uses high-entropy ASLR which conflicts with ASAN's shadow memory
requirements, causing the ASAN runtime to crash at startup. Running tests
via setarch -R (ADDR_NO_RANDOMIZE) disables ASLR for the test process,
allowing ASAN to initialize correctly.
Previously, dynamic threads exited as soon as their current task
completed and the queue was empty. This caused excessive thread
creation/destruction under bursty or long-lived workloads (e.g., SSE
streaming), degrading tail latency. Now dynamic threads loop back and
wait for CPPHTTPLIB_THREAD_POOL_IDLE_TIMEOUT (3s) before exiting,
allowing them to be reused for subsequent tasks.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Added `set_websocket_max_missed_pongs` method to configure unresponsive-peer detection.
- Updated README and documentation to clarify WebSocket limitations and features.
- Introduced tests for detecting non-responsive peers and ensuring responsive peers do not trigger timeouts.
The server listens on AF_INET6 only (::1), so the test fails:
[ RUN ] WebSocketIntegrationTest.SocketSettings
test/test.cc:17160: Failure
Value of: client.connect()
Actual: false
Expected: true
Fixes#2419.
Co-authored-by: Jiri Slaby <jslaby@suse.cz>
adds support for pre-existing `zstd::libzstd` which is useful for
projects that bundle their own zstd in a way that doesn't get caught by
`CONFIG`
Signed-off-by: crueter <crueter@eden-emu.dev>