Commit Graph

1905 Commits

Author SHA1 Message Date
yhirose
600d220c84 Release v0.43.4 v0.43.4 2026-05-09 21:29:23 +09:00
yhirose
87d62db46b Reject malformed chunk-size in chunked decoder
strtoul silently accepts a leading "-" and wraps via unsigned
arithmetic, so chunk-size "-2" produced ULONG_MAX-1, bypassing the
ULONG_MAX guard and letting a client drive the server toward unbounded
allocation.

Replace strtoul with a manual hex parser that requires at least one hex
digit, detects size_t overflow per digit, and accepts only chunk-ext or
end-of-line after the digits (RFC 9112 §7.1).
2026-05-09 16:52:32 +09:00
yhirose
a1fdc07f34 Guard nullptr res in KeepAliveTest proxy template (#2443)
When the upstream request to httpbingo.org transiently fails, cli.Get()
returns nullptr and the next line dereferences it (res->status / res->body),
producing a SEGV in std::string::begin() under ASan. Sibling templates in
the same file already use ASSERT_TRUE(res != nullptr); apply the same
guard to the four Get() call sites in KeepAliveTest so a flaky network
turns into a clean test failure instead of a crash.
2026-05-06 08:36:38 -04:00
yhirose
eb49a304b6 Use vswhere to locate VS install in 32-bit Windows CI (#2442)
The hosted windows-latest runner is migrating from VS 2022 to VS 2026
(NOTICE: windows-2025 -> windows-2025-vs2026 by 2026-05-12). The
hardcoded path C:\Program Files\Microsoft Visual Studio\2022\Enterprise
no longer exists on the new image, so vcvarsall.bat silently fails and
'cl' is not on PATH.

Resolve the install path via vswhere.exe (stable location, version
agnostic) and exit if vcvarsall.bat fails so future breakage surfaces
immediately instead of as a confusing 'cl not recognized' error.
2026-05-06 08:25:56 -04:00
yhirose
a9bfe5914b Fix #2441 2026-05-06 18:44:14 +09:00
yhirose
ec5ce17929 Release v0.43.3 v0.43.3 2026-05-04 16:19:49 +09:00
yhirose
f6524c0802 Drop Str2tagTest unit test that broke split / no-exceptions builds
The test referenced detail::can_compress_content_type, which lives below
the split BORDER in httplib.h and is therefore not visible to test.cc in
test_split / Windows-CMake builds. EXPECT_NO_THROW also expanded to a
try/catch that would not compile under -fno-exceptions. The OSS-Fuzz
reproducer in test/fuzzing/corpus already serves as the regression test
for #508087118 and is exercised by make fuzz_test.
2026-05-01 22:20:41 +09:00
yhirose
35c4026c7f Make fuzz_test robust to missing corpus files
When a glob like clusterfuzz-testcase-minimized-foo_fuzzer-* did not
match anything, bash passed the literal pattern through. The standalone
runner then tried to open it, tellg() returned -1, and the resulting
size_t cast (SIZE_MAX) crashed std::vector with length_error. This made
fuzz_test fail loudly during bisects to commits before a corpus file
landed. Filter each glob through a -f test so unmatched patterns are
silently skipped with a "(no XXX corpus)" notice, mirroring what was
already done for url_parser_fuzzer.
2026-05-01 21:50:26 +09:00
yhirose
40e18460bc Document str2tag_core's compile-time-only role 2026-05-01 21:46:13 +09:00
yhirose
92aecf85d8 Fix OSS-Fuzz #508087118: avoid stack overflow in str2tag
str2tag_core is recursive (one frame per character), so a long runtime
input such as a fuzzer-supplied Content-Type would overflow the stack.
Rewrite the runtime entry point str2tag() iteratively while keeping the
recursive constexpr str2tag_core for compile-time UDL evaluation. The
hash output is unchanged for all inputs.
2026-05-01 21:39:46 +09:00
yhirose
b223e29778 Add OSS-Fuzz #508370122 reproducer to client_fuzzer corpus
Same root cause as #508342856 (fixed in 2d2efe4): an oversized
Content-Length value (here 4467440718547775) caused res.body.reserve()
to attempt a multi-petabyte allocation. The UBSAN fuzzer job surfaced
it as a std::bad_alloc-driven abort, while the ASAN job for #508342856
reported it as allocation-size-too-big. The payload_max_length_ cap
introduced in 2d2efe4 already addresses both.
2026-05-01 21:34:03 +09:00
yhirose
2d2efe46da Fix OSS-Fuzz #508342856: cap Content-Length reservation by payload_max_length_
A malicious or malformed server response with an enormous Content-Length
header (e.g. 20000000000) caused the client to call res.body.reserve(len)
with the untrusted value, triggering OOM before read_content's
payload_max_length_ check could take effect. Cap the pre-reservation
at payload_max_length_, since reading more than that is never useful.
2026-05-01 21:28:57 +09:00
yhirose
cae753425e Run all fuzzers via make fuzz_test 2026-05-01 21:28:45 +09:00
yhirose
d412e98c62 Release v0.43.2 v0.43.2 2026-04-30 17:47:53 +09:00
yhirose
806fcb8268 Re-enable getaddrinfo_a with worker-completion wait (#2431) (#2439)
* Restore getaddrinfo_a path with proper worker-completion wait (#2431)

5ebbfee dropped the Linux/glibc getaddrinfo_a branch entirely to avoid
the stack-use-after-free reported in #2431. That sidestepped the bug
but lost the asynchronous-resolution capability getaddrinfo_a is meant
to provide.

Bring the getaddrinfo_a branch back with the actual fix on the
cancellation path: after gai_cancel() — which is non-blocking and may
return EAI_NOTCANCELED while the resolver worker is still mid-operation
— call gai_suspend() with no timeout in a loop until gai_error() stops
returning EAI_INPROGRESS. Only then is it safe to destroy the
stack-local gaicb. freeaddrinfo() is also called on any partially
populated ar_result so that error paths do not leak.

This is the approach suggested in the issue body, with gai_suspend
substituted for the busy-poll over gai_error.

The issue-2431 reproducer test (run under ASAN with sinkhole DNS) is
unchanged and continues to drive the cancel path; it now exercises the
restored getaddrinfo_a code rather than the std::thread fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Simplify getaddrinfo_a branch (idiomatic init, scope_exit, fewer comments)

- Value-initialize gaicb / sigevent / timespec with {} instead of memset
- Replace the two manual freeaddrinfo calls with a scope_exit guard, with
  request.ar_result reset to nullptr on the success path to release
  ownership to the caller (matches the addrinfo cleanup pattern used in
  detail::create_socket and friends)
- Inline the single-call wait_for_request_done lambda
- Drop the (const struct gaicb *const *) cast — the array decays without
  it under C++11
- Tighten the leading comment to the one load-bearing fact (#2431) and
  the trade-off about pathological DNS waits; remove a stale claim that
  the inner loop handles EAI_INTR (the loop checks gai_error, not the
  gai_suspend return value)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:03:37 +09:00
yhirose
c2678f0186 Fix #2435: allow mmap to open files held open for writing (#2438)
* Add test for #2435 mmap::open with concurrent writer

Verifies that detail::mmap can open a file held open with GENERIC_WRITE
by another handle (e.g. an active log file). Currently fails on Windows
because CreateFile2 omits FILE_SHARE_WRITE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix #2435: allow mmap to open files held open for writing

Add FILE_SHARE_WRITE to the share mode passed to ::CreateFile2 so
detail::mmap can open a file even when another process holds it open
with GENERIC_WRITE (e.g. an active log file). Without this, CreateFile2
fails with ERROR_SHARING_VIOLATION because the new opener's share mode
must permit the existing handle's access mode.

This brings the Windows path's behavior in line with the POSIX path
which uses ::open(O_RDONLY) and is unaffected by other processes'
write handles.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:42:38 +09:00
DavidKorczynski
0cbeafe6a4 Add client fuzzing harness (#2437)
Cover client request processing logic. The goal is to enable this
running on OSS-Fuzz.

Signed-off-by: David Korczynski <david@adalogics.com>
2026-04-29 11:05:29 +09:00
yhirose
13e866bdb0 Use SHARDS=1 for macOS mbedTLS to stop residual flakiness
The macos-latest runner is consistently slower than ubuntu-latest for
the ASAN+mbedTLS test binary, and SHARDS=2 still flakes there on the
ServerTest fixture's rapid bind/connect cycle against a fixed port.
Serialize fully (SHARDS=1) on macOS only; ubuntu mbedTLS stays at 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:01:10 +09:00
yhirose
db6c9ef27b Drop mbedTLS continue-on-error now that the matrix is stable
With the close_notify mid-response fix and SHARDS=2 mitigation, the
mbedTLS legs run reliably on both ubuntu and macos. Drop the
continue-on-error escape hatch so future regressions actually break the
build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:38:45 +09:00
yhirose
887837c65b Run mbedTLS test shards with SHARDS=2 to reduce flakiness
Under ASAN+mbedTLS, the default 4-way sharding loads CI runners enough
that timing-sensitive ServerTest cases (Delete, PostMethod2, GetStreamed,
...) flake on what looks like first-request keep-alive reuse. Reducing
to 2 shards halves contention and historically stabilizes these on local
runs. The total test time goes up roughly 1.5x (still well under the job
budget) which is an acceptable trade for reliability.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:27:18 +09:00
yhirose
3d56762d5c Fix mbedTLS close_notify mid-response handling
The mbedTLS backend's read() returned -1 with err.code = PeerClosed when
the peer sent close_notify, while OpenSSL and wolfSSL surface it as 0
(clean EOF). The result was that an SSL response without Content-Length
or chunked Transfer-Encoding — terminated by connection close — was
reported as "Failed to read connection" on mbedTLS, even though the
body had been fully delivered.

Translate PeerClosed into a return value of 0 to match the other
backends. This re-enables SSLTest.ResponseBodyTerminatedByConnectionClose
on mbedTLS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:04:10 +09:00
yhirose
109e331068 Exclude *_Online tests from default CI runs
These tests reach out to external services (httpbin, YouTube, ...) and
flake on CI runners whenever those services are slow or unreachable.
The previous shard runner script silently masked these failures; now
that runs report them faithfully, default the filter to -*_Online.

Override via workflow_dispatch + the gtest_filter input to include
them when explicitly desired.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:41:39 +09:00
yhirose
2ea632264d Skip mbedTLS-specific SSL test; allow flaky mbedTLS jobs
Skip SSLTest.ResponseBodyTerminatedByConnectionClose under
CPPHTTPLIB_MBEDTLS_SUPPORT until the close_notify-mid-response handling
is brought into parity with the OpenSSL and wolfSSL backends. The test
verifies a successful read past the server's close, which mbedTLS
currently reports as an I/O error.

Mark the mbedTLS matrix legs (ubuntu and macos) as
continue-on-error: true. Several timing-sensitive ServerTest cases
(PostMethod2, GetStreamed, Brotli, ...) flake under ASAN+mbedTLS in
ways unrelated to cpp-httplib code; isolating these into a non-blocking
slot keeps master green while the flakiness is investigated separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:30:36 +09:00
yhirose
511cc02278 Suppress wolfSSL library leaks; remove fail-fast from test matrix
Add a libwolfssl entry to lsan_suppressions.txt to mirror the existing
libcrypto rule: the wolfSSL ECC subsystem caches per-handshake buffers
that are only freed at library shutdown, which the test binaries do
not perform. These are not leaks in cpp-httplib code.

Disable fail-fast on the ubuntu / macos / windows matrices so a failure
in one TLS backend does not cancel the others; with the runner now
detecting failures correctly, we want to see the full picture per run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 07:55:09 +09:00
yhirose
f50bd311fb Fix MakeFileBody/MakeFileProvider tests on Windows
These tests wrote to a hardcoded "/tmp/" path which does not exist on
Windows, causing the file write to silently fail and the subsequent
make_file_body / make_file_provider call to return zero-sized data.
Use a relative path under the test working directory instead so the
test runs identically on every platform.

Also dump the shard log when a shard's process exits non-zero even
when the gtest summary appears clean (e.g. sanitizer report after
the suite, or assertion-based abort) — previously such failures were
detected only via overall rc and showed no diagnostic output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 07:33:51 +09:00
yhirose
b0866cff8f Detect failing tests in parallel shard runner
The previous logic considered a shard "passed" if its log contained any
[  PASSED  ] line, missing the case where some tests pass and some fail
(both [  PASSED  ] N tests. and [  FAILED  ] M tests, listed below:
appear in the gtest summary). Exit codes from the test binaries were
also ignored.

Now require both: an [  PASSED  ] line, no [  FAILED  ] line, and a
zero exit code. Track each shard's PID so wait can surface non-zero
exits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 07:03:37 +09:00
yhirose
5ebbfeef0b Fix #2431: drop getaddrinfo_a path to eliminate stack-use-after-free (#2436)
The Linux/glibc branch of detail::getaddrinfo_with_timeout used
getaddrinfo_a(GAI_NOWAIT) with a stack-local struct gaicb. On the
connection-timeout branch it called gai_cancel(), which is non-blocking
and may return EAI_NOTCANCELED -- in that case the resolver worker
thread is still alive and writes back to ar_result on the now-destroyed
stack frame after the function has already returned.

Drop the entire #elif _GNU_SOURCE && __GLIBC__ branch and let glibc
fall through to the existing std::thread + std::shared_ptr<State>
implementation that the file already uses for other Unix systems. That
path captures shared ownership in the resolver lambda, so the state
outlives the caller's frame whether or not the worker finishes in
time -- no stack frame is ever referenced after return.

The reproducer added in #2433 (issue-2431 repro CI job) goes from
hanging at job teardown to passing in ~25s with this change.
2026-04-28 18:34:14 +09:00
yhirose
d14e4fc05f Reproducer test for #2431 (getaddrinfo_a use-after-free) (#2433)
* Add reproducer for #2431 (getaddrinfo_a use-after-free)

On Linux/glibc, getaddrinfo_with_timeout() runs DNS asynchronously via
getaddrinfo_a(GAI_NOWAIT) using a stack-local gaicb. When gai_suspend()
hits the connection timeout, gai_cancel() is called and the function
returns immediately — but gai_cancel() is non-blocking and can return
EAI_NOTCANCELED, leaving the resolver worker thread alive and still
referencing the destroyed stack frame.

Adds three opt-in gtest cases (GetAddrInfoAsyncCancelTest.*) that
exercise the cancel path repeatedly. They are gated on Linux/glibc +
CPPHTTPLIB_USE_NON_BLOCKING_GETADDRINFO at compile time, and on the
CPPHTTPLIB_TEST_ISSUE_2431=1 env var at runtime, so a normal `make
test` run is unaffected.

Also adds a dedicated CI job (issue-2431-repro) and a Docker-based
local runner (test/run_issue_2431_repro.sh) that sinkhole UDP/53 so
the timeout branch is taken, and run the test under ASAN/LSAN. With
the bug present these runs are expected to fail; with a fix applied
they should pass.

Refs: https://github.com/yhirose/cpp-httplib/issues/2431

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix split build for #2431 reproducer tests

The new GetAddrInfoAsyncCancelTest cases call detail::getaddrinfo_with_timeout
directly. In split builds (make test_split) split.py moves the definition into
httplib.cc and strips `inline`, so the symbol is not declared in the public
httplib.h and test.cc fails to compile -- breaking the ubuntu/test-no-exceptions
CI jobs that the PR description says should be unaffected.

Add a forward declaration in test.cc, gated by the same #if as the tests
themselves, so it links against the split-build symbol without changing the
header-only build.

* Cap issue-2431 repro job at 5 minutes

The bug manifests as orphan getaddrinfo_a resolver workers that keep the
runner from completing job teardown -- the previous run had all steps
succeed in ~1m37s but then hung in "Cleaning up orphan processes" for
~57m before GitHub force-killed the job.

A job-level timeout-minutes makes the failure signal fast and predictable:
bug present -> killed at 5 min, bug fixed -> ~2 min pass. Step-level timeout
isn't enough since the hang is in post-job cleanup, not the test step.

* Enable ASAN detect_stack_use_after_return for #2431 repro

The bug is a textbook stack-use-after-return: a stack-local struct gaicb
is destroyed when getaddrinfo_with_timeout returns after gai_cancel()
yields EAI_NOTCANCELED, then the still-live resolver worker thread writes
back into the freed frame. ASAN's detect_stack_use_after_return is the
direct detector for exactly this pattern -- enabling it lets the failure
surface as a clear ASAN diagnostic during the test run instead of as an
orphan-process hang at job teardown.

* Revert ASAN detect_stack_use_after_return for #2431 repro

The option did not detect the bug in CI -- the resolver worker write
likely lands on the heap (via the gaicb's pai pointer) or happens after
the test process exits, neither of which stack-use-after-return can
catch. Roll back to relying on the job-level timeout: bug present ->
post-cleanup hangs ~8min then job-level timeout cancels at 10min total;
bug fixed -> job completes in ~2min.

* Switch issue-2431 repro to a delayed loopback DNS test fixture

The previous repro setup dropped UDP/53 outright, which made glibc's
resolver hang forever on every lookup -- the worker never actually
received a response and so never reached the buggy write-back path
that #2431 is about. As a result, neither the broken HEAD nor the
fix made any visible difference in CI: both produced "tests pass +
post-cleanup hangs ~10min" because the orphan resolver thread is a
structural property of *any* getaddrinfo path on a hung resolver,
not a property of the bug.

Replace the sinkhole with a small loopback test fixture
(test/dns_test_fixture.py, ~50 lines, stdlib only) that answers DNS
queries after a 3s delay -- longer than the test's 1s timeout. An
iptables NAT rule routes the test job's lookups to the fixture
without touching /etc/resolv.conf, so the rest of the runner's DNS
behaviour is unaffected.

With ASAN's detect_stack_use_after_return enabled, the worker's
late write-back into the destroyed gaicb stack frame is now caught
as a stack-use-after-return diagnostic, so the broken HEAD fails
fast at the test step (clear red) and the fix turns the same job
green in well under a minute.

Same fixture is wired into both the GitHub Actions job and the
docker-based test/run_issue_2431_repro.sh script, so local repro on
macOS and CI repro on Linux exercise the identical path.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 18:17:19 +09:00
yhirose
33bc1df930 Release v0.43.1 v0.43.1 2026-04-20 01:48:27 -04:00
yhirose
02d3825149 Fix Windows build error 2026-04-20 01:39:51 -04:00
yhirose
9f41fc0447 Release v0.43.0 v0.43.0 2026-04-19 20:18:52 -04:00
yhirose
3cedf31d4c Fix #2427 (#2428)
* Fix #2427

* Use setarch -R on Linux to fix ASAN crash on WSL2

WSL2 uses high-entropy ASLR which conflicts with ASAN's shadow memory
requirements, causing the ASAN runtime to crash at startup. Running tests
via setarch -R (ADDR_NO_RANDOMIZE) disables ASLR for the test process,
allowing ASAN to initialize correctly.
2026-04-13 23:19:31 -04:00
yhirose
cc8f270d4b Fix test style for ResponseBodyTerminatedByConnectionClose
Use HOST/PORT constants and scope_exit cleanup pattern
to match the rest of the SSL test suite.
2026-04-13 20:41:56 -04:00
Kukodam
9f52821be6 fix #2429 (#2430) 2026-04-13 20:32:04 -04:00
yhirose
b045ee7f6b Fix #2424 2026-04-12 17:31:32 -04:00
Andrea Pappacoda
cb3fce964d fix: cast len to 64 bits before right shift in ws (#2426)
Fixes WebSocketIntegrationTest.LargeMessage and
WebSocketIntegrationTest.MaxPayloadAtLimit on i386
2026-04-12 17:27:16 -04:00
yhirose
7e2a173072 Fix #2425 2026-04-12 17:25:41 -04:00
yhirose
ee5d15c842 Let dynamic threads wait for work instead of exiting immediately
Previously, dynamic threads exited as soon as their current task
completed and the queue was empty. This caused excessive thread
creation/destruction under bursty or long-lived workloads (e.g., SSE
streaming), degrading tail latency. Now dynamic threads loop back and
wait for CPPHTTPLIB_THREAD_POOL_IDLE_TIMEOUT (3s) before exiting,
allowing them to be reused for subsequent tasks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 11:50:38 -04:00
yhirose
09d00c099c Update README 2026-04-11 22:24:04 -04:00
yhirose
6bdd657713 Enhance WebSocket support with unresponsive-peer detection and documentation updates
- Added `set_websocket_max_missed_pongs` method to configure unresponsive-peer detection.
- Updated README and documentation to clarify WebSocket limitations and features.
- Introduced tests for detecting non-responsive peers and ensuring responsive peers do not trigger timeouts.
2026-04-11 22:17:38 -04:00
yhirose
b4eec3ee77 Removed deprecated APIs (#2423) 2026-04-11 20:54:06 -04:00
yhirose
c0248ff7fc Add links to other topics in Cookbook documents 2026-04-11 20:40:08 -04:00
yhirose
203e1bf2ac Code cleanup 2026-04-11 20:40:08 -04:00
yhirose
ff04679538 Release v0.42.0 v0.42.0 2026-04-11 18:53:36 -04:00
yhirose
d97749a315 Update README 2026-04-11 17:15:37 -04:00
yhirose
994d76ab39 Fix #2422 2026-04-11 15:38:35 -04:00
yhirose
529dafdee3 Add Cookbook other topics (draft) 2026-04-10 19:02:44 -04:00
yhirose
361b753f19 Add Cookbook S01-S22 (draft) 2026-04-10 18:47:42 -04:00
yhirose
61e533ddc5 Add Cookbook C01-C19 (draft) 2026-04-10 18:16:02 -04:00
yhirose
783de4ec4e Update README 2026-04-08 18:35:56 -04:00