Compare commits

...

45 Commits

Author SHA1 Message Date
yhirose
b045ee7f6b Fix #2424 2026-04-12 17:31:32 -04:00
Andrea Pappacoda
cb3fce964d fix: cast len to 64 bits before right shift in ws (#2426)
Fixes WebSocketIntegrationTest.LargeMessage and
WebSocketIntegrationTest.MaxPayloadAtLimit on i386
2026-04-12 17:27:16 -04:00
yhirose
7e2a173072 Fix #2425 2026-04-12 17:25:41 -04:00
yhirose
ee5d15c842 Let dynamic threads wait for work instead of exiting immediately
Previously, dynamic threads exited as soon as their current task
completed and the queue was empty. This caused excessive thread
creation/destruction under bursty or long-lived workloads (e.g., SSE
streaming), degrading tail latency. Now dynamic threads loop back and
wait for CPPHTTPLIB_THREAD_POOL_IDLE_TIMEOUT (3s) before exiting,
allowing them to be reused for subsequent tasks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 11:50:38 -04:00
yhirose
09d00c099c Update README 2026-04-11 22:24:04 -04:00
yhirose
6bdd657713 Enhance WebSocket support with unresponsive-peer detection and documentation updates
- Added `set_websocket_max_missed_pongs` method to configure unresponsive-peer detection.
- Updated README and documentation to clarify WebSocket limitations and features.
- Introduced tests for detecting non-responsive peers and ensuring responsive peers do not trigger timeouts.
2026-04-11 22:17:38 -04:00
yhirose
b4eec3ee77 Removed deprecated APIs (#2423) 2026-04-11 20:54:06 -04:00
yhirose
c0248ff7fc Add links to other topics in Cookbook documents 2026-04-11 20:40:08 -04:00
yhirose
203e1bf2ac Code cleanup 2026-04-11 20:40:08 -04:00
yhirose
ff04679538 Release v0.42.0 2026-04-11 18:53:36 -04:00
yhirose
d97749a315 Update README 2026-04-11 17:15:37 -04:00
yhirose
994d76ab39 Fix #2422 2026-04-11 15:38:35 -04:00
yhirose
529dafdee3 Add Cookbook other topics (draft) 2026-04-10 19:02:44 -04:00
yhirose
361b753f19 Add Cookbook S01-S22 (draft) 2026-04-10 18:47:42 -04:00
yhirose
61e533ddc5 Add Cookbook C01-C19 (draft) 2026-04-10 18:16:02 -04:00
yhirose
783de4ec4e Update README 2026-04-08 18:35:56 -04:00
yhirose
7a7f9b30e7 Update README 2026-04-08 18:22:25 -04:00
yhirose
834a444435 Fixed warnings 2026-04-08 18:10:34 -04:00
yhirose
4f589c1ffb Fix #2421 2026-04-08 18:09:22 -04:00
Jiri Slaby
fc885cc62d test: WebSocketIntegrationTest.SocketSettings: do not set AF_INET (#2420)
The server listens on AF_INET6 only (::1), so the test fails:
 [ RUN      ] WebSocketIntegrationTest.SocketSettings
 test/test.cc:17160: Failure
 Value of: client.connect()
   Actual: false
 Expected: true

Fixes #2419.

Co-authored-by: Jiri Slaby <jslaby@suse.cz>
2026-04-08 07:48:13 -04:00
yhirose
ca82c93772 Refactor SSLVerifierResponse to enum class and add get_param_values method to Request 2026-04-04 00:02:26 -04:00
yhirose
073b587962 Release v0.41.0 2026-04-03 21:50:24 -04:00
yhirose
96785eea21 Add parse_url 2026-04-03 20:54:17 -04:00
yhirose
3093bdd9ab Fix #2416 2026-04-03 18:27:12 -04:00
crueter
6607a6a592 [cmake] Allow using pre-existing zstd target if it exists (#2390)
adds support for pre-existing `zstd::libzstd` which is useful for
projects that bundle their own zstd in a way that doesn't get caught by
`CONFIG`

Signed-off-by: crueter <crueter@eden-emu.dev>
2026-03-30 21:26:20 -04:00
DavidKorczynski
831b64bdeb Add two new fuzzers (#2412)
The goal is to increase code coverage by way of OSS-Fuzz. A recent code
coverage report is available at
https://storage.googleapis.com/oss-fuzz-coverage/cpp-httplib/reports/20260326/linux/report.html

Signed-off-by: David Korczynski <david@adalogics.com>
2026-03-28 15:00:10 -04:00
yhirose
32c82492de Add workflow_dispatch trigger to docs deployment workflow 2026-03-28 11:08:25 -04:00
yhirose
b7e02de4a7 Release v0.40.0 2026-03-28 00:57:24 -04:00
yhirose
a9359df42e Optimize multipart content provider to coalesce small writes and reduce TCP packet fragmentation (Fix #2410) 2026-03-28 00:23:59 -04:00
yhirose
9a97e948f0 Add set_socket_opt function and corresponding test for TCP_NODELAY option (Resolve #2411) 2026-03-28 00:23:59 -04:00
yhirose
6fd97aeca0 Implement request body consumption and reject invalid Content-Length with Transfer-Encoding to prevent request smuggling 2026-03-27 23:16:08 -04:00
yhirose
05540e4d50 Fixed warnings 2026-03-27 22:35:22 -04:00
yhirose
ceefc14e7d Use go-httplibbin 2026-03-27 22:26:14 -04:00
yhirose
a77284a634 Release v0.39.0 2026-03-23 23:14:05 -04:00
yhirose
315a87520d Add release script and update .gitignore for work directory 2026-03-23 23:06:37 -04:00
yhirose
703abbb53b Prevent forwarding of authentication credentials during cross-host redirects as per RFC 9110. Add tests for basic auth and bearer token scenarios. 2026-03-23 22:32:53 -04:00
yhirose
cb8365349f Fix #2404 (Refactor make_file_body to improve file handling and scope management) 2026-03-22 22:40:01 -04:00
yhirose
3792ce0da7 Add socket configuration options and corresponding test case for WebSocketClient. Fix #2401 2026-03-22 22:31:59 -04:00
yhirose
7178f451a4 "Building a Desktop LLM App with cpp-httplib" (#2403) 2026-03-21 23:31:55 -04:00
yhirose
c2bdb1c5c1 SSE Client: Update Authorization Header
Fixes #2402
2026-03-21 13:17:28 -04:00
yhirose
45820de332 Enhance stream handling in LongPollingTest and add new test for client close detection 2026-03-18 18:29:19 -04:00
yhirose
dd8071a7d4 Fix #2397 2026-03-17 17:39:57 -04:00
v-nam
c59ef98b3b [cmake] Update modules.cmake to fix cmake error (#2393)
* Update modules.cmake

* tst1
2026-03-17 12:09:51 -04:00
yhirose
1c6b3ea5a0 Add status: "draft" to multiple documentation pages and enhance navigation sections 2026-03-14 23:46:14 -04:00
yhirose
4a1e9443ee Update deprecation messages to indicate removal in v1.0.0 2026-03-14 23:32:07 -04:00
164 changed files with 17441 additions and 1123 deletions

View File

@@ -4,6 +4,7 @@ on:
branches: [master]
paths:
- 'docs-src/**'
workflow_dispatch:
permissions:
contents: read
pages: write

3
.gitignore vendored
View File

@@ -2,6 +2,7 @@ tags
AGENTS.md
docs-src/pages/AGENTS.md
plans/
work/
# Ignore executables (no extension) but not source files
example/server
@@ -59,9 +60,7 @@ test/*.pem
test/*.srl
test/*.log
test/_build_*
work/
benchmark/server*
docs-gen/target/
*.swp

View File

@@ -242,27 +242,33 @@ endif()
# zstd < 1.5.6 does not provide the CMake imported target `zstd::libzstd`.
# Older versions must be consumed via their pkg-config file.
if(HTTPLIB_REQUIRE_ZSTD)
find_package(zstd 1.5.6 CONFIG)
if(NOT zstd_FOUND)
find_package(PkgConfig REQUIRED)
pkg_check_modules(zstd REQUIRED IMPORTED_TARGET libzstd)
add_library(zstd::libzstd ALIAS PkgConfig::zstd)
if (NOT TARGET zstd::libzstd)
find_package(zstd 1.5.6 CONFIG)
if(NOT zstd_FOUND)
find_package(PkgConfig REQUIRED)
pkg_check_modules(zstd REQUIRED IMPORTED_TARGET libzstd)
add_library(zstd::libzstd ALIAS PkgConfig::zstd)
endif()
endif()
set(HTTPLIB_IS_USING_ZSTD TRUE)
elseif(HTTPLIB_USE_ZSTD_IF_AVAILABLE)
find_package(zstd 1.5.6 CONFIG QUIET)
if(NOT zstd_FOUND)
find_package(PkgConfig QUIET)
if(PKG_CONFIG_FOUND)
pkg_check_modules(zstd QUIET IMPORTED_TARGET libzstd)
if (TARGET zstd::libzstd)
set(HTTPLIB_IS_USING_ZSTD TRUE)
else()
find_package(zstd 1.5.6 CONFIG QUIET)
if(NOT zstd_FOUND)
find_package(PkgConfig QUIET)
if(PKG_CONFIG_FOUND)
pkg_check_modules(zstd QUIET IMPORTED_TARGET libzstd)
if(TARGET PkgConfig::zstd)
add_library(zstd::libzstd ALIAS PkgConfig::zstd)
if(TARGET PkgConfig::zstd)
add_library(zstd::libzstd ALIAS PkgConfig::zstd)
endif()
endif()
endif()
# Both find_package and PkgConf set a XXX_FOUND var
set(HTTPLIB_IS_USING_ZSTD ${zstd_FOUND})
endif()
# Both find_package and PkgConf set a XXX_FOUND var
set(HTTPLIB_IS_USING_ZSTD ${zstd_FOUND})
endif()
# Used for default, common dirs that the end-user can change (if needed)
@@ -317,13 +323,13 @@ if(HTTPLIB_COMPILE)
$<BUILD_INTERFACE:${_httplib_build_includedir}/httplib.h>
$<INSTALL_INTERFACE:${CMAKE_INSTALL_INCLUDEDIR}/httplib.h>
)
# Add C++20 module support if requested
# Include from separate file to prevent parse errors on older CMake versions
if(CMAKE_VERSION VERSION_GREATER_EQUAL "3.28")
include(cmake/modules.cmake)
endif()
set_target_properties(${PROJECT_NAME}
PROPERTIES
VERSION ${${PROJECT_NAME}_VERSION}

View File

@@ -74,6 +74,9 @@ sse.set_reconnect_interval(5000);
// Set max reconnect attempts (default: 0 = unlimited)
sse.set_max_reconnect_attempts(10);
// Update headers at any time (thread-safe)
sse.set_headers({{"Authorization", "Bearer new_token"}});
```
#### Control
@@ -154,6 +157,25 @@ httplib::sse::SSEClient sse(cli, "/events", headers);
sse.start();
```
### Refreshing Auth Token on Reconnect
```cpp
httplib::sse::SSEClient sse(cli, "/events",
{{"Authorization", "Bearer " + get_token()}});
// Preemptively refresh token on each successful connection
sse.on_open([&sse]() {
sse.set_headers({{"Authorization", "Bearer " + get_token()}});
});
// Or reactively refresh on auth failure (401 triggers reconnect)
sse.on_error([&sse](httplib::Error) {
sse.set_headers({{"Authorization", "Bearer " + refresh_token()}});
});
sse.start();
```
### Error Handling
```cpp

View File

@@ -3,15 +3,19 @@
A simple, blocking WebSocket implementation for C++11.
> [!IMPORTANT]
> This is a blocking I/O WebSocket implementation using a thread-per-connection model. If you need high-concurrency WebSocket support with non-blocking/async I/O (e.g., thousands of simultaneous connections), this is not the one that you want.
> This is a blocking I/O WebSocket implementation using a thread-per-connection model (plus one heartbeat thread per connection). It is intended for small- to mid-scale workloads; handling large numbers of simultaneous WebSocket connections is outside the design target of this library. If you need high-concurrency WebSocket support with non-blocking/async I/O (e.g., thousands of simultaneous connections), this is not the one that you want.
> [!NOTE]
> WebSocket extensions (`permessage-deflate` and others defined by RFC 6455) are **not supported**. If a client proposes an extension via `Sec-WebSocket-Extensions`, the server silently declines it — the negotiated connection always runs without extensions.
## Features
- **RFC 6455 compliant**: Full WebSocket protocol support
- **RFC 6455 compliant**: Full WebSocket protocol support (extensions are not implemented)
- **Server and Client**: Both sides included
- **SSL/TLS support**: `wss://` scheme for secure connections
- **Text and Binary**: Both message types supported
- **Automatic heartbeat**: Periodic Ping/Pong keeps connections alive
- **Unresponsive-peer detection**: Opt-in liveness check via `set_websocket_max_missed_pongs()`
- **Subprotocol negotiation**: `Sec-WebSocket-Protocol` support for GraphQL, MQTT, etc.
## Quick Start
@@ -352,6 +356,7 @@ if (ws.connect()) {
| `CPPHTTPLIB_WEBSOCKET_READ_TIMEOUT_SECOND` | `300` | Read timeout for WebSocket connections (seconds) |
| `CPPHTTPLIB_WEBSOCKET_CLOSE_TIMEOUT_SECOND` | `5` | Timeout for waiting peer's Close response (seconds) |
| `CPPHTTPLIB_WEBSOCKET_PING_INTERVAL_SECOND` | `30` | Automatic Ping interval for heartbeat (seconds) |
| `CPPHTTPLIB_WEBSOCKET_MAX_MISSED_PONGS` | `0` (disabled) | Close the connection after N consecutive unacked pings |
### Runtime Ping Interval
@@ -373,6 +378,20 @@ ws.set_websocket_ping_interval(10); // 10 seconds
ws.set_websocket_ping_interval(0);
```
### Unresponsive-Peer Detection (Pong Timeout)
By default the heartbeat only sends pings — it does not enforce that pongs come back. To detect a silently dropped connection faster, enable the max-missed-pongs check. Once `max_missed_pongs` consecutive pings go unanswered, the heartbeat thread closes the connection with `CloseStatus::GoingAway` and the reason `"pong timeout"`.
```cpp
ws.set_websocket_max_missed_pongs(2); // close after 2 consecutive unacked pings
```
The server side has the same `set_websocket_max_missed_pongs()`.
With the default ping interval of 30 seconds, `max_missed_pongs = 2` detects a dead peer within ~60 seconds. The counter is reset every time a Pong frame is received, so the mechanism only works when your code is actively calling `read()` — exactly the pattern a normal WebSocket client already uses.
**The default is `0`**, which means "never close the connection because of missing pongs." Pings are still sent on the heartbeat interval, but their responses are not checked. Even so, a dead connection does not linger forever: while your code is inside `read()`, `CPPHTTPLIB_WEBSOCKET_READ_TIMEOUT_SECOND` (default **300 seconds = 5 minutes**) acts as a backstop and `read()` fails if no frame arrives in time. `max_missed_pongs` is the knob for detecting an unresponsive peer faster than that 5-minute fallback.
## Threading Model
WebSocket connections share the same thread pool as HTTP requests. Each WebSocket connection occupies one thread for its entire lifetime.

View File

@@ -8,7 +8,7 @@ It's extremely easy to set up. Just include the **[httplib.h](https://raw.github
Learn more in the [official documentation](https://yhirose.github.io/cpp-httplib/) (built with [docs-gen](https://github.com/yhirose/docs-gen)).
> [!IMPORTANT]
> This library uses 'blocking' socket I/O. If you are looking for a library with 'non-blocking' socket I/O, this is not the one that you want.
> This library uses 'blocking' socket I/O. If you are looking for a library with 'non-blocking' socket I/O, this is not the one that you want. Only **HTTP/1.1** is supported — HTTP/2 and HTTP/3 are not implemented.
> [!WARNING]
> 32-bit platforms are **NOT supported**. Use at your own risk. The library may compile on 32-bit targets, but no security review has been conducted for 32-bit environments. Integer truncation and other 32-bit-specific issues may exist. **Security reports that only affect 32-bit platforms will be closed without action.** The maintainer does not have access to 32-bit environments for testing or fixing issues. CI includes basic compile checks only, not functional or security testing.
@@ -191,7 +191,7 @@ cpp-httplib automatically integrates with the OS certificate store on macOS and
| Platform | Behavior | Disable (compile time) |
| :------- | :------- | :--------------------- |
| macOS | Loads system certs from Keychain (link `CoreFoundation` and `Security` with `-framework`) | `CPPHTTPLIB_DISABLE_MACOSX_AUTOMATIC_ROOT_CERTIFICATES` |
| macOS | Loads system certs from Keychain (link `CoreFoundation` and `Security` with `-framework`). Requires Apple Clang; GCC is not supported for this feature. | `CPPHTTPLIB_DISABLE_MACOSX_AUTOMATIC_ROOT_CERTIFICATES` |
| Windows | Verifies certs via CryptoAPI (`CertGetCertificateChain` / `CertVerifyCertificateChainPolicy`) with revocation checking | `CPPHTTPLIB_DISABLE_WINDOWS_AUTOMATIC_ROOT_CERTIFICATES_UPDATE` |
On Windows, verification can also be disabled at runtime:
@@ -239,6 +239,8 @@ int main(void)
if (req.has_param("key")) {
auto val = req.get_param_value("key");
}
// Get all values for a given key (e.g., ?tag=a&tag=b)
auto values = req.get_param_values("tag");
res.set_content(req.body, "text/plain");
});
@@ -461,7 +463,7 @@ svr.set_pre_request_handler([](const auto& req, auto& res) {
### Response user data
`res.user_data` is a `std::map<std::string, httplib::any>` that lets pre-routing or pre-request handlers pass arbitrary data to route handlers.
`res.user_data` is a type-safe key-value store that lets pre-routing or pre-request handlers pass arbitrary data to route handlers.
```cpp
struct AuthContext {
@@ -471,12 +473,12 @@ struct AuthContext {
svr.set_pre_routing_handler([](const auto& req, auto& res) {
auto token = req.get_header_value("Authorization");
res.user_data["auth"] = AuthContext{decode_token(token)};
res.user_data.set("auth", AuthContext{decode_token(token)});
return Server::HandlerResponse::Unhandled;
});
svr.Get("/me", [](const auto& /*req*/, auto& res) {
auto* ctx = httplib::any_cast<AuthContext>(&res.user_data["auth"]);
auto* ctx = res.user_data.get<AuthContext>("auth");
if (!ctx) {
res.status = StatusCode::Unauthorized_401;
return;
@@ -485,8 +487,6 @@ svr.Get("/me", [](const auto& /*req*/, auto& res) {
});
```
`httplib::any` mirrors the C++17 `std::any` API. On C++17 and later it is an alias for `std::any`; on C++11/14 a compatible implementation is provided.
### Form data handling
#### URL-encoded form data ('application/x-www-form-urlencoded')
@@ -743,6 +743,10 @@ svr.set_keep_alive_timeout(10); // Default is 5
svr.set_read_timeout(5, 0); // 5 seconds
svr.set_write_timeout(5, 0); // 5 seconds
svr.set_idle_interval(0, 100000); // 100 milliseconds
// std::chrono is also supported
svr.set_read_timeout(std::chrono::seconds(5));
svr.set_keep_alive_timeout(std::chrono::seconds(10));
```
### Set maximum payload length for reading a request body
@@ -1053,6 +1057,12 @@ cli.set_write_timeout(5, 0); // 5 seconds
// This method works the same as curl's `--max-time` option
cli.set_max_timeout(5000); // 5 seconds
// std::chrono is also supported
cli.set_connection_timeout(std::chrono::milliseconds(300));
cli.set_read_timeout(std::chrono::seconds(5));
cli.set_write_timeout(std::chrono::seconds(5));
cli.set_max_timeout(std::chrono::seconds(5));
```
### Set maximum payload length for reading a response body
@@ -1452,10 +1462,26 @@ if (ws.connect()) {
SSL is also supported via `wss://` scheme (e.g. `WebSocketClient("wss://example.com/ws")`). Subprotocol negotiation (`Sec-WebSocket-Protocol`) is supported via `SubProtocolSelector` callback.
> **Note:** WebSocket connections occupy a thread for their entire lifetime. If you plan to handle many simultaneous WebSocket connections, consider using a dynamic thread pool: `svr.new_task_queue = [] { return new ThreadPool(8, 64); };`
> **Note:** WebSocket connections occupy a thread for their entire lifetime (plus an additional thread per connection for heartbeat pings). This thread-per-connection model is intended for small- to mid-scale workloads; large numbers of simultaneous WebSocket connections are outside the design target of this library. If you expect many concurrent WebSocket clients, configure a dynamic thread pool (`svr.new_task_queue = [] { return new ThreadPool(8, 64); };`) and measure carefully.
> **WebSocket extensions are not supported.** `permessage-deflate` and other RFC 6455 extensions are not implemented. If a client proposes them via `Sec-WebSocket-Extensions`, the server silently declines them in its handshake response.
> **Unresponsive-peer detection.** Heartbeat pings also serve as a liveness probe when `set_websocket_max_missed_pongs(n)` is set: if the client sends `n` consecutive pings without receiving a pong, it will close the connection. Disabled by default (`0`).
See [README-websocket.md](README-websocket.md) for more details.
## Socket Option Utility
`set_socket_opt` is a convenience wrapper around `setsockopt` for setting integer socket options:
```cpp
auto sock = svr.socket();
httplib::set_socket_opt(sock, IPPROTO_TCP, TCP_NODELAY, 1);
```
> [!TIP]
> For most use cases, prefer `set_tcp_nodelay(true)` or `set_socket_options(callback)` on the Server/Client instead of calling `set_socket_opt` directly.
## Split httplib.h into .h and .cc
```console

View File

@@ -10,7 +10,10 @@ if(HTTPLIB_BUILD_MODULES)
target_sources(${PROJECT_NAME}
PUBLIC
FILE_SET CXX_MODULES FILES
FILE_SET CXX_MODULES
BASE_DIRS
"${_httplib_build_includedir}"
FILES
"${_httplib_build_includedir}/httplib.cppm"
)
endif()

View File

@@ -4,7 +4,7 @@ langs = ["en", "ja"]
[site]
title = "cpp-httplib"
version = "0.38.0"
version = "0.42.0"
hostname = "https://yhirose.github.io"
base_path = "/cpp-httplib"
footer_message = "© 2026 Yuji Hirose. All rights reserved."

View File

@@ -0,0 +1,60 @@
---
title: "C01. Get the Response Body / Save to a File"
order: 1
status: "draft"
---
## Get it as a string
```cpp
httplib::Client cli("http://localhost:8080");
auto res = cli.Get("/hello");
if (res && res->status == 200) {
std::cout << res->body << std::endl;
}
```
`res->body` is a `std::string`, ready to use as-is. The entire response is loaded into memory.
> **Warning:** If you fetch a large file with `res->body`, it all goes into memory. For large downloads, use a `ContentReceiver` as shown below.
## Save to a file
```cpp
httplib::Client cli("http://localhost:8080");
std::ofstream ofs("output.bin", std::ios::binary);
if (!ofs) {
std::cerr << "Failed to open file" << std::endl;
return 1;
}
auto res = cli.Get("/large-file",
[&](const char *data, size_t len) {
ofs.write(data, len);
return static_cast<bool>(ofs);
});
```
With a `ContentReceiver`, data arrives in chunks. You can write each chunk straight to disk without buffering the whole body in memory — perfect for large file downloads.
Return `false` from the callback to abort the download. In the example above, if writing to `ofs` fails, the download stops automatically.
> **Detail:** Want to check response headers like Content-Length before downloading? Combine a `ResponseHandler` with a `ContentReceiver`.
>
> ```cpp
> auto res = cli.Get("/large-file",
> [](const httplib::Response &res) {
> auto len = res.get_header_value("Content-Length");
> std::cout << "Size: " << len << std::endl;
> return true; // return false to skip the download
> },
> [&](const char *data, size_t len) {
> ofs.write(data, len);
> return static_cast<bool>(ofs);
> });
> ```
>
> The `ResponseHandler` is called after headers arrive but before the body. Return `false` to skip the download entirely.
> To show download progress, see [C11. Use the progress callback](c11-progress-callback).

View File

@@ -0,0 +1,36 @@
---
title: "C02. Send and Receive JSON"
order: 2
status: "draft"
---
cpp-httplib doesn't include a JSON parser. Use a library like [nlohmann/json](https://github.com/nlohmann/json) to build and parse JSON. The examples here use `nlohmann/json`.
## Send JSON
```cpp
httplib::Client cli("http://localhost:8080");
nlohmann::json j = {{"name", "Alice"}, {"age", 30}};
auto res = cli.Post("/api/users", j.dump(), "application/json");
```
Pass the JSON string as the second argument to `Post()` and the Content-Type as the third. The same pattern works with `Put()` and `Patch()`.
> **Warning:** If you omit the Content-Type (the third argument), the server may not recognize the body as JSON. Always specify `"application/json"`.
## Receive a JSON response
```cpp
auto res = cli.Get("/api/users/1");
if (res && res->status == 200) {
auto j = nlohmann::json::parse(res->body);
std::cout << j["name"] << std::endl;
}
```
`res->body` is a `std::string`, so you can pass it straight to your JSON library.
> **Note:** Servers sometimes return HTML on errors. Check the status code before parsing to be safe. Some APIs also require an `Accept: application/json` header. If you're calling a JSON API repeatedly, [C03. Set default headers](c03-default-headers) can save you some boilerplate.
> For how to receive and return JSON on the server side, see [S02. Receive JSON requests and return JSON responses](s02-json-api).

View File

@@ -0,0 +1,55 @@
---
title: "C03. Set Default Headers"
order: 3
status: "draft"
---
When you want the same headers on every request, use `set_default_headers()`. Once set, they're attached automatically to every request sent from that client.
## Basic usage
```cpp
httplib::Client cli("https://api.example.com");
cli.set_default_headers({
{"Accept", "application/json"},
{"User-Agent", "my-app/1.0"},
});
auto res = cli.Get("/users");
```
Register the headers you need on every API call — like `Accept` or `User-Agent` — in one place. No need to repeat them on each request.
## Send a Bearer token on every request
```cpp
httplib::Client cli("https://api.example.com");
cli.set_default_headers({
{"Authorization", "Bearer " + token},
{"Accept", "application/json"},
});
auto res1 = cli.Get("/me");
auto res2 = cli.Get("/projects");
```
Set the auth token once, and every subsequent request carries it. Handy when you're writing an API client that hits multiple endpoints.
> **Note:** `set_default_headers()` **replaces** the existing default headers. Even if you only want to add one, pass the full set again.
## Combine with per-request headers
You can still pass extra headers on individual requests, even with defaults set.
```cpp
httplib::Headers headers = {
{"X-Request-ID", "abc-123"},
};
auto res = cli.Get("/users", headers);
```
Per-request headers are **added** on top of the defaults. Both are sent to the server.
> For details on Bearer token auth, see [C06. Call an API with a Bearer token](c06-bearer-token).

View File

@@ -0,0 +1,38 @@
---
title: "C04. Follow Redirects"
order: 4
status: "draft"
---
By default, cpp-httplib does not follow HTTP redirects (3xx). If the server returns `302 Found`, you'll get it as a response with status code 302 — nothing more.
To follow redirects automatically, call `set_follow_location(true)`.
## Follow redirects
```cpp
httplib::Client cli("http://example.com");
cli.set_follow_location(true);
auto res = cli.Get("/old-path");
if (res && res->status == 200) {
std::cout << res->body << std::endl;
}
```
With `set_follow_location(true)`, the client reads the `Location` header and reissues the request to the new URL automatically. The final response ends up in `res`.
## Redirects from HTTP to HTTPS
```cpp
httplib::Client cli("http://example.com");
cli.set_follow_location(true);
auto res = cli.Get("/");
```
Many sites redirect HTTP traffic to HTTPS. With `set_follow_location(true)` on, this case is handled transparently — the client follows redirects even when the scheme or host changes.
> **Warning:** To follow redirects to HTTPS, you need to build cpp-httplib with OpenSSL (or another TLS backend). Without TLS support, redirects to HTTPS will fail.
> **Note:** Following redirects adds to the total request time. See [C12. Set timeouts](c12-timeouts) for timeout configuration.

View File

@@ -0,0 +1,46 @@
---
title: "C05. Use Basic Authentication"
order: 5
status: "draft"
---
For endpoints that require Basic authentication, pass the username and password to `set_basic_auth()`. cpp-httplib builds the `Authorization: Basic ...` header for you.
## Basic usage
```cpp
httplib::Client cli("https://api.example.com");
cli.set_basic_auth("alice", "s3cret");
auto res = cli.Get("/private");
if (res && res->status == 200) {
std::cout << res->body << std::endl;
}
```
Set it once, and every request from that client carries the credentials. No need to build the header each time.
## Per-request usage
If you want credentials on only one specific request, pass headers directly.
```cpp
httplib::Headers headers = {
httplib::make_basic_authentication_header("alice", "s3cret"),
};
auto res = cli.Get("/private", headers);
```
`make_basic_authentication_header()` builds the Base64-encoded header for you.
> **Warning:** Basic authentication **encodes** credentials in Base64 — it does not encrypt them. Always use it over HTTPS. Over plain HTTP, your password travels the network in the clear.
## Digest authentication
For the more secure Digest authentication scheme, use `set_digest_auth()`. This is only available when cpp-httplib is built with OpenSSL (or another TLS backend).
```cpp
cli.set_digest_auth("alice", "s3cret");
```
> To call an API with a Bearer token, see [C06. Call an API with a Bearer token](c06-bearer-token).

View File

@@ -0,0 +1,50 @@
---
title: "C06. Call an API with a Bearer Token"
order: 6
status: "draft"
---
For Bearer token authentication — common in OAuth 2.0 and modern Web APIs — use `set_bearer_token_auth()`. Pass the token and cpp-httplib builds the `Authorization: Bearer <token>` header for you.
## Basic usage
```cpp
httplib::Client cli("https://api.example.com");
cli.set_bearer_token_auth("eyJhbGciOiJIUzI1NiIs...");
auto res = cli.Get("/me");
if (res && res->status == 200) {
std::cout << res->body << std::endl;
}
```
Set it once and every subsequent request carries the token. This is the go-to pattern for token-based APIs like GitHub, Slack, or your own OAuth service.
## Per-request usage
When you want the token on only one request — or need a different token per request — pass it via headers.
```cpp
httplib::Headers headers = {
httplib::make_bearer_token_authentication_header(token),
};
auto res = cli.Get("/me", headers);
```
`make_bearer_token_authentication_header()` builds the `Authorization` header for you.
## Refresh the token
When a token expires, just call `set_bearer_token_auth()` again with the new one.
```cpp
if (res && res->status == 401) {
auto new_token = refresh_token();
cli.set_bearer_token_auth(new_token);
res = cli.Get("/me");
}
```
> **Warning:** A Bearer token is itself a credential. Always send it over HTTPS, and never hard-code it into source or config files.
> To set multiple headers at once, see [C03. Set default headers](c03-default-headers).

View File

@@ -0,0 +1,52 @@
---
title: "C07. Upload a File as Multipart Form Data"
order: 7
status: "draft"
---
When you want to send a file the same way an HTML `<input type="file">` does, use multipart form data (`multipart/form-data`). cpp-httplib offers two APIs — `UploadFormDataItems` and `FormDataProviderItems` — and you pick between them based on **file size**.
## Send a small file
Read the file into memory first, then send it. For small files, this is the simplest path.
```cpp
httplib::Client cli("http://localhost:8080");
std::ifstream ifs("avatar.png", std::ios::binary);
std::string content((std::istreambuf_iterator<char>(ifs)),
std::istreambuf_iterator<char>());
httplib::UploadFormDataItems items = {
{"name", "Alice", "", ""},
{"avatar", content, "avatar.png", "image/png"},
};
auto res = cli.Post("/upload", items);
```
Each `UploadFormData` entry is `{name, content, filename, content_type}`. For plain text fields, leave `filename` and `content_type` empty.
## Stream a large file
To avoid loading the whole file into memory, use `make_file_provider()`. It reads the file in chunks as it sends — so even huge files won't blow up your memory footprint.
```cpp
httplib::Client cli("http://localhost:8080");
httplib::UploadFormDataItems items = {
{"name", "Alice", "", ""},
};
httplib::FormDataProviderItems provider_items = {
httplib::make_file_provider("video", "large-video.mp4", "", "video/mp4"),
};
auto res = cli.Post("/upload", httplib::Headers{}, items, provider_items);
```
The arguments to `make_file_provider()` are `(form name, file path, file name, content type)`. Leave the file name empty to use the file path as-is.
> **Note:** You can mix `UploadFormDataItems` and `FormDataProviderItems` in the same request. A clean split is: text fields in `UploadFormDataItems`, files in `FormDataProviderItems`.
> To show upload progress, see [C11. Use the progress callback](c11-progress-callback).

View File

@@ -0,0 +1,34 @@
---
title: "C08. POST a File as Raw Binary"
order: 8
status: "draft"
---
Sometimes you want to send a file's contents as the request body directly — no multipart wrapping. This is common for S3-compatible APIs or endpoints that take raw image data. For this, use `make_file_body()`.
## Basic usage
```cpp
httplib::Client cli("https://storage.example.com");
auto [size, provider] = httplib::make_file_body("backup.tar.gz");
if (size == 0) {
std::cerr << "Failed to open file" << std::endl;
return 1;
}
auto res = cli.Put("/bucket/backup.tar.gz", size,
provider, "application/gzip");
```
`make_file_body()` returns a pair of file size and a `ContentProvider`. Pass them to `Post()` or `Put()` and the file contents flow straight into the request body.
The `ContentProvider` reads the file in chunks, so even huge files never sit fully in memory.
## When the file can't be opened
If the file can't be opened, `make_file_body()` returns `size` as `0` and `provider` as an empty function object. Sending that would produce garbage — always check `size` first.
> **Warning:** `make_file_body()` needs to fix the Content-Length up front, so it reads the file size ahead of time. If the file size might change mid-upload, this API isn't the right fit.
> To send the file as multipart form data instead, see [C07. Upload a file as multipart form data](c07-multipart-upload).

View File

@@ -0,0 +1,47 @@
---
title: "C09. Send the Body with Chunked Transfer"
order: 9
status: "draft"
---
When you don't know the body size up front — for data generated on the fly or piped from another stream — use `ContentProviderWithoutLength`. The client sends the body with HTTP chunked transfer encoding.
## Basic usage
```cpp
httplib::Client cli("http://localhost:8080");
auto res = cli.Post("/stream",
[&](size_t offset, httplib::DataSink &sink) {
std::string chunk = produce_next_chunk();
if (chunk.empty()) {
sink.done(); // done sending
return true;
}
return sink.write(chunk.data(), chunk.size());
},
"application/octet-stream");
```
The lambda's job is just: produce the next chunk and send it with `sink.write()`. When there's no more data, call `sink.done()` and you're finished.
## When the size is known
If you **do** know the total size ahead of time, use the `ContentProvider` overload (taking `size_t offset, size_t length, DataSink &sink`) and pass the total size as well.
```cpp
size_t total_size = get_total_size();
auto res = cli.Post("/upload", total_size,
[&](size_t offset, size_t length, httplib::DataSink &sink) {
auto data = read_range(offset, length);
return sink.write(data.data(), data.size());
},
"application/octet-stream");
```
With a known size, the request carries a Content-Length header — so the server can show progress. Prefer this form when you can.
> **Detail:** `sink.write()` returns a `bool` indicating whether the write succeeded. If it returns `false`, the connection is gone — return `false` from the lambda to stop.
> If you're just sending a file, `make_file_body()` is easier. See [C08. POST a file as raw binary](c08-post-file-body).

View File

@@ -0,0 +1,52 @@
---
title: "C10. Receive a Response as a Stream"
order: 10
status: "draft"
---
To receive a response body chunk by chunk, use a `ContentReceiver`. It's the obvious choice for large files, but it's equally handy for NDJSON (newline-delimited JSON) or log streams where you want to start processing data as it arrives.
## Process each chunk
```cpp
httplib::Client cli("http://localhost:8080");
auto res = cli.Get("/logs/stream",
[](const char *data, size_t len) {
std::cout.write(data, len);
std::cout.flush();
return true; // return false to stop receiving
});
```
Data arrives in the lambda in the order it's received from the server. Return `false` from the callback to stop the download partway through.
## Parse NDJSON line by line
Here's a buffered approach for processing newline-delimited JSON one line at a time.
```cpp
std::string buffer;
auto res = cli.Get("/events",
[&](const char *data, size_t len) {
buffer.append(data, len);
size_t pos;
while ((pos = buffer.find('\n')) != std::string::npos) {
auto line = buffer.substr(0, pos);
buffer.erase(0, pos + 1);
if (!line.empty()) {
auto j = nlohmann::json::parse(line);
handle_event(j);
}
}
return true;
});
```
Accumulate into a buffer, then pull out and parse one line each time you see a newline. This is the standard pattern for consuming a streaming API in real time.
> **Warning:** When you pass a `ContentReceiver`, `res->body` stays **empty**. Store or process the body inside the callback yourself.
> To track download progress, combine this with [C11. Use the progress callback](c11-progress-callback).
> For Server-Sent Events (SSE), see [E04. Receive SSE on the client](e04-sse-client).

View File

@@ -0,0 +1,59 @@
---
title: "C11. Use the Progress Callback"
order: 11
status: "draft"
---
To display download or upload progress, pass a `DownloadProgress` or `UploadProgress` callback. Both take two arguments: `(current, total)`.
## Download progress
```cpp
httplib::Client cli("http://localhost:8080");
auto res = cli.Get("/large-file",
[](size_t current, size_t total) {
auto percent = (total > 0) ? (current * 100 / total) : 0;
std::cout << "\rDownloading: " << percent << "% ("
<< current << "/" << total << ")" << std::flush;
return true; // return false to abort
});
std::cout << std::endl;
```
The callback fires each time data arrives. `total` comes from the Content-Length header — if the server doesn't send one, it may be `0`. In that case, you can't compute a percentage, so just display bytes received.
## Upload progress
Uploads work the same way. Pass an `UploadProgress` as the last argument to `Post()` or `Put()`.
```cpp
httplib::Client cli("http://localhost:8080");
std::string body = load_large_body();
auto res = cli.Post("/upload", body, "application/octet-stream",
[](size_t current, size_t total) {
auto percent = current * 100 / total;
std::cout << "\rUploading: " << percent << "%" << std::flush;
return true;
});
std::cout << std::endl;
```
## Cancel mid-transfer
Return `false` from the callback to abort the transfer. This is how you wire up a "Cancel" button in a UI — flip a flag, and the next progress tick stops the transfer.
```cpp
std::atomic<bool> cancelled{false};
auto res = cli.Get("/large-file",
[&](size_t current, size_t total) {
return !cancelled.load();
});
```
> **Note:** `ContentReceiver` and the progress callback can be used together. When you want to stream to a file and show progress at the same time, pass both.
> For a concrete example of saving to a file, see [C01. Get the response body / save to a file](c01-get-response-body).

View File

@@ -0,0 +1,50 @@
---
title: "C12. Set Timeouts"
order: 12
status: "draft"
---
The client has three kinds of timeouts, each set independently.
| Kind | API | Default | Meaning |
| --- | --- | --- | --- |
| Connection | `set_connection_timeout` | 300s | Time to wait for the TCP connection to establish |
| Read | `set_read_timeout` | 300s | Time to wait for a single `recv` when receiving the response |
| Write | `set_write_timeout` | 5s | Time to wait for a single `send` when sending the request |
## Basic usage
```cpp
httplib::Client cli("http://localhost:8080");
cli.set_connection_timeout(5, 0); // 5 seconds
cli.set_read_timeout(10, 0); // 10 seconds
cli.set_write_timeout(10, 0); // 10 seconds
auto res = cli.Get("/api/data");
```
Pass seconds and microseconds as two arguments. If you don't need the sub-second part, you can omit the second argument.
## Use `std::chrono`
There's also an overload that takes a `std::chrono` duration directly. It's easier to read — recommended.
```cpp
using namespace std::chrono_literals;
cli.set_connection_timeout(5s);
cli.set_read_timeout(10s);
cli.set_write_timeout(500ms);
```
## Watch out for the long 300s default
Connection and read timeouts default to **300 seconds (5 minutes)**. If the server hangs, you'll be waiting five minutes by default. Shorter values are usually a better idea.
```cpp
cli.set_connection_timeout(3s);
cli.set_read_timeout(10s);
```
> **Warning:** The read timeout covers a single receive call — not the whole request. If data keeps trickling in during a large download, the request can take half an hour without ever hitting the timeout. To cap the total request time, use [C13. Set an overall timeout](c13-max-timeout).

View File

@@ -0,0 +1,42 @@
---
title: "C13. Set an Overall Timeout"
order: 13
status: "draft"
---
The three timeouts from [C12. Set timeouts](c12-timeouts) all apply to a single `send` or `recv` call. To cap the total time a request can take, use `set_max_timeout()`.
## Basic usage
```cpp
httplib::Client cli("http://localhost:8080");
cli.set_max_timeout(5000); // 5 seconds (in milliseconds)
auto res = cli.Get("/slow-endpoint");
```
The value is in milliseconds. Connection, send, and receive together — the whole request is aborted if it exceeds the limit.
## Use `std::chrono`
There's also an overload that takes a `std::chrono` duration.
```cpp
using namespace std::chrono_literals;
cli.set_max_timeout(5s);
```
## When to use which
`set_read_timeout` fires when no data arrives for a while. If data keeps trickling in bit by bit, it will never fire. An endpoint that sends one byte per second can make `set_read_timeout` useless no matter how short you set it.
`set_max_timeout` caps elapsed time, so it handles those cases cleanly. It's great for calls to external APIs or anywhere you don't want users waiting forever.
```cpp
cli.set_connection_timeout(3s);
cli.set_read_timeout(10s);
cli.set_max_timeout(30s); // abort if the whole request takes over 30s
```
> **Note:** `set_max_timeout()` works alongside the regular timeouts. Short stalls get caught by `set_read_timeout`; long-running requests get capped by `set_max_timeout`. Use both for a safety net.

View File

@@ -0,0 +1,53 @@
---
title: "C14. Understand Connection Reuse and Keep-Alive"
order: 14
status: "draft"
---
When you send multiple requests through the same `httplib::Client` instance, the TCP connection is reused automatically. HTTP/1.1 Keep-Alive does the work for you — you don't pay the TCP and TLS handshake cost on every call.
## Connections are reused automatically
```cpp
httplib::Client cli("https://api.example.com");
auto res1 = cli.Get("/users/1");
auto res2 = cli.Get("/users/2"); // reuses the same connection
auto res3 = cli.Get("/users/3"); // reuses the same connection
```
No special config required. Just hold on to `cli` — internally, the socket stays open across calls. The effect is especially noticeable over HTTPS, where the TLS handshake is expensive.
## Disable Keep-Alive explicitly
To force a fresh connection every time, call `set_keep_alive(false)`. Mostly useful for testing.
```cpp
cli.set_keep_alive(false);
```
For normal use, leave it on (the default).
## Don't create a `Client` per request
If you create a `Client` inside a loop and let it fall out of scope each iteration, you lose the reuse benefit. Create the instance outside the loop.
```cpp
// Bad: a new connection every iteration
for (auto id : ids) {
httplib::Client cli("https://api.example.com");
cli.Get("/users/" + id);
}
// Good: the connection is reused
httplib::Client cli("https://api.example.com");
for (auto id : ids) {
cli.Get("/users/" + id);
}
```
## Concurrent requests
If you want to send requests in parallel from multiple threads, give each thread its own `Client` instance. A single `Client` uses a single TCP connection, so firing concurrent requests at the same instance ends up serializing them anyway.
> **Note:** If the server closes the connection after its Keep-Alive timeout, cpp-httplib reconnects and retries transparently. You don't need to handle this in application code.

View File

@@ -0,0 +1,47 @@
---
title: "C15. Enable Compression"
order: 15
status: "draft"
---
cpp-httplib supports compression when sending and decompression when receiving. You just need to build it with zlib or Brotli enabled.
## Build-time setup
To use compression, define these macros before including `httplib.h`:
```cpp
#define CPPHTTPLIB_ZLIB_SUPPORT // gzip / deflate
#define CPPHTTPLIB_BROTLI_SUPPORT // brotli
#include <httplib.h>
```
You'll also need to link against `zlib` or `brotli`.
## Compress the request body
```cpp
httplib::Client cli("https://api.example.com");
cli.set_compress(true);
std::string big_payload = build_payload();
auto res = cli.Post("/api/data", big_payload, "application/json");
```
With `set_compress(true)`, the body of POST or PUT requests gets gzipped before sending. The server needs to handle compressed bodies too.
## Decompress the response
```cpp
httplib::Client cli("https://api.example.com");
cli.set_decompress(true); // on by default
auto res = cli.Get("/api/data");
std::cout << res->body << std::endl;
```
With `set_decompress(true)`, the client automatically decompresses responses that arrive with `Content-Encoding: gzip` or similar. `res->body` contains the decompressed data.
It's on by default, so normally you don't need to do anything. Set it to `false` only if you want the raw compressed bytes.
> **Warning:** If you build without `CPPHTTPLIB_ZLIB_SUPPORT`, calling `set_compress()` or `set_decompress()` does nothing. If compression isn't working, check the macro definition first.

View File

@@ -0,0 +1,52 @@
---
title: "C16. Send Requests Through a Proxy"
order: 16
status: "draft"
---
To route traffic through a corporate network or a specific path, send requests via an HTTP proxy. Just pass the proxy host and port to `set_proxy()`.
## Basic usage
```cpp
httplib::Client cli("https://api.example.com");
cli.set_proxy("proxy.internal", 8080);
auto res = cli.Get("/users");
```
The request goes through the proxy. For HTTPS, the client uses the CONNECT method to tunnel through — no extra setup required.
## Proxy authentication
If the proxy itself requires authentication, use `set_proxy_basic_auth()` or `set_proxy_bearer_token_auth()`.
```cpp
cli.set_proxy("proxy.internal", 8080);
cli.set_proxy_basic_auth("user", "password");
```
```cpp
cli.set_proxy_bearer_token_auth("token");
```
If cpp-httplib is built with OpenSSL (or another TLS backend), you can also use Digest authentication for the proxy.
```cpp
cli.set_proxy_digest_auth("user", "password");
```
## Combine with end-server authentication
Proxy authentication is separate from authenticating to the end server ([C05. Use Basic authentication](c05-basic-auth), [C06. Call an API with a Bearer token](c06-bearer-token)). When both are needed, set both.
```cpp
cli.set_proxy("proxy.internal", 8080);
cli.set_proxy_basic_auth("proxy-user", "proxy-pass");
cli.set_bearer_token_auth("api-token"); // for the end server
```
`Proxy-Authorization` is sent to the proxy, `Authorization` to the end server.
> **Note:** cpp-httplib does not read `HTTP_PROXY` or `HTTPS_PROXY` environment variables automatically. If you want to honor them, read them in your application and pass the values to `set_proxy()`.

View File

@@ -0,0 +1,63 @@
---
title: "C17. Handle Error Codes"
order: 17
status: "draft"
---
`cli.Get()`, `cli.Post()`, and friends return a `Result`. When the request fails — can't reach the server, times out, etc. — the result is "falsy". To get the specific reason, use `Result::error()`.
## Basic check
```cpp
httplib::Client cli("http://localhost:8080");
auto res = cli.Get("/api/data");
if (res) {
// the request was sent and a response came back
std::cout << "status: " << res->status << std::endl;
} else {
// the network layer failed
std::cerr << "error: " << httplib::to_string(res.error()) << std::endl;
}
```
Use `if (res)` to check success. On failure, `res.error()` returns a `httplib::Error` enum value. Pass it to `to_string()` to get a human-readable description.
## Common errors
| Value | Meaning |
| --- | --- |
| `Error::Connection` | Couldn't connect to the server |
| `Error::ConnectionTimeout` | Connection timeout (`set_connection_timeout`) |
| `Error::Read` / `Error::Write` | Error during send or receive |
| `Error::Timeout` | Overall timeout set via `set_max_timeout` |
| `Error::ExceedRedirectCount` | Too many redirects |
| `Error::SSLConnection` | TLS handshake failed |
| `Error::SSLServerVerification` | Server certificate verification failed |
| `Error::Canceled` | A progress callback returned `false` |
## Network errors vs. HTTP errors
Even when `res` is truthy, the HTTP status code can still be 4xx or 5xx. These are two different things.
```cpp
auto res = cli.Get("/api/data");
if (!res) {
// network error (no response received at all)
std::cerr << "network error: " << httplib::to_string(res.error()) << std::endl;
return 1;
}
if (res->status >= 400) {
// HTTP error (response received, but the status is bad)
std::cerr << "http error: " << res->status << std::endl;
return 1;
}
// success
std::cout << res->body << std::endl;
```
Keep them separated in your head: network-layer errors go through `res.error()`, HTTP-level errors through `res->status`.
> To dig deeper into SSL-related errors, see [C18. Handle SSL errors](c18-ssl-errors).

View File

@@ -0,0 +1,51 @@
---
title: "C18. Handle SSL Errors"
order: 18
status: "draft"
---
When an HTTPS request fails, `res.error()` returns values like `Error::SSLConnection` or `Error::SSLServerVerification`. Sometimes that's not enough to pinpoint the cause. That's where `Result::ssl_error()` and `Result::ssl_backend_error()` help.
## Get the SSL error details
```cpp
httplib::Client cli("https://api.example.com");
auto res = cli.Get("/");
if (!res) {
auto err = res.error();
std::cerr << "error: " << httplib::to_string(err) << std::endl;
if (err == httplib::Error::SSLConnection ||
err == httplib::Error::SSLServerVerification) {
std::cerr << "ssl_error: " << res.ssl_error() << std::endl;
std::cerr << "ssl_backend_error: " << res.ssl_backend_error() << std::endl;
}
}
```
`ssl_error()` returns the error code from the SSL library (e.g., OpenSSL's `SSL_get_error()`). `ssl_backend_error()` gives you the backend's more detailed error value — for OpenSSL, that's `ERR_get_error()`.
## Format OpenSSL errors as strings
When you have a value from `ssl_backend_error()`, pass it to OpenSSL's `ERR_error_string()` to get a readable message.
```cpp
#include <openssl/err.h>
if (res.ssl_backend_error() != 0) {
char buf[256];
ERR_error_string_n(res.ssl_backend_error(), buf, sizeof(buf));
std::cerr << "openssl: " << buf << std::endl;
}
```
## Common causes
| Symptom | Usual suspect |
| --- | --- |
| `SSLServerVerification` | CA certificate path isn't configured, or the cert is self-signed |
| `SSLServerHostnameVerification` | The cert's CN/SAN doesn't match the host |
| `SSLConnection` | TLS version mismatch, no shared cipher suite |
> To change certificate verification settings, see [T02. Control SSL certificate verification](t02-cert-verification).

View File

@@ -0,0 +1,57 @@
---
title: "C19. Set a Logger on the Client"
order: 19
status: "draft"
---
To log requests sent and responses received by the client, use `set_logger()`. If you only care about errors, there's a separate `set_error_logger()`.
## Log requests and responses
```cpp
httplib::Client cli("https://api.example.com");
cli.set_logger([](const httplib::Request &req, const httplib::Response &res) {
std::cout << req.method << " " << req.path
<< " -> " << res.status << std::endl;
});
auto res = cli.Get("/users");
```
The callback you pass to `set_logger()` fires once for each completed request. You get both the request and the response as arguments — so you can log the method, path, status, headers, body, or whatever else you need.
## Catch errors only
When a network-layer error happens (like `Error::Connection`), `set_logger()` is **not** called — there's no response to log. For those cases, use `set_error_logger()`.
```cpp
cli.set_error_logger([](const httplib::Error &err, const httplib::Request *req) {
std::cerr << "error: " << httplib::to_string(err);
if (req) {
std::cerr << " (" << req->method << " " << req->path << ")";
}
std::cerr << std::endl;
});
```
The second argument `req` can be null — it happens when the failure occurred before the request was built. Always null-check before dereferencing.
## Use both together
A nice pattern is to log successes through one, failures through the other.
```cpp
cli.set_logger([](const auto &req, const auto &res) {
std::cout << "[ok] " << req.method << " " << req.path
<< " " << res.status << std::endl;
});
cli.set_error_logger([](const auto &err, const auto *req) {
std::cerr << "[ng] " << httplib::to_string(err);
if (req) std::cerr << " " << req->method << " " << req->path;
std::cerr << std::endl;
});
```
> **Note:** The log callbacks run synchronously on the same thread as the request. Heavy work inside them slows the request down — push it to a background queue if you need to do anything expensive.

View File

@@ -0,0 +1,87 @@
---
title: "E01. Implement an SSE Server"
order: 47
status: "draft"
---
Server-Sent Events (SSE) is a simple protocol for pushing events one-way from server to client. The connection stays open, and the server can send data whenever it wants. It's lighter than WebSocket and fits entirely within HTTP — a nice combination.
cpp-httplib doesn't have a dedicated SSE server API, but you can implement one with `set_chunked_content_provider()` and `text/event-stream`.
## Basic SSE server
```cpp
svr.Get("/events", [](const httplib::Request &req, httplib::Response &res) {
res.set_chunked_content_provider(
"text/event-stream",
[](size_t offset, httplib::DataSink &sink) {
std::string message = "data: hello\n\n";
sink.write(message.data(), message.size());
std::this_thread::sleep_for(std::chrono::seconds(1));
return true;
});
});
```
Three things matter here:
1. Content-Type is `text/event-stream`
2. Messages follow the format `data: <content>\n\n` (the double newline separates events)
3. Each `sink.write()` delivers data to the client
The provider lambda keeps being called as long as the connection is alive.
## A continuous stream
Here's a simple example that sends the current time once per second.
```cpp
svr.Get("/time", [](const httplib::Request &req, httplib::Response &res) {
res.set_chunked_content_provider(
"text/event-stream",
[&req](size_t offset, httplib::DataSink &sink) {
if (req.is_connection_closed()) {
sink.done();
return true;
}
auto now = std::chrono::system_clock::now();
auto t = std::chrono::system_clock::to_time_t(now);
std::string msg = "data: " + std::string(std::ctime(&t)) + "\n";
sink.write(msg.data(), msg.size());
std::this_thread::sleep_for(std::chrono::seconds(1));
return true;
});
});
```
When the client disconnects, call `sink.done()` to stop. Details in [S16. Detect client disconnection](s16-disconnect).
## Heartbeats via comment lines
Lines starting with `:` are SSE comments — clients ignore them, but they **keep the connection alive**. Handy for preventing proxies and load balancers from closing idle connections.
```cpp
// heartbeat every 30 seconds
if (tick_count % 30 == 0) {
std::string ping = ": ping\n\n";
sink.write(ping.data(), ping.size());
}
```
## Relationship with the thread pool
SSE connections stay open, so each client holds a worker thread. For lots of concurrent connections, enable dynamic scaling on the thread pool.
```cpp
svr.new_task_queue = [] {
return new httplib::ThreadPool(8, 128);
};
```
See [S21. Configure the thread pool](s21-thread-pool).
> **Note:** When `data:` contains newlines, split it into multiple `data:` lines — one per line. This is how the SSE spec requires multiline data to be transmitted.
> For event names, see [E02. Use named events in SSE](e02-sse-event-names). For the client side, see [E04. Receive SSE on the client](e04-sse-client).

View File

@@ -0,0 +1,84 @@
---
title: "E02. Use Named Events in SSE"
order: 48
status: "draft"
---
SSE lets you send multiple kinds of events over the same stream. Give each one a name with the `event:` field, and the client can dispatch to a different handler per type. Great for things like "new message", "user joined", "user left" in a chat app.
## Send events with names
```cpp
auto send_event = [](httplib::DataSink &sink,
const std::string &event,
const std::string &data) {
std::string msg = "event: " + event + "\n"
+ "data: " + data + "\n\n";
sink.write(msg.data(), msg.size());
};
svr.Get("/chat/stream", [&](const httplib::Request &req, httplib::Response &res) {
res.set_chunked_content_provider(
"text/event-stream",
[&, send_event](size_t offset, httplib::DataSink &sink) {
send_event(sink, "message", "Hello!");
std::this_thread::sleep_for(std::chrono::seconds(2));
send_event(sink, "join", "alice");
std::this_thread::sleep_for(std::chrono::seconds(2));
send_event(sink, "leave", "bob");
std::this_thread::sleep_for(std::chrono::seconds(2));
return true;
});
});
```
A message is `event:``data:` → blank line. If you omit `event:`, the client treats it as a default `"message"` event.
## Attach IDs for reconnect
When you include an `id:` field, the client automatically sends it back as `Last-Event-ID` on reconnect, telling the server "here's how far I got."
```cpp
auto send_event = [](httplib::DataSink &sink,
const std::string &event,
const std::string &data,
const std::string &id) {
std::string msg = "id: " + id + "\n"
+ "event: " + event + "\n"
+ "data: " + data + "\n\n";
sink.write(msg.data(), msg.size());
};
send_event(sink, "message", "Hello!", "42");
```
The ID format is up to you. Monotonic counters or UUIDs both work — just pick something unique and orderable on the server side. See [E03. Handle SSE reconnection](e03-sse-reconnect) for details.
## JSON payloads in data
For structured data, the usual move is to put JSON in `data:`.
```cpp
nlohmann::json payload = {
{"user", "alice"},
{"text", "Hello!"},
};
send_event(sink, "message", payload.dump(), "42");
```
On the client, parse the incoming `data` as JSON to get the original object back.
## Data with newlines
If the data value contains newlines, split it across multiple `data:` lines.
```cpp
std::string msg = "data: line1\n"
"data: line2\n"
"data: line3\n\n";
sink.write(msg.data(), msg.size());
```
On the client side, these come back as a single `data` string with newlines.
> **Note:** Using `event:` makes client-side dispatch cleaner, but it also helps in the browser DevTools — events are easier to filter by type. That matters more than you'd expect while debugging.

View File

@@ -0,0 +1,85 @@
---
title: "E03. Handle SSE Reconnection"
order: 49
status: "draft"
---
SSE connections drop for all sorts of network reasons. Clients automatically try to reconnect, so it's a good idea to make your server resume from where it left off.
## Read `Last-Event-ID`
When the client reconnects, it sends the ID of the last event it received in the `Last-Event-ID` header. The server reads that and picks up from the next one.
```cpp
svr.Get("/events", [](const httplib::Request &req, httplib::Response &res) {
auto last_id = req.get_header_value("Last-Event-ID");
int start = last_id.empty() ? 0 : std::stoi(last_id) + 1;
res.set_chunked_content_provider(
"text/event-stream",
[start](size_t offset, httplib::DataSink &sink) mutable {
static int next_id = 0;
if (next_id < start) { next_id = start; }
std::string msg = "id: " + std::to_string(next_id) + "\n"
+ "data: event " + std::to_string(next_id) + "\n\n";
sink.write(msg.data(), msg.size());
++next_id;
std::this_thread::sleep_for(std::chrono::seconds(1));
return true;
});
});
```
On the first connect, `Last-Event-ID` is empty, so start from `0`. On reconnect, resume from the next ID. Event history is the server's responsibility — you need to keep recent events around somewhere.
## Set the reconnect interval
Sending a `retry:` field tells the client how long to wait before reconnecting, in milliseconds.
```cpp
std::string msg = "retry: 5000\n\n"; // reconnect after 5 seconds
sink.write(msg.data(), msg.size());
```
Usually you send this once at the start. During peak load or maintenance windows, a longer retry interval helps reduce reconnect storms.
## Buffer recent events
To support reconnection, keep a rolling buffer of recent events on the server.
```cpp
struct EventBuffer {
std::mutex mu;
std::deque<std::pair<int, std::string>> events; // {id, data}
int next_id = 0;
void push(const std::string &data) {
std::lock_guard<std::mutex> lock(mu);
events.push_back({next_id++, data});
if (events.size() > 1000) { events.pop_front(); }
}
std::vector<std::pair<int, std::string>> since(int id) {
std::lock_guard<std::mutex> lock(mu);
std::vector<std::pair<int, std::string>> out;
for (const auto &e : events) {
if (e.first >= id) { out.push_back(e); }
}
return out;
}
};
```
When a client reconnects, call `since(last_id)` to send any events it missed.
## How much to keep
The buffer size is a tradeoff between memory and how far back a client can resume. It depends on the use case:
- Real-time chat: a few minutes to half an hour
- Notifications: the last N items
- Trading data: persist to a database and pull from there
> **Warning:** `Last-Event-ID` is a client-provided value — don't trust it blindly. If you read it as a number, validate the range. If it's a string, sanitize it.

View File

@@ -0,0 +1,99 @@
---
title: "E04. Receive SSE on the Client"
order: 50
status: "draft"
---
cpp-httplib ships a dedicated `sse::SSEClient` class. It handles auto-reconnect, per-event-name dispatch, and `Last-Event-ID` tracking for you — so receiving SSE is painless.
## Basic usage
```cpp
#include <httplib.h>
httplib::Client cli("http://localhost:8080");
httplib::sse::SSEClient sse(cli, "/events");
sse.on_message([](const httplib::sse::SSEMessage &msg) {
std::cout << "data: " << msg.data << std::endl;
});
sse.start(); // blocking
```
Build an `SSEClient` with a `Client` and a path, register a callback with `on_message()`, and call `start()`. The event loop kicks in and automatically reconnects if the connection drops.
## Dispatch by event name
When the server sends events with an `event:` field, register a handler per name via `on_event()`.
```cpp
sse.on_event("message", [](const auto &msg) {
std::cout << "chat: " << msg.data << std::endl;
});
sse.on_event("join", [](const auto &msg) {
std::cout << msg.data << " joined" << std::endl;
});
sse.on_event("leave", [](const auto &msg) {
std::cout << msg.data << " left" << std::endl;
});
```
`on_message()` serves as a generic fallback for unnamed events (the default `message` type).
## Connection lifecycle and errors
```cpp
sse.on_open([] {
std::cout << "connected" << std::endl;
});
sse.on_error([](httplib::Error err) {
std::cerr << "error: " << httplib::to_string(err) << std::endl;
});
```
Hook into connection open and error events. Even when the error handler fires, `SSEClient` keeps trying to reconnect in the background.
## Run asynchronously
If you don't want to block the main thread, use `start_async()`.
```cpp
sse.start_async();
// main thread continues to do other things
do_other_work();
// when you're done, stop it
sse.stop();
```
`start_async()` spawns a background thread to run the event loop. Use `stop()` to shut it down cleanly.
## Configure reconnection
You can tune the reconnect interval and maximum retries.
```cpp
sse.set_reconnect_interval(5000); // 5 seconds
sse.set_max_reconnect_attempts(10); // up to 10 (0 = unlimited)
```
If the server sends a `retry:` field, that takes precedence.
## Automatic Last-Event-ID
`SSEClient` tracks the `id` of each received event internally and sends it back as `Last-Event-ID` on reconnect. As long as the server sends events with `id:`, this all works automatically.
```cpp
std::cout << "last id: " << sse.last_event_id() << std::endl;
```
Use `last_event_id()` to read the current value.
> **Note:** `SSEClient::start()` blocks, which is fine for a one-off command-line tool. For GUI apps or embedded in a server, the `start_async()` + `stop()` pair is the usual pattern.
> For the server side, see [E01. Implement an SSE server](e01-sse-server).

View File

@@ -1,95 +1,96 @@
---
title: "Cookbook"
order: 0
status: "draft"
---
A collection of recipes that answer "How do I...?" questions. Each recipe is self-contained — read only what you need.
A collection of recipes that answer "How do I...?" questions. Each recipe is self-contained — read only what you need. For an introduction to the basics, see the [Tour](../tour/).
## Client
### Basics
- Get the response body as a string / save to a file
- Send and receive JSON
- Set default headers (`set_default_headers`)
- Follow redirects (`set_follow_location`)
- [C01. Get the response body / save to a file](c01-get-response-body)
- [C02. Send and receive JSON](c02-json)
- [C03. Set default headers](c03-default-headers)
- [C04. Follow redirects](c04-follow-location)
### Authentication
- Use Basic authentication (`set_basic_auth`)
- Call an API with a Bearer token
- [C05. Use Basic authentication](c05-basic-auth)
- [C06. Call an API with a Bearer token](c06-bearer-token)
### File Upload
- Upload a file as multipart form data (`make_file_provider`)
- POST a file as raw binary (`make_file_body`)
- Send the body with chunked transfer (Content Provider)
- [C07. Upload a file as multipart form data](c07-multipart-upload)
- [C08. POST a file as raw binary](c08-post-file-body)
- [C09. Send the body with chunked transfer](c09-chunked-upload)
### Streaming & Progress
- Receive a response as a stream
- Use the progress callback (`DownloadProgress` / `UploadProgress`)
- [C10. Receive a response as a stream](c10-stream-response)
- [C11. Use the progress callback](c11-progress-callback)
### Connection & Performance
- Set timeouts (`set_connection_timeout` / `set_read_timeout`)
- Set an overall timeout (`set_max_timeout`)
- Understand connection reuse and Keep-Alive behavior
- Enable compression (`set_compress` / `set_decompress`)
- Send requests through a proxy (`set_proxy`)
- [C12. Set timeouts](c12-timeouts)
- [C13. Set an overall timeout](c13-max-timeout)
- [C14. Understand connection reuse and Keep-Alive behavior](c14-keep-alive)
- [C15. Enable compression](c15-compression)
- [C16. Send requests through a proxy](c16-proxy)
### Error Handling & Debugging
- Handle error codes (`Result::error()`)
- Handle SSL errors (`ssl_error()` / `ssl_backend_error()`)
- Set up client logging (`set_logger` / `set_error_logger`)
- [C17. Handle error codes](c17-error-codes)
- [C18. Handle SSL errors](c18-ssl-errors)
- [C19. Set up client logging](c19-client-logger)
## Server
### Basics
- Register GET / POST / PUT / DELETE handlers
- Receive JSON requests and return JSON responses
- Use path parameters (`/users/:id`)
- Set up a static file server (`set_mount_point`)
- [S01. Register GET / POST / PUT / DELETE handlers](s01-handlers)
- [S02. Receive JSON requests and return JSON responses](s02-json-api)
- [S03. Use path parameters](s03-path-params)
- [S04. Set up a static file server](s04-static-files)
### Streaming & Files
- Stream a large file in the response (`ContentProvider`)
- Return a file download response (`Content-Disposition`)
- Receive multipart data as a stream (`ContentReader`)
- Compress responses (gzip)
- [S05. Stream a large file in the response](s05-stream-response)
- [S06. Return a file download response](s06-download-response)
- [S07. Receive multipart data as a stream](s07-multipart-reader)
- [S08. Return a compressed response](s08-compress-response)
### Handler Chain
- Add pre-processing to all routes (Pre-routing handler)
- Add response headers with a Post-routing handler (CORS, etc.)
- Authenticate per route with a Pre-request handler (`matched_route`)
- Pass data between handlers with `res.user_data`
- [S09. Add pre-processing to all routes](s09-pre-routing)
- [S10. Add response headers with a post-routing handler](s10-post-routing)
- [S11. Authenticate per route with a pre-request handler](s11-pre-request)
- [S12. Pass data between handlers with `res.user_data`](s12-user-data)
### Error Handling & Debugging
- Return custom error pages (`set_error_handler`)
- Catch exceptions (`set_exception_handler`)
- Log requests (Logger)
- Detect client disconnection (`req.is_connection_closed()`)
- [S13. Return custom error pages](s13-error-handler)
- [S14. Catch exceptions](s14-exception-handler)
- [S15. Log requests](s15-server-logger)
- [S16. Detect client disconnection](s16-disconnect)
### Operations & Tuning
- Assign a port dynamically (`bind_to_any_port`)
- Control startup order with `listen_after_bind`
- Shut down gracefully (`stop()` and signal handling)
- Tune Keep-Alive (`set_keep_alive_max_count` / `set_keep_alive_timeout`)
- Configure the thread pool (`new_task_queue`)
- [S17. Bind to any available port](s17-bind-any-port)
- [S18. Control startup order with `listen_after_bind`](s18-listen-after-bind)
- [S19. Shut down gracefully](s19-graceful-shutdown)
- [S20. Tune Keep-Alive](s20-keep-alive)
- [S21. Configure the thread pool](s21-thread-pool)
- [S22. Talk over a Unix domain socket](s22-unix-socket)
## TLS / Security
- Choosing between OpenSSL, mbedTLS, and wolfSSL (build-time `#define` differences)
- Control SSL certificate verification (disable, custom CA, custom callback)
- Use a custom certificate verification callback (`set_server_certificate_verifier`)
- Set up an SSL/TLS server (certificate and private key)
- Configure mTLS (mutual TLS with client certificates)
- Access the peer certificate on the server (`req.peer_cert()` / SNI)
- [T01. Choosing between OpenSSL, mbedTLS, and wolfSSL](t01-tls-backends)
- [T02. Control SSL certificate verification](t02-cert-verification)
- [T03. Start an SSL/TLS server](t03-ssl-server)
- [T04. Configure mTLS](t04-mtls)
- [T05. Access the peer certificate on the server](t05-peer-cert)
## SSE
- Implement an SSE server
- Use event names to distinguish event types
- Handle reconnection (`Last-Event-ID`)
- Receive SSE events on the client
- [E01. Implement an SSE server](e01-sse-server)
- [E02. Use named events in SSE](e02-sse-event-names)
- [E03. Handle SSE reconnection](e03-sse-reconnect)
- [E04. Receive SSE on the client](e04-sse-client)
## WebSocket
- Implement a WebSocket echo server and client
- Configure heartbeats (`set_websocket_ping_interval` / `CPPHTTPLIB_WEBSOCKET_PING_INTERVAL_SECOND`)
- Handle connection close
- Send and receive binary frames
- [W01. Implement a WebSocket echo server and client](w01-websocket-echo)
- [W02. Set a WebSocket heartbeat](w02-websocket-ping)
- [W03. Handle connection close](w03-websocket-close)
- [W04. Send and receive binary frames](w04-websocket-binary)

View File

@@ -0,0 +1,66 @@
---
title: "S01. Register GET / POST / PUT / DELETE Handlers"
order: 20
status: "draft"
---
With `httplib::Server`, you register a handler per HTTP method. Just pass a pattern and a lambda to `Get()`, `Post()`, `Put()`, or `Delete()`.
## Basic usage
```cpp
#include <httplib.h>
int main() {
httplib::Server svr;
svr.Get("/hello", [](const httplib::Request &req, httplib::Response &res) {
res.set_content("Hello, World!", "text/plain");
});
svr.Post("/api/items", [](const httplib::Request &req, httplib::Response &res) {
// req.body holds the request body
res.status = 201;
res.set_content("Created", "text/plain");
});
svr.Put("/api/items/1", [](const httplib::Request &req, httplib::Response &res) {
res.set_content("Updated", "text/plain");
});
svr.Delete("/api/items/1", [](const httplib::Request &req, httplib::Response &res) {
res.status = 204;
});
svr.listen("0.0.0.0", 8080);
}
```
Handlers take `(const Request&, Response&)`. Use `res.set_content()` to set the body and Content-Type, and `res.status` for the status code. `listen()` starts the server and blocks.
## Read query parameters
```cpp
svr.Get("/search", [](const httplib::Request &req, httplib::Response &res) {
auto q = req.get_param_value("q");
auto limit = req.get_param_value("limit");
res.set_content("q=" + q + ", limit=" + limit, "text/plain");
});
```
`req.get_param_value()` pulls a value from the query string. Use `req.has_param("q")` if you want to check existence first.
## Read request headers
```cpp
svr.Get("/me", [](const httplib::Request &req, httplib::Response &res) {
auto ua = req.get_header_value("User-Agent");
res.set_content("UA: " + ua, "text/plain");
});
```
To add a response header, use `res.set_header("Name", "Value")`.
> **Note:** `listen()` is a blocking call. To run it on a different thread, wrap it in `std::thread`. If you need non-blocking startup, see [S18. Control startup order with `listen_after_bind`](s18-listen-after-bind).
> To use path parameters like `/users/:id`, see [S03. Use path parameters](s03-path-params).

View File

@@ -0,0 +1,74 @@
---
title: "S02. Receive a JSON Request and Return a JSON Response"
order: 21
status: "draft"
---
cpp-httplib doesn't include a JSON parser. On the server side, combine it with something like [nlohmann/json](https://github.com/nlohmann/json). The examples below use `nlohmann/json`.
## Receive and return JSON
```cpp
#include <httplib.h>
#include <nlohmann/json.hpp>
int main() {
httplib::Server svr;
svr.Post("/api/users", [](const httplib::Request &req, httplib::Response &res) {
try {
auto in = nlohmann::json::parse(req.body);
nlohmann::json out = {
{"id", 42},
{"name", in["name"]},
{"created_at", "2026-04-10T12:00:00Z"},
};
res.status = 201;
res.set_content(out.dump(), "application/json");
} catch (const std::exception &e) {
res.status = 400;
res.set_content("{\"error\":\"invalid json\"}", "application/json");
}
});
svr.listen("0.0.0.0", 8080);
}
```
`req.body` is a plain `std::string`, so you pass it straight to your JSON library. For the response, `dump()` to a string and set the Content-Type to `application/json`.
## Check the Content-Type
```cpp
svr.Post("/api/users", [](const httplib::Request &req, httplib::Response &res) {
auto content_type = req.get_header_value("Content-Type");
if (content_type.find("application/json") == std::string::npos) {
res.status = 415; // Unsupported Media Type
return;
}
// ...
});
```
When you strictly want JSON only, verify the Content-Type up front.
## A helper for JSON responses
If you're writing the same pattern repeatedly, a small helper saves typing.
```cpp
auto send_json = [](httplib::Response &res, int status, const nlohmann::json &j) {
res.status = status;
res.set_content(j.dump(), "application/json");
};
svr.Get("/api/health", [&](const auto &req, auto &res) {
send_json(res, 200, {{"status", "ok"}});
});
```
> **Note:** A large JSON body ends up entirely in `req.body`, which means it all sits in memory. For huge payloads, consider streaming reception — see [S07. Receive multipart data as a stream](s07-multipart-reader).
> For the client side, see [C02. Send and receive JSON](c02-json).

View File

@@ -0,0 +1,53 @@
---
title: "S03. Use Path Parameters"
order: 22
status: "draft"
---
For dynamic URLs like `/users/:id` — the staple of REST APIs — just put `:name` in the path pattern. The matched values end up in `req.path_params`.
## Basic usage
```cpp
svr.Get("/users/:id", [](const httplib::Request &req, httplib::Response &res) {
auto id = req.path_params.at("id");
res.set_content("user id: " + id, "text/plain");
});
```
A request to `/users/42` fills `req.path_params["id"]` with `"42"`. `path_params` is a `std::unordered_map<std::string, std::string>`, so use `at()` to read it.
## Multiple parameters
You can have as many as you need.
```cpp
svr.Get("/orgs/:org/repos/:repo", [](const httplib::Request &req, httplib::Response &res) {
auto org = req.path_params.at("org");
auto repo = req.path_params.at("repo");
res.set_content(org + "/" + repo, "text/plain");
});
```
This matches paths like `/orgs/anthropic/repos/cpp-httplib`.
## Regex patterns
For more flexible matching, use a `std::regex`-based pattern.
```cpp
svr.Get(R"(/users/(\d+))", [](const httplib::Request &req, httplib::Response &res) {
auto id = req.matches[1];
res.set_content("user id: " + std::string(id), "text/plain");
});
```
Parentheses in the pattern become captures in `req.matches`. `req.matches[0]` is the full match; `req.matches[1]` onward are the captures.
## Which to use
- For plain IDs or slugs, `:name` is enough — readable, and the shape is obvious
- Use regex when you want to constrain the URL to, say, numbers only or a UUID format
- Mixing both can get confusing — stick with one style per project
> **Note:** Path parameters come in as strings. If you need an integer, convert with `std::stoi()` and don't forget to handle conversion errors.

View File

@@ -0,0 +1,55 @@
---
title: "S04. Serve Static Files"
order: 23
status: "draft"
---
To serve static files like HTML, CSS, and images, use `set_mount_point()`. Just map a URL path to a local directory, and the whole directory becomes accessible.
## Basic usage
```cpp
httplib::Server svr;
svr.set_mount_point("/", "./public");
svr.listen("0.0.0.0", 8080);
```
`./public/index.html` is now reachable at `http://localhost:8080/index.html`, and `./public/css/style.css` at `http://localhost:8080/css/style.css`. The directory layout maps directly to URLs.
## Multiple mount points
You can register more than one mount point.
```cpp
svr.set_mount_point("/", "./public");
svr.set_mount_point("/assets", "./dist/assets");
svr.set_mount_point("/uploads", "./var/uploads");
```
You can even mount multiple directories at the same path — they're searched in registration order, and the first hit wins.
## Combine with API handlers
Static files and API handlers coexist happily. Handlers registered with `Get()` and friends take priority; the mount points are searched only when nothing matches.
```cpp
svr.Get("/api/users", [](const auto &req, auto &res) {
res.set_content("[]", "application/json");
});
svr.set_mount_point("/", "./public");
```
This gives you an SPA-friendly setup: `/api/*` hits the handlers, everything else is served from `./public/`.
## Add MIME types
cpp-httplib ships with a built-in extension-to-Content-Type map, but you can add your own.
```cpp
svr.set_file_extension_and_mimetype_mapping("wasm", "application/wasm");
```
> **Warning:** The static file server methods are **not thread-safe**. Don't call them after `listen()` — configure everything before starting the server.
> For download-style responses, see [S06. Return a file download response](s06-download-response).

View File

@@ -0,0 +1,63 @@
---
title: "S05. Stream a Large File in the Response"
order: 24
status: "draft"
---
When the response is a huge file or data generated on the fly, loading the whole thing into memory isn't realistic. Use `Response::set_content_provider()` to produce data in chunks as you send it.
## When the size is known
```cpp
svr.Get("/download", [](const httplib::Request &req, httplib::Response &res) {
size_t total_size = get_file_size("large.bin");
res.set_content_provider(
total_size, "application/octet-stream",
[](size_t offset, size_t length, httplib::DataSink &sink) {
auto data = read_range_from_file("large.bin", offset, length);
sink.write(data.data(), data.size());
return true;
});
});
```
The lambda is called repeatedly with `offset` and `length`. Read just that range and write it to `sink`. Only a small chunk sits in memory at any given time.
## Just send a file
If you only want to serve a file, `set_file_content()` is far simpler.
```cpp
svr.Get("/download", [](const httplib::Request &req, httplib::Response &res) {
res.set_file_content("large.bin", "application/octet-stream");
});
```
It streams internally, so even huge files are safe. Omit the Content-Type and it's guessed from the extension.
## When the size is unknown — chunked transfer
For data generated on the fly, where you don't know the total size up front, use `set_chunked_content_provider()`. It's sent with HTTP chunked transfer encoding.
```cpp
svr.Get("/events", [](const httplib::Request &req, httplib::Response &res) {
res.set_chunked_content_provider(
"text/plain",
[](size_t offset, httplib::DataSink &sink) {
auto chunk = produce_next_chunk();
if (chunk.empty()) {
sink.done(); // done sending
return true;
}
sink.write(chunk.data(), chunk.size());
return true;
});
});
```
Call `sink.done()` to signal the end.
> **Note:** The provider lambda is called multiple times. Watch out for the lifetime of captured variables — wrap them in a `std::shared_ptr` if needed.
> To serve the file as a download, see [S06. Return a file download response](s06-download-response).

View File

@@ -0,0 +1,50 @@
---
title: "S06. Return a File Download Response"
order: 25
status: "draft"
---
To force a browser to show a **download dialog** instead of rendering inline, send a `Content-Disposition` header. There's no special cpp-httplib API for this — it's just a header.
## Basic usage
```cpp
svr.Get("/download/report", [](const httplib::Request &req, httplib::Response &res) {
res.set_header("Content-Disposition", "attachment; filename=\"report.pdf\"");
res.set_file_content("reports/2026-04.pdf", "application/pdf");
});
```
`Content-Disposition: attachment` makes the browser pop up a "Save As" dialog. The `filename=` parameter becomes the default save name.
## Non-ASCII file names
For file names with non-ASCII characters or spaces, use the RFC 5987 `filename*` form.
```cpp
svr.Get("/download/report", [](const httplib::Request &req, httplib::Response &res) {
res.set_header(
"Content-Disposition",
"attachment; filename=\"report.pdf\"; "
"filename*=UTF-8''%E3%83%AC%E3%83%9D%E3%83%BC%E3%83%88.pdf");
res.set_file_content("reports/2026-04.pdf", "application/pdf");
});
```
The part after `filename*=UTF-8''` is URL-encoded UTF-8. Keep the ASCII `filename=` too, as a fallback for older browsers.
## Download dynamically generated data
You don't need a real file — you can serve a generated string as a download directly.
```cpp
svr.Get("/export.csv", [](const httplib::Request &req, httplib::Response &res) {
std::string csv = build_csv();
res.set_header("Content-Disposition", "attachment; filename=\"export.csv\"");
res.set_content(csv, "text/csv");
});
```
This is the classic pattern for CSV exports.
> **Note:** Some browsers will trigger a download based on Content-Type alone, even without `Content-Disposition`. Conversely, setting `inline` tries to render the content in the browser when possible.

View File

@@ -0,0 +1,71 @@
---
title: "S07. Receive Multipart Data as a Stream"
order: 26
status: "draft"
---
A naive upload handler puts the whole request into `req.body`, which blows up memory for large files. Use `HandlerWithContentReader` to receive the body chunk by chunk.
## Basic usage
```cpp
svr.Post("/upload",
[](const httplib::Request &req, httplib::Response &res,
const httplib::ContentReader &content_reader) {
if (req.is_multipart_form_data()) {
content_reader(
// headers of each part
[&](const httplib::FormData &file) {
std::cout << "name: " << file.name
<< ", filename: " << file.filename << std::endl;
return true;
},
// body of each part (called multiple times)
[&](const char *data, size_t len) {
// write to disk here, for example
return true;
});
} else {
// plain request body
content_reader([&](const char *data, size_t len) {
return true;
});
}
res.set_content("ok", "text/plain");
});
```
The `content_reader` has two call shapes. For multipart data, pass two callbacks (one for headers, one for body). For plain bodies, pass just one.
## Write directly to disk
Here's how to stream an uploaded file to disk.
```cpp
svr.Post("/upload",
[](const httplib::Request &req, httplib::Response &res,
const httplib::ContentReader &content_reader) {
std::ofstream ofs;
content_reader(
[&](const httplib::FormData &file) {
if (!file.filename.empty()) {
ofs.open("uploads/" + file.filename, std::ios::binary);
}
return static_cast<bool>(ofs);
},
[&](const char *data, size_t len) {
ofs.write(data, len);
return static_cast<bool>(ofs);
});
res.set_content("uploaded", "text/plain");
});
```
Only a small chunk sits in memory at any moment, so gigabyte-scale files are no problem.
> **Warning:** When you use `HandlerWithContentReader`, `req.body` stays **empty**. Handle the body yourself inside the callbacks.
> For the client side of multipart uploads, see [C07. Upload a file as multipart form data](c07-multipart-upload).

View File

@@ -0,0 +1,53 @@
---
title: "S08. Return a Compressed Response"
order: 27
status: "draft"
---
cpp-httplib automatically compresses response bodies when the client indicates support via `Accept-Encoding`. The handler doesn't need to do anything special. Supported encodings are gzip, Brotli, and Zstd.
## Build-time setup
To enable compression, define the relevant macros before including `httplib.h`:
```cpp
#define CPPHTTPLIB_ZLIB_SUPPORT // gzip
#define CPPHTTPLIB_BROTLI_SUPPORT // brotli
#define CPPHTTPLIB_ZSTD_SUPPORT // zstd
#include <httplib.h>
```
You'll also need to link `zlib`, `brotli`, and `zstd` respectively. Enable only what you need.
## Usage
```cpp
svr.Get("/api/data", [](const httplib::Request &req, httplib::Response &res) {
std::string body = build_large_response();
res.set_content(body, "application/json");
});
```
That's it. If the client sent `Accept-Encoding: gzip`, cpp-httplib compresses the response with gzip automatically. `Content-Encoding: gzip` and `Vary: Accept-Encoding` are added for you.
## Encoding priority
When the client accepts multiple encodings, cpp-httplib picks in this order (among those enabled at build time): Brotli → Zstd → gzip. Your code doesn't need to care — you always get the most efficient option available.
## Streaming responses are compressed too
Streaming responses via `set_chunked_content_provider()` get the same automatic compression.
```cpp
svr.Get("/events", [](const httplib::Request &req, httplib::Response &res) {
res.set_chunked_content_provider(
"text/plain",
[](size_t offset, httplib::DataSink &sink) {
// ...
});
});
```
> **Note:** Tiny responses barely benefit from compression and just waste CPU time. cpp-httplib skips compression for bodies that are too small to bother with.
> For the client-side counterpart, see [C15. Enable compression](c15-compression).

View File

@@ -0,0 +1,54 @@
---
title: "S09. Add Pre-Processing to All Routes"
order: 28
status: "draft"
---
Sometimes you want the same logic to run before every request — auth checks, logging, rate limiting. Register those with `set_pre_routing_handler()`.
## Basic usage
```cpp
svr.set_pre_routing_handler(
[](const httplib::Request &req, httplib::Response &res) {
std::cout << req.method << " " << req.path << std::endl;
return httplib::Server::HandlerResponse::Unhandled;
});
```
The pre-routing handler runs **before routing**. It catches every request — including ones that don't match any handler.
The `HandlerResponse` return value is key:
- Return `Unhandled` → continue normally (routing and the actual handler run)
- Return `Handled` → the response is considered complete, skip the rest
## Use it for authentication
Put your shared auth check in one place.
```cpp
svr.set_pre_routing_handler(
[](const httplib::Request &req, httplib::Response &res) {
if (req.path.rfind("/public", 0) == 0) {
return httplib::Server::HandlerResponse::Unhandled; // no auth needed
}
auto auth = req.get_header_value("Authorization");
if (auth.empty()) {
res.status = 401;
res.set_content("unauthorized", "text/plain");
return httplib::Server::HandlerResponse::Handled;
}
return httplib::Server::HandlerResponse::Unhandled;
});
```
If auth fails, return `Handled` to respond with 401 immediately. If it passes, return `Unhandled` and let routing take over.
## For per-route auth
If you want different auth rules per route rather than a single global check, `set_pre_request_handler()` is a better fit. See [S11. Authenticate per route with a pre-request handler](s11-pre-request).
> **Note:** If all you want is to modify the response, `set_post_routing_handler()` is the right tool. See [S10. Add response headers with a post-routing handler](s10-post-routing).

View File

@@ -0,0 +1,56 @@
---
title: "S10. Add Response Headers with a Post-Routing Handler"
order: 29
status: "draft"
---
Sometimes you want to add shared headers to the response after the handler has run — CORS headers, security headers, a request ID, and so on. That's what `set_post_routing_handler()` is for.
## Basic usage
```cpp
svr.set_post_routing_handler(
[](const httplib::Request &req, httplib::Response &res) {
res.set_header("X-Request-ID", generate_request_id());
});
```
The post-routing handler runs **after the route handler, before the response is sent**. From here you can call `res.set_header()` or `res.headers.erase()` to add or remove headers across every response in one place.
## Add CORS headers
CORS is a classic use case.
```cpp
svr.set_post_routing_handler(
[](const httplib::Request &req, httplib::Response &res) {
res.set_header("Access-Control-Allow-Origin", "*");
res.set_header("Access-Control-Allow-Methods", "GET, POST, PUT, DELETE, OPTIONS");
res.set_header("Access-Control-Allow-Headers", "Content-Type, Authorization");
});
```
For the preflight `OPTIONS` requests, register a separate handler — or handle them in the pre-routing handler.
```cpp
svr.Options("/.*", [](const auto &req, auto &res) {
res.status = 204;
});
```
## Bundle your security headers
Manage browser security headers in one spot.
```cpp
svr.set_post_routing_handler(
[](const httplib::Request &req, httplib::Response &res) {
res.set_header("X-Content-Type-Options", "nosniff");
res.set_header("X-Frame-Options", "DENY");
res.set_header("Referrer-Policy", "strict-origin-when-cross-origin");
});
```
No matter which handler produced the response, the same headers get attached.
> **Note:** The post-routing handler also runs for responses that didn't match any route and for responses from error handlers. That's exactly what you want when you need certain headers on every response, guaranteed.

View File

@@ -0,0 +1,47 @@
---
title: "S11. Authenticate Per Route with a Pre-Request Handler"
order: 30
status: "draft"
---
The `set_pre_routing_handler()` from [S09. Add pre-processing to all routes](s09-pre-routing) runs **before routing**, so it has no idea which route matched. When you want per-route behavior, `set_pre_request_handler()` is what you need.
## Pre-routing vs. pre-request
| Hook | When it runs | Route info |
| --- | --- | --- |
| `set_pre_routing_handler` | Before routing | Not available |
| `set_pre_request_handler` | After routing, right before the route handler | Available via `req.matched_route` |
In a pre-request handler, `req.matched_route` holds the **pattern string** that matched. You can vary behavior based on the route definition itself.
## Switch auth per route
```cpp
svr.set_pre_request_handler(
[](const httplib::Request &req, httplib::Response &res) {
// require auth for routes starting with /admin
if (req.matched_route.rfind("/admin", 0) == 0) {
auto token = req.get_header_value("Authorization");
if (!is_admin_token(token)) {
res.status = 403;
res.set_content("forbidden", "text/plain");
return httplib::Server::HandlerResponse::Handled;
}
}
return httplib::Server::HandlerResponse::Unhandled;
});
```
`matched_route` is the pattern **before** path parameters are expanded (e.g. `/admin/users/:id`). You compare against the route definition, not the actual request path, so IDs or names don't throw you off.
## Return values
Same as pre-routing — return `HandlerResponse`.
- `Unhandled`: continue (the route handler runs)
- `Handled`: we're done, skip the route handler
## Passing auth info to the route handler
To pass decoded user info into the route handler, use `res.user_data`. See [S12. Pass data between handlers with `res.user_data`](s12-user-data).

View File

@@ -0,0 +1,56 @@
---
title: "S12. Pass Data Between Handlers with res.user_data"
order: 31
status: "draft"
---
Say your pre-request handler decodes an auth token and you want the route handler to use the result. That "data handoff between handlers" is what `res.user_data` is for — it holds values of arbitrary types.
## Basic usage
```cpp
struct AuthUser {
std::string id;
std::string name;
bool is_admin;
};
svr.set_pre_request_handler(
[](const httplib::Request &req, httplib::Response &res) {
auto token = req.get_header_value("Authorization");
auto user = decode_token(token); // decode the auth token
res.user_data.set("user", user);
return httplib::Server::HandlerResponse::Unhandled;
});
svr.Get("/me", [](const httplib::Request &req, httplib::Response &res) {
auto *user = res.user_data.get<AuthUser>("user");
if (!user) {
res.status = 401;
return;
}
res.set_content("Hello, " + user->name, "text/plain");
});
```
`user_data.set()` stores a value of any type, and `user_data.get<T>()` retrieves it. If you give the wrong type you get `nullptr` back — so be careful.
## Typical value types
Strings, numbers, structs, `std::shared_ptr` — anything copyable or movable works.
```cpp
res.user_data.set("user_id", std::string{"42"});
res.user_data.set("is_admin", true);
res.user_data.set("started_at", std::chrono::steady_clock::now());
```
## Where to set, where to read
The usual flow is: set it in `set_pre_routing_handler()` or `set_pre_request_handler()`, read it in the route handler. Pre-request runs after routing, so you can combine it with `req.matched_route` to set values only for specific routes.
## A gotcha
`user_data` lives on `Response`, not `Request`. That's because handlers get `Response&` (mutable) but only `const Request&`. It looks odd at first, but it makes sense once you think of it as "the mutable context shared between handlers."
> **Warning:** `user_data.get<T>()` returns `nullptr` when the type doesn't match. Use the exact same type on set and get. Storing as `AuthUser` and fetching as `const AuthUser` won't work.

View File

@@ -0,0 +1,51 @@
---
title: "S13. Return a Custom Error Page"
order: 32
status: "draft"
---
To customize the response for 4xx or 5xx errors, use `set_error_handler()`. You can replace the plain default error page with your own HTML or JSON.
## Basic usage
```cpp
svr.set_error_handler([](const httplib::Request &req, httplib::Response &res) {
auto body = "<h1>Error " + std::to_string(res.status) + "</h1>";
res.set_content(body, "text/html");
});
```
The error handler runs right before an error response is sent — any time `res.status` is 4xx or 5xx. Replace the body with `res.set_content()` and every error response uses the same template.
## Branch by status code
```cpp
svr.set_error_handler([](const httplib::Request &req, httplib::Response &res) {
if (res.status == 404) {
res.set_content("<h1>Not Found</h1><p>" + req.path + "</p>", "text/html");
} else if (res.status >= 500) {
res.set_content("<h1>Server Error</h1>", "text/html");
}
});
```
Checking `res.status` lets you show a custom message for 404s and a "contact support" link for 5xx errors.
## JSON error responses
For an API server, you probably want errors as JSON.
```cpp
svr.set_error_handler([](const httplib::Request &req, httplib::Response &res) {
nlohmann::json j = {
{"error", true},
{"status", res.status},
{"path", req.path},
};
res.set_content(j.dump(), "application/json");
});
```
Now every error comes back in a consistent JSON shape.
> **Note:** `set_error_handler()` also fires for 500 responses caused by exceptions thrown from a route handler. To get at the exception itself, combine it with `set_exception_handler()`. See [S14. Catch exceptions](s14-exception-handler).

View File

@@ -0,0 +1,67 @@
---
title: "S14. Catch Exceptions"
order: 33
status: "draft"
---
When a route handler throws, cpp-httplib keeps the server running and responds with 500. By default, though, very little of the error information reaches the client. `set_exception_handler()` lets you intercept exceptions and build your own response.
## Basic usage
```cpp
svr.set_exception_handler(
[](const httplib::Request &req, httplib::Response &res,
std::exception_ptr ep) {
try {
std::rethrow_exception(ep);
} catch (const std::exception &e) {
res.status = 500;
res.set_content(std::string("error: ") + e.what(), "text/plain");
} catch (...) {
res.status = 500;
res.set_content("unknown error", "text/plain");
}
});
```
The handler receives a `std::exception_ptr`. The idiomatic move is to rethrow it with `std::rethrow_exception()` and catch by type. You can vary status code and message based on the exception type.
## Branch on custom exception types
If you throw your own exception types, you can map them to 400 or 404 responses.
```cpp
struct NotFound : std::runtime_error {
using std::runtime_error::runtime_error;
};
struct BadRequest : std::runtime_error {
using std::runtime_error::runtime_error;
};
svr.set_exception_handler(
[](const auto &req, auto &res, std::exception_ptr ep) {
try {
std::rethrow_exception(ep);
} catch (const NotFound &e) {
res.status = 404;
res.set_content(e.what(), "text/plain");
} catch (const BadRequest &e) {
res.status = 400;
res.set_content(e.what(), "text/plain");
} catch (const std::exception &e) {
res.status = 500;
res.set_content("internal error", "text/plain");
}
});
```
Now throwing `NotFound("user not found")` inside a handler is enough to return 404. No per-handler try/catch needed.
## Relationship with set_error_handler
`set_exception_handler()` runs the moment the exception is thrown. After that, if `res.status` is 4xx or 5xx, `set_error_handler()` also runs. The order is `exception_handler``error_handler`. Think of their roles as:
- **Exception handler**: interpret the exception, set the status and message
- **Error handler**: see the status and wrap it in the shared template
> **Note:** Without an exception handler, cpp-httplib returns a default 500 response and the exception details never make it to logs. Always set one for anything you want to debug.

View File

@@ -0,0 +1,64 @@
---
title: "S15. Log Requests on the Server"
order: 34
status: "draft"
---
To log the requests the server receives and the responses it returns, use `Server::set_logger()`. The callback fires once per completed request, making it the foundation for access logs and metrics collection.
## Basic usage
```cpp
svr.set_logger([](const httplib::Request &req, const httplib::Response &res) {
std::cout << req.remote_addr << " "
<< req.method << " " << req.path
<< " -> " << res.status << std::endl;
});
```
The log callback receives both the `Request` and the `Response`. You can grab the method, path, status, client IP, headers, body — whatever you need.
## Access-log style format
Here's an Apache/Nginx-ish access log format.
```cpp
svr.set_logger([](const auto &req, const auto &res) {
auto now = std::time(nullptr);
char timebuf[32];
std::strftime(timebuf, sizeof(timebuf), "%Y-%m-%d %H:%M:%S",
std::localtime(&now));
std::cout << timebuf << " "
<< req.remote_addr << " "
<< "\"" << req.method << " " << req.path << "\" "
<< res.status << " "
<< res.body.size() << "B"
<< std::endl;
});
```
## Measure request time
To include request duration in the log, stash a start timestamp in `res.user_data` from a pre-routing handler, then subtract in the logger.
```cpp
svr.set_pre_routing_handler([](const auto &req, auto &res) {
res.user_data.set("start", std::chrono::steady_clock::now());
return httplib::Server::HandlerResponse::Unhandled;
});
svr.set_logger([](const auto &req, const auto &res) {
auto *start = res.user_data.get<std::chrono::steady_clock::time_point>("start");
auto elapsed = start
? std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::steady_clock::now() - *start).count()
: 0;
std::cout << req.method << " " << req.path
<< " " << res.status << " " << elapsed << "ms" << std::endl;
});
```
For more on `user_data`, see [S12. Pass data between handlers with `res.user_data`](s12-user-data).
> **Note:** The logger runs synchronously on the same thread as request processing. Heavy work inside it hurts throughput — push it to a queue and process asynchronously if you need anything expensive.

View File

@@ -0,0 +1,55 @@
---
title: "S16. Detect When the Client Has Disconnected"
order: 35
status: "draft"
---
During a long-running response, the client might close the connection. There's no point continuing to do work no one's waiting for. In cpp-httplib, check `req.is_connection_closed()`.
## Basic usage
```cpp
svr.Get("/long-task", [](const httplib::Request &req, httplib::Response &res) {
for (int i = 0; i < 1000; ++i) {
if (req.is_connection_closed()) {
std::cout << "client disconnected" << std::endl;
return;
}
do_heavy_work(i);
}
res.set_content("done", "text/plain");
});
```
`is_connection_closed` is a `std::function<bool()>`, so call it with `()`. It returns `true` when the client is gone.
## With a streaming response
The same check works inside `set_chunked_content_provider()`. Capture the request by reference.
```cpp
svr.Get("/events", [](const httplib::Request &req, httplib::Response &res) {
res.set_chunked_content_provider(
"text/event-stream",
[&req](size_t offset, httplib::DataSink &sink) {
if (req.is_connection_closed()) {
sink.done();
return true;
}
auto event = generate_next_event();
sink.write(event.data(), event.size());
return true;
});
});
```
When you detect a disconnect, call `sink.done()` to stop the provider from being called again.
## How often should you check?
The call itself is cheap, but calling it in a tight inner loop doesn't add much value. Check at **boundaries where interrupting is safe** — after producing a chunk, after a database query, etc.
> **Warning:** `is_connection_closed()` is not guaranteed to reflect reality instantly. Because of how TCP works, sometimes you only notice the disconnect when you try to send. Don't expect pixel-perfect real-time detection — think of it as "we'll notice eventually."

View File

@@ -0,0 +1,52 @@
---
title: "S17. Bind to Any Available Port"
order: 36
status: "draft"
---
Standing up a test server often hits port conflicts. With `bind_to_any_port()`, you let the OS pick a free port and then read back which one it gave you.
## Basic usage
```cpp
httplib::Server svr;
svr.Get("/", [](const auto &req, auto &res) {
res.set_content("hello", "text/plain");
});
int port = svr.bind_to_any_port("0.0.0.0");
std::cout << "listening on port " << port << std::endl;
svr.listen_after_bind();
```
`bind_to_any_port()` is equivalent to passing `0` as the port — the OS assigns a free one. The return value is the port actually used.
After that, call `listen_after_bind()` to start accepting. You can't combine bind and listen into a single call here, so you work in two steps.
## Useful in tests
This pattern is great for tests that spin up a server and hit it.
```cpp
httplib::Server svr;
svr.Get("/ping", [](const auto &, auto &res) { res.set_content("pong", "text/plain"); });
int port = svr.bind_to_any_port("127.0.0.1");
std::thread t([&] { svr.listen_after_bind(); });
// run the test while the server is up on another thread
httplib::Client cli("127.0.0.1", port);
auto res = cli.Get("/ping");
assert(res && res->body == "pong");
svr.stop();
t.join();
```
Because the port is assigned at runtime, parallel test runs don't collide.
> **Note:** `bind_to_any_port()` returns `-1` on failure (permission errors, no available ports, etc.). Always check the return value.
> To stop the server, see [S19. Shut down gracefully](s19-graceful-shutdown).

View File

@@ -0,0 +1,57 @@
---
title: "S18. Control Startup Order with listen_after_bind"
order: 37
status: "draft"
---
Normally `svr.listen("0.0.0.0", 8080)` handles bind and listen in one shot. When you need to do something between the two, split them into two calls.
## Separate bind and listen
```cpp
httplib::Server svr;
svr.Get("/", [](const auto &, auto &res) { res.set_content("ok", "text/plain"); });
if (!svr.bind_to_port("0.0.0.0", 8080)) {
std::cerr << "bind failed" << std::endl;
return 1;
}
// bind is done here. accept hasn't started yet.
drop_privileges();
signal_ready_to_parent_process();
svr.listen_after_bind(); // start the accept loop
```
`bind_to_port()` reserves the port; `listen_after_bind()` actually starts accepting. Splitting them gives you a window between the two steps.
## Common use cases
**Privilege drop**: Binding to a port under 1024 requires root. Bind as root, drop to a normal user, and all subsequent request handling runs with reduced privileges.
```cpp
svr.bind_to_port("0.0.0.0", 80);
drop_privileges();
svr.listen_after_bind();
```
**Startup notification**: Tell the parent process or systemd "I'm ready" before starting to accept connections.
**Test synchronization**: In tests, you can reliably catch "the moment the server is bound" and start the client after that.
## Check the return values
`bind_to_port()` returns `false` on failure — typically when the port is already taken. Always check it.
```cpp
if (!svr.bind_to_port("0.0.0.0", 8080)) {
std::cerr << "port already in use" << std::endl;
return 1;
}
```
`listen_after_bind()` blocks until the server stops and returns `true` on a clean shutdown.
> **Note:** To auto-pick a free port, see [S17. Bind to any available port](s17-bind-any-port). Under the hood, that's just `bind_to_any_port()` + `listen_after_bind()`.

View File

@@ -0,0 +1,57 @@
---
title: "S19. Shut Down Gracefully"
order: 38
status: "draft"
---
To stop the server, call `Server::stop()`. It's safe to call even while requests are in flight, so you can wire it to SIGINT or SIGTERM for a graceful shutdown.
## Basic usage
```cpp
httplib::Server svr;
svr.Get("/", [](const auto &, auto &res) { res.set_content("ok", "text/plain"); });
std::thread t([&] { svr.listen("0.0.0.0", 8080); });
// wait for input on the main thread, or whatever
std::cin.get();
svr.stop();
t.join();
```
`listen()` blocks, so the typical pattern is: run the server on a background thread and call `stop()` from the main thread. After `stop()`, `listen()` returns and you can `join()`.
## Shut down on a signal
Here's how to stop the server on SIGINT (Ctrl+C) or SIGTERM.
```cpp
#include <csignal>
httplib::Server svr;
// global so the signal handler can reach it
httplib::Server *g_svr = nullptr;
int main() {
svr.Get("/", [](const auto &, auto &res) { res.set_content("ok", "text/plain"); });
g_svr = &svr;
std::signal(SIGINT, [](int) { if (g_svr) g_svr->stop(); });
std::signal(SIGTERM, [](int) { if (g_svr) g_svr->stop(); });
svr.listen("0.0.0.0", 8080);
std::cout << "server stopped" << std::endl;
}
```
`stop()` is thread-safe and signal-safe — you can call it from a signal handler. Even when `listen()` is running on the main thread, the signal pulls it out cleanly.
## What happens to in-flight requests
When you call `stop()`, new connections are refused, but requests already being processed are **allowed to finish**. Once all workers drain, `listen()` returns. That's what makes it graceful.
> **Warning:** There's a wait between calling `stop()` and `listen()` returning — it's the time in-flight requests take to finish. To enforce a timeout, you'll need to add your own shutdown timer in application code.

View File

@@ -0,0 +1,57 @@
---
title: "S20. Tune Keep-Alive"
order: 39
status: "draft"
---
`httplib::Server` enables HTTP/1.1 Keep-Alive automatically. From the client's perspective, connections are reused — so they don't pay the TCP handshake cost on every request. When you need to tune the behavior, there are two setters.
## What you can configure
| API | Default | Meaning |
| --- | --- | --- |
| `set_keep_alive_max_count` | 100 | Max requests served over a single connection |
| `set_keep_alive_timeout` | 5s | How long an idle connection is kept before closing |
## Basic usage
```cpp
httplib::Server svr;
svr.set_keep_alive_max_count(20);
svr.set_keep_alive_timeout(10); // 10 seconds
svr.listen("0.0.0.0", 8080);
```
`set_keep_alive_timeout()` also has a `std::chrono` overload.
```cpp
using namespace std::chrono_literals;
svr.set_keep_alive_timeout(10s);
```
## Tuning ideas
**Too many idle connections eating resources**
Shorten the timeout so idle connections drop and release their worker threads.
```cpp
svr.set_keep_alive_timeout(2s);
```
**API is hammered and you want max reuse**
Raising the per-connection request cap improves benchmark numbers.
```cpp
svr.set_keep_alive_max_count(1000);
```
**Never reuse connections**
Set `set_keep_alive_max_count(1)` and every request gets its own connection. Mostly only useful for debugging or compatibility testing.
## Relationship with the thread pool
A Keep-Alive connection holds a worker thread for its entire lifetime. If `connections × concurrent requests` exceeds the thread pool size, new requests wait. For thread counts, see [S21. Configure the thread pool](s21-thread-pool).
> **Note:** For the client side, see [C14. Understand connection reuse and Keep-Alive behavior](c14-keep-alive). Even when the server closes the connection on timeout, the client reconnects automatically.

View File

@@ -0,0 +1,69 @@
---
title: "S21. Configure the Thread Pool"
order: 40
status: "draft"
---
cpp-httplib serves requests from a thread pool. By default, the base thread count is the greater of `std::thread::hardware_concurrency() - 1` and `8`, and it can scale up dynamically to 4× that. To set thread counts explicitly, provide your own factory via `new_task_queue`.
## Set thread counts
```cpp
httplib::Server svr;
svr.new_task_queue = [] {
return new httplib::ThreadPool(/*base_threads=*/8, /*max_threads=*/64);
};
svr.listen("0.0.0.0", 8080);
```
The factory is a lambda returning a `TaskQueue*`. Pass `base_threads` and `max_threads` to `ThreadPool` and the pool scales between them based on load. Idle threads exit after a timeout (3 seconds by default).
## Also cap the queue
The pending queue can eat memory if it grows unchecked. You can cap it too.
```cpp
svr.new_task_queue = [] {
return new httplib::ThreadPool(
/*base_threads=*/12,
/*max_threads=*/0, // disable dynamic scaling
/*max_queued_requests=*/18);
};
```
`max_threads=0` disables dynamic scaling — you get a fixed `base_threads`. Requests that don't fit in `max_queued_requests` are rejected.
## Use your own thread pool
You can plug in a fully custom thread pool by subclassing `TaskQueue` and returning it from the factory.
```cpp
class MyTaskQueue : public httplib::TaskQueue {
public:
MyTaskQueue(size_t n) { pool_.start_with_thread_count(n); }
bool enqueue(std::function<void()> fn) override { return pool_.post(std::move(fn)); }
void shutdown() override { pool_.shutdown(); }
private:
MyThreadPool pool_;
};
svr.new_task_queue = [] { return new MyTaskQueue(12); };
```
Handy when you already have a thread pool in your project and want to keep thread management unified.
## Compile-time tuning
You can set the defaults with macros if you want compile-time configuration.
```cpp
#define CPPHTTPLIB_THREAD_POOL_COUNT 16 // base thread count
#define CPPHTTPLIB_THREAD_POOL_MAX_COUNT 128 // max thread count
#define CPPHTTPLIB_THREAD_POOL_IDLE_TIMEOUT 5 // seconds before idle threads exit
#include <httplib.h>
```
> **Note:** A WebSocket connection holds a worker thread for its entire lifetime. For lots of simultaneous WebSocket connections, enable dynamic scaling (e.g. `ThreadPool(8, 64)`).

View File

@@ -0,0 +1,64 @@
---
title: "S22. Talk Over a Unix Domain Socket"
order: 41
status: "draft"
---
When you want to talk only to other processes on the same host, a Unix domain socket is a nice fit. It avoids TCP overhead and uses filesystem permissions for access control. Local IPC and services sitting behind a reverse proxy are classic use cases.
## Server side
```cpp
httplib::Server svr;
svr.set_address_family(AF_UNIX);
svr.Get("/", [](const auto &, auto &res) {
res.set_content("hello from unix socket", "text/plain");
});
svr.listen("/tmp/httplib.sock", 80);
```
Call `set_address_family(AF_UNIX)` first, then pass the socket file path as the first argument to `listen()`. The port number is unused but required by the signature — pass any value.
## Client side
```cpp
httplib::Client cli("/tmp/httplib.sock");
cli.set_address_family(AF_UNIX);
auto res = cli.Get("/");
if (res) {
std::cout << res->body << std::endl;
}
```
Pass the socket file path to the `Client` constructor and call `set_address_family(AF_UNIX)`. Everything else works like a normal HTTP request.
## When to use it
- **Behind a reverse proxy**: An nginx-to-backend setup over a Unix socket is faster than TCP and sidesteps port management
- **Local-only APIs**: IPC between tools that shouldn't be reachable from outside
- **In-container IPC**: Process-to-process communication within the same pod or container
- **Dev environments**: No more worrying about port conflicts
## Clean up the socket file
A Unix domain socket creates a real file in the filesystem. It doesn't get removed on shutdown, so delete it before starting if needed.
```cpp
std::remove("/tmp/httplib.sock");
svr.listen("/tmp/httplib.sock", 80);
```
## Permissions
You control who can connect via the socket file's permissions.
```cpp
svr.listen("/tmp/httplib.sock", 80);
// from another process or thread
chmod("/tmp/httplib.sock", 0660); // owner and group only
```
> **Warning:** Some Windows versions support AF_UNIX, but the implementation and behavior differ by platform. Test thoroughly before running cross-platform in production.

View File

@@ -0,0 +1,49 @@
---
title: "T01. Choosing Between OpenSSL, mbedTLS, and wolfSSL"
order: 42
status: "draft"
---
cpp-httplib doesn't ship its own TLS implementation — it uses one of three backends that you pick at build time via a macro.
| Backend | Macro | Character |
| --- | --- | --- |
| OpenSSL | `CPPHTTPLIB_OPENSSL_SUPPORT` | Most widely used, richest feature set |
| mbedTLS | `CPPHTTPLIB_MBEDTLS_SUPPORT` | Lightweight, aimed at embedded |
| wolfSSL | `CPPHTTPLIB_WOLFSSL_SUPPORT` | Embedded-friendly, commercial support available |
## Build-time selection
Define the macro for your chosen backend before including `httplib.h`:
```cpp
#define CPPHTTPLIB_OPENSSL_SUPPORT
#include <httplib.h>
```
You'll also need to link against the backend's libraries (`libssl`, `libcrypto`, `libmbedtls`, `libwolfssl`, etc.).
## Which to pick
**When in doubt, OpenSSL**
It has the most features and the best documentation. For normal server use or Linux desktop apps, start here — you probably won't need anything else.
**To shrink binary size or target embedded**
mbedTLS or wolfSSL are a better fit. They're far more compact than OpenSSL and run on memory-constrained devices.
**When you need commercial support**
wolfSSL offers commercial licensing and support. If you're shipping in a product, it's worth considering.
## Supporting multiple backends
The usual approach is to treat each backend as a build variant and recompile the same source with different macros. cpp-httplib smooths over most of the API differences, but the backends are not 100% identical — always test.
## APIs that work across all backends
Certificate verification control, standing up an SSLServer, reading the peer certificate — these all share the same API across backends:
- [T02. Control SSL certificate verification](t02-cert-verification)
- [T03. Start an SSL/TLS server](t03-ssl-server)
- [T05. Access the peer certificate on the server](t05-peer-cert)
> **Note:** On macOS with an OpenSSL-family backend, cpp-httplib automatically loads root certificates from the system keychain (via `CPPHTTPLIB_USE_CERTS_FROM_MACOSX_KEYCHAIN`, on by default). To disable this, define `CPPHTTPLIB_DISABLE_MACOSX_AUTOMATIC_ROOT_CERTIFICATES`.

View File

@@ -0,0 +1,53 @@
---
title: "T02. Control SSL Certificate Verification"
order: 43
status: "draft"
---
By default, an HTTPS client verifies the server certificate — it uses the OS root certificate store to check the chain and the hostname. Here are the APIs for changing that behavior.
## Specify a custom CA certificate
When connecting to a server whose certificate is signed by an internal CA, use `set_ca_cert_path()`.
```cpp
httplib::Client cli("https://internal.example.com");
cli.set_ca_cert_path("/etc/ssl/certs/internal-ca.pem");
auto res = cli.Get("/");
```
The first argument is the CA certificate file; the second is an optional CA directory. With the OpenSSL backend, you can also pass an `X509_STORE*` directly via `set_ca_cert_store()`.
## Disable certificate verification (not recommended)
For development servers or self-signed certificates, you can skip verification entirely.
```cpp
httplib::Client cli("https://self-signed.example.com");
cli.enable_server_certificate_verification(false);
auto res = cli.Get("/");
```
That's all it takes to disable chain verification.
> **Warning:** Disabling certificate verification removes protection against man-in-the-middle attacks. **Never do this in production.** If you find yourself needing it outside of dev/test, pause and make sure you're not doing something wrong.
## Disable hostname verification only
There's an in-between option: verify the certificate chain, but skip the hostname check. Useful when you need to reach a server whose cert CN/SAN doesn't match the request's hostname.
```cpp
cli.enable_server_hostname_verification(false);
```
The certificate itself is still validated, so this is safer than fully disabling verification — but still not recommended in production.
## Use the OS cert store as-is
On most Linux distributions, root certificates live in a single file like `/etc/ssl/certs/ca-certificates.crt`. cpp-httplib reads the OS default store at startup, so for most servers you don't need to configure anything.
> The same APIs work on the mbedTLS and wolfSSL backends. For choosing between backends, see [T01. Choosing between OpenSSL, mbedTLS, and wolfSSL](t01-tls-backends).
> For details on diagnosing failures, see [C18. Handle SSL errors](c18-ssl-errors).

View File

@@ -0,0 +1,78 @@
---
title: "T03. Start an SSL/TLS Server"
order: 44
status: "draft"
---
To stand up an HTTPS server, use `httplib::SSLServer` instead of `httplib::Server`. Pass a certificate and private key to the constructor, and you get back something that works exactly like `Server`.
## Basic usage
```cpp
#define CPPHTTPLIB_OPENSSL_SUPPORT
#include <httplib.h>
int main() {
httplib::SSLServer svr("cert.pem", "key.pem");
svr.Get("/", [](const auto &req, auto &res) {
res.set_content("hello over TLS", "text/plain");
});
svr.listen("0.0.0.0", 443);
}
```
Pass the server certificate (PEM format) and private key file paths to the constructor. That's all you need for a TLS-enabled server. Registering handlers and calling `listen()` work the same as with `Server`.
## Password-protected private keys
The fifth argument is the private key password.
```cpp
httplib::SSLServer svr("cert.pem", "key.pem",
nullptr, nullptr, "password");
```
The third and fourth arguments are for client certificate verification (mTLS, see [T04. Configure mTLS](t04-mtls)). For now, pass `nullptr`.
## Load PEM data from memory
When you want to load certs from memory instead of files, use the `PemMemory` struct.
```cpp
httplib::SSLServer::PemMemory pem{};
pem.cert_pem = cert_data.data();
pem.cert_pem_len = cert_data.size();
pem.key_pem = key_data.data();
pem.key_pem_len = key_data.size();
httplib::SSLServer svr(pem);
```
Handy when you pull certificates from environment variables or a secrets manager.
## Rotate certificates
Before a certificate expires, you may want to swap it out without restarting the server. That's what `update_certs_pem()` is for.
```cpp
svr.update_certs_pem(new_cert_pem, new_key_pem);
```
Existing connections keep using the old cert; new connections use the new one.
## Generating a test certificate
For a throwaway self-signed cert, use the `openssl` CLI.
```sh
openssl req -x509 -newkey rsa:2048 -days 365 -nodes \
-keyout key.pem -out cert.pem -subj "/CN=localhost"
```
In production, use certificates from Let's Encrypt or your internal CA.
> **Warning:** Binding an HTTPS server to port 443 requires root. For a safe way to do that, see the privilege-drop pattern in [S18. Control startup order with `listen_after_bind`](s18-listen-after-bind).
> For mutual TLS (client certificates), see [T04. Configure mTLS](t04-mtls).

View File

@@ -0,0 +1,70 @@
---
title: "T04. Configure mTLS"
order: 45
status: "draft"
---
Regular TLS verifies the server certificate only. **mTLS** (mutual TLS) adds the other direction: the client presents a certificate too, and the server verifies it. It's common for zero-trust API-to-API traffic and internal system authentication.
## Server side
Pass the CA used to verify client certificates as the third (and fourth) argument to `SSLServer`.
```cpp
httplib::SSLServer svr(
"server-cert.pem", // server certificate
"server-key.pem", // server private key
"client-ca.pem", // CA that signs valid client certs
nullptr // CA directory (none)
);
svr.Get("/", [](const httplib::Request &req, httplib::Response &res) {
res.set_content("authenticated", "text/plain");
});
svr.listen("0.0.0.0", 443);
```
With this, any connection whose client certificate isn't signed by `client-ca.pem` is rejected at the handshake. By the time a handler runs, the client is already authenticated.
## Configure with in-memory PEM
```cpp
httplib::SSLServer::PemMemory pem{};
pem.cert_pem = server_cert.data();
pem.cert_pem_len = server_cert.size();
pem.key_pem = server_key.data();
pem.key_pem_len = server_key.size();
pem.client_ca_pem = client_ca.data();
pem.client_ca_pem_len = client_ca.size();
httplib::SSLServer svr(pem);
```
This is the clean way when you load certificates from environment variables or a secrets manager.
## Client side
On the client side, pass the client certificate and key to `SSLClient`.
```cpp
httplib::SSLClient cli("api.example.com", 443,
"client-cert.pem",
"client-key.pem");
auto res = cli.Get("/");
```
Note you're using `SSLClient` directly, not `Client`. If the private key has a password, pass it as the fifth argument.
## Read client info from a handler
To see which client connected from inside a handler, use `req.peer_cert()`. Details in [T05. Access the peer certificate on the server](t05-peer-cert).
## Use cases
- **Microservice-to-microservice calls**: Issue a cert per service, use the cert as identity
- **IoT device management**: Burn a cert into each device and use it to gate API access
- **An alternative to internal VPN**: Put cert-based auth in front of public endpoints so internal resources can be reached safely
> **Note:** Issuing and revoking client certificates is more operational work than password-based auth. You'll need either an internal PKI setup or an automated flow using ACME-family tools.

View File

@@ -0,0 +1,88 @@
---
title: "T05. Access the Peer Certificate on the Server Side"
order: 46
status: "draft"
---
In an mTLS setup, you can read the client's certificate from inside a handler. Pull out the CN or SAN to identify the user or log the request.
## Basic usage
```cpp
svr.Get("/me", [](const httplib::Request &req, httplib::Response &res) {
auto cert = req.peer_cert();
if (!cert) {
res.status = 401;
res.set_content("no client certificate", "text/plain");
return;
}
auto cn = cert.subject_cn();
res.set_content("hello, " + cn, "text/plain");
});
```
`req.peer_cert()` returns a `tls::PeerCert`. It's convertible to `bool`, so check whether a cert is present before using it.
## Available fields
From a `PeerCert`, you can get:
```cpp
auto cert = req.peer_cert();
std::string cn = cert.subject_cn(); // CN
std::string issuer = cert.issuer_name(); // issuer
std::string serial = cert.serial(); // serial number
time_t not_before, not_after;
cert.validity(not_before, not_after); // validity period
auto sans = cert.sans(); // SANs
for (const auto &san : sans) {
std::cout << san.value << std::endl;
}
```
There's also a helper to check if a hostname is covered by the SAN list:
```cpp
if (cert.check_hostname("alice.corp.example.com")) {
// matches
}
```
## Cert-based authorization
You can gate routes by CN or SAN.
```cpp
svr.set_pre_request_handler(
[](const httplib::Request &req, httplib::Response &res) {
auto cert = req.peer_cert();
if (!cert) {
res.status = 401;
return httplib::Server::HandlerResponse::Handled;
}
if (req.matched_route.rfind("/admin", 0) == 0) {
auto cn = cert.subject_cn();
if (!is_admin_cn(cn)) {
res.status = 403;
return httplib::Server::HandlerResponse::Handled;
}
}
return httplib::Server::HandlerResponse::Unhandled;
});
```
Combined with a pre-request handler, you can keep all authorization logic in one place. See [S11. Authenticate per route with a pre-request handler](s11-pre-request).
## SNI (Server Name Indication)
cpp-httplib handles SNI automatically. If one server hosts multiple domains, SNI is used under the hood — but normally handlers don't need to care.
> **Warning:** `req.peer_cert()` only returns a meaningful value when mTLS is enabled and the client actually presented a certificate. For plain TLS, you get an empty `PeerCert`. Always do the `bool` check before using it.
> To set up mTLS, see [T04. Configure mTLS](t04-mtls).

View File

@@ -0,0 +1,88 @@
---
title: "W01. Implement a WebSocket Echo Server and Client"
order: 51
status: "draft"
---
WebSocket is a protocol for **two-way** messaging between client and server. cpp-httplib provides APIs for both sides. Let's start with the simplest example: an echo server.
## Server: echo server
```cpp
#include <httplib.h>
int main() {
httplib::Server svr;
svr.WebSocket("/echo", [](const httplib::Request &req, httplib::ws::WebSocket &ws) {
std::string msg;
while (ws.is_open()) {
auto result = ws.read(msg);
if (result == httplib::ws::ReadResult::Fail) {
break;
}
ws.send(msg); // echo back what we received
}
});
svr.listen("0.0.0.0", 8080);
}
```
Register a WebSocket handler with `svr.WebSocket()`. By the time the handler runs, the WebSocket handshake is already complete. Inside the loop, just `ws.read()` and `ws.send()` to get a working echo.
The `read()` return value is a `ReadResult` enum:
- `ReadResult::Text`: received a text message
- `ReadResult::Binary`: received a binary message
- `ReadResult::Fail`: error, or connection closed
## Client: talk to the echo server
```cpp
#include <httplib.h>
int main() {
httplib::ws::WebSocketClient cli("ws://localhost:8080/echo");
if (!cli.connect()) {
std::cerr << "failed to connect" << std::endl;
return 1;
}
cli.send("Hello, WebSocket!");
std::string msg;
if (cli.read(msg) != httplib::ws::ReadResult::Fail) {
std::cout << "received: " << msg << std::endl;
}
cli.close();
}
```
Use a `ws://` (plain) or `wss://` (TLS) URL. Call `connect()` to do the handshake, then `send()` and `read()` work the same as on the server side.
## Text vs. binary
`send()` has two overloads that let you choose the frame type.
```cpp
ws.send("Hello"); // text frame
ws.send(binary_data, binary_data_size); // binary frame
```
The `std::string` overload sends as **text**; the `const char*` + size overload sends as **binary**. A bit subtle, but once you know it, it's intuitive. See [W04. Send and receive binary frames](w04-websocket-binary) for details.
## Thread pool implications
A WebSocket handler holds its worker thread for the entire life of the connection — one connection per thread. For many concurrent clients, configure a dynamic thread pool.
```cpp
svr.new_task_queue = [] {
return new httplib::ThreadPool(8, 128);
};
```
See [S21. Configure the thread pool](s21-thread-pool).
> **Note:** To run WebSocket over HTTPS, use `httplib::SSLServer` instead of `httplib::Server` — the same `WebSocket()` handler just works. On the client side, use a `wss://` URL.

View File

@@ -0,0 +1,80 @@
---
title: "W02. Set a WebSocket Heartbeat"
order: 52
status: "draft"
---
WebSocket connections stay open for a long time, and proxies or load balancers will sometimes drop them for being "idle." To prevent that, you periodically send Ping frames to keep the connection alive. cpp-httplib can do this for you automatically.
## Server side
```cpp
svr.set_websocket_ping_interval(30); // ping every 30 seconds
svr.WebSocket("/chat", [](const auto &req, auto &ws) {
// ...
});
```
Just pass the interval in seconds. Every WebSocket connection this server accepts will be pinged on that interval.
There's a `std::chrono` overload too.
```cpp
using namespace std::chrono_literals;
svr.set_websocket_ping_interval(30s);
```
## Client side
The client has the same API.
```cpp
httplib::ws::WebSocketClient cli("ws://localhost:8080/chat");
cli.set_websocket_ping_interval(30);
cli.connect();
```
Call it before `connect()`.
## The default
The default interval is set by the build-time macro `CPPHTTPLIB_WEBSOCKET_PING_INTERVAL_SECOND`. Usually you won't need to change it, but adjust downward if you're dealing with an aggressive proxy.
## What about Pong?
The WebSocket protocol requires that Ping frames are answered with Pong frames. cpp-httplib responds to Pings automatically — you don't need to think about it in application code.
## Picking an interval
| Environment | Suggested |
| --- | --- |
| Normal internet | 3060s |
| Strict proxies (e.g. AWS ALB) | 1530s |
| Mobile networks | 60s+ (too short drains battery) |
Too short wastes bandwidth; too long and connections get dropped. As a rule of thumb, target about **half the idle timeout** of whatever's between you and the client.
> **Warning:** A very short ping interval spawns background work per connection and increases CPU usage. For servers with many connections, keep the interval modest.
## Detecting an unresponsive peer
Sending pings alone doesn't tell you anything if the peer just silently dies — the TCP socket might still look open while the process on the other end is long gone. To catch that, enable the max-missed-pongs check: if N consecutive pings go unanswered, the connection is closed.
```cpp
cli.set_websocket_max_missed_pongs(2); // close after 2 consecutive unacked pings
```
The server side has the same `set_websocket_max_missed_pongs()`.
With a 30-second ping interval and `max_missed_pongs = 2`, a dead peer is detected within roughly 60 seconds and the connection is closed with `CloseStatus::GoingAway` and the reason `"pong timeout"`.
The counter is reset whenever `read()` consumes an incoming Pong frame, so this only works if your code is actively calling `read()` in a loop — which is what a normal WebSocket client does anyway.
### Why the default is 0
`max_missed_pongs` defaults to `0`, which means "never close the connection because of missing pongs." Pings are still sent on the heartbeat interval, but their responses aren't checked. If you want unresponsive-peer detection, set it explicitly to `1` or higher.
Even with `0`, a dead connection won't linger forever: while your code is inside `read()`, `CPPHTTPLIB_WEBSOCKET_READ_TIMEOUT_SECOND` (default **300 seconds = 5 minutes**) acts as a backstop and `read()` fails if no frame arrives in time. Think of `max_missed_pongs` as the knob for detecting an unresponsive peer **faster** than that.
> For handling a closed connection, see [W03. Handle connection close](w03-websocket-close).

View File

@@ -0,0 +1,91 @@
---
title: "W03. Handle Connection Close"
order: 53
status: "draft"
---
A WebSocket ends when either side closes it explicitly, or when the network drops. Handle close cleanly, and your cleanup and reconnect logic stays tidy.
## Detect a closed connection
When `ws.read()` returns `ReadResult::Fail`, the connection is gone — either cleanly or with an error. Break out of the loop and the handler will finish.
```cpp
svr.WebSocket("/chat", [](const httplib::Request &req, httplib::ws::WebSocket &ws) {
std::string msg;
while (ws.is_open()) {
auto result = ws.read(msg);
if (result == httplib::ws::ReadResult::Fail) {
std::cout << "disconnected" << std::endl;
break;
}
handle_message(ws, msg);
}
// cleanup runs once we're out of the loop
cleanup_user_session(req);
});
```
You can also check `ws.is_open()` — it's the same signal from a different angle.
## Close from the server side
To close explicitly, call `close()`.
```cpp
ws.close(httplib::ws::CloseStatus::Normal, "bye");
```
The first argument is the close status; the second is an optional reason. Common `CloseStatus` values:
| Value | Meaning |
| --- | --- |
| `Normal` (1000) | Normal closure |
| `GoingAway` (1001) | Server is shutting down |
| `ProtocolError` (1002) | Protocol violation detected |
| `UnsupportedData` (1003) | Received data that can't be handled |
| `PolicyViolation` (1008) | Violated a policy |
| `MessageTooBig` (1009) | Message too large |
| `InternalError` (1011) | Server-side error |
## Close from the client side
The client API is identical.
```cpp
cli.close(httplib::ws::CloseStatus::Normal);
```
Destroying the client also closes the connection, but calling `close()` explicitly makes the intent clearer.
## Graceful shutdown
To notify in-flight clients that the server is going down, use `GoingAway`.
```cpp
ws.close(httplib::ws::CloseStatus::GoingAway, "server restarting");
```
The client can inspect that status and decide whether to reconnect.
## Example: a tiny chat with quit
```cpp
svr.WebSocket("/chat", [](const auto &req, auto &ws) {
std::string msg;
while (ws.is_open()) {
if (ws.read(msg) == httplib::ws::ReadResult::Fail) break;
if (msg == "/quit") {
ws.send("goodbye");
ws.close(httplib::ws::CloseStatus::Normal, "user quit");
break;
}
ws.send("echo: " + msg);
}
});
```
> **Note:** On a sudden network drop, `read()` returns `Fail` with no chance to call `close()`. Put your cleanup at the end of the handler, and both paths — clean close and abrupt disconnect — end up in the same place.

View File

@@ -0,0 +1,85 @@
---
title: "W04. Send and Receive Binary Frames"
order: 54
status: "draft"
---
WebSocket has two frame types: text and binary. JSON and plain text go in text frames; images and raw protocol bytes go in binary. In cpp-httplib, `send()` picks the right type via overload.
## How to pick a frame type
```cpp
ws.send(std::string("Hello")); // text
ws.send("Hello", 5); // binary
ws.send(binary_data, binary_data_size); // binary
```
The `std::string` overload sends as **text**. The `const char*` + size overload sends as **binary**. A bit subtle, but once you know it, it sticks.
If you have a `std::string` and want to send it as binary, pass `.data()` and `.size()` explicitly.
```cpp
std::string raw = build_binary_payload();
ws.send(raw.data(), raw.size()); // binary frame
```
## Detect frame type on receive
The return value of `ws.read()` tells you whether the received frame was text or binary.
```cpp
std::string msg;
auto result = ws.read(msg);
switch (result) {
case httplib::ws::ReadResult::Text:
std::cout << "text: " << msg << std::endl;
break;
case httplib::ws::ReadResult::Binary:
std::cout << "binary: " << msg.size() << " bytes" << std::endl;
handle_binary(msg.data(), msg.size());
break;
case httplib::ws::ReadResult::Fail:
// error or closed
break;
}
```
Binary frames still come back in a `std::string`, but treat its contents as raw bytes — use `msg.data()` and `msg.size()`.
## When binary is the right call
- **Images, video, audio**: No Base64 overhead
- **Custom protocols**: protobuf, MessagePack, or any structured binary format
- **Game networking**: When latency matters
- **Sensor data streams**: Push numeric arrays directly
## Ping is binary-ish, but hidden
WebSocket Ping/Pong frames are close cousins of binary frames at the opcode level, but cpp-httplib handles them automatically — you don't touch them. See [W02. Set a WebSocket heartbeat](w02-websocket-ping).
## Example: send an image
```cpp
// Server: push an image
svr.WebSocket("/image", [](const auto &req, auto &ws) {
auto img = read_image_file("logo.png");
ws.send(img.data(), img.size());
});
```
```cpp
// Client: receive and save
httplib::ws::WebSocketClient cli("ws://localhost:8080/image");
cli.connect();
std::string buf;
if (cli.read(buf) == httplib::ws::ReadResult::Binary) {
std::ofstream ofs("received.png", std::ios::binary);
ofs.write(buf.data(), buf.size());
}
```
You can mix text and binary in the same connection. A common pattern: JSON for control messages, binary for the actual data — you get efficient handling of metadata and payload both.
> **Note:** WebSocket frames don't have an infinite size limit. For very large data, chunk it in your application code. cpp-httplib can handle a big frame in one shot, but it does load it all into memory at once.

View File

@@ -18,5 +18,8 @@ Under the hood, it uses blocking I/O with a thread pool. It's not built for hand
## Documentation
- [A Tour of cpp-httplib](tour/) — A step-by-step tutorial covering the basics. Start here if you're new
- [Cookbook](cookbook/) — A collection of recipes organized by topic. Jump to whatever you need
- [Building a Desktop LLM App](llm-app/) — A hands-on guide to building a desktop app with llama.cpp, step by step
## Stay Tuned
- [Cookbook](cookbook/) — A collection of recipes organized by topic. Jump to whatever you need

Binary file not shown.

After

Width:  |  Height:  |  Size: 120 KiB

View File

@@ -0,0 +1,236 @@
---
title: "1. Setting Up the Project Environment"
order: 1
---
Let's incrementally build a text translation REST API server using llama.cpp as the inference engine. By the end, a request like this will return a translation result.
```bash
curl -X POST http://localhost:8080/translate \
-H "Content-Type: application/json" \
-d '{"text": "The weather is nice today. Shall we go for a walk?", "target_lang": "ja"}'
```
```json
{
"translation": "今日はいい天気ですね。散歩に行きましょうか?"
}
```
The "Translation API" is just one example. By swapping out the prompt, you can adapt this to any LLM application you like, such as summarization, code generation, or a chatbot.
Here's the full list of APIs the server will provide.
| Method | Path | Description | Chapter |
| -------- | ---- | ---- | -- |
| `GET` | `/health` | Returns server status | 1 |
| `POST` | `/translate` | Translates text and returns JSON | 2 |
| `POST` | `/translate/stream` | SSE streaming on a per-token basis | 3 |
| `GET` | `/models` | Model list (available / downloaded / selected) | 4 |
| `POST` | `/models/select` | Select a model (automatically downloads if not yet downloaded) | 4 |
In this chapter, let's set up the project environment. We'll fetch the dependency libraries, create the directory structure, configure the build settings, and grab the model file, so that we're ready to start writing code in the next chapter.
## Prerequisites
- A C++20-compatible compiler (GCC 10+, Clang 10+, MSVC 2019 16.8+)
- CMake 3.20 or later
- OpenSSL (used for the HTTPS client in Chapter 4. macOS: `brew install openssl`, Ubuntu: `sudo apt install libssl-dev`)
- Sufficient disk space (model files can be several GB)
## 1.1 What We Will Use
Here are the libraries we'll use.
| Library | Role |
| ----------- | ------ |
| [cpp-httplib](https://github.com/yhirose/cpp-httplib) | HTTP server/client |
| [nlohmann/json](https://github.com/nlohmann/json) | JSON parser |
| [cpp-llamalib](https://github.com/yhirose/cpp-llamalib) | llama.cpp wrapper |
| [llama.cpp](https://github.com/ggml-org/llama.cpp) | LLM inference engine |
| [webview/webview](https://github.com/webview/webview) | Desktop WebView (used in Chapter 6) |
cpp-httplib, nlohmann/json, and cpp-llamalib are header-only libraries. You could just download a single header file with `curl` and `#include` it, but in this book we use CMake's `FetchContent` to fetch them automatically. Declare them in `CMakeLists.txt`, and `cmake -B build` downloads and builds everything for you. webview is used in Chapter 6, so you don't need to worry about it for now.
## 1.2 Directory Structure
The final structure will look like this.
```ascii
translate-app/
├── CMakeLists.txt
├── models/
│ └── (GGUF files)
└── src/
└── main.cpp
```
We don't include library source code in the project. CMake's `FetchContent` fetches them automatically at build time, so all you need is your own code.
Let's create the project directory and initialize a git repository.
```bash
mkdir translate-app && cd translate-app
mkdir src models
git init
```
## 1.3 Obtaining the GGUF Model File
You need a model file for LLM inference. GGUF is the model format used by llama.cpp, and you can find many models on Hugging Face.
Let's start by trying a small model. The quantized version of Google's Gemma 2 2B (~1.6 GB) is a good starting point. It's lightweight but supports multiple languages and works well for translation tasks.
```bash
curl -L -o models/gemma-2-2b-it-Q4_K_M.gguf \
https://huggingface.co/bartowski/gemma-2-2b-it-GGUF/resolve/main/gemma-2-2b-it-Q4_K_M.gguf
```
In Chapter 4, we'll add the ability to download models from within the app using cpp-httplib's client functionality.
## 1.4 CMakeLists.txt
Create a `CMakeLists.txt` in the project root. By declaring dependencies with `FetchContent`, CMake will automatically download and build them for you.
<!-- data-file="CMakeLists.txt" -->
```cmake
cmake_minimum_required(VERSION 3.20)
project(translate-server CXX)
set(CMAKE_CXX_STANDARD 20)
include(FetchContent)
# llama.cpp (LLM inference engine)
FetchContent_Declare(llama
GIT_REPOSITORY https://github.com/ggml-org/llama.cpp
GIT_TAG master
GIT_SHALLOW TRUE
)
FetchContent_MakeAvailable(llama)
# cpp-httplib (HTTP server/client)
FetchContent_Declare(httplib
GIT_REPOSITORY https://github.com/yhirose/cpp-httplib
GIT_TAG master
)
FetchContent_MakeAvailable(httplib)
# nlohmann/json (JSON parser)
FetchContent_Declare(json
URL https://github.com/nlohmann/json/releases/download/v3.11.3/json.tar.xz
)
FetchContent_MakeAvailable(json)
# cpp-llamalib (header-only llama.cpp wrapper)
FetchContent_Declare(cpp_llamalib
GIT_REPOSITORY https://github.com/yhirose/cpp-llamalib
GIT_TAG main
)
FetchContent_MakeAvailable(cpp_llamalib)
add_executable(translate-server src/main.cpp)
target_link_libraries(translate-server PRIVATE
httplib::httplib
nlohmann_json::nlohmann_json
cpp-llamalib
)
```
`FetchContent_Declare` tells CMake where to find each library, and `FetchContent_MakeAvailable` fetches and builds them. The first `cmake -B build` will take some time because it downloads all libraries and builds llama.cpp, but subsequent runs will use the cache.
Just link with `target_link_libraries`, and each library's CMake configuration sets up include paths and build settings for you.
## 1.5 Creating the Skeleton Code
We'll use this skeleton code as a base and add functionality chapter by chapter.
<!-- data-file="main.cpp" -->
```cpp
// src/main.cpp
#include <httplib.h>
#include <nlohmann/json.hpp>
#include <csignal>
#include <iostream>
using json = nlohmann::json;
httplib::Server svr;
// Graceful shutdown on `Ctrl+C`
void signal_handler(int sig) {
if (sig == SIGINT || sig == SIGTERM) {
std::cout << "\nReceived signal, shutting down gracefully...\n";
svr.stop();
}
}
int main() {
// Log requests and responses
svr.set_logger([](const auto &req, const auto &res) {
std::cout << req.method << " " << req.path << " -> " << res.status
<< std::endl;
});
// Health check
svr.Get("/health", [](const auto &, auto &res) {
res.set_content(json{{"status", "ok"}}.dump(), "application/json");
});
// Stub implementations for each endpoint (replaced with real ones in later chapters)
svr.Post("/translate",
[](const auto &req, auto &res) {
res.set_content(json{{"translation", "TODO"}}.dump(), "application/json");
});
svr.Post("/translate/stream",
[](const auto &req, auto &res) {
res.set_content("data: \"TODO\"\n\ndata: [DONE]\n\n", "text/event-stream");
});
svr.Get("/models",
[](const auto &req, auto &res) {
res.set_content(json{{"models", json::array()}}.dump(), "application/json");
});
svr.Post("/models/select",
[](const auto &req, auto &res) {
res.set_content(json{{"status", "TODO"}}.dump(), "application/json");
});
// Allow the server to be stopped with `Ctrl+C` (`SIGINT`) or `kill` (`SIGTERM`)
signal(SIGINT, signal_handler);
signal(SIGTERM, signal_handler);
// Start the server
std::cout << "Listening on http://127.0.0.1:8080" << std::endl;
svr.listen("127.0.0.1", 8080);
}
```
## 1.6 Building and Verifying
Build the project, start the server, and verify that requests work with curl.
```bash
cmake -B build
cmake --build build -j
./build/translate-server
```
From another terminal, try it with curl.
```bash
curl http://localhost:8080/health
# => {"status":"ok"}
```
If you see JSON come back, the setup is complete.
## Next Chapter
Now that the environment is set up, in the next chapter we'll implement the translation REST API on top of this skeleton. We'll run inference with llama.cpp and expose it as an HTTP endpoint with cpp-httplib.
**Next:** [Integrating llama.cpp to Build a REST API](../ch02-rest-api)

View File

@@ -0,0 +1,212 @@
---
title: "2. Integrating llama.cpp to Build a REST API"
order: 2
---
In the skeleton from Chapter 1, `/translate` simply returned `"TODO"`. In this chapter we integrate llama.cpp inference and turn it into an API that actually returns translation results.
Calling the llama.cpp API directly makes the code quite long, so we use a thin wrapper library called [cpp-llamalib](https://github.com/yhirose/cpp-llamalib). It lets you load a model and run inference in just a few lines, keeping the focus on cpp-httplib.
## 2.1 Initializing the LLM
Simply pass the path to a model file to `llamalib::Llama`, and model loading, context creation, and sampler configuration are all taken care of. If you downloaded a different model in Chapter 1, adjust the path accordingly.
```cpp
#include <cpp-llamalib.h>
int main() {
auto llm = llamalib::Llama{"models/gemma-2-2b-it-Q4_K_M.gguf"};
// LLM inference takes time, so set a longer timeout (default is 5 seconds)
svr.set_read_timeout(300);
svr.set_write_timeout(300);
// ... Build and start the HTTP server ...
}
```
If you want to change the number of GPU layers, context length, or other settings, you can specify them via `llamalib::Options`.
```cpp
auto llm = llamalib::Llama{"models/gemma-2-2b-it-Q4_K_M.gguf", {
.n_gpu_layers = 0, // CPU only
.n_ctx = 4096,
}};
```
## 2.2 The `/translate` Handler
We replace the handler that returned dummy JSON in Chapter 1 with actual inference.
```cpp
svr.Post("/translate",
[&](const httplib::Request &req, httplib::Response &res) {
// Parse JSON (3rd arg `false`: don't throw on failure, check with `is_discarded()`)
auto input = json::parse(req.body, nullptr, false);
if (input.is_discarded()) {
res.status = 400;
res.set_content(json{{"error", "Invalid JSON"}}.dump(),
"application/json");
return;
}
// Validate required fields
if (!input.contains("text") || !input["text"].is_string() ||
input["text"].get<std::string>().empty()) {
res.status = 400;
res.set_content(json{{"error", "'text' is required"}}.dump(),
"application/json");
return;
}
auto text = input["text"].get<std::string>();
auto target_lang = input.value("target_lang", "ja"); // Default is Japanese
// Build the prompt and run inference
auto prompt = "Translate the following text to " + target_lang +
". Output only the translation, nothing else.\n\n" + text;
try {
auto translation = llm.chat(prompt);
res.set_content(json{{"translation", translation}}.dump(),
"application/json");
} catch (const std::exception &e) {
res.status = 500;
res.set_content(json{{"error", e.what()}}.dump(), "application/json");
}
});
```
`llm.chat()` can throw exceptions during inference (for example, when the context length is exceeded). By catching them with `try/catch` and returning the error as JSON, we prevent the server from crashing.
## 2.3 Complete Code
Here is the finished code with all the changes so far.
<details>
<summary data-file="main.cpp">Complete code (main.cpp)</summary>
```cpp
#include <httplib.h>
#include <nlohmann/json.hpp>
#include <cpp-llamalib.h>
#include <csignal>
#include <iostream>
using json = nlohmann::json;
httplib::Server svr;
// Graceful shutdown on `Ctrl+C`
void signal_handler(int sig) {
if (sig == SIGINT || sig == SIGTERM) {
std::cout << "\nReceived signal, shutting down gracefully...\n";
svr.stop();
}
}
int main() {
// Load the model downloaded in Chapter 1
auto llm = llamalib::Llama{"models/gemma-2-2b-it-Q4_K_M.gguf"};
// LLM inference takes time, so set a longer timeout (default is 5 seconds)
svr.set_read_timeout(300);
svr.set_write_timeout(300);
// Log requests and responses
svr.set_logger([](const auto &req, const auto &res) {
std::cout << req.method << " " << req.path << " -> " << res.status
<< std::endl;
});
svr.Get("/health", [](const httplib::Request &, httplib::Response &res) {
res.set_content(json{{"status", "ok"}}.dump(), "application/json");
});
svr.Post("/translate",
[&](const httplib::Request &req, httplib::Response &res) {
// Parse JSON (3rd arg `false`: don't throw on failure, check with `is_discarded()`)
auto input = json::parse(req.body, nullptr, false);
if (input.is_discarded()) {
res.status = 400;
res.set_content(json{{"error", "Invalid JSON"}}.dump(),
"application/json");
return;
}
// Validate required fields
if (!input.contains("text") || !input["text"].is_string() ||
input["text"].get<std::string>().empty()) {
res.status = 400;
res.set_content(json{{"error", "'text' is required"}}.dump(),
"application/json");
return;
}
auto text = input["text"].get<std::string>();
auto target_lang = input.value("target_lang", "ja"); // Default is Japanese
// Build the prompt and run inference
auto prompt = "Translate the following text to " + target_lang +
". Output only the translation, nothing else.\n\n" + text;
try {
auto translation = llm.chat(prompt);
res.set_content(json{{"translation", translation}}.dump(),
"application/json");
} catch (const std::exception &e) {
res.status = 500;
res.set_content(json{{"error", e.what()}}.dump(), "application/json");
}
});
// Dummy implementations to be replaced with real ones in later chapters
svr.Get("/models",
[](const httplib::Request &, httplib::Response &res) {
res.set_content(json{{"models", json::array()}}.dump(), "application/json");
});
svr.Post("/models/select",
[](const httplib::Request &, httplib::Response &res) {
res.set_content(json{{"status", "TODO"}}.dump(), "application/json");
});
// Allow the server to be stopped with `Ctrl+C` (`SIGINT`) or `kill` (`SIGTERM`)
signal(SIGINT, signal_handler);
signal(SIGTERM, signal_handler);
// Start the server (blocks until `stop()` is called)
std::cout << "Listening on http://127.0.0.1:8080" << std::endl;
svr.listen("127.0.0.1", 8080);
}
```
</details>
## 2.4 Testing It Out
Rebuild and start the server, then verify that it now returns actual translation results.
```bash
cmake --build build -j
./build/translate-server
```
```bash
curl -X POST http://localhost:8080/translate \
-H "Content-Type: application/json" \
-d '{"text": "I had a great time visiting Tokyo last spring. The cherry blossoms were beautiful.", "target_lang": "ja"}'
# => {"translation":"去年の春に東京を訪れた。桜が綺麗だった。"}
```
In Chapter 1 the response was `"TODO"`, but now you get an actual translation back.
## Next Chapter
The REST API we built in this chapter waits for the entire translation to complete before sending the response, so for long texts the user has to wait with no indication of progress.
In the next chapter, we use SSE (Server-Sent Events) to stream tokens back in real time as they are generated.
**Next:** [Adding Token Streaming with SSE](../ch03-sse-streaming)

View File

@@ -0,0 +1,264 @@
---
title: "3. Adding Token Streaming with SSE"
order: 3
---
The `/translate` endpoint from Chapter 2 returned the entire translation at once after completion. This is fine for short sentences, but for longer text the user has to wait several seconds with nothing displayed.
In this chapter, we add a `/translate/stream` endpoint that uses SSE (Server-Sent Events) to return tokens in real time as they are generated. This is the same approach used by the ChatGPT and Claude APIs.
## 3.1 What is SSE?
SSE is a way to send HTTP responses as a stream. When a client sends a request, the server keeps the connection open and gradually returns events. The format is simple text.
```text
data: "去年の"
data: "春に"
data: "東京を"
data: [DONE]
```
Each line starts with `data:` and events are separated by blank lines. The Content-Type is `text/event-stream`. Tokens are sent as escaped JSON strings, so they appear enclosed in double quotes (we implement this in Section 3.3).
## 3.2 Streaming with cpp-httplib
In cpp-httplib, you can use `set_chunked_content_provider` to send responses incrementally. Each time you write to `sink.os` inside the callback, data is sent to the client.
```cpp
res.set_chunked_content_provider(
"text/event-stream",
[](size_t offset, httplib::DataSink &sink) {
sink.os << "data: hello\n\n";
sink.done();
return true;
});
```
Calling `sink.done()` ends the stream. If the client disconnects mid-stream, writing to `sink.os` will fail and `sink.os.fail()` will return `true`. You can use this to detect disconnection and abort unnecessary inference.
## 3.3 The `/translate/stream` Handler
JSON parsing and validation are the same as the `/translate` endpoint from Chapter 2. The only difference is how the response is returned. We combine the streaming callback of `llm.chat()` with `set_chunked_content_provider`.
```cpp
svr.Post("/translate/stream",
[&](const httplib::Request &req, httplib::Response &res) {
// ... JSON parsing and validation same as /translate ...
res.set_chunked_content_provider(
"text/event-stream",
[&, prompt](size_t, httplib::DataSink &sink) {
try {
llm.chat(prompt, [&](std::string_view token) {
sink.os << "data: "
<< json(std::string(token)).dump(
-1, ' ', false, json::error_handler_t::replace)
<< "\n\n";
return sink.os.good(); // Abort inference on disconnect
});
sink.os << "data: [DONE]\n\n";
} catch (const std::exception &e) {
sink.os << "data: " << json({{"error", e.what()}}).dump() << "\n\n";
}
sink.done();
return true;
});
});
```
A few key points:
- When you pass a callback to `llm.chat()`, it is called each time a token is generated. If the callback returns `false`, generation is aborted
- After writing to `sink.os`, you can check whether the client is still connected with `sink.os.good()`. If the client has disconnected, it returns `false` to stop inference
- Each token is escaped as a JSON string using `json(token).dump()` before sending. This is safe even for tokens containing newlines or quotes
- The first three arguments of `dump(-1, ' ', false, ...)` are the defaults. What matters is the fourth argument, `json::error_handler_t::replace`. Since the LLM returns tokens at the subword level, multi-byte characters (such as Japanese) can be split mid-character across tokens. Passing an incomplete UTF-8 byte sequence directly to `dump()` would throw an exception, so `replace` safely substitutes them. The browser reassembles the bytes on its end, so everything displays correctly
- The entire lambda is wrapped in `try/catch`. `llm.chat()` can throw exceptions for reasons such as exceeding the context window. If an exception goes uncaught inside the lambda, the server will crash, so we return the error as an SSE event instead
- `data: [DONE]` follows the OpenAI API convention to signal the end of the stream to the client
## 3.4 Complete Code
Here is the complete code with the `/translate/stream` endpoint added to the code from Chapter 2.
<details>
<summary data-file="main.cpp">Complete code (main.cpp)</summary>
```cpp
#include <httplib.h>
#include <nlohmann/json.hpp>
#include <cpp-llamalib.h>
#include <csignal>
#include <iostream>
using json = nlohmann::json;
httplib::Server svr;
// Graceful shutdown on `Ctrl+C`
void signal_handler(int sig) {
if (sig == SIGINT || sig == SIGTERM) {
std::cout << "\nReceived signal, shutting down gracefully...\n";
svr.stop();
}
}
int main() {
// Load the GGUF model
auto llm = llamalib::Llama{"models/gemma-2-2b-it-Q4_K_M.gguf"};
// LLM inference takes time, so set a longer timeout (default is 5 seconds)
svr.set_read_timeout(300);
svr.set_write_timeout(300);
// Log requests and responses
svr.set_logger([](const auto &req, const auto &res) {
std::cout << req.method << " " << req.path << " -> " << res.status
<< std::endl;
});
svr.Get("/health", [](const httplib::Request &, httplib::Response &res) {
res.set_content(json{{"status", "ok"}}.dump(), "application/json");
});
// Standard translation endpoint from Chapter 2
svr.Post("/translate",
[&](const httplib::Request &req, httplib::Response &res) {
// JSON parsing and validation (see Chapter 2 for details)
auto input = json::parse(req.body, nullptr, false);
if (input.is_discarded()) {
res.status = 400;
res.set_content(json{{"error", "Invalid JSON"}}.dump(),
"application/json");
return;
}
if (!input.contains("text") || !input["text"].is_string() ||
input["text"].get<std::string>().empty()) {
res.status = 400;
res.set_content(json{{"error", "'text' is required"}}.dump(),
"application/json");
return;
}
auto text = input["text"].get<std::string>();
auto target_lang = input.value("target_lang", "ja");
auto prompt = "Translate the following text to " + target_lang +
". Output only the translation, nothing else.\n\n" + text;
try {
auto translation = llm.chat(prompt);
res.set_content(json{{"translation", translation}}.dump(),
"application/json");
} catch (const std::exception &e) {
res.status = 500;
res.set_content(json{{"error", e.what()}}.dump(), "application/json");
}
});
// SSE streaming translation endpoint
svr.Post("/translate/stream",
[&](const httplib::Request &req, httplib::Response &res) {
// JSON parsing and validation (same as /translate)
auto input = json::parse(req.body, nullptr, false);
if (input.is_discarded()) {
res.status = 400;
res.set_content(json{{"error", "Invalid JSON"}}.dump(),
"application/json");
return;
}
if (!input.contains("text") || !input["text"].is_string() ||
input["text"].get<std::string>().empty()) {
res.status = 400;
res.set_content(json{{"error", "'text' is required"}}.dump(),
"application/json");
return;
}
auto text = input["text"].get<std::string>();
auto target_lang = input.value("target_lang", "ja");
auto prompt = "Translate the following text to " + target_lang +
". Output only the translation, nothing else.\n\n" + text;
res.set_chunked_content_provider(
"text/event-stream",
[&, prompt](size_t, httplib::DataSink &sink) {
try {
llm.chat(prompt, [&](std::string_view token) {
sink.os << "data: "
<< json(std::string(token)).dump(
-1, ' ', false, json::error_handler_t::replace)
<< "\n\n";
return sink.os.good(); // Abort inference on disconnect
});
sink.os << "data: [DONE]\n\n";
} catch (const std::exception &e) {
sink.os << "data: " << json({{"error", e.what()}}).dump() << "\n\n";
}
sink.done();
return true;
});
});
// Dummy implementations to be replaced in later chapters
svr.Get("/models",
[](const httplib::Request &, httplib::Response &res) {
res.set_content(json{{"models", json::array()}}.dump(), "application/json");
});
svr.Post("/models/select",
[](const httplib::Request &, httplib::Response &res) {
res.set_content(json{{"status", "TODO"}}.dump(), "application/json");
});
// Allow the server to be stopped with `Ctrl+C` (`SIGINT`) or `kill` (`SIGTERM`)
signal(SIGINT, signal_handler);
signal(SIGTERM, signal_handler);
// Start the server (blocks until `stop()` is called)
std::cout << "Listening on http://127.0.0.1:8080" << std::endl;
svr.listen("127.0.0.1", 8080);
}
```
</details>
## 3.5 Testing It Out
Build and start the server.
```bash
cmake --build build -j
./build/translate-server
```
Using curl's `-N` option to disable buffering, you can see tokens displayed in real time as they arrive.
```bash
curl -N -X POST http://localhost:8080/translate/stream \
-H "Content-Type: application/json" \
-d '{"text": "I had a great time visiting Tokyo last spring. The cherry blossoms were beautiful.", "target_lang": "ja"}'
```
```text
data: "去年の"
data: "春に"
data: "東京を"
data: "訪れた"
data: "。"
data: "桜が"
data: "綺麗だった"
data: "。"
data: [DONE]
```
You should see tokens streaming in one by one. The `/translate` endpoint from Chapter 2 continues to work as well.
## Next Chapter
The server's translation functionality is now complete. In the next chapter, we use cpp-httplib's client functionality to add the ability to fetch and manage models from Hugging Face.
**Next:** [Adding Model Download and Management](../ch04-model-management)

View File

@@ -0,0 +1,788 @@
---
title: "4. Adding Model Download and Management"
order: 4
---
By the end of Chapter 3, the server's translation functionality was fully in place. However, the only model file available is the one we manually downloaded in Chapter 1. In this chapter, we'll use cpp-httplib's **client functionality** to enable downloading and switching Hugging Face models from within the app.
Once complete, you'll be able to manage models with requests like these:
```bash
# Get the list of available models
curl http://localhost:8080/models
```
```json
{
"models": [
{"name": "gemma-2-2b-it", "params": "2B", "size": "1.6 GB", "downloaded": true, "selected": true},
{"name": "gemma-2-9b-it", "params": "9B", "size": "5.8 GB", "downloaded": false, "selected": false},
{"name": "Llama-3.1-8B-Instruct", "params": "8B", "size": "4.9 GB", "downloaded": false, "selected": false}
]
}
```
```bash
# Select a different model (automatically downloads if not yet available)
curl -N -X POST http://localhost:8080/models/select \
-H "Content-Type: application/json" \
-d '{"model": "gemma-2-9b-it"}'
```
```text
data: {"status":"downloading","progress":0}
data: {"status":"downloading","progress":12}
...
data: {"status":"downloading","progress":100}
data: {"status":"loading"}
data: {"status":"ready"}
```
## 4.1 httplib::Client Basics
So far we've only used `httplib::Server`, but cpp-httplib also provides client functionality. Since Hugging Face uses HTTPS, we need a TLS-capable client.
```cpp
#include <httplib.h>
// Including the URL scheme automatically uses SSLClient
httplib::Client cli("https://huggingface.co");
// Automatically follow redirects (Hugging Face redirects to a CDN)
cli.set_follow_location(true);
auto res = cli.Get("/api/models");
if (res && res->status == 200) {
std::cout << res->body << std::endl;
}
```
To use HTTPS, you need to enable OpenSSL at build time. Add the following to your `CMakeLists.txt`:
```cmake
find_package(OpenSSL REQUIRED)
target_link_libraries(translate-server PRIVATE OpenSSL::SSL OpenSSL::Crypto)
target_compile_definitions(translate-server PRIVATE CPPHTTPLIB_OPENSSL_SUPPORT)
# macOS: required for loading system certificates
if(APPLE)
target_link_libraries(translate-server PRIVATE "-framework CoreFoundation" "-framework Security")
endif()
```
Defining `CPPHTTPLIB_OPENSSL_SUPPORT` enables `httplib::Client("https://...")` to make TLS connections. On macOS, you also need to link the CoreFoundation and Security frameworks to access the system certificate store. See Section 4.8 for the complete `CMakeLists.txt`.
## 4.2 Defining the Model List
Let's define the list of models that the app can handle. Here are four models we've verified for translation tasks.
```cpp
struct ModelInfo {
std::string name; // Display name
std::string params; // Parameter count
std::string size; // GGUF Q4 size
std::string repo; // Hugging Face repository
std::string filename; // GGUF filename
};
const std::vector<ModelInfo> MODELS = {
{
.name = "gemma-2-2b-it",
.params = "2B",
.size = "1.6 GB",
.repo = "bartowski/gemma-2-2b-it-GGUF",
.filename = "gemma-2-2b-it-Q4_K_M.gguf",
},
{
.name = "gemma-2-9b-it",
.params = "9B",
.size = "5.8 GB",
.repo = "bartowski/gemma-2-9b-it-GGUF",
.filename = "gemma-2-9b-it-Q4_K_M.gguf",
},
{
.name = "Llama-3.1-8B-Instruct",
.params = "8B",
.size = "4.9 GB",
.repo = "bartowski/Meta-Llama-3.1-8B-Instruct-GGUF",
.filename = "Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf",
},
};
```
## 4.3 Model Storage Location
Up through Chapter 3, we stored models in the `models/` directory within the project. However, when managing multiple models, a dedicated app directory makes more sense. On macOS/Linux we use `~/.translate-app/models/`, and on Windows we use `%APPDATA%\translate-app\models\`.
```cpp
std::filesystem::path get_models_dir() {
#ifdef _WIN32
auto env = std::getenv("APPDATA");
auto base = env ? std::filesystem::path(env) : std::filesystem::path(".");
return base / "translate-app" / "models";
#else
auto env = std::getenv("HOME");
auto base = env ? std::filesystem::path(env) : std::filesystem::path(".");
return base / ".translate-app" / "models";
#endif
}
```
If the environment variable isn't set, it falls back to the current directory. The app creates this directory at startup (`create_directories` won't error even if it already exists).
## 4.4 Rewriting Model Initialization
We rewrite the model initialization at the beginning of `main()`. In Chapter 1 we hardcoded the path, but from here on we support model switching. We track the currently loaded filename in `selected_model` and load the first entry in `MODELS` at startup. The `GET /models` and `POST /models/select` handlers reference and update this variable.
Since cpp-httplib runs handlers concurrently on a thread pool, reassigning `llm` while another thread is calling `llm.chat()` would crash. We add a `std::mutex` to protect against this.
```cpp
int main() {
auto models_dir = get_models_dir();
std::filesystem::create_directories(models_dir);
std::string selected_model = MODELS[0].filename;
auto path = models_dir / selected_model;
// Automatically download the default model if not yet present
if (!std::filesystem::exists(path)) {
std::cout << "Downloading " << selected_model << "..." << std::endl;
if (!download_model(MODELS[0], [](int pct) {
std::cout << "\r" << pct << "%" << std::flush;
return true;
})) {
std::cerr << "\nFailed to download model." << std::endl;
return 1;
}
std::cout << std::endl;
}
auto llm = llamalib::Llama{path};
std::mutex llm_mutex; // Protect access during model switching
// ...
}
```
This ensures that users don't need to manually download models with curl on first launch. It uses the `download_model` function from Section 4.6 and displays progress on the console.
## 4.5 The `GET /models` Handler
This returns the model list with information about whether each model has been downloaded and whether it's currently selected.
```cpp
svr.Get("/models",
[&](const httplib::Request &, httplib::Response &res) {
auto arr = json::array();
for (const auto &m : MODELS) {
auto path = get_models_dir() / m.filename;
arr.push_back({
{"name", m.name},
{"params", m.params},
{"size", m.size},
{"downloaded", std::filesystem::exists(path)},
{"selected", m.filename == selected_model},
});
}
res.set_content(json{{"models", arr}}.dump(), "application/json");
});
```
## 4.6 Downloading Large Files
GGUF models are several gigabytes, so we can't load the entire file into memory. By passing callbacks to `httplib::Client::Get`, we can receive data chunk by chunk.
```cpp
// content_receiver: callback that receives data chunks
// progress: download progress callback
cli.Get(url,
[&](const char *data, size_t len) { // content_receiver
ofs.write(data, len);
return true; // returning false aborts the download
},
[&](size_t current, size_t total) { // progress
int pct = total ? (int)(current * 100 / total) : 0;
std::cout << pct << "%" << std::endl;
return true; // returning false aborts the download
});
```
Let's use this to create a function that downloads models from Hugging Face.
```cpp
#include <filesystem>
#include <fstream>
// Download a model and report progress via progress_cb.
// If progress_cb returns false, the download is aborted.
bool download_model(const ModelInfo &model,
std::function<bool(int)> progress_cb) {
httplib::Client cli("https://huggingface.co");
cli.set_follow_location(true);
cli.set_read_timeout(std::chrono::hours(1));
auto url = "/" + model.repo + "/resolve/main/" + model.filename;
auto path = get_models_dir() / model.filename;
auto tmp_path = std::filesystem::path(path).concat(".tmp");
std::ofstream ofs(tmp_path, std::ios::binary);
if (!ofs) { return false; }
auto res = cli.Get(url,
[&](const char *data, size_t len) {
ofs.write(data, len);
return ofs.good();
},
[&](size_t current, size_t total) {
return progress_cb(total ? (int)(current * 100 / total) : 0);
});
ofs.close();
if (!res || res->status != 200) {
std::filesystem::remove(tmp_path);
return false;
}
// Write to .tmp first, then rename, so that an incomplete file
// is never mistaken for a usable model if the download is interrupted
std::filesystem::rename(tmp_path, path);
return true;
}
```
## 4.7 The `/models/select` Handler
This handles model selection requests. We always respond with SSE, reporting status in sequence: download progress, loading, and ready.
```cpp
svr.Post("/models/select",
[&](const httplib::Request &req, httplib::Response &res) {
auto input = json::parse(req.body, nullptr, false);
if (input.is_discarded() || !input.contains("model")) {
res.status = 400;
res.set_content(json{{"error", "'model' is required"}}.dump(),
"application/json");
return;
}
auto name = input["model"].get<std::string>();
// Find the model in the list
auto it = std::find_if(MODELS.begin(), MODELS.end(),
[&](const ModelInfo &m) { return m.name == name; });
if (it == MODELS.end()) {
res.status = 404;
res.set_content(json{{"error", "Unknown model"}}.dump(),
"application/json");
return;
}
const auto &model = *it;
// Always respond with SSE (same format whether already downloaded or not)
res.set_chunked_content_provider(
"text/event-stream",
[&, model](size_t, httplib::DataSink &sink) {
// SSE event sending helper
auto send = [&](const json &event) {
sink.os << "data: " << event.dump() << "\n\n";
};
// Download if not yet present (report progress via SSE)
auto path = get_models_dir() / model.filename;
if (!std::filesystem::exists(path)) {
bool ok = download_model(model, [&](int pct) {
send({{"status", "downloading"}, {"progress", pct}});
return sink.os.good(); // Abort download on client disconnect
});
if (!ok) {
send({{"status", "error"}, {"message", "Download failed"}});
sink.done();
return true;
}
}
// Load and switch to the model
send({{"status", "loading"}});
{
std::lock_guard<std::mutex> lock(llm_mutex);
llm = llamalib::Llama{path};
selected_model = model.filename;
}
send({{"status", "ready"}});
sink.done();
return true;
});
});
```
A few notes:
- We send SSE events directly from the `download_model` progress callback. This is an application of `set_chunked_content_provider` + `sink.os` from Chapter 3
- Since the callback returns `sink.os.good()`, the download stops if the client disconnects. The cancel button we add in Chapter 5 uses this
- When we update `selected_model`, it's reflected in the `selected` flag of `GET /models`
- The `llm` reassignment is protected by `llm_mutex`. The `/translate` and `/translate/stream` handlers also lock the same mutex, so inference can't run during a model switch (see the complete code)
## 4.8 Complete Code
Here is the complete code with model management added to the Chapter 3 code.
<details>
<summary data-file="CMakeLists.txt">Complete code (CMakeLists.txt)</summary>
```cmake
cmake_minimum_required(VERSION 3.20)
project(translate-server CXX)
set(CMAKE_CXX_STANDARD 20)
include(FetchContent)
# llama.cpp
FetchContent_Declare(llama
GIT_REPOSITORY https://github.com/ggml-org/llama.cpp
GIT_TAG master
GIT_SHALLOW TRUE
)
FetchContent_MakeAvailable(llama)
# cpp-httplib
FetchContent_Declare(httplib
GIT_REPOSITORY https://github.com/yhirose/cpp-httplib
GIT_TAG master
)
FetchContent_MakeAvailable(httplib)
# nlohmann/json
FetchContent_Declare(json
URL https://github.com/nlohmann/json/releases/download/v3.11.3/json.tar.xz
)
FetchContent_MakeAvailable(json)
# cpp-llamalib
FetchContent_Declare(cpp_llamalib
GIT_REPOSITORY https://github.com/yhirose/cpp-llamalib
GIT_TAG main
)
FetchContent_MakeAvailable(cpp_llamalib)
find_package(OpenSSL REQUIRED)
add_executable(translate-server src/main.cpp)
target_link_libraries(translate-server PRIVATE
httplib::httplib
nlohmann_json::nlohmann_json
cpp-llamalib
OpenSSL::SSL OpenSSL::Crypto
)
target_compile_definitions(translate-server PRIVATE CPPHTTPLIB_OPENSSL_SUPPORT)
if(APPLE)
target_link_libraries(translate-server PRIVATE
"-framework CoreFoundation"
"-framework Security"
)
endif()
```
</details>
<details>
<summary data-file="main.cpp">Complete code (main.cpp)</summary>
```cpp
#include <httplib.h>
#include <nlohmann/json.hpp>
#include <cpp-llamalib.h>
#include <algorithm>
#include <csignal>
#include <filesystem>
#include <fstream>
#include <iostream>
#include <mutex>
using json = nlohmann::json;
// -------------------------------------------------------------------------
// Model definitions
// -------------------------------------------------------------------------
struct ModelInfo {
std::string name;
std::string params;
std::string size;
std::string repo;
std::string filename;
};
const std::vector<ModelInfo> MODELS = {
{
.name = "gemma-2-2b-it",
.params = "2B",
.size = "1.6 GB",
.repo = "bartowski/gemma-2-2b-it-GGUF",
.filename = "gemma-2-2b-it-Q4_K_M.gguf",
},
{
.name = "gemma-2-9b-it",
.params = "9B",
.size = "5.8 GB",
.repo = "bartowski/gemma-2-9b-it-GGUF",
.filename = "gemma-2-9b-it-Q4_K_M.gguf",
},
{
.name = "Llama-3.1-8B-Instruct",
.params = "8B",
.size = "4.9 GB",
.repo = "bartowski/Meta-Llama-3.1-8B-Instruct-GGUF",
.filename = "Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf",
},
};
// -------------------------------------------------------------------------
// Model storage directory
// -------------------------------------------------------------------------
std::filesystem::path get_models_dir() {
#ifdef _WIN32
auto env = std::getenv("APPDATA");
auto base = env ? std::filesystem::path(env) : std::filesystem::path(".");
return base / "translate-app" / "models";
#else
auto env = std::getenv("HOME");
auto base = env ? std::filesystem::path(env) : std::filesystem::path(".");
return base / ".translate-app" / "models";
#endif
}
// -------------------------------------------------------------------------
// Model download
// -------------------------------------------------------------------------
// If progress_cb returns false, the download is aborted
bool download_model(const ModelInfo &model,
std::function<bool(int)> progress_cb) {
httplib::Client cli("https://huggingface.co");
cli.set_follow_location(true); // Hugging Face redirects to a CDN
cli.set_read_timeout(std::chrono::hours(1)); // Set a long timeout for large models
auto url = "/" + model.repo + "/resolve/main/" + model.filename;
auto path = get_models_dir() / model.filename;
auto tmp_path = std::filesystem::path(path).concat(".tmp");
std::ofstream ofs(tmp_path, std::ios::binary);
if (!ofs) { return false; }
auto res = cli.Get(url,
// content_receiver: receive data chunk by chunk and write to file
[&](const char *data, size_t len) {
ofs.write(data, len);
return ofs.good();
},
// progress: report download progress (returning false aborts)
[&, last_pct = -1](size_t current, size_t total) mutable {
int pct = total ? (int)(current * 100 / total) : 0;
if (pct == last_pct) return true; // Skip if same value
last_pct = pct;
return progress_cb(pct);
});
ofs.close();
if (!res || res->status != 200) {
std::filesystem::remove(tmp_path);
return false;
}
// Rename after download completes
std::filesystem::rename(tmp_path, path);
return true;
}
// -------------------------------------------------------------------------
// Server
// -------------------------------------------------------------------------
httplib::Server svr;
void signal_handler(int sig) {
if (sig == SIGINT || sig == SIGTERM) {
std::cout << "\nReceived signal, shutting down gracefully...\n";
svr.stop();
}
}
int main() {
// Create the model storage directory
auto models_dir = get_models_dir();
std::filesystem::create_directories(models_dir);
// Automatically download the default model if not yet present
std::string selected_model = MODELS[0].filename;
auto path = models_dir / selected_model;
if (!std::filesystem::exists(path)) {
std::cout << "Downloading " << selected_model << "..." << std::endl;
if (!download_model(MODELS[0], [](int pct) {
std::cout << "\r" << pct << "%" << std::flush;
return true;
})) {
std::cerr << "\nFailed to download model." << std::endl;
return 1;
}
std::cout << std::endl;
}
auto llm = llamalib::Llama{path};
std::mutex llm_mutex; // Protect access during model switching
// Set a long timeout since LLM inference takes time (default is 5 seconds)
svr.set_read_timeout(300);
svr.set_write_timeout(300);
svr.set_logger([](const auto &req, const auto &res) {
std::cout << req.method << " " << req.path << " -> " << res.status
<< std::endl;
});
svr.Get("/health", [](const httplib::Request &, httplib::Response &res) {
res.set_content(json{{"status", "ok"}}.dump(), "application/json");
});
// --- Translation endpoint (Chapter 2) ------------------------------------
svr.Post("/translate",
[&](const httplib::Request &req, httplib::Response &res) {
// JSON parsing and validation (see Chapter 2 for details)
auto input = json::parse(req.body, nullptr, false);
if (input.is_discarded()) {
res.status = 400;
res.set_content(json{{"error", "Invalid JSON"}}.dump(),
"application/json");
return;
}
if (!input.contains("text") || !input["text"].is_string() ||
input["text"].get<std::string>().empty()) {
res.status = 400;
res.set_content(json{{"error", "'text' is required"}}.dump(),
"application/json");
return;
}
auto text = input["text"].get<std::string>();
auto target_lang = input.value("target_lang", "ja");
auto prompt = "Translate the following text to " + target_lang +
". Output only the translation, nothing else.\n\n" + text;
try {
std::lock_guard<std::mutex> lock(llm_mutex);
auto translation = llm.chat(prompt);
res.set_content(json{{"translation", translation}}.dump(),
"application/json");
} catch (const std::exception &e) {
res.status = 500;
res.set_content(json{{"error", e.what()}}.dump(), "application/json");
}
});
// --- SSE streaming translation (Chapter 3) -------------------------------
svr.Post("/translate/stream",
[&](const httplib::Request &req, httplib::Response &res) {
auto input = json::parse(req.body, nullptr, false);
if (input.is_discarded()) {
res.status = 400;
res.set_content(json{{"error", "Invalid JSON"}}.dump(),
"application/json");
return;
}
if (!input.contains("text") || !input["text"].is_string() ||
input["text"].get<std::string>().empty()) {
res.status = 400;
res.set_content(json{{"error", "'text' is required"}}.dump(),
"application/json");
return;
}
auto text = input["text"].get<std::string>();
auto target_lang = input.value("target_lang", "ja");
auto prompt = "Translate the following text to " + target_lang +
". Output only the translation, nothing else.\n\n" + text;
res.set_chunked_content_provider(
"text/event-stream",
[&, prompt](size_t, httplib::DataSink &sink) {
std::lock_guard<std::mutex> lock(llm_mutex);
try {
llm.chat(prompt, [&](std::string_view token) {
sink.os << "data: "
<< json(std::string(token)).dump(
-1, ' ', false, json::error_handler_t::replace)
<< "\n\n";
return sink.os.good(); // Abort inference on disconnect
});
sink.os << "data: [DONE]\n\n";
} catch (const std::exception &e) {
sink.os << "data: " << json({{"error", e.what()}}).dump() << "\n\n";
}
sink.done();
return true;
});
});
// --- Model list (Chapter 4) ----------------------------------------------
svr.Get("/models",
[&](const httplib::Request &, httplib::Response &res) {
auto models_dir = get_models_dir();
auto arr = json::array();
for (const auto &m : MODELS) {
auto path = models_dir / m.filename;
arr.push_back({
{"name", m.name},
{"params", m.params},
{"size", m.size},
{"downloaded", std::filesystem::exists(path)},
{"selected", m.filename == selected_model},
});
}
res.set_content(json{{"models", arr}}.dump(), "application/json");
});
// --- Model selection (Chapter 4) -----------------------------------------
svr.Post("/models/select",
[&](const httplib::Request &req, httplib::Response &res) {
auto input = json::parse(req.body, nullptr, false);
if (input.is_discarded() || !input.contains("model")) {
res.status = 400;
res.set_content(json{{"error", "'model' is required"}}.dump(),
"application/json");
return;
}
auto name = input["model"].get<std::string>();
auto it = std::find_if(MODELS.begin(), MODELS.end(),
[&](const ModelInfo &m) { return m.name == name; });
if (it == MODELS.end()) {
res.status = 404;
res.set_content(json{{"error", "Unknown model"}}.dump(),
"application/json");
return;
}
const auto &model = *it;
// Always respond with SSE (same format whether already downloaded or not)
res.set_chunked_content_provider(
"text/event-stream",
[&, model](size_t, httplib::DataSink &sink) {
// SSE event sending helper
auto send = [&](const json &event) {
sink.os << "data: " << event.dump() << "\n\n";
};
// Download if not yet present (report progress via SSE)
auto path = get_models_dir() / model.filename;
if (!std::filesystem::exists(path)) {
bool ok = download_model(model, [&](int pct) {
send({{"status", "downloading"}, {"progress", pct}});
return sink.os.good(); // Abort download on client disconnect
});
if (!ok) {
send({{"status", "error"}, {"message", "Download failed"}});
sink.done();
return true;
}
}
// Load and switch to the model
send({{"status", "loading"}});
{
std::lock_guard<std::mutex> lock(llm_mutex);
llm = llamalib::Llama{path};
selected_model = model.filename;
}
send({{"status", "ready"}});
sink.done();
return true;
});
});
// Allow the server to be stopped with `Ctrl+C` (`SIGINT`) or `kill` (`SIGTERM`)
signal(SIGINT, signal_handler);
signal(SIGTERM, signal_handler);
std::cout << "Listening on http://127.0.0.1:8080" << std::endl;
svr.listen("127.0.0.1", 8080);
}
```
</details>
## 4.9 Testing
Since we added OpenSSL configuration to CMakeLists.txt, we need to re-run CMake before building.
```bash
cmake -B build
cmake --build build -j
./build/translate-server
```
### Checking the Model List
```bash
curl http://localhost:8080/models
```
The gemma-2-2b-it model downloaded in Chapter 1 should show `downloaded: true` and `selected: true`.
### Switching to a Different Model
```bash
curl -N -X POST http://localhost:8080/models/select \
-H "Content-Type: application/json" \
-d '{"model": "gemma-2-9b-it"}'
```
Download progress streams via SSE, and `"ready"` appears when it's done.
### Comparing Translations Across Models
Let's translate the same sentence with different models.
```bash
# Translate with gemma-2-9b-it (the model we just switched to)
curl -X POST http://localhost:8080/translate \
-H "Content-Type: application/json" \
-d '{"text": "The quick brown fox jumps over the lazy dog.", "target_lang": "ja"}'
# Switch back to gemma-2-2b-it
curl -N -X POST http://localhost:8080/models/select \
-H "Content-Type: application/json" \
-d '{"model": "gemma-2-2b-it"}'
# Translate the same sentence
curl -X POST http://localhost:8080/translate \
-H "Content-Type: application/json" \
-d '{"text": "The quick brown fox jumps over the lazy dog.", "target_lang": "ja"}'
```
Translation results vary depending on the model, even with the same code and the same prompt. Since cpp-llamalib automatically applies the appropriate chat template for each model, no code changes are needed.
## Next Chapter
The server's main features are now complete: REST API, SSE streaming, and model download and switching. In the next chapter, we'll add static file serving and build a Web UI you can use from a browser.
**Next:** [Adding a Web UI](../ch05-web-ui)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,724 @@
---
title: "6. Turning It into a Desktop App with WebView"
order: 6
---
In Chapter 5, we completed a translation app you can use from a browser. But every time, you have to start the server, open the URL in a browser... Wouldn't it be nice to just double-click and start using it, like a normal app?
In this chapter, we'll do two things:
1. **WebView integration** — Use [webview/webview](https://github.com/webview/webview) to turn it into a desktop app that runs without a browser
2. **Single binary packaging** — Use [cpp-embedlib](https://github.com/yhirose/cpp-embedlib) to embed HTML/CSS/JS into the binary, making the distributable a single file
When finished, you'll be able to just run `./translate-app` to open a window and start translating.
![Desktop App](../app.png#large-center)
The model downloads automatically on first launch, so the only thing you need to give users is the single binary.
## 6.1 Introducing webview/webview
[webview/webview](https://github.com/webview/webview) is a library that lets you use the OS's native WebView component (WKWebView on macOS, WebKitGTK on Linux, WebView2 on Windows) from C/C++. Unlike Electron, it doesn't bundle its own browser, so the impact on binary size is negligible.
We'll fetch it with CMake. Add the following to your `CMakeLists.txt`:
```cmake
# webview/webview
FetchContent_Declare(webview
GIT_REPOSITORY https://github.com/webview/webview
GIT_TAG master
)
FetchContent_MakeAvailable(webview)
```
This makes the `webview::core` CMake target available. When you link it with `target_link_libraries`, it automatically sets up include paths and platform-specific frameworks.
> **macOS**: No additional dependencies are needed. WKWebView is built into the system.
>
> **Linux**: WebKitGTK is required. Install it with `sudo apt install libwebkit2gtk-4.1-dev`.
>
> **Windows**: The WebView2 runtime is required. It comes pre-installed on Windows 11. For Windows 10, download it from the [official Microsoft website](https://developer.microsoft.com/en-us/microsoft-edge/webview2/).
## 6.2 Running the Server on a Background Thread
Up through Chapter 5, the server's `listen()` was blocking the main thread. To use WebView, we need to run the server on a separate thread and run the WebView event loop on the main thread.
```cpp
#include "webview/webview.h"
#include <thread>
int main() {
// ... (server setup is the same as Chapter 5) ...
// Start the server on a background thread
auto port = svr.bind_to_any_port("127.0.0.1");
std::thread server_thread([&]() { svr.listen_after_bind(); });
std::cout << "Listening on http://127.0.0.1:" << port << std::endl;
// Display the UI with WebView
webview::webview w(false, nullptr);
w.set_title("Translate App");
w.set_size(1024, 768, WEBVIEW_HINT_NONE);
w.navigate("http://127.0.0.1:" + std::to_string(port));
w.run(); // Block until the window is closed
// Stop the server when the window is closed
svr.stop();
server_thread.join();
}
```
Let's look at the key points:
- **`bind_to_any_port`** — Instead of `listen("127.0.0.1", 8080)`, we let the OS choose an available port. Since desktop apps can be launched multiple times, using a fixed port would cause conflicts
- **`listen_after_bind`** — Starts accepting requests on the port reserved by `bind_to_any_port`. While `listen()` does bind and listen in one call, we need to know the port number first, so we split the operations
- **Shutdown order** — When the WebView window is closed, we stop the server with `svr.stop()` and wait for the thread to finish with `server_thread.join()`. If we reversed the order, WebView would lose access to the server
The `signal_handler` from Chapter 5 is no longer needed. In a desktop app, closing the window means terminating the application.
## 6.3 Embedding Static Files with cpp-embedlib
In Chapter 5, we served files from the `public/` directory, so you'd need to distribute `public/` alongside the binary. With [cpp-embedlib](https://github.com/yhirose/cpp-embedlib), you can embed HTML, CSS, and JavaScript into the binary, packaging the distributable into a single file.
### CMakeLists.txt
Fetch cpp-embedlib and embed `public/`:
```cmake
# cpp-embedlib
FetchContent_Declare(cpp-embedlib
GIT_REPOSITORY https://github.com/yhirose/cpp-embedlib
GIT_TAG main
)
FetchContent_MakeAvailable(cpp-embedlib)
# Embed the public/ directory into the binary
cpp_embedlib_add(WebAssets
FOLDER ${CMAKE_CURRENT_SOURCE_DIR}/public
NAMESPACE Web
)
target_link_libraries(translate-app PRIVATE
WebAssets # Embedded files
cpp-embedlib-httplib # cpp-httplib integration
)
```
`cpp_embedlib_add` converts the files under `public/` into binary data at compile time and creates a static library called `WebAssets`. When linked, you can access the embedded files through a `Web::FS` object. `cpp-embedlib-httplib` is a helper library that provides the `httplib::mount()` function.
### Replacing set_mount_point with httplib::mount
Simply replace Chapter 5's `set_mount_point` with cpp-embedlib's `httplib::mount`:
```cpp
#include <cpp-embedlib-httplib.h>
#include "WebAssets.h"
// Chapter 5:
// svr.set_mount_point("/", "./public");
// Chapter 6:
httplib::mount(svr, Web::FS);
```
`httplib::mount` registers handlers that serve the files embedded in `Web::FS` over HTTP. MIME types are automatically determined from file extensions, so there's no need to manually set `Content-Type`.
The file contents are directly mapped to the binary's data segment, so no memory copies or heap allocations occur.
## 6.4 macOS: Adding the Edit Menu
If you try to paste text into the input field with `Cmd+V`, you'll find it doesn't work. On macOS, keyboard shortcuts like `Cmd+V` (paste) and `Cmd+C` (copy) are routed through the application's menu bar. Since webview/webview doesn't create one, these shortcuts never reach the WebView. We need to add a macOS Edit menu using the Objective-C runtime:
```cpp
#ifdef __APPLE__
#include <objc/objc-runtime.h>
void setup_macos_edit_menu() {
auto cls = [](const char *n) { return (id)objc_getClass(n); };
auto sel = sel_registerName;
auto msg = reinterpret_cast<id (*)(id, SEL)>(objc_msgSend);
auto msg_s = reinterpret_cast<id (*)(id, SEL, const char *)>(objc_msgSend);
auto msg_id = reinterpret_cast<id (*)(id, SEL, id)>(objc_msgSend);
auto msg_v = reinterpret_cast<void (*)(id, SEL, id)>(objc_msgSend);
auto msg_mi = reinterpret_cast<id (*)(id, SEL, id, SEL, id)>(objc_msgSend);
auto str = [&](const char *s) {
return msg_s(cls("NSString"), sel("stringWithUTF8String:"), s);
};
id app = msg(cls("NSApplication"), sel("sharedApplication"));
id mainMenu = msg(msg(cls("NSMenu"), sel("alloc")), sel("init"));
id editItem = msg(msg(cls("NSMenuItem"), sel("alloc")), sel("init"));
id editMenu = msg_id(msg(cls("NSMenu"), sel("alloc")),
sel("initWithTitle:"), str("Edit"));
struct { const char *title; const char *action; const char *key; } items[] = {
{"Undo", "undo:", "z"},
{"Redo", "redo:", "Z"},
{"Cut", "cut:", "x"},
{"Copy", "copy:", "c"},
{"Paste", "paste:", "v"},
{"Select All", "selectAll:", "a"},
};
for (auto &[title, action, key] : items) {
id mi = msg_mi(msg(cls("NSMenuItem"), sel("alloc")),
sel("initWithTitle:action:keyEquivalent:"),
str(title), sel(action), str(key));
msg_v(editMenu, sel("addItem:"), mi);
}
msg_v(editItem, sel("setSubmenu:"), editMenu);
msg_v(mainMenu, sel("addItem:"), editItem);
msg_v(app, sel("setMainMenu:"), mainMenu);
}
#endif
```
Call this before `w.run()`:
```cpp
#ifdef __APPLE__
setup_macos_edit_menu();
#endif
w.run();
```
On Windows and Linux, keyboard shortcuts are delivered directly to the focused control without going through the menu bar, so this workaround is macOS-specific.
## 6.5 Complete Code
<details>
<summary data-file="CMakeLists.txt">Complete code (CMakeLists.txt)</summary>
```cmake
cmake_minimum_required(VERSION 3.20)
project(translate-app CXX)
set(CMAKE_CXX_STANDARD 20)
include(FetchContent)
# llama.cpp
FetchContent_Declare(llama
GIT_REPOSITORY https://github.com/ggml-org/llama.cpp
GIT_TAG master
GIT_SHALLOW TRUE
)
FetchContent_MakeAvailable(llama)
# cpp-httplib
FetchContent_Declare(httplib
GIT_REPOSITORY https://github.com/yhirose/cpp-httplib
GIT_TAG master
)
FetchContent_MakeAvailable(httplib)
# nlohmann/json
FetchContent_Declare(json
URL https://github.com/nlohmann/json/releases/download/v3.11.3/json.tar.xz
)
FetchContent_MakeAvailable(json)
# cpp-llamalib
FetchContent_Declare(cpp_llamalib
GIT_REPOSITORY https://github.com/yhirose/cpp-llamalib
GIT_TAG main
)
FetchContent_MakeAvailable(cpp_llamalib)
# webview/webview
FetchContent_Declare(webview
GIT_REPOSITORY https://github.com/webview/webview
GIT_TAG master
)
FetchContent_MakeAvailable(webview)
# cpp-embedlib
FetchContent_Declare(cpp-embedlib
GIT_REPOSITORY https://github.com/yhirose/cpp-embedlib
GIT_TAG main
)
FetchContent_MakeAvailable(cpp-embedlib)
# Embed the public/ directory into the binary
cpp_embedlib_add(WebAssets
FOLDER ${CMAKE_CURRENT_SOURCE_DIR}/public
NAMESPACE Web
)
find_package(OpenSSL REQUIRED)
add_executable(translate-app src/main.cpp)
target_link_libraries(translate-app PRIVATE
httplib::httplib
nlohmann_json::nlohmann_json
cpp-llamalib
OpenSSL::SSL OpenSSL::Crypto
WebAssets
cpp-embedlib-httplib
webview::core
)
if(APPLE)
target_link_libraries(translate-app PRIVATE
"-framework CoreFoundation"
"-framework Security"
)
endif()
target_compile_definitions(translate-app PRIVATE
CPPHTTPLIB_OPENSSL_SUPPORT
)
```
</details>
<details>
<summary data-file="main.cpp">Complete code (main.cpp)</summary>
```cpp
#include <httplib.h>
#include <nlohmann/json.hpp>
#include <cpp-llamalib.h>
#include <cpp-embedlib-httplib.h>
#include "WebAssets.h"
#include "webview/webview.h"
#ifdef __APPLE__
#include <objc/objc-runtime.h>
#endif
#include <algorithm>
#include <filesystem>
#include <fstream>
#include <iostream>
#include <mutex>
#include <thread>
using json = nlohmann::json;
// -------------------------------------------------------------------------
// macOS Edit menu (Cmd+C/V/X/A require an Edit menu on macOS)
// -------------------------------------------------------------------------
#ifdef __APPLE__
void setup_macos_edit_menu() {
auto cls = [](const char *n) { return (id)objc_getClass(n); };
auto sel = sel_registerName;
auto msg = reinterpret_cast<id (*)(id, SEL)>(objc_msgSend);
auto msg_s = reinterpret_cast<id (*)(id, SEL, const char *)>(objc_msgSend);
auto msg_id = reinterpret_cast<id (*)(id, SEL, id)>(objc_msgSend);
auto msg_v = reinterpret_cast<void (*)(id, SEL, id)>(objc_msgSend);
auto msg_mi = reinterpret_cast<id (*)(id, SEL, id, SEL, id)>(objc_msgSend);
auto str = [&](const char *s) {
return msg_s(cls("NSString"), sel("stringWithUTF8String:"), s);
};
id app = msg(cls("NSApplication"), sel("sharedApplication"));
id mainMenu = msg(msg(cls("NSMenu"), sel("alloc")), sel("init"));
id editItem = msg(msg(cls("NSMenuItem"), sel("alloc")), sel("init"));
id editMenu = msg_id(msg(cls("NSMenu"), sel("alloc")),
sel("initWithTitle:"), str("Edit"));
struct { const char *title; const char *action; const char *key; } items[] = {
{"Undo", "undo:", "z"},
{"Redo", "redo:", "Z"},
{"Cut", "cut:", "x"},
{"Copy", "copy:", "c"},
{"Paste", "paste:", "v"},
{"Select All", "selectAll:", "a"},
};
for (auto &[title, action, key] : items) {
id mi = msg_mi(msg(cls("NSMenuItem"), sel("alloc")),
sel("initWithTitle:action:keyEquivalent:"),
str(title), sel(action), str(key));
msg_v(editMenu, sel("addItem:"), mi);
}
msg_v(editItem, sel("setSubmenu:"), editMenu);
msg_v(mainMenu, sel("addItem:"), editItem);
msg_v(app, sel("setMainMenu:"), mainMenu);
}
#endif
// -------------------------------------------------------------------------
// Model definitions
// -------------------------------------------------------------------------
struct ModelInfo {
std::string name;
std::string params;
std::string size;
std::string repo;
std::string filename;
};
const std::vector<ModelInfo> MODELS = {
{
.name = "gemma-2-2b-it",
.params = "2B",
.size = "1.6 GB",
.repo = "bartowski/gemma-2-2b-it-GGUF",
.filename = "gemma-2-2b-it-Q4_K_M.gguf",
},
{
.name = "gemma-2-9b-it",
.params = "9B",
.size = "5.8 GB",
.repo = "bartowski/gemma-2-9b-it-GGUF",
.filename = "gemma-2-9b-it-Q4_K_M.gguf",
},
{
.name = "Llama-3.1-8B-Instruct",
.params = "8B",
.size = "4.9 GB",
.repo = "bartowski/Meta-Llama-3.1-8B-Instruct-GGUF",
.filename = "Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf",
},
};
// -------------------------------------------------------------------------
// Model storage directory
// -------------------------------------------------------------------------
std::filesystem::path get_models_dir() {
#ifdef _WIN32
auto env = std::getenv("APPDATA");
auto base = env ? std::filesystem::path(env) : std::filesystem::path(".");
return base / "translate-app" / "models";
#else
auto env = std::getenv("HOME");
auto base = env ? std::filesystem::path(env) : std::filesystem::path(".");
return base / ".translate-app" / "models";
#endif
}
// -------------------------------------------------------------------------
// Model download
// -------------------------------------------------------------------------
// Abort the download if progress_cb returns false
bool download_model(const ModelInfo &model,
std::function<bool(int)> progress_cb) {
httplib::Client cli("https://huggingface.co");
cli.set_follow_location(true); // Hugging Face redirects to a CDN
cli.set_read_timeout(std::chrono::hours(1)); // Long timeout for large models
auto url = "/" + model.repo + "/resolve/main/" + model.filename;
auto path = get_models_dir() / model.filename;
auto tmp_path = std::filesystem::path(path).concat(".tmp");
std::ofstream ofs(tmp_path, std::ios::binary);
if (!ofs) { return false; }
auto res = cli.Get(url,
// content_receiver: Receive data chunk by chunk and write to file
[&](const char *data, size_t len) {
ofs.write(data, len);
return ofs.good();
},
// progress: Report download progress (return false to abort)
[&, last_pct = -1](size_t current, size_t total) mutable {
int pct = total ? (int)(current * 100 / total) : 0;
if (pct == last_pct) return true; // Skip if the value hasn't changed
last_pct = pct;
return progress_cb(pct);
});
ofs.close();
if (!res || res->status != 200) {
std::filesystem::remove(tmp_path);
return false;
}
// Rename after download completes
std::filesystem::rename(tmp_path, path);
return true;
}
// -------------------------------------------------------------------------
// Server
// -------------------------------------------------------------------------
int main() {
httplib::Server svr;
// Create the model storage directory
auto models_dir = get_models_dir();
std::filesystem::create_directories(models_dir);
// Auto-download the default model if not already present
std::string selected_model = MODELS[0].filename;
auto path = models_dir / selected_model;
if (!std::filesystem::exists(path)) {
std::cout << "Downloading " << selected_model << "..." << std::endl;
if (!download_model(MODELS[0], [](int pct) {
std::cout << "\r" << pct << "%" << std::flush;
return true;
})) {
std::cerr << "\nFailed to download model." << std::endl;
return 1;
}
std::cout << std::endl;
}
auto llm = llamalib::Llama{path};
std::mutex llm_mutex; // Protect access during model switching
// Set a long timeout since LLM inference takes time (default is 5 seconds)
svr.set_read_timeout(300);
svr.set_write_timeout(300);
svr.set_logger([](const auto &req, const auto &res) {
std::cout << req.method << " " << req.path << " -> " << res.status
<< std::endl;
});
svr.Get("/health", [](const httplib::Request &, httplib::Response &res) {
res.set_content(json{{"status", "ok"}}.dump(), "application/json");
});
// --- Translation endpoint (Chapter 2) ------------------------------------
svr.Post("/translate",
[&](const httplib::Request &req, httplib::Response &res) {
auto input = json::parse(req.body, nullptr, false);
if (input.is_discarded()) {
res.status = 400;
res.set_content(json{{"error", "Invalid JSON"}}.dump(),
"application/json");
return;
}
if (!input.contains("text") || !input["text"].is_string() ||
input["text"].get<std::string>().empty()) {
res.status = 400;
res.set_content(json{{"error", "'text' is required"}}.dump(),
"application/json");
return;
}
auto text = input["text"].get<std::string>();
auto target_lang = input.value("target_lang", "ja");
auto prompt = "Translate the following text to " + target_lang +
". Output only the translation, nothing else.\n\n" + text;
try {
std::lock_guard<std::mutex> lock(llm_mutex);
auto translation = llm.chat(prompt);
res.set_content(json{{"translation", translation}}.dump(),
"application/json");
} catch (const std::exception &e) {
res.status = 500;
res.set_content(json{{"error", e.what()}}.dump(), "application/json");
}
});
// --- SSE streaming translation (Chapter 3) -------------------------------
svr.Post("/translate/stream",
[&](const httplib::Request &req, httplib::Response &res) {
auto input = json::parse(req.body, nullptr, false);
if (input.is_discarded()) {
res.status = 400;
res.set_content(json{{"error", "Invalid JSON"}}.dump(),
"application/json");
return;
}
if (!input.contains("text") || !input["text"].is_string() ||
input["text"].get<std::string>().empty()) {
res.status = 400;
res.set_content(json{{"error", "'text' is required"}}.dump(),
"application/json");
return;
}
auto text = input["text"].get<std::string>();
auto target_lang = input.value("target_lang", "ja");
auto prompt = "Translate the following text to " + target_lang +
". Output only the translation, nothing else.\n\n" + text;
res.set_chunked_content_provider(
"text/event-stream",
[&, prompt](size_t, httplib::DataSink &sink) {
std::lock_guard<std::mutex> lock(llm_mutex);
try {
llm.chat(prompt, [&](std::string_view token) {
sink.os << "data: "
<< json(std::string(token)).dump(
-1, ' ', false, json::error_handler_t::replace)
<< "\n\n";
return sink.os.good(); // Abort inference on disconnect
});
sink.os << "data: [DONE]\n\n";
} catch (const std::exception &e) {
sink.os << "data: " << json({{"error", e.what()}}).dump() << "\n\n";
}
sink.done();
return true;
});
});
// --- Model list (Chapter 4) ----------------------------------------------
svr.Get("/models",
[&](const httplib::Request &, httplib::Response &res) {
auto models_dir = get_models_dir();
auto arr = json::array();
for (const auto &m : MODELS) {
auto path = models_dir / m.filename;
arr.push_back({
{"name", m.name},
{"params", m.params},
{"size", m.size},
{"downloaded", std::filesystem::exists(path)},
{"selected", m.filename == selected_model},
});
}
res.set_content(json{{"models", arr}}.dump(), "application/json");
});
// --- Model selection (Chapter 4) -----------------------------------------
svr.Post("/models/select",
[&](const httplib::Request &req, httplib::Response &res) {
auto input = json::parse(req.body, nullptr, false);
if (input.is_discarded() || !input.contains("model")) {
res.status = 400;
res.set_content(json{{"error", "'model' is required"}}.dump(),
"application/json");
return;
}
auto name = input["model"].get<std::string>();
auto it = std::find_if(MODELS.begin(), MODELS.end(),
[&](const ModelInfo &m) { return m.name == name; });
if (it == MODELS.end()) {
res.status = 404;
res.set_content(json{{"error", "Unknown model"}}.dump(),
"application/json");
return;
}
const auto &model = *it;
// Always respond with SSE (same format whether downloaded or not)
res.set_chunked_content_provider(
"text/event-stream",
[&, model](size_t, httplib::DataSink &sink) {
// SSE event sending helper
auto send = [&](const json &event) {
sink.os << "data: " << event.dump() << "\n\n";
};
// Download if not yet downloaded (report progress via SSE)
auto path = get_models_dir() / model.filename;
if (!std::filesystem::exists(path)) {
bool ok = download_model(model, [&](int pct) {
send({{"status", "downloading"}, {"progress", pct}});
return sink.os.good(); // Abort download on client disconnect
});
if (!ok) {
send({{"status", "error"}, {"message", "Download failed"}});
sink.done();
return true;
}
}
// Load and switch to the model
send({{"status", "loading"}});
{
std::lock_guard<std::mutex> lock(llm_mutex);
llm = llamalib::Llama{path};
selected_model = model.filename;
}
send({{"status", "ready"}});
sink.done();
return true;
});
});
// --- Embedded file serving (Chapter 6) ------------------------------------
// Chapter 5: svr.set_mount_point("/", "./public");
httplib::mount(svr, Web::FS);
// Start the server on a background thread
auto port = svr.bind_to_any_port("127.0.0.1");
std::thread server_thread([&]() { svr.listen_after_bind(); });
std::cout << "Listening on http://127.0.0.1:" << port << std::endl;
// Display the UI with WebView
webview::webview w(false, nullptr);
w.set_title("Translate App");
w.set_size(1024, 768, WEBVIEW_HINT_NONE);
w.navigate("http://127.0.0.1:" + std::to_string(port));
#ifdef __APPLE__
setup_macos_edit_menu();
#endif
w.run(); // Block until the window is closed
// Stop the server when the window is closed
svr.stop();
server_thread.join();
}
```
</details>
To summarize the changes from Chapter 5:
- `#include <csignal>` replaced with `#include <thread>`, `<cpp-embedlib-httplib.h>`, `"WebAssets.h"`, `"webview/webview.h"`
- Removed the `signal_handler` function
- `svr.set_mount_point("/", "./public")` replaced with `httplib::mount(svr, Web::FS)`
- `svr.listen("127.0.0.1", 8080)` replaced with `bind_to_any_port` + `listen_after_bind` + WebView event loop
Not a single line of handler code has changed. The REST API, SSE streaming, and model management built through Chapter 5 all work as-is.
## 6.6 Building and Testing
```bash
cmake -B build
cmake --build build -j
```
Launch the app:
```bash
./build/translate-app
```
No browser is needed. A window opens automatically. The same UI from Chapter 5 appears as-is, and translation and model switching all work just the same.
When you close the window, the server shuts down automatically. There's no need for `Ctrl+C`.
### What Needs to Be Distributed
You only need to distribute:
- The single `translate-app` binary
That's it. You don't need the `public/` directory. HTML, CSS, and JavaScript are embedded in the binary. Model files download automatically on first launch, so there's no need to ask users to prepare anything in advance.
## Next Chapter
Congratulations! 🎉
In Chapter 1, `/health` just returned `{"status":"ok"}`. Now we have a desktop app where you type text and translations stream in real time, pick a different model from a dropdown and it downloads automatically, and closing the window cleanly shuts everything down — all in a single distributable binary.
What we changed in this chapter was just the static file serving and the server startup. Not a single line of handler code changed. The REST API, SSE streaming, and model management we built through Chapter 5 all work as a desktop app, as-is.
In the next chapter, we'll shift perspective and read through the code of llama.cpp's own `llama-server`. Let's compare our simple server with a production-quality one and see what design decisions differ and why.
**Next:** [Reading the llama.cpp Server Source Code](../ch07-code-reading)

View File

@@ -0,0 +1,154 @@
---
title: "7. Reading the llama.cpp Server Source Code"
order: 7
---
Over the course of six chapters, we built a translation desktop app from scratch. We have a working product, but it's ultimately a "learning-oriented" implementation. So how does "production-quality" code differ? Let's read the source code of `llama-server`, the official server bundled with llama.cpp, and compare.
`llama-server` is located at `llama.cpp/tools/server/`. It uses the same cpp-httplib, so you can read the code the same way as in the previous chapters.
## 7.1 Source Code Location
```ascii
llama.cpp/tools/server/
├── server.cpp # Main server implementation
├── httplib.h # cpp-httplib (bundled version)
└── ...
```
The code is contained in a single `server.cpp`. It runs to several thousand lines, but once you understand the structure, you can narrow down the parts worth reading.
## 7.2 OpenAI-Compatible API
The biggest difference between the server we built and `llama-server` is the API design.
**Our API:**
```text
POST /translate → {"translation": "..."}
POST /translate/stream → SSE: data: "token"
```
**llama-server's API:**
```text
POST /v1/chat/completions → OpenAI-compatible JSON
POST /v1/completions → OpenAI-compatible JSON
POST /v1/embeddings → Text embedding vectors
```
`llama-server` conforms to [OpenAI's API specification](https://platform.openai.com/docs/api-reference). This means OpenAI's official client libraries (such as the Python `openai` package) work out of the box.
```python
# Example of connecting to llama-server with the OpenAI client
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8080/v1", api_key="dummy")
response = client.chat.completions.create(
model="local-model",
messages=[{"role": "user", "content": "Hello!"}]
)
```
Compatibility with existing tools and libraries is a big design decision. We designed a simple translation-specific API, but if you're building a general-purpose server, OpenAI compatibility has become the de facto standard.
## 7.3 Concurrent Request Handling
Our server processes requests one at a time. If another request arrives while a translation is in progress, it waits until the previous inference finishes. This is fine for a desktop app used by one person, but it becomes a problem for a server shared by multiple users.
`llama-server` handles concurrent requests through a mechanism called **slots**.
![llama-server's slot management](../slots.svg#half)
The key point is that tokens from each slot are not inferred **one by one in sequence**, but rather **all at once in a single batch**. GPUs excel at parallel processing, so processing two users simultaneously takes almost the same time as processing one. This is called "continuous batching."
In our server, cpp-httplib's thread pool assigns one thread per request, but the inference itself runs single-threaded inside `llm.chat()`. `llama-server` consolidates this inference step into a shared batch processing loop.
## 7.4 Differences in SSE Format
The streaming mechanism itself is the same (`set_chunked_content_provider` + SSE), but the data format differs.
**Our format:**
```text
data: "去年の"
data: "春に"
data: [DONE]
```
**llama-server (OpenAI-compatible):**
```text
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"delta":{"content":"去年の"}}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"delta":{"content":"春に"}}]}
data: [DONE]
```
Our format simply sends the tokens. Because `llama-server` follows the OpenAI specification, even a single token comes wrapped in JSON. It may look verbose, but it includes useful information for clients, like an `id` to identify the request and a `finish_reason` to indicate why generation stopped.
## 7.5 KV Cache Reuse
In our server, we process the entire prompt from scratch on every request. Our translation app's prompt is short ("Translate the following text to ja..." + input text), so this isn't a problem.
`llama-server` reuses the KV cache for the prefix portion when a request shares a common prompt prefix with a previous request.
![KV cache reuse](../kv-cache.svg#half)
For chatbots that send a long system prompt and few-shot examples with every request, this alone dramatically reduces response time. The difference is night and day: processing several thousand tokens of system prompt every time versus reading them from cache in an instant.
For our translation app, where the system prompt is just a single sentence, the benefit is limited. However, it's an optimization worth keeping in mind when applying this to your own applications.
## 7.6 Structured Output
Since our translation API returns plain text, there was no need to constrain the output format. But what if you want the LLM to respond in JSON?
```text
Prompt: Analyze the sentiment of the following text and return it as JSON.
LLM output (expected): {"sentiment": "positive", "score": 0.8}
LLM output (reality): Here are the results of the sentiment analysis. {"sentiment": ...
```
LLMs sometimes ignore instructions and add extraneous text. `llama-server` solves this problem with **grammar constraints**.
```bash
curl http://localhost:8080/v1/chat/completions \
-d '{
"messages": [{"role": "user", "content": "Analyze sentiment..."}],
"json_schema": {
"type": "object",
"properties": {
"sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]},
"score": {"type": "number"}
},
"required": ["sentiment", "score"]
}
}'
```
When you specify `json_schema`, tokens that don't conform to the grammar are excluded during token generation. This guarantees that the output is always valid JSON, so there's no need to worry about `json::parse` failing.
When embedding LLMs into applications, whether you can reliably parse the output directly impacts reliability. Grammar constraints are unnecessary for free-text output like translation, but they're essential for use cases where you need to return structured data as an API response.
## 7.7 Summary
Let's organize the differences we've covered.
| Aspect | Our Server | llama-server |
|------|-------------|--------------|
| API design | Translation-specific | OpenAI-compatible |
| Concurrent requests | Sequential processing | Slots + continuous batching |
| SSE format | Tokens only | OpenAI-compatible JSON |
| KV cache | Cleared each time | Prefix reuse |
| Structured output | None | JSON Schema / grammar constraints |
| Code size | ~200 lines | Several thousand lines |
Our code is simple because of the assumption that "one person uses it as a desktop app." If you're building a server for multiple users or one that integrates with the existing ecosystem, `llama-server`'s design serves as a valuable reference.
Conversely, even 200 lines of code is enough to make a fully functional translation app. I hope this code reading exercise has also conveyed the value of "building only what you need."
## Next Chapter
In the next chapter, we'll cover the key points for swapping in your own library and customizing the app to make it truly yours.
**Next:** [Making It Your Own](../ch08-customization)

View File

@@ -0,0 +1,120 @@
---
title: "8. Making It Your Own"
order: 8
---
Through Chapter 7, we've built a translation desktop app and studied how production-quality code differs. In this chapter, let's go over the key points for **turning this app into something entirely your own**.
The translation app was just a vehicle. Replace llama.cpp with your own library, and the same architecture works for any application.
## 8.1 Swapping Out the Build Configuration
First, replace the llama.cpp-related `FetchContent` entries in `CMakeLists.txt` with your own library.
```cmake
# Remove: llama.cpp and cpp-llamalib FetchContent
# Add: your own library
FetchContent_Declare(my_lib
GIT_REPOSITORY https://github.com/yourname/my-lib
GIT_TAG main
)
FetchContent_MakeAvailable(my_lib)
target_link_libraries(my-app PRIVATE
httplib::httplib
nlohmann_json::nlohmann_json
my_lib # Your library instead of cpp-llamalib
# ...
)
```
If your library doesn't support CMake, you can place the header and source files directly in `src/` and add them to `add_executable`. Keep cpp-httplib, nlohmann/json, and webview as they are.
## 8.2 Adapting the API to Your Task
Change the translation API's endpoints and parameters to match your task.
| Translation app | Your app (e.g., image processing) |
|---|---|
| `POST /translate` | `POST /process` |
| `{"text": "...", "target_lang": "ja"}` | `{"image": "base64...", "filter": "blur"}` |
| `POST /translate/stream` | `POST /process/stream` |
| `GET /models` | `GET /filters` or `GET /presets` |
Then update each handler's implementation. For example, just replace the `llm.chat()` calls with your own library's API.
```cpp
// Before: LLM translation
auto translation = llm.chat(prompt);
res.set_content(json{{"translation", translation}}.dump(), "application/json");
// After: e.g., an image processing library
auto result = my_lib::process(input_image, options);
res.set_content(json{{"result", result}}.dump(), "application/json");
```
The same goes for SSE streaming. If your library has a function that reports progress via a callback, you can use the exact same pattern from Chapter 3 to send incremental responses. SSE isn't limited to LLMs — it's useful for any time-consuming task: image processing progress, data conversion steps, long-running computations.
## 8.3 Design Considerations
### Libraries with Expensive Initialization
In this book, we load the LLM model at the top of `main()` and keep it in a variable. This is intentional. Loading the model on every request would take several seconds, so we load it once at startup and reuse it. If your library has expensive initialization (loading large data files, acquiring GPU resources, etc.), the same approach works well.
### Thread Safety
cpp-httplib processes requests concurrently using a thread pool. In Chapter 4 we protected the `llm` object with a `std::mutex` to prevent crashes during model switching. The same pattern applies when integrating your own library. If your library isn't thread-safe or you need to swap objects at runtime, protect access with a `std::mutex`.
## 8.4 Customizing the UI
Edit the three files in `public/`.
- **`index.html`** — Change the input form layout. Swap `<textarea>` for `<input type="file">`, add parameter fields, etc.
- **`style.css`** — Adjust the layout and colors. Keep the two-column design or switch to a single column
- **`script.js`** — Update the `fetch()` target URLs, request bodies, and how responses are displayed
Even without changing any server code, just swapping the HTML makes the app look completely different. Since these are static files, you can iterate quickly — just reload the browser without restarting the server.
This book used plain HTML, CSS, and JavaScript, but combining them with a frontend framework like Vue or React, or a CSS framework, would let you build an even more polished app.
## 8.5 Distribution Considerations
### Licenses
Check the licenses of the libraries you're using. cpp-httplib (MIT), nlohmann/json (MIT), and webview (MIT) all allow commercial use. Don't forget to check the license of your own library and its dependencies too.
### Models and Data Files
The download mechanism we built in Chapter 4 isn't limited to LLM models. If your app needs large data files, the same pattern lets you auto-download them on first launch, keeping the binary small while sparing users the manual setup.
If the data is small, you can embed it directly into the binary with cpp-embedlib.
### Cross-Platform Builds
webview supports macOS, Linux, and Windows. When building for each platform:
- **macOS** — No additional dependencies
- **Linux** — Requires `libwebkit2gtk-4.1-dev`
- **Windows** — Requires the WebView2 runtime (pre-installed on Windows 11)
Consider setting up cross-platform builds in CI (e.g., GitHub Actions) too.
## Closing
Thank you so much for reading to the end. 🙏
This book started with `/health` returning `{"status":"ok"}` in Chapter 1. From there we built a REST API, added SSE streaming, downloaded models from Hugging Face, created a browser-based Web UI, and packaged it all into a single-binary desktop app. In Chapter 7 we read through `llama-server`'s code and learned how production-quality servers differ in their design. It's been quite a journey, and I'm truly grateful you stuck with it all the way through.
Looking back, we used several key cpp-httplib features hands-on:
- **Server**: routing, JSON responses, SSE streaming with `set_chunked_content_provider`, static file serving with `set_mount_point`
- **Client**: HTTPS connections, redirect following, large downloads with content receivers, progress callbacks
- **WebView integration**: `bind_to_any_port` + `listen_after_bind` for background threading
cpp-httplib offers many more features beyond what we covered here, including multipart file uploads, authentication, timeout control, compression, and range requests. See [A Tour of cpp-httplib](../../tour/) for details.
These patterns aren't limited to a translation app. If you want to add a web API to your C++ library, give it a browser UI, or ship it as an easy-to-distribute desktop app — I hope this book serves as a useful reference.
Take your own library, build your own app, and have fun with it. Happy hacking! 🚀

View File

@@ -1,22 +1,26 @@
---
title: "Building a Desktop LLM App with cpp-httplib"
order: 0
---
Build an LLM-powered translation desktop app step by step, learning both the server and client sides of cpp-httplib along the way. Translation is just an example — swap it out to build your own summarizer, code generator, chatbot, or any other LLM application.
Have you ever wanted to add a web API to your own C++ library, or quickly build an Electron-like desktop app? In Rust you might reach for "Tauri + axum," but in C++ it always seemed out of reach.
## Dependencies
With [cpp-httplib](https://github.com/yhirose/cpp-httplib), [webview/webview](https://github.com/webview/webview), and [cpp-embedlib](https://github.com/yhirose/cpp-embedlib), you can take the same approach in pure C++ — and produce a small, easy-to-distribute single binary.
- [llama.cpp](https://github.com/ggml-org/llama.cpp) — LLM inference engine
- [nlohmann/json](https://github.com/nlohmann/json) — JSON parser (header-only)
- [webview/webview](https://github.com/webview/webview) — WebView wrapper (header-only)
- [cpp-httplib](https://github.com/yhirose/cpp-httplib) — HTTP server/client (header-only)
In this tutorial we build an LLM-powered translation app using [llama.cpp](https://github.com/ggml-org/llama.cpp), progressing step by step from "REST API" to "SSE streaming" to "Web UI" to "desktop app." Translation is just the vehicle — replace llama.cpp with your own library and the same architecture works for any application.
![Desktop App](app.png#large-center)
If you know basic C++17 and understand the basics of HTTP / REST APIs, you're ready to start.
## Chapters
1. **Embed llama.cpp and create a REST API**Start with a simple API that accepts text via POST and returns a translation as JSON
2. **Add token streaming with SSE**Stream translation results token by token using the standard LLM API approach
3. **Add model discovery and download**Use the client to search and download GGUF models from Hugging Face
4. **Add a Web UI** — Serve a translation UI with static file hosting, making the app accessible from a browser
5. **Turn it into a desktop app with WebView**Wrap the web app with webview/webview to create an Electron-like desktop application
6. **Code reading: llama.cpp's server implementation**Compare your implementation with production-quality code and learn from the differences
1. **[Set up the project](ch01-setup)** — Fetch dependencies, configure the build, write scaffold code
2. **[Embed llama.cpp and create a REST API](ch02-rest-api)** — Return translation results as JSON
3. **[Add token streaming with SSE](ch03-sse-streaming)** — Stream responses token by token
4. **[Add model discovery and management](ch04-model-management)** — Download and switch models from Hugging Face
5. **[Add a Web UI](ch05-web-ui)** — A browser-based translation interface
6. **[Turn it into a desktop app with WebView](ch06-desktop-app)** — A single-binary desktop application
7. **[Reading the llama.cpp server source code](ch07-code-reading)** — Compare with production-quality code
8. **[Making it your own](ch08-customization)** — Swap in your own library and customize

View File

@@ -0,0 +1,36 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 426 160" font-family="system-ui, sans-serif" font-size="14">
<rect x="0" y="0" width="426" height="160" rx="8" fill="#f5f3ef"/>
<defs>
<marker id="arrowhead" markerWidth="7" markerHeight="5" refX="7" refY="2.5" orient="auto">
<polygon points="0,0 7,2.5 0,5" fill="#198754"/>
</marker>
</defs>
<!-- request 1 (y=16) -->
<text x="94" y="35" fill="#333" font-weight="bold" text-anchor="end">Request 1:</text>
<rect x="106" y="16" width="138" height="30" rx="4" fill="#d1e7dd" stroke="#198754" stroke-width="1"/>
<text x="175" y="36" fill="#333" text-anchor="middle">System prompt</text>
<text x="256" y="36" fill="#333" text-anchor="middle">+</text>
<rect x="270" y="16" width="138" height="30" rx="4" fill="#cfe2ff" stroke="#0d6efd" stroke-width="1"/>
<text x="339" y="36" fill="#333" text-anchor="middle">User question A</text>
<!-- annotation: cache save -->
<text x="175" y="64" fill="#198754" font-size="11" text-anchor="middle">Saved to KV cache</text>
<!-- arrow -->
<line x1="175" y1="70" x2="175" y2="90" stroke="#198754" stroke-width="1.2" marker-end="url(#arrowhead)"/>
<text x="188" y="85" fill="#198754" font-size="11">Reuse</text>
<!-- request 2 (y=96) -->
<text x="94" y="115" fill="#333" font-weight="bold" text-anchor="end">Request 2:</text>
<rect x="106" y="96" width="138" height="30" rx="4" fill="#d1e7dd" stroke="#198754" stroke-width="1" stroke-dasharray="6,3"/>
<text x="175" y="116" fill="#333" text-anchor="middle">System prompt</text>
<text x="256" y="116" fill="#333" text-anchor="middle">+</text>
<rect x="270" y="96" width="138" height="30" rx="4" fill="#cfe2ff" stroke="#0d6efd" stroke-width="1"/>
<text x="339" y="116" fill="#333" text-anchor="middle">User question B</text>
<!-- bottom labels -->
<text x="175" y="144" fill="#198754" font-size="11" text-anchor="middle">No recomputation</text>
<text x="339" y="144" fill="#0d6efd" font-size="11" text-anchor="middle">Only this is computed</text>
</svg>

After

Width:  |  Height:  |  Size: 2.0 KiB

View File

@@ -0,0 +1,24 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 440 240" font-family="system-ui, sans-serif" font-size="14">
<!-- outer box -->
<rect x="0" y="0" width="440" height="240" rx="8" fill="#f5f3ef"/>
<text x="20" y="28" font-weight="bold" font-size="16" fill="#333">llama-server</text>
<!-- slot 0 -->
<rect x="20" y="46" width="400" height="32" rx="4" fill="#d1e7dd" stroke="#198754" stroke-width="1"/>
<text x="32" y="67" fill="#333">Slot 0: User A's request</text>
<!-- slot 1 -->
<rect x="20" y="86" width="400" height="32" rx="4" fill="#d1e7dd" stroke="#198754" stroke-width="1"/>
<text x="32" y="107" fill="#333">Slot 1: User B's request</text>
<!-- slot 2 -->
<rect x="20" y="126" width="400" height="32" rx="4" fill="#e9ecef" stroke="#adb5bd" stroke-width="1"/>
<text x="32" y="147" fill="#999">Slot 2: (idle)</text>
<!-- slot 3 -->
<rect x="20" y="166" width="400" height="32" rx="4" fill="#e9ecef" stroke="#adb5bd" stroke-width="1"/>
<text x="32" y="187" fill="#999">Slot 3: (idle)</text>
<!-- arrow + label -->
<text x="20" y="224" fill="#333" font-size="13">→ Active slots are inferred together in a single batch</text>
</svg>

After

Width:  |  Height:  |  Size: 1.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 130 KiB

View File

@@ -164,13 +164,13 @@ Use `res.user_data` to pass data from middleware to handlers. This is useful for
```cpp
svr.set_pre_routing_handler([](const auto &req, auto &res) {
res.user_data["auth_user"] = std::string("alice");
res.user_data.set("auth_user", std::string("alice"));
return httplib::Server::HandlerResponse::Unhandled;
});
svr.Get("/me", [](const auto &req, auto &res) {
auto user = std::any_cast<std::string>(res.user_data.at("auth_user"));
res.set_content("Hello, " + user, "text/plain");
auto *user = res.user_data.get<std::string>("auth_user");
res.set_content("Hello, " + *user, "text/plain");
});
```

View File

@@ -0,0 +1,60 @@
---
title: "C01. レスポンスボディを取得する / ファイルに保存する"
order: 1
status: "draft"
---
## 文字列として取得する
```cpp
httplib::Client cli("http://localhost:8080");
auto res = cli.Get("/hello");
if (res && res->status == 200) {
std::cout << res->body << std::endl;
}
```
`res->body``std::string`なので、そのまま使えます。レスポンス全体がメモリに読み込まれます。
> **Warning:** 大きなファイルを`res->body`で受け取ると、まるごとメモリに載ってしまいます。サイズが大きい場合は次の`ContentReceiver`を使いましょう。
## ファイルに保存する
```cpp
httplib::Client cli("http://localhost:8080");
std::ofstream ofs("output.bin", std::ios::binary);
if (!ofs) {
std::cerr << "Failed to open file" << std::endl;
return 1;
}
auto res = cli.Get("/large-file",
[&](const char *data, size_t len) {
ofs.write(data, len);
return static_cast<bool>(ofs);
});
```
`ContentReceiver`を使うと、データをチャンクごとに受け取れます。ボディ全体をメモリに溜めずにファイルへ書き出せるので、大きなファイルのダウンロードにぴったりです。
コールバックから`false`を返すと、ダウンロードを途中で止められます。上の例では`ofs`への書き込みが失敗したら自動的に中断します。
> **Detail:** ダウンロード前にContent-Lengthなどのレスポンスヘッダーを確認したいときは、`ResponseHandler`を組み合わせましょう。
>
> ```cpp
> auto res = cli.Get("/large-file",
> [](const httplib::Response &res) {
> auto len = res.get_header_value("Content-Length");
> std::cout << "Size: " << len << std::endl;
> return true; // falseを返すとダウンロードを中止
> },
> [&](const char *data, size_t len) {
> ofs.write(data, len);
> return static_cast<bool>(ofs);
> });
> ```
>
> `ResponseHandler`はヘッダー受信後、ボディ受信前に呼ばれます。`false`を返せばダウンロード自体をスキップできます。
> ダウンロードの進捗を表示したい場合は[C11. 進捗コールバックを使う](c11-progress-callback)を参照してください。

View File

@@ -0,0 +1,36 @@
---
title: "C02. JSONを送受信する"
order: 2
status: "draft"
---
cpp-httplibにはJSONパーサーが含まれていません。JSONの組み立てや解析には[nlohmann/json](https://github.com/nlohmann/json)などのライブラリを使ってください。ここでは`nlohmann/json`を例に説明します。
## JSONを送信する
```cpp
httplib::Client cli("http://localhost:8080");
nlohmann::json j = {{"name", "Alice"}, {"age", 30}};
auto res = cli.Post("/api/users", j.dump(), "application/json");
```
`Post()`の第2引数にJSON文字列、第3引数にContent-Typeを渡します。`Put()``Patch()`でも同じ形です。
> **Warning:** 第3引数のContent-Typeを省略すると、サーバー側でJSONとして認識されないことがあります。`"application/json"`を必ず指定しましょう。
## JSONレスポンスを受け取る
```cpp
auto res = cli.Get("/api/users/1");
if (res && res->status == 200) {
auto j = nlohmann::json::parse(res->body);
std::cout << j["name"] << std::endl;
}
```
`res->body``std::string`なので、そのままJSONライブラリに渡せます。
> **Note:** サーバーがエラー時にHTMLを返すことがあります。ステータスコードを確認してからパースすると安全です。また、APIによっては`Accept: application/json`ヘッダーが必要です。JSON APIを繰り返し呼ぶなら[C03. デフォルトヘッダーを設定する](c03-default-headers)が便利です。
> サーバー側でJSONを受け取って返す方法は[S02. JSONリクエストを受け取りJSONレスポンスを返す](s02-json-api)を参照してください。

View File

@@ -0,0 +1,55 @@
---
title: "C03. デフォルトヘッダーを設定する"
order: 3
status: "draft"
---
同じヘッダーを毎回のリクエストに付けたいときは、`set_default_headers()`を使います。一度設定すれば、そのクライアントから送るすべてのリクエストに自動で付与されます。
## 基本の使い方
```cpp
httplib::Client cli("https://api.example.com");
cli.set_default_headers({
{"Accept", "application/json"},
{"User-Agent", "my-app/1.0"},
});
auto res = cli.Get("/users");
```
`Accept``User-Agent`のように、APIを呼ぶたびに必要なヘッダーをまとめて登録できます。リクエストごとに指定する手間が省けます。
## Bearerトークンを毎回付ける
```cpp
httplib::Client cli("https://api.example.com");
cli.set_default_headers({
{"Authorization", "Bearer " + token},
{"Accept", "application/json"},
});
auto res1 = cli.Get("/me");
auto res2 = cli.Get("/projects");
```
認証トークンを一度セットしておけば、以降のリクエストで自動的に送られます。複数のエンドポイントを叩くAPIクライアントを書くときに便利です。
> **Note:** `set_default_headers()`は既存のデフォルトヘッダーを**上書き**します。あとから1つだけ追加したい場合でも、全体を渡し直してください。
## リクエスト単位のヘッダーと組み合わせる
デフォルトヘッダーを設定していても、個別のリクエストで追加のヘッダーを渡せます。
```cpp
httplib::Headers headers = {
{"X-Request-ID", "abc-123"},
};
auto res = cli.Get("/users", headers);
```
リクエスト単位で渡したヘッダーはデフォルトヘッダーに**追加**されます。両方がサーバーに送られます。
> Bearerトークンを使った認証の詳細は[C06. BearerトークンでAPIを呼ぶ](c06-bearer-token)を参照してください。

View File

@@ -0,0 +1,38 @@
---
title: "C04. リダイレクトを追従する"
order: 4
status: "draft"
---
cpp-httplibはデフォルトではリダイレクトHTTP 3xxを追従しません。サーバーから`302 Found`が返ってきても、そのままステータスコード302のレスポンスとして受け取ります。
自動で追従してほしいときは、`set_follow_location(true)`を呼びましょう。
## リダイレクトを追従する
```cpp
httplib::Client cli("http://example.com");
cli.set_follow_location(true);
auto res = cli.Get("/old-path");
if (res && res->status == 200) {
std::cout << res->body << std::endl;
}
```
`set_follow_location(true)`を設定すると、`Location`ヘッダーを見て新しいURLに自動でリクエストを投げ直します。最終的なレスポンスが`res`に入ります。
## HTTPからHTTPSへのリダイレクト
```cpp
httplib::Client cli("http://example.com");
cli.set_follow_location(true);
auto res = cli.Get("/");
```
多くのサイトはHTTPアクセスをHTTPSへリダイレクトします。`set_follow_location(true)`を有効にしておけば、こうしたケースも透過的に扱えます。スキームやホストが変わっても自動で追従します。
> **Warning:** HTTPSへのリダイレクトを追従するには、cpp-httplibをOpenSSLまたは他のTLSバックエンド付きでビルドしておく必要があります。TLSサポートがないと、HTTPSへのリダイレクトは失敗します。
> **Note:** リダイレクトを追従すると、リクエストの実行時間は伸びます。タイムアウトの設定は[C12. タイムアウトを設定する](c12-timeouts)を参照してください。

View File

@@ -0,0 +1,46 @@
---
title: "C05. Basic認証を使う"
order: 5
status: "draft"
---
Basic認証が必要なエンドポイントには、`set_basic_auth()`でユーザー名とパスワードを渡します。cpp-httplibが自動で`Authorization: Basic ...`ヘッダーを組み立ててくれます。
## 基本の使い方
```cpp
httplib::Client cli("https://api.example.com");
cli.set_basic_auth("alice", "s3cret");
auto res = cli.Get("/private");
if (res && res->status == 200) {
std::cout << res->body << std::endl;
}
```
一度設定すれば、そのクライアントから送るすべてのリクエストに認証情報が付きます。毎回ヘッダーを組み立てる必要はありません。
## リクエスト単位で使う
特定のリクエストだけに認証情報を付けたいときは、Headersを直接渡す方法もあります。
```cpp
httplib::Headers headers = {
httplib::make_basic_authentication_header("alice", "s3cret"),
};
auto res = cli.Get("/private", headers);
```
`make_basic_authentication_header()`がBase64エンコード済みのヘッダーを作ってくれます。
> **Warning:** Basic認証は資格情報をBase64で**エンコード**するだけで、暗号化しません。必ずHTTPS経由で使ってください。平文HTTPで使うと、パスワードがネットワーク上を丸見えで流れます。
## Digest認証
より安全なDigest認証を使いたいときは`set_digest_auth()`を使います。こちらはOpenSSLまたは他のTLSバックエンド付きでビルドしたときだけ利用できます。
```cpp
cli.set_digest_auth("alice", "s3cret");
```
> BearerトークンでAPIを呼びたい場合は[C06. BearerトークンでAPIを呼ぶ](c06-bearer-token)を参照してください。

View File

@@ -0,0 +1,50 @@
---
title: "C06. BearerトークンでAPIを呼ぶ"
order: 6
status: "draft"
---
OAuth 2.0やモダンなWeb APIでよく使われるBearerトークン認証には、`set_bearer_token_auth()`を使います。トークンを渡すと、cpp-httplibが`Authorization: Bearer <token>`ヘッダーを自動で組み立ててくれます。
## 基本の使い方
```cpp
httplib::Client cli("https://api.example.com");
cli.set_bearer_token_auth("eyJhbGciOiJIUzI1NiIs...");
auto res = cli.Get("/me");
if (res && res->status == 200) {
std::cout << res->body << std::endl;
}
```
一度設定すれば、以降のリクエストすべてにトークンが付きます。GitHub APIやSlack API、自前のOAuthサービスなど、トークンベースのAPIを叩くときの定番です。
## リクエスト単位で使う
特定のリクエストだけにトークンを付けたい、あるいはリクエストごとに違うトークンを使いたいときは、Headersで直接渡せます。
```cpp
httplib::Headers headers = {
httplib::make_bearer_token_authentication_header(token),
};
auto res = cli.Get("/me", headers);
```
`make_bearer_token_authentication_header()``Authorization`ヘッダーを組み立ててくれます。
## トークンをリフレッシュする
トークンの有効期限が切れたら、新しいトークンで`set_bearer_token_auth()`を呼び直すだけで更新できます。
```cpp
if (res && res->status == 401) {
auto new_token = refresh_token();
cli.set_bearer_token_auth(new_token);
res = cli.Get("/me");
}
```
> **Warning:** Bearerトークンはそれ自体が認証情報です。必ずHTTPS経由で送ってください。また、ソースコードや設定ファイルにトークンをハードコードしないようにしましょう。
> 複数のヘッダーをまとめて設定したいときは[C03. デフォルトヘッダーを設定する](c03-default-headers)も便利です。

View File

@@ -0,0 +1,52 @@
---
title: "C07. ファイルをマルチパートフォームとしてアップロードする"
order: 7
status: "draft"
---
HTMLフォームの`<input type="file">`と同じ形式でファイルを送りたいときは、マルチパートフォーム(`multipart/form-data`を使います。cpp-httplibは`UploadFormDataItems``FormDataProviderItems`の2種類のAPIを用意しています。使い分けの基準は**ファイルサイズ**です。
## 小さなファイルを送る
ファイル内容をメモリに読み込んでから送る方法です。サイズが小さいなら、これが一番シンプルです。
```cpp
httplib::Client cli("http://localhost:8080");
std::ifstream ifs("avatar.png", std::ios::binary);
std::string content((std::istreambuf_iterator<char>(ifs)),
std::istreambuf_iterator<char>());
httplib::UploadFormDataItems items = {
{"name", "Alice", "", ""},
{"avatar", content, "avatar.png", "image/png"},
};
auto res = cli.Post("/upload", items);
```
`UploadFormData`の各要素は`{name, content, filename, content_type}`の4つです。テキストフィールドなら`filename``content_type`を空文字にしておきます。
## 大きなファイルをストリーミングで送る
ファイル全体をメモリに載せずに、チャンクごとに送りたいときは`make_file_provider()`を使います。内部でファイルを少しずつ読みながら送信するので、巨大なファイルでもメモリを圧迫しません。
```cpp
httplib::Client cli("http://localhost:8080");
httplib::UploadFormDataItems items = {
{"name", "Alice", "", ""},
};
httplib::FormDataProviderItems provider_items = {
httplib::make_file_provider("video", "large-video.mp4", "", "video/mp4"),
};
auto res = cli.Post("/upload", httplib::Headers{}, items, provider_items);
```
`make_file_provider()`の引数は`(フォーム名, ファイルパス, ファイル名, Content-Type)`です。ファイル名を空にするとファイルパスがそのまま使われます。
> **Note:** `UploadFormDataItems`と`FormDataProviderItems`は同じリクエスト内で併用できます。テキストフィールドは`UploadFormDataItems`、ファイルは`FormDataProviderItems`、という使い分けがきれいです。
> アップロードの進捗を表示したい場合は[C11. 進捗コールバックを使う](c11-progress-callback)を参照してください。

View File

@@ -0,0 +1,34 @@
---
title: "C08. ファイルを生バイナリとしてPOSTする"
order: 8
status: "draft"
---
マルチパートではなく、ファイルの中身をそのままリクエストボディとして送りたいときがあります。S3互換APIへのアップロードや、生の画像データを受け付けるエンドポイントなどです。このときは`make_file_body()`を使いましょう。
## 基本の使い方
```cpp
httplib::Client cli("https://storage.example.com");
auto [size, provider] = httplib::make_file_body("backup.tar.gz");
if (size == 0) {
std::cerr << "Failed to open file" << std::endl;
return 1;
}
auto res = cli.Put("/bucket/backup.tar.gz", size,
provider, "application/gzip");
```
`make_file_body()`はファイルサイズと`ContentProvider`のペアを返します。そのまま`Post()``Put()`に渡せば、ファイルの中身がリクエストボディとしてそのまま送られます。
`ContentProvider`はファイルをチャンクごとに読み込むので、巨大なファイルでもメモリに全体を載せません。
## ファイルが開けなかったとき
`make_file_body()`は開けなかった場合、`size``0``provider`を空の関数オブジェクトとして返します。そのまま送信するとおかしな結果になるので、必ず`size`をチェックしてください。
> **Warning:** `make_file_body()`はContent-Lengthを最初に確定させる必要があるため、ファイルサイズをあらかじめ取得します。送信中にファイルサイズが変わる可能性がある場合は、このAPIには向きません。
> マルチパート形式で送りたい場合は[C07. ファイルをマルチパートフォームとしてアップロードする](c07-multipart-upload)を参照してください。

View File

@@ -0,0 +1,47 @@
---
title: "C09. チャンク転送でボディを送る"
order: 9
status: "draft"
---
送信するボディのサイズが事前にわからないとき、たとえばリアルタイムに生成されるデータや別のストリームから流し込むデータを送りたいときは、`ContentProviderWithoutLength`を使います。HTTPのチャンク転送エンコーディングchunked transfer-encodingとして送信されます。
## 基本の使い方
```cpp
httplib::Client cli("http://localhost:8080");
auto res = cli.Post("/stream",
[&](size_t offset, httplib::DataSink &sink) {
std::string chunk = produce_next_chunk();
if (chunk.empty()) {
sink.done(); // 送信終了
return true;
}
return sink.write(chunk.data(), chunk.size());
},
"application/octet-stream");
```
ラムダは「次のチャンクを作って`sink.write()`で送る」だけです。データがもう無くなったら`sink.done()`を呼べば送信が完了します。
## サイズがわかっている場合
送信するボディの**合計サイズが事前にわかっている**ときは、`ContentProvider``size_t offset, size_t length, DataSink &sink`を取るタイプ)と合計サイズを渡す別のオーバーロードを使います。
```cpp
size_t total_size = get_total_size();
auto res = cli.Post("/upload", total_size,
[&](size_t offset, size_t length, httplib::DataSink &sink) {
auto data = read_range(offset, length);
return sink.write(data.data(), data.size());
},
"application/octet-stream");
```
サイズがわかっているとContent-Lengthヘッダーが付くので、サーバー側で進捗を把握しやすくなります。可能ならこちらを使いましょう。
> **Detail:** `sink.write()`は書き込みが成功したかどうかを`bool`で返します。`false`が返ったら回線が切れています。ラムダはそのまま`false`を返して終了しましょう。
> ファイルをそのまま送るだけなら、`make_file_body()`が便利です。[C08. ファイルを生バイナリとしてPOSTする](c08-post-file-body)を参照してください。

View File

@@ -0,0 +1,52 @@
---
title: "C10. レスポンスをストリーミングで受信する"
order: 10
status: "draft"
---
レスポンスのボディをチャンクごとに受け取りたいときは、`ContentReceiver`を使います。大きなファイルを扱うときはもちろん、NDJSON改行区切りJSONやログストリームのように「届いた分だけ先に処理したい」ケースでも便利です。
## チャンクごとに処理する
```cpp
httplib::Client cli("http://localhost:8080");
auto res = cli.Get("/logs/stream",
[](const char *data, size_t len) {
std::cout.write(data, len);
std::cout.flush();
return true; // falseを返すと受信を中止
});
```
サーバーから届いたデータが、到着した順にラムダに渡されます。コールバックが`false`を返すとダウンロードを途中で止められます。
## NDJSONを行単位でパースする
バッファを使って改行区切りのJSONを1行ずつ処理する例です。
```cpp
std::string buffer;
auto res = cli.Get("/events",
[&](const char *data, size_t len) {
buffer.append(data, len);
size_t pos;
while ((pos = buffer.find('\n')) != std::string::npos) {
auto line = buffer.substr(0, pos);
buffer.erase(0, pos + 1);
if (!line.empty()) {
auto j = nlohmann::json::parse(line);
handle_event(j);
}
}
return true;
});
```
バッファに貯めながら、改行が見つかるたびに1行を取り出してパースします。ストリーミングAPIをリアルタイムに処理する基本パターンです。
> **Warning:** `ContentReceiver`を渡すと、`res->body`は**空のまま**になります。ボディは自分でコールバック内で保存するか処理するかしてください。
> ダウンロードの進捗を知りたい場合は[C11. 進捗コールバックを使う](c11-progress-callback)と組み合わせましょう。
> Server-Sent EventsSSEを扱うときは[E04. SSEをクライアントで受信する](e04-sse-client)も参考になります。

View File

@@ -0,0 +1,59 @@
---
title: "C11. 進捗コールバックを使う"
order: 11
status: "draft"
---
ダウンロードやアップロードの進捗を表示したいときは、`DownloadProgress`または`UploadProgress`コールバックを渡します。どちらも`(current, total)`の2引数を取る関数オブジェクトです。
## ダウンロードの進捗
```cpp
httplib::Client cli("http://localhost:8080");
auto res = cli.Get("/large-file",
[](size_t current, size_t total) {
auto percent = (total > 0) ? (current * 100 / total) : 0;
std::cout << "\rDownloading: " << percent << "% ("
<< current << "/" << total << ")" << std::flush;
return true; // falseを返すとダウンロードを中止
});
std::cout << std::endl;
```
コールバックはデータを受信するたびに呼ばれます。`total`はContent-Lengthから取得した値で、サーバーが送ってこない場合は`0`になることがあります。その場合は進捗率が計算できないので、受信済みバイト数だけを表示するのが無難です。
## アップロードの進捗
アップロード側も同じ形です。`Post()``Put()`の最後の引数に`UploadProgress`を渡します。
```cpp
httplib::Client cli("http://localhost:8080");
std::string body = load_large_body();
auto res = cli.Post("/upload", body, "application/octet-stream",
[](size_t current, size_t total) {
auto percent = current * 100 / total;
std::cout << "\rUploading: " << percent << "%" << std::flush;
return true;
});
std::cout << std::endl;
```
## 中断する
コールバックから`false`を返すと、転送を中止できます。UI側で「キャンセル」ボタンが押されたら`false`を返す、といった使い方ができます。
```cpp
std::atomic<bool> cancelled{false};
auto res = cli.Get("/large-file",
[&](size_t current, size_t total) {
return !cancelled.load();
});
```
> **Note:** `ContentReceiver`と進捗コールバックは同時に使えます。ファイルに書き出しながら進捗を表示したいときは、両方を渡しましょう。
> ファイル保存と組み合わせる具体例は[C01. レスポンスボディを取得する / ファイルに保存する](c01-get-response-body)も参照してください。

View File

@@ -0,0 +1,50 @@
---
title: "C12. タイムアウトを設定する"
order: 12
status: "draft"
---
クライアントには3種類のタイムアウトがあります。それぞれ別々に設定できます。
| 種類 | API | デフォルト | 意味 |
| --- | --- | --- | --- |
| 接続タイムアウト | `set_connection_timeout` | 300秒 | TCP接続の確立までの待ち時間 |
| 読み取りタイムアウト | `set_read_timeout` | 300秒 | レスポンスを受信する際の1回の`recv`待ち時間 |
| 書き込みタイムアウト | `set_write_timeout` | 5秒 | リクエストを送信する際の1回の`send`待ち時間 |
## 基本の使い方
```cpp
httplib::Client cli("http://localhost:8080");
cli.set_connection_timeout(5, 0); // 5秒
cli.set_read_timeout(10, 0); // 10秒
cli.set_write_timeout(10, 0); // 10秒
auto res = cli.Get("/api/data");
```
秒数とマイクロ秒を2引数で渡します。細かい指定が不要なら第2引数は省略できます。
## `std::chrono`で指定する
`std::chrono`の期間を直接渡すオーバーロードもあります。こちらのほうが読みやすいのでおすすめです。
```cpp
using namespace std::chrono_literals;
cli.set_connection_timeout(5s);
cli.set_read_timeout(10s);
cli.set_write_timeout(500ms);
```
## デフォルトでは300秒と長めな点に注意
接続タイムアウトと読み取りタイムアウトはデフォルトで**300秒5分**です。サーバーが反応しない場合、このままだと5分待たされます。短めに設定したほうが良いことが多いです。
```cpp
cli.set_connection_timeout(3s);
cli.set_read_timeout(10s);
```
> **Warning:** 読み取りタイムアウトは「1回の受信待ち」に対するタイムアウトです。大きなファイルのダウンロードで途中ずっとデータが流れている限り、リクエスト全体で30分かかっても発火しません。リクエスト全体の時間制限を設けたい場合は[C13. 全体タイムアウトを設定する](c13-max-timeout)を使ってください。

View File

@@ -0,0 +1,42 @@
---
title: "C13. 全体タイムアウトを設定する"
order: 13
status: "draft"
---
[C12. タイムアウトを設定する](c12-timeouts)で紹介した3種類のタイムアウトは、いずれも「1回の`send``recv`」に対するものです。リクエスト全体の所要時間に上限を設けたい場合は、`set_max_timeout()`を使います。
## 基本の使い方
```cpp
httplib::Client cli("http://localhost:8080");
cli.set_max_timeout(5000); // 5秒ミリ秒単位
auto res = cli.Get("/slow-endpoint");
```
ミリ秒単位で指定します。接続、送信、受信をすべて含めて、リクエスト全体が指定時間を超えたら打ち切られます。
## `std::chrono`で指定する
こちらも`std::chrono`の期間を受け取るオーバーロードがあります。
```cpp
using namespace std::chrono_literals;
cli.set_max_timeout(5s);
```
## どう使い分けるか
`set_read_timeout`は「データが来ない時間」のタイムアウトなので、少しずつデータが流れ続ける状況では発火しません。たとえば1秒ごとに1バイト届くようなエンドポイントは、`set_read_timeout`をいくら短くしてもタイムアウトしません。
一方、`set_max_timeout`は「経過時間」に対する上限なので、こうしたケースでも確実に止められます。外部APIを叩くときや、ユーザーを待たせすぎたくないときに重宝します。
```cpp
cli.set_connection_timeout(3s);
cli.set_read_timeout(10s);
cli.set_max_timeout(30s); // 全体で30秒を超えたら中断
```
> **Note:** `set_max_timeout()`は通常のタイムアウトと併用できます。短期的な無反応は`set_read_timeout`で、長時間の処理は`set_max_timeout`で、という二段構えにするのが安全です。

View File

@@ -0,0 +1,53 @@
---
title: "C14. 接続の再利用とKeep-Aliveの挙動を理解する"
order: 14
status: "draft"
---
`httplib::Client`は同じインスタンスで複数回リクエストを送ると、TCP接続を自動的に再利用します。HTTP/1.1のKeep-Aliveが有効に働くので、TCPハンドシェイクやTLSハンドシェイクのオーバーヘッドを毎回払わずに済みます。
## 接続は自動で使い回される
```cpp
httplib::Client cli("https://api.example.com");
auto res1 = cli.Get("/users/1");
auto res2 = cli.Get("/users/2"); // 同じ接続を再利用
auto res3 = cli.Get("/users/3"); // 同じ接続を再利用
```
特別な設定は要りません。`cli`を使い回すだけで、内部的には同じソケットで通信が続きます。とくにHTTPSでは、TLSハンドシェイクのコストが大きいので効果が顕著です。
## Keep-Aliveを明示的にオフにする
毎回新しい接続を張り直したい場合は、`set_keep_alive(false)`を呼びます。テスト目的などで使うことがあります。
```cpp
cli.set_keep_alive(false);
```
ただし、普段はオン(デフォルト)のままで問題ありません。
## リクエストごとに`Client`を作らない
1回のリクエストのたびに`Client`をスコープから抜けて破棄すると、接続の再利用は効きません。ループの外でインスタンスを作り、中で使い回しましょう。
```cpp
// NG: 毎回接続が切れる
for (auto id : ids) {
httplib::Client cli("https://api.example.com");
cli.Get("/users/" + id);
}
// OK: 接続が再利用される
httplib::Client cli("https://api.example.com");
for (auto id : ids) {
cli.Get("/users/" + id);
}
```
## 並行リクエスト
複数のスレッドから並行にリクエストを送りたいときは、スレッドごとに別々の`Client`インスタンスを持つのが無難です。1つの`Client`は1本のTCP接続を使い回すので、同じインスタンスに複数スレッドから同時にリクエストを投げると、結局どこかで直列化されます。
> **Note:** サーバー側のKeep-Aliveタイムアウトを超えると、サーバーが接続を切ります。その場合cpp-httplibは自動で再接続して再試行するので、アプリケーションコードで気にする必要はありません。

View File

@@ -0,0 +1,47 @@
---
title: "C15. 圧縮を有効にする"
order: 15
status: "draft"
---
cpp-httplibは送信時の圧縮と受信時の解凍をサポートしています。ただし、zlibまたはBrotliを有効にしてビルドしておく必要があります。
## ビルド時の準備
圧縮機能を使うには、`httplib.h`をインクルードする前に次のマクロを定義しておきます。
```cpp
#define CPPHTTPLIB_ZLIB_SUPPORT // gzip / deflate
#define CPPHTTPLIB_BROTLI_SUPPORT // brotli
#include <httplib.h>
```
リンク時に`zlib``brotli`のライブラリも必要です。
## リクエストボディを圧縮して送る
```cpp
httplib::Client cli("https://api.example.com");
cli.set_compress(true);
std::string big_payload = build_payload();
auto res = cli.Post("/api/data", big_payload, "application/json");
```
`set_compress(true)`を呼んでおくと、POSTやPUTのリクエストボディがgzipで圧縮されて送信されます。サーバー側が対応している必要があります。
## レスポンスを解凍する
```cpp
httplib::Client cli("https://api.example.com");
cli.set_decompress(true); // デフォルトで有効
auto res = cli.Get("/api/data");
std::cout << res->body << std::endl;
```
`set_decompress(true)`を呼ぶと、サーバーが`Content-Encoding: gzip`などで圧縮したレスポンスを自動で解凍してくれます。`res->body`には解凍済みのデータが入ります。
デフォルトで有効なので、通常は何もしなくても解凍されます。あえて生の圧縮データを触りたいときだけ`set_decompress(false)`にしましょう。
> **Warning:** `CPPHTTPLIB_ZLIB_SUPPORT`を定義せずにビルドすると、`set_compress()`や`set_decompress()`を呼んでも何も起こりません。マクロの定義を忘れていないか、最初に確認しましょう。

View File

@@ -0,0 +1,52 @@
---
title: "C16. プロキシを経由してリクエストを送る"
order: 16
status: "draft"
---
社内ネットワークや特定の経路を通したい場合、HTTPプロキシを経由してリクエストを送れます。`set_proxy()`でプロキシのホストとポートを指定するだけです。
## 基本の使い方
```cpp
httplib::Client cli("https://api.example.com");
cli.set_proxy("proxy.internal", 8080);
auto res = cli.Get("/users");
```
プロキシ経由でリクエストが送られます。HTTPSの場合はCONNECTメソッドでトンネルが張られるので、cpp-httplib側で特別な設定は要りません。
## プロキシに認証を設定する
プロキシ自体が認証を要求する場合は、`set_proxy_basic_auth()``set_proxy_bearer_token_auth()`を使います。
```cpp
cli.set_proxy("proxy.internal", 8080);
cli.set_proxy_basic_auth("user", "password");
```
```cpp
cli.set_proxy_bearer_token_auth("token");
```
OpenSSLまたは他のTLSバックエンド付きでビルドしていれば、Digest認証も使えます。
```cpp
cli.set_proxy_digest_auth("user", "password");
```
## エンドのサーバー認証と組み合わせる
プロキシ認証と、エンドサーバーへの認証([C05. Basic認証を使う](c05-basic-auth)や[C06. BearerトークンでAPIを呼ぶ](c06-bearer-token))は別物です。両方が必要なら、両方設定します。
```cpp
cli.set_proxy("proxy.internal", 8080);
cli.set_proxy_basic_auth("proxy-user", "proxy-pass");
cli.set_bearer_token_auth("api-token"); // エンドサーバー向け
```
プロキシには`Proxy-Authorization`、エンドサーバーには`Authorization`ヘッダーが送られます。
> **Note:** 環境変数の`HTTP_PROXY`や`HTTPS_PROXY`は自動的には読まれません。必要ならアプリケーション側で読み取って`set_proxy()`に渡してください。

View File

@@ -0,0 +1,63 @@
---
title: "C17. エラーコードをハンドリングする"
order: 17
status: "draft"
---
`cli.Get()``cli.Post()``Result`型を返します。リクエストが失敗したときサーバーに到達できなかった、タイムアウトしたなど、返り値は「falsy」になります。詳しい原因を知りたい場合は`Result::error()`を使います。
## 基本の判定
```cpp
httplib::Client cli("http://localhost:8080");
auto res = cli.Get("/api/data");
if (res) {
// リクエストが送れて、レスポンスも受け取れた
std::cout << "status: " << res->status << std::endl;
} else {
// ネットワーク層で失敗した
std::cerr << "error: " << httplib::to_string(res.error()) << std::endl;
}
```
`if (res)`で成功・失敗を判定し、失敗時は`res.error()``httplib::Error`列挙値を取り出せます。`to_string()`に渡すと人間が読める文字列になります。
## 代表的なエラー
| 値 | 意味 |
| --- | --- |
| `Error::Connection` | サーバーに接続できなかった |
| `Error::ConnectionTimeout` | 接続タイムアウト(`set_connection_timeout` |
| `Error::Read` / `Error::Write` | 送受信中のエラー |
| `Error::Timeout` | `set_max_timeout`で設定した全体タイムアウト |
| `Error::ExceedRedirectCount` | リダイレクト回数が上限を超えた |
| `Error::SSLConnection` | TLSハンドシェイクに失敗 |
| `Error::SSLServerVerification` | サーバー証明書の検証に失敗 |
| `Error::Canceled` | 進捗コールバックから`false`が返された |
## ステータスコードとの使い分け
`res`が truthy でも、HTTPステータスコードが4xxや5xxのこともあります。この2つは別物です。
```cpp
auto res = cli.Get("/api/data");
if (!res) {
// ネットワークエラー(そもそもレスポンスを受け取れていない)
std::cerr << "network error: " << httplib::to_string(res.error()) << std::endl;
return 1;
}
if (res->status >= 400) {
// HTTPエラーレスポンスは受け取った
std::cerr << "http error: " << res->status << std::endl;
return 1;
}
// 正常系
std::cout << res->body << std::endl;
```
ネットワーク層のエラーは`res.error()`、HTTPのエラーは`res->status`、と頭の中で分けておきましょう。
> SSL関連のエラーをさらに詳しく調べたい場合は[C18. SSLエラーをハンドリングする](c18-ssl-errors)を参照してください。

View File

@@ -0,0 +1,51 @@
---
title: "C18. SSLエラーをハンドリングする"
order: 18
status: "draft"
---
HTTPSリクエストで失敗したとき、`res.error()``Error::SSLConnection``Error::SSLServerVerification`といった値を返します。ただ、これだけだと原因の切り分けが難しいこともあります。そんなときは`Result::ssl_error()``Result::ssl_backend_error()`が役に立ちます。
## SSLエラーの詳細を取得する
```cpp
httplib::Client cli("https://api.example.com");
auto res = cli.Get("/");
if (!res) {
auto err = res.error();
std::cerr << "error: " << httplib::to_string(err) << std::endl;
if (err == httplib::Error::SSLConnection ||
err == httplib::Error::SSLServerVerification) {
std::cerr << "ssl_error: " << res.ssl_error() << std::endl;
std::cerr << "ssl_backend_error: " << res.ssl_backend_error() << std::endl;
}
}
```
`ssl_error()`はSSLライブラリが返したエラーコードOpenSSLの`SSL_get_error()`の値など)、`ssl_backend_error()`はバックエンドがさらに詳しく提供するエラー値です。OpenSSLなら`ERR_get_error()`の値が入ります。
## OpenSSLのエラーを文字列化する
`ssl_backend_error()`で取得した値を、OpenSSLの`ERR_error_string()`で文字列にするとデバッグに便利です。
```cpp
#include <openssl/err.h>
if (res.ssl_backend_error() != 0) {
char buf[256];
ERR_error_string_n(res.ssl_backend_error(), buf, sizeof(buf));
std::cerr << "openssl: " << buf << std::endl;
}
```
## よくある原因
| 症状 | ありがちな原因 |
| --- | --- |
| `SSLServerVerification` | CA証明書のパスが通っていない、自己署名証明書 |
| `SSLServerHostnameVerification` | 証明書のCN/SANとホスト名が一致しない |
| `SSLConnection` | TLSバージョンの不一致、対応スイートが無い |
> 証明書の検証設定を変えたい場合は[T02. SSL証明書の検証を制御する](t02-cert-verification)を参照してください。

View File

@@ -0,0 +1,57 @@
---
title: "C19. クライアントにログを設定する"
order: 19
status: "draft"
---
クライアントから送ったリクエストと受け取ったレスポンスをログに残したいときは、`set_logger()`を使います。エラーだけを拾いたいなら`set_error_logger()`が別に用意されています。
## リクエストとレスポンスをログに残す
```cpp
httplib::Client cli("https://api.example.com");
cli.set_logger([](const httplib::Request &req, const httplib::Response &res) {
std::cout << req.method << " " << req.path
<< " -> " << res.status << std::endl;
});
auto res = cli.Get("/users");
```
`set_logger()`に渡したコールバックは、リクエストが完了するたびに呼ばれます。リクエストとレスポンスの両方を引数で受け取れるので、メソッドやパス、ステータスコード、ヘッダー、ボディなど好きな情報をログに残せます。
## エラーだけを拾う
ネットワーク層のエラーが起きたとき(`Error::Connection`など)は、`set_logger()`は呼ばれません。レスポンスが無いからです。こうしたエラーを拾いたいときは`set_error_logger()`を使います。
```cpp
cli.set_error_logger([](const httplib::Error &err, const httplib::Request *req) {
std::cerr << "error: " << httplib::to_string(err);
if (req) {
std::cerr << " (" << req->method << " " << req->path << ")";
}
std::cerr << std::endl;
});
```
第2引数の`req`はヌルポインタのこともあります。リクエストを組み立てる前の段階で失敗した場合です。使う前に必ずヌルチェックしてください。
## 両方を組み合わせる
成功時は通常のログ、失敗時はエラーログ、という2本立てにすると便利です。
```cpp
cli.set_logger([](const auto &req, const auto &res) {
std::cout << "[ok] " << req.method << " " << req.path
<< " " << res.status << std::endl;
});
cli.set_error_logger([](const auto &err, const auto *req) {
std::cerr << "[ng] " << httplib::to_string(err);
if (req) std::cerr << " " << req->method << " " << req->path;
std::cerr << std::endl;
});
```
> **Note:** ログコールバックはリクエスト処理と同じスレッドで同期的に呼ばれます。重い処理を入れるとリクエストがその分遅くなるので、必要なら別スレッドのキューに流しましょう。

View File

@@ -0,0 +1,87 @@
---
title: "E01. SSEサーバーを実装する"
order: 47
status: "draft"
---
Server-Sent EventsSSEは、サーバーからクライアントへイベントを一方向にプッシュするためのシンプルなプロトコルです。長時間の接続を保ったまま、サーバーが好きなタイミングでデータを送れます。WebSocketより軽量で、HTTPの範囲で完結するのが魅力です。
cpp-httplibにはSSE専用のサーバーAPIはありませんが、`set_chunked_content_provider()``text/event-stream`を組み合わせれば実装できます。
## 基本のSSEサーバー
```cpp
svr.Get("/events", [](const httplib::Request &req, httplib::Response &res) {
res.set_chunked_content_provider(
"text/event-stream",
[](size_t offset, httplib::DataSink &sink) {
std::string message = "data: hello\n\n";
sink.write(message.data(), message.size());
std::this_thread::sleep_for(std::chrono::seconds(1));
return true;
});
});
```
ポイントは3つです。
1. Content-Typeを`text/event-stream`にする
2. メッセージは`data: <内容>\n\n`の形式で書く(`\n\n`で1イベントの区切り
3. `sink.write()`で送るたびに、クライアントが受け取る
接続が生きている限り、プロバイダラムダが繰り返し呼ばれ続けます。
## イベントを送り続ける例
サーバーの現在時刻を1秒ごとに送るシンプルな例です。
```cpp
svr.Get("/time", [](const httplib::Request &req, httplib::Response &res) {
res.set_chunked_content_provider(
"text/event-stream",
[&req](size_t offset, httplib::DataSink &sink) {
if (req.is_connection_closed()) {
sink.done();
return true;
}
auto now = std::chrono::system_clock::now();
auto t = std::chrono::system_clock::to_time_t(now);
std::string msg = "data: " + std::string(std::ctime(&t)) + "\n";
sink.write(msg.data(), msg.size());
std::this_thread::sleep_for(std::chrono::seconds(1));
return true;
});
});
```
クライアントが切断したら`sink.done()`で終了します。詳しくは[S16. クライアントが切断したか検出する](s16-disconnect)を参照してください。
## コメント行でハートビート
`:`で始まる行はSSEのコメントで、クライアントは無視しますが、**接続を生かしておく**役割があります。プロキシやロードバランサが無通信接続を切ってしまうのを防げます。
```cpp
// 30秒ごとにハートビート
if (tick_count % 30 == 0) {
std::string ping = ": ping\n\n";
sink.write(ping.data(), ping.size());
}
```
## スレッドプールとの関係
SSEは接続がつなぎっぱなしなので、1クライアントあたり1ワーカースレッドを消費します。同時接続数が多くなりそうなら、スレッドプールを動的スケーリングにしておきましょう。
```cpp
svr.new_task_queue = [] {
return new httplib::ThreadPool(8, 128);
};
```
詳しくは[S21. マルチスレッド数を設定する](s21-thread-pool)を参照してください。
> **Note:** `data:`の後ろに改行が含まれる場合、各行の先頭に`data: `を付けて複数の`data:`行として送ります。SSEの仕様で決まっているフォーマットです。
> イベント名を使い分けたい場合は[E02. SSEでイベント名を使い分ける](e02-sse-event-names)を、クライアント側は[E04. SSEをクライアントで受信する](e04-sse-client)を参照してください。

View File

@@ -0,0 +1,84 @@
---
title: "E02. SSEでイベント名を使い分ける"
order: 48
status: "draft"
---
SSEでは、1本のストリームで複数の種類のイベントを送れます。`event:`フィールドで名前を付けると、クライアント側で種類ごとに別々のハンドラを呼べます。チャットの「新規メッセージ」「入室」「退室」のような場面で便利です。
## イベント名付きで送る
```cpp
auto send_event = [](httplib::DataSink &sink,
const std::string &event,
const std::string &data) {
std::string msg = "event: " + event + "\n"
+ "data: " + data + "\n\n";
sink.write(msg.data(), msg.size());
};
svr.Get("/chat/stream", [&](const httplib::Request &req, httplib::Response &res) {
res.set_chunked_content_provider(
"text/event-stream",
[&, send_event](size_t offset, httplib::DataSink &sink) {
send_event(sink, "message", "Hello!");
std::this_thread::sleep_for(std::chrono::seconds(2));
send_event(sink, "join", "alice");
std::this_thread::sleep_for(std::chrono::seconds(2));
send_event(sink, "leave", "bob");
std::this_thread::sleep_for(std::chrono::seconds(2));
return true;
});
});
```
1メッセージは`event:``data:` → 空行、の形式です。`event:`を書かないと、クライアント側ではデフォルトの`"message"`イベントとして扱われます。
## IDを付けて再接続に備える
`id:`フィールドを一緒に送ると、クライアントが切断→再接続したときに`Last-Event-ID`ヘッダーで「どこまで受け取ったか」を教えてくれます。
```cpp
auto send_event = [](httplib::DataSink &sink,
const std::string &event,
const std::string &data,
const std::string &id) {
std::string msg = "id: " + id + "\n"
+ "event: " + event + "\n"
+ "data: " + data + "\n\n";
sink.write(msg.data(), msg.size());
};
send_event(sink, "message", "Hello!", "42");
```
IDの付け方は自由です。連番でもUUIDでも、サーバー側で重複せず順序が追えるものを選びましょう。再接続の詳細は[E03. SSEの再接続を処理する](e03-sse-reconnect)を参照してください。
## JSONをdataに乗せる
構造化されたデータを送りたいときは、`data:`の中身をJSONにするのが定番です。
```cpp
nlohmann::json payload = {
{"user", "alice"},
{"text", "Hello!"},
};
send_event(sink, "message", payload.dump(), "42");
```
クライアント側では受け取った`data`をそのままJSONパースすれば、元のオブジェクトに戻せます。
## データに改行が含まれる場合
`data:`の値に改行が入るときは、各行の先頭に`data: `を付けて複数行に分けて送ります。
```cpp
std::string msg = "data: line1\n"
"data: line2\n"
"data: line3\n\n";
sink.write(msg.data(), msg.size());
```
クライアント側では、これらが改行でつながった1つの`data`として復元されます。
> **Note:** `event:`を使うとクライアント側のハンドリングがきれいになりますが、ブラウザのDevToolsで見たときに種類別で識別しやすくなるというメリットもあります。デバッグ時に効いてきます。

View File

@@ -0,0 +1,85 @@
---
title: "E03. SSEの再接続を処理する"
order: 49
status: "draft"
---
SSE接続はネットワークの都合で切れることがあります。クライアントは自動的に再接続を試みるので、サーバー側では「再接続してきたクライアントに、途中から配信を再開する」仕組みを用意しておくと親切です。
## `Last-Event-ID`を受け取る
クライアントが再接続すると、最後に受け取ったイベントのIDを`Last-Event-ID`ヘッダーに入れて送ってきます。サーバー側ではこれを読んで、その続きから配信を再開できます。
```cpp
svr.Get("/events", [](const httplib::Request &req, httplib::Response &res) {
auto last_id = req.get_header_value("Last-Event-ID");
int start = last_id.empty() ? 0 : std::stoi(last_id) + 1;
res.set_chunked_content_provider(
"text/event-stream",
[start](size_t offset, httplib::DataSink &sink) mutable {
static int next_id = 0;
if (next_id < start) { next_id = start; }
std::string msg = "id: " + std::to_string(next_id) + "\n"
+ "data: event " + std::to_string(next_id) + "\n\n";
sink.write(msg.data(), msg.size());
++next_id;
std::this_thread::sleep_for(std::chrono::seconds(1));
return true;
});
});
```
初回接続では`Last-Event-ID`が無いので`0`から送り始め、再接続時は続きのIDから再開します。イベントの保存はサーバー側の責任なので、直近のイベントをキャッシュしておく必要があります。
## 再接続間隔を指定する
`retry:`フィールドを送ると、クライアント側の再接続間隔を指定できます。単位はミリ秒です。
```cpp
std::string msg = "retry: 5000\n\n"; // 5秒後に再接続
sink.write(msg.data(), msg.size());
```
通常は最初に1回送っておけば十分です。混雑時やサーバーメンテナンス時に、リトライ間隔を長めに指定して負荷を減らすといった使い方もできます。
## イベントのバッファリング
再接続のために、直近のイベントをサーバー側でバッファしておく実装が必要です。
```cpp
struct EventBuffer {
std::mutex mu;
std::deque<std::pair<int, std::string>> events; // {id, data}
int next_id = 0;
void push(const std::string &data) {
std::lock_guard<std::mutex> lock(mu);
events.push_back({next_id++, data});
if (events.size() > 1000) { events.pop_front(); }
}
std::vector<std::pair<int, std::string>> since(int id) {
std::lock_guard<std::mutex> lock(mu);
std::vector<std::pair<int, std::string>> out;
for (const auto &e : events) {
if (e.first >= id) { out.push_back(e); }
}
return out;
}
};
```
再接続してきたクライアントに`since(last_id)`で未送信分をまとめて送ると、取りこぼしを防げます。
## 保存期間のバランス
バッファをどれだけ持つかは、メモリと「どれだけさかのぼって再送できるか」のトレードオフです。用途によって決めましょう。
- リアルタイムチャット: 数分〜数十分
- 通知: 直近のN件
- 取引データ: 永続化して、必要ならDBから取得
> **Warning:** `Last-Event-ID`はクライアントが送ってくる値なので、サーバー側で信用しすぎないようにしましょう。数値として読むなら範囲チェックを、文字列ならサニタイズを忘れずに。

Some files were not shown because too many files have changed in this diff Show More