mirror of
https://github.com/yhirose/cpp-httplib.git
synced 2026-04-11 19:28:30 +00:00
"Building a Desktop LLM App with cpp-httplib" (#2403)
This commit is contained in:
@@ -1,23 +1,26 @@
|
||||
---
|
||||
title: "Building a Desktop LLM App with cpp-httplib"
|
||||
order: 0
|
||||
status: "draft"
|
||||
|
||||
---
|
||||
|
||||
Build an LLM-powered translation desktop app step by step, learning both the server and client sides of cpp-httplib along the way. Translation is just an example — swap it out to build your own summarizer, code generator, chatbot, or any other LLM application.
|
||||
Have you ever wanted to add a web API to your own C++ library, or quickly build an Electron-like desktop app? In Rust you might reach for "Tauri + axum," but in C++ it always seemed out of reach.
|
||||
|
||||
## Dependencies
|
||||
With [cpp-httplib](https://github.com/yhirose/cpp-httplib), [webview/webview](https://github.com/webview/webview), and [cpp-embedlib](https://github.com/yhirose/cpp-embedlib), you can take the same approach in pure C++ — and produce a small, easy-to-distribute single binary.
|
||||
|
||||
- [llama.cpp](https://github.com/ggml-org/llama.cpp) — LLM inference engine
|
||||
- [nlohmann/json](https://github.com/nlohmann/json) — JSON parser (header-only)
|
||||
- [webview/webview](https://github.com/webview/webview) — WebView wrapper (header-only)
|
||||
- [cpp-httplib](https://github.com/yhirose/cpp-httplib) — HTTP server/client (header-only)
|
||||
In this tutorial we build an LLM-powered translation app using [llama.cpp](https://github.com/ggml-org/llama.cpp), progressing step by step from "REST API" to "SSE streaming" to "Web UI" to "desktop app." Translation is just the vehicle — replace llama.cpp with your own library and the same architecture works for any application.
|
||||
|
||||

|
||||
|
||||
If you know basic C++17 and understand the basics of HTTP / REST APIs, you're ready to start.
|
||||
|
||||
## Chapters
|
||||
|
||||
1. **Embed llama.cpp and create a REST API** — Start with a simple API that accepts text via POST and returns a translation as JSON
|
||||
2. **Add token streaming with SSE** — Stream translation results token by token using the standard LLM API approach
|
||||
3. **Add model discovery and download** — Use the client to search and download GGUF models from Hugging Face
|
||||
4. **Add a Web UI** — Serve a translation UI with static file hosting, making the app accessible from a browser
|
||||
5. **Turn it into a desktop app with WebView** — Wrap the web app with webview/webview to create an Electron-like desktop application
|
||||
6. **Code reading: llama.cpp's server implementation** — Compare your implementation with production-quality code and learn from the differences
|
||||
1. **[Set up the project](ch01-setup)** — Fetch dependencies, configure the build, write scaffold code
|
||||
2. **[Embed llama.cpp and create a REST API](ch02-rest-api)** — Return translation results as JSON
|
||||
3. **[Add token streaming with SSE](ch03-sse-streaming)** — Stream responses token by token
|
||||
4. **[Add model discovery and management](ch04-model-management)** — Download and switch models from Hugging Face
|
||||
5. **[Add a Web UI](ch05-web-ui)** — A browser-based translation interface
|
||||
6. **[Turn it into a desktop app with WebView](ch06-desktop-app)** — A single-binary desktop application
|
||||
7. **[Reading the llama.cpp server source code](ch07-code-reading)** — Compare with production-quality code
|
||||
8. **[Making it your own](ch08-customization)** — Swap in your own library and customize
|
||||
|
||||
Reference in New Issue
Block a user