Show HN: Blast – Fast, multi-threaded serving engine for web browsing AI agents https://ift.tt/QKx2hd5

May 02, 2025

Show HN: Blast – Fast, multi-threaded serving engine for web browsing AI agents Hi HN! BLAST is a high-performance serving engine for browser-augmented LLMs, designed to make deploying web-browsing AI easy, fast, and cost-manageable. The goal with BLAST is to ultimately achieve google search level latencies for tasks that currently require a lot of typing and clicking around inside a browser. We're starting off with automatic parallelism, prefix caching, budgeting (memory and LLM cost), and an OpenAI-Compatible API but have a ton of ideas in the pipe! Website & Docs: https://ift.tt/lfKXp2u https://ift.tt/ZqVyipF MIT-Licensed Open-Source: https://ift.tt/bHcz9Zs Hope some folks here find this useful! Please let me know what you think in the comments or ping me on Discord. — Caleb (PhD student @ Stanford CS) https://ift.tt/bHcz9Zs May 2, 2025 at 11:12PM

Search This Blog

News updates

Show HN: Blast – Fast, multi-threaded serving engine for web browsing AI agents https://ift.tt/QKx2hd5

Comments

Post a Comment

Popular posts from this blog

Show HN: Agent File (.af) – An open file format for agents https://ift.tt/fzI5HcG

Show HN: Attach Gateway – one-command OIDC/DID auth for local LLMs https://ift.tt/crSdltX

Show HN: Sort lines semantically using llm-sort https://ift.tt/7vEeHKP