Posts

Show HN: PhAIL – Real-robot benchmark for AI models https://ift.tt/BcHAMUw

Show HN: PhAIL – Real-robot benchmark for AI models I built this because I couldn't find honest numbers on how well VLA models [1] actually work on commercial tasks. I come from search ranking at Google where you measure everything, and in robotics nobody seemed to know. PhAIL runs four models (OpenPI/pi0.5, GR00T, ACT, SmolVLA) on bin-to-bin order picking – one of the most common warehouse operations. Same robot (Franka FR3), same objects, hundreds of blind runs. The operator doesn't know which model is running. Best model: 64 UPH. Human teleoperating the same robot: 330. Human by hand: 1,300+. Everything is public – every run with synced video and telemetry, the fine-tuning dataset, training scripts. The leaderboard is open for submissions. Happy to answer questions about methodology, the models, or what we observed. [1] Vision-Language-Action: https://ift.tt/XUpPNT0 https://phail.ai March 31, 2026 at 09:55PM

Show HN: AI Spotlight for Your Computer (natural language search for files) https://ift.tt/4o3ef2M

Show HN: AI Spotlight for Your Computer (natural language search for files) Hi HN, I built SEARCH WIZARD — a tool that lets you search your computer using natural language. Traditional file search only works if you remember the filename. But most of the time we remember things like: "the screenshot where I was in a meeting" "the PDF about transformers" "notes about machine learning" Smart Search indexes your files and lets you search by meaning instead of filename. Currently supports: - Images - Videos - Audio - Documents Example query: "old photo where a man is looking at a monitor" The system retrieves the correct file instantly. Everything runs locally except embeddings. I'm looking for feedback on: - indexing approaches - privacy concerns - features you'd want in a tool like this GitHub: https://ift.tt/UaqFuem Demo: https://deepanmpc.github.io/SMART-SEARCH/ March 30, 2026 at 08:43PM

Show HN: Memv – Memory for AI Agents https://ift.tt/0iM4pL2

Show HN: Memv – Memory for AI Agents memv is an open-source Python library that gives AI agents persistent memory. Feed it conversations; it extracts knowledge. The extraction mechanism is predict-calibrate (Nemori paper): given existing knowledge, it predicts what a new conversation should contain, then extracts only what the prediction missed. v0.1.2 adds the production path: - PostgreSQL backend (pgvector for vectors, tsvector for text search, asyncpg pooling). Single db_url parameter — file path for SQLite, connection string for Postgres. - Embedding adapters: OpenAI, Voyage, Cohere, fastembed (local ONNX). Other things it does: - Bi-temporal validity: event time (when was the fact true) + transaction time (when did we learn it), following Graphiti's model. - Hybrid retrieval: vector similarity + BM25 merged with Reciprocal Rank Fusion. - Episode segmentation: groups messages before extraction. - Contradiction handling: new facts invalidate old ones, with full audit trail. Proc...

Show HN: Coasts – Containerized Hosts for Agents https://ift.tt/xzlCijp

Show HN: QuickBEAM – run JavaScript as supervised Erlang/OTP processes https://ift.tt/WdsFxTk

Show HN: QuickBEAM – run JavaScript as supervised Erlang/OTP processes QuickBEAM is a JavaScript runtime embedded inside the Erlang/OTP VM. If you’re building a full-stack app, JavaScript tends to leak in anyway — frontend, SSR, or third-party code. QuickBEAM runs that JavaScript inside OTP supervision trees. Each runtime is a process with a `Beam` global that can: - call Elixir code - send/receive messages - spawn and monitor processes - inspect runtime/system state It also provides browser-style APIs backed by OTP/native primitives (fetch, WebSocket, Worker, BroadcastChannel, localStorage, native DOM, etc.). This makes it usable for: - SSR - sandboxed user code - per-connection state - backend JS with direct OTP interop Notable bits: - JS runtimes are supervised and restartable - sandboxing with memory/reduction limits and API control - native DOM that Erlang can read directly (no string rendering step) - no JSON boundary between JS and Erlang - built-in TypeScript, npm support, and ...

Show HN: Octopus, Open-source alternative to CodeRabbit and Greptile https://ift.tt/n8UFme0

Show HN: Octopus, Open-source alternative to CodeRabbit and Greptile Hey HN, we built Octopus an open-source, self-hostable AI code reviewer for GitHub and Bitbucket. It uses RAG with vector search (Qdrant) to understand your full codebase, not just the diff, and posts inline findings on PRs with severity ratings. Works with Claude and OpenAI, and you can bring your own API keys. Video: https://www.youtube.com/watch?v=HP1kaKTOdXw | GitHub: https://ift.tt/rqseX5Y https://ift.tt/Ei0FTpI March 28, 2026 at 06:50PM