Show HN: Sparrow-1 – Audio-native model for human-level turn-taking without ASR https://ift.tt/KeXlId5

January 14, 2026

Show HN: Sparrow-1 – Audio-native model for human-level turn-taking without ASR For the past year I've been working to rethink how AI manages timing in conversation at Tavus. I've spent a lot of time listening to conversations. Today we're announcing the release of Sparrow-1, the most advanced conversational flow model in the world. Some technical details: - Predicts conversational floor ownership, not speech endpoints - Audio-native streaming model, no ASR dependency - Human-timed responses without silence-based delays - Zero interruptions at sub-100ms median latency - In benchmarks Sparrow-1 beats all existing models at real world turn-taking baselines I wrote more about the work here: https://ift.tt/fFRHxa4... https://ift.tt/u7ViAlg January 14, 2026 at 11:31PM

Search This Blog

News updates

Show HN: Sparrow-1 – Audio-native model for human-level turn-taking without ASR https://ift.tt/KeXlId5

Comments

Post a Comment

Popular posts from this blog

Show HN: Smol machines – subsecond coldstart, portable virtual machines https://ift.tt/gnYXFml

Show HN: web-pinentry: a pinentry program that leverages matrix and http https://ift.tt/qBfREVw

Show HN: I benchmarked Gemma 4 E2B – the 2B model beat the 12B on multi-turn https://ift.tt/izHYx7V