Show HN: Papermusic (draw an instrument, then play it) https://ift.tt/gtGwuMi

Show HN: Papermusic (draw an instrument, then play it) This was a fun experiment to try PaliGemma (open vision-language model). I found that PaliGemma performed better than Gemini Flash for this type of specific image task, especially around latency. (~0.9 seconds for PaliGemma inference on a VM, vs. 3-4 seconds for Gemini Flash.) Would love feedback on ways to potentially improve this setup. https://ift.tt/QDIFYBb June 17, 2024 at 09:56PM

Comments

Popular posts from this blog

Show HN: Agent File (.af) – An open file format for agents https://ift.tt/fzI5HcG

Show HN: Sort lines semantically using llm-sort https://ift.tt/7vEeHKP

Show HN: AgentKit – JavaScript Alternative to OpenAI Agents SDK with Native MCP https://ift.tt/H4Kz5Vi