Show HN: Papermusic (draw an instrument, then play it) This was a fun experiment to try PaliGemma (open vision-language model). I found that PaliGemma performed better than Gemini Flash for this type of specific image task, especially around latency. (~0.9 seconds for PaliGemma inference on a VM, vs. 3-4 seconds for Gemini Flash.) Would love feedback on ways to potentially improve this setup. https://ift.tt/dYmSwi4 June 17, 2024 at 09:56PM
Show HN: Papermusic (draw an instrument, then play it) https://ift.tt/zbK1Z5D
Related Articles
Show HN: I open-sourced my AI toy company that runs on ESP32 and OpenAI realtime https://ift.tt/M2eQd0YShow HN: I open-sourced my AI toy company that runs on ESP32 and OpenA… Read More
Show HN: Body Controlled 3D Dino Game https://ift.tt/Vc6roSGShow HN: Body Controlled 3D Dino Game Hey HN, I am Niko. I've built th… Read More
Show HN: I reverse engineered top websites to build an animated UI library https://ift.tt/HoE80pvShow HN: I reverse engineered top websites to build an animated UI lib… Read More
Show HN:[Opensource] AIgr.id–Polycentric Infrastructure for Open and Plural AI https://ift.tt/VGOHLutShow HN:[Opensource] AIgr.id–Polycentric Infrastructure for Open and P… Read More
Show HN: GitNote- Online MD note editor that syncs to GitHub https://ift.tt/g8RClFBShow HN: GitNote- Online MD note editor that syncs to GitHub https://i… Read More
Show HN: Prompt Coded 3D Asteroids https://ift.tt/QXiSA0BShow HN: Prompt Coded 3D Asteroids https://ift.tt/d91qSIp April 22, 20… Read More
Show HN: Document agent example that can parse and chat over unstructured data https://ift.tt/Q1WOda0Show HN: Document agent example that can parse and chat over unstructu… Read More
Show HN: Dosidicus – A digital pet with a simple neural network https://ift.tt/cFUDZ3nShow HN: Dosidicus – A digital pet with a simple neural network https:… Read More
0 Comments: