Show HN: Kevin-32B – how to do multi-turn RL on writing CUDA kernels Hey – we just published a blog post about Kevin-32B = K(ernel D)evin. It's to our knowledge the first open-source model that's RL-trained on CUDA kernels. Our goal was to demonstrate multi-turn RL using GRPO. We used 180 Python->CUDA conversion tasks from the KernelBench dataset. The results were surprisingly strong! We were able to outperform top reasoning model like o3 & o4-mini. We're sharing our training setup and learnings in the blogpost. Also the model is on HuggingFace: https://ift.tt/yzarSeH https://ift.tt/Fv4XU2o May 7, 2025 at 01:18AM
Show HN: Kevin-32B – how to do multi-turn RL on writing CUDA kernels https://ift.tt/VEsDdT4
Related Articles
Show HN: Stagehand – an open source browser automation framework powered by AI https://ift.tt/IwBailbShow HN: Stagehand – an open source browser automation framework power… Read More
Show HN: Zig Obfusgator https://ift.tt/9Q1Z6SEShow HN: Zig Obfusgator https://ift.tt/Q3hA5OZ January 9, 2025 at 01:2… Read More
Show HN: HipScript – Run CUDA in the Browser with WebAssembly and WebGPU https://ift.tt/T8iYLhFShow HN: HipScript – Run CUDA in the Browser with WebAssembly and WebG… Read More
Show HN: Tinyhnsw – The Littlest Vector Database https://ift.tt/fUK0eN3Show HN: Tinyhnsw – The Littlest Vector Database In an effort to under… Read More
Show HN: I created a tool that helps developers generate fake data for databases https://ift.tt/N5l2wR3Show HN: I created a tool that helps developers generate fake data for… Read More
Show HN: Ultra-portable Gantt chart tool for very regulated environments https://ift.tt/Flq4j2rShow HN: Ultra-portable Gantt chart tool for very regulated environmen… Read More
Show HN: Zero-overhead compile-time builder pattern for Rust https://ift.tt/t7l08hmShow HN: Zero-overhead compile-time builder pattern for Rust https://i… Read More
Show HN: Cardstock- Free TCG Proxy Manager for Magic, Yugioh, & Pokemon https://ift.tt/qBm410RShow HN: Cardstock- Free TCG Proxy Manager for Magic, Yugioh, & Po… Read More
0 Comments: