Show HN: Tune LLaMa3.1 on Google Cloud TPUs Hey HN, we wanted to share our repo where we fine-tuned Llama 3.1 on Google TPUs. We’re building AI infra to fine-tune and serve LLMs on non-NVIDIA GPUs (TPUs, Trainium, AMD GPUs). The problem: Right now, 90% of LLM workloads run on NVIDIA GPUs, but there are equally powerful and more cost-effective alternatives out there. For example, training and serving Llama 3.1 on Google TPUs is about 30% cheaper than NVIDIA GPUs. But developer tooling for non-NVIDIA chipsets is lacking. We felt this pain ourselves. We initially tried using PyTorch XLA to train Llama 3.1 on TPUs, but it was rough: xla integration with pytorch is clunky, missing libraries (bitsandbytes didn't work), and cryptic HuggingFace errors. We then took a different route and translated Llama 3.1 from PyTorch to JAX. Now, it’s running smoothly on TPUs! We still have challenges ahead, there is no good LoRA library in JAX, but this feels like the right path forward. Here's a demo ( https://ift.tt/6gGmvOw ) of our managed solution. Would love your thoughts on our repo and vision as we keep chugging along! https://ift.tt/csBmZpF September 11, 2024 at 08:44PM
Show HN: Tune LLaMa3.1 on Google Cloud TPUs https://ift.tt/zjxGdMI
Related Articles
Show HN: SillyCoValley – I made a fast-paced startup simulation game https://ift.tt/QCk6GajShow HN: SillyCoValley – I made a fast-paced startup simulation game h… Read More
Show HN: Create books on any topic with ChatGPT https://ift.tt/pz5Da6CShow HN: Create books on any topic with ChatGPT This weekend wanted to… Read More
Show HN: Tarot Arcana—AI tarot card readings https://ift.tt/nKfB4tFShow HN: Tarot Arcana—AI tarot card readings On device LLM generated t… Read More
Show HN: Postgres Language Server https://ift.tt/bvGU5AlShow HN: Postgres Language Server https://ift.tt/spmozhA August 6, 202… Read More
Show HN: Archsense – Accurately generated architecture from the source code https://ift.tt/iaUQe7AShow HN: Archsense – Accurately generated architecture from the source… Read More
Show HN: ChatData-yet another chat-with-document app, querying millions of files https://ift.tt/j2qtJhzShow HN: ChatData-yet another chat-with-document app, querying million… Read More
Show HN: Briefed – Summaries for Hard Paywalled Content https://ift.tt/fkNhsXjShow HN: Briefed – Summaries for Hard Paywalled Content Hey HN! Briefe… Read More
Show HN: A guide to self-host AudioCraft demo https://ift.tt/Ba9Yfj0Show HN: A guide to self-host AudioCraft demo https://ift.tt/hKXs8G2 A… Read More
0 Comments: