Show HN: Running LLMs in one line of Python without Docker https://ift.tt/iDwaHbj

Show HN: Running LLMs in one line of Python without Docker https://ift.tt/iDwaHbj

Show HN: Running LLMs in one line of Python without Docker Hello Hacker News! We're Yangqing, Xiang and JJ from lepton.ai. We are building a platform to run any AI models as easy as writing local code, and to get your favorite models in minutes. It's like container for AI, but without the hassle of actually building a docker image. We built and contributed to some of the world's most popular AI software - PyTorch 1.0, ONNX, Caffe, etcd, Kubernetes, etc. We also managed hundreds of thousands of computers in our previous jobs. And we found that the AI software stack is usually unnecessarily complex - and we want to change that. Imagine if you are a developer who sees a good model on github, or HuggingFace. To make it a production ready service, the current solution usually requires you to build a docker image. But think about it - I have a few python code and a few python dependencies. That sounds like a huge overhead, right? lepton.ai is really a pythonic way to free you from such difficulties. You write a simple python scaffold around your PyTorch / TensorFlow code, and lepton launches it as a full-fledged service callable via python, javascript, or any language that understands OpenAPI. We use containers under the hood, but you don't need to worry about all the infrastructure nuts and bolts. One of the biggest challenge in AI is that it's really "all-stack": in addition to a plethora of models, AI applications usually involves GPUs, cloud infra, web services, DevOps, and SysOps. But we want you to focus on your job - and we take care of the rest "boring but essential" work. We're really excited we get to show this to you all! Please let us know your thoughts and questions in the comments. https://www.lepton.ai/ October 4, 2023 at 10:07PM

0 Comments: