Show HN: Warehouse OpenAI requests to your own database Today we’re launching Velvet, an AI gateway for warehousing OpenAI and Anthropic requests to your PostgreSQL instance. We originally built an AI SQL editor, but realized that customers were using it to monitor their AI requests in production. We had already built an AI request warehousing tool internally to debug our SQL editor and gave some customers access. A few days into testing this idea, our pilot customer launched [1] and we began warehousing 1,500 requests per second. We worked closely with their engineering team in the following weeks, completely re-architecting Velvet for scale and additional features (such as Batch support). Along the way, other companies began seeking out Velvet to get visibility into their own LLM requests. We’re launching our AI gateway as a self-serve product today, but our pilot customers are already warehousing over 3 million requests per week - so the system is stable and performant. What makes Velvet unique is that you own the data in your own database. Also, we’re the first proxy that gives visibility into OpenAI batch calls - so you can observe and monitor async calls that save you money. Some technical notes: - Supports OpenAI and Anthropic endpoints - Data is formatted as JSON and logged to your own PostgreSQL instance (can add support for other databases for paying customers). - You can include queryable metadata in the header, such as user ID, org ID, model ID, and version ID. - Built on Cloudflare workers, which keeps latency minimal (using our caching feature will reduce latency overall) - Built for security + starting process of SOC II soon Why warehouse your requests? - Understand where money is spent. Use custom headers to calculate the cost per customer, model, or service. - Download real request/response data, so you can evaluate new models (e.g., re-running requests with a cheaper mini model) - Monitor time to completion of batch jobs. (e.g., OpenAI says 24 hours, but our customers average 3-4 hours) - Export a subset of example requests for fine-tuning It’s just a 2 line code change to get started. Try a sandbox demoing the logging proxy here: https://ift.tt/J9VCxIz More details in our docs https://ift.tt/bdfK06w [1] https://ift.tt/6gTRLV3 https://ift.tt/f0cgUCp August 28, 2024 at 10:21PM
Show HN: Warehouse OpenAI requests to your own database https://ift.tt/4QN8wzZ
Related Articles
Show HN: Sciagraph, performance+memory profiler for production Python batch jobs https://ift.tt/atpyv0QShow HN: Sciagraph, performance+memory profiler for production Python … Read More
Show HN: Mailauth, CLI utility to analyze DKIM, DMARC, SPF, ARC, BIMI signatures https://ift.tt/NhXcDoQShow HN: Mailauth, CLI utility to analyze DKIM, DMARC, SPF, ARC, BIMI … Read More
Show HN: We made a fast audio editor for podcasting https://ift.tt/pj9WDPnShow HN: We made a fast audio editor for podcasting https://teapodo.co… Read More
Show HN: Easily Convert WARC (Web Archive) into Parquet, Then Query with DuckDB https://ift.tt/S4J1Is7Show HN: Easily Convert WARC (Web Archive) into Parquet, Then Query wi… Read More
Show HN: Piano Trainer – Learn piano scales, chords and more using MIDI https://ift.tt/q8SvT2LShow HN: Piano Trainer – Learn piano scales, chords and more using MID… Read More
Show HN: Heat Pump Cost Comparison https://ift.tt/Xt9ES5AShow HN: Heat Pump Cost Comparison https://ift.tt/rWjagBX June 30, 202… Read More
Show HN: Mabel – a fancy BitTorrent client for the terminal https://ift.tt/YlFnbSMShow HN: Mabel – a fancy BitTorrent client for the terminal https://if… Read More
Show HN: Movably – Protect your health, move more while you work https://ift.tt/UZ0vCmXShow HN: Movably – Protect your health, move more while you work https… Read More
0 Comments: