Show HN: A vector database with semantic SQL-like filtering Hi HN! It’s always bothered me that there’s no real equivalent of SQL WHERE for vector content. Filtering is one of the cornerstones of a modern database — but vector DBs only support either top-k sort, which is only useful for fuzzy search, or metadata filtering, which isn’t semantic. I’ve found myself wanting all the results matching my semantic query, not just k! Aside from data analysis, it's relevant if you’re trying to do any LLM reasoning: you don’t make good decisions or reach good conclusions by considering a small subset of information. So, we’ve designed a filtering primitive on top of vectors and assembled a demo on customer reviews from Trustpilot, Yelp, App Store, etc. You can select any brand/restaurant/app, and slice the review data however you want. The filter should find all matching documents, not just the top-k. Check it out at https://ift.tt/TuJhYrS ! Not super optimized yet, and really just an exploration, but hopefully gets the point across. FAQ: - Can I try it on my own data? Sure, shoot me a message at hello [at] emberml [dot] com. - How does it work? We’ve built a custom vector-based index, and we learn a high-quality decision boundary between relevant and irrelevant vectors at query time. You can think of it as forming a few-shot classifier each time. - What’s the catch? It’s far slower and less scalable than KNN/ANN right now. But I’d rather solve quality before trying to scale up quantity; tbh I’m not satisfied with vector DB performance even at @ N=1,000. A hot take, maybe? - Why don’t you just classify the data beforehand? Unstructured data has too many degrees of freedom, so it’s hard to anticipate every search/filter a priori. Our approach is somewhat analogous to schema-on-read. https://ift.tt/SOz9nKh September 14, 2023 at 11:34PM
Show HN: A vector database with semantic SQL-like filtering https://ift.tt/WiV14M9
Related Articles
Show HN: Top Hacker News stories every day over the years https://ift.tt/jpfolITShow HN: Top Hacker News stories every day over the years https://ift.… Read More
Show HN: I made Picle (a.k.a. Catchphrase x Wordle x AI) https://ift.tt/xCsvirHShow HN: I made Picle (a.k.a. Catchphrase x Wordle x AI) Love to hear … Read More
Show HN: Tailwind Box Shadow Generator https://ift.tt/ecWMk9IShow HN: Tailwind Box Shadow Generator https://ift.tt/uBiDgXq November… Read More
Show HN: Store and render ASCII diagrams in Obsidian https://ift.tt/OiZwlcyShow HN: Store and render ASCII diagrams in Obsidian Obsidian plug-in … Read More
Show HN: I built an app for anyone to design their own kitchen https://ift.tt/dLa8DoNShow HN: I built an app for anyone to design their own kitchen https:/… Read More
Show HN: Knight's Graph – game based on the Knight's tour problem https://ift.tt/8jkXCNBShow HN: Knight's Graph – game based on the Knight's tour problem When… Read More
Show HN: CSV Table – Proper GUI for View and Edit CSV, JSON https://ift.tt/OfUqI8TShow HN: CSV Table – Proper GUI for View and Edit CSV, JSON https://cs… Read More
Show HN: Next Beats – A Modern, Customizable, Open Source Lofi Music Player https://ift.tt/yMqPApkShow HN: Next Beats – A Modern, Customizable, Open Source Lofi Music P… Read More
0 Comments: