Show HN: Using LLMs and Embeddings to classify application errors https://ift.tt/RkBSzoW

Show HN: Using LLMs and Embeddings to classify application errors Hi Hacker News! We’re Vadim and Chris from Highlight.io [1]. We do web app monitoring and are working on using LLMs/embeddings to add new functionality to our error monitoring product. Given that there’s a lot of founders/engineers using LLMs in their products, we figured we’d share how we built the new functionality, their impact on our workflows, and how you can try it out. Our goal was to build two features: (1) tagging errors (e.g. deeming an error as “authentication error” or a “database error”); and (2) grouping similar errors together (e.g. two errors that have a different stacktrace and body, but are semantically not very different). Each of these rely heavily on comparing text across our application. After some experimentation with the OpenAI embeddings API [3], we went ahead and hosted a private model instance of thenlper/gte-large (an open-source MIT licensed model), which is a 1024-dimension model running on an Intel Ice Lake 2 vCPU machine on Hugging face [4]. Our general approach for classifying/comparing text is as follows. As each set of tokens (i.e a string) comes in, our backend makes a request to an inference endpoint and receives a 1024-dimension float vector as a response (see the code here [5]). We then store that vector using pgvector [6]. To compare any two sets for similarity, we simply look at the Euclidian distance between their respective embeddings using the ivfflat index implemented by pgvector (example code here [7]). To tag errors, we assign an error its most relevant tag from a predetermined set decided by us. For example, if we tag an error as an "authentication error" or a "database error", we can allow developers to have a starting point before inspecting an issue.(see the logic here [8]). Anecdotally, this approach seems to work very well. For example, here are two authentication errors that got tagged as “Authentication Error”: * Firebase: A network AuthError has occurred * Error retrieving user from firebase api for email verification: cannot find user from uid. We also use these error embeddings to group similar errors. To decide whether an error joins a group or starts a new one, we decide on a distance threshold (using the euclidean distance) ahead of time. An interesting thing about this approach, compared to using a text-based heuristic, is that two errors with different stack traces can still be grouped together. Here’s an example: * github.com/highlight-run/highlight/backend/worker.(*Worker).ReportStripeUsage * github.com/highlight-run/highlight/backend/private-graph/graph.(*Resolver).GetSlackChannelsFromSlack.func1 Both reported as `integration api error` as they involve the Stripe and Slack integrations respectively. The neat thing is that the LLM can use the full context of an error and match based on the most relevant details about the error. We have rolled out a first version of the error grouping logic to our cloud product [9], and there’s a demo of all the functionality at [2]. Long-term, the HN community has other ideas of what we could build with LLM tooling in observability, we’re all ears. Let us know what you think! Links [1] https://ift.tt/erbfRDU [2] https://ift.tt/UvcFpiy [3] https://ift.tt/vo2TtP7 [4] https://ift.tt/u5yZLUY [5] https://ift.tt/cbU6WRi... [6] https://ift.tt/5LYptSW... [7] https://ift.tt/Nz8u4Wd... [8] https://ift.tt/S9MbFip... [9] https://ift.tt/ifJsl1U https://ift.tt/5QDbZwt September 27, 2023 at 08:47PM

World News

Labels Cloud

Hot News

Socialize

Page Nav

Breaking News

News

Sports

Grid

Menu Footer Widget

Featured

Social Plugin

Videos

Text Widget

Populars

Trending Posts Display

Home Layout Display

Contact Form

Contact Us

Ticker

Latest News

Labels

Ad Code

Like Us

Latest

Brexit

Football

America

Total Pageviews

Home Top Ad

Archive

Post Top Ad

Post Bottom Ad

728x90 AdSpace

Slider

Subscribe Us

Ads Place

Ad Space

Footer Menu

Connect WIth Us

Sports News

Games

Category

Sports

Trends

About Us

News By Picture

Politics

Travel

Tech

Music

Games

Ads Place

Iklan Atas Artikel

Social

Pages

Iklan Tengah Artikel 1

Content Marketing

Iklan Tengah Artikel 2

Privacy Policy

Iklan Bawah Artikel

Fashion & Lifestyle

Popular

Show HN: Using LLMs and Embeddings to classify application errors https://ift.tt/RkBSzoW

0 Comments: