Show HN: LLMs can generate valid JSON 100% of the time https://ift.tt/FS6jJcd

Show HN: LLMs can generate valid JSON 100% of the time Outlines is a Python library that focuses on text generation with large language models. Brandon and I are not LLM experts and started the project a few months ago because we wanted to understand better how the generation process works. Our original background is probabilistic, relational and symbolic programming. Recently we came up with a fast way to generate text that matches a regex ( https://ift.tt/qQKtZlL... ). The basic idea is simple: regular expressions have an equivalent Deterministic-Finite Automaton (DFA) representation. We can transform this DFA into a generative model: in each state we get a list of symbols which correspond to completions that partially match the regular expression. We mask the other symbols in the logits returned by a large language model, sample a new symbol and move to the next state. The subtelty is that language models work with tokens, not symbols, so we derive a new FSM whose alphabet is the model's vocabulary. We can do this in only one pass over the vocabulary. Generating the token masks thus only requires a dictionary lookup at each state. Our method blows other libraries like Microsoft's guidance out of the water. From there it was only a small leap to be able to generate text that follows a JSON schema ( https://ift.tt/i3qZpeO ), or is parseable into a Pydantic model ( https://ift.tt/ExNciWs ). The method works with union types, optional types, nested schemas, arrays, everything. It is guaranteed that the output is parseable. I think it's cool, and I've spent a lot of time watching even tiny models output valid JSON over the weekend. Hope you will too. I look forward to feedback, bug reports, feature requests and discussions! Edit: Link to our pre-print explaining the method and how this can be extended to generate text that follows a Context-Free Grammar https://ift.tt/ZdmslqH https://ift.tt/SyBvN7P August 15, 2023 at 12:22AM

World News

Labels Cloud

Hot News

Socialize

Page Nav

Breaking News

News

Sports

Grid

Menu Footer Widget

Featured

Social Plugin

Videos

Text Widget

Populars

Trending Posts Display

Home Layout Display

Contact Form

Contact Us

Ticker

Latest News

Labels

Ad Code

Like Us

Latest

Brexit

Football

America

Total Pageviews

Home Top Ad

Archive

Post Top Ad

Post Bottom Ad

728x90 AdSpace

Slider

Subscribe Us

Ads Place

Ad Space

Footer Menu

Connect WIth Us

Sports News

Games

Category

Sports

Trends

About Us

News By Picture

Politics

Travel

Tech

Music

Games

Ads Place

Iklan Atas Artikel

Social

Pages

Iklan Tengah Artikel 1

Content Marketing

Iklan Tengah Artikel 2

Privacy Policy

Iklan Bawah Artikel

Fashion & Lifestyle

Popular

Show HN: LLMs can generate valid JSON 100% of the time https://ift.tt/FS6jJcd

0 Comments: