Show HN: Next-token prediction in JavaScript – build fast LLMs from scratch https://ift.tt/0GW13EA

Show HN: Next-token prediction in JavaScript – build fast LLMs from scratch What inspired this project today was watching this amazing video by 3Blue1Brown called "But what is a GPT?" on Youtube ( https://www.youtube.com/watch?v=wjZofJX0v4M - I highly recommend watching it). I added it to the repo for reference. When it clicked in my head that "knowing a fact" is nearly synonymous with predicting a word (or series of words), I wanted to put it to the test, because it seemed so simple. I chose JavaScript because I can exploit the way it structures objects to aid in the modeling of language. For example: "I want to be at the beach", "I will do it later", "I want to know the answer", ... becomes: { I: { want: { to: { be: { ... }, know: { ... } } }, will: { ... } }, ... } in JavaScript. You can exploit the language's fast object lookup speed to find known sentences this way, rather than recursively searching text - which is the convention and would take forever or not work at all considering there are several full books loaded in by default (and it could support many more). Accompanying research yielded learnings about what "tokens" and "embeddings" are, what is meant by "training", and most of the rest - though I'm still learning jargon. I wrote a script to iterate over every single word of every single book to rank how likely it is that word will appear next, if given a cursor, and extended that to rank entire phrases. The base decoder started out what I'll call "token-agnostic" - didn't care if you were looking for the next letter... word... pixel... it's the same logic. But actually it's not, and it soon evolved into a text (language) model. But I have plans to get into image generation next (next-pixel prediction), using this. Overall the concepts are similar, but there are differences primarily around extraction and formatting. Goals of the project: - Demystify LLMs for people, show that it's just regular code that does normal stuff - Actually make a pretty good LLM in JavaScript, with a version at least capable of running in a browser tab https://ift.tt/9e1LUdk April 11, 2024 at 02:57AM

World News

Labels Cloud

Hot News

Socialize

Page Nav

Breaking News

News

Sports

Grid

Menu Footer Widget

Featured

Social Plugin

Videos

Text Widget

Populars

Trending Posts Display

Home Layout Display

Contact Form

Contact Us

Ticker

Latest News

Labels

Ad Code

Like Us

Latest

Brexit

Football

America

Total Pageviews

Home Top Ad

Archive

Post Top Ad

Post Bottom Ad

728x90 AdSpace

Slider

Subscribe Us

Ads Place

Ad Space

Footer Menu

Connect WIth Us

Sports News

Games

Category

Sports

Trends

About Us

News By Picture

Politics

Travel

Tech

Music

Games

Ads Place

Iklan Atas Artikel

Social

Pages

Iklan Tengah Artikel 1

Content Marketing

Iklan Tengah Artikel 2

Privacy Policy

Iklan Bawah Artikel

Fashion & Lifestyle

Popular

Show HN: Next-token prediction in JavaScript – build fast LLMs from scratch https://ift.tt/0GW13EA

0 Comments: