Show HN: Solving NYT Connections with ChatGPT Just for fun I decided to see if I could use chatGPT to solve NYT Connections word puzzles. It uses a pretty straightforward BFS search in which the LLM is first prompted to generate several possible groupings of four related words, and then a different prompt is used to evaluate the soundness of each of those groupings. This approach seems to be able to produce the correct solution somewhat less than half the time. Some observations: * For whatever reason, chatGPT-4 seems to be a bit worse than 3.5 at generating Connections groupings. I haven’t tested systematically so maybe this is just some small sample size bias. But at the very least it isn’t obviously better * It really struggles with the “words that can fill in the blank” style groups. Often it will correctly come up with the right category (e.g. “words that can precede `cheese`”) but will only be able to identify 2 of 4 words in that grouping * It frequently generates very vague categories (“words that can be nouns”) despite nothing like that appearing in the proposal prompt. Also it will still sometimes score them highly, despite there being several explicitly examples in the value prompt disallowing these types of categories If you have any idea for how to improve this, please let me know (or send a PR)! https://ift.tt/ns9q0kx December 6, 2023 at 01:41AM
Show HN: Solving NYT Connections with ChatGPT https://ift.tt/BW4oIA7
Related Articles
Show HN: A tool for kids to practice arithmetic https://ift.tt/K12TZqdShow HN: A tool for kids to practice arithmetic https://ift.tt/sETiJOj… Read More
Show HN: wazero compiler ported to 4 new OSes https://ift.tt/862nzZuShow HN: wazero compiler ported to 4 new OSes Release 1.8.2 of wazero,… Read More
Show HN: Built This in 3 Hours Using Bolt (No React Knowledge) https://ift.tt/w8FExSCShow HN: Built This in 3 Hours Using Bolt (No React Knowledge) It's bl… Read More
Show HN: MyDuck Server – Supercharge MySQL and Postgres Analytics with DuckDB https://ift.tt/le1VJsQShow HN: MyDuck Server – Supercharge MySQL and Postgres Analytics with… Read More
Show HN: Gogo installs your shell tools https://ift.tt/ExYf1mIShow HN: Gogo installs your shell tools Trying to be as friction-less … Read More
Show HN: iOS Theremin Simulator with Hand Tracking (Beta) https://ift.tt/SwpTQ1YShow HN: iOS Theremin Simulator with Hand Tracking (Beta) https://ift.… Read More
Show HN: A word guessing game based on text vector embeddings and cos-similarity https://ift.tt/N9ypDW5Show HN: A word guessing game based on text vector embeddings and cos-… Read More
Show HN: It took me 5() months to build a Plausible alternative https://ift.tt/rFTNkYJShow HN: It took me 5() months to build a Plausible alternative After … Read More
0 Comments: