Show HN: I hacked LLMs to work like scikit-learn https://ift.tt/uDO6mC5

Show HN: I hacked LLMs to work like scikit-learn https://ift.tt/uDO6mC5

Show HN: I hacked LLMs to work like scikit-learn Working with LLMs in existing pipelines can often be bloated, complex, and slow. That's why I created FlashLearn , a streamlined library that mirrors the user experience of scikit-learn. It follows a pipeline-like structure allowing you to "fit" (learn) skills from sample data or instructions, and "predict" (apply) these skills to new data, returning structured results. High-Level Concept Flow: Your Data --> Load Skill / Learn Skill --> Create Tasks --> Run Tasks --> Structured Results --> Downstream Steps Installation: pip install flashlearn Learning a New "Skill" from Sample Data Just like a fit/predict pattern in scikit-learn, you can quickly "learn" a custom skill from minimal (or no!) data. Here's an example where we create a skill to evaluate the likelihood of purchasing a product based on user comments: from flashlearn.skills.learn_skill import LearnSkill from flashlearn.client import OpenAI # Instantiate your pipeline "estimator" or "transformer", similar to a scikit-learn model learner = LearnSkill(model_name="gpt-4o-mini", client=OpenAI()) data = [ {"comment_text": "I love this product, it's everything I wanted!"}, {"comment_text": "Not impressed... wouldn't consider buying this."}, # ... ] # Provide instructions and sample data for the new skill skill = learner.learn_skill( data, task=( "Evaluate how likely the user is to buy my product based on the sentiment in their comment, " "return an integer 1-100 on key 'likely_to_buy', " "and a short explanation on key 'reason'." ), ) # Save skill to use in pipelines skill.save("evaluate_buy_comments_skill.json") Input Is a List of Dictionaries Simply wrap each record into a dictionary, much like feature dictionaries in typical ML workflows: user_inputs = [ {"comment_text": "I love this product, it's everything I wanted!"}, {"comment_text": "Not impressed... wouldn't consider buying this."}, # ... ] Run in 3 Lines of Code - Concurrency Built-in up to 1000 calls/min # Suppose we previously saved a learned skill to "evaluate_buy_comments_skill.json". skill = GeneralSkill.load_skill("evaluate_buy_comments_skill.json") tasks = skill.create_tasks(user_inputs) results = skill.run_tasks_in_parallel(tasks) print(results) Get Structured Results Here's an example of structured outputs mapped to indexes of your original list: { "0": { "likely_to_buy": 90, "reason": "Comment shows strong enthusiasm and positive sentiment." }, "1": { "likely_to_buy": 25, "reason": "Expressed disappointment and reluctance to purchase." } } Pass on to the Next Steps You can use each record’s output for downstream tasks such as storing results in a database or filtering high-likelihood leads: # Suppose 'flash_results' is the dictionary with structured LLM outputs for idx, result in flash_results.items(): desired_score = result["likely_to_buy"] reason_text = result["reason"] # Now do something with the score and reason, e.g., store in DB or pass to next step print(f"Comment #{idx} => Score: {desired_score}, Reason: {reason_text}") https://ift.tt/91Ax5Xc February 1, 2025 at 10:09PM

0 Comments: