Show HN: Open-source template for end-to-end streaming analytics To help my future self, I decided to build a repository in which I can quickly deploy an end-to-end modern analytics pipeline, from ingestion to fast analytics and business dashboards, including data exploration, time-series forecasting, and monitoring of the stack. Of course, all the components are open source, and you can use this template as a stepping stone for your near-realtime streaming analytics. What's the inspiration? I’ve been working with streaming analytics for a long time. I’ve done not-too-stale analytics with a RDBMs incremental query and a spreadsheet, gone over the micro-batch-looks-almost-like-real-time lambda analytics, and the near-real-time analytics since kappa and afterwards. The range and features of tools today is way better than what we had 15 years ago. What remains constant is the requirement for freshness of data, and for more advanced analytics. This means that you cannot really build a reliable data pipeline for near-realtime analytics at scale using a single component, and every time you start a new project you waste a lot of time just integrating the different moving parts. When the repository starts, the pipeline will collect public events from the GitHub API, send them to a message broker (Apache Kafka), persist them into a fast time-series database (QuestDB), and visualize them on a dashboard (Grafana). It will also provide a web-based development environment (Jupyter Notebook) for data science and machine learning. Monitoring metrics are captured by a server agent (Telegraf) and stored back into the time-series database (QuestDB). Hopefully others in the community find this useful! https://ift.tt/E25fmYP February 9, 2024 at 01:22AM
Show HN: Open-source template for end-to-end streaming analytics https://ift.tt/zh32AxT
Related Articles
Show HN: Me and my buddy made $20 with a Stripe link and a Tweet https://ift.tt/Fd50RNoShow HN: Me and my buddy made $20 with a Stripe link and a Tweet Hi th… Read More
Show HN: ARA Records Ansible and makes it easier to understand and troubleshoot https://ift.tt/IGAxfBjShow HN: ARA Records Ansible and makes it easier to understand and tro… Read More
Show HN: Make box plots from parquet, avro, CSV https://ift.tt/xX9GTk4Show HN: Make box plots from parquet, avro, CSV https://www.tablab.app… Read More
Show HN: Bookmarklet to count number of lines in a GitHub repo https://ift.tt/rA6fYoGShow HN: Bookmarklet to count number of lines in a GitHub repo https:/… Read More
Show HN: Infinitely Recyclable Plastic (PDK) from Berkley https://ift.tt/NkW0IJjShow HN: Infinitely Recyclable Plastic (PDK) from Berkley https://ift.… Read More
Show HN: Fireworks Tap Toy https://ift.tt/dDPKhl8Show HN: Fireworks Tap Toy https://ift.tt/Bo8Ehg3 September 17, 2023 a… Read More
Show HN: Hello Inbox – Free email deliverability checklist for marketers https://ift.tt/u2tQ7sZShow HN: Hello Inbox – Free email deliverability checklist for markete… Read More
Show HN: I made a browser extension for building your own custom HN themes https://ift.tt/F67eds1Show HN: I made a browser extension for building your own custom HN th… Read More
0 Comments: