Show HN: Bitemporal, Binary JSON Database System and Event Store I had already posted the project a couple of years ago, and it gained some interest, but a lot of stuff has been done since then, especially regarding performance, a completely new JSON store, a REST API, various internals refactored, an improved JSONiq based query engine allowing updates, implementing set-oriented join optimizations, a now already dated web UI, a new Kotlin based CLI, a Python and TypeScript client to ease the use of Sirix... First prototypes from a precursor stem already from 2005. So, what is it all about? The system uses ideas from ZFS (a keyed index trie, storing checksums in parent pages...) and Git (a persistent index structure that shares unchanged pages between revisions) but appends new tree roots on each commit [1][2]. It is a JSON DBS. The system stores fine granular JSON nodes. Thus, there's almost no limit to the structure and size of an object. Objects can be arbitrarily nested, and updates are cheap. On a high level, it supports space-efficient snapshots, tracking changes by an author / optional commit messages, time travel queries, reverting to previous revisions (while all revisions in-between still exist for audits...), or retrieving the changes of whole (sub)trees. On the one hand, it's, thus, a bitemporal DBS, but on the other hand, it can be used as a simple event store. It stores the state after an event or a change occurs and tracks the changes. Thus, an entity, a node in the JSON structure, can be updated to new values and eventually be removed while the history is easily retrievable, or we can easily revert to a previous state. The system assigns a unique ID to each new node, which never changes and is never reused (even after the deletion of the node). Thus, the system stores the state after the change/event and the event itself (the change event). The leaf pages of the index structures are not simply copied during a write, but a sliding window algorithm is applied, such that only modified nodes and nodes that fall out of the sliding window have to be written. A predefined window length is configurable. The system avoids write-peaks, which would occur due to full snapshots and having to read a long chain of incremental changes in between. Thus, it's best suited for fast flash drives with fast random reads and sequential writes. Data is never overwritten thus, audit trails are given for free. Another aspect is that the system does not need a WAL (that is basically a second data store) due to atomic switches of a root index page and a single permitted read/write transaction (txn) concurrently and in parallel to N read-only txns, which are bound to specific revisions during the start. Reads do not involve any locks.[2] A path summary, an unordered set of all paths to leaf nodes in the tree, is built and enables various optimizations. Furthermore, a rolling hash is optionally built, whereas all ancestor node hashes are adapted during inserts. A dated Jupyter notebook with some examples can be found in [3], and overall documentation in [4]. The query engine[5] Brackit is retargetable (a couple of interfaces and rewrite rules have to be implemented for DB systems) and especially finds implicit joins and applies known algorithms from the relational DB systems world to optimize joins and aggregate functions due to set-oriented processing of the operators.[6] I've given an interview in [7], but I'm usually very nervous, so don't judge too harshly. Give it a try, and happy coding! Kind regards Johannes [1] https://sirix.io | https://ift.tt/VXiKIQY [2] https://ift.tt/9VX3DKM [3] https://ift.tt/9cdwI1O [4] https://sirix.io/docs/ [5] http://brackit.io [6] https://ift.tt/WYfbUsD [7] https://youtu.be/Ee-5ruydgqo?si=Ift73d49w84RJWb2 November 13, 2023 at 11:21PM
Show HN: Bitemporal, Binary JSON Database System and Event Store https://ift.tt/Sy6Ro1U
Related Articles
Show HN: I built an open-source data pipeline tool in Go https://ift.tt/SHAFvtQShow HN: I built an open-source data pipeline tool in Go Every data pi… Read More
Show HN: I made a multiplayer crossword game https://ift.tt/EzPgUDfShow HN: I made a multiplayer crossword game Hey HN, I’ve been working… Read More
Show HN: AI Powered Daily Budgeting https://ift.tt/4G2cChSShow HN: AI Powered Daily Budgeting https://ift.tt/AtCmM0L December 15… Read More
Show HN: Dbine – Auxiliary tools related to databases https://ift.tt/pzQHkT0Show HN: Dbine – Auxiliary tools related to databases https://ift.tt/h… Read More
Show HN: SmartHome – An Adventure Game https://ift.tt/U1TxMt7Show HN: SmartHome – An Adventure Game SmartHome is a free, browser-ba… Read More
Show HN: NCompass Technologies – yet another AI Inference API, but hear us out https://ift.tt/O08jZyMShow HN: NCompass Technologies – yet another AI Inference API, but hea… Read More
Show HN: 31Memorize–Free vocab builder with FSRS-5 spaced repetition https://ift.tt/5Mau4TvShow HN: 31Memorize–Free vocab builder with FSRS-5 spaced repetition M… Read More
Show HN: GitHub Stars Semantic Search - Find Your Starred Projects https://ift.tt/DkTbRPKShow HN: GitHub Stars Semantic Search - Find Your Starred Projects htt… Read More
0 Comments: