TechEcho

4 comments

The paper is premised on a dichotomy that doesn't hold in real database systems. Specifically, the assertion that "static" shared-nothing architecture performance substantially degrades under skew, which provides a foil for shared-disk architectures that do not. While this is true for many shared-nothing architectures, particularly in open source, robustly skew-tolerant shared-nothing architectures have existed for at least a decade -- highly dynamic shared-nothing architectures are valid (and quite good) designs.A skew-tolerant shared-nothing database has internals that look a bit like "AnyDB" to the extent execution of a type of workload is largely disconnected from the physical architecture -- the storage engines are often identical, for example. This allows you to schedule any mixture of fast-twitch operations concurrent with slow analytic queries. The original motivation for these types of architectures was complex mixed workloads. What is missing from the AnyDB shared-nothing architecture to make it skew-tolerant is a mechanism for continuously and smoothly shedding both data and load across cores/machines while maintaining consistency. Multiple options here, just need to pick one that makes sense and plays nicely with the concurrency control model.Similarly, the synchronization-free streaming concurrency control model described in the paper is a standard design idiom. Most "thread-per-core" style database architectures do something like this -- it is one of the major advantages of being thread-per-core.The database engineering industry has a long history of not publishing design advances and this is illustrative of that. If someone asked me to point to a paper that describes all this, I'd have a difficult time thinking of one. That isn't where this knowledge tends to be stored.

the_dukeover 4 years ago

The juicy part of the paper:> The main idea of an architecture-less database system is that it is composed of a single generic type of component where multiple instances of those components “act together” in an optimal manner on a per-query basis. To instrument generic components at run-time and coordinate the overall DBMS execution, each component consumes two streams: an event and a data stream. While the event stream encodes the operations to be executed, the data stream shuffles the state required by these events to the executing component.Through this instrumentation of generic components by event and data streams, a component can act as a query optimizer at one moment for one query but for the next as a worker executing a filter or join operator.Doesn't sound too different from current distributed DBMS, which already specialize query execution and distribute workload between cores/nodes in a similar manner, but taken to the next level to more easily enable things like different data persistence models or FPGA integration.Seems challenging to implement without losing significant performance to the abstraction layer.

评论 #25752051 未加载

评论 #25756105 未加载

gregw2over 4 years ago

So I get that they call it "architecture-less" because it doesn't choose between "shared nothing" and "shared disk" architectures and thus can pivot from OLTP to OLAP.But I have a different OLTP vs OLAP "architecture" question.... is it row-based or columnar? Is it "architectureless" in that regard also? Are they going to store data persisted both ways so you get the worst of both worlds performance-wise or is there still an OLAP vs OLTP architecture choice there?I suspect there are still some architectural choices here!

评论 #25754364 未加载

tabtabover 4 years ago

I'd like to see Dynamic Relational implemented to have something in-between "architecture-less" and familiar RDBMS conventions. It's close enough to existing RDBMS to reduce the learning curve, and can be incrementally "locked down" via added constraints and types (via formatting constraints). It tweaks the RDBMS/SQL wheel just enough to get dynamism, NOT reinvent the wheel.It would be great for prototyping and rush-jobs. For example, a lot of orgs needed to get Covid-related employee illness, notification tracking, and/or vaccination tracking up and running ASAP. Something like DR would make it easy.<a href="https://stackoverflow.com/questions/66385/dynamic-database-schema#46202802" rel="nofollow">https://stackoverflow.com/questions/66385/dynamic-database-s...</a>

AnyDB: An Architecture-Less DBMS for Any Workload

4 comments

AnyDB: An Architecture-Less DBMS for Any Workload

4 comments