May8

MIT paper says “Databases-in-Virtual Machines offer limited scaling”

MIT stated in a recent research paper that many of the current Database-as-a-Service (DaaS) offerings on the marketplace offer limited scaling, mainly because they rely on virtual machines for server consolidation and lack support to go beyond a single node.

Further, they offer no viable solutions for data privacy or for specifically processing queries over encrypted data.

As a result, they are building and testing a Relational Cloud, or more accurately a highly secure Database-as-a-Service (DaaS) which specifically go beyond existing services such as Amazon RDS and Microsoft SQL Azure in overcoming limitations relating to efficient multi-tenancy, elastic scalability and database privacy.

MIT results were partly obtained from load statistics for about 200 servers from three data centers hosting the production database servers of Wikia.com, Wikipedia, and Second Life, and the load statistics from a cluster of machines providing shared services at MIT CSAIL.

MIT said their Relational cloud would benefit a single organization with many individual databases deployed in a private cloud or as a service offered via a public cloud infrastructure to multiple organizations.

Typically, a DaaS service, allows a developer to outsource all the complexity including hardware and software tuning to the remote web host while still tapping into the relational RDMS model and related SQL feature sets.

In terms of efficient multi-tenacy, MIT stated the typical DB-in-VM approach requires 2x to 3x more machines to consolidate the same workloads and that all things being equal would deliver on average 9x less performance than a rival approach of using a single database server on each machine, which hosts logical databases.

The Relational Cloud periodically determines which databases should be placed on which machines using a non-linear optimization formulation, combined with a cost model that estimates the combined resource utilization of multiple databases running on a machine,” said MIT.

Further, their Relational Cloud includes a lightweight mechanism to perform live migration of databases between machines.

In order to allow elastic scaling over multiple machines MIT will use a workload-aware partitioner and something called ‘graph partitioning’ to map and coordinate workloads across several nodes.

One of the most interesting aspects of their research was finding ways to execute SQL queries over encrypted data, a strategy designed to ensure privacy and raise “trust” in their new system.

What they came up with is something called CryptDB, or adjustable security, which prevents over-curious administrators from seeing user data. There is some hit to performance, between 20-30%, depending on which paper or blog post your read. However, MIT labels this loss as ‘acceptable’.

CryptDB employs different encryption levels for different types of data, based on the types of queries that users run. Queries are evaluated on the encrypted data, and sent back to the client for final decryption; no query processing runs on the client.

MIT confirmed the Relational Cloud includes a transaction coordinator, which supports both MySQL and PostgreSQL back-ends and will sport a JDBC public interface.

MIT intends to release the above as a public cloud once they have completed testing. Related documentation indicates the work above, especially CryptDB, is being supported by Google and the NSF.

This article was brought to you by VI.net, for dedicated server hosting, cloud servers and 24/7 support visit our site here www.vi.net

No Comments

Leave a Reply

You must be logged in to post a comment.

Stop blending in with the rest of the crowd and start leaving your mark on the web