Big Data

Big Data(bases): Making Sense of OldSQL vs NoSQL vs NewSQL

A few months ago, I had the great pleasure of meeting and discussing big data with Michael Stonebraker, a legendary computer scientist at MIT who specializes in database systems and is considered to be the forefather of big data. Stonebraker developed INGRES, which helped pioneer the use of relational databases, and has formed nine companies related to database technologies.

Until recently, the choice of a database architecture was largely a non-issue. Relational databases were the de-facto standard and the main choices were Oracle, SQL Server or an open source database like MySQL. But with the advent of big data, scalability and performance issues with relational databases became commonplace. For online processing, NoSQL databases have emerged as a solution to these problems. NoSQL is a catch-all for different kinds of database architectures — key-value stores, document databases, column family databases and graph databases. Each has it’s own relative advantages and disadvantages. However, in order to get scalability and performance, NoSQL databases give up “queryability” (i.e. not being able to use SQL) and ACID transactions.

More recently a new type of database has emerged that offers high performance and scalability without giving up SQL and ACID transactions. This class of database is called NewSQL, a term coined by Stonebraker. He provides an excellent overview of OldSQL vs NoSQL vs NewSQL in this video.

Some key points from the video:

  • SQL is good.
  • Traditional databases are slow not because SQL is slow. It’s because of their architecture and the fact that they are running code that is 30 years old.
  • NewSQL provides performance and scalability while preserving SQL and ACID transactions by using a new architecture that drastically reduces overhead.

In the video, Stonebraker talks about VoltDB, an open source NewSQL database that comes from a company of the same name founded by him. Some of the performance figures of VoltDB are pretty amazing:

  • 3 million transactions per second on a “couple hundred cores”
  • 45x the performance of “a SQL vendor who’s name has more than three letters and less than nine”
  • 5-6 times faster than Cassandra and same speed as Memcached on key-value operations

VoltDB sounds like an extremely compelling alternative to NoSQL databases, and certainly warrants a look if you want to move from a traditional “OldSQL” database to one that is highly scalable and performant without losing SQL and ACID.