should you say yes to NoSQL? (1/2)

September 17, 2012 Charly Hamy , Cloud

NoSQL stands for “Not only Structured Query Language”.

Maybe you’ve already come across this imposing new term – half buzzword, half call to arms – which has popped up in blogs, presentations, and tweets for the past three years. This movement, which signals a revolution in data management and the start of the Big Data race, has already proved useful for large-scale projects by adding fully developed products to the software architect’s toolbox.

It’s especially adapted to fields like M2M, where collected data’s diversity, abundance, and need for high availability and statistical analysis (often in real time) are quickly reaching the limits of traditional products.

So will NoSQL soon be a central part of your information system?

a brief history of databases

Built on principles established in the 1970s, relational database management systems (RDBMS) became the cornerstone of all large-scale software architecture. These systems store data in organized tables with a new line for each entry. Relationships between entries can then be viewed using different fields like primary or foreign keys.

The common language compatible with the vast majority of RDBMS is Structured Query Language (SQL), which was standardized in 1986. SQL laid the groundwork for a new functional convergence among these solutions. It became so popular that it is now fairly common to develop applications that can query relational databases without knowing whether Oracle, SQL Server, PostgreSQL, or MySQL will be used.

SQL and relational databases eventually emerged as indispensable. Their use for shared or long-term data storage became automatic and was rarely challenged... until recently.

new needs

The logic behind relational databases is particularly suited to data storing, manipulating and sharing organized, permanent (saved to disk), and highly consistent (ensured by transaction mechanisms) data.

Relational databases work particularly well in the banking sector, where data consistency (so as not to add or lose value during bank transactions, for example) and permanence are absolute priorities.

With the boom in web services – especially social networks – new data storage needs cropped up. Data storage must:

treat requests from thousands or millions of users simultaneously. This number can climb exponentially in a matter of weeks (the infamous scalability – yep, it's coming)
favor performance over consistency (users expect fast service and can put up with a few imprecisions – like not seeing exactly the same content as the next guy)
maintain a high level of data availability, even when hardware malfunctions

When dealing with this kind of demand, RDBMS solutions are limited by the relational model itself. That’s where solutions based on radically different approaches come into play.

if I had a hammer

The "parable of the hammer” has gained wide currency in pro-NoSQL literature: “when you have a hammer in your hand, everything that juts out looks like a nail.”

In other words, it’s not because a tool is widespread and efficient for a certain group of problems that it’s the best option in every situation. On the contrary, the more general a tool, the less likely it is to be the best solution for a specific problem. Sometimes a Phillips screwdriver can be your best friend.

The NoSQL movement bills itself as an opportunity to expand the range of solutions available in our data management toolbox.

Though the term “NoSQL,” by serving as a hallmark, has championed a whole new line of products (and some old ones: the 20-year-old BerkeleyDB is a NoSQL database), it doesn’t guarantee the technical capacity or conceptual approach of these products. There are a lot of ways to not be relational.

That’s what we’ll take a look at in the second part of this article.

Charly

This post was originally published in French here.

Charly Hamy

Architecte logiciel chez It&L@bs, entité d’Orange Business, j’interviens depuis deux ans sur des projets trans-techno (J2EE, .Net, web, embarqué) dans les domaines du M2M et de l’informatique industrielle. Passionné par l’informatique, j’aime suivre de près les dernières tendances technologiques dans le domaine du web et de l’ingénierie de SI.