The law of the hammer goes like this – If all you have is a hammer, everything looks like a nail. The concept is used to symbolize the over reliance on a familiar tool. The hammer here is a metaphor to one’s habit of using a single tool for all purposes. As a computer science engineer, I have come across one specific pain point during my interactions with many professionals in my field. It is the resistance many of us have in moving beyond the SQL paradigm while finding a solution. In my opinion, SQL is a perceived golden hammer in Software Technology. Almost any technology that has attempted to offer something beyond what SQL offers, has needed the blessing of these SQL programmers (explained in the next paragraph) to gain adoption. A SQL flavor has become inevitable. One such technology that is facing this challenge is Hadoop, which too has tried its bit to lure the SQL community. This blog is my attempt to reason out this resistance.
When RDBMS rose to stardom in the 1980s, SQL offered a much-needed convenience to the then programmers through its declarative style by abstracting the underlying implementation and allowing the developer focus on the ‘what’ aspect. With the advent of SQL, developers no longer need to write long programs detailing the how part (imperative style) to retrieve the data elements. BTW, SQL in this article refers only to the non-procedural aspect of it until explicitly included, which is widely used. SQL did cover a lot of use cases through its limited set of functions and the world (especially the enterprises) was satisfied for the most part. Over a period of time, this convenience has resulted in a fairly large community of pure play SQL programmers who no more write algorithms i.e. do imperative style programming but expect to solve everything through SQL. In the Hadoop paradigm, Hive was an explicit attempt to lure such SQL programmers. Aster Data introduced SQL-MR (SQL – Map Reduce) to penetrate. I recently stumbled upon a white paper on Oracle’s In-DB Hadoop capabilities. All of these exploit the procedural capabilities of SQL to position Hadoop, which again is imperative style.
Fundamentally, I see Hadoop/Map-reduce to bring us back to the basics, where in we use a procedural style to implement the intelligence that we wish to infer from the underlying data. Hadoop comes with a pre-built distributed infrastructure, where our algorithm can be applied on large volumes of data. With Analytics being the primary use case for Hadoop, it is intuitive that if we were to apply our own intelligence, we need to be able to define the algorithm i.e. the implementation as well. The moment that ‘intelligence’ becomes available as a pre-built SQL function, it means we have already commoditized it and one is forced to write a different/more innovative algorithm all over again. So ‘imperative style’ programming is fundamental to analytics. Some technology consultants tend to project Hadoop as a complex paradigm compared to SQL and propose Hadoop to converge with SQL to gain adoption. The proposal is flawed for the fundamental reason that SQL is ‘declarative’ and Hadoop is ‘imperative’. Declarative style is only a high level abstraction and cannot exist without an imperative style implementation underneath. We will face this resistance until we embrace ’imperative style’ programming completely.