Data & Intuition

Few years back, my friend explained how he understood the word ‘intuition’. Per him, the word is a possible derivation from ‘in-tutor’, which refers to the ‘tutor’ or the ‘guide’ inside each one of us. If that were to be true, we reflected why we find it difficult to follow our inner tutor while making important decisions. I think the reason we get jittery about following our own intuition is intuition is un-rational. What Mahatma Gandhi calls as the ‘inner voice’ in his autobiography is intuition (I believe).  Albert Einstein calls an intuitive mind a sacred gift that needs to be honored. Per Steve Jobs, intuition is the most important thing, even more important than intellect.

While intuition is instinctive, the other way to decide is through reason that happens primarily by analyzing data or the information we have at hand. Many business leaders have stressed the importance of data driven decisions as well. A quote of N.R. Narayana Murthy goes like this – In god we trust and everyone else come with data. Recently read an article about Marissa Mayer’s decision to ban Yahoo employees from working from home. The article cites Mayer’s obsession with data and metrics to help her decide.

While both forms of decision making (intuitive and data driven) look absolutely pertinent in our daily lives, this blog is an attempt to resolve a personal dilemma i.e. what should prevail when one is forced to choose? As a BI practitioner, I can fully appreciate the importance of data and metrics. I continue to see it add value in business and in my personal life. But I’m sure all of us would have crossed those unique moments in life through gut, hunch or intuition. I also believe intuition is more fundamental an experience to inference.

I recently stumbled upon this email, posted online by Andrew Mason. Many of us are aware of this charming goodbye email written by the founder of Groupon. In his letter, he shares an important wisdom. His regret in his own words – My biggest regrets are the moments that I let a lack of data override my intuition on what’s best for our customers. To me, his statement captured the importance of both intuition and data. As I think of it, every data driven scientific decision that we experience is someone’s intuition in the form of a mathematical model, algorithm etc. I think one should strive to decide by intuition. Technology solutions built around data and metrics can help serve 2 purposes here – (1) either to execute those decisions or (2) to help the decision maker stay clear of any doubt he/she might have during the process. I’ll sign off with another quote of Mahatma Gandhi – You must try to listen to the inner voice, but if you will not have the expression ‘inner voice’, you may use the expression ‘dictates of reason’, which you should obey.


The Chef & the Cook

Please read him as him/her.

What is that quality or a skill that defines a chef especially when compared to a cook? While the distinction is fairly blurred, few came close to my intuition as I researched in the Internet. One of them went like this – while a cook is more of an ‘executor’, a chef is a ‘conceptor’. Another one said, while a cook can make different kinds of food, a chef could have his own personal recipe. In essence a chef is likely to be far more technically skilled, one who is capable of building his own secret sauce. From a market perspective, I tend to think the skills of a cook can get commoditized over a period of time but not that of a chef’s. Also, I like to see the cook/chef setup as a metaphor to professions in other industries. Any thriving industry would constantly evolve to commoditize the skills of its cooks and at moments also get ambitious to commoditize that of its chefs. No matter how many times the commoditization happens, the same industry is likely to witness the evolution of its chefs in a different form at a later point in time.

In computing industry too, it is not uncommon to find this.  ‘Commoditize’ in this context is when the industry offers to automate a programmer’s task or provide it as a feature part of the technology infrastructure itself. Attempting to ‘commoditize the skill of its chef’ happens when a programmer alone can provide the optimal implementation (in many cases) of the feature being considered. Here I’ll go through 2 instances that I observe, where the industry has offered to commoditize the skills of its software chefs. The first one began 4 decades back and the second one is happening at the moment.

Software programmers don’t do many of the tasks that they used to do long time back. For instance, they no more care about garbage collection; they no more explicitly type cast (in many cases) and so on. While many of these have made the programmer more productive, some are yet to. Codd in his disruptive 1970 paper articulated the role of the Query Optimizer in the paradigm of the Relational Database. The optimizer will arrive at a ‘least cost’ program to retrieve the underlying data to prepare the final result set. In the same paper, Codd did agree that building such an optimizer is a difficult design problem. Although most of the commercial databases have the optimizer built in, I think this task is done best by the programmer alone. Gartner’s 2012 magic quadrant for Data warehouse databases cites performance issues with complex queries as a problem noticed by its clients even with the market leaders and this is 40 years since RDBMS was born. As long as the cardinality and the correlation of the data (of the underlying subject area) continue to evolve, something I believe is inevitable; we will continue to solve this problem.

The second instance is the emergence of the various analytical platforms and their claims to make Analytics easier and accessible. While I’ll not delve deep, this appears to be even more ambitious (compared to the above) especially when analytics is all about imagination and intuition.

CAP Theorem: Wearing the right design hat

In 2000, Eric Brewer articulated his conjecture that of the 3 desirable properties in a distributed system, one has to choose two of them. There is no system that can be designed that gives all the 3 at any point in time and with a formal proof in 2002, this got established as the CAP theorem. The 3 desirable properties are Consistency (C), Availability (A) and Partition Tolerance (P). The 3 possible systems that can be designed are commonly referred as CA (a system that is not Partition Tolerant), AP (system that might forego consistency to stay available) and CP systems (system that might forego availability to maintain consistency). While the theorem is simple and intuitive, it can also get confusing especially when this gets associated alongside a platform or a Database system. This blog is my attempt to get a handle on the CAP intuition.

As noted by Eric himself, one can get possibly get confused when he/she relates the ‘C’ of CAP to the ‘C’ of ACID (the key properties that define most Relational Databases).  While both the C’s refer to consistency, the C in CAP is a strict subset of the C in ACID. The CAP ‘C’ only refers to the single copy consistency of a data element where as the ACID ‘C’ includes much more (like a guarantee for unique keys etc.). The other source of confusion is from the way ‘Availability’ in CAP can be construed. Per the CAP intuition, a CP system is not to be construed as a system that is unavailable. It is actually a system that forfeits availability when in doubt i.e. a portion of a system is technically up and receives the request but chooses not to respond because a complementing portion of the system is unreachable.

In the subsequent paragraphs, I’ll disassociate the CAP intuition from the platform and will intuit it from a design perspective. Let us try applying CAP in the context of an online retailer. Every online retailer will most likely have 2 key functionalities – (1) To add an item to a cart/shopping bag and (2) Check the availability of the item in the inventory. Let us consider different ways of designing this system.

Imagine a retailer who is obsessed with customer satisfaction and does not want to disappoint their customers at any point. This retailer considers it a breach of contract to allow its customer to add an item to the cart that is not verified for its inventory availability. The retailer thus chooses to tightly couple the inventory check functionality to its ‘add to cart’ functionality. Such a monolithic implementation would mean the customer would not be able to add an item to his/her online cart that could not be verified for its availability. Such an implementation is a CA system.

Now consider another retailer who is proud of its strong supply chain. This retailer also does an inventory check before the ‘add to cart’ act but prefers to allow the customer add the item and proceed to an online checkout when the availability of the item could not be verified at that moment. This retailer is confident that it will be able to find the right supplier and meet the SLA for the items it promotes in its store. This retailer too implements the inventory check as part of the ‘add to cart’ feature but the implementation is not monolithic as the previous one. The ‘add to cart’ will be available for the customer even when the ‘inventory check’ is down for a brief duration. Most important, the entire implementation is kept transparent to the end customer. The customer adds such an item and even completes the order. Such an implementation is an AP system.

Now consider the third retailer who opts to take a middle path in the implementation. This retailer also chooses to check for the inventory availability at the time of ‘add to cart’. In case the ‘inventory check’ service is down, instead of deciding to continue (like the second retailer), this retailer opts to communicate the status to the customer and allows him/her to decide on the next step. Unlike the previous implementation, before taking an order, the customer is explicitly communicated about the pending ‘inventory check’ and a possible delay of shipment. Such an implementation is a CP system.

The Golden Hammer in Software Technology

The law of the hammer goes like this – If all you have is a hammer, everything looks like a nail. The concept is used to symbolize the over reliance on a familiar tool. The hammer here is a metaphor to one’s habit of using a single tool for all purposes. As a computer science engineer, I have come across one specific pain point during my interactions with many professionals in my field. It is the resistance many of us have in moving beyond the SQL paradigm while finding a solution. In my opinion, SQL is a perceived golden hammer in Software Technology. Almost any technology that has attempted to offer something beyond what SQL offers, has needed the blessing of these SQL programmers (explained in the next paragraph) to gain adoption. A SQL flavor has become inevitable. One such technology that is facing this challenge is Hadoop, which too has tried its bit to lure the SQL community. This blog is my attempt to reason out this resistance.

When RDBMS rose to stardom in the 1980s, SQL offered a much-needed convenience to the then programmers through its declarative style by abstracting the underlying implementation and allowing the developer focus on the ‘what’ aspect. With the advent of SQL, developers no longer need to write long programs detailing the how part (imperative style) to retrieve the data elements. BTW, SQL in this article refers only to the non-procedural aspect of it until explicitly included, which is widely used. SQL did cover a lot of use cases through its limited set of functions and the world (especially the enterprises) was satisfied for the most part. Over a period of time, this convenience has resulted in a fairly large community of pure play SQL programmers who no more write algorithms i.e. do imperative style programming but expect to solve everything through SQL. In the Hadoop paradigm, Hive was an explicit attempt to lure such SQL programmers. Aster Data introduced SQL-MR (SQL – Map Reduce) to penetrate. I recently stumbled upon a white paper on Oracle’s In-DB Hadoop capabilities.  All of these exploit the procedural capabilities of SQL to position Hadoop, which again is imperative style.

Fundamentally, I see Hadoop/Map-reduce to bring us back to the basics, where in we use a procedural style to implement the intelligence that we wish to infer from the underlying data. Hadoop comes with a pre-built distributed infrastructure, where our algorithm can be applied on large volumes of data. With Analytics being the primary use case for Hadoop, it is intuitive that if we were to apply our own intelligence, we need to be able to define the algorithm i.e. the implementation as well. The moment that ‘intelligence’ becomes available as a pre-built SQL function, it means we have already commoditized it and one is forced to write a different/more innovative algorithm all over again. So ‘imperative style’ programming is fundamental to analytics. Some technology consultants tend to project Hadoop as a complex paradigm compared to SQL and propose Hadoop to converge with SQL to gain adoption.  The proposal is flawed for the fundamental reason that SQL is ‘declarative’ and Hadoop is ‘imperative’. Declarative style is only a high level abstraction and cannot exist without an imperative style implementation underneath. We will face this resistance until we embrace ’imperative style’ programming completely.

Internet Privacy: Your anonymous signature in the web

Webster’s dictionary defines privacy as the quality or state of being apart from company or observation. ‘Online privacy’ has been a topic of concern and debate for a while. Richard Stallman had warned us long time back that Internet users need to swim against the tide if they were to protect their privacy. Recently, Steve Woznaik predicted that Cloud Computing could cause horrible problems to us in the next five years. Above all, privacy Expert Steve Rambam has declared, Privacy is actually dead and it’s time we get over it. While these are no good news to the consumer, technology giants (like Apple, Amazon, Google, Facebook, Microsoft) on the other end are claiming to take privacy pretty seriously. With all of them showing keenness to crack the human mind, I’m convinced it is fundamentally not possible for any of them to allow the consumer experience ‘privacy’ as defined by Webster’s. So, what’s the privacy that they are talking about? As I looked into their privacy policies, it is evident that at best, these policies explain clearly how one’s privacy (going strictly by Webster’s definition) can be compromised when their products are used. My attempt in this blog is to understand their policies and identify the one that intrudes the least into user’s privacy.


Google has made it’s intentions clear through it’s privacy policy that for its products to provide a compelling user experience that is personalized and relevant, it would need to collect every possible information about customer behavior that is out there. The information includes details about the consumer’s device (hardware, OS, crash events etc.), customer’s footsteps in their site (click stream, IP, cookies), customer’s location details (GPS, nearby wi-fi towers, cell towers). Google apparently strives to get the most out of cookies, anonymous identifiers and pixel tags to know the customer’s mind. While Google’s ambitions maybe to decipher the human mind, it is feared for the enormity of the data it holds about the customer. Maybe it has enough data to identify the customer through his/her mind.

Some of the controversies in the past clearly reflect this sentiment. The incident of Google to have inadvertently collected Wi-fi payload data from public wi-fi spots was received with intense criticism. Its street cameras are perceived to be too invasive into our homes. Google’s recent move to unify its privacy policies across products is received with skepticism.

Amidst all these controversies, one thing that I see positive about goggle is its data liberation initiative, which is about allowing the user to liberate him/her along with their data from Google when one decides to stop using its services.


Apple collects as much information as Google does when the consumer is on its terrain (Apple website, icloud, app store etc.).  In fact in limited cases, Apple also collects the consumer’s SSN. While Google’s message to the consumers sounds like ‘We would like to know everything about you to delight you’, Apple’s message is more like ‘Look, we need all this data about you to delight you. You better be aware of this’. Apple has always mentioned it takes privacy seriously. Apple’s safari browser blocks third arty cookies by default and its opt-out channels are relatively straightforward and accessible.

But still Apple is not free of any controversies in the past. Notable instances include the location logging controversy last year where in iOS was found to be logging one’s location details and backing up in one of Apple’s servers. Apple acknowledged this to be a bug. The recent one is about exposing the unique identifier of an iphone/ipad device (UUID) though one its APIs. Apple has responded by deprecating this API as well. Apple was also criticized for allowing mobile apps (like Path) to access one’s personal contacts (without explicit consent) that could eventually land in external servers. Apple responded by including an explicit consent. Above all this, I imagine the biggest source of personal data to Apple is through its flagship app Siri. While Apple has made it clear that Siri’s data will only stay with Siri that sure is a lot of personal data including our voice.


With its very minimal retail (store) presence, Amazon relies heavily on data for their marketing needs. I believe Amazon crunches enormous data to understand their customer behavior and target them with personalized product recommendations. Amazon has been pretty effective in handling data with no major controversies until last year. The only controversy (on privacy) that I can recall about Amazon is with its ‘Silk’ browser that is packaged as part of its tablets. There is a default option in Silk that allows every HTTP request go through Amazon’s cloud infrastructure. While Amazon assures you a better experience through speed, it allows it to capture the user’s entire web history. This does not apply to SSL connections and the user could turn off this feature as well.


While the companies mentioned above are feared for what they know about you, Facebook is primarily feared for what they can reveal about you. While its mission is to make the world open and connected, its users have certainly found it difficult to digest that and keep pace with the change. There have been several controversies in the past where Facebook has reimagined privacy. Beacon feature is an acknowledged misstep. News feed was another controversial product in the beginning but proved successful over time. Facebook is currently pushing the envelope through its ‘frictionless sharing’ applications where there is nothing private after signing up for it. As usual, it is both loved and hated at the same time.


While there have been numerous controversies around Microsoft’s control over the platform, when it comes to Privacy, Microsoft comes out pretty clean. I observe its privacy policy to be the friendliest of the lot. While Microsoft is equally interested in data about consumer behavior, its privacy policy (Bing) explicitly calls out that personally identifiable information will not be correlated with the behavior data. Microsoft has demonstrated its ‘privacy by design’ philosophy on multiple occasions. While Siri transmits the voice to a remote server, Microsoft’s Kinect keeps the biometric data locally. While every one else assume opt-in by default (on many occasions) Microsoft keeps ‘Opt-Out’ as a default.  Microsoft’s recent heroic in privacy is about insisting and keeping ‘Do Not Track’ as the default setting in its IE10.  In my opinion, Microsoft emerges as a clear winner when it comes to Privacy.

Free Software: When man ignores a walled garden for a forest

Please read him as him/her. Man refers here to mankind/humanity in general and includes everyone in this planet.

In 2008, when most of the world was excited about cloud computing, he called it worse than stupidity. When almost the entire world mourned the death of Steve Jobs in 2011, he said he was glad that he is gone. Although a computer programmer and a techie himself, he advocates paper voting to machine voting. He does not recommend mobile phones. He is against Software Patents and Digital Rights Management, prompting Microsoft CEO Steve Ballmer to call the license he authored ‘a cancer that attaches itself in an intellectual property sense to everything it touches’. He is Richard M Stallman (rms going forward), founder of Free Software Foundation and the author of the GNU public license. I have perceived rms to a stubborn personality and one whose views (when implemented) could cause a lot of inconvenience to the majority. This blog is my curious attempt to get into his mind and imagine his point of view.

‘I prefer the tumult of liberty to the quiet of servitude’ – Thomas Jefferson

Man likes to indulge and more importantly exert his capabilities. Products and services emerge to enable him achieve both of these with minimal effort. Man doesn’t really care about the ‘how’ part in these as long as they don’t infringe in his very ability to exert. When these products and services get prevalent/rigid and when man begins to feel constrained by his lack of choice, a force does emerge to remind mankind of his fundamental rights and basic freedom. Free software is one such force in the field of computing. The motivation for free software arose in a similar context during the seventies, when there was a considerable rise in proprietary software in the form of companies releasing software in binary/executable formats making it impossible for the end user to change or modify them on their computers. In the year 1980, copyright was extended to software programs. In 1983, rms announced his frustration and the GNU project was born. The goal of the GNU project was to develop a sufficient body of free software to get along without any software that is not free.

At the heart of Free Software is the GNU General public license (GNU GPL) authored by rms and we will focus on that. In the most fundamental sense, free software was and is never about money. As rms explains, it is about liberty and not price. The word ‘free’ should be thought of as in ‘free speech’ and not ‘free beer’. A software program that qualifies on this paradigm offers 4 essential freedoms to its users.

  • The freedom to run the program, for any purpose (freedom 0).
  • The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
  • The freedom to redistribute copies so you can help your neighbor (freedom 2).
  • The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

I consider a river to be a good metaphor to understand the philosophy of free software. The follow of a river symbolizes freedom. The source of a river is analogous to the software written from scratch by a programmer who believes in the above 4 freedoms and opts to distribute it (free of cost or fee) under GNU GPL. Now anyone who is interested in this water can consume it at his own risk i.e. under GNU GPL, he can receive a copy of the software in full and do anything with it. This includes using the software for a commercial purpose, either using it as such or by modifying or tinkering the same to suit his purpose. The risk is analogous to the lack of warranty on the received software. GNU GPL offers no warranty on the received software. The consumer is not obliged to share the software with anyone even if he has received it free of cost. It is only when the consumer decides to become part of the river i.e. when he opts to distribute the software further (as such or the modified version), the license expects him to distribute again in full. This is called ‘copyleft’ a very powerful license strategy to ensure the freedom does not get diluted downstream. There can be no further restrictions added to this software that is in violation with the GNU GPL license. The license is built around this very important paradigm and this is where some of its challenges are. If there is a conflict of any sort, the license expects the consumer to not distribute the software at all. It is either all or none and there is no half-truth whatsoever.

So how does free software differ from open source? I’m sure many of us (including me) think they are synonymous whereas it is not. A well thought out article on this topic is written by rms himself that can be accessed here. As it might be evident from this article, the philosophy of GPL and free software is very fundamental. It is not just about keeping the source code in a software open. While that sure has the practical benefits, the philosophy is much deeper. In a way, it is about asserting what is important for humanity (which is freedom) as well. Any movement that asserts something righteous and fundamental is most likely to be one that inconveniences the most. There is a possibility that the very people who were the intended beneficiaries could reject such a movement. I imagine Open Source to be an initiative (splintered off from Free software in 1998) that intended to address this problem. Open source attempts to get the practical benefits out to the user first before educating him on what is right to him in the first place. While free software is a social movement aimed to respect (and even remind him of) user freedom, Open source is a development methodology that asserts the philosophy that the only way software could be made better is by keeping it open and accessible. Interestingly rms calls them ‘free software activists’ and ‘open source enthusiast’ in his article.

Now I will cite some very specific contexts that have challenged GPL in the past and how GPL has evolved. I believe in all these contexts, rms has stood firm without compromising on the core philosophy. The first context – What if one has a restriction imposed that prevent him from distributing GPL-covered software in a way that respects other users’ freedom (for example, if a legal ruling states that he or she can only distribute the software in binary form). In such a setting, rms asserts the software not be distributed at all. This is the most important change as part of version 2 of GPL in 1991. Rms calls it ‘liberty or death’.

What if the software is open but the underlying hardware is restrictive?  As part of GPL V3, this got addressed (tivoization) and GPL does not permit GPL covered software be installed on such a restrictive hardware.  Again Free software considers laws around DRM (Digital Rights Management or Digital Restrictions Management) as restrictive. Here again rms has stood firm while some prominent open source proponents (like Linus Torvalds) have a differing point of view. Linus’s view is that software licenses should control only software.

What about Patents? GPL addressed this in version 3 as well. Free software philosophy does not encourage patents that too for software on general-purpose computers. If someone has a patent claim on software that is covered under GPL, which he wishes to distribute, GPL prohibits him from imposing a license fee, royalty, or other charge (specific to the patent claim) for exercise of rights granted under this License.

While rms’s views (on software) may not align with the vast majority and could also be inconvenient, when listened to, the philosophy cannot be proved incorrect. The problem with pure indulgence in general is that it blinds one momentarily of his judgment. Like any other thought, freedom along with one’s basic rights need to be remembered and exercised else they might just go away. In line with the famous quote – ‘the price of freedom is eternal vigilance’, in every aspect of GPL revision I do see vigilance. I think free software is a very important force and rms needs to be more than listened to.

Patent Wars: Guarding the Open secret

Please read him as him/her.

Not long back, I stumbled upon this URL that showed who is suing whom in the technology industry. To see some of those companies, whose products we all love, to be involved in such legal brawl was sure not an inspiring thing to me. Last month in his interview with All Things Digital, Apple CEO Tim Cook summed up the whole scenario as a pain in the ass. In this blog I have attempted to get a handle around the current ‘patent situation’ in the tech industry (mostly around the mobile space) and imagine why it is a difficult problem to solve.

Let us start with the basics. Why a patent? The intuition of a patent is to provide a commercial incentive to the creator who is willing to disclose his idea in complete detail. By making his idea public, the inventor creates a possibility of new ideas that could be spawned (from the original) and the commercial incentive he gets in return is a legal monopoly to monetize his idea in the market for a limited period of time.  How does this work? For an idea to be patented, it needs to be first patentable (novel & non-obvious to start with), which are applied by the inventors at the Patent Office (USPTO for instance), subsequently verified and approved by them. Broadly, a patent application has 2 components – claims and the specifications. The ‘Claims’ part talks about the scope of the invention and the ‘specifications’ part describes the technique involved around meeting the scope. Each claim should strive to be specific. It can be broad but not generic. When a particular ‘claim’ is implemented in full in another product without the consent of the patent holder using any technique, it can be called an infringement. The catch here is that for an infringement to occur, at least one claim in the patent needs to be implemented in full. This is precisely where it gets difficult to prove an infringement and this will be clear in the next few paragraphs.

Any new invention (product or a service) is most likely to be a unique combination of some existing features (features that are already present in this world) and new features.  Each feature is equivalent to a ‘claim’ in some patent. For this invention to be free of any infringement, for each existing feature there should be a consensus in some form with the original patent holder for including that feature as part of the current invention. With this knowledge, let us look at the various scenarios that are alleged around.

Here is a classic infringement allegation – a novel patented feature found in another product without the consent of the creator. The novel feature that I will quote here is the ‘data-tapping feature’ (FOSS Patents) that we find in smart phones (iphone). This is an invention that marks up addresses or phone numbers in an unstructured document like an email, to help users bring up relevant applications like maps, dialer apps, which can process such data. Apple filed this complaint against HTC (Android), which was ruled by ITC (International Trade Commission) in favor of Apple in Dec 2011. HTC had to either drop this popular feature or find a workaround and HTC chose to workaround this.

In the US, the legal monopoly offered for a design patent is 14 years and for a utility patent it is 20 years. What if a feature/claim becomes so common and quickly evolves as a standard, acquiring enough mindshare that it becomes part of the infrastructure? These features become ‘standards and essentials’ and once this state is reached, these inventions are FRAND (Fair, Reasonable And Non-Discriminatory) pledged by the creators with a license fee. Again the catch here is ‘fair’, which is very subjective and plays a crucial role when one’s direct competitors own these patents. Kodak has alleged patent infringement against Apple and HTC relating to their patents on Digital Imaging Technology. Another instance is Nokia alleging Apple to infringe into its patents on 3G, GSM, which was finally settled by Apple.

An interesting combination of the above two occurs when 2 inventions allegedly overlap. A classic example of this would be some of the allegations between Samsung and Apple. Apple alleged that certain Samsung’s smart phones have the same “look and feel” (called a trade dress in the IP world) as their own products. In another context, Samsung alleged that Apple infringed into its W-CDMA (standards & essentials) patent. Cross-Licensing each other’s patent to the other party is one way these conflicts could be resolved.

Defensive Patent strategy is where companies go out and acquire patents to help protect them from possible lawsuits from other companies. A famous example would be when Google announced its plan to buy Motorola Mobility in 2011 to protect its Android platform from its competitors (mainly Apple). An aggressive variation of this is a Patent Troll, which is to go out and buy patents with an intention to sue others.

Patent Pending is another clever strategy, wherein the creator releases his product after filing a patent that is not issued yet. He observes competition’s response, refines the patent to trap the competitor to infringement.

How does one defend?

The alleged infringer would typically defend his stance trying to prove the patent to be invalid. One way is to prove the invalidity is claiming that the invention was indeed obvious at that point in time (when the patent was issued) and the other way is by citing a prior publication or an art invalidating the novelty of the issued patent.

So, coming back to my original question – Why is this a difficult problem to solve?  I’ll try answering this by splitting this into 2 sub-questions – (1) why is an infringement inevitable? And (2) why will it be difficult to spot an infringement? The answer to the first question lies with a fundamental difficulty the human mind has in dis-possessing a good idea after experiencing it. In this information age, to innovate and differentiate has become mandatory for any business to survive. Companies are going to find it challenging to ignore innovation that is happening around and may end up indulging in creative workarounds/alternates. The answer to the second question lies with another fundamental difficulty.  Anyone other than the alleged creator will never know for sure if the creation was inspired or copied. Inspiration is important for the very advancement of science, while copying may not be. This necessitates any patent framework that we attempt to build be designed such that it assumes ‘inspiration’ when in doubt. Doing it the other way will defeat the whole purpose of the framework (remember the intent of a patent framework is also to make progress).

In a competitive environment (tech industry for example) where the stakes are high, patenting one’s invention (to detail out the idea along with the how part) is actually a risk. Every idea needs an ‘engineering magic’ to materialize to a unique user experience and if that ‘engineering magic’ is hard to guess, most of the companies would retain the ‘how part’ as a trade secret without patenting it (coca cola is a classic example). It is only when the ‘engineering magic’ is guessable that companies opt for a patent to prevent someone from copying and I think ‘visual’ user experiences are relatively guessable. While the intent of the patent framework was to promote the progress of science and useful art, the current use case for the same in my opinion is completely different. I anticipate the patent framework will continue to be used as a defensive weapon at best, used to intimidate, distract the alleged infringer and delay his progress.