When Holmes meets Bayes

Please read him as him/her. ‘Man’ refers to mankind/humanity in general.

Like many people, in the past 2 weeks I too am disturbed when I think about what could have happened to the missing Malaysian flight MH370. On Saturday the 15th of March, Malaysia Prime Minister communicated that evidence is consistent with someone acting deliberately from inside the plane and I could not help notice the shift in direction since then. The focus is now high on its pilots, crewmembers and passengers. The dearth of a solid meaningful clue has understandably made the search difficult and if it stays, I’m worried Sherlock Holmes might get to solve this with help from Thomas Bayes.

Bayesian inference in is a useful tool in statistics to help manage uncertainties. Mathematically, Bayes’ theorem gives the relationship between the probabilities of event A and B, P(A) and P(B), and the conditional probabilities of A given B and B given A, P(A|B) and P(B|A). In its most common form, it is:

13_Bayes

However, in real life, Bayesian inference could get ‘belief shattering’ when Holmes begins to use the same. In an investigation involving uncertainties and humans (like the missing MH370), without Holmes in the picture, the Bayesian way of solving it would be to find the probability of the underlying event occurring, given that the men involved are innocent. The premise or the prior knowledge held is simple i.e. the involved men are innocent to start with.

However, the Bayesian inference by Holmes is different. Holmes believes when one has excluded the impossible, whatever remains however improbable has to be the truth. Holmes tries to solve the problem by attempting to calculate the probability of one’s innocence given all the evidence. The premise or the prior knowledge that Holmes focuses on is the probability of getting such an evidence itself out of earlier such occurrences involving innocent men.

With the search entering the 11th day, I’m trying hard to believe in a miracle. I hope the entire episode gets explained through facts very soon. I hope Holmes does not have to go far into theorizing this mystery.

Advertisements

That Counterintuitive Moment

I explored the relation between data and intuition in my previous blog. Here I’ve pondered the concept of coutnerintuition and the importance of perseverance in a pursuit. I think a discovery in science is that unique moment when one’s intuition intersects with a phenomenon, which subsequently gets rationalized either mathematically or through an experiment. Some of those discoveries happen to get counterintuitive too i.e. they tend to defy (sometimes completely reversing) a previously held intuition, only to evolve into a new intuition providing us with that extra bit of knowledge. In the subsequent paragraphs, I recall a few instances that I think were/are counterintuitive.

Most of us are aware of the flat world theory that prevailed prior to 5th century BC. When proposed, the concept of a spherical world was difficult to grasp for many. Even the scholars during that time were unable to accept it until they began see empirical evidence in its favor. I think moving to the spherical world would have been a counterintuitive moment for those who had embraced the flat world theory completely.

The second one is Heisenberg’s uncertainty principle. This principle asserts the fundamental limit in precision with which certain pairs of physical properties (position and momentum) of a particle can be measured. This can be demonstrated through an experiment wherein; a laser beam is made to pass through a slit (of a variable length) and projected on a screen. The image shrinks as we narrow the slit, which is intuitive, but beyond a specific point (when the slit is narrowed to less than a 100th of a cm), the behavior reverses – the image actually begins to widen as we narrow the slit further. I think for someone who has been exposed to classical physics alone, this behavior could sound counterintuitive.

My last example is related to the game of cricket. For a long time the fast bowlers of the game were used to the intuition of the conventional swing. The cricket ball when bowled fast swings in the direction of the side that is relatively rougher. For decades fast bowlers exploited this by polishing only one side of the ball in the course of the game to maximize the possibility of a swing, creating trouble to the batsman. The intuition was completely justified and aerodynamically rationalized, until the concept of reverse swing was observed a few decades back. Reverse swing is observed when the cricket ball is bowled even faster (say 80 mph). At this critical speed, the intuition is defied and the ball also swings in the direction of the polished side. Understandably this counterintuitive behavior took some time to settle in the cricketing world.

In all above cases I sense someone’s (the first observer of each of the above phenomenon) relentless pursuit to evolve an intuition. I happened to stumble upon this interesting perspective by Richard Feynman on what Physics is all about. Feynman compares Physics to a grand game of chess played by God and humans trying to interpret the rules of it by just watching a few moves here and there. All of a sudden, pursuing to intuit appears more important than the act of intuiting itself.

Data & Intuition

Few years back, my friend explained how he understood the word ‘intuition’. Per him, the word is a possible derivation from ‘in-tutor’, which refers to the ‘tutor’ or the ‘guide’ inside each one of us. If that were to be true, we reflected why we find it difficult to follow our inner tutor while making important decisions. I think the reason we get jittery about following our own intuition is intuition is un-rational. What Mahatma Gandhi calls as the ‘inner voice’ in his autobiography is intuition (I believe).  Albert Einstein calls an intuitive mind a sacred gift that needs to be honored. Per Steve Jobs, intuition is the most important thing, even more important than intellect.

While intuition is instinctive, the other way to decide is through reason that happens primarily by analyzing data or the information we have at hand. Many business leaders have stressed the importance of data driven decisions as well. A quote of N.R. Narayana Murthy goes like this – In god we trust and everyone else come with data. Recently read an article about Marissa Mayer’s decision to ban Yahoo employees from working from home. The article cites Mayer’s obsession with data and metrics to help her decide.

While both forms of decision making (intuitive and data driven) look absolutely pertinent in our daily lives, this blog is an attempt to resolve a personal dilemma i.e. what should prevail when one is forced to choose? As a BI practitioner, I can fully appreciate the importance of data and metrics. I continue to see it add value in business and in my personal life. But I’m sure all of us would have crossed those unique moments in life through gut, hunch or intuition. I also believe intuition is more fundamental an experience to inference.

I recently stumbled upon this email, posted online by Andrew Mason. Many of us are aware of this charming goodbye email written by the founder of Groupon. In his letter, he shares an important wisdom. His regret in his own words – My biggest regrets are the moments that I let a lack of data override my intuition on what’s best for our customers. To me, his statement captured the importance of both intuition and data. As I think of it, every data driven scientific decision that we experience is someone’s intuition in the form of a mathematical model, algorithm etc. I think one should strive to decide by intuition. Technology solutions built around data and metrics can help serve 2 purposes here – (1) either to execute those decisions or (2) to help the decision maker stay clear of any doubt he/she might have during the process. I’ll sign off with another quote of Mahatma Gandhi – You must try to listen to the inner voice, but if you will not have the expression ‘inner voice’, you may use the expression ‘dictates of reason’, which you should obey.

The Chef & the Cook

Please read him as him/her.

What is that quality or a skill that defines a chef especially when compared to a cook? While the distinction is fairly blurred, few came close to my intuition as I researched in the Internet. One of them went like this – while a cook is more of an ‘executor’, a chef is a ‘conceptor’. Another one said, while a cook can make different kinds of food, a chef could have his own personal recipe. In essence a chef is likely to be far more technically skilled, one who is capable of building his own secret sauce. From a market perspective, I tend to think the skills of a cook can get commoditized over a period of time but not that of a chef’s. Also, I like to see the cook/chef setup as a metaphor to professions in other industries. Any thriving industry would constantly evolve to commoditize the skills of its cooks and at moments also get ambitious to commoditize that of its chefs. No matter how many times the commoditization happens, the same industry is likely to witness the evolution of its chefs in a different form at a later point in time.

In computing industry too, it is not uncommon to find this.  ‘Commoditize’ in this context is when the industry offers to automate a programmer’s task or provide it as a feature part of the technology infrastructure itself. Attempting to ‘commoditize the skill of its chef’ happens when a programmer alone can provide the optimal implementation (in many cases) of the feature being considered. Here I’ll go through 2 instances that I observe, where the industry has offered to commoditize the skills of its software chefs. The first one began 4 decades back and the second one is happening at the moment.

Software programmers don’t do many of the tasks that they used to do long time back. For instance, they no more care about garbage collection; they no more explicitly type cast (in many cases) and so on. While many of these have made the programmer more productive, some are yet to. Codd in his disruptive 1970 paper articulated the role of the Query Optimizer in the paradigm of the Relational Database. The optimizer will arrive at a ‘least cost’ program to retrieve the underlying data to prepare the final result set. In the same paper, Codd did agree that building such an optimizer is a difficult design problem. Although most of the commercial databases have the optimizer built in, I think this task is done best by the programmer alone. Gartner’s 2012 magic quadrant for Data warehouse databases cites performance issues with complex queries as a problem noticed by its clients even with the market leaders and this is 40 years since RDBMS was born. As long as the cardinality and the correlation of the data (of the underlying subject area) continue to evolve, something I believe is inevitable; we will continue to solve this problem.

The second instance is the emergence of the various analytical platforms and their claims to make Analytics easier and accessible. While I’ll not delve deep, this appears to be even more ambitious (compared to the above) especially when analytics is all about imagination and intuition.

Overriding thoughts & the inevitable simplicity

Years ago were a time I was worried about the future and for a brief period had lost my ability to stay happy. Looking at my state, my friend advised that every thought is our own creation and we have absolute control over them implying happiness is my own choice. Recently, another friend of mine  shared his intuition that material existence involves navigating ones collection of nested thoughts. One thing appears clear – thoughts are inevitable as we get through this life.

While we can think at will i.e. while we can choose our next thought, our choice is bound by what we remember. We wont know what we forget at the moment right ? And am sure all of us do forget something all the time. Also once in a while, all of us would have experienced something ‘out of the blue’ or through instincts i.e. those ‘out of context’ thoughts. This makes me think and even conclude that the arrival of the next thought is elusive over which we will never have absolute control over. Now, that gets me nervous – the lack of control over our own thoughts which eventually drive our actions. But an appropriate subsequent thought is going to hold importance especially during moments of uncertainty. So how do we handle this paradox?

While we cannot control, I think we can influence. Influencing happens through proactively conditioning the mind with the thoughts that we would like to experience. Through such conditioning, we improve the probability of us experiencing the right thought at the appropriate time. Now, if we decide to condition, what could be our selection of those thoughts? Thoughts that could come handy and help us decide during uncertain moments could be one good choice. Especially those thoughts that are driven by ‘will’ which override the convenient ones for the not so convenient ones for a deeper purpose. I’ll call these crucial thoughts as ‘overriding’ thoughts. As one embarks on such a conditioning pursuit, he/she will most likely realize a constraint that such overriding thoughts can only be those that are easy to remember. To remember, the thought has to be simple and a fundamental one that finds frequent applicability. I have an  intuition that such thoughts can get nothing other than fundamental qualities (both positive and negative) around decency, discipline, spontaneity etc. It does appear that simplicity (in both thoughts & actions) is inevitable in any pursuit, only that the pursuit has to be genuine.

CAP Theorem: Wearing the right design hat

In 2000, Eric Brewer articulated his conjecture that of the 3 desirable properties in a distributed system, one has to choose two of them. There is no system that can be designed that gives all the 3 at any point in time and with a formal proof in 2002, this got established as the CAP theorem. The 3 desirable properties are Consistency (C), Availability (A) and Partition Tolerance (P). The 3 possible systems that can be designed are commonly referred as CA (a system that is not Partition Tolerant), AP (system that might forego consistency to stay available) and CP systems (system that might forego availability to maintain consistency). While the theorem is simple and intuitive, it can also get confusing especially when this gets associated alongside a platform or a Database system. This blog is my attempt to get a handle on the CAP intuition.

As noted by Eric himself, one can get possibly get confused when he/she relates the ‘C’ of CAP to the ‘C’ of ACID (the key properties that define most Relational Databases).  While both the C’s refer to consistency, the C in CAP is a strict subset of the C in ACID. The CAP ‘C’ only refers to the single copy consistency of a data element where as the ACID ‘C’ includes much more (like a guarantee for unique keys etc.). The other source of confusion is from the way ‘Availability’ in CAP can be construed. Per the CAP intuition, a CP system is not to be construed as a system that is unavailable. It is actually a system that forfeits availability when in doubt i.e. a portion of a system is technically up and receives the request but chooses not to respond because a complementing portion of the system is unreachable.

In the subsequent paragraphs, I’ll disassociate the CAP intuition from the platform and will intuit it from a design perspective. Let us try applying CAP in the context of an online retailer. Every online retailer will most likely have 2 key functionalities – (1) To add an item to a cart/shopping bag and (2) Check the availability of the item in the inventory. Let us consider different ways of designing this system.

Imagine a retailer who is obsessed with customer satisfaction and does not want to disappoint their customers at any point. This retailer considers it a breach of contract to allow its customer to add an item to the cart that is not verified for its inventory availability. The retailer thus chooses to tightly couple the inventory check functionality to its ‘add to cart’ functionality. Such a monolithic implementation would mean the customer would not be able to add an item to his/her online cart that could not be verified for its availability. Such an implementation is a CA system.

Now consider another retailer who is proud of its strong supply chain. This retailer also does an inventory check before the ‘add to cart’ act but prefers to allow the customer add the item and proceed to an online checkout when the availability of the item could not be verified at that moment. This retailer is confident that it will be able to find the right supplier and meet the SLA for the items it promotes in its store. This retailer too implements the inventory check as part of the ‘add to cart’ feature but the implementation is not monolithic as the previous one. The ‘add to cart’ will be available for the customer even when the ‘inventory check’ is down for a brief duration. Most important, the entire implementation is kept transparent to the end customer. The customer adds such an item and even completes the order. Such an implementation is an AP system.

Now consider the third retailer who opts to take a middle path in the implementation. This retailer also chooses to check for the inventory availability at the time of ‘add to cart’. In case the ‘inventory check’ service is down, instead of deciding to continue (like the second retailer), this retailer opts to communicate the status to the customer and allows him/her to decide on the next step. Unlike the previous implementation, before taking an order, the customer is explicitly communicated about the pending ‘inventory check’ and a possible delay of shipment. Such an implementation is a CP system.

Mathematics.. A discovery or an invention?

Please read him as him/her.

I believe the primary responsibility of a BI (Business Intelligence) practitioner is to comprehend and quantify a phenomenon through data and make it accessible for decision-making. Understandably, numbers form an important part of his life. Numbers and more fundamentally Mathematics has long intrigued me. Is mathematics a product of human thought or is it a reality that humans stumbled upon? While there have been many perspectives around this question, I’ll cite the perspectives of 2 great physicists. In 1960, Eugene Wigner articulated the unreasonable effectiveness of mathematics in Natural sciences, reflecting the deep connection that he observes between sciences and mathematics. Albert Einstein held the following opinion –  ‘As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality.’ While answering my own question is certainly beyond my capacity or imagination at the moment, this blog is my attempt to find those mathematical laws and crucial numbers (those I have come across) that have evolved out of reality. In a way, all of us experience these or use these for our decision-making. Again these laws and numbers are purely empirical.

Number 7 is the magical number when it comes to decision-making.  Originally conceived by cognitive psychologist George A Miller, this is referred to as Miller’s Law. Miller articulates that when it comes to humans processing information, the number of objects an average human can hold in working memory is 7 (range is between 5 to 9).

The bell curve (binomial and normal distribution) is an effective way to describe and imagine a random process. The empirical rule or the 68-95-99.7 rule states that in a normal distribution, approximately 68% of the data falls within one standard deviation of the mean, 95 % of the data falls within two standard deviations of the mean and within three standard deviations of the mean, we should fine almost (99.7%) the entire data. There are many contexts, where in we could see this at work or made to work. In his theory – Diffusion of Innovation, Everett Rogers places 34% of the community as the early majority and another 34% (68% total) as the late majority. A more personal example could be its association with Performance Evaluations (see Vitality curve) in an enterprise. Performance evaluations in many enterprises will most likely recognize no more than 30% of the group as high and low performers placing the remaining 70% close to the mean i.e. those who perform well to meet expectations.

The 80-20 rule or the Pareto principle is a commonly used one to explain cause and effect. Originally conceived by Pareto in 1906 when he observed that 80% of the land in Italy was owned by 20 % of the population. We could observe this phenomenon in business as well wherein 80% of the business is most likely to come from 20% of the clients.

Another concept that was born out of breakthroughs in technology and communication is the ‘six degrees of separation’. Per this concept, the social distance between any 2 random people in this universe is approximately 6 i.e. through my friend I can connect to his friends and when I go down this chain 4 more times I’m most likely to reach most of the people in this world. When we say ‘ It is a small world’, we exactly mean this. Again in an enterprise, there will most likely be no more than 6 levels of hierarchy between any employee and the ultimate decision maker (the CEO maybe).

The last one is the Zipf’s law, which is another curious and a mysterious reality. Originally proposed by linguist George Kingsley Zipf, this law states that given a corpus of natural language occurrences, the frequency of any word is inversely proportional to its rank in the frequency table.  For example, the most frequently used word ‘the’ is used almost twice the number of times the next frequently used word ‘of’. It is mysterious because, this pattern is evident in the population of world cities as well.