Monday, April 12, 2010
Botanical art and data analysis-Huh?
Birgitta Volz is an unusual botanical artist who works in auroville, India. She makes bark prints(by coloring the tree bark with organic natural color and making direct impressions on paper from it), wood prints and plant prints. When I first laid eyes on her work, the analyst in me stood in awe. The beauty of the art lay in it's clean, clear ink capture of the plant and the end result clearly brought out the thought, process, work, story and final result.
When I look at an analysis what I usually see is a crisp story sometimes well put together and at other times all over the place. What is usually missing is the creative aspect and something that I 'm always looking for. It must make me feel!
What is the art in the science then you may ask? Here is my version-
1. Has the doer really visualised the details of her work in her minds eye? It could be the story, the charts, the main points, the 'how to tell it' or the statistical work.
2. Has she followed a process(some times structured sometimes abstract) to arrive at the end?
3. Has she worked smart and hard? The fleshing out of the story with the right numbers and analysis is the key.
4. Does the overall analysis flow well? Is there attention to detail?
5. Has she stuck with the smaller picture and detail or the larger picture and less detail or both?
6. Does the final product reflect all that has gone in? Does it go where it intended to at the start?
Unfortunately some of the best analysis or numerical work is missing the above. Most of them leave me cold not because the work is not done but because 'the something extra' is missing. That something extra is not the 'art of creative story telling' rather the passion of number crunching to bring out a good story.
For all those of you not yet convinced that data analysts should learn about 'art' of their work from painters; here are some examples of the art I call science. They tell the story of plants.
Chameli Ramachandran
ASBA(The American Society Of Botanical Artists) member gallery
Wow!
Back to the number writing
I will be posting regularly from now on so keep reading.
Wednesday, June 17, 2009
To do SEM or Not to do SEM, that is the question?
I like Structural equation modeling (SEM) for it's ability to model the rhythm, flow and intricacies of relationships in real data BUT (and here's the hard part) I would not recommend that it be used unless a strict set of conditions are applied and met for its use.
For the uninitiated, here is what structural equation models do-
- They are a step up from regression models and allow you to incorporate multiple independent and dependent variables in the analysis.
- The dependent and independent variables may be latent constructs formed by observed variables.
- They allow for testing of specified relationships among the observed and latent variables in a kind of testing of hypothetical theoretical frameworks.
Here is why SEM completely falls apart in the hands of unskilled (even somewhat skilled) or ‘software trigger happy’ practitioners:
- Use of SEM in situations where the measurement structure underlying a set of items is not well established and there is no sound theoretical framework available for possible patterns of relationships among constructs
- Having too many single indicator constructs
- Items loading on more than one construct
- Low sample sizes relative to number of parameters to be estimated
- Lack of addressing issues such as outliers or normality of variables
- Use of too few measures for assessment of fit of model or use of measures of fit that do not address sample size biases
- Building models that are too complex
- Lack of use of measures of reliability to assess the quality of construct measurement
- Little attention given to variance fit measures in the structural model
- Lack of specifications of alternate nested models and testing of the same
- Using modification indices and residual analysis too liberally to re-specify the model
- No cross validation of model
Tuesday, June 9, 2009
When numbers just don't add up
I read about the row over Nielsen’s low American Idol ratings between Fox Network and Nielsen last month with some amusement. This came hot on the heels of my sister's comment that networks really need to deepen their understanding of what's going on with the ratings of shows because of so much riding on them. She works with a large network and is responsible for a couple of big reality show on the same; she voiced the same angst as Fox’s CEO Tony Vinciquerra. The frustration echoed by the networks is something that I have heard time and time again usually when the ‘unexplainable’ happens. I’ve also been privy to the war of methodology between two TV rating providers in the Indian market (they finally merged through acquisitions).
Personally, having been on the agency’s end for years now, I must take Nielsen’s side for no better reason than to talk about how 'glitches' in analysis are looked at by agencies and why this may just be a case of bad communication on both sides along with a product recalibration issue.
Analytical solutions provided by agencies in the form of products or customized research are usually well accepted by their clients till something goes wrong. The chaos that ensues is more in the case of established research and products vs. new customized solutions. Usually agencies are the first ones to catch the ‘glitch’ and after the dreaded news is broken to the client, many a nights are spent arriving at an internal explanation and fixing it or accounting for it.
The ‘what went wrong’ analysis usually centers on three areas at the agencies end:
- Quality control(human error is the first thing that is checked out)
- ‘The non normal event’ that could have caused the error
- Methodology-usually sample design and representation
While the first two are the easier items to find and fix or explain, it’s the third that is usually more troubling to discover and to correct.
On the Nielsen rating issue with Fox, the first thing the agency would probably do is to check if the reporting of Idol ratings were correct. For this they would look at the generation of the report itself or may sample various breaks to re-tally results.
If this does not throw up anything, they would then search for other events that could explain what happened. For this, they would use information already available through other studies or look at issues in the past which pointed to the problem. In Nielsen’s case, my conjecture is that one of the hypotheses generated would have been the noncompliance in homes i.e. people using their meters incorrectly. This insight that Nielsen stated was generated from another research would form one of the many hypotheses that the agency would explore. I’m not sure if this was the only reason for the low ratings communicated to the networks, but my feeling is that it was the one picked up and blown apart. Nielsen, on their part is right to contend that the 8% discrepancy they talked about is not a number that can trickle down to the network show rating since it was garnered in an absolutely different study done for different reasons. But I doubt the networks are listening.
With so much on the line in terms of money, the way that the client sees it is that ‘he better have answers or someone will pay’. In all fairness, if I were the client I would feel the same way i.e. enraged, puzzled and frustrated. The problem is that ‘fixing the issue fast’ may not be as easy or appropriate unless there is a real causal link established between the households that are not in compliance and the low ratings. To do this the agency will have to test this hypothesis out in a comprehensive manner and then arrive at the x% number for non compliance or under reporting and then link it to the fact of low ratings across some disputed and non disputed shows. Not a simple task.
This brings me to the third point and one which most agencies dread, questioning their design if all else fails. Since established research and products go through a thinking and creation phase, casting doubts on or revamping design is not the preferred approach with most outages. The need to re-look at methodology only arises if the issue raised is not resolved and if the agency fundamentally comes to believe that a change in methodology would benefit clients. Methodological issues especially those of sample selection, type of sampling, weighting, margin of error reporting and power calculations in studies especially large ones like TV ratings still leave room for improvement and require another blog entry. Nielsen may very well be in the process of thus investigating and learning from the first two possible reasons for the data error.
That the Fox CEO feels things are very unclear are feelings that a lot of clients echo when issues arise with research results. Some pointers that may help clients manage these situations better for the future are:
- Understand the various elements of the design of the study especially sampling, the type and representative aspect and weighting. Ask the agency for a pros and cons analysis for the design in place or proposed.
- Evaluate the margin of error in reporting not just at an overall level but by necessary segments and what it means at a ground level.
- Invest in a statistician/(s) or another analytics agency at your end (if you don’t have one already). Their job should be to slice and dice the numbers, give you more insights and raise pertinent questions while working with the agency.
- Calibrate results got from the agency with other findings internal or external. This is harder in a monopolistic situation but past data and parallel studies should guide you.
- When data issues arise, work with the agency to fix them, once they are fixed test the new situation. Play devil’s advocate; don’t rest after things stabilize. Try and get the agency to establish a cause and effect analysis for the data blip controlling for other factors.
- Question, Question, Question-it keeps the agency on its toes and helps preempt disasters.
- Ask for the process of quality control employed by the agency in it's data collection and reporting. Review it, check for loopholes and facilitate correction.
- Be kind (unless the agency is a repeat offender). Recognize that in the data analytics game, agencies usually try and give you the best that they’ve got. Data outages are also frustrating and traumatic for them.
Monday, April 13, 2009
Poor randomized testing-why a rose by any other name does not smell as sweet?
- Lack of rigor in the design
- Execution of a half-hearted test to show evidence
The lack of rigor in the test design creeps in in many ways:
- Small sample sizes(not adequate to yield statistically valid results)-Clients usually quote costs as an issue for the same, however a large margin of error in the results make the test a no go right from the start. This applies to not just the overall sample sizes but also sample sizes for the breakouts at which data needs to be analyzed and reported.
- Inadequate matching of test to control groups-Not enough analysis and matching is done of the test and control groups which should be almost comparable. Thus results from the analysis cannot be pegged to the new stimulus due to confounding factors present. The rush to start the experiment is another reason for this lack of fit between test and control.
- Wrong number of cells in the design-While complex designs, usually factorial exist that reduce the cells needed without compromising reads on the data, simple less adequate designs continue to be used. While I like the idea of simple models being able to explain complex phenomenon, that should not be a deterrent to the use of more complex models for complex real world scenarios.
- A too short testing period-In a rush to complete the test and convey results, clients don't give the test the adequate time it needs to generate stable metrics(especially if those metrics have a high variance).
Since most marketers recognize the need for a 'test-learn-roll out' approach, the second reason why randomized tests fail is harder to understand. There seems to exist a 'need to test' to show evidence of 'having tested' and the results from such tests are couched in scientific jargon with liberal extrapolations. Initiative roll out decisions are made on the basis of these tests with numerous rationalizations, for example:
- The results pan out for some regions, they will thus work at a national level
- The results are positive even though margin of errors are large, with a big enough sample things will be fine
Here is my advice for marketers -
DON'T TEST if a new approach cannot be tested(for whatever reasons some of them valid). Use a think tank of key executives to do a swot analysis and go with the final call on the same.
DON'T TEST if you don't want to test due to a lack of belief in testing or a disinclination to test with rigor. Roll out the new product without testing and be ready to explain to the boss if the initiative fails. Something that looks and feels like a test is not a test.
BUT...
DO TEST if you-
- Want to find out what really works and put your hypothesis under a rigorous scanner.
- Want to optimize the money you put behind a new product or idea before pushing it to customers(who may be unwilling to accept it).
- Want to learn and apply and not make the same mistakes twice.
Saturday, April 11, 2009
Trends: Recommendations-Tell me what else I should buy and do it well
Here are three scenarios that address the power of recommendations and how they can work for consumers and marketers-
Scene 1: I log in to Amazon and search for the book 'Predictably Irrational', their recommendation algorithm tells me the other books that customers who bought this book have also bought i.e. 'Sway', 'Outliers', 'The Drunkard's Walk', 'The Black Swan', 'The Wisdom of Crowds' and many more. Sometimes the recommendations are interesting enough for me to look through them and I end up buying more books than I budgeted for.
Scene 2: I enter Debenhams the UK department store with my son in tow for a quick buy to wear at an anniversary lunch. I am in a huge rush thus getting it right quickly is the key. I show the shop assistant the style I am looking for and she promptly picks up three of the same kind and hands them to me. While I try them on, she comes back to give me some more tops that match my style. She tells me what a deal I would be getting on them-Betty Jackson designs at 70% reduction, that's a steal! Well you guessed it, I buy three tops and a pair of shoes and walk out happy and satisfied after thanking her personally.
Scene 3: I call Nirulas for a home delivery order and ask for my favorite item on the menu, their hot chocolate fudge (HCF's for short). For those new to this homegrown north Indian brand-they have the best hot chocolate fudge in the world. Well, before I can say 'some extra nuts and chocolate please' the order taker tells me if I were to add some extra nuts and chocolate, they would charge me Rs 17 extra for each. While the consumer in me is chagrined at having to cough up money for something I got for free for years, the data analyst in me realizes someone's been analyzing the orders and pricing better.
Recommendations make sense to us because they help us sift through piles of information and focus quickly on what will maximize our buying experience i.e. finding relevant, new and interesting things. However for them to work, the underlying assumptions must hold:
- They must come from a deemed 'trusted' source whose judgment we value
- They must hit our sweet spot in terms of experience
- They must be consistent and thus build trust
How does this translate at ground level, with data on purchases being recorded both offline and online-very soon I envisage walking into a store(physical or online) to be told not just what I should buy based on my taste but what else I should be looking at. While a lot of e-commerce websites offer this personalized shopping experience via crude and sophisticated variants of recommendation algorithms to users, recommendations generated to fit individual customer preferences still have a long way to go.
Consumers inundated with loads of choice want good subsets of that choice but within the context of 'what they like or would like'. Marketers would like to offer the consumer products that have a higher probability of being bought. Looking at past historical purchase data or user rating of items attempts to marry the need of the customer with that of the marketer. The problem lies in how to read and interpret what the customer is looking for? My experience has been that the answer is tied to satisfaction and loyalty. If the customer comes back for more and increases his burn rate over time, then what you are recommending is working-if not, then there is scope for improvement in the recommendation algo. Testing what recommendations worked may help in this process in fine-tuning what did not work. Analyzing customers who picked up recommended items vs. those that did not for a particular purchased product may also lend insights into what may be going on.
An interesting article by Anand V. Bodapati titled "Recommendation Systems with Purchase Data" in the Journal of Marketing Research Vol XLV Feb 2008, talks about why recommendation decisions should be based not on historical purchase probabilities but on the elasticity of purchase probabilities to the action taken on the recommendation.
How would I rank the suggestions given by the three companies based on my experience with them and would I go back for more?
- Debenhams: Bang on. I got what I wanted at a good price and looked at a right variety of relevant alternatives before making my choice(remember time was an issue). Would definitely go back.
- Amazon: It's a hit or a miss and the list of suggestions is very long and not always worth the browse. They could do better but it's not bad. Would definitely go back.
- Nirulas: While I appreciate that someone recognized that the 'extras' needed to be paid for, I would like some suggestions like 'try our Jamoca almond fudge' or the 'Mango Tango is to die for'. They could do much much better. Would definitely go back(It's a monopolistic situation-no other brand comes close on the HCF).
Tuesday, March 31, 2009
Quantifying creativity in developing and evaluating package design
I'm a product and graphic design buff, and as I sat drooling over Phaidon's wonderful book Area_2 on upcoming graphic designers, I wondered if quantitative research could really pick out winners in this creative field? I am no expert but I can tell instantly if I like what I see or don't like what I see in a visual image. If I am undecided, I need to process the visual and then understand it before I take a call.
Yes, things work a lot differently when product packages are on the shelf and consumers are filtering the visual among others with heaps of information in their heads (brand affinity, the time they have, size of package they buy, advertising awareness, frequency of buying and a lot more). But is it so hard to pick a winner quantitatively when it comes to package design or do companies simply rely more on non quantitative or flawed quantitative approaches to choose a winner?
I am a loyal Tropicana consumer and I thought the change in package design for the brand smacked of 'not listening to consumers and not quantifying their voice in research', else why would a design change that drastic (it makes the brand look ordinary) make it through research? If consumers called, e-mailed and telephoned to express their feelings about the new design, where were these consumers when the design was tested? A well done online test (among other tests) with the right samples of loyalists and other segments would have saved PepsiCo a lot of grief.
What could have gone wrong in the research? Some hypotheses I generated about the consumers in the study:
- They were the wrong sample(it can happen)
- They were not enough in size and voice
- They gave wrong answers
- They favored the new design but had a violent reaction later when they saw it on the shelf and wanted the old packaging back(blame it on the recession)
- They were misinformed or did not understand the research
- They were not taken seriously about something as creative as packaging design
- They could not evaluate the new design clearly since it was a radical change from the original
- ...
- Set goals and objectives for the new design using qualitative research.
- Communicate the objectives and vision for the new design clearly to package designers.
- Evaluate the initial rough designs through online testing. Identify the best four or five.
- Fine tune the best designs through quantitative research.
- Quantitatively test the best designs via various simulated tests (online or offline) to identify the winner.
- Go ahead with the winner design only if it emerges as a clear winner with respect to the control (keeping in mind the status quo bias in marketing research).
Online package design research tools are helping marketers evaluate and quantify how consumers will react to the creative aspects of the design. Package Design Magazine talks about three of these solutions.
Pure quantitative analysis of a creative process like package design is still viewed with skepticism among marketers. However, using the numbers to aid in the creative process helps companies avoid big mistakes and let's designers work and create within a framework that echoes the consumer's needs and wants.
One loyal customer is happy Pepsi scrapped the new Tropicana package and bought back the old.