Monday, April 13, 2009

Poor randomized testing-why a rose by any other name does not smell as sweet?

While rigorous testing of new ideas, offerings and approaches is the order of the day at companies like Capital One(my hero), Amazon, Google, Netflix, some retailers, direct marketers and pharmaceutical companies; at many others important decisions are still based on 'gut feel' and 'wrong evidence'. In spite of the availability of software, capability and adequate research in the area of randomized testing, most companies still continue to flounder when it comes to executing a test.

The two main reasons why I have seen testing break down(in spite of good intentions and an adequate hypothesis) are-

  1. Lack of rigor in the design

  2. Execution of a half-hearted test to show evidence

The lack of rigor in the test design creeps in in many ways:

  • Small sample sizes(not adequate to yield statistically valid results)-Clients usually quote costs as an issue for the same, however a large margin of error in the results make the test a no go right from the start. This applies to not just the overall sample sizes but also sample sizes for the breakouts at which data needs to be analyzed and reported.

  • Inadequate matching of test to control groups-Not enough analysis and matching is done of the test and control groups which should be almost comparable. Thus results from the analysis cannot be pegged to the new stimulus due to confounding factors present. The rush to start the experiment is another reason for this lack of fit between test and control.

  • Wrong number of cells in the design-While complex designs, usually factorial exist that reduce the cells needed without compromising reads on the data, simple less adequate designs continue to be used. While I like the idea of simple models being able to explain complex phenomenon, that should not be a deterrent to the use of more complex models for complex real world scenarios.

  • A too short testing period-In a rush to complete the test and convey results, clients don't give the test the adequate time it needs to generate stable metrics(especially if those metrics have a high variance).

Since most marketers recognize the need for a 'test-learn-roll out' approach, the second reason why randomized tests fail is harder to understand. There seems to exist a 'need to test' to show evidence of 'having tested' and the results from such tests are couched in scientific jargon with liberal extrapolations. Initiative roll out decisions are made on the basis of these tests with numerous rationalizations, for example:

  • The results pan out for some regions, they will thus work at a national level
  • The results are positive even though margin of errors are large, with a big enough sample things will be fine

Here is my advice for marketers -

DON'T TEST if a new approach cannot be tested(for whatever reasons some of them valid). Use a think tank of key executives to do a swot analysis and go with the final call on the same.

DON'T TEST if you don't want to test due to a lack of belief in testing or a disinclination to test with rigor. Roll out the new product without testing and be ready to explain to the boss if the initiative fails. Something that looks and feels like a test is not a test.


DO TEST if you-

  1. Want to find out what really works and put your hypothesis under a rigorous scanner.
  2. Want to optimize the money you put behind a new product or idea before pushing it to customers(who may be unwilling to accept it).
  3. Want to learn and apply and not make the same mistakes twice.

No comments: