Estimating Demand via Experiments (when They won't let you randomize price...)
The easiest way to get around draconian restrictions and backwards attitudes like "we shouldn't treat our customers like lab rats."
I’ve complained about the demonization of pricing experiments before, but alas, there’s no arguing religion. Sometimes, you just have to accept that you can’t run price experiments, and still, They want you to estimate demand.
The trick, of course, is to convince Them to let you randomly send some folks a 10% off coupon for whatever you’re hawking. Often, the marketing/promo team has already done this, so you can just steal their data. This is best.
The idea here is that the promo introduces price variation, but you have to be careful with how you use the data to identify demand. So, thought I’d write a little post on how to do it. Price well!
A simplified empirical demand model (ignoring all the nonsense you control for, etc) looks something like this:
Where j indexes products and t indexes time.
The problem is that historical price changes are related to all the other factors that affect output (E). For example, often this equation more or less describes the price variable:
i.e. the price increased at some point in time in the past. It is extremely unlikely that nothing else changed with demand before and after the price change. Demand is always changing.
So, you can’t just run the regression.
The standard solution to this problem is to use an instrument that is uncorrelated with the other factors that drive output (e.g., demand shocks), but is correlated with price, because then:
So, you can just solve for the elasticity β.
But you don’t have any instruments. You know this. I know this. The supply of instruments is much lower than the demand, and, if you’re working in industry, it’s just extremely unlikely that some natural experiment exists for your particular company.
So, you have to make your own instruments. That’s what experiments do. Experiments are instrument factories. They make random variables that are uncorrelated with everything but what comes after being bucketed into the experiment.
But the 10% off coupon mentioned earlier isn’t a valid instrument for this demand model, sadly.
It’s randomized across users, not products. So, there’s no reason to suspect it’s excluded from a direct shock to demand. The randomization ensures that variant assignment at the individual level is uncorrelated with other user characteristics, but this doesn’t solve the endogeneity problem, which is cross-product and time!
The problem is that this demand model is based on product-time variation. So, to use this instrument, we’ll need to build up a customer-level demand model and derive the aggregate demand for each product from the individual demand curves.
A natural way to do this is with a discrete choice model. To save on notation, let’s suppose you’re hawking one product, so that this amounts to modeling the conversion rate. Here’s a pretty standard model, but you can use something more flexible:
Where Xi are some customer characteristics and Pi is the final price of the product for customer i. Here, we’re allowing for other reasons a particular customer might have a lower price aside from access to the 10% discount. Pi is just whatever the customer’s final price ends up at.
Now, let Zi = 1 if the customer gets the 10% promo and 0 otherwise. The key idea is to use this Zi as an instrument for price.
For doing binary choice models with instrument variables, I usually use a little something I wrote circa 2017-2018: repeated two-stage least squares.
The basic idea is that you can show that doing the following procedure is a contraction mapping:
Run two-stage least squares for the linear probability model of 1(purchasei) on (X, P) with (X, Z) as instruments. Call this estimate c.
Set k = 1.
For some initial guess of a and b, compute the predicted conversion rate. Run the same two-stage least squares estimation with the dependent variable being the predicted conversion rate. Call this estimate ck.
Update: [ak+1, bk+1] = [ak, bk] + c - ck.
Under an assumption that the instrument has a “strong enough” relationship to (X,P), this procedure converges to the true a and b in a large enough sample. It’s also got standard normal, asymptotic inference because there’s an analogous GMM estimator.
The key thing is that you can use these kinds of experiments to generate plausible instruments for demand estimation even when you don’t have the ability to do more direct price experimentation (for whatever reason).
Once you have this model, you can simulate changes in price by just plugging in new prices, but identification here has its limits. All you’ve done is exogenously moved prices 10% for some customers. If you start looking at 30% price increases, you’re inferences will be based more on functional form than anything in the data. The real info you want to get is “Are we over/under-priced? Would a small increase/decrease help us?”
A couple things to try to improve power and flexibility here:
If you can get more promo depths (lower or higher discount %), that’s super useful because it will give you more instruments. Maybe you can get a more complex model of the pricing curve. With only the one 10% promo, we can’t introduce a P2 term into the model because we don’t have a separate instrument for it. But if we have a few different treatments, we can get it in there.
If you can get them to send out the promo a few times where you re-randomize the recipients, this is good, because it generates additional variation in variant assignment, giving you more power.
There’s an important wrinkle here. The e-mail or notification that tells a customer they have a promo itself might be responsible for increased demand, i.e. aside from any discount the customer gets.
The trick to deal with this is to send promos to everyone but vary the discount level. Instead of doing 10% vs 0%, do 20% vs 10%. Then, you can send the same e-mail to everyone and just change the discount. This helps isolate just the pricing impact of the experiment.
At this point, we might ask: “Well, if we’ve got this nifty trick to run a more palatable experiment, why would we ever want to run pricing experiments?” This is another case where it’s important to remember that the data’s only really informative in a neighborhood of actual prices people are seeing. This strategy only tests prices below the status quo, not higher. So, you’re not really getting direct evidence on what would happen at higher prices. You’re just assuming that the elasticities you see for the 10% lower price are roughly relevant when the price is 10% higher. That’s better than vibes and purely observational analyses, but it’s an assumption all the same. So, you have to be more cautious when increasing prices than decreasing them using this method.
Thanks for reading!
Zach
Connect at: https://linkedin.com/in/zlflynn

