Should You Use ChatGPT to Do Customer Research for Your Organization?
by: Grant Gooding
Read Time: 3 minutes
The idea is tempting. Many of you have probably already tried using ChatGPT to test ideas, messaging, or product concepts. I’ve been excited about the potential of synthetic audiences myself. But the most in-depth study comparing them to real audiences was just released and the results were far worse than I expected.
Verasight, a research firm trusted by universities, media outlets, and Fortune 500s for reaching hard-to-access audiences, put synthetic polling to the test.
They ran a real, nationally representative survey of 1,500 U.S. adults (for context: you only need 385 for statistical significance; 1,500 yields a ±2.5% margin of error, which is very precise). The questions included Trump’s approval rating, congressional vote intention, and a newer policy issue around zoning and housing.
Then they built a “synthetic sample” by giving ChatGPT-style models the same personas (age, race, gender, income, education, party ID, etc.) and asking the AI to respond as if it were those individuals.
Here’s what happened:
- Topline results weren’t just off, they were catastrophic. Even the best-performing AI model was off by up to 23 percentage points. In research terms, that isn’t a rounding error, that’s the kind of miss that gets CEO’s fired.
- Minority voices were erased. Black respondents were misrepresented by 15 points. Other ethnic groups were off by 8 to 20 points. If you care about culture, community or diversity of usage across your customer or prospect base, expect the wrong answers.
- Uncertainty vanished. Real people said “don’t know” about 3% of the time. AI respondents? Zero. Synthetic samples don’t hesitate, which might look tidy in a chart until your real prospects hesitate to buy and you don’t know why.
- Polarization was manufactured. On issues without clear partisan divides, the models exaggerated differences and pushed minority groups disproportionately into left-leaning positions.
- On newer issues, AI got it backwards. In one test, the model not only exaggerated support but flipped the majority opinion. That’s not a small miss — that’s a different reality altogether.
- Attempts to improve accuracy made things worse. Feeding the model extra examples degraded performance. Errors grew from 4% to 6%, and crosstab correlations dropped. Translation: synthetic audiences don’t get smarter when you spoon-feed them, they just fail with more confidence.
Why This Matters
It may seem like I am being overly critical, but I’m not. In research, a few points off can shift millions in budget, tank a product launch, or send a campaign in the wrong direction. Synthetic samples aren’t just imprecise, they’re systematically biased and misleading.
My Final Thought
AI is brilliant at generating language. But it’s not brilliant at generating people.
When accuracy matters, and it always does in research, politics, or markets, synthetic polling isn’t just risky. It’s deceptive. For now, there’s still no substitute for real people, but it won’t be this way forever.