The 2 Sigma Problem

Can the dream of personalized learning finally become a reality?

Oct 16, 2024

What if I told you there’s a scientifically proven way to outperform 98% of your peers?

First, you’d tell me I was lying. Second, in case I wasn’t lying, you’d ask how.

The answer is 1:1 tutoring and the phenomenon is known as Bloom’s 2 Sigma Problem. It’s a terribly un-catchy name, which is probably why so few people know about it. It comes from Benjamin Bloom who published research on “The 2 Sigma Problem” in 1984.

Bloom’s research tested the effects of different variables on student achievement. He found that “Tutorial enforcement” (aka tutoring) blows every other intervention out of the water. Specifically, tutoring has an effect size of 2 standard deviations above the mean while the second highest variable is only 1.2 standard deviations above the mean. No contest.

See the full list of variables, their effect size (measured in standard deviations above the mean), and percentile equivalent below:

Ok. But why haven’t test scores improved?

Let’s say, for the sake of a fun example, that you’re Marty McFly in Back to the Future, which was released in 1985, just after Bloom’s 2 Sigma paper. You’ve just read this paper then time travel to 2024. What would you expect from America’s education system?

Back To The Future's Time Travel Explained: How It Works & Is It Accurate?

You’d expect significant improvement in student outcomes! We learned the formula for improving learning and had 30 years to implement it.

Unfortunately, the reality is quite different…

National Center for Educational Statistics

Scores have barely increased! Today, reading scores are actually lower than a decade ago.

Why is this? Tutoring doesn’t scale1.

Tutoring is inherently one-to-one. Tutoring also needs to be scheduled in advance and is bound to a set time frame.

What if you have a question outside that time? Too bad.

Can ChatGPT be your tutor?

With today’s AI tools, it could be easy to say that Bloom’s 2 Sigma paper’s potential will finally be achieved. Everyone now has a tutor in their pocket.

ChatGPT can explain school concepts in an infinitely scalable one-on-one manner. And, everyone is already using it!

College students already pool ChatGPT+ subscriptions for access to the latest models. $20/mo spread across three or four friends is an order of magnitude cheaper than one hour of human tutoring.

However, most students aren’t using ChatGPT as a tutor.

ChatGPT is like the friend who did the homework and lets you copy, or the friend who took a class a year ago, saved their work, and handed it down since the professor doesn’t re-write their tests.

ChatGPT helps you cheat.

If you get the answers, you may pass the test, but the knowledge isn’t retained. If you cheat through Freshman calculus, you’ll struggle in your Sophomore math class. Eventually, it catches up.

Only the most motivated students will avoid the trap of the easy answer.

How does this impact work?

The bear case: Junior employees won’t learn anything

AI assistants are impacting more than just classrooms. Like the classroom, there’s a question of how this will affect junior employees’ ability to learn the trade.

Matt Levine makes an interesting case2 about the potential downstream impact of “AI analysts” in investment banking:

If you work 100 hours a week for two years writing pitchbooks, you will develop a deep sense of what sorts of deals get pitched and what pitches work. The knowledge and intuition necessary to do the job will be pumped into you at great speed. “Drinking through a firehose” is the cliche.
And then you eventually graduate to being a senior banker, or more realistically a private equity investor, and you are useful. You know stuff, because you learned it, fast.
And now we’re pretty close to a world where an AI can do most of those first three items, build the models and write the pitchbooks and put the stuff in the data room and so on. You could imagine that having the effect of making junior bankers’ lives easier in the short term, but undermining their development in the long term: They won’t learn, at a deep level, what drives returns on an LBO, because the AI does all the math for them. The apprenticeship model will break down.

As a recovering consultant, this certainly resonates.

I recall many late nights crunching through Excel models or building slide decks for my clients. Despite my weary eyes, the work (usually) paid off when the client asked some esoteric question about the model or the footnote on slide 67 and I could answer immediately.

How? I spent every waking hour last week building that model and checking the footnotes. Later, this grunt work let me quickly grasp what mattered most even when I wasn’t the one who built the model.

Had AI built most of the model for me, I may not have hardened those skills and been ready for the next role. That’s the risk of AI at work.

The bull case: Our tutor for work

While some late nights likely helped me in the long run, that wasn’t always the case.

Spending hours figuring out why the Excel model was throwing an error wasn’t a great reason to go to bed at 2am instead of midnight. Nor was perfecting every line, color, and caption of a chart where I already understood the takeaway from the spreadsheet.

Now, consider another model of the AI analyst: What if I had access to a 24/7 “AI Senior Analyst” to help me debug my work and move faster? That would be a game changer.

This latter model is how many jobs today work.

When I started in consulting, I was joined at the hip with my Senior Associate Consultant who was 2 years older and 1,000x better at my job. In those first months, she sat right over my shoulder as I stumbled through Excel formulas or slide formats. When I really struggled, she’d jump in and show me how she would do it.

As her fingers flew through the keyboard, my face usually looked like this.

jaw drop | PlanetCalypsoForum — Junior consultant me seeing what good looked like

Every skill she demonstrated added a new tool to my toolkit. Given the results, it’s not surprising that Consulting is just one of many professions using this model. Pair programming is the equivalent in software engineering and is how some of the GOATs teach.

Anyone who was pair programming with Greg is surely a 10x engineer.

Calculators: Foreshadowing AI’s future?

Before people worried that AI would turn our brains into mush, they thought the same of calculators.

The 1986 National Council of Teachers of Mathematics annual meeting famously was a stage for anti-calculator protests. A Washington Post article titled “Math Teachers Stage a Calculated Protest3” elaborates:

John Saxon, a math book publisher and retired Oklahoma math teacher, and about 20 others carried signs reading "The Button's Nothin' Til the Brain's Trained" and "Beware: Premature Calculator Usage May Be Harmful to Your Child's Education."
They were protesting a National Council of Teachers of Mathematics policy recommending "the integration of the calculator into the school mathematics program at all grade levels in classwork, homework and evaluation." The policy urges that "at each grade level every student should be taught how and when to use the calculator."

To spoil the punchline: Calculators didn’t make us dumber. Quite the opposite.

Just five years after the anti-calculator protests, the College Board was writing about how calculators had positively impacted mathematics education4

Calculators are transforming the way we teach, and the way our students learn, mathematics. We are freeing ourselves of the arithmetic-driven curriculum. We no longer require students to practice computation endlessly. We no longer need to construct problems so that the numbers involved do not interfere with the mathematical principles being taught. For example, in textbooks, problems involving cubes and cube roots often use the numbers 8, 27, 64, or 125. Square and square-root problems use 16, 25, 64. If applying the law of cosines is part of a solution, either the numbers have to be "cooked," or the students spend more time on arithmetic than on thinking about how to solve triangle problems.
Calculators and computers are tools that support student work. Searching for patterns, solving problems, even learning basic facts, are all enhanced when we use this technology. Our students are engaged and enthusiastic about the fundamentals of mathematics.

Clearly the calculator was extremely additive5 to math education.

What’s in store for us with AI

While fear is a natural response to new technology, building new tools that replace “human jobs” is quite literally what makes us human. No tools = no progress.

Some tools are met at first with fear but later prove harmless. Others are adopted with excitement but only questioned later — like smartphones6 or Teflon7.

The danger lies in the underestimated technology shifts.

While there’s no shortage of attention on the dangers from the race towards AGI, I’d argue there’s not enough focus on the more nuanced side-effects that could arise from an increased use of AI-based products. The ability to learn and do it quickly is the most remarkable part of the human species. Accelerating this accelerates progress. Dampening this could be disastrous.

Think about it — every group, from companies to countries, is only as strong as its people. What would happen if those people could learn even 1% faster?

What gets automated8, to what extent, and with what interaction model is all being decided now. Seemingly minor nuances, like the AI notetaker that still requires you to take notes, can make all the difference.

As buyers, builders, students, and teachers we all have the opportunity to influence the next generation of AI products and how we use them. Let’s make “The 2 Sigma Problem” more than an academic paper.

The US Education system is a multi-faceted and extremely complex machine. There are, of course, many issues that drive poor test score performance. However, it is fair to say that the silver bullet of tutoring has not worked at this scale.

https://news.bloomberglaw.com/crypto/matt-levines-money-stuff-bankers-hours-are-still-pretty-bad

I loved this headline as someone who appreciates a good pun

https://elective.collegeboard.org/calculators-classroom-archive-college-board-review

A math pun of my own!

https://www.theatlantic.com/magazine/archive/2017/09/has-the-smartphone-destroyed-a-generation/534198/

https://lindsaydahl.com/wp-content/uploads/2015/08/Teflon.jpg

To be clear, there will be lots of jobs where full automation is the best answer. Companies already outsource plenty of work today which makes clear that they don’t believe those skills are needed to climb the ladder. And that’s ok.

Leo

Oct 17, 2024

This is great. The other parallel I think of is personal trainers / physical therapists—information on what workouts to do and how to do them is widely available, but people still pay to have someone tell them “do 20 more reps.”

Virtual workout classes are somewhere in-between.

Obviously plenty of people get in great shape without a trainer, but I wonder if those are the “highly motivated” students who already do fine without AI / a tutor. Will be interesting to see!

Expand full comment

Ben Goldberg

Nice read!

1 more comment...

Matt's newsletter

Discussion about this post

Ready for more?