Seth Stephens-Davidowitz speaks in a classroom
Seth Stephens-Davidowitz estimates that an NBA player’s identical twin has a 50% chance of also playing in the league. He figures any person on earth who is at least 7 feet tall has a one in seven chance of becoming an NBA player. And he has a pretty good idea why some countries produce more elite basketball players than other countries. (Answer: probably not what you are thinking).
How does he know these things? Training as a data scientist, reliable input, knowledge and inspiration to ask the right questions, and a series of plain-language sessions with ChatGPT.
After spending a few months learning how the popular large-language model (LLM) works, Stephens-Davidowitz researched, wrote, and created artwork for his third book, “Who Makes the NBA: Data-Driven Answers to Basketball’s Biggest Questions.” It took him 30 days.
“Artificial intelligence is shrinking the time it takes to do this type of work,” Stephens-Davidowitz said. “Things that used to take me four months now take me four hours.”
Stephens-Davidowitz visited Yale recently for a pair of talks about his latest book and the methods he used to put it together. The events were co-sponsored by the Data-Intensive Social Science Center (DISSC), the Institution for Social and Policy Studies (ISPS), the Yale Institute for Foundations for Data Science (FDS), and the Tobin Center for Economic Policy,
During the 2004-05 school year, Stephens-Davidowitz worked as a predoctoral associate at Yale for legal scholar and economist John J. Donohue, now the C. Wendell and Ethic M. Carlsmith Professor of Law at Stanford Law School, before earning a PhD. in economics at Harvard University. He worked for Google as a data scientist and wrote the bestselling book “Everybody Lies” about secrets revealed in internet data.
“If you asked me in 2022 what skill set is necessary to be a good data scientist, I would have said 70% is computer coding and maybe 30% is asking the right questions,” he said. “And now, I say facility with AI is 55%, asking the right questions is 40%, and coding is 5%. And that 5% is getting smaller and smaller.”
Ron Borzekowski, executive director of DISSC, thanked Stephens-Davidowitz for visiting and sharing his experience with a pair of eager audiences.
“We live in a world with increasingly easier access to larger and larger reams of data,” Borzekowski said. “It is extremely useful to hear from Seth about the various ways the latest tools can help streamline the process of analyzing data and generating interesting findings for consumption by a wide audience.”
ISPS Director Alan Gerber, Sterling Professor of Political Science, expressed his enthusiasm for how AI can accelerate social science.
“One thing I’ve heard said about AI that rings true to me is that it will never be worse than it is today,” Gerber said. “We can see from Seth just how quickly laborious tasks can be routinized and done almost instantaneously. It’s amazing how quickly this technology is improving, and who knows what other aspects of the process of developing ideas and executing them will be improved and sped up by AI.”
In researching his book on the NBA, Stephens-Davidowitz tapped his love of basketball, online datasets, and a curiosity to see if a high-powered AI tool could quickly and affordably help him analyze them to answer questions that had been on his mind for years.
Questions such as: What players are systematically undervalued in the NBA draft? Are clutch shooters born or made? Why are so many NBA players named Chris? Who would be the best NBA player of all time if every player were the same height?Who Makes the NBA? Data-Driven Answers to Basketball's Biggest Questions; Plus: How AI Can Revolutionize Data and Analysis, with Ai-generated image of a basketball player
Perhaps unsurprisingly, height serves as a running theme in the book. And perhaps even less surprising for basketball fans of a certain age, the answer to that last question is Muggsy Bogues.
But Stephens-Davidowitz delves beyond the most obvious characteristic of so many stars in a game where the rims are hung 10 feet above the floor. For example, he calculates that every inch of height roughly doubles the chances of becoming an NBA player. And this result triggered a series of additional investigations ultimately exposing the relatively poor athleticism of the tallest players, who tend to have lower vertical leaps, slower sprint speeds, worse free throw percentages, and worse performance late in close games.
“Tall players are kind of shockingly bad athletes,” he said, citing Shaquille O’Neal and Wilt Chamberlain as just two 7-foot-1-inch Hall of Fame players who were poor free throw shooters.
What could account for the fall-off in measurable skills among the tallest players? Stephens-Davidowitz argues players of such height simply do not go through the same merciless selection process to make the league as shorter players. They just have to be in the top percentile for height. The skills come afterward and likely never reach the level of shorter (but usually still very tall) players who must display extraordinary ballhandling, jumping, shooting, and success hitting shots in the clutch.
“If you can’t handle pressure and you are 6 feet tall, you won’t make the NBA,” he said. “You will lose out to players who can handle pressure at that height. But for taller players, there is just not that much competition.”
Stephens-Davidowitz learned how to craft prompts to achieve the results he needed from ChatGPT. For example, he asked the chatbot to “find the teams that had the best decade in over-performance in their draft and teams that had the worst” He didn’t define overperformance or underperformance. Once given the data, ChatGPT built a model predicting players’ success and put it into a clear chart.
“Everybody can be a data scientist now,” he said. “The way to get good at AI is using AI.”
In a second session at Yale, Stephens-Davidowitz told a room filled largely with predoctoral students how ChatGPT eliminated the grunt work that would have previously driven him toward procrastination.
“All of the annoying parts of this type of work just come back completed right away, and I’m back to doing something interesting,” he said.
For one chart, he wanted entries for each players’ college to reflect official college colors, a task that would have taken hours to look up and code properly. But the AI took a few seconds.
“Just ask,” he said. “You never know what it can do.”
And while he regularly checked the results and the underlying coding to be sure he could trust the results, he found that most errors were easy to catch because they were not something a human would have done.
“The mindset of what’s possible has transformed overnight because of these tools,” he said. “It’s only going to get more powerful. Don’t be the one betting against AI.”
Or against someone 7 feet tall eventually getting handed a basketball.