2025Small Data SF

In the long run, everything is a fad

To be clear - I'm not saying that analytics and data engineering are a fad. I'm not saying the data teams are doomed to fade away, or that the old fundamentals of data modeling are wrong, or that the urge to quantify everything is a mistake. I'm saying that things seem pretty good, right now. But, you know. Like Charles Schwab constantly says, past performance is no guarantee of future results. So someone else might say all of that in the future - because, as John Maynard Keynes said, in the long run, we are all dead.
Speaker
Benn Stancil
Benn Stancil

Former Co-founder and Chief Analytics Officer at Mode


Benn Stancil was a co-founder of Mode, a business intelligence tool that was acquired by ThoughtSpot in 2023.While at Mode, Benn held roles leading Mode's data, product, marketing, and executive teams. He regularly writes about data and technology at benn.substack.com. Prior to founding Mode, Benn worked on analytics teams at Microsoft and Yammer.

0:00[music]

0:09[music] I am Ben. I am the former co-founder of Mode. Uh which is like a polite way to say like um [laughter] uh I also spent a good bit of time like yelling at the internet in kind of a grumpy way. So take that all like as a as an intro to what this might be like.

0:27Um but naturally like I want to start where all these things start which is the Olympics. Uh specifically the 2024 Olympics and specifically gymnastics and women's gymnastics and the women's floor exercise which is that big thing in the middle. Um so there's like an event final to figure out who is the best in the world at the floor exercise. This is

0:45like a thing that is just how the Olympics work. Uh in 2024 these were the people who were competing for it. Uh and so they all went out and did a bunch of crazy stuff. Here is Simone Biles doing like a crazy thing. And so like a big part of this, right, if you're doing this event is to be like, how do you

0:58judge who is the best? Like to me, this is all insane. How do we decide which one of these is the best of these people who who competed? Uh and so there's a whole process for it. Uh I did not know this until recently, but this is the whole process. Uh some of you may be familiar with it. Uh so you start with

1:11what is called the Dcore or the difficulty score. It's like your starting score. Uh this is the combination of scores of the eight hardest elements in your routine as determined by the table of elements and the code of points. This is the code of points. It's like a giant PDF. It's full of stuff like this uh which is like

1:27weird diagrams and all this stuff and tables and things like that. And so these are the things that define what goes into your Dcore. You get eight elements in them. So like E1 plus E2 plus E3 blah blah blah. So that is your starting score. The next thing you have is an Ecore which is your execution score. Uh that is like the classic 0 to

1:4310 perfect 10 kind of thing where the judges judge you based on artistry and form and technique and all the things that I guess make gymnastics gymnastics.

1:51Uh after that there are penalties. These are like the things that we sit on the couch and yell at people for like, oh my god, they took a step on their landing.

1:57Um, and it's falls and all this sort of stuff. And so there's like standardized deductions for penalties. And you add all this up and you get a final score for for what your score is. And so in this event in 2024, uh, this was the scores after all but one competitor had gone. Uh, Rebecca Andragi, the Brazilian, was in first place. Simone

2:14Biles kind of messed up and had all these deductions and was surprisingly in second. Uh, and then two Romanians were tied for third with the Anna Barbarosa ahead because the tiebreaker is on your escore. And so she was in third and the other one was in fourth. And Jordan Childs was the last one to go. So an

2:27American Jordan was the last person to to to perform. Uh, and when she went, so she went, she did her routine. She got a score. She had a difficulty score of 5.8. She got an execution score of 7.866. She got zero penalties. Nice job.

2:40Um, and her score was 13.66. uh which if you like put that in the ranking, it comes in fifth, which like damn, that's not a metal. Bummer. Um however, uh Jordan Child's coach noticed something. She noticed that this difficulty score of 5.8 was actually calculated wrong. Uh that they did not include the points the right way. And

3:01this should actually have been a 5.9. And so if you do the math and say that's actually 5.9 instead of 5.8, this score would change to a 13.766. And then if you rerank them, she comes in third. So she should get a medal. Uh, and so her coach went to the judges and they said like, "Hey, this was wrong." The judges

3:15said, "Yes, you're right. Jordan Childs got third. She got a medal. They took this very cute picture." Uh, and so great. Okay, that's what happened. But there are other characters in the story like, namely this woman. This is Anna Barbosa. This is the woman who was in third and then became in fourth. Uh, so like bummer for her. Here's her finding

3:32out. [laughter] She then had coaches and they went into a different rule book, the technical regulations for 2024. These have the regulations for how you inquire about a score. And it says there inquiries for a difficulty score are allowed provided that they happen before the last gymnast goes or before the gymnast who is next goes. But that doesn't work because if

3:51you remember Jordan Childs went last so there is no next gymnast. So what do you do? You have to go to the next part of article 8.5 which says for the last gymnast in the rotation you get one minute. You have to protest in one minute. Uh and so she went to the judges and said okay actually it was 1 minute

4:04and 24 seconds. So she took this to the court of arbitration for sport which is a thing.

4:11They consulted with the official timekeeper of the Olympics um and discovered it was not 1 minute and 24 seconds. In fact it was 1 minute and 4 seconds [laughter] but that is more than a minute. So they said actually that score is disallowed.

4:25She gets a 5.8. She is back in fifth and Anna Barbarosa got third. Uh Jordan Childs naturally fought a lawsuit with the federal Supreme Court in Switzerland which had a bunch of stuff in it about this which is like synchronation of video recordings and audio recordings.

4:40This is an insane document to read. Um but this was how they were trying to figure it out. But this is actually not where the story ends because there's also this woman. This woman was the woman who came in fourth or fifth. We don't know. Her name is Sabrina. Uh you'll notice she has this deduction.

4:53This tenth point deduction that from the corner uh code of points is for stepping out of bounds. Uh it was here. One step landing out of bounds is a 10th point of deduction. Uh if you this is a video of her stepping out of bounds. You may notice she ain't out of bounds.

5:09[laughter] So she was like actually you know what this should be a zero. If that's a zero my score should not be that. It should be this. That would put me in third. And importantly, even if you take Jordan Child's score and make it back a 5.9, that means that's a 13.766. Jordan Childs ends up in fourth. So, I should

5:28get third. So, she fought another thing with this tribunal saying, "Hey, this was wrong. This penalty was given without merit. Her score should there be increased to 13.8." A day or so later, the tribunal came back with their media release and said, "Jordan Child's thing was after the one minute deadline. It is disallowed." uh her score is dismissed because you

5:46can't actually file like reviews on deductions. And so this is where we were like two days after the Olympics. Uh then Nadia Kich tweeted about it or posted on Instagram about it. People argued in the comments about what you're allowed to do. Romanian prime minister decided they're not going to go to the closing ceremony. Jordan Charles filed

6:02more lawsuits. Someone proposed just give them all silver medal or bronze medals because nobody knows what to do.

6:08And then 457 days later, we still do not know what the [ __ ] is going on.

6:14>> [laughter] >> Okay. So, how do we get here? Like what what are we talking about? I mean this or in general. Um so what happened?

6:23Basically a lot of us know there used to be like regular ways to judge the Olympics. It would be like judges they watch the events they give them scores.

6:28They write it all down. Uh that was like kind of famously corrupt. Um and there were a lot of scandals with this particular something happened in 2004 I think in Athens. They're like we got to get rid of this this system. we reverted to this code of points which is this very elaborate way of like scoring things numerically. Um and it says stuff

6:45like this in it about means to provide an objective means for evaluating gymnastics to get away from this like subjective thing of judges.

6:52But there's another thing kind of in this that I think is reflective of a broader trend uh that we want to take a small detour for from whatever detour we're currently on uh which is there was this famous article in the New York Times I don't know this was 5 10 years ago about the music that we like it was

7:06like what music do we think is best as individuals and so this was the chart that explained it and basically what it says is everybody thinks the best music was the music when they were teenagers that the most popular music to us is like whatever it was when we were growing up and so the Washington Post looked at this a little further They put

7:23out another sort of like study or whatever they do that was basically found that a ton of things are like this. That the most close-knit communities were when we were like teenagers. The most moral society, the best music, the best fashion, the best economy, the best sporting events, all of this stuff, the best things were all when we were teenagers. And so if you

7:40like generalize this, it basically says the best stuff is like this. It comes from our adolescence that whatever it was when we were like growing up and sort of our formative years, this is when we decide what is best. And so like okay what is our best stuff professionally? Um and for a lot of us like that kind of formative thing is

7:57basically this. It's basically we were all people who are like ah we aren't these old geysers who are doing things on vibes and like scouting people and that we are like the quantitative people. We are thinking about things rigorously. There's a whole generation of people who idolize this kind of thinking. And so it's either Moneyball or if you like take the sort of more

8:16political angle. It was like we used to have all these talking head pundits like David Brooks. And then we had a Nate Silver. Uh this is all from like the Charlie Rose show like oops. Um and so this is now kind of the way things were.

8:27So the best stuff professionally it used to be like this sort of stuff and it used to basically be vibes. It was pundits and then it became for us in our adolescence math.

8:37And so a lot of things follow this pattern where they became very quantified in part because it was stuff that we all liked. And so like there is an obvious question here on this chart which is like there's a third set like what is this? Um mainly because like this is not a thing that stays fixed.

8:52This basically is like a generational thing. And as we age there is a new generation that starts to ride up this curve. And so where does this go? I mean I can probably figure this out but the second page of that code of points like literally the second page the first one was one you saw. This is the second page

9:07is an ad from Fujitsu for their like 3D sensing AI and gymnastic scoring system.

9:13And so what does this thing do? This thing basically it watches gymnast do stuff. And what it does not do is it does not say okay take those things transfer them into this crazy code of points put them all into this crazy scoring and give them that. Basically it says hey we're going to watch it. We're going to compare it to a whole bunch of

9:28examples of other routines that are sort of like said these are how good these routines are and then we're going to give you a final score. So, we're going to have this thing that is basically the perfect judge. It isn't a judge that's going through all this quantification.

9:38It's a judge that says, "I've seen every gymnastics routine ever, and I can attempt to say which ones are best." And so, the best stuff professionally used to be this. Uh, and later, I don't know, it's like AI stuff or it's vibes. It's

9:52like a different kind of vibes, but it's vibes. And so, if you look at this chart, you may notice three things. Uh, one is that this is already a thing that's kind of happening. like vibes are a thing that are ascendant. So, uh, Zoron won to be the mayor of New York uh, yesterday. A lot of his campaign was sort of vibes

10:10based. There was a big trend in like 2008 and the Obama years that campaigns are very oriented around like extreme number crunching. A lot of the new campaigns like Zorons are very much like, well, we do it on TikTok and we base it on vibes. And this has nothing to do with AI. This is just like this is

10:24how people are starting to think. There's some rejection of we overquantified stuff. Um, in the presidential election, Trump kind of won a lot of this way. I worked for Kamla for four or five months. Um, this was how Kamala ran her campaign in some ways. The sort of brat thing was sort of a famous part of it. Um, but it also

10:39shows up in other places too that if you talk to like the youths, uh, a lot of them say things about like taste eating silicon. This is uh, someone else, but like taste eating Silicon Valley.

10:48There's a lot of stuff being built in Silicon Valley that is less oriented around how do we optimize everything?

10:53How do we AB test everything? And much more around how do we just build something that is really wellcrafted?

10:57How do we build something that has really good vibes? And so like vibes in some ways here are broadly ascended and kind of what some of the folks in the panel earlier were saying. A second thing you may notice from that like vibe chart uh and a lot of us in this room may be like is like wait a minute we

11:11need something objectives and this ain't object like this is not objective. This is like some computer telling us what to do. This is crazy. We don't want to do that. No part of that is we naturally would say that because we're these people like this is our whole thing.

11:23But another thing is like this is actually not that objective of a thing to measure that there are some sports in the Olympics like the 100 meter dash uh also famously close um that this is a very objective thing like who crossed the line first it's pretty straightforward it's actually not that straightforward because like what part of your body is first I don't know like

11:39it's your shoulders apparently um or clavicle I think uh but it's like fairly objective this despite us putting it in charts and tables is not really that objective this is a fairly subjective thing and you'll notice that if you zoom in on that bottom step stuff you can get deductions for is like poor rhythm and elements. Like I don't know what that I

11:56mean maybe they know better than I do but like that's not a numeric thing inherently. That is like us applying a numeric thing to a subjective thing. But also the other big part of this is this score which looks like a giant formula is full of ton a ton of subjective things. So like why are these elements

12:12worth these points? Why do we make these decisions that certain things are worth certain points? Why are there eight? Why not 10? Why not six? I don't know.

12:18Somebody just decided eight. Uh why are E scores out of 10? There's a certain waiting that happens here when the D scores are at certain numbers and the E scores are 10. Why do we evaluate it that way? Like why are the deductions and deductions? Why can't certain scores be reviewed? There's all these weird laws about what you can review and not

12:32and also like what in the world is going on with this? Um all of these things are fairly subjective rules that get applied for like the air of of quantitiveness, but in reality there's a lot of subjective stuff that that gets added.

12:43And so really the best stuff professionally here isn't just like math. It's kind of like numeric vibes.

12:49it's like quantitativeish. Um, and so that's not that that's like not that different. The third thing that you may notice though if you look at this chart uh is this chart looks like another chart which is like this one.

13:01Um, and so actually it may be that the best way to judge this is like how are the vibes that why do we have all of this crazy rubric? Like maybe we should just do this.

13:15Okay. So like also like what are we talking about? Um, some people like, okay, this is all fine. Like, makes sense. We're talking about gymnastics. We do data stuff. It's businesses. We have built an enormous number of tools to support not vibes, but data. Um, like how in the world does any of this happen? What do you say?

13:32Okay. So, to to make this like a little more businessoriented in a way. Um, imagine that you are an analyst. Uh, and you're sitting here and your boss calls you into the office and your boss is like, "Hey, I have a question for you." And you're like, "Who me?" And your boss's like, "Yes, you." is our business in good shape?

13:50And you like look at it for a second, you say, uh, like, well, it depends on what you mean by good. It depends on what kind of business you're referring to. The overall balance sheet is in flux. We lost $70 billion last night.

13:59Uh, but that's exclusive of this 3x revolver, and our exposure is already down 120%.

14:05[laughter] And then your boss looks at you like this. Uh, and says, "It's a simple question, doctor." Um, [laughter] so like, okay, this is like an extreme example. Well, we most of us probably have not sat in a collapsing financial firm as the world falls apart. But, uh, we probably have gotten a lot of questions that are like this of like,

14:23what do our customers think of our new release? Why do we lose the deals we lose? Uh, which customers are at risk of turnurning? Shout out to Jordan. This shows you when these slides got made.

14:31Um, even how much money do we make seems like a question we should have an answer to. And we give answers like this. We give these sorts of things of like, well, it depends on what you mean by money. What's an active user? I don't know. We have we have conservative ones.

14:43We have like aggressive ones. what's a weekly active user? How do we define all this stuff? And we end up presenting charts like this that have all of this kind of like weird minutiae in them and they aren't actual real answers. Then we get a look from our boss being like, what the um and if they had a tool that was not

14:58this, that was not us giving these weird answers, but was just this was just like, okay, what are your customers saying? And this is not built in the prior conversation, a lot of it was talking about how do you build like agentic analysts. This is not built on top of that. This is like the thing that does the gymnastics evaluation where it

15:13is not write a query, make a chart. It is just saying like there's a bunch of support tickets. I'm going to read them.

15:19That's kind of what I'm good at. And then I'll give you a vibe. Like, are they going to use that? Are they going to use that? Are they going to use the analyst that gives this giant paragraph of answers as to why things are different, what they mean, why revenue is not what say it is, all that

15:33sort of stuff. And again, we may have this objection of like it's not objective. Like that doesn't work. You can't do that. Which one, as a brief plug, uh we're working on some stuff to build some benchmarks to figure that out. Uh but the other thing I'll say is to quote this is a weird way to go. Uh

15:46to quote Stalin, um

15:51which I don't really want to do like this is disputed if Dalin said or Napoleon said it Napoleon said, uh quantity has equality all its own. And so if you get a ton of answers from things like this, at some point the quality doesn't matter that much that if you're having to wait 457 days for an answer versus you get a score right

16:08away, you'll take that one. And so like this is how we transition into this new world of vibes is these things will give us answers really quickly, whereas a lot of the data stuff won't and it'll give us this weird dispute. So uh the last thing I want to talk about here or close with is uh there is this tweet from

16:22Shawn Taylor from a while back. Some of y'all may know Sean. Sean's great. uh that basically is like there is a ton of faith in doing data work that like okay what does it mean we have to hire a bunch of really expensive people they're going to look at really like esoteric numbers and do all this stuff and how do

16:35we know if any of it works we'll hire more of them they'll tell us and like okay there's a whole lot of faith built into this and a whole lot of faith built into like this trend that has been the data stuff that we've done for so long but this faith is fairly fragile and if we run into problems like this where

16:50there's a whole bunch of lawsuits because we try to do data stuff if we run into problems like this where a lot of the tools we built don't really work out if everything else is getting done by Vibes. So, people are writing a ton of code with Vibes as all this stuff happens and if the world sort of exists

17:03around Vibes. So, this is an article from the New York Times a couple days ago about like new dating apps. Uh they are matching people not based on these giant like algorithmic matching patterns, but based on like what are your profiles and we'll just smash it together with vibes. And if this stuff works, which apparently it does in this

17:18article, someone from San Francisco, if you're here, hello. um was pleasantly surprised by the way the AI worked and she said the AI actually did a good job of finding compatibility if this stuff works and this is from people who are the upand cominging generation then like this is probably what's going to happen that this will be the next

17:36best stuff professionally and we'll be the folks that are the old geysers off to the side. Uh and with that I'm going to stop. Thank you everybody. That's it.

17:47[music]

More 2025 Talks
View all