Squeezing Maximum Roi Out Of Small Data
Speaker

Lindsay is a data leader with 13 years of experience. She has successfully launched and led data teams at startups, including BenchSci, Maple, and Secoda. Her expertise spans the development of internal data products and “modern data stack” infrastructures, hiring and developing data engineering teams, and crafting data strategies that help businesses deliver on organizational objectives. As an active member of the data community, Lindsay organizes the 2500-member Toronto Modern Data Stack Meetup group, has taught over 100 students Advanced dbt with Uplimit, and hosts a weekly podcast called Women Lead Data.
0:00[Music]
0:16back in 2002 an Innovative new company launched with a singular mission to enable people to live on other planets and this company was faced with pretty significant competitive pressure from the incumbents and they needed to find a way to build reusable Rockets so that they could compete with these companies that have been around a lot longer than
0:43them taking these constraints SpaceX was
0:47able to deliver reusable rocket technology by Focus uh something that was deemed Impossible by many in the industry before this over the next two decades SpaceX embraced these Contra constraints of reusability and this helped them to deliver their first recovery of a rocket Falcon 9 in 2015 and by 2017 they made history by successfully relaunching a rocket back
1:14into space that had been previously retrieved but this focus on constraints wasn't only a development for SpaceX it also changed space travel as we know it the cost of launching a rocket is 10 times cheaper than it was 10 years ago because of the work that SpaceX has been doing so by focusing on the constraints of reusability and cost Effectiveness
1:36SpaceX was able to drive massive Innovation for themselves and also in the industry so why am I talking about Rockets when we're at a data conference well the focus on Thinking Inside the Box in this story really helps us to understand how focusing on constraints can drive Innovation and make the possible the impossible possible and that's what I want to talk about today
2:00how can data teams think about squeezing maximum Roi out of small data by focusing our efforts and making constraints play in our favor but constraints are not something we typically talk a lot about in especially the modern data stack we often think really big when we talk about constraints we're often focused on the lack of them with massively scalable
2:23cloud data warehouses we can spin up new uh databases and scale infrastructure in minutes regardless of whether our actual our actual data size or needs are we have the ability to sync more and more data with all the different uh connectors that we have options for and we can add more data models to our DBT projects we can deliver more and more
2:46dashboards to the business seemingly with less effort and uh faster than we've ever been able to do it before but if we stop and wonder where did this abundance mindset come from how did we get here well if we look back to the time when the data modern data stack kind of had its glowup period and when it
3:04started to develop we were in a bit of uh an atypical economic condition where we had a bull market and massive economic growth for just over 10 years paired with this was the zero interest rate period that we've been talking so much about this is over now but when the modern Data stock was growing up it was
3:22very easy to borrow money and build companies which these conditions are what allowed us to go from this in 2012 to this in 2024 and as data people we now have literally Endless Options to buy our data infrastructure and expand our capabilities but is this really a good thing this level of uh competitiveness and abundance mindset is what also gets
3:49us here with thousands of data models that all need to be maintained for data quality documentation written about them and even more and on the delivery side of things we end up with more bi dashboards than the business knows how to handle and half the time no one's looking at any of them so to maximize our Roi we need to
4:11find a way to challenge this abundance mindset that the modern data data stack has thrust upon us we need to go back to First principles and think about the value that we deliver divided by the cost that it takes for us to run our data teams so let's focus on this first piece to maximize value we have to
4:31focus we can't be delivering sorry deciding what not to do is as important as deciding what to do but this can be really hard for data teams and this is partially because for decades now the business has been telling us how sexy we are first it was the data scientist and then analytics Engineers took that title and most recently data Engineers have
4:55been titled the sexiest of the 21st century so companies who want to become data driven have jumped at the chance to hire their first data team the First Data person who's going to create Magic from that data in their business and so we've become pretty accustomed to being in demand and if you work on a data team
5:17you know what this feels like a long lineup of stakeholders all of which are vying for our time there's more requests than ever before and there's not enough time to get it all done every day team that I've ever worked on has felt chronically underresourced but I think what we need to realize is that not all stakeholder requests and
5:37not all stakeholders are created equal one tool that we can use to think about this is the stakeholder engagement Matrix thinking about stakeholders rather than the requests that they submit thinking about the people in the business who are the most powerful and focusing our time on those individuals the worst place to get caught as a dat data team is in this
6:01bottom right hand corner the keep informed quadrant these are the people who are very hungry for data they're typically focused on tactical execution in the business and the work that these people can submit to your backlog will keep you busy for years on end the people that are really going to make a difference in the business are the ones
6:19in the top quadrant and often times we may need to spend our time engaging them from the keep satisfied and move them into the engage and consult quadrant by increasing their interest in data so how do we do that well we have to speak their language we have to understand the business and to fundamentally understand the business we can't be talking about
6:42pipelines and orchestration and Warehouse partitioning when we talk to these people this will not get their Buy in we have to understand the business at least as well as they do if not better one of the ways that we can do this is by thinking about the business as a growth model how does the business make money I work for a company called
7:02Hive we are a secondary Marketplace for pre-ipo stock trading this is a metrics tree that would apply to the business that I work in so being able to take this type of a tool and start speaking to the leaders in the business changes the conversation from data to how does this business actually function and how can we measure
7:23performance this isn't just a documentation asset this can be translated into weekly metrics reviews forecasting projects and things where you can actually show the business if we pull these levers here's how we generate more revenue for our business and when we start talking like this we start getting more visibility with those senior stakeholders and moving them into
7:44that engage and consult quadrant so maybe maybe you have a stakeholder who is actually in the engag and consult quad quadrant but they're asking you to implement a generative AI model to do something extremely simple that it doesn't require maybe you've got a CEO that's talking a little bit like this they want to they want to use AI
8:08for everything this can also become a a great opportunity to educate these stakeholders and start to talk about the value that you can deliver from first principles and also start to focus on the constraints that you have in your business so let's talk about that next so focusing on the constraints as we saw for SpaceX actually helps to drive more
8:31Innovation and as I said earlier I think the modern data stack has this challenge that we try to dim minimize the constraints we try to talk about the fact that we don't have any constraints um quick poll how many folks do we have in the room who are are trained Engineers I'm curious okay it's about 50 a little less than 50% okay so that
8:53validates my my HP hypothesis that a lot of data folks and especially folks who have been in the industry a little longer or maybe are coming from from different backgrounds we haven't been formally trained necessarily in the idea of focusing on constraints and how those affect the quality of our project so the iron triangle is a really great uh Sol
9:11uh tool to use when you're thinking about planning your projects out in data this is an important conversation to have with the business that delivery of a highquality project you're always going to be focusing on scope cost and time and a lot of times the business wants us to move extremely fast and deliver a high quality Project without
9:31regard for the long long-term cost and the scope creep of a project if we start to bring these into our conversations of data project planning and bringing this into a formal engineering design process we can then start to think about constraints and keep them in mind as we build out our projects when we think about cost though
9:52we don't only want to think about the cost of delivering a project in that moment we need to also think about the cost the hidden costs which is the thinking about the total cost of ownership so the initial purchase price of a new tool or building a new model or your compute costs is one thing but we also need to think about
10:10ongoing maintenance of projects the downtime the scaling costs things that aren't necessarily easy to identify when you're first planning a project the other thing to consider is that sometimes we build in tools that may all of a sudden change their pricing models and and this can really shake things up for us this is not unknown to to a lot of data teams where you've
10:34become dependent on something and then it becomes a lot more expensive overnight and so in this post Zer economy we as data teams need to be careful how we're building and thinking about what happens if something I become very dependent on changes on me
10:54overnight so if you remember there was a concept of a millennial tax that was talked about back in uh 20 the early 2010s where Uber door Dash companies like that would start to think about uh they would give you a a uh discount on their service so that you would get hooked on using the app and I think
11:12that's what we're seeing a little bit in the modern data stack there's this tax that a lot of times you can get hooked on using these products very early on with free credits and other things incentives to get you bought in and then pricing models or things like that change so it's very important for us as data teams to think about the choices
11:29that we make now and how they will scale into the future we're even seeing tools now that are helping us to reduce cost so more tools that we can buy that will help us to reduce our cost but these are actively helping us shrink our bills these also become great assets that you can take back to the business
11:46to drive accountability I don't know how many folks have created assets for stakeholders who don't have a concept or an understanding of the cost that it takes to maintain something people don't have a concept for what it takes to build a dash board or to maintain a pipeline maybe if it doesn't go uh get used very much and so typically stakeholders are
12:07so far removed from the cost of their requests that we need to make that more visible to them and start thinking about pushing that accountability back on the
12:18business tools that we are here celebrating today DLT mother duck some more these are ones that can help us to think about Cost Containment from the beginning when we start to build uh from the ground up we may be able to build with these constraints in mind and get ahead of these costs before they happen so as we start to unravel this illusion
12:36that con constraints don't exist they've always been there we've just been choosing to ignore them and so with that I will uh close things off and say think about the two key levers of Roi the value that we deliver and the cost that it incurs uh that it uh takes for us to deliver this value to maximize our value we must
12:56resist this abundance mindset that we can just continue continue to build Non-Stop and we have to think about focusing our efforts with certain stakeholders who are most important in the business to minimize our costs we can turn constraints into Innovation focus on constraints and costs first and put them into treat them as first class citizens at your planning table if we
13:18treat them accordingly as an opportunity to innovate we can build higher quality data products all right thanks for listening and would love to connect thank you [Music]


