Doing More with our Data at DoSomething.Org: a Nonprofit’s Small Data Story
Speaker

Sahil is a data-focused technology practitioner who uses his interest in social impact to advance the initiatives of charitable organizations in order to better achieve their missions.As Senior Data Engineer at DoSomething, Sahil leads all-things data and empowers the organization to benchmark their work to ensure DoSomething continues to fuel young people to change the world.
0:00[music]
0:05[music] Title of the talk is doing more with our data at do somethingthing.org. It's a nonprofit's small data story. My name is Sah Gupta. I'm the senior data engineer.
0:18Do something is 32 years old and on our 30th anniversary, we made this video.
0:24It's short introduction. I'll just play it for you now.
0:42We call BS.
0:51Heat. [music]
1:01Heat. [applause]
1:11Okay.
1:21something.
1:26Okay.
1:33[music]
1:45>> [music]
1:52[music]
2:03>> So, do something has gone through many evolutions. You may remember us from the MTV era if you were a teenager at that time. In our current iteration, we're 20 staff members based in New York, and we support a membership of to date almost 8 million young people.
2:21Our mission as a nonprofit is to fuel young people to change the world. We spent three decades learning, building, and growing alongside the generation that will make the world better. Our target demographic is usually 13 to 25.
2:35Make the world better for themselves and for everyone in it. Do Something has fueled 8 million young people to turn their passions into action since 1993.
2:43And we built one of the largest youth- centered movements for doing good. This is what our website looks like today. Um you'll see a volunteer opportunity with with a call to action highlighted. Uh we call them report backs, but this is what happens when a young person completes a volunteer opportunity. They send us a picture uh showing that they did it. So here's a a
3:08highlighted case on the top and also if you click the call to action you'll be taken to the take action page and this is where we will have a highlighted or featured opportunities but also the uh opportunity to filter by cause area like environment mental health. um choose if you want to do an activity alone or with
3:31a group and how long you expect to spend on that that uh activity. And you'll see that some of our activities or actions we call them are eligible for scholarships. So if you complete the action, you send us a a picture and write a blur about what you did, we approve it, you'll be considered for a scholarship.
3:54Young people can also get volunteer credit in the form of a certificate. Many young people have community service requirements for high school or for middle school. So that's a big part of our our user base.
4:06If you were to filter by all or see what we have in the top row, we have our most recent actions which were around safe driving, safe summer driving, uh sponsored by GM. We have two actions. So these are two volunteer opportunities and a program which is sort of an umbrella for actions.
4:25This is what our impact page looks like. So to date we've activated 8 million young people. Uh we found that members are three times more likely to volunteer or advocate for social issues and there have been over 650,000 actions taken by do something members since 2018 alone.
4:46Of those actions, 28,000 were climate and sustainability related. 87% of members reported acquiring skills essential for sustainable living. And over a ton of toxic chemicals were diverted from landfills.
5:01So that's some data. How do we get that data? Well, we have a data pipeline. We have a data infrastructure. And we've worked with data for a while. Do something prid itself on being technologically forward. As a nonprofit, we pioneered the use of SMS um for interacting with young people programmatically through code. Um similarly, our data team has been pretty
5:21robust for a while. But before we joined the the small data movement, queries took hours if they ran at all. We had 4 terabytes of data in Postgress, which admittedly is not the best choice of data warehouse. Mostly they were web events and any query that involved those events were prohibitively slow.
5:41Sometimes I would leave them running all day overnight. They might finish, they might not. The cost was way too high for the value that we were getting. As a nonprofit, we only truly relied on a few simple metrics which didn't justify our bill. We'd have no embedded analytics.
5:58Our users are not doing sort of comp complex uh analytics of their of their activity. We just needed to get the metrics that we needed for our nonprofit purposes.
6:09Our staff, especially our nontechnical staff, lacked direct access to insights. Tableau running on Postgress as a back end was way too slow if we tried to have live dashboards or views and we were bottlenecked with Tableau by capacity in terms of which staff members were comfortable even going to Tableau just to consume a dashboard.
6:31So what does our new data system look like? And this is something we co-created with mother duck and we did that in 2024. Our key considerations were to be cost effective and rights sized and to be suitable now for one data practitioner. When I joined I took over a system that was built by a team of 15 engineers and now in today's date
6:51I'm the only data person at do something and so we needed a system that accommodated that reality.
6:57So now for extraction or ingestion we use fiverrren. We used to use fiver as well but now we've limited it to just one data source which is G4 Google Analytics 4. We used to have 15 data sources to make up for some of the data sources we removed. We have some simple batch scripts and Python scripts to to
7:16augment that data. We load it all into motherduck instead of Postgress and we transform. We used to use DBT cloud. Now we use DBT core running in a GitHub action every night.
7:30So our strategy for extraction was to limit our sources only to key operational data and only aggregated web events, not consuming every individual web event which drastically reduced our ingestion bill. In fact, now we are on the free tier of Fiverr.
7:47We load it into motherduck which allows us to do high performance analysis for a fraction of the cost of our previous data warehouse which was postgress and it's obviously much faster and more performant and our transformations now happen in GitHub actions because they're running on motherduck instead of postgress they're much faster it's not apples to apples because our data models have
8:07changed but our dbt run on dbt cloud used to take 8 hours and now it takes 2 minutes in GitHub actions so it fits very nicely into the I 2,000 minutes that we get monthly for free.
8:20So there too many wins to name them all. Here are the top three. Our data system all included from Fiverr to our warehouse to our transformations is 20x cheaper. And given the reality of constrained nonprofit resources, especially in the fundraising environment we find ourselves in today, reducing costs here to rightsize against downstream value made datadriven operations sustainable. It's also simple
8:46and fast. One data person. I am now able to coherently develop, test, and iterate on a data on a system that is not only much simpler, but also orders of magnitude more performant. I mentioned queries used to take hours or overnight.
8:59In Mother Duck, we don't have a query that goes longer than a minute. A surprising one, but a powerful one is that the accessibility of our data has increased. Incredibly, surprisingly, non-technical team members now feel comfortable engaging directly with our data and extracting necessary insights themselves. Most of the time, sometimes they still file a data request using
9:23Motherduct's features like column explorer and AI SQL. And that's something that has been a game changer for us.
9:31So, what does this mean for nonprofits at large? We're not the only charitable organization that needs data, has a significant amount of it to process. Our total data size about 5 terabytes, but is operating with inherent operational constraints. Constraints that are that are part of being a nonprofit.
9:51In fact, 76% of nonprofits lack a formal data strategy according to Salesforce. 3% of most nonprofit operating budgets are spent on technology. less than 3% according to the Chronicle of Philanthropy and 27% of nonprofits experience problems working with data according to N10.
10:12So the nonprofit data quandry I'm calling it the fundamental issue for nonprofits and data is that nonprofits are underserved by technology providers.
10:23The sector requires sub substantial capacity building to make data useful. But on the other hand, nonprofits need data to make better decisions and drive outcomes. And even more concretely, our funders require it. When we get a grant or we apply for a grant, there are certain outcomes, certain data outputs that we are obligated to provide. And the public, whether it's on Charity
10:45Navigator or Guidear or any of the or through our IRS 9990, the public expects nonprofits to be transparent about their inputs and outputs.
10:56So, how can small data help? Well, we can rightsize our data practices by only collecting what we need. So instead of collecting every web event, collect aggregated web events. Transform data using open- source tools so that we are sort of not locked in and we can leverage uh free usage.
11:16Leverage tools with the best price to performance namely motherduck in this case. Set realistic expectations for data freshness and metric complexity. As a nonprofit, daily refreshes for reports is more than enough. and trying to bake in real time data analytics would have caused the system to be much more complex.
11:36And in general, keeping things simple, flexible, and small. How can you help? Technology providers in the room. Philanthropy isn't just about money. Charitable organizations like nonprofits and foundations have unique constraints. That's true, including not having huge bud budgets for technology. But we also have powerful proximity to impact.
11:59which can help you have the impact that might help you better achieve your mission, help you have improve public perception or whatever whatever it may be. So my my plea, what I'm advocating
12:13for, what I'm why I'm here today is I would like you to reach out to organizations, nonprofit organizations doing work that aligns with you. You don't need to do mass cold outreach.
12:24Find targeted nonprofits that align with your for-profit mission. Understand their unique data nuances and needs. Find powerful collaborations and solutions just like we have with Motherduck. And lastly, offer nonprofit pricing.
12:40So that's how do something and nonprofits in general with small data can have even bigger impact. Thank you.


