Mike Evans, cofounder of Grubhub, released a memoir in 2022. It’s called Hangry. It’s an interesting story of a startup with a marketplace business model. He gleans over a lot of the dirty work building a business of that variety, but he repeatedly comes back to the concept of mutual embiggening.
Strangely comical term, but altogether sound business principle.
The idea is pretty straightforward - as a business, you grow as your customers grow. In the case of Grubhub, the more orders placed through their platform, the more sales restaurants are making, and in turn, the more revenue Grubhub generates from those restaurant. Everyone wins. Mutual embiggening.
This idea and general business model is not new. It exists in many verticals.
But it is a rare sight in the world of data.
What is the purpose of data?
Plenty of tech and non-tech companies alike have data teams of some flavor. Analytics, Machine Learning, Business Intelligence, Decision Science - they’re all flavors within the wide spectrum of "“data teams”.
But, what is the general purpose or mandate of a Data team?
Knee-jerk reaction might be “to drive better decision-making” or something filled with more meaningless jargon (ie. “to uncover actionable insights!”… please excuse my virtual eye roll).
Really, there are only 3 things a “data team” of some flavor are meant for:
Improve operations of the business through objectivity
Provide an alternative mechanism for monetization
Act as a signal to outsiders that this company “knows what they’re doing” even if they actually don’t
Those might seem too generic - where does the data science team fit in? Why are we ignoring data engineering?
In reality, those teams also fall into these categories. If you’re building machine learning pipelines, you’re either working to improve some existing inefficiency in the business, or your producing a monetizable feature that powers some user experience. No need to complicate things.
What do these flavors of data team work look like? #1 usually takes the form of dashboards, reports or some automation of an existing workflow. It might mean sending data from one tool to another, delivering some metrics on a regular schedule to the weekly Exec meeting, or just dumping a spreadsheet for an analyst. This is work that is meant to highlight improvements in other parts of the business.
The second is what you’ll hear referred to as a “data product”. This is often the goal for most teams, but getting to this point can be a challenge. Sometimes a company may think they’re on the path to building a data product only to fall into the trap of internal business intelligence reporting.
Or, they may have ambitions of monetizing their internal datasets only to be derailed by a lack of internal data management, poor data modeling, and plenty of unexpected work to make things usable.
Data products are the best opportunity for mutual embiggening, though. But we need to talk about billing before we can get there.
On Billing in Today’s Data Ecosystem
For the last 5-10 years, a wave of optimism has washed over the tech industry. Data has been viewed as a “more is better” commodity - a phenomenon that has produced some wildly popular products in the space (Snowflake, BiqQuery, Fivetran, Census, etc, etc… the list goes on).
FirstMark’s ML/AI/Data Landscape now lists an apparent 1400+ companies, and there are plenty missing. Each one of those businesses are battling for your budget.
And while there are plenty of tools that provide real value in specific scenarios, looking at their pricing pages and billing models should leave you with one significant takeaway - these companies by-and-large show no alignment between your usage and the value your derive from the tool.
Usage-based billing has taken over. OpenView Partners says it themselves - usage-based billing has nearly doubled within B2B SaaS over the last 5 years.
And guess what? SaaS companies in the data space are all B2B; let’s be real (Side note - you could make the case a product like Whoop is a data product, but that is the exception).
At a high level, this is fine. These companies are making money charging for a product that people want.
But, what’s shocking is how foreign Mike Evan’s mutual embiggening concept is to the data ecosystem. Businesses are designed to make money, but why has this entire industry turned away from a model that serves so many other businesses incredibly well? Call me crazy, but it can feel like a cash-grab at times.
The cost-value alignment of their pricing models in other industries are obvious. The businesses fortunate enough to fit the “mutual embiggening” paradigm seem to do pretty well, regardless of size - Grubhub, Postscript (a former employer of yours truly), and ConvertKit are good examples.
The prevalent theme? They make money as their customers make money - delivery orders, ecommerce, and newsletter subscriptions, respectively.
And this is the core problem.
Many of the companies on the Firstmark MAD Landscape are service providers or internal tools, not revenue drivers. And without that alignment, you can’t embrace mutual embiggening. Even data SaaS business who claim cost savings are still just an expense at the end of the day. The economics are tough.
The Chosen Few
There is another way to look at this mutual growth business model - it has to do with consumables.
We need to be careful here since it’s easy to conflate consumption with usage, but for this conversation, they have to be viewed differently. Consumption-based are valued and priced based on the outcomes they provide'; in our B2B SaaS scenario, that usually is tied to some other application’s making use of the product. Usage-based products provide an interface into a service of some sort, and charge based on the activity on the platform. There is no concept of “consumption” in another application context.
For instance, a software product that provides observability into errors in your running code (ala Grafana or DataDog) is a usage-based product. They are providing a service (introspection and observability), not a consumable.
On the other hand, a company like SafeGraph is a consumable - they provide data to be consumed and processed by their customers. Coincidentally, their website explicitly says they offer “No pay-per-use policies”.
SafeGraph also happens to embody a mutual embiggening paradigm. Their data sets offer clear revenue generation opportunities, which in turn drives more consumption. It’s a win-win.
The fact is many SaaS businesses within the Data space just don’t provide opportunities for revenue generation and monetization for their customers. The select few that do are data-as-a-product companies who offer datasets for enrichment purposes. Clearbit is another such example.
Data Teams are constantly pushed to move towards data products and monetization, but the reality is that the incentives often don’t align. Even within the broader data industry, pricing models overwhelmingly lean into a “we charge you for using our product” position rather than one of mutual benefit.
And until mutual embiggening becomes a more common practice for both vendors and data teams alike, the unspoken truth is that these vendors and teams will remain as costs centers.