Feature launch: Identity graphs | RudderStack

By neub9
5 Min Read


RudderStack is the warehouse native CDP built for data teams. Identity graphs are a foundational component of almost every use case involving customer data, from analytics to personalization. At a basic level, an identity graph builds a complete view of a business entity (user, account, etc.) by finding all unique identifiers for that entity across a variety of data sources.

Building an identity graph, though, is challenging. If you’ve ever undertaken an identity graph project, the pitfalls are obvious. At the beginning of the project, identity resolution starts simply enough – deduplicating users that exist as multiple records or have multiple identifiers from a known set of disparate data sources. This process is essentially creating a simple identity graph across your data, most often with SQL joins.

But, inevitably, edge cases begin to creep in. The data model in your CRM is different from what the team anticipated. A legacy system uses a different data model and has legacy data that needs to be resolved to your current (and much better) data model. So, over time your simple model grows into an unmanageable mess… that downstream teams can’t live without because it’s the most comprehensive source of customer data available for data activation use cases, from analytics to lists for marketing.

These models also become more and more fragile over time as you add new sources, data models change, and team members move on. Before you know it, the model becomes a risk in and of itself. Adding new datasets or even minor conditionals begin to affect existing use cases and break downstream systems. Adding a new data source becomes a long-lived project where newly introduced bugs are chased at each step of the project. Ultimately, you end up with an increasingly brittle model that few trust but the business relies on. These problems can’t be solved in marketing platforms or audience-building tools.

That’s why it’s time for dedicated tooling that helps data teams solve identity resolution at the root and easily scale the project as data and business needs change.

Introducing identity graphs for RudderStack Profiles

We launched the first version of our Profiles product at Snowflake Summit in June of 2023. Since then, we’ve been working directly with customers to understand their identity resolution projects on a deeper level and how we could improve our product to make their work even more streamlined.

Today, we’re announcing identity graphs for Profiles, a feature that allows data teams to easily configure and generate deterministic identity graphs, at any level of complexity, directly in their warehouse.

Like all RudderStack features, identity graphs are warehouse native, meaning:

  • All of the jobs run transparently in your warehouse or data lake
  • The code Profiles produces is transparent, auditable, portable SQL
  • Schedules and data sets are fully configurable, making it easy to control compute cost while still meeting the needs of every downstream team
  • The output tables for identity graphs live within your warehouse

One major challenge businesses face is that traditional data models from marketing, customer success, and sales tools rely on an overly simplistic user/account taxonomy. Most business models transcend such a narrow view.

It’s not an uncommon requirement for data teams to consider not only individual entities (customers, users, etc.) but also how they roll up or relate to each other in the concept of a household, business, or account. IoT companies often also need to associate physical devices with users and households.

Identity graphs in RudderStack Profiles are completely agnostic to business entities and can support any kind of relationship between entities, drastically simplifying the task of modeling business logic.

In addition to multiple entities, many businesses have multiple business lines/brands or multiple categories of users to consider. Companies with this more complex business logic often need to resolve user identities across business lines while maintaining dedicated identities and user feature sets individually for each business line.

Profiles identity graphs make it easy for data teams to maintain a global view of every entity alongside dedicated identity graphs for brands, product lines, teams, or any other business logic component.


Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *