Joining data

Last updated:

|Edit this page

The real power of the data warehouse is the ability to combine data from multiple tables in a single query. Joins enable you to do this. They enable you to choose fields that act as connections between PostHog and external sources.

Table joins

You can join external data on existing PostHog schemas and other external data tables. These joins are saved and interpreted anytime they're accessed on the origin table.

To define a join, go to the data warehouse tab, click the three dots next to your source table, and click Add join. Here you define the source table key, joining table, and joining table key as well as how the fields are accessed.

For example, if you import your Stripe data, you can define a join between the events table's distinct_id key and the stripe_customer table's email key. You can then access the stripe_customer table through the events table like SELECT stripe_customer.id FROM events.

Create a join

Once joined, source properties can be used in filters, breakdowns, and HogQL expressions.

To edit or delete a table join, click the three dots next to your source table, click View table schema, click the three dots next to your joined table, and select Edit or Delete.

Person joins

Person joins are a special type of table joins. They are joins on the persons table in PostHog. When you join external data on this table, we enable you to use it like a person filter in insights.

Filter on joined person properties

Note: Be sure that your joined keys actually match. For example persons.id returns a UUID, even if you use an email as a distinct_id when capturing events. You might need to add a person property like email.

Query joins

If you only want to join data together for a single insight or query, you can use SQL commands like WHERE IN and JOIN SQL commands.

For example, to get a count of events for your Hubspot contacts you can filter events.distinct_id by email FROM hubspot_contacts like this:

SQL
SELECT COUNT() AS event_count, distinct_id
FROM events
WHERE distinct_id IN (SELECT email FROM hubspot_contacts)
GROUP BY distinct_id
ORDER BY event_count DESC

You can also use a JOIN such as INNER JOIN or LEFT JOIN to combine data. For example, to get a count of events for your Stripe customers you can INNER JOIN on distinct_id and email like this:

SQL
SELECT events.distinct_id, COUNT() AS event_count
FROM events
INNER JOIN prod_stripe_customer ON events.distinct_id = prod_stripe_customer.email
GROUP BY events.distinct_id
ORDER BY event_count DESC

Questions? Ask Max AI.

It's easier than reading through 580 docs articles.

Community questions

  • Tim
    4 months ago

    Stripe to Person Properties

    Hey!

    I'm looking to add stripe data to person properties so that I can filter people and see more info about where our paid users came from, recordings etc.

    I'm unsure how to get the stripe info to appear on the list of person properties.

    Are you able to help?

    • Chris
      2 months ago

      Thank you for the fast reply Tim! I've joined the persons and stripe_customers table by email address, but can't the properties are not added to the customer yet. Will try something with the SQL editor, many thanks!

  • Axel
    5 months ago

    Joins to groups table

    Hey! I had earlier used the Person joins only and was quite bummed that the same behavior isn't available for Groups yet, even though connecting data felt easier as you had the key immediately available in the main table (whereas for persons you need to send the joining key as a person property specifically).

    Is there any ETA on when I'd be able to use "extended group properties" in the same way as "extended person properties"?

    • Eric(he/him)
      5 months agoSolution

      Hey Axel!

      Thanks for expressing interest on extended group properties. We haven't implemented this yet but will put this on our radar for upcoming priorities and will reach out when it's ready.

  • Lena
    5 months ago

    Join of two events for visualization

    Hey there!

    I would need the property of one event as a filter criteria for the visualization of another event. I know how to do this in hogQL, but is there any chance to do this in the visualization tab itself? Because I still want to filter out the internal users not in a manual way.

    Thank you in advance. :)

    • Eric(he/him)
      5 months ago

      Hi Lena!

      This most likely isn't possible in the main visualization features as those are very geared towards analysis of a specific event (without joins) but can you describe more what you're trying to visualize and filter?

Was this page useful?

Next article

Creating views

In the PostHog data warehouse, you can save your most used queries as views and reference them in subsequent queries. Creating a view Query views are created directly inside SQL insights. If the query has valid view characteristics, the "Save as view" button is enabled. When clicked, you are prompted to give the view a name which can then be referenced in other queries. For a query to be a valid view, all fields being accessed must be aliased (with the SQL as keyword). The alias names are…

Read next article

PostHog.com doesn't use third party cookies - only a single in-house cookie.

No data is sent to a third party.

Ursula von der Leyen, President of the European Commission