Analyzing event-action relationships in SQL databases

Hey folks, I’m stuck on a SQL problem. I need to figure out which events happen right before actions in my database. I’ve got two tables:

  1. user_activity with columns:

    • activity_timestamp
    • user_identifier
  2. user_interactions with columns:

    • interaction_timestamp
    • interaction_category
    • user_identifier

I want to count how many users took an action based on the interaction type that came just before it. For instance, if a user’s history looks like this:

interaction > interaction > interaction > activity > interaction

I want to count the interaction type in bold.

My output should look something like this:

| Interaction_Category | User_Count |
|----------------------|------------|
| Type A               | 50         |
| Type B               | 75         |

I tried writing a query, but it’s not working. Can anyone help me figure out the right approach? Thanks!

ooh, interesting problem! have you considered using a window function to compare timestamps? maybe something like LAG() or LEAD() could help identify the interaction right before each activity? just brainstorming here… what other approaches have u tried so far?

To tackle this issue, I’d recommend using a correlated subquery combined with a window function. Here’s a potential approach:

SELECT interaction_category, COUNT(DISTINCT user_identifier) AS user_count
FROM (
  SELECT ua.user_identifier, ui.interaction_category,
         ROW_NUMBER() OVER (PARTITION BY ua.user_identifier, ua.activity_timestamp
                            ORDER BY ui.interaction_timestamp DESC) AS rn
  FROM user_activity ua
  JOIN user_interactions ui ON ua.user_identifier = ui.user_identifier
  WHERE ui.interaction_timestamp < ua.activity_timestamp
) subq
WHERE rn = 1
GROUP BY interaction_category

This query joins the tables, finds the most recent interaction before each activity, and then counts unique users for each interaction type. It should provide the output you are looking for.

yo, that’s a tricky one! maybe try using a subquery to find the latest interaction before each activity? smthing like:

SELECT interaction_category, COUNT(DISTINCT user_identifier) AS user_count
FROM (
SELECT…
WHERE…
AND interaction_timestamp = (
SELECT MAX(interaction_timestamp)…
)
)
GROUP BY interaction_category

just a thought. good luck!