How to Query
Activities and Relationships
The simplest query on an activity stream is to simply choose an activity. Let’s say we chose ‘Page View’. We’d see the following table
Customer | Activity | Page Viewed |
---|---|---|
Ann | Page View | landing page |
Ann | Page View | faq |
Ann | Page View | product |
Matt | Page View | home |
Ann | Page View | checkout form |
This is useful, but querying activities becomes interesting when we can start describing a sequence of events.
Narrator realizes that activities have relationships with each other. In the example activity stream it’s clear that Ann is in the process of selecting a product, checking out, and submitting an order.
Let’s say you recently launched a new marketing campaign and want how it’s performing. One metric might be the number of times a customer submitted an order after visiting the campaign’s landing page.
More specifically, we want to find all page views called ‘summer promo’ followed by a ‘submit order’ activity. In traditional SQL this can be relatively challenging, but Narrator thinks this way by default.
When querying the activity stream (say though building a Dataset), selecting two activities means that each row in the result will represent a customer doing both activities, in that order.
This is our query: show people who did ‘Page View’ and also did ‘Submit Order’ after ‘Page View’
Here’s what that looks like for our example
Customer | Activity 1 | Page Viewed | Activity 2 | Order ID |
---|---|---|---|---|
Ann | Page View | checkout form | Submit Order | Order 25 |
Ann | Page View | checkout form | Submit Order | Order 37 |
The first thing to notice is that this has limited the Dataset. We only see those page views that happened strictly before an Order. Conceptually it’s the same as finding each Order activity that matches, then walking backwards in time to the nearest page view.
The second thing to notice is that this isn’t correct. It should only show the 2nd order, and it matched both with the checkout form page view.
A revised query looks like this:
show people who did ‘Page View’ where ‘page viewed = summer promo’ and also did ‘Submit Order’ after ‘Page View’
Customer | Activity 1 | Page Viewed | Activity 2 | Order ID |
---|---|---|---|---|
Ann | Page View | summer promo | Submit Order | Order 37 |
This basic principle applies when adding three or more activities as well. Each one will further limit the Dataset to instances where all three (or four, etc) activities happened in order.
Relationship Options
Ignoring filters for a moment, a full query conceptually looks like
show people who (did / did not) <activity 1> with occurrence = (all / first / last) and also (did / did not) <activity 2> with occurrence = (all / first / last) (before / after) <activity 1>
Before / After
This is relatively straightforward: use before / after to select the order in which the activities relate.
Did / Did Not
This can be used to specify a sequence of events that did or did not happen. For example: show all people who viewed the summer promo page but did not submit an order.
Occurrence
This allows us to limit our Dataset even further.
Activity occurrence is a property of the activity stream. It’s simply an integer that counts how many times that activity has happened for that customer.
Customer | Activity | Order Number | Activity Occurrence |
---|---|---|---|
Ann | Submit Order | Order 25 | 1 |
Ann | Submit Order | Order 37 | 2 |
In the example above, let’s we I wanted to find everyone whose first order could be attributed to the promotion. The query would look like
show people who did ‘Page View’ where ‘page viewed = summer promo’ and also did ‘Submit Order’ (occurrence = first) after ‘Page View’
Specifying an occurrence in relationships is the same as adding a filter on the activity occurrence column.