Understanding Snowplow Analytics Custom Contexts

Context can improve your understanding of event data

By Joao Correia
Jul 5, 2018 in Snowplow Analytics

Joao Correia
Driving Growth & Innovation With Data

July 5, 2018

In a time where tracking everything that moves seems to be the norm, context is often overlooked.

Context provides information describing the circumstances that surround an event, which is critical to comprehend events fully, and this applies to real life situations too.

It's a goal! Who scored? Which teams were playing?
GIF source

What are Snowplow Analytics Custom Contexts?

In Google Analytics and Adobe Analytics, the way to add additional context to an event is to map an event property to a custom dimension or traffic variable, usually referenced by a number, or letter and number.

The mapping can quickly become confusing with dictionaries and spreadsheets mapping numbers to properties scattered along the available slots. We’ve all been there.

Snowplow custom contexts are similar to what Google custom dimensions and Adobe traffic variables aim to do, but with a significant difference; they are a self-describing JSON.

With Snowplow you can send a custom context as an extra argument for any Snowplow event type.

There are several advantages to this approach:

You will never run out of space for context
Schemas are versioned and can expand/contract with business needs
Schemas are reusable
Schema variables are easy to read and understand
The analysis is easier because each schema lives in a dedicated SQL table

Below is an example of a custom context that BestBuy could use to describe a laptop product detail pageview.

// Custom Context
var pageviewContext = [{
    schema: "iglu:com.bestbuy/computer/jsonschema/1-0-0",
    data: {
        sku: '8532502'
       ,brand: 'Apple'
       ,name: 'MacBook® Pro'
       ,type: 'laptop'
       ,price_display:1499.99
       ,model: 'MJLQ2LL/A'
       ,cpu: 'Intel Core i7'
       ,memory_gb: 16
       ,screen_size: 15.4
       ,hard_drive_gb: 256
       ,hard_drive_type: 'flash'
    }
}];

// Snowplow Pageview with Custom Context
window.snowplow('trackPageView', null , pageviewContext);

If your event doesn't fit the usual category, action, label model, you can create your own event with a self-describing unstructured event.

// Returning an item event

var return_event_context = {
      schema: "iglu:com.bestbuy/return/jsonschema/1-0-0",
      data: {
          transaction_id: 'T6318372'
         ,transaction_value: 164.83
         ,item_condition: 'Excelent'
      }
  };

// Snowplow Self Describing Event
window.snowplow('trackSelfDescribingEvent', return_event_context);

How to create a Snowplow Analytics Custom Context

Before you start sending custom contexts with events, you have to create a JSON schema, validate it, upload it to Iglu (schema repository in Snowplow), upload the jsonpaths to S3, and create the corresponding SQL tables.

The jsonpaths and SQL table definitions are generated automatically with igluctl, a tool to help you manage the schema registry.

INFO

Each custom context will be a table in Redshift, which can be easily joined with the main atomic.events table by event_id and collector_tstamp.

Let's get ready to create our custom context.

Download and install Igluctl

Download the sample schema-registry template

A schema registry is a repository for schemas composed of three folders: jsonpaths, schemas, and sql. Don't edit any of the files inside the jsonpaths and sql folders; they will be generated automatically by igluctl from your defined schemas.

Schema Registry Folder Structure Explanation

Note the schema file name 1-0-0, follows Snowplow SchemaVer which is defined as: MODEL-REVISION-ADDITION:

MODEL - when you make a breaking schema change which will prevent interaction with any historical data
REVISION - when you introduce a schema change which may prevent interaction with some historical data
ADDITION - when you make a schema change that is compatible with all historical data

WARNING

If you want to make changes to your schema in production, DO NOT edit or rename the JSON schema file, create a copy following SchemaVer. Leave the 1-0-0 in place. If you send events in the old schema they will still be processed.

Open 1-0-0, the example_event schema file.

{
  "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
  "description": "Schema for an example event",
  "self": {
    "vendor": "com.example_company",
    "name": "example_event",
    "format": "jsonschema",
    "version": "1-0-0"
  },

  "type": "object",
  "properties": {
    "exampleStringField": {
      "description": "Example string field",
      "type": ["string","null"],
      "maxLength": 255
    },
    "exampleIntegerField": {
      "description": "Example integer field",
      "type": ["integer","null"],
      "minimum": 0,
      "maximum": 100000
    },
    "exampleNumericField": {
      "description": "Example number field",
      "type": ["number","null"],
      "multipleOf": 0.0001,
      "minimum": -1000000,
      "maximum":  1000000
    },
    "exampleTimestampField": {
      "description": "Example timestamp field",   
      "type": ["string","null"],
      "format": "date-time"
    },
    "exampleArray": {
            "description": "Example array",                    
            "type": ["array","null"],
            "items": {
                "type": ["string","null"],
                "description": "Each item inside the array",   
                "maxLength": 50                                              
            }                        
        }
  },
  "minProperties":1,
  "required": ["exampleStringField", "exampleIntegerField"],
  "additionalProperties": false
}

Edit the sample schema file with fields that help describe your event.

Validating Your JSON Schema

If an event has an invalid custom context, Snowplow will not know how to process the data, and the event will be sent to the bad rows folder.

As a side note, Snowplow stores both unprocessed and processed data. Processed data goes into one of two folders, good or bad, so you can reprocess data if needed.

You can validate your schemas using igluctl.

Windows

java -jar /path/to/igluctl lint schemas/com.example_company/example_event/jsonschema/1-0-0

Mac/Linux

> /path/to/igluctl lint schemas/com.example_company/example_event/jsonschema/1-0-0

If the schema validated successfully, you will see "TOTAL: 1 Schemas were successfully validated."

Congratulations you've created your first custom context!.

Custom Contexts are one of the most powerful features in Snowplow Analytics, they provide you with the freedom to add context to your events, in any platform, and if the events don't fit your business model, you can always create your own.

snowplow-analytics