The How and Why of Flask-Bitmapist

An Analytics Extension for Flask

Flask-Bitmapist is a Flask extension that creates a simple interface to the Bitmapist analytics library. It is easy to set up and start collecting data on how users are interacting with an application, then to work with that data and build out cohorts to learn more about user engagement and retention.

What We Wanted and What We Built

For the most part, current analytics libraries were not meeting our needs. Available options tended to be prohibitively expensive, particularly for use with small projects. Otherwise, they were too resource-intensive to implement for repeated use across projects to be appealing.

We found and liked the Bitmapist library because it addresses the first issue in an open-source way. We were also excited about the bitmap-based approach they took, which allows bitwise operations to be performed on the data; this means that the library’s operations can be extremely fast and lightweight. You can read more about the what, the why, and the how of Bitmapist in the author’s Medium post.

Still, we wanted something that we could essentially drop into a project each time without having to do much else. Most of our projects to date have been built using the Flask microframework for Python, so we decided to write a custom Flask extension.

Thus was Flask-Bitmapist born: a Flask extension that, once installed, requires only three lines of code (one line to import, two lines to initialize) to be ready to register events (i.e., to record a given user action and when it occurred).

How to Get Set Up

One of our major goals for Flask-Bitmapist was for it to be easy to set up.

To add it to a project, just pip install flask-bitmapist as you would any other package. Then, import and initialize Flask-Bitmapist with your Flask app, and you are ready to start registering events.

Note: Since Bitmapist uses Redis to record the registered user events, Redis must also be running for Flask-Bitmapist to work.

How to Register Events

Once you have set up Flask-Bitmapist, you get not one, not two, but four ways to register events.

If you are using Flask-Login, user login/logout events will be registered automatically. Add a mixin to your user model to register changes made to database objects. Use a decorator to register an event for a given view or function call, or call a function directly wherever else you might want to register an event.

Examples of when you might want to use each method:

  • Flask-Login: You are already using Flask-Login for user session management, and you want to record when users are logging in and out of your application.
  • ORM mixin: You are using SQLAlchemy, and you want to record when changes are being made to (user) objects in the database. (See the “Moving Forward” section for details about additional ORM support.)
  • Decorator: You want to register an event for whenever a user accesses a particular view (e.g., to render a template, submit a form, access an API, etc.).
  • Function: You want to register events at multiple points throughout a single process, or you want to register different events based on branching or conditional results within some process.

Flask-Login

Flask-Login is a popular library for user session management in Flask applications, so it was important for us to have built-in user login/logout event registration in Flask-Bitmapist.

Setting up Flask-Bitmapist to work with Flask-Login requires nothing beyond setting up both as you normally would: initialize Flask-Bitmapist and Flask-Login with the Flask app, create your User class (e.g., with the UserMixin), and Flask-Bitmapist will automatically listen for the user_logged_in and user_logged_out signals from Flask-Login to register the corresponding events.

Note: The registered event names will be ‘user:logged_in’ and ‘user:logged_out’ for users logging in and out, respectively.

ORM Mixin

As with session management, we wanted event registration for changes to objects in the database to be straightforward.

Import the Bitmapistable mixin and add it to your User class definition. Flask-Bitmapist will then automatically register the appropriate event whenever a user is created, updated, or deleted.

Note: We started with the ORM we most commonly use (SQLAlchemy), but we are looking to add support for others (Peewee, MongoEngine, etc.) in the future.

Note: The mixin is currently intended for the application’s User (or equivalent) class only, pending implementation of a flexible (i.e., not dependent on using Flask-Login) means of retrieving the current session’s user id.

Decorator & Function

Import the mark decorator function from flask_bitmapist and attach it to the desired function, providing the event name and the id of the current user (e.g., Flask-Login’s current_user).

Import the mark_event function from flask_bitmapist and call it with the event name and user id.

Note: The event name structure ~ ‘user:action’ is merely a convention. You could, potentially, name events any number of ways, including breaking them down by domain (e.g., ‘support:bug_reported’) if you so wished.

Note: In most cases, you will probably want the decorator and function to use the current time for registering the event. You can, however, specify a datetime for the event, by passing it with the now argument.

How to Use the Data

Flask-Bitmapist provides multiple ways to retrieve and process the data as well.

You can get all of the users registered with an event for a given time/time scale using the get_event_data function. To get a user cohort based on multiple events over a given time frame/time scale, the get_cohort function will serve. Additionally, Flask-Bitmapist by default registers a blueprint with a sample interface for generating a heatmap, a table constructed to visually present the cohort data based on the selected inputs.

Single Event at a Single Time: get_event_data()

The get_event_data function is what you will want to use to get the registered events for a single event (e.g., ‘user:spilled_soda’) at a single point in time. The optional time_group argument determines the span of time to include (i.e., get the events spanning one day, a week, a month, or a year) and defaults to a scale of ‘days’; the optional now argument determines when to pull the events from (e.g., right now, last month, or October 21, 2014).

The function returns a Bitmapist events collection object, which allows you to use Bitmapist’s built-in operations (e.g., BitOpAnd, BitOpOr, etc.) to further combine with other Bitmapist event collections. Or, if you prefer, you can cast the collection to a list to get just the list of user ids for those users registered with the event within the given time frame.

The resulting user_ids will be a list of the ids for the users who either spilled popcorn OR spilled cheesepuffs (today, with default time_group of ‘days’ and default now of datetime.utcnow()).

Multiple Events Over Time: get_cohort()

The get_cohort function takes, at minimum, two event names that form the foundation of a cohort. Optional arguments for additional events (to further refine the cohort’s users), time group, and the size of the returned results are available as well. The function returns a tuple containing the cohort with its dates and totals; the returned cohort is a list of lists, with the items in each nested list containing the count of users who were registered with both the primary event and the secondary event (and any additional events provided) at that time. This perhaps makes more sense with an example.

Say you wanted to look at users who had ordered products from infomercials in the last six months and then, over the next four months, made the first of two easy payments AND either made their second easy payment OR returned the product. The first two event names would be passed as required positional arguments to get_cohort (‘user:ordered_infomercial_product’ and ‘user:made_first_easy_payment’ in the example below).

The latter two events would be passed as the optional additional_events argument, with their corresponding operations 6 (see assignment to additional_events in example). The number of rows corresponds to how far back to get results (num_rows = 6 for looking at the last six months), and the number of columns corresponds to how far forward from each date to get results (num_cols = 4 for looking at four months from whenever the user ordered the product).

The get_cohort function returns the cohort, the cohort dates, and the cohort totals.

  • The cohort itself is a list of lists, structured like a matrix or a table, with each element containing the number of users who, from the initial event (e.g., ordered product in May 2016), were registered with the subsequent events (i.e., made first payment AND made second payment OR returned product) for the given date offset (e.g., + 2 months). See the heatmap example below for a visual representation; that table is laid out based on the cohort data’s structure.
  • The cohort dates will be the list of dates calculated for defining the cohort. If today were September 12, 2016, the cohort dates returned would be (the first day of) April 2016, May 2016, June 2016, July 2016, August 2016, and September 2016.
  • The cohort totals will be the count of users for the primary event at each of the above cohort dates. For example, if the first cohort date were April 2016, and 416 users ordered a product in April 2016, the first cohort total will be 416.

Note: Currently, cohort generation prioritizes OR in the order of operations, and allows OR to operate only on its immediate predecessor. For example, first AND second AND third OR fourth will be handled as first AND second AND (third OR fourth), since progressing through the chain of events otherwise would give ORs too much weight, e.g., (first AND second AND third) OR fourth.

Visualizing a Cohort: Heatmap

By default, Flask-Bitmapist will register a ‘bitmapist’ blueprint – this can be disabled by setting the BITMAPIST_DISABLE_BLUEPRINT configuration option to True. With the blueprint enabled, visiting ‘/bitmapist/cohort’ will provide you with a starter interface to retrieve a user cohort based on given events (pulled from your existing Bitmapist event names in Redis) and selected settings. The generated cohort is used to build a heatmap to display the data and help you visualize the results.

Flask-Bitmapist cohort heatmap example

Moving Forward

But wait – there’s more! We’ve got big goals for Flask-Bitmapist moving forward. In the near future, for example, we are planning to add:

  • A more robust and flexible object-oriented back-end structure
  • Broader ORM support, to include multiple ORM options (Peewee, MongoEngine, etc.)
  • Better ORM support, for flexible user id configuration (i.e., so that using the mixin with non-user models will not be dependent on a specific user session manager)

If you’d like to read through the code, or if you’d like to contribute, check out the project on GitHub. We’d also like to see a similar Django integration for Bitmapist, so hit us up if you’re interested in helping out.

Want more Cuttlesoft? Sign up for our newsletter: