Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R&D] Should Cairn be deprecated in favor of openedx/tutor-contrib-aspects #50

Open
DawoudSheraz opened this issue Jan 6, 2025 · 3 comments
Assignees

Comments

@DawoudSheraz
Copy link
Contributor

Cairn comes in with tutor main plugin index and offers an out of the box real-time analytics solution utilizing:

  • Vector to collect tracking logs on server
  • Clickhouse to store tracking events and provides live and materialized views to interact with data
  • Superset to view the data

While cairn is being maintained and a new version is released with every Open edX release, for past 2 releases (Redwood & Sumac), many features on Open edX, like reporting on Instructor dashboard, require https://github.com/openedx/tutor-contrib-aspects. aspects is also an analytics system inspired from cairn but has many other things in it. That leaves a question: Should Edly keep on maintaining cairn considering aspects is needed for certain features? The aim of this spike is to:

  • Compare aspects with cairn
  • List out where both systems are similar
  • List what features of aspects are not available in cairn and vice versa
  • If there are features/views in cairn that can be useful in aspects, how much effort is needed to port them?

Apart from above, any additional information that compares aspects and cairn will be useful. With all the information & research, determine if we need to keep maintaining cairn or can it be deprecated in favor of aspects?

@Danyal-Faheem
Copy link
Collaborator

Danyal-Faheem commented Jan 13, 2025

Here's my findings on this:

It is possible I might have missed some things, so please feel free to add here.

Technologies

In terms of the technologies used, Cairn and Aspects use the following with the first three being similar:

Cairn Aspects
- Vector (log collection) - Vector (log collection) [Optional]
- Superset (Dashboard) - Superset (Dashboard)
- Clickhouse (OLAP Database) - Clickhouse (OLAP Database)
- Raw SQL Queries - DBT for SQL pipeline queries
- Ralph as a Learner Record Store to process and verify xAPI statements
- Event-routing-backends to transform logs into xAPI statements

Methodology

In terms of methodologies, a relatively simplified explaination can be seen in the following table. It is also possible to configure aspects to work the same way as cairn per se with vector acting as the log collector and forwarding tracking logs to Aspects.

Cairn Aspects
1. Vector reads tracking logs from containers 1. Tracking logs are transformed by event-routing-backends and sent to Ralph
2. Sends tracking logs to Clickhouse 2. Ralph uses Clickhouse as its database so all verified xAPI statements are stored there
3. Clickhouse will then process data based on queries and materialized views AND/OR
4. Dashboards created on top of datasets in Superset 1. Vector reads tracking logs from containers
2. Sends tracking logs to Clickhouse
3. Clickhouse will then process data based on queries and materialized views
4. Dashboards created on top of datasets in Superset

Integrated Dashboards

In terms of Dashboards that are available out of the box, Cairn and Aspects have the following. Aspects has a lot more reports and charts available for course teams. Moreover, Dashboards for Aspects are being continously developed by Axim as well as the community while Cairn has seen no such developments on its ready made dashboards.

Cairn Aspects
- Course Overview - Course Dashboard
- Operator Dashboard
- Individual Learner
- Course Comparison
- At-risk learner

Similarities

Here are the common features between the two of them.

Common Features
- Role based access control
- Customizable Dashboards
- Multiple Language support for the platform
- Automatic course block updates synced with Clickhouse
- Support to backfill old data using tracking logs
- Support to backfill old course structures

Differences

And finally, here are the different features each of them provides.

Cairn Aspects
- Shorter learning curve as the tool is generally less complex - Multiple out of the box dashboards, with more along the way
- Comparatively less resource intensive than Aspects - Dashboards in multiple languages
- Can access data directly from MySQL allowing Joins with Clickhouse tables - Easier to customize due to the amount of charts and datasets present out of the box
- Support for dummy data
- Support to enable or disable personal identifiable information
- External Clickhouse Cluster support
- Brings a lot more support for persistent customization using tutor plugins instead of through Superset. Has patches and tutor settings available for almost everything
- Alerts and Reports available out of the box

TLDR

Aspects uses ralph and xAPI primarily whereas Cairn uses vector and tracking logs. Aspects has a lot more integrated dashboards as compared to cairns just one. Most, it not all of the features provided by cairn are present in aspects. Aspects has other key features such as dashboards in other languages.

Deprecation

With Aspects providing most, if not everything cairn provides I do not see a reason to keep it around. I think we should use aspects moving forward as the analytical tool for Open edX.

@regisb
Copy link
Collaborator

regisb commented Jan 14, 2025

This is a great review, thanks @Danyal-Faheem! Can you also comment on the possibility to access MySQL data straight from Superset? As far as I know it's not possible to JOIN data from mysql and clickhouse in Aspects, which only allows reading from specific clickhouse tables.

@DawoudSheraz DawoudSheraz moved this from Backlog to In Progress in Tutor project management Jan 14, 2025
@Danyal-Faheem
Copy link
Collaborator

Hi @regisb, yes you are correct. I've updated my original comment to reflect this as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

3 participants