Skip to content

Latest commit

 

History

History
96 lines (64 loc) · 5.66 KB

README.md

File metadata and controls

96 lines (64 loc) · 5.66 KB

Triib Scrape

TL;DR

If you are a gym owner, and you're using Triib to manage membership, workouts, etc., you can get all your members' workout data off of the site with one simple command: yarn start. You'll be prompted for additional information about the task you want to perform and the member group you want to perform it for. Then you'll wait while the script scrapes Triib's admin pages. Repeat for each member group you're interested in and then move onto the next task. By the end you'll have all your members' information in a nice CSV file that you can open in Excel or something.

Feel free to skip to the Prerequisites section below if you want to dig in right away.

Back Story

This repo is the result of trying to figure out a way to extract workout results from Triib and transform them into a useful format for importing into another fitness-tracking app. Triib bills itself as "everything you need to excite your members, manage your gym and build your business." And they do offer a lot—from workout tracking to billing to customer management. But when the owner of my gym wanted to switch to another service, the Triib folks suddenly became very unhelpful.

I suspected things might get difficult because their site and mobile app are, in my opinion, kind of janky. They've never offered an API or a way for gym members to export all of their workout results. They only had one place where we could download an CSV file of the latest result from each workout.

Our gym owner is a data geek, and he loves to monitor his members' progress. He also wanted a way for us to be able to hold on to all of our info that we painstakingly entered into the app after each workout. Unfortunately, Triib wasn't much help. In fact, they seemed almost obstructionist, telling the owner that there was "no possible way to get the data." I mean, it must be in a database, right? How hard could it be?

I didn't have direct access to the database, of course, so I needed to get at the data a different way. Using the owner's credentials, I poked around in the admin area and discovered that there was, in fact, a way to get all of the data. The service our gym is switching to now said they could import our old data if we gave it to them in a CSV file with a specific set of "columns": athlete, date, workout title, "is rx," result, set rep scheme, and notes.

Data Extraction Steps

Because our gym has had a lot of members over the years, and those members have recorded a lot of workouts, getting all of the information was going to be a slow process, involving over 100,000 requests to Triib's server. As much as I didn't appreciate their lack of help, I didn't feel it would be right to bring the server to its knees. So, within each step, the requests are made one at a time. Here is what we do:

  1. Scrape the active members page for name, email, and link to workout history. When that is done, scrape the "other members" page for inactive members. Then do the same for "on-hold" and "archived" members. When that is done, we have a directory for each member status, and in those directories a JSON file for each member.
  2. For each member status, run a task that opens all the workout pages for each member. Add an array of workouts to the member file, along with the result details page for each one and save it into a new JSON file along with the member info.
  3. Again, for each member status, run a task that opens the result details page for every workout, grab all those details, and save it all into yet another JSON file for every member.
  4. Now that we've pulled down all the info from the site, we go through all of the "results/scores" files and clean them up, removing duplicates, adjusting the "score" column so it matches the expected format, etc.
  5. From the "csv-ready" files, create a CSV file for each member group.
  6. Combine the CSV files into a single CSV of all the data.

That's it for the narrative part. The rest are the technical details if you'd like to try this at home.

Prerequisites

  • Node.js
  • An admin account for a gym (aka "box") at triib.com
  • Patience

Installation and Setup

  1. Clone the repo

    git clone [email protected]:kswedberg/triib-scrape.git
  2. Install dependencies

    yarn install
  3. Copy .env.example to .env

    cp .env.example .env
  4. In .env, enter your Triib login credentials and the URL to your Triib instance

    USER_EMAIL='[email protected]'
    USER_PWD='YOUR_TRIIB_PASSWORD'
    BASE_URL='https://YOUR_SUBMDOMAIN.triib.com'

Usage

There is only one command: yarn start. All files will be stored in the project's (git-ignored) /data/ directory, in sub-directories named by task. Here are the prompts you'll possibly see (depending on the task you choose at the first prompt):

$ yarn start
? task: (Use arrow keys)
  Fetch Members
  Fetch workouts for each member
  Fetch scores for each workout
  Prepare scores data for CSV
  Create CSV
  Combine CSV files into "all.csv"

? which member type are we dealing with? (Use arrow keys)
  active
  inactive
  archived
  on-hold

? where in the array of members do you want to start? (0)

? where in the array of members do you want to end?

The last two prompts are there because sometimes you need to stop a task (e.g. Fetch workouts for each member) before it completes and pick it up later. Or you might want to just test things by fetching scores for only the first 5 members, for example.

Questions, bugs, requests

Feel free to post an issue if you have any questions or concerns about this insane project.