Skip to content

ImranR98/InstacartFlation

Repository files navigation

InstacartFlation

A Python script that scrapes your Instacart order history and saves the data in a JSON file.

The data scraped includes:

  • Order date
  • Number of unique items
  • Order total
  • Whether the order was cancelled
  • The delivery photo URL (if any)
  • The list of items, where the data for each item includes:
    • Item name
    • Item unit price
    • Item unit description (usually weight if applicable)
    • Item unit quantity

Usage

  1. Ensure Python dependencies are installed: pip install -r requirements.txt
  2. Ensure you have Chromium or Google Chrome installed.
  3. Ensure you have Chrome Webdriver installed and that it is compatible with the version of Chromium/Chrome you have.
    • On Linux, you can run installChromeDriver.sh to automatically install/update ChromeDriver in /usr/local/bin,
  4. Optionally, create a .env file with your Instacart credentials defined as INSTACART_EMAIL and INSTACART_PASSWORD (or ensure those environment variables are present in some other way).
    • If you skip this, you will need to login manually when the script starts.
    • Note that even with these variables defined, you may still need to manually solve the occasional CAPTCHA.
  5. Run the script: python main.py
    • You can use the --after argument to only include orders after a certain date/time (format is %Y-%m-%d %H:%M).
    • The output is printed to the terminal; if you would like to also save it to a file, use the --file argument with a valid file path.
      • If specified file already exists, it is assumed to be a JSON previously generated by this script for the same Instacart account. In this case, only orders newer than the last order from the existing file are scraped, and the output is a merged file containing all orders. You cannot use the --after argument in this case.

More Automation

  • You can use the Node.js script downloadImages.js with a JSON file generated by main.py to download all delivery photos and product thumbnails linked in that file.
  • You can use backup.sh to run main.py and installImages.js in a way that is ideal for periodic backups to a dedicated directory (including auto-copying delivery photos to another directory and running another script on them).
  • You can run analyze.py with the path to a JSON generated from main.py to generate a CSV that lists all unique items with their average units purchased per month, average units purchased per order, and price fluctuation history.
    • You can use the --select argument to get that info for a specific item only on the terminal.
    • analyze.py also accepts the --after date argument.