GitHub - osean-man/recipe-scrapers: Python package for scraping recipes data

A simple web scraping tool for recipe sites.

pip install recipe-scrapers

then:

from recipe_scrapers import scrape_me

# give the url as a string, it can be url from any site listed below
scraper = scrape_me('https://www.allrecipes.com/recipe/158968/spinach-and-feta-turkey-burgers/')

# Q: What if the recipe site I want to extract information from is not listed below?
# A: You can give it a try with the wild_mode option! If there is Schema/Recipe available it will work just fine.
scraper = scrape_me('https://www.feastingathome.com/tomato-risotto/', wild_mode=True)

scraper.title()
scraper.total_time()
scraper.yields()
scraper.ingredients()
scraper.instructions()
scraper.image()
scraper.host()
scraper.links()

Note: scraper.links() returns a list of dictionaries containing all of the <a> tag attributes. The attribute names are the dictionary keys.

Scrapers available for:

Contribute

Part of the reason I want this open sourced is because if a site makes a design change, the scraper for it should be modified.

If you spot a design change (or something else) that makes the scraper unable to work for a given site - please fire an issue asap.

If you are programmer PRs with fixes are warmly welcomed and acknowledged with a virtual beer.

If you want a scraper for a new site added

Open an Issue providing us the site name, as well as a recipe link from it.
You are a developer and want to code the scraper on your own:
- If Schema is available on the site - you can do this
  - How do I know if a schema is available on my site?
- Otherwise, scrape the HTML - like this

For Devs / Contribute

Assuming you have python3 installed, navigate to the directory where you want this project to live in and drop these lines

git clone [email protected]:hhursev/recipe-scrapers.git &&
cd recipe-scrapers &&
python3 -m venv .venv &&
source .venv/bin/activate &&
pip install -r requirements-dev.txt &&
pre-commit install &&
python -m coverage run -m unittest &&
python -m coverage report

In case you want to run a single unittest for a newly developed scraper

python -m coverage run -m unittest tests.test_myscraper

FAQ

How do I know if a website has a Recipe Schema? Run in python shell:

from recipe_scrapers import scrape_me
scraper = scrape_me('<url of a recipe from the site>', wild_mode=True)
# if no error is raised - there's schema available:
scraper.title()
scraper.instructions()  # etc.

Special thanks to:

All the contributors that helped improving the package. You are awesome!

Name		Name	Last commit message	Last commit date
Latest commit History 459 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
recipe_scrapers		recipe_scrapers
tests		tests
.coveragerc		.coveragerc
.coveralls.yml		.coveralls.yml
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.rst		README.rst
requirements-dev.txt		requirements-dev.txt
setup.py		setup.py

License

osean-man/recipe-scrapers

Folders and files

Latest commit

History

Repository files navigation

Scrapers available for:

Contribute

If you want a scraper for a new site added

For Devs / Contribute

FAQ

Special thanks to:

About

Resources

License

Stars

Watchers

Forks

Languages