Apartments web scraper for the website: https://www.forestclub.com.pl/
ForestClub is a python program, written for PERSONAL needs, to automate tracking the changes in the following residential investment website: https://www.forestclub.com.pl/wyszukaj/?flat-type=Mieszkanie&area=&room=&floor=#flats-list.
NOTE: the script aims to be used for several months from the date of creation (as the new information should appear on the website in this timeframe) and its main goal is to send the notification once the new apartments appear on the website for sale.
- scraps and parses a given site using selenium web-driver and Beautiful Soup
- saves all found apartments in apartments.csv (if file exists it is overwritten)
- saves statistics concerning number of total/free/sold apartments to stats.csv (if file exists new stats are appended)
- in case new statistics are different than previous ones a proper notification e-mail is generated and sent using Gmail API
- launching the script should be triggered automatically (e.g. daily or after each reboot or ...) via cron configuration or other workflow manager
- forestclub.py - main program (web scraper)
- my_gmail.py - module (imported by forestclub.py) implementing sending e-mails via Gmail API
- .env - file holding information about sender and recipient e-mail addresses according to the format presented in .env.example template
- credentials.json - file with Gmail API credentials generated from personal project on Google platform, according to the format presented in credential.json.example template
- apartments.csv - file with list of apartments found on the website
- stats.csv - file with statistics concerning apartments
- token.pickle - file with saved credentials to access Gmail API
- Linux shell:
- add execute permission: chmod +x forestclub.py
- run: ./forestclub.py (or python3 forestclub.py)
- schedule launching the script in cron according to the instructions in crontab.example
- python libraries/modules from requirements.txt
- create the project on Google Cloud Platform and enable Gmail API:
- then create credentials, download them and save as credentials.json