This is an industry level web scrapper made in python written in jupyter notebook. Purpose of this program is to collect data of 86K products for ML based recommender system. It is made using BeautifulSoup, requests, pandas, and other libraries, and data structres of python. For collecting data at enormous rate by utilizing full capacity of machine, concepts of multiprocessing and code optimizations are used. For better readability of code, concepts modules, classes, and objects of python are used. For storing data directly in SQL database, pipelines are created to streamline process of scrapping, collecting, cleaning, and storing data into database.
-
Notifications
You must be signed in to change notification settings - Fork 0
An industry level web scrapper made in python written in jupyter notebook. Purpose of this program is to collect data of 86K products for ML based recommender system.
License
jordi399/web_scrapping
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
An industry level web scrapper made in python written in jupyter notebook. Purpose of this program is to collect data of 86K products for ML based recommender system.
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published