An interactive, easy and unique way to scrape web page components from webpages.
-
Scraping following objects and parsing them:
- Headings and subheading
- Links
- Images
- Tables (Coming soon)
- CSS based class selector.
- Requests (To fetch Webpage contents)
- Inquirer (Interactive CLI UI)
- BeautifulSoup4 (Parse the HTML)
To run this you need pipenv installed on your system. Do this to install it:
pip install pipenv
and then run the module by running pipenv run start
- Work on Tables
- Pretty print tables
- Select the title of the page
- Gather all the text of the page
- Get all the
<p>
tag content - Whole source code
- Enable writing to file
- Work on CSS based selector