This project is a custom scraper for the Hacker News website. It is designed to extract news articles from multiple pages of Hacker News, filtering and sorting them based on the number of upvotes. The final output includes articles that have garnered more than 99 upvotes, providing a curated list of popular and relevant news items.
- Scrapes multiple Hacker News pages.
- Filters articles with more than 99 upvotes.
- Sort articles based on upvote count.
- Utilizes BeautifulSoup for efficient HTML parsing.
- Clone this repository.
- Install the required dependencies:
requests
andbeautifulsoup4
. - Add URLs of the Hacker News pages you want to scrape in
URLs_list.txt
. - Run the script:
python main.py
.
- Python 3.x
requests
beautifulsoup4
Thanuja Cherukuri - [[email protected]]