WebScraping-HN

Description

This project is a custom scraper for the Hacker News website. It is designed to extract news articles from multiple pages of Hacker News, filtering and sorting them based on the number of upvotes. The final output includes articles that have garnered more than 99 upvotes, providing a curated list of popular and relevant news items.

Features

Scrapes multiple Hacker News pages.
Filters articles with more than 99 upvotes.
Sort articles based on upvote count.
Utilizes BeautifulSoup for efficient HTML parsing.

How to Use

Clone this repository.
Install the required dependencies: requests and beautifulsoup4.
Add URLs of the Hacker News pages you want to scrape in URLs_list.txt.
Run the script: python main.py.

Requirements

Python 3.x
requests
beautifulsoup4

Contact

Thanuja Cherukuri - [[email protected]]

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
URLs_list.txt		URLs_list.txt
main.py		main.py
scraping.py		scraping.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WebScraping-HN

Description

Features

How to Use

Requirements

Contact

About

Releases

Packages

Languages

Cherukuri-Thanu/WebScraping-HN

Folders and files

Latest commit

History

Repository files navigation

WebScraping-HN

Description

Features

How to Use

Requirements

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages