Skip to content

husseini2000/Web_Scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Web Scraping: The Arabic Library Books

This project involves building a Python scraper using BeautifulSoup to extract information from an Arabic Library website. The data collected includes the author, title, language, pages, publishing house, size, format, category, and URL. The data is stored in a clean CSV file, ready for further analysis or machine learning models.

Table of Contents

Project Overview

The goal of this project is to demonstrate the ability to scrape data from a website and store it in a structured format for analysis. The Arabic Library website was chosen as the target site for this project.

Features

  • Extracts detailed information about books from the Arabic Library website.
  • Stores the scraped data in a clean CSV file.
  • Ready for further analysis or machine learning models.

Technologies Used

  • Python
  • BeautifulSoup
  • Requests
  • Pandas

Installation

  1. Clone the repository:
    git clone https://github.com/husseini2000/Web_Scraping.git
  2. Navigate to the project directory:
    cd Web_Scraping
  3. Install the required libraries:
    pip install -r requirements.txt

Usage

  1. Run the scraper script:
    python scraper.py
  2. The script will generate a CSV file named arabic_library_books.csv containing the scraped data.

Contributing

Contributions are welcome! If you have any suggestions or improvements, please create a pull request or open an issue.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published