ParserOzon

Overview

This project allows you to extract data from the Ozon website in json format using the Scrapy, Selenium libraries.

Install

Download the program to your computer:

git clone https://github.com/murreds/ParserOzon.git

And install the required libraries.

python3 -m pip install -r requirements.txt

Note: Firefox browser installed is required.

Features

Extracting data from the first pages and by category of card products.
Obtaining data on the characteristics of the each products by category.

Usage

First, go to the project directory and enter the command:

cd ./ozonscraper

To get data about each product card from the first pages, enter:

scrapy crawl cardproduct -a category={category} [-a page=page] [-a mode=full]

To get product characteristics data, enter:

scrapy crawl chproduct -a category={category}

The category must be one of 'smartphone, tv, tablets, laptop'.
Data is stored in a directory: /path/to/project/directory/ozonscraper/data

Important: first you need to use the first command then the second.

Example

scrapy crawl cardproduct -a category=laptop -a page=2 && scrapy crawl chproduct -a category=laptop

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
ozonscraper		ozonscraper
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ozonscraper

ozonscraper

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

ParserOzon

Overview

Install

Features

Usage

Example

About

Releases

Packages

Languages

mzkhrv028/ParserOzon

Folders and files

Latest commit

History

Repository files navigation

ParserOzon

Overview

Install

Features

Usage

Example

About

Topics

Resources

Stars

Watchers

Forks

Languages