Skip to content

teamtnt/crawler

Repository files navigation

About

A distributed crawler

Requirements

Installation

Via Composer:

composer require teamtnt/crawler

Configuration

Each instance needs to have an identifier. This can be added in .env

NODE_NAME="Instance 1"

The domain feeder needs to start with a seed domain. After that, running

php artisan crawler

For scraping a single url

php artisan url:frontier www.example.com/something

Crawler Topology

Crawler Topology

Domain Feeder

Domain Feeder

Single Instance

Single Instance

URL Frontier

URL Frontier

About

Distributed crawler written in PHP

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published