Web service for retrieving list of urls form web domain. The urls are classified into PDF and other.
Install dependencies with pipenv install
, then start the service with
pipenv run python app.py -p 8080
Then you can query from localhost like so
curl localhost:8080/crawl -d "url=https://www.centralpark-hamburg.de" -X POST
By default, the search is two layers deep. You can go one level deeper by querying
curl localhost:8080/crawl -d "url=https://www.centralpark-hamburg.de&layers=3" -X POST
curl https://pdf-crawl-5ekifxtyca-ew.a.run.app/crawl -d "url=https://www.centralpark-hamburg.de" -X POST