Sitemap Prerendering

This module was designed to run as a prerender client that caches to s3. Utilizing either local or docker to render webpages, which are then posts the rendered static HTML page to S3. The idea behind this is to allow for a place for bots to scan static html pages.

Prereqs

Create an S3 Bucket.
Have a domain with a robots.txt (ex. https://example.com/robots.txt)

Development

If developing, ensure to install the requirements.txt file.

pip install -r requirements.txt

Utilization

Docker


 docker build -t prerender .

docker run -e AWS_ACCESS_KEY_ID=AWSKEY -e AWS_SECRET_ACCESS_KEY=AWSSECRET -t prerender -i python -c "from prerender.prerender import Prerender; Prerender(#Options).capture()"

Local Installation

Install the modules:


python scraper/setup.py install

python prerender/setup.py install

Create Python Code


from prerender.prerender import Prerender

pre = Prerender( # Options )

Options

Required	Variable	Info
True	robots_url	The path to your root robots file. This will contain the sitemap info
True	s3_bucket	Cache Archive bucket name
False	auth	Utilized for basic authenticating to page.
False	query_char_deliminator	(recommended) - Character to replace the question mark. If storing static pages, AWS doesnt allow you to have ? in a file to serve the content. So changing to a different character will fix this. Ex) /subpage?id=1 and your query_char_deliminator is '#', your page will be stored as /subpage#id=1
False	allowed_domains	List of domains to allow. If specified all other domains will be blocked during the page capturing.

Module invocation

Invalidate/Clear bucket:


pre.invalidate()

Capture from sitemaps within Robots.txt


pre.capture()

Single Page Capture

If you prefer to capture a single page, versus a full domain.


pre.capture_page_and_upload("https://example.com")

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.github		.github
prerender		prerender
samples		samples
scraper		scraper
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sitemap Prerendering

Prereqs

Development

Utilization

Docker

Local Installation

Install the modules:

Create Python Code

Options

Module invocation

Invalidate/Clear bucket:

Capture from sitemaps within Robots.txt

Single Page Capture

About

Releases

Packages

Contributors 2

Languages

License

tournamentmgr/Sitemap-Prerendering-S3

Folders and files

Latest commit

History

Repository files navigation

Sitemap Prerendering

Prereqs

Development

Utilization

Docker

Local Installation

Install the modules:

Create Python Code

Options

Module invocation

Invalidate/Clear bucket:

Capture from sitemaps within Robots.txt

Single Page Capture

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages