Skip to content

sucho-archiving/web-archive-browser

 
 

Repository files navigation

Web Replay Gen

Generate a website for viewing web archives.

🌐 Live demo

Features

  • Compatible with web archives in WACZ format
  • Automatic deploy to GitHub Pages
  • List & autocomplete-search web archives
  • Embedded web archive replay

Jump to:


Quick Start

1. Create new project from template

Use this template

Clone as usual after creating your new repository from this template.

2. Install dependencies

Navigate to your project directory and run:

npm install

3. Update wrg-config.json

Add your website title and web archive URL:

{
  "site": {
+   "title": "My Web Archives"
  },
  "archives": [
+   "https://my-site.example.com/wacz/archive.wacz",
-   "s3://webrecorder-builds/warcs/netpreserve-twitter.warc"
  ]
}

See Configuration for all options.

4. Preview website

To access your site from http://localhost:8080, run:

npm run serve

5. Deploy to GitHub Pages

Push to main to automatically deploy your site to GitHub Pages. ✨

You can also opt-out of Pages to use another hosting provider. See Deployment for more information.


Configuration

Configure options in wrg-config.json:

site

Object for configuring site details.

Key Default Value Value Type
site {} Object
site.title "Web Archives" string Website title, used in browser title bar and as the primary heading
site.url "" string Website base URL
site.logoSrc "" string Website logo, any valid <img> src

replay

Object for configuring the embedded ReplayWeb.page <replay-web-page> tag.

Key Default Value Value Type
replay {} Object
replay.embed "replayonly" "replayonly"|"full"|"default" ReplayWeb.page embed option
replay.baseUrl "https://cdn.jsdelivr.net/npm/replaywebpage" string Base URL for ReplayWeb.page scripts. replay.version will be ignored if a base URL is specified.
replay.version "" string ReplayWeb.page version. Omit for the latest. See releases

archives

Configure location of web archive files.

Key Default Value Value Type
archives "archives" string|string[]|{name:string;url:string}[]

The option value can be:

  • Relative path to a project directory containing .wacz files
  • Relative path to a .txt file with newline-separated list of remote URLs
  • JSON array of plain URL strings or an object with name and url
  • Relative path to a JSON file with an archives key where the value is a JSON array

Paths must be a subdirectory or file in your project root (i.e. in your repo.) Examples:

{
  "archives": "./wacz-files/"
}
{
  "archives": "data/archives.json"
}

Example JSON array:

{
  "archives": [
    // Plain URL string:
    "s3://my-bucket/a/archive.wacz",
    // Object with name and URL:
    {
      "name": "My Web Archive",
      "url": "s3://my-bucket/b/archive.wacz"
    }
  ]
}

The default behavior is to list Web Archive files in the archives directory. Web Archive files (.wacz, .warc) are ignored in git and and copied over to the output _site by default, retaining their directory structure.

Deployment

Github Pages

By default, Web Replay Gen will deploy to Pages on every push to the main branch, as configured in this GitHub Workflow. To change the deployment workflow (e.g. to change the release branch) update the publish-gh-pages.yml workflow file.

Local web archive support

Due to GitHub's file size limit and lack of support for git LFS in Pages, you may run into an issue with deploying large web archive files. To resolve the issue, you can create a separate workflow for uploading web archive files elsewhere (e.g. to an S3 bucket) and configure your site with the remote URLs. Alternatively, you can self-host.

Opt-Out

To prevent deployment to Pages, either disable the workflow through the GitHub UI or simply delete the workflow file (publish-gh-pages.yml.)

Self-hosting

First, remove the Pages workflow. Run the build script to output your site into a local directory:

npm run build

This will output a production-ready build to /_site. Transfer the contents of /_site to your host.

Dev Server

Run the dev server with npm run serve to serve files from /_site.

Saving changes to src will automatically reload the page. See 11ty Browsersync docs to customize the dev server.

Local configuration

Create and configure options in wrg-config.local.json to specify different site options during local development.

To use wrg-local.local.json, run the following:

echo 'WRG_CONFIG_NAME=wrg-config.local.json' > .env

To disable, comment out the line in .env:

# WRG_CONFIG_NAME=wrg-config.local.json

Templates

Web Replay Gen templates are written in Nunjucks. You are free to use any templating language Eleventy supports, like plain HTML, markdown, or ejs.

Web Components

Web components in the /components directory are not pre-rendered at build time. Use the <is-land> tag to render web components at runtime. See archive.njk for an example and refer to the 11ty/is-land docs.

Styling

TailwindCSS

TailwindCSS is enabled in all Eleventy template files. You can install a specific Tailwind version with npm install -D tailwindcss@{version}.

Note: Tailwind is not available in web components (/components/*.js) due to limitations with the shadow DOM. See workarounds if you'd like to access Tailwind classes in web components.

Customization

Tailwind supports inline-style-like customization through arbitrary values in class names. For a more global approach to customization (for example, if you have vendor CSS file) include a <link rel="stylesheet"> tag in your template file. Any .css files in /src will be copied to the output site folder and can be referenced in the <link> tag.

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 75.2%
  • Nunjucks 24.4%
  • CSS 0.4%