Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedded fonts are not included in WARCs #125

Open
machawk1 opened this issue Oct 6, 2020 · 3 comments
Open

Embedded fonts are not included in WARCs #125

machawk1 opened this issue Oct 6, 2020 · 3 comments
Labels

Comments

@machawk1
Copy link
Owner

machawk1 commented Oct 6, 2020

On my own site (e.g., https://matkelly.com), I reference some fonts to be included and used in the CSS of the web page, e.g.,

<link rel="preload" href="/_font/IM_FELL_English_Roman.woff2" as="font" type="font/woff2" crossorigin>

The resource resolution procedure never fetches these, so the HTML representation is affected at replay. The request for the resource does appear in the WARC.

@machawk1 machawk1 added the bug label Oct 6, 2020
@machawk1
Copy link
Owner Author

machawk1 commented Oct 6, 2020

A generic query selector like document.querySelectorAll('link') will return all of the link tags in the document (header) but I am still searching for a less generic way to identify fonts in the same spirit of the current logic with (e.g.) document.styleSheets for CSS.

@ibnesayeed
Copy link
Collaborator

You may want to use "not perfect but good enough" approach of matching patters in the href, as, and/or type attribute values using the attribute selectors of CSS Selectors in your querySelectorAll call.

@machawk1
Copy link
Owner Author

machawk1 commented Oct 6, 2020

@ibnesayeed That will be my first approach. I am still investigating if there are other resources that perhaps are missing but represented in these elements. If so, the more generic approach of querying the DOM for link elements would yield additional representations to store.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants