Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect "total" numbers in Security chapter (2024, 2022, ?) #3912

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

JannisBush
Copy link
Contributor

@JannisBush JannisBush commented Nov 21, 2024

Some queries in the security chapter incorrectly calculated a "total" number, for example the total number of iframes only took iframes that had either an allow or sandbox attribute into account.

This pull request fixes all the queries in the security chapter with incorrect "total" numbers.

  • Identify all queries with incorrect total numbers (security chapter 2024)
  • Fix all queries
  • Identify where the incorrect numbers where used in the Almanac (security chapter 2024)
  • Run the updated queries and store the results in google sheets (_fixed)
    • Only for the instances where the incorrect numbers were actually used for the chapter
  • Update the text in the Almanac (2024)
  • Optional: update the main queries/results/text for prior Almanac instances as well.

@JannisBush
Copy link
Contributor Author

I went through all the queries and it seems like only iframe_attribute_usage.sql and meta_csp_disallowed_directives.sql were really "wrong". For a couple of others the word "total" was confusing as it referred to a subset, so I changed that as well, but we did not use the incorrect "total" in the text of the almanac.

I uploaded the new data for the two fixed queries on the Google Sheets. https://docs.google.com/spreadsheets/d/1b9IEGbfQjKCEaTBmcv_zyCyWEsq35StCa-dVOe6V1Cs/edit?gid=1587787684#gid=1587787684 and https://docs.google.com/spreadsheets/d/1b9IEGbfQjKCEaTBmcv_zyCyWEsq35StCa-dVOe6V1Cs/edit?gid=2132002234#gid=2132002234

We still have to adapt the text. Text passages to change:

  • Allow 2024
    • 21.4 million <iframe> -> 30.4 million <iframe>; probably we should also add from the desktop crawl
    • half included the allow -> 35.2%
    • only 21% of <iframe> elements had the allow attribute -> 14.4%
  • Sandbox 2024
    • Change 28.4% and 27.5% to 19.9% and 19.8%
    • Change 35.2% and 32% to 22.1% and 21.2%
  • Meta CSP 2024
    • Simple option would be to simply change 1.70% of pages to 1.70% of pages that use CSP in a <meta> tag.
    • Another option would be to change the percentage to be of pages but then all will be <0.01

About the pre-2024 versions:

  • The Meta CSP issue only exist for 2024.
  • The Allow/Sandbox also exists for 2022, 2021, and 2020. 2019 does not contain the incorrect query.
  • I updated the queries with a comment only for now.
  • We could use the newest query to also get the data for 2021 and 2020 (it already contains the data for 2022) and only update the text.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant