Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sanitization to dangerouslySetInnerHTML values #530

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

add sanitization to dangerouslySetInnerHTML values #530

wants to merge 1 commit into from

Conversation

davidsword
Copy link
Contributor

@davidsword davidsword commented Sep 25, 2018

This fixes: #526

There are several spots in this plugin using Reacts dangerouslySetInnerHTML attribute to place content from the server. This can easily lead to cross-site scripting (XSS) attacks.

Before the HTML is dangerously placed, running it through DOMPurify helps to sanitize out malicious HTML.

iframe's (an easy method for XSS), are allowed for the embedded media, however the domain used must be whitelisted - otherwise the src is removed.

The domains for the iframe whitelist reflect the front end "Embed Media" list https://github.com/Automattic/liveblog/blob/master/README.md#embedding-media

⚠️ This sanitization, by design, removes any <script> and/or non-whitelisted <iframe> a user has added themselves in an entry, including content within 'HTML Blocks'. When merged into a upcoming release, this should be mentioned in the Upgrade Notice.

(& thank you for ignoring the typo in my branch ;)

There are several spots in this plugin using Reacts 
`dangerouslySetInnerHTML` attribute to place content from the server. 
This can easily lead to cross-site scripting (XSS) attacks.

Before the HTML is dangerously placed, running it through DOMPurify 
https://github.com/cure53/DOMPurify helps to sanitize out malicious 
HTML.

iframe (an easy method for XSS), are allowed for the embedded media, 
however the domain used must be whitelisted - otherwise the iframes src 
is removed.
@davidsword davidsword changed the title add sanitization to dangerouslySetInnerHTML values. Re: #526 add sanitization to dangerouslySetInnerHTML values Sep 25, 2018
@cain
Copy link
Contributor

cain commented Sep 25, 2018

@davidsword Looks like you need to build the assets :)

@sboisvert
Copy link
Contributor

This looks great!

I suspect @cain is correct in needing to rebuild the compiled JS to get it working.

One thing that came to mind is, what happens if someone has an extra domain they want whitelisted. For example to their own site or to some other service (that they trust)? Is there a way we could have a filter in the plugin where they could add domains?

'open.spotify.com',
'player.vimeo.com',
'www.youtube.com',
// Instagram and Twitter don't use iframes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we will strip out Twitter and Instagram embeds?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, how was this list of domains generated?

Copy link
Contributor Author

@davidsword davidsword Sep 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

negative. Twitter & Instagram continue to work and can be used.

Specifically, for Tweets: the Rest API has the entry content with <blockquote class="twitter-tweet">...</blockquote> and then the Twitter embed is fully rendered by using the platform.twitter.com/widgets.js script that's already enqueued into wp_head()

Copy link
Contributor Author

@davidsword davidsword Oct 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list of domains was generated taking liveblogs advertised media embed features, adding them to an entry, and seeing what WordPress/oembed gave back. In writing that, I realize I need to take a closer look into these domains (ie: subdomain changes for different regions or languages) and each services documentation to ensure the domains are correct and reliable for each service.

& as pointed out by @sboisvert, a filter to manage this whitelist would be needed.

@sboisvert
Copy link
Contributor

Chatted with @mjangda on this.
A few great questions that came from it:

  1. What is our motivation behind this?
    Is the content that's being fed to the dangerouslySetInnerHTML something that isn't being validated by WP? If it isn't, why not? If it is, what are we going to do about things that we validate in WP and accept and then filter out in JS? If it's impossible for something to come to the JS without having been thru PHP validation, is the increase in security worth breaking current backwards compatibility?

  2. What will this break?
    If we're introducing a breaking change, should we first notify users in a version? (For example the next release could lists all the things that would be removed and link to documentation on how to validate / whitelist it properly)
    Is there a way we could look over the most common things this would strip and then figure out if those should be whitelisted by default?

* @return {string} sanitized HTML
*/
export const sanitizeHTML = (dirty) => {
// Whitelist iframes for the plugins 'embded media' feature.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in embed

@davidsword
Copy link
Contributor Author

Thank you all for the input and questions. I will answer and look into this more this week.

@davidsword
Copy link
Contributor Author

davidsword commented Oct 9, 2018

The custom endpoints on the REST API that are feeding dangerouslySetInnerHTML are not sanitized. If database was compromised, an XSS attack could output from the REST API and the React app would display it.

Since XSS prevention is needed and no sanitization is occurring (besides on new entry submissions), wether we do it with PHP at the custom endpoints, or with JS using DOMPurify, it seems breaking changes are inevitable. In reviewing this again, I now believe both locations should be sanitized using the same filter-able white/blacklist. This way we're not serving stored XSS and we're also escaping as late as possible.

I'm not sure why things are in this current state, perhaps a broader history lessons needs to happen. It looks like the old PHP version of the plugin did have wp_kses() and an extra whitelist of allowed tags before output - however the current REST/React setup does not use it.

As for what DOMPurify breaks (as the PR currently is), here are the implicit blacklist for tags & attrs (whitelist: tags & attrs). This PR only varies from the defaults by whitelisting iframe with our own function to validate src values. I can see the need for a discussion around what the defaults should be.

& thanks for bringing up the questions. This PR lacked plugin documentation for what is default white/blacklisted, and Upgrade Notice details on breaking change, and so on. More info needed.

As well as having the white/blacklist be filterable, and documentation on that. I will add that in as this goes along.

@GaryJones
Copy link
Contributor

@davidsword Any interest in fixing the merge conflicts and finishing this one off?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

Review usage of dangerouslySetInnerHTML
7 participants