Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

πŸ› Bug Report β€” Strange header issue with fetch URLs that redirect #2223

Open
Hacksore opened this issue Jun 5, 2024 · 25 comments
Labels

Comments

@Hacksore
Copy link

Hacksore commented Jun 5, 2024

Warning

WIP issue please be patient

I have a worker where I am attempting to fetch artifacts from the github api.

This API returns a 302 to a signed download URL. So what I think is happening is the Authentication header is also being sent to the signed url server which results in an error.

Repro that I'm working on:
https://github.com/Hacksore/fetch-with-github-artifacts

Same issue exists in prod:
https://fetch-with-github-artifacts.hacksore.workers.dev

I was able to figure out that it has to be the auth token making it down stream.

curl -vvv -H "Authorization: Bearer ligma" https://productionresultssa16.blob.core.windows.net

Note workaround:

We just have to fetch without redirect and then do another fetch to omit sending the Authenticated header.

@Hacksore
Copy link
Author

Hacksore commented Jun 5, 2024

Seems CF has the same issue
πŸ’€ https://github.com/cloudflare/workers-sdk/pull/4816/files

@Hacksore
Copy link
Author

Hacksore commented Jun 5, 2024

You can repro this with curl with the -L & --location-trusted flag and it will yield the same issue we see in the worker.

curl -o test.zip -L -vvv --location-trusted \
  -H "Accept: application/vnd.github+json" \
  -H "Authorization: Bearer <yeeted>" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  https://api.github.com/repos/Hacksore/overlayed/actions/artifacts/1569393086/zip

@Hacksore
Copy link
Author

Hacksore commented Jun 5, 2024

FWIW this was a CVE https://curl.se/docs/CVE-2018-1000007.html and it's expected to not pass auth headers to a redirection if the origins are not the same.

and in node it makes sure not to send it in a redirection to a cross origin call.
https://github.com/nodejs/node/blob/0281e2cbf0531ade992265b7b734dd3f1ecffe8a/deps/undici/src/lib/web/fetch/index.js#L1286

They delete it here:
https://github.com/nodejs/node/blob/0281e2cbf0531ade992265b7b734dd3f1ecffe8a/deps/undici/src/lib/web/fetch/index.js#L1311

@nickhudkins
Copy link

The whatwg spec for fetch, specifically item 13 is what is at play here: https://fetch.spec.whatwg.org/#http-redirect-fetch

It appears that it is mostly handled but a few items from the spec have been missed here:

jsg::Promise<jsg::Ref<Response>> handleHttpRedirectResponse(
jsg::Lock& js,
jsg::Ref<Fetcher> fetcher,
jsg::Ref<Request> jsRequest, kj::Vector<kj::Url> urlList,
uint status, kj::StringPtr location) {
// Reconstruct the request body stream for retransmission in the face of a redirect. Before
// reconstructing the stream, however, this function:
//
// - Throws if `status` is non-303 and this request doesn't have a "rewindable" body.
// - Translates POST requests that hit 301, 302, or 303 into GET requests with null bodies.
// - Translates HEAD requests that hit 303 into HEAD requests with null bodies.
// - Translates all other requests that hit 303 into GET requests with null bodies.
auto redirectedLocation = ([&]() -> kj::Maybe<kj::Url> {
// TODO(later): This is a bit unfortunate. Per the fetch spec, we're supposed to be
// using standard WHATWG URL parsing to resolve the redirect URL. However, changing it
// now requires a compat flag. In order to minimize changes to the rest of the impl
// we end up double parsing the URL here, once with the standard parser to produce the
// correct result, and again with kj::Url in order to produce something that works with
// the existing code. Fortunately the standard parser is fast but it would be nice to
// be able to avoid the double parse at some point.
if (FeatureFlags::get(js).getFetchStandardUrl()) {
auto base = urlList.back().toString();
KJ_IF_SOME(parsed, jsg::Url::tryParse(location, base.asPtr())) {
auto str = kj::str(parsed.getHref());
return kj::Url::tryParse(str.asPtr(), kj::Url::Context::REMOTE_HREF, kj::Url::Options {
.percentDecode = false,
.allowEmpty = true,
});
} else {
return kj::none;
}
} else {
return urlList.back().tryParseRelative(location);
}
})();
if (redirectedLocation == kj::none) {
auto exception = JSG_KJ_EXCEPTION(FAILED, TypeError,
"Invalid Location header; unable to follow redirect.");
return js.rejectedPromise<jsg::Ref<Response>>(kj::mv(exception));
}
// Note: RFC7231 says we should propagate fragments from the current request URL to the
// redirected URL. The Fetch spec seems to take the position that that's the navigator's
// job -- i.e., that you should be using redirect manual mode and deciding what to do with
// fragments in Location headers yourself. We follow the spec, and don't do any explicit
// fragment propagation.
if (urlList.size() - 1 >= MAX_REDIRECT_COUNT) {
auto exception = JSG_KJ_EXCEPTION(FAILED, TypeError, "Too many redirects.", urlList);
return js.rejectedPromise<jsg::Ref<Response>>(kj::mv(exception));
}
urlList.add(kj::mv(KJ_ASSERT_NONNULL(redirectedLocation)));
// "If actualResponse’s status is not 303, request’s body is non-null, and
// request’s body’s source [buffer] is null, then return a network error."
// https://fetch.spec.whatwg.org/#http-redirect-fetch step 9.
//
// TODO(conform): this check pedantically enforces the spec, even if a POST hits a 301 or
// 302. In that case, we're going to null out the body, anyway, so it feels strange to
// report an error. If we widen fetch()'s contract to allow POSTs with non-buffer-backed
// bodies to survive 301/302 redirects, our logic would get simpler here.
//
// Follow up with the spec authors about this.
if (status != 303 && !jsRequest->canRewindBody()) {
auto exception = JSG_KJ_EXCEPTION(FAILED, TypeError,
"A request with a one-time-use body (it was initialized from a stream, not a buffer) "
"encountered a redirect requiring the body to be retransmitted. To avoid this error "
"in the future, construct this request from a buffer-like body initializer.");
return js.rejectedPromise<jsg::Ref<Response>>(kj::mv(exception));
}
auto method = jsRequest->getMethodEnum();
// "If either actualResponse’s status is 301 or 302 and request’s method is
// `POST`, or actualResponse’s status is 303 and request's method is not `HEAD`, set request’s
// method to `GET` and request’s body to null."
// https://fetch.spec.whatwg.org/#http-redirect-fetch step 11.
if (((status == 301 || status == 302) && method == kj::HttpMethod::POST) ||
(status == 303 && method != kj::HttpMethod::HEAD)) {
// TODO(conform): When translating a request with a body to a GET request, should we
// explicitly remove Content-* headers? See https://github.com/whatwg/fetch/issues/609
jsRequest->setMethodEnum(kj::HttpMethod::GET);
jsRequest->nullifyBody();
} else {
// Reconstruct the stream from our buffer. The spec does not specify that we should cancel the
// current body transmission in HTTP/1.1, so I'm not neutering the stream. (For HTTP/2 it asks
// us to send a RST_STREAM frame if possible.)
//
// We know `buffer` is non-null here because we checked `buffer`'s nullness when non-303, and
// nulled out `impl` when 303. Combined, they guarantee that we have a backing buffer.
jsRequest->rewindBody(js);
}
// No need to wait for output locks again when following a redirect, because we didn't interact
// with the app state in any way.
return fetchImplNoOutputLock(js, kj::mv(fetcher), kj::mv(jsRequest), kj::mv(urlList));
}

Happy to throw my hat in the ring to fix if I get some time.

@jasnell jasnell added bug Something isn't working spec-compliance api fetch labels Jun 5, 2024
@nickhudkins
Copy link

"Thinking aloud": main...nickhudkins:workerd:fix/fetch-redirect-spec

@kentonv
Copy link
Member

kentonv commented Jun 6, 2024

[EDIT: They changed the spec. 🀦 See comments below.]

This is functioning according to spec. The behavior is exactly the same as what is implemented in browsers.

To understand this, it is important to understand the difference between ambient and explicit credentials. Ambient credentials are credentials which a browser automatically adds to all requests based on the hostname to which they are addressed. This includes cookies and HTTP basic auth (old browser-builtin username/password authentication that no one uses anymore). Explicit credentials are credentials which the application explicitly passes to fetch(), such as when writing code like:

fetch(url, {headers: {"Authorization": "foo"}})

Note that HTTP basic auth uses the Authorization header, but the above code example is not HTTP basic auth. HTTP basic auth is ambient, but the above example is explicit.

Only ambient credentials are required to be dropped on a redirect. Explicit header values are never dropped.

This is the behavior in Chrome and other browsers. Chrome WILL keep the Authorization header through a cross-host redirect if the header was explicitly provided by the app. In fact, as far as fetch() is concerned, when the app explicitly specifies Authorization, this header is not any different from any other header by any other name. It treats it the same as Content-Type or Foo or X-Auth-Key -- it's just some strings.

Cloudflare Workers does not implement any ambient credentials. The Authorization header is only sent when you explicitly specified it. So the header is always kept through redirects.

This is not a security vulnerability in fetch(). If an HTTP server implements a REST API that uses non-ambient authorization via headers, and from this API it returns a redirect to a host it does not trust, then the security vulnerability is in the HTTP server.

In fact, this behavior is desirable. Say you are serving an API from some hostname, and you decide to move to the API to a different hostname using a redirect. Say your API client authenticate via an OAuth access token. If redirects dropped authorization headers, then all your clients would break -- the redirect would be useless. You therefore cannot move your API to a new hostname.

I do realize that some HTTP client libraries have nevertheless decided to implement this behavior of dropping the Authorization header on redirects. I would argue that this is a mistake. However, as long as those libraries aren't trying to implement a standard API, they are free to do as they like. In Workers, we are implementing the standard fetch() API, and we must follow the standard.

@nickhudkins
Copy link

I think I may have miscommunicated because I am not talking about the difference between ambient and explicit credentials, which, thank you for the thorough explanation, but....

This is about the whatwg spec, Step 13. Which explicitly calls out dropping certain headers when origins do not match during the redirection process. Is there some different spec that Workers follows?

@Hacksore
Copy link
Author

Hacksore commented Jun 6, 2024

@nickhudkins yeah I see it here as well described in 4.4. HTTP-redirect fetch -> step 13.

image

@kentonv
Copy link
Member

kentonv commented Jun 6, 2024

I agree that that line appears to contradict what I'm saying, which is weird because this issue came up in the past and we investigated it pretty carefully at the time. I specifically tested in Chrome and verified that the Authorization header wasn't dropped.

Did the spec change?

@nickhudkins
Copy link

Interesting! Not sure if the spec changed, and I understand how seriously this could break A LOT of folks if introduced without gating it behind a flag or something, really just surprising behavior due to the behavior of NodeJS, and (my) maybe incorrect? assumption that workerd's fetch implementation would be in parity with Node (bugs and all!)

@kentonv
Copy link
Member

kentonv commented Jun 6, 2024

It appears the spec was indeed changed, several months after I reviewed this last.

Changed November 2022: whatwg/fetch@9004f4e

That time when Node changed this and I argued they were breaking spec, Jan 2022: node-fetch/node-fetch#1449 (comment)

Ugh.

@jasnell
Copy link
Member

jasnell commented Jun 6, 2024

Just for a differential... I tested quickly in Node.js, Deno, and Bun... Node.js and Bun both remove the Authorization header on the redirect. Deno preserves it. I have to assume that Node.js implemented things after the spec changed and Bun likely copied Node.js' behavior. Deno, like us, likely copied the browser's behavior. Aren't standards fun?

@kentonv
Copy link
Member

kentonv commented Jun 6, 2024

I have to assume that Node.js implemented things after the spec changed and Bun likely copied Node.js' behavior.

Nope! When Node implemented this, it was against spec. The spec changed later on to match Node!

@jasnell
Copy link
Member

jasnell commented Jun 6, 2024

Well, that was node-fetch wasn't it? That's not the implementation that is actually in Node.js (which is based on undici) ... it looks like the change in undici's fetch was possibly made the same day the spec change was landed https://github.com/nodejs/undici/blame/8422aa988243fb4c6c37b78519954d7337cc240b/lib/fetch/index.js#L1339 ... possibly predating it by a bit but I suspect the change was made first in the node-fetch package.

@nickhudkins
Copy link

Ah yes, node-fetch may have gone through a change here, but I believe that fetch as long as it has been native to NodeJS has used undici

@jasnell beat me to it :)

@kentonv
Copy link
Member

kentonv commented Jun 6, 2024

Oh... sorry, I didn't realize node-fetch isn't node's fetch implementation.

@jasnell
Copy link
Member

jasnell commented Jun 6, 2024

Either way... I guess we need to decide how to handle this as it is technically a breaking change. Our options are:

  1. Keep it like it is. Essentially ignore the requirement in the spec
  2. Change the behavior with a compatibility flag / date
  3. Change the behavior with an extension option passed into the fetch
  4. Other options?

@jasnell
Copy link
Member

jasnell commented Jun 6, 2024

Oh... sorry, I didn't realize node-fetch isn't node's fetch implementation.

No worries lol... you're not the first and won't be the last. It's caused quite a bit of confusion over the years. Node.js implementation is provided by https://github.com/nodejs/undici

@kentonv
Copy link
Member

kentonv commented Jun 6, 2024

I am surprised they made this spec change as it is backwards-incompatible. There could be API servers out there that were relying on the ability to redirect to a new hostname. I thought I heard of at least one person broken by the node-fetch change in the wild.

So on one hand I feel like this needs a compat flag.

But on the other hand it's weird to use a compat flag on what could arguably be a security bug?

I think this is only a security problem in the presence of other security problems. For example, in whatwg/fetch#944 they give the example of an attacker setting their username to ../../redirect?url=http://evil.com, and if this username gets inserted into the URL of an API request made by some other user, now their credentials are leaked. The real bug here is URL injection (which can probably be exploited in other ways), but stripping Authorization provides some defense in depth.

Of course, many APIs use other header names for authorization and so aren't defended by this rule. So again... it can't be a security flaw in itself, it's more of a defensive measure.

So probably a compat flag is the right way to go?

@jasnell
Copy link
Member

jasnell commented Jun 6, 2024

So probably a compat flag is the right way to go?

That's where I'm leaning also.

@nickhudkins
Copy link

While I do not have deciding powers, a compat flag makes most sense to me as well. Thanks for getting through that. @Hacksore and I spent an unreasonable amount of time digging around to see where the divergence showed up across different runtimes, and then found the curl CVE etc and ended up deep in the rabbit hole of "how does everyone do fetch"

@Hacksore
Copy link
Author

Hacksore commented Jun 6, 2024

I am happy for anything the CF team decides to solve this. Glad @nickhudkins and I were able to discover the root cause cause it was driving me insane.

@kentonv
Copy link
Member

kentonv commented Jun 6, 2024

Here's a thread where people complained about the spec change breaking things: whatwg/fetch#1631

So seems pretty clear this should be a compat flag.

@jasnell
Copy link
Member

jasnell commented Jun 6, 2024

"In non-server environments you could not rely on this behavior so there it is not considered a breaking change..."

The WHATWG's philosophy of not caring about anything but browsers bites again. Fun.

@jasnell
Copy link
Member

jasnell commented Jun 6, 2024

@Hacksore @nickhudkins thanks for reporting! Not yet sure when we'll get the fix in but will get it scheduled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants