Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove trailing whitespace from HTTP headers #53005

Closed
brianquinlan opened this issue Jul 21, 2023 · 16 comments
Closed

Remove trailing whitespace from HTTP headers #53005

brianquinlan opened this issue Jul 21, 2023 · 16 comments
Assignees
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. breaking-change-request This tracks requests for feedback on breaking changes library-io

Comments

@brianquinlan
Copy link
Contributor

brianquinlan commented Jul 21, 2023

Change Intent

The intent of this change is to make the headers field values contained (through HttpHeaders) in HttpRequest and HttpClientResponse never contain trailing spaces and tabs.

For example, an HTTP request like:

POST /test HTTP/1.1
Header-A:  \t  cat \t 

Would be parsed like:

// CURRENT BEHAVIOR: leading whitespace removed, trailing preserved
assert(headers['header-a'] == 'cat \t ');
// PROPOSED BEHAVIOR: leading and trailing whitespace removed.
assert(headers['header-a'] == 'cat');

Justification

It is easier to write correct code if the whitespace has been removed from the field values.

The Dart HTTP parser itself has issues with trailing whitespace. For example, the chunked content encoding detector currently checks for the exact string 'chunked' rather than for 'chunked[ \t]*'.

RFC 2616 section 4.2 specifically says that leading and trailing whitespace is not semantic:

The field-content does not include any leading or trailing LWS:
linear white space occurring before the first non-whitespace
character of the field-value or after the last non-whitespace
character of the field-value. Such leading or trailing LWS MAY be
removed without changing the semantics of the field value.

Finally, this behavior would be consistent with the behavior of XMLHTTPRequest (dart:html HttpRequest) , Cronet (package:cronet_http) and Apple's Foundation URL loading system (package:cupertino_http).

Impact

There are three scenarios that I can imagine:

  1. usage that is currently incorrect but is fixed by this change, e.g., if (header['header-a'] == 'cat') ...
  2. usage that already accounts for the current behavior but will not be broken, e.g., if (header['header-a'].trim() == 'cat') ...
  3. usage that already accounts for the current behavior in a broken way, e.g., if (header['header-a'] == 'cat \t ') ...

I'm guessing that case (1) is the most common and case (3) is the least common.

Mitigation

I think that this is unlikely to have significant impact.

@brianquinlan brianquinlan added library-io breaking-change-request This tracks requests for feedback on breaking changes labels Jul 21, 2023
@aam
Copy link
Contributor

aam commented Jul 21, 2023

I don't think http client library should be in business of trimming and cleansing header values, should not attempt to apply semantic meaning to them. It should try to get content it received into developers hands as is and let them decide whether to trim values or not.
Beside relatively recent dart internal issue #51532 where Content-Length value was not trimmed before parsing, do we have any user reports of having a problem with current behavior?

@itsjustkevin
Copy link
Contributor

@Hixie @grouma @vsmenon breaking change request, hot off of the presses!

@brianquinlan
Copy link
Contributor Author

I don't think http client library should be in business of trimming and cleansing header values, should not attempt to apply semantic meaning to them. It should try to get content it received into developers hands as is and let them decide whether to trim values or not.

We already normalize the header name for case. We also remove leading whitespace from the field value.

Beside relatively recent dart internal issue #51532 where Content-Length value was not trimmed before parsing, do we have any user reports of having a problem with current behavior?

No. But I suspect that is because most servers do not append additional whitespace to field values.

@a-siva a-siva added the area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. label Jul 21, 2023
@aam
Copy link
Contributor

aam commented Jul 21, 2023

I don't think http client library should be in business of trimming and cleansing header values, should not attempt to apply semantic meaning to them. It should try to get content it received into developers hands as is and let them decide whether to trim values or not.
We already normalize the header name for case. We also remove leading whitespace from the field value.

Normalizing header name is kind of different thing compared to "normalizing" the value(note also that we do have an option preserveHeaderCase on https://api.dart.dev/stable/3.0.6/dart-io/HttpHeaders/add.html).

@a-siva
Copy link
Contributor

a-siva commented Jul 24, 2023

I don't think http client library should be in business of trimming and cleansing header values, should not attempt to apply semantic meaning to them. It should try to get content it received into developers hands as is and let them decide whether to trim values or not.

Could you summarize as to why this is important and what potential pitfalls you see from the removal of trailing whitespaces from HTTP headers.

@sgjesse
Copy link
Contributor

sgjesse commented Jul 25, 2023

As already mentioned "RFC 2616 section 4.2 specifically says that leading and trailing whitespace is not semantic", so any intermediate (cache server, proxy server, etc.) MAY already remove this whitespace. So in practice one can not rely on whitespace added by the server being preserved anyway. The current behavior in principle require trimming all header values which seems error prone.

@aam
Copy link
Contributor

aam commented Jul 25, 2023

@a-siva wrote

Could you summarize as to why this is important and what potential pitfalls you see from the removal of trailing whitespaces from HTTP headers.

When dart developer controls both server and client they might want to use trailing spaces as they are not prohibited by the RFC. Also it is hard to rule out possibility of some legacy server sending out trailing spaces as part of the header value content.

Note #33501 that resulted in dart HttpHeaders add and set getting preserveHeaderCase option when representing header names - developers asked for flexibility around that even though it's not per RFC.

https://datatracker.ietf.org/doc/html/rfc2616#section-4.2 talks about several things that MAY happen to the content(trimming of leading/trailing WS, longer WS sequences replaced with single WS, combination of multiple header values into single comma-separated list) - I don't think we have done none of that so far(well, except of leading WS). If we implement trimming of trailing WS, It would be good to clarify our strategy with other items.

@itsjustkevin
Copy link
Contributor

Checking in, have we come to a decision here, or is this still being worked through?

@brianquinlan
Copy link
Contributor Author

@itsjustkevin

We are ready for approval.

@itsjustkevin
Copy link
Contributor

Calling all reviewers (@Hixie, @grouma, @vsmenon) another breaking change, hot off the press!

@grouma
Copy link
Member

grouma commented Aug 11, 2023

SGTM. I can see this potentially impacting our internal web development workflow but we should have sufficient E2E coverage to catch any issues.

@Hixie
Copy link
Contributor

Hixie commented Aug 11, 2023

fine by me

@vsmenon
Copy link
Member

vsmenon commented Aug 14, 2023

lgtm

@brianquinlan
Copy link
Contributor Author

@grouma I already have a PR in progress and it passes the CBUILD.

@grouma
Copy link
Member

grouma commented Aug 14, 2023

@grouma I already have a PR in progress and it passes the CBUILD.

I'm not very familiar with CBUILD but I assume it doesn't do a TGP?

I'm concerned about our internal workflows as we have a handful of Dart based web servers / tools. Any issues will hopefully be caught when this change is rolled into Google3 though.

@brianquinlan
Copy link
Contributor Author

@grouma I don't know much about CBUILD but you seem to be right - it tests <5,000 Google3 targets.

copybara-service bot pushed a commit that referenced this issue Aug 15, 2023
Bug: #53005
Bug: #51532
Change-Id: I8a2fc04f48d50103819d655ccd300e73d59fbecc
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/319903
Reviewed-by: Alexander Aprelev <aam@google.com>
Commit-Queue: Brian Quinlan <bquinlan@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. breaking-change-request This tracks requests for feedback on breaking changes library-io
Projects
None yet
Development

No branches or pull requests

9 participants