Page History

...

If a character encoding is not specified, the Servlet specification requires that an encoding of ISO-8859-1 is used. The character encoding for the body of an HTTP message (request or response) is specified in the Content-Type header field. An example of such a header is Content-Type: text/html; charset=ISO-8859-1 which explicitly states that the default (ISO-8859-1) is being used.

References: HTTP 1.1 Specification, Section 3.7.1

Anchor

	Q2
	Q2

How do I change how GET parameters are interpreted?

...

Set the URIEncoding attribute on the <Connector> element in server.xml to something specific (e.g. URIEncoding="UTF-8").
Set the useBodyEncodingForURI attribute on the <Connector> element in server.xml to true. This will cause the Connector to use the request body's encoding for GET parameters.

References: Tomcat 6 HTTP Connector, Tomcat 6 AJP Connector

Anchor

	Q3
	Q3

How do I change how POST parameters are interpreted?

...

Default encoding for request and response bodies

See 'Default Encoding for POST' below.

Default encoding for GET

The character set for HTTP query strings (that's the technical term for 'GET parameters') can be found in sections 2 and 2.1 the "URI Syntax" specification. The character set is defined to be US-ASCII. Any character that does not map to US-ASCII must be encoded in some way. Section 2.1 of the URI Syntax specification says that characters outside of US-ASCII must be encoded using % escape sequences: each character is encoded as a literal % followed by the two hexadecimal codes which indicate its character code. Thus, a (US-ASCII character code 0x97) is equivalent to %97. There is no default encoding for URIs specified anywhere, which is why there is a lot of confusion when it comes to decoding these values.

Some notes about the character encoding of URIs:

...

Page tree

Versions Compared

Old Version 10

New Version 11

Key