Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Wiki Markup
    \[#Q1 What is the default character encoding of the request or response body?\]
  2. Wiki Markup
    \[#Q2 How do I change how GET parameters are interpreted?\]
  3. Wiki Markup
    \[#Q3 How do I change how POST parameters are interpreted?\]
  4. Wiki Markup
    \[#Q4 How can I test if my configuration will work correctly?\]
  5. Wiki Markup
    \[#Q5 I'm having a problem with character encoding in Tomcat 5\]
  6. Wiki Markup
    \[#Q6 What can you recommend to just make everything work?\]
    \\

Answers

What is the default character encoding of the request or response body?

If a character encoding is not specified, the Servlet specification requires that an encoding of ISO-8859-1 is used. The character encoding for the body of an HTTP message (request or response) is specified in the Content-Type header field. An example of such a header is Content-Type: text/html; charset=ISO-8859-1 which explicitly states that the default (ISO-8859-1) is being used.

How do I change how GET parameters are interpreted?

There are two ways to specify how GET parameters are interpreted:

  1. Set the URIEncoding

...

  1. attribute on the <Connector> element in server.xml to something specific (e.g. URIEncoding="UTF-8").
  2. Set the useBodyEncodingForURI attribute on the <Connector> element in server.xml to true. This will cause the Connector to use the request body's encoding for GET parameters.

How do I change how POST parameters are interpreted?

POST requests should specify the encoding of the parameters and values they send. Since many clients fail to set an explicit encoding, the default is used (ISO-8859-1). In many cases this is not the preferred interpretation so one can employ a javax.servlet.Filter to set request encodings. Writing such a filter is trivial. Furthermore Tomcat already comes with such an example filter. Please take a look at:
4.x::

...

  • Wiki Markup
    \[http://issues.apache.org/bugzilla/show_bug.cgi?id=23929 23929\]
  • Wiki Markup
    \[http://issues.apache.org/bugzilla/show_bug.cgi?id=25360 25360\]
  • Wiki Markup
    \[http://issues.apache.org/bugzilla/show_bug.cgi?id=25231 25231\]
  • Wiki Markup
    \[http://issues.apache.org/bugzilla/show_bug.cgi?id=25235 25235\]
  • Wiki Markup
    \[http://issues.apache.org/bugzilla/show_bug.cgi?id=22666 22666\]
  • Wiki Markup
    \[http://issues.apache.org/bugzilla/show_bug.cgi?id=24557 24557\]
  • Wiki Markup
    \[http://issues.apache.org/bugzilla/show_bug.cgi?id=24345 24345\]
  • Wiki Markup
    \[http://issues.apache.org/bugzilla/show_bug.cgi?id=25848 25848\]
    \\

What can you recommend to just make everything work?

Using UTF-8 as your character encoding for everything is a safe bet. This should work for pretty much every situation. In order to completely switch to using UTF-8, you need to make the following changes:

  1. Set URIEncoding="UTF-8" on your <Connector> in server.xml
  2. Wiki Markup
    Use a \[#Q3 character encoding filter\] with the default encoding set to UTF-8
  3. Change all your JSPs to set the correct Content-Type (use <%@page cotnentType="mime/type; charset=UTF-8" %>)
  4. Change all your servlets to set the content type for responses to UTF-8
  5. Change any content-generation libraries you use (Velocity, Freemarker, etc.) to use UTF-8 as the content type