Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Clarify that <request-character-encoding> requires Servlet 4.0/Tomcat 9 or later.

...

Permalink to this page: https://cwiki.apache.org/confluence/x/liklBg

Questions

  1. Why
    1. What is the default character encoding of the request or response body?
    2. Why does everything have to be this way?
  2. How
    1. How do I change how GET parameters are interpreted?
    2. How do I change how POST parameters are interpreted?
    3. What can you recommend to just make everything work? (How to use UTF-8 everywhere).
    4. How can I test if my configuration will work correctly?
    5. How can I send higher characters in HTTP headers?
  3. Troubleshooting
    1. I'm having a problem with character encoding in Tomcat 5

Answers

Why

Anchor
Q1
Q1
What is the default character encoding of the request or response body?

...

The container-agnostic approach for specifying the request character encoding for applications using Servlet 4.0 or later (which would correspond to Tomcat 9.0 and later) is to set the <request-character-encoding> element in the web application web.xml file:

...

Note: The request encoding setting is effective only if it is done earlier than parameters are parsed. Once parsing happens, there is no way back. Parameters parsing is triggered by the first method that asks for parameter name or value. Make sure that the filter is positioned before any other filters that ask for request parameters. The positioning depends on the order of filter-mapping declarations in the WEB-INF/web.xml file, though since Servlet 3.0 specification there are additional options to control the order. To check the actual order you can throw an Exception from your page and check its stack trace for filter names.

Tomcat 9.x and later: do not use a <filter> at all and instead specify <request-character-encoding> in your application's web.xml file.

Anchor
Q8
Q8
What can you recommend to just make everything work? (How to use UTF-8 everywhere).

...

  1. Set URIEncoding="UTF-8" on your <Connector> in server.xml. References: Tomcat 7 HTTP Connector, Tomcat 7 AJP Connector, Tomcat 8.5 HTTP Connector, Tomcat 8.5 AJP Connector.
  2. Set the default request character encoding either in the Tomcat conf/web.xml file or in the web app web.xml file; either by setting <request-character-encoding> (for applications using Servlet 4.0 / Tomcat 9.x+) or by using a character encoding filter.
  3. Change all your JSPs to include charset name in their contentType. For example, use <%@page contentType="text/html; charset=UTF-8" %> for the usual JSP pages and <jsp:directive.page contentType="text/html; charset=UTF-8" /> for the pages in XML syntax (aka JSP Documents).
  4. Change all your servlets to set the content type for responses and to include charset name in the content type to be UTF-8. Use response.setContentType("text/html; charset=UTF-8") or response.setCharacterEncoding("UTF-8").
  5. Change any content-generation libraries you use (Velocity, Freemarker, etc.) to use UTF-8 and to specify UTF-8 in the content type of the responses that they generate.
  6. Disable any valves or filters that may read request parameters before your character encoding filter or jsp page has a chance to set the encoding to UTF-8. For more information see https://www.mail-archive.com/users@tomcat.apache.org/msg21117.html.

...