1. JSP pages must include the header:
<%@ page contentType="text/html; charset=UTF-8" %>
2. In the Catalina.bat (windows) catalina.sh (windows) apache$jakarta_config.com (OpenVMS), file there must be a switch added to the call to java.exe. The switch is:
-Dfile.encoding=UTF-8
I cannot find documentation for this environment variable anywhere or what it actually does but it is essential.
3. For translation of inputs coming back from the browser there must be a method that translates from the browser's ISO-8859-1 to UTF-8. It seems to me that -1 is used in all regions as I have had people in countries such as Greece & Bulgaria test this and they always send input back in -1 encoding. The method which you will use constantly should go something like this:
/** * Convert ISO8859-1 format string (which is the default sent by IE * to the UTF-8 format that the database is in. */ public String toUTF8(String isoString) { String utf8String = null; if (null != isoString && !isoString.equals("")) { try { byte[] stringBytesISO = isoString.getBytes("ISO-8859-1"); utf8String = new String(stringBytesISO, "UTF-8"); } catch(UnsupportedEncodingException e) { // As we can't translate just send back the best guess. System.out.println("UnsupportedEncodingException is: " + e.getMessage()); utf8String = isoString; } } else { utf8String = isoString; } return utf8String; }
I have found that these three steps are all that is necessary to make your site accept any language that UTF-8 can work with. I extend my thanks to those of you on the Tomcat users list who helped me find these little gems.
(from the tomcat-user mailing list)