Hi Eric,
Thank you for taking the time to reply, sorry it's taken me so long to
get to this. Other tasks popped up. :-)
Anyway, the app I'm working with uses struts and com.oreilly.servlet. I
don't see response.getOutputStream() or new OutputStreamWriter()
anywhere. Maybe it's inside the cos.jar?
Thank You,
Troy
On Apr 5, 2005, at 1:54 PM, Broyles, Eric wrote:
Troy,
UTF-8 is the encoding that you want to use throughout your web
application. I'm not sure what framework you're using, but whatever it
is you'll need to make sure that the encoding is being set to UTF-8 for
the Writer. The easiest way to do this is to set the encoding of your
JSP to UTF-8 and pass the value of the getCharacterEncoding() method
from the HttpServletRequest to the Writer upon creation.
Here's some sample code:
OutputStream os = response.getOutputStream();
Writer writer = new OutputStreamWriter( os,
request.getCharacterEncoding() );
This will construct a Writer using the encoding set in the request,
which if your JSP is set to UTF-8, should be UTF-8. This has the
benefit of being able to change your character encoding without having
to change code. If the framework you're using has special support for
specifying the character encoding so you don't have to manually supply
it on every page, that is all the better.
Essentially, anything written to the response must be UTF-8 encoded.
If
it is, it should handle all the world's known languages today. We have
an application that is currently localized in German, French, Italian,
Spanish, Korean, Chinese (Simplified and Traditional), Portuguese and
Japanese.
I hope this is helpful. I spent a lot of time trying to get this to
work (trial and error and lots of debugging) and it boiled down to a
bug
in the framework which was easily solved by the above code sample.
That
is really key to the whole thing.
Eric
-----Original Message-----
From: mavery@xxxxxxxxxxxxxxx [mailto:mavery@xxxxxxxxxxxxxxx]
Sent: Tuesday, April 05, 2005 1:27 PM
To: Troy Davis
Cc: Broyles, Eric; cburkey@xxxxxxxxxxxx
Subject: Re: [cinjug-users] Looking for i18n Java information for
webapp
Sorry for the delay in writing back. I am cc-ing your questions to the
guys who *actually* implemented this in Open Edit and other apps. I'm
not sure about the pasting issue.
Thank you for writing back!
I'm familiar with the 1252 character encoding, but I thought that MS
apps try to paste text as unicode, not cp1252?
I suppose I could set the document encoding to cp1252, but then I have
an inverted problem when linux and mac users paste text in. Their high
bit characters will get changed in that circumstance.
And I'm ok with storing the text in the database as unicode and then
encoding high bit characters as HTML entities when displayed. I think
where I'm getting hung up is whether to change all the HTML and source
files to UTF-8, UTF-8 no BOM, UTF-16, etc. All the java and jsp files
are latin1, when I tried to change them to UTF-8 I got compiler errors
about illegal characters, so I never got to test any data passing
functionality.
Which encodings do you use for java sources and jsp files? (Assuming
you use jsp...)
Thank you!
Troy
On Apr 4, 2005, at 11:11 AM, mavery@xxxxxxxxxxxxxxx wrote:
Hi Troy,
My posts to CinJUG are being rejected because the client I am
currently using is misconfigured so I am responding directly. We had
to iron out problems like this for Open Edit a while back so I will
*try* to remember all the steps.
First, Windows uses some default encoding that is MS1252, not UTF-8.
You
will have to set you request encoding to MS1252, then convert it
UTF-8 on the server side in order to handle the copy and paste from
Word into the text area. Once you have it in UTF-8, you can store it
that way and re-read and resend it all as UTF-8. One caveat is that
if you paste text in to a text area, then try to display it as HTML,
there are several characters that will need to be transformed into
HTML, e.g. "&".
Again, I
am not *entirely* sure that I have these instructions correct, but it
should be enough to get you started. We wrestled with this for quite
a while before getting it all straightened out.
You might consider using Open Edit (openedit.org), or parts thereof,
since it already handles these problems and has a nice WYSIWYG
editor. It is open source and not hard at all for an experienced
developer to use as an API, but since it is more of a "product" and
not intended to be an API, the documentation is somewhat lacking on
that front.
Hi Everyone,
I'm extending a Java web application for my company, and it's
probably time to internationalize it in order to handle unicode
characters. It currently works with the latin1 character set
exclusively. Most of this need is driven by the desire to stop the
app's habit of mangling text pasted from MS Word into textarea
fields in the web interface.
I've tried to simply change the encoding of the pages and database
tables to UTF-8 along with the html metatags that specify the
character encoding, and the machine's locale is already set to
en_US.UTF-8, but that didn't do the trick. I've also read the i18n
section of Sun's Java tutorial, but I'm confused about the steps
needed to internationalize a webapp, Sun's tutorial only covers a
simple command-line app.
Would anyone be willing to share some tidbits of wisdom on this
topic?
Thank You,
Troy
__________________
Troy Davis
Technology Director
Metaphor Studio
538 Reading Road
Loft 200
Cincinnati, Ohio 45202
Tel: 513-723-0290
Fax: 513-723-0670
http://metaphorstudio.com
---------
You may unsubscribe from this mailing list by sending a blank email
addressed to:
users-unsubscribe@xxxxxxxxxx
--
Find additional help by sending a blank email addressed to:
users-help@xxxxxxxxxx
__________________
Troy Davis
Technology Director
Metaphor Studio
538 Reading Road
Loft 200
Cincinnati, Ohio 45202
Tel: 513-723-0290
Fax: 513-723-0670
http://metaphorstudio.com
|