Options

UTF-8 and Firefox

dir3wolfdir3wolf Registered Users Posts: 3 Beginner grinner
edited September 15, 2005 in SmugMug Support
I have a problem with Firefox recognizing UTF-8 charset properly. It always sets the character encoding to Western (ISO-8859-1).

I noticed that the first tag in my smugmug page is:
<html xmlns="http://www.w3.org/1999/xhtml&quot; xml:lang="en" lang="en">

I saved smugmug page on my computer and removed lang="en" from that tag. Afterwards firefox correctly identified charset to be UTF-8. Obviously firefox reads page language to be Western in the first tag and automaticaly sets the page to Western ignoring later <meta> tag which sets it to UTF-8.
Is there any way around this?
Most of my family and friends use Firefox now, as it is much more customizable than IE, and on all their computers my page shows croatian text incorectly.

Thanks

Comments

  • Options
    AndyAndy Registered Users Posts: 50,016 Major grins
    edited September 9, 2005
    <img src="https://us.v-cdn.net/6029383/emoji/wave.gif&quot; border="0" alt="" > hiya wolf, welcome to dgrin :D

    one of the experts will chime in to help, i'm sure <img src="https://us.v-cdn.net/6029383/emoji/deal.gif&quot; border="0" alt="" >
    dir3wolf wrote:
    I have a problem with Firefox recognizing UTF-8 charset properly. It always sets the character encoding to Western (ISO-8859-1).

    I noticed that the first tag in my smugmug page is:
    <html xmlns="http://www.w3.org/1999/xhtml&quot; xml:lang="en" lang="en">

    I saved smugmug page on my computer and removed lang="en" from that tag. Afterwards firefox correctly identified charset to be UTF-8. Obviously firefox reads page language to be Western in the first tag and automaticaly sets the page to Western ignoring later <meta> tag which sets it to UTF-8.
    Is there any way around this?
    Most of my family and friends use Firefox now, as it is much more customizable than IE, and on all their computers my page shows croatian text incorectly.

    Thanks
  • Options
    dir3wolfdir3wolf Registered Users Posts: 3 Beginner grinner
    edited September 12, 2005
    andy wrote:
    wave.gif hiya wolf, welcome to dgrin :D

    one of the experts will chime in to help, i'm sure deal.gif
    Well... thank you for your warm welcome! thumb.gif

    But obviously nobody knows how to fix my problem... or those who know don't care to answer. My page is still a mess. ne_nau.gif
  • Options
    luke_churchluke_church Registered Users Posts: 507 Major grins
    edited September 12, 2005
    dir3wolf wrote:
    I have a problem with Firefox recognizing UTF-8 charset properly. It always sets the character encoding to Western (ISO-8859-1).
    Please can you send a sample URL for the problem. I suspect it can be fixed by adjusting the Language settings (Tools -> Options -> Languages).

    page to Western ignoring later <meta> tag which sets it to UTF-8.
    Is there any way around this?
    I'm *guessing* that FireFox might intrepet this as the page being available in both languages, rather than overriding the previous META tag, so setting UTF-8 as the default and Croation as the preffered language might fix this, by instructing it that you would prefer to use that mapping than the default English, ISO-8859-1.

    If you give me a URL, I'll try it out for you :):

    Cheers,

    Luke
  • Options
    AndyAndy Registered Users Posts: 50,016 Major grins
    edited September 12, 2005
    hey wolf -
    dir3wolf wrote:
    ... or those who know don't care to answer.

    rolleyes1.gif stick around a while - that's *absolutely* not the spirit of dgrin. you'll find that ordinary people take extraordinary measures to go out of their way to help folks.
  • Options
    devbobodevbobo Registered Users, Retired Mod Posts: 4,339 SmugMug Employee
    edited September 12, 2005
    dir3wolf wrote:
    Well... thank you for your warm welcome! thumb.gif

    But obviously nobody knows how to fix my problem... or th yose who know don't care to answer. My page is still a mess. ne_nau.gif
    Hey wolf,

    I have been having a play around, and I think I have a workaround but it's kinda ugly.

    eg. išao

    using character-mapping in windows, you can determine š to be equal to 161 hex or 353 decimal.

    So instead of typing.. išao you can type...
    [PHP]išao[/PHP]

    [PHP]š[/PHP]
    being the escape code for this character.

    As I said, it's a bit ugly but it works.

    Hope this helps some.

    David
    David Parry
    SmugMug API Developer
    My Photos
  • Options
    {JT}{JT} Registered Users Posts: 1,016 Major grins
    edited September 12, 2005
    I tried your fix on our testing site - and it did nothing for the problem :(

    It may be a local fix that works only on your machine AFTER you have saved the file. I really wish it was a valid fix, because I know this problem is affecting a lot of people.

    dir3wolf wrote:
    Well... thank you for your warm welcome! thumb.gif

    But obviously nobody knows how to fix my problem... or those who know don't care to answer. My page is still a mess. ne_nau.gif
  • Options
    luke_churchluke_church Registered Users Posts: 507 Major grins
    edited September 12, 2005
    {JT} wrote:
    I tried your fix on our testing site - and it did nothing for the problem :(
    JT, would it be possible to parse every 'non-standard, non-html' character into escape codes before saving to the webserver? I would be suprisied if this wasn't heavily optimised in most web libraries due to its use in defence against XSS attacks.

    Hideously ugly, but if it's hurting a lot of people?

    Just a thought,

    Luke
  • Options
    {JT}{JT} Registered Users Posts: 1,016 Major grins
    edited September 12, 2005
    Well, it is really not hurting a lot of people. I don't like to make anyone one group feel like they are not a priority though :)

    The problem is multifaceted - and in this thread we are only talking about one aspect of it. Don is looking in to it too though, since he controls the DB and Webserver, I just make the UI work :)

    JT, would it be possible to parse every 'non-standard, non-html' character into escape codes before saving to the webserver? I would be suprisied if this wasn't heavily optimised in most web libraries due to its use in defence against XSS attacks.

    Hideously ugly, but if it's hurting a lot of people?

    Just a thought,

    Luke
  • Options
    dir3wolfdir3wolf Registered Users Posts: 3 Beginner grinner
    edited September 13, 2005
    Thank all you guys for trying, I have to bite my tongue and change my opinion about this forum (you were right andy). thumb.gifthumb
    ... so setting UTF-8 as the default and Croation as the preffered language might fix this, by instructing it that you would prefer to use that mapping than the default English, ISO-8859-1.

    If you give me a URL, I'll try it out for you :):
    That was the first thing I've tried. No help. The link is http://dir3wolf.smugmug.com/gallery/792641
    {JT} wrote:
    I tried your fix on our testing site - and it did nothing for the problem :(

    It may be a local fix that works only on your machine AFTER you have saved the file. I really wish it was a valid fix, because I know this problem is affecting a lot of people.
    Jep, I've tried it again without even removing lang="en" and the page showed correctly. Strange... So when I save the page on my computer and open it from there firefox correctly identifies code page to be UTF-8, but when I open it online it always falls back to Western. It's not my field but, could it be something on the smugmug side then... some scripting or css or something?

    devbobo wrote:
    Hey wolf,

    I have been having a play around, and I think I have a workaround but it's kinda ugly.
    Even if it's not quite an elegant solution, I'll be forced to use your solution if I want my family to be able to read gallery descriptions.

    Thanks everybody... if the greasehanded mechanics come up with the proper solution to this problem be sure to fly it high in the sky. 1drink.gif
  • Options
    ScalaScala Registered Users Posts: 95 Big grins
    edited September 13, 2005
    For me the special characters display correctly if I type them in the old-fashioned customize gallery screen (not using the new AJAX dynamic edit feature).

    Let's say I type in the character š on the customize gallery screen (gallery description box). It displays correctly but as soon as I edit the description with the new dynamic feature the character changes to two characters Å¡.

    In my book this is called UTF-8 misinterpreted as ISO-8859-1 :) Apparently the old feature saves text in the ISO-8859-1 encoding and the new feature in the UTF-8 encoding. Selecting the UTF-8 encoding from the browser displays the caracters correctly.

    It's worth noting that special characters typed in the customize gallery screen display incorrectly if character encoding is set to UTF-8 from Firefox (or IE I guess).

    It will be interesting to see how the people at Smugmug will be able to fix this and keep all existing special characters displaying correctly. UTF-8 is of course the way to go these days...
    My smugmug site: www.majakorpi.net
  • Options
    luke_churchluke_church Registered Users Posts: 507 Major grins
    edited September 13, 2005
    Jep, I've tried it again without even removing lang="en" and the page showed correctly. Strange... So when I save the page on my computer and open it from there firefox correctly identifies code page to be UTF-8, but when I open it online it always falls back to Western. It's not my field but, could it be something on the smugmug side then... some scripting or css or something?
    I can confirm this behaviour...

    I confess I'm a little out of my depths on international charsets, it's not something I've done anything significant with, yet...

    However, I have a suspecision. I've done a packet dump of the HTTP Headers:

    GET /gallery/792641/2 HTTP/1.1
    Host: dir3wolf.smugmug.com
    User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.7.10) Gecko/20050717 Firefox/1.0.6
    Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
    Accept-Language: hr,en-gb;q=0.7,en;q=0.3
    Accept-Encoding: gzip,deflate
    Accept-Charset: UTF-8,*
    Keep-Alive: 300
    Connection: keep-alive
    Referer: http://dir3wolf.smugmug.com/gallery/792641
    Cookie: depth=32; res=1280x1024

    HTTP/1.1 200 OK
    Date: Tue, 13 Sep 2005 10:34:33 GMT
    Server: Apache
    X-Powered-By: smugmug/1.2.0
    Set-Cookie: SMSESS=97a5156771f4df7646a439b4974fe192; path=/; domain=.smugmug.com
    Cache-Control: private, max-age=1, must-revalidate
    Pragma:
    X-Extra: 0.2676408290863
    Content-Encoding: gzip
    Vary: Accept-Encoding
    ETag: sm-7c30b1cff06301b4ac5d27dd877ca651-sm
    Content-Length: 6228
    Keep-Alive: timeout=5, max=100
    Connection: Keep-Alive
    Content-Type: text/html; charset=ISO-8859-1

    I suspect that this will be the problem. The web-server is effectivly forcing the browser to consider the text to be ISO-8859-1. When the file is save, this information is stripped. Hence when its reloaded Firefox can consider it how it likes and reverts to UTF-8.

    I really need to read the HTTP specification to see what might be done about this, but it might at least be a pointer to the problem?

    Luke
  • Options
    ScalaScala Registered Users Posts: 95 Big grins
    edited September 13, 2005
    Content-Type: text/html; charset=ISO-8859-1
    I suspect that this will be the problem. The web-server is effectivly forcing the browser to consider the text to be ISO-8859-1. When the file is save, this information is stripped. Hence when its reloaded Firefox can consider it how it likes and reverts to UTF-8.
    Luke
    I suspected this as well but didn't have the energy to actually check the headers. Good catch! 1drink.gif
    My smugmug site: www.majakorpi.net
  • Options
    ScalaScala Registered Users Posts: 95 Big grins
    edited September 14, 2005
    This thread was moved to the customization forum although the issue here is a bug.
    My smugmug site: www.majakorpi.net
  • Options
    Mike LaneMike Lane Registered Users Posts: 7,106 Major grins
    edited September 15, 2005
    http://www.contentwithstyle.co.uk/Articles/7/utf-8-documents-with-a-lot-of-character

    Not sure if this'll add anything to the issue, but there it is.
    Y'all don't want to hear me, you just want to dance.

    http://photos.mikelanestudios.com/
  • Options
    Mike LaneMike Lane Registered Users Posts: 7,106 Major grins
    edited September 15, 2005
    Scala wrote:
    This thread was moved to the customization forum although the issue here is a bug.
    Lots and lots of threads to move. This just got caught in the fray. I'm moving it back.
    Y'all don't want to hear me, you just want to dance.

    http://photos.mikelanestudios.com/
Sign In or Register to comment.