Encoding for caption in pictures uploaded by smugmug uploader 1.0 on MacOS X

kajhkajh Registered Users Posts: 14 Big grins
edited May 11, 2007 in SmugMug Support
Hi!

I'm new to smugmug and I'm try to find the best way to upload pictures from MacOS X. I have the pictures in iPhoto. Since the smugmug uploader 1.0 does not support copying iPhoto comments into caption in smugmug I used Caption Buddy to copy the comment in iPhoto for a picture into the jpg file as iptc properties [1].

See the iptc properties in the following screenshot from Preview on osx at http://folk.uio.no/kajh/tmp/iptc.png

I tried to upload this picture with smugmug uploader 1.0. I get the iptc caption into the picture in smugmug, but the characterencoding seems to be wrong.

The caption bellow the picture shows "test æøå". If I in Safari change the encoding for the page to utf-8, the caption is displayed as "test æøå".

You can see this caption at http://hkbilder.smugmug.com/gallery/1066244

The same character encoding issue goes for the keywords for the picture.

What is the best way to solve this issue? Should the smugmug uploader convert the character encoding to iso-8859-1?

-Kaj :)


[1] thank you for implementing support for iPhoto comments in version 2.0 of the smugmug uploader for osx! :)

Comments

  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited December 30, 2005
    kajh wrote:
    Hi!

    I'm new to smugmug and I'm try to find the best way to upload pictures from MacOS X. I have the pictures in iPhoto. Since the smugmug uploader 1.0 does not support copying iPhoto comments into caption in smugmug I used Caption Buddy to copy the comment in iPhoto for a picture into the jpg file as iptc properties [1].

    See the iptc properties in the following screenshot from Preview on osx at http://folk.uio.no/kajh/tmp/iptc.png

    I tried to upload this picture with smugmug uploader 1.0. I get the iptc caption into the picture in smugmug, but the characterencoding seems to be wrong.

    The caption bellow the picture shows "test æøå". If I in Safari change the encoding for the page to utf-8, the caption is displayed as "test æøå".

    You can see this caption at http://hkbilder.smugmug.com/gallery/1066244

    The same character encoding issue goes for the keywords for the picture.

    What is the best way to solve this issue? Should the smugmug uploader convert the character encoding to iso-8859-1?

    -Kaj :)


    [1] thank you for implementing support for iPhoto comments in version 2.0 of the smugmug uploader for osx! :)

    Hi Kaj - not sure what the best route to take is here. It may be, "wait for V2." I'll ask Ben to comment if he can..

    Cheers!
  • kajhkajh Registered Users Posts: 14 Big grins
    edited December 30, 2005
    Andy wrote:
    Hi Kaj - not sure what the best route to take is here. It may be, "wait for V2." I'll ask Ben to comment if he can..


    Hi!

    Thanks for answering and for letting Ben know about this thread!

    I think the encoding issues are present in the latest beta of v2 too. It would be great if you could fix the encoding issue before release of v2. There are quite a few ppl needing more than just the ascii characters ;-)


    -Kaj :)
  • kajhkajh Registered Users Posts: 14 Big grins
    edited December 30, 2005
    kajh wrote:
    Hi!

    Thanks for answering and for letting Ben know about this thread!

    I think the encoding issues are present in the latest beta of v2 too. It would be great if you could fix the encoding issue before release of v2. There are quite a few ppl needing more than just the ascii characters ;-)


    Hi again :)

    I just gave the Drag & Drop Upload BETA a try. It worked very well in Safari on osx 10.4.3 except for two posible bugs:

    1) The keywords in the iptc tags for the photos didn't make it into smugmug

    2) I see the same character encoding issues as decribed above.

    I use Caption Buddy to copy the iPhoto meta info into iptc tags.


    -Kaj :)
  • cabbeycabbey Registered Users Posts: 1,053 Major grins
    edited December 30, 2005
    Part of the problem you may be having is that IPTC doesn't have any implicit code page included... it's just a binary blob. Most applications assume that they can interpret that blob via whatever their local locale is, or ascii if they don't have one. From the looks of things, the smugmug side is interpreting as ascii/iso-8859-1, whereas your application is interpreting them as utf-8, or possibly Norwegian?

    You need both ends of the translation to agree on what encoding is being used inorder for clear transmisson of data. In the example you show, you "got lucky" in that the UTF-8 data you entered as "æøå" (0xC3A6C3B8C3A5) happen to be "valid" (if meaningless) ascii "æøå", but if you had used different code points in UTF-8 they might NOT have been legit ascii values. (which might explain the ones that didn't make it). Anyway, since they were legit smugmug stored them and then feed them back to your browser along with the rest of the 8859-1 (aka ascii) page. In addition to the uploader, you might need to get someone on the backend team involved to see if they can even store non-ascii comment data. (I'd like to hope so, but depending on the database they're using on the backend, it might not have been enabled by default.)

    update: some of the release notes related to the addition of the AJAX editing indicate that "non-latin foreign characters" in comments are supported, or at least they're fixing bugs related to them. That's a good sign.
    SmugMug Sorcerer - Engineering Team Champion for Commerce, Finance, Security, and Data Support
    http://wall-art.smugmug.com/
  • verseguruverseguru Registered Users Posts: 3 Beginner grinner
    edited January 1, 2006
    IPTC does actually support character set encoding. It uses dataset 1:90 (Coded Character Set) to identify the encoding. Unfortunately it's not widely used because it employs a rather complex system (ISO 2022 control sequences) that is confusing documented. Some libraries do already have support for it (libiptc, possibly ImageMagick) and I suspect PHP's image functions do, or if not should soon. The most common implementation is that IPTC capable apps write Latin-1 or MacRoman encodings but generally without setting 1:90. UTF-8 capable IPTC apps set 1:90 (to ESC%G / hex 1B2547).

    The solution would be for SmugMug to implement IPTC 1:90 support, and/or to respect incoming UTF-8 XML encoding (for the XML-RPC) and then convert to Latin-1 (which is what they use for HTML encodings at any rate). UTF-8 to to Latin-1 conversion only involves a few lines of code.

    It should also be possible to send Latin-1 encoded (ISO 8859-1) text their way but I tried this using the upload API — however it choked, and failed to even process the image. I haven't tried uploading Latin-1 in IPTC yet though.

    Incidentally, Kaj, PictureSync does now write IPTC 1:90 itself but that doesn't solve anything with regard to SmugMug.
  • cabbeycabbey Registered Users Posts: 1,053 Major grins
    edited January 1, 2006
    verseguru wrote:
    IPTC does actually support character set encoding. It uses dataset 1:90 (Coded Character Set) to identify the encoding. Unfortunately it's not widely used because it employs a rather complex system (ISO 2022 control sequences) that is confusing documented.
    Wow, I've been playing with IPTC header tags off and on for a year or so, and I've never run across 1:90 before. Though with that added to a google search I've run before, I've finally found a publicly accessible *complete* copy of the spec. :)

    And after paging through it for about 20 minutes I've come to the conclusion it is probably the single worst written specification I've every had to read. :puke
    SmugMug Sorcerer - Engineering Team Champion for Commerce, Finance, Security, and Data Support
    http://wall-art.smugmug.com/
  • kajhkajh Registered Users Posts: 14 Big grins
    edited January 12, 2006
    verseguru wrote:
    The solution would be for SmugMug to implement IPTC 1:90 support, and/or to respect incoming UTF-8 XML encoding (for the XML-RPC) and then convert to Latin-1 (which is what they use for HTML encodings at any rate). UTF-8 to to Latin-1 conversion only involves a few lines of code.

    It should also be possible to send Latin-1 encoded (ISO 8859-1) text their way but I tried this using the upload API — however it choked, and failed to even process the image. I haven't tried uploading Latin-1 in IPTC yet though.

    Incidentally, Kaj, PictureSync does now write IPTC 1:90 itself but that doesn't solve anything with regard to SmugMug.


    Hi!

    Thank you for posting this information!

    PictureSync is a great program, and it would be sad if it wouldn't be possible to use PictureSync for uploading picutures from iPhoto for those of us who need more than just the ascii characters.

    I have used PictureSync a bit when uploading to Flicker (from iPhoto) and PictureSync a very nice program to work with! Please :) make it possible to use PictureSync with Smugmug too.

    I have to questions to the Smugmug ppl here :)

    1)
    Do you think it will be possible for you to make the upload API support utf-8 or in some other way make PictureSync upload work with iPhoto / 8-bit characters, ref the post I quote above?

    2)
    Will the new Smugmug uploader for MacOS X support 8-bit characters in comments and keywords in iPhoto?

    Even with a Smugmug uploader for MacOS X with support for iPhoto (which would be a great thing!) I wouldn't want to not be able to use PictureSync.


    -Kaj, a happy PictureSync user and a not so happy (yet) Smugmug user :)
  • kajhkajh Registered Users Posts: 14 Big grins
    edited May 11, 2007
    kajh wrote:
    The caption bellow the picture shows "test æøå". If I in Safari change the encoding for the page to utf-8, the caption is displayed as "test æøå".

    Hi!

    Any news about this issue?


    -Kaj :)
Sign In or Register to comment.