Getting Extra Image information in Response from Upload

jschoenjschoen Registered Users Posts: 12 Big grins
I have noticed that for a few images when I upload images using the API that apparently SmugMug does some modification to the files, thus changing the MD5Sum of the image. This causes a problem for my library, as I try to prevent uploading duplicate images.

The current solution is to to call smugmug.images.getInfo after the upload is complete and compare teh MD5Sum there with the one I calculated before hand. If they are different I can at least tell the caller, or maybe even download it from SmugMug overwriting the original.

What I would like, is if the response from the upload included the MD5Sum of the image on SmugMug along with the URL to download the file. This would allow me to not have to do a second api call (which adds up when you are doing this for tons of images).

For more a previous discussion of this issue see this thread.

Comments

  • huffmanhuffman Registered Users Posts: 19 Big grins
    edited December 2, 2012
    In my experience, the only changes SmugMug makes to the original is that, if the rotation flags in the jpeg say that the picture should be rotated, they immediately rotate and set the flag to "no rotation". For me, that's all portrait oriented pictures as they come out of the camera just like landscape mode, but with the orientation tag set differently. I presume SmugMug does this so they don't have to deal with one more complexity.

    From my experience, the algorithm they use to rotate is identical to the one used in the linux command "jhead -autorot *.jpg". It is a lossless algorithm, at least if your picture dimensions are multiples of 8 (or maybe it's 16).

    The section of the man page is below. It looks like the program "jpegtran" that's called is available for windows as well. This means that you can do the rotation yourself and get the MD5SUM that SmugMug will get. At least that's always worked for me.

    Bill

    JHEAD Documentation:
    Using the 'Orientation' tag of the Exif header, rotate the image so that it is upright. The program jpegtran is used to perform the rotation. This program is present in most Linux distributions. For windows, you need to get a copy of it. After rotation, the orientation tag of the Exif header is set to '1' (normal orientation). The thumbnail is also rotated. Other fields of the Exif header, including dimensions are untouched, but the JPEG height/width are adjusted.

    JPEGTRAN Documentation:
    jpegtran works by rearranging the compressed data (DCT coefficients), without ever fully decoding the image. Therefore, its transformations are lossless: there is no image degradation at all, which would not be true if you used djpeg followed by cjpeg to accomplish the same conversion. But by the same token, jpegtran cannot perform lossy operations such as changing the image quality.
  • jschoenjschoen Registered Users Posts: 12 Big grins
    edited December 2, 2012
    Thanks for the info. While your suggestion is valid in this instance, it still seems fragile. While I can (and probably will) perform the check and make changes before the upload is performed, it still seems brittle from a programming standpoint. What if there are other instances where SmugMug makes changes.

    I guess my point is, with out there being documentation pointing to when they change the images on upload, then at least having the MD5Sum returned would allow us to know based on the return value. Because for now, I will have to to check to see if I need to rotate it, make upload call, us imageid and key returned to make a call to smugmug.images.getInfo to get the MD5Sum to know if anything changed. All this adds up when you start considering the overall performance of the application.

    So I would like 2 things:
    1) Documentation from SmugMug on the Uploading page listing out when they modify uploaded images
    2) At a minimum add the MD5Sum to the returned results when uploading (would be nice to have URL to download the modified file also, but if the documentation above is provided the use would be limited anyway)
  • huffmanhuffman Registered Users Posts: 19 Big grins
    edited December 4, 2012
    I agree about the fragility. I find the API pretty fragile anyway and this has not been a source of problems compared to other things. I believe I asked and they said this was the only way they changed it, but that does nothing for possibility of future changes.

    I think the only reasonable way is to check later. If you are concerned about the performance of checking later, I would think you would be more concerned about the performance of having the API response wait long enough to do a jpegtran rotate followed by an MD5sum. ;-)

    For best performance you might try uploading all images in one album (with the known change - rotation) and then using the smugmug.images.get method on the album with the "heavy" flag set. It returns a structure with many things, including the MD5sums (and download URLs) of every picture in the album. They can all be checked in parallel that way just to make sure there's no other change. If there's another change, the update can be made, but it won't happen until SmugMug makes a change.

    Bill
  • jschoenjschoen Registered Users Posts: 12 Big grins
    edited December 5, 2012
    I am not concerned about the performance of jpegtran since that is not the bottleneck. The bottleneck is the actual upload of the image to SmugMug and the api calls. Each upload is spawned off into another thread, so I can upload multiple items at a time. The jhead calls are done before the threads are put into a queue to be ran. So it is happening simultaneously and at most costing a few seconds in the initial startup while loading the queue with upload threads.
  • huffmanhuffman Registered Users Posts: 19 Big grins
    edited December 6, 2012
    That all sounds reasonable.

    But what I was talking about was that the answer to the upload request wouldn't come back from smugmug until they did the jpegtran and md5sum. I'm suspecting that wouldn't bother you either, if it took a couple of seconds, for the same reason, you'd just have more transactions going on in parallel because of the extra wait at the end.

    But other users would likely complain loudly about that couple of seconds.

    Worse, I watch what happens when I upload to a gallery I'm currently observing. The pictures appear completed somewhat out of order and 10s of seconds after I upload them. I'm suspecting that means that smugmug has completely other systems doing all that work and putting the pictures into place running in parallel. Holding a transaction open and waiting 10s of seconds to hear back from those other systems, assuming that's what would happen, would be a serious negative.

    So, I'm guessing your request is way too costly to other things. I'm guessing you will have to handle that delay in your program by asking for the whole album worth of md5 sums later and doing what you need to do.

    Bill
  • jschoenjschoen Registered Users Posts: 12 Big grins
    edited December 6, 2012
    Sorry about that I misunderstood what you meant. That actually seems reasonable as I had not considered that. Thanks for the insight, and thought exercise. While it may not actually fix anything, it has helped me gain a clearer picture of the process.
  • gingerlimegingerlime Registered Users Posts: 3 Beginner grinner
    edited January 19, 2014
    just a quick comment, since I bumped into the same issue: shouldn't smugmug keep the original md5sum for the image before any modifications? They already store lots of data for each image and this value will be sent with the upload. Storing another 128bit field with the original MD5 as well as the current MD5 would make a lot of sense. It would make comparing the uploaded file to the one on Smugmug much easier, and avoid having duplicates (which would take even more storage for Smugmug). Just my 2cents.
Sign In or Register to comment.