Options

rsync to Smugmug?

bmcent1bmcent1 Registered Users Posts: 24 Big grins
Does anything like this exist or could it be possible to program with the API?

I occasionally loose track of which photo's I've uploaded. I usually upload to albums named by month and I keep photos organized the same way locally.

It is possible to run an rsync to Smugmug to upload only those photos not already in the album?

Comments

  • Options
    luke_churchluke_church Registered Users Posts: 507 Major grins
    edited February 26, 2006
    It is possible to run an rsync to Smugmug to upload only those photos not already in the album?

    It would be relatively easy to do with the MD5 hash computations on the images... The only issue is that *any* changes to the files would result in the file being reuploaded. Would this be acceptable?

    Luke
  • Options
    ruttrutt Registered Users Posts: 6,511 Major grins
    edited February 26, 2006
    Try sm_tool.py. You can download here: http://www.michaelmcdaniel.net/files/sm_tool.py

    It keeps information locally instead of using the API to compare, but when I wrote it I wanted it to behave like rsync.
    If not now, when?
  • Options
    bmcent1bmcent1 Registered Users Posts: 24 Big grins
    edited February 26, 2006
    It would be relatively easy to do with the MD5 hash computations on the images... The only issue is that *any* changes to the files would result in the file being reuploaded. Would this be acceptable?

    MD5hash'ing would not only be acceptable, it would be AWESOME :-)

    I took a look at the JFIF file format and it appears there are separate parts to a JPEG image. The image it self. And then COM (comments) sections, an APP13 section which might contain IPTC, and EXIF...

    It would be best if the MD5 hash was computed only against the image. This way, if keywords or EXIF changed, the MD5sum would not change, but if the image itself is redone/different, it would be apparent. Alternately storing both the image only MD5 and the whole file MD5 could be useful but then it's more complicated.

    In Linux, a tool called jhead can strip out strictly the image portion of a JPEG with 'jhead -purejpg' so it might me something that's already possible within the JPEG library.
  • Options
    bmcent1bmcent1 Registered Users Posts: 24 Big grins
    edited February 26, 2006
    rutt wrote:
    Try sm_tool.py. You can download here: http://www.michaelmcdaniel.net/files/sm_tool.py

    It keeps information locally instead of using the API to compare, but when I wrote it I wanted it to behave like rsync.

    Thanks for the link. That's a cool utility. I'm not sure it can help me because I've got photos on a couple different computers (my fault :-) and also on SmugMug and Flickr. I'm really hoping for MD5sums or some other SmugMug or API mechanism which saves me from uploading duplicates.
  • Options
    luke_churchluke_church Registered Users Posts: 507 Major grins
    edited February 27, 2006
    I took a look at the JFIF file format and it appears there are separate parts to a JPEG image.

    No kidding there are ;) I've recently written a custom JPEG compressor by hand (don't ask), so I know more about the insides of the JPEG/JFIF file format than I ever wish to....

    There are also a bunch of other items, such as APP1 (of the top of my head) thumbnails etc.
    It would be best if the MD5 hash was computed only against the image. This way, if keywords or EXIF changed, the MD5sum would not change, but if the image itself is redone/different, it would be apparent.

    Unfortunately that's a slightly naive view. It depends in the manner in which the keywords where changed. Most JPEG compressors will also subtly alter the image data when they save it (even if they've been written well enough not to recompress the image data, which most haven't), they may do things like recomputing the huffman tables, or modifying the resume markers in the file. Even the smallest modification will result in this not working.
    Alternately storing both the image only MD5 and the whole file MD5 could be useful but then it's more complicated.

    There is another advantage of whole file MD5 in that this is was Smugmug uses. If you upload an image and query it for the MD5 of that image is it 'sm-'WholeFileMD5'-sm'

    I don't understand why 'sm-' etc has been added, but as store it it would make it even easier to use to prevent reuploading.

    There is some possibility of trying to do clever things to prevent duplicates, I'll have a think about that in a couple of weeks.

    Cheers,

    Luke
  • Options
    luke_churchluke_church Registered Users Posts: 507 Major grins
    edited February 27, 2006
    bmcent1 wrote:
    I'm really hoping for MD5sums or some other SmugMug or API mechanism which saves me from uploading duplicates.

    We can definately get it right on the assumptions that the metadata doesn't change. We'll have to see about image only comparisons. Quick wins first and all that mwink.gif

    Luke
  • Options
    bmcent1bmcent1 Registered Users Posts: 24 Big grins
    edited February 27, 2006
    No kidding there are ;) I've recently written a custom JPEG compressor by hand (don't ask)

    Heh. Okay, I won't. But it reminds me of when I wrote a silly general purpose compression program to understand how compressors worked. :-)
    There is another advantage of whole file MD5 in that this is was Smugmug uses. If you upload an image and query it for the MD5 of that image is it 'sm-'WholeFileMD5'-sm'

    I don't understand why 'sm-' etc has been added, but as store it it would make it even easier to use to prevent reuploading.

    Oh, awesome. Since it's already there, hopefully should be easier to expose via the API.
    There is some possibility of trying to do clever things to prevent duplicates, I'll have a think about that in a couple of weeks.
    Luke

    Cool. Another way I was considering was some sort of image comparison, maybe build on top of ImageMagick or possibly something in the netpbm toolkit. I was thinking of overlaying two images (like Photoshop layers + difference) and considering an either all black frame as identical or possibly an arbitrary threshold of differences to be considered similar enough.

    Just throwing that out there as an idea. MD5sums will definately make me happy! :-)
Sign In or Register to comment.