Md5 hash comparison no longer detects duplicates?

Jmaz5Jmaz5 Registered Users Posts: 3 Beginner grinner
I setup a python script that uploads albums that worked fine until switching to the new smugmug design. It retrieves the md5 hashes for each album first and compares md5 hash for each file to detect duplicates before uploading them. The duplicate detection no longer works and duplicates are uploaded regardless. Any ideas on why this has changed and no longer works? I can't use filenames to detect duplicates because they often get reused. I haven't had much time to insert break points and troubleshoot yet, but am wondering if there was a change to the api that would affect this.

Comments

  • wiredprairiewiredprairie Registered Users Posts: 12 Big grins
    edited August 29, 2013
    Long ago, I noticed that if SmugMug makes any automatic changes to the image that it totally messes up the checksum calculation. I couldn't always predict when it would happen even. Maybe there's something about 2.0 that is always making a subtle change to the original?
    Jmaz5 wrote: »
    I setup a python script that uploads albums that worked fine until switching to the new smugmug design. It retrieves the md5 hashes for each album first and compares md5 hash for each file to detect duplicates before uploading them. The duplicate detection no longer works and duplicates are uploaded regardless. Any ideas on why this has changed and no longer works? I can't use filenames to detect duplicates because they often get reused. I haven't had much time to insert break points and troubleshoot yet, but am wondering if there was a change to the api that would affect this.
  • gingerlimegingerlime Registered Users Posts: 3 Beginner grinner
    edited January 18, 2014
    Having similar issue
    I just bumped into the same issue. I'm writing a small command line tool to upload images to smugmug (look for gingerlime/smugsync on github). I have built duplicate detection into it - it grabs the md5 hashes from the album and compares it to the uploaded files before uploading.

    Unfortunately it seems like some images get a different md5 hash after being uploaded. I compared my code to the local md5sum util, and it's calculating the sum correctly... Strange.

    Anybody managed to work out how the md5 hashes get calculated or why they seem to change after the file is uploaded?
Sign In or Register to comment.