Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Another photo clean-up

    I've been a bit busy lately. Before Christmas my middle son asked if I would get the images from their old computer and put them on an portable disk, ready for when they get a new computer.

    "Sure", I said.

    So I took the hard drive out of their (unused for some months) computer and installed it in mine. Booted up and the drive is recognised and readable - great.

    It had about 65,000 images and video clips on it - uh-oh!

    Some were in folders named for the subject, some where just dumped into the Pictures folder.

    After a bit of trial and error, I decided to:
    • keep most of the the subject folders as is and re-arrange the others into a Year > Month structure based on date taken.. Year > Month > Day created too many folders with just one or two images.
    • Import and restructure onto one of my hard drives using Faststone.
    • Even after changing permissions, some files proved unreadable without individually changing permissions. I'm sure I could have scripted this somehow but it would have taken too long to work out how, so I used backup mode in Robocopy to copy the existing structure to another drive, which reset file permissions, then imported from there to restructure.
    • Clean up each folder by visual inspection. I started using Faststone then changed to Adobe Bridge.
    To further complicate matters, some cameras seemed to have had incorrect dates, and some files were missing date-taken. Quite a few files ended up with the date I copied them as one of the dates. Images came from a number of different phones, a DLSR, screen-shots and downloaded images.

    It now looked organised but was still a mess. Someone had imported photos from their phone multiple times, sometimes using different naming conventions. A couple of folders ended up with six copies of each image. Sorting each folder by date, then eyeballing them and checking file size gave a reasonably simple method of deleting duplicates (at least if they were in the same folder). Some folders ended up with around 4000 images

    I am now down to 49,000 photos in 310 GB. I do an hour or two at a time then give it away for a day. They don't seem in a hurry for it, which is just as well.

    At some point I will decide it is clean enough and it will go back to them for final culling - with written instructions on importing images in future. I could have just copied them across and handed them back as-is, but I don't think they would have had time or knowledge to fix it. They both work and have two young children.

    I haven't been on here much lately and now you know why.
    Alan W

    My Gallery

  • #2
    I don't envy you having to search through all those files for similar/duplicate images Alan. I have had a lot of success with VisiPics, although it doesn't seem to scan RAW files it does a pretty good job on JPEG, TIFF, etc.

    A screenshot of a scan I have just done. It took about half an hour.
    Click image for larger version  Name:	VisiPics.jpg Views:	0 Size:	130.5 KB ID:	483231

    I'm also trying out a program from the Microsoft page called Duplicate & Similar Photo Cleaner, although it loads up as Photo Forensics Free (go figure). I'll let you know how it goes.

    Click image for larger version

Name:	Photo Forensics 1.jpg
Views:	27
Size:	170.0 KB
ID:	483235
    My Gear

    Comment


    • wigz
      wigz commented
      Editing a comment
      I think I have found the problem. The EXIF orientation flag is different in the identical images that don't match. The following from ExifTool output:

      Image 1: Orientation : Rotate 90 CW
      Image Size : 3264x2448

      Image 2: Orientation : Horizontal (normal)
      Image Size : 2448x3264

      I suspect both programs are not handling this correctly. I have no idea why this flag is different in these images - remember these are not mine.
      Back to manual checking and deletion - at least there aren't as many now,

      Below are some tests I ran to try to determine what was going on,

      I tried VisiPics on all photos and it picked up about 60 that Duplicate and Similar Photo Cleaner (DASPC) did not find. These seemed to be where there was a drastic size difference such as 50k to 3000k.

      There are many images that both programs missed. In one folder of about 230 images there were about 100 duplicates with minor differences in file size but the images were dimensionally and visually identical, 2.3MB to 2.4MB for example. Both programs missed these.

      Using DASPC on this folder at the lowest similarity setting, and highest CPU setting found one pair of visually similar images that were in fact similar, not duplicates, but missed all of the duplicates..

      Using ViisiPics at Basic similarity setting found 19 pairs of similar images that were atually duplicates, but showed one of them rotated ninety degrees - both thumbnail and preview. All other programs showed these oriented correctly. I had the option 'Scan for 90 degree rotations set.

      I took two identical photos that were not being picked, opened them in photoshop, drew a small spot on each and saved as a different name, and checked with each program again.

      Both programs identified three images as the identical, both modified versions and the other identical image that I had not touched, but missed the original unmodified image.

      VisiPics has a number of usability issues that make it more awkward to use than DASPC.

  • #3
    I have to say that I'm pretty impressed with the Duplicate & Similar Photo Cleaner program, it found a lot more similar images than VisiPics as it also analysed DNG files. It did not analyse CR2 (Canon) or RW2 (Panasonic) RAW files. IT would probably be worth the $15 to register if you had a lot of images to manage.

    Click image for larger version

Name:	Photo Forensics 2.jpg
Views:	28
Size:	145.0 KB
ID:	483237

    Click image for larger version

Name:	Photo Forensics 3.jpg
Views:	32
Size:	141.5 KB
ID:	483238
    My Gear

    Comment


    • wigz
      wigz commented
      Editing a comment
      One big scan John. I included videos and it did pick up a lot of them. I have also found a number of other duplicate jpegs it missed for which I can see no reason.

      I moved four images that it had picked up as two sets of two duplicates into a separate folder and just ran a scan of that folder. It gave the same result.

      Interestingly I had it scan those four for similar, rather than duplicates, and it found none. It obviously has some issues but did find a lot more duplicates than I would have found without it, and in less time.

    • Grumpy John
      Grumpy John commented
      Editing a comment
      As a comparison, have you tried VisiPics?
      How did it go with the video files?

    • wigz
      wigz commented
      Editing a comment
      Haven't tried VisiPics as 'Duplicate and Similar...' seemed good until I was most of the way through. I might give it a go to see if it picks up any others.

      I haven't found many videos it missed so far. I'm still doing a final review and clean-up.

  • #4
    G'day fellas

    Good info & experiences here .............. and I'm sure we've all done tasks like this and with varying results
    Phil
    __________________
    > Offers Digital Photography workshops in outback eastern Australia
    > recent images at http://www.flickr.com/photos/ozzie_traveller/sets/

    Comment


    • #5
      Hi wigz have you heard back from the developers of Duplicate and Similar Photo Cleaner yet, if so did they have any clues on improving performance?
      My Gear

      Comment


      • Ozzie_Traveller
        Ozzie_Traveller commented
        Editing a comment
        G'day fellas
        GJ you said above ...

        I'm also trying out a program from the Microsoft page called Duplicate & Similar Photo Cleaner, although it loads up as Photo Forensics Free (go figure). I'll let you know how it goes.

        This is typical MS behaviour -- buy up an existing bit of "pretty good" software, reverse engineer it into Windows and then put it out as a MS creation

        Alan- your efforts are beyond the realm of "good works for the family" ... I hope they take you out to a damn fing eatery once it's all finished

        Phil

      • wigz
        wigz commented
        Editing a comment
        Phil,

        It's not Microsoft software but distributed in their store, which is their version of the Apple and Google stores.

        There is some proper software there but also plenty of minimally functional stuff.

        About the cleanup - yep, would be nice but I'm not holding my breath.

      • Grumpy John
        Grumpy John commented
        Editing a comment
        I'm glad that you're not holding your breath for an answer Alan as I feel you would turn 50 shades of purple before hearing back.
        I am currently in email conversation with DxO about issues that I am having with NIK Collection, so far it has been 7 weeks and the problem has still not been resolved. They have made plenty of suggestions, but nothing has worked.
    Working...
    X