The Duplicate Problem

Every photographer who shoots bursts, continuous shooting, or bracket sequences knows the pain: you come back from a shoot with hundreds of near-identical frames. A single moment captured in a 10-frame burst gives you 10 images that are 95 percent identical. Manually comparing these to pick the sharpest one is time-consuming and mind-numbing.

Duplicate detection software solves this problem automatically. Here is how it works and how to use it.

Types of Duplicates

Exact Duplicates

True exact duplicates occur when you have copied a file twice or imported the same card twice. These are easy to detect — the file hash matches exactly. Any decent file management tool can find these.

Near-Duplicates (Burst Sequences)

Near-duplicates are the more common and more challenging problem. Ten frames of a burst sequence might have slightly different exposure, tiny differences in subject position, and variations in motion blur — but they are visually almost identical. Simple file hashing cannot detect these.

How imagic Detects Near-Duplicates

imagic uses perceptual hashing to find near-duplicate images. Unlike a cryptographic hash which changes entirely with any pixel modification, a perceptual hash represents the visual structure of the image. Two images that look nearly identical will have very similar perceptual hash values, even if they differ slightly in sharpness, exposure, or framing.

The process works as follows: imagic generates a perceptual hash for every image in your shoot during the Analyse step. It then clusters images with similar hash values into groups. Each group is treated as a burst sequence. Within each group, imagic ranks the images by its combined quality score (sharpness, exposure, noise, composition, detail) and pre-selects the highest-scoring frame.

The Time Savings

Consider a wedding photographer who shoots 3,000 frames and captures most key moments in 5 to 10 frame bursts. Without burst detection, every burst must be manually compared frame by frame — an exhausting process. With imagic's automatic grouping, the photographer reviews one pre-selected image per burst and simply confirms or overrides the AI's choice. The review time for burst sequences drops from hours to minutes.

Setting Up Duplicate Detection in imagic

Install imagic with pip install imagic and import your shoot. The Analyse step runs automatically and handles duplicate detection as part of the same AI analysis pass that scores image quality. There is no separate step to configure — burst detection is built into the standard workflow.

Storage Benefits

Beyond time savings, duplicate removal has a significant impact on storage. A single burst of 10 RAW files might consume 300 to 500 MB. On a wedding shoot with hundreds of bursts, the non-selected frames represent gigabytes of storage that can be safely archived or deleted once you have confirmed your selects. imagic's Cull step makes this clean separation easy — rejects are separated from selects so you can archive or delete them in a single operation.

Best Practices

TIFF vs JPEG vs RAW: When to Use Each File Format Scene Detection in Photography: How AI Identifies Photo Types