Note
This is a community plugin, an external project maintained by its respective author. Community plugins are not part of FiftyOne core and may change independently. Please review each plugin’s documentation and license before use.
Image Deduplication Plugin#
This plugin is a Python plugin that streamlines image deduplication workflows!
With this plugin, you can:
Find exact duplicate images using a hash function
Find near duplicate images using an embedding model and similarity threshold
View and interact with duplicate images in the App
Remove all duplicates, or keep a representative image from each duplicate set
Watch On Youtube#
Installation#
fiftyone plugins download https://github.com/jacobmarks/image-deduplication-plugin
Operators#
find_approximate_duplicate_images
#
This operator finds near-duplicate images in a dataset using a specified similarity index paired with either a distance threshold or a fraction of samples to mark as duplicates.
find_exact_duplicate_images
#
This operator finds exact duplicate images in a dataset using a hash function.
display_approximate_duplicate_groups
#
This operator displays the images in a dataset that are near-duplicates of each other, grouped together.
display_exact_duplicate_groups
#
This operator displays the images in a dataset that are exact duplicates of each other, grouped together.
remove_all_approximate_duplicates
#
This operator removes all near-duplicate images from a dataset.
remove_all_exact_duplicates
#
This operator removes all exact duplicate images from a dataset.
deduplicate_approximate_duplicates
#
This operator removes near-duplicate images from a dataset, keeping a representative image from each duplicate set.
deduplicate_exact_duplicates
#
This operator removes exact duplicate images from a dataset, keeping a representative image from each duplicate set.