This is a staging site. Uploads will not persist. Testing only.

simhash

generate similarity hashes to find nearly duplicate files

Description

One of the questions that it's nice to be able to answer about a pair of files is the degree of similarity between them. This command-line tool is useful for estimating the "degree of similarity" between a pair of nominally sequential files such as textfiles. The tool uses Manassas's "shingleprinting" technique.

Upload more screenshots

Please help extend the collection of screenshots. Just make a screenshot and upload it here. You don't need to register or anything.

Upload a screenshot

Hint: upload an image here from your clipboard with Ctrl-V


Homepage

http://wiki.cs.pdx.edu/forge/simhash.html


Install this software package

If the package is available for the distribution you are currently using on your computer then install the software by clicking on…

Install simhash