CrcCheckCopy - compare folders by the CRC hash of each file

CrcCheckCopy is a command-line utility that lets you compare the files of different folders. It generates a CRC32 for each file and stores it in one checksum file. This small file can then be used to either check/compare another folder hierarchy or be used as an integrity check of the same folder hierarchy. The files are compared in binary mode, meaning byte by byte. This way you can be sure that the files are exact, identical copies.

The checksum file enables you to compare folders on two remote computers without needing to transfer the whole folder structure over the internet. Other file/folder comparing utilities need concurrent access to both folder hierarchies and need to read (again) the files of the source directory (in most cases they contain some gigabytes spread in multiple files).

CrcCheckCopy is a console application (command-line tool) that runs on Windows and Apple MacOS.

You can use the comments section below for any other feature requests and suggestions related to disk/folder/file comparison tasks.

Who is it for?

IT administrators, who must do reliable file copies of large volumes of data.

IT administrators can use CrcCheckCopy to ensure (with proof) reliable data copies to their managed servers in their data centers.

Professionals in the DVD/CD authoring, mastering or mass-production industry

CrcCheckCopy was originally created for professionals in the multimedia sector. These included DVD/CD manufacturing factories, DVD/CD software authors, DVD/CD preprocessing, mastering studios.

CrcCheckCopy allows you to take a checksum file containing the CRC of each file on an incoming disk and when the processing of any files is completed, to verify the produced final disk. The utility can show which files have (expected) differences. This is something that cannot be done if you create a single CRC for the whole disk. 

Power users who want to ensure the reliability of their data copies. 

Not all copies are 100% good. This might sound funny in the era where using the computer and making copies is so easy for anyone. But here are just a few examples of what can go wrong during a copy without the user noticing it:

  • Some files do not get copied because of the limitation of the maximum directory length. The message shown by Windows will not explain to you which files were affected.
  • USB flash disks worn-out while being used. Some of their files might become "corrupted" (this is not a moral term but a computer term meaning that the files were altered by defects of the medium).
  • Copying to different files systems might alter the filename characters, e.g. Chinese or eastern European characters get converted to something that is compatible with the destination file system. These files might then become unreadable by other computers. You will most probably not see a warning about this during your file copy operation. It is a silent error.
  • Accidental user mistakes, e.g. you accidentally drag and drop some files in a different folder (the accidental file move operation can every easily go unnoticed).

CrcCheckCopy can generate signatures (CRC32 checksums) of all the source files, and compare them with the destination copy, alerting you for any differences found.

You can even store the CrcCheckCopy utility and the checksum file to the destination disk or folder so that the recipient of the copy can verify them at any time. 

People who want to archive data files and be sure that nothing has altered them, like in the case of long-time preservation. 
Storing the checksum stamps file (and the CrcCheckCopy utility itself) together with the archived files will give you the ability to verify the integrity of your data at any time in the future. Just run the utility in verification mode, using the existing checksum stamps file.
You?

How are you using our file comparison and crc verification utility? Use the comments below to send us your use cases. You can help us fine-tune the software and/or add new uses for it.

How to compare the files between a source and a destination folder/disk/CD/DVD/NAS shared folder/Network share

Note for users who are not familiar with command-line utility software:
You cannot run the program with a double click. You will need first to open the Command prompt (on Windows) or the Terminal (on MacOS) and then type the command that will run CrcCheckCopy.
If the commands are repetitive scans and verifications of folders, then it is better if you create a small script with those commands. That script can run with a double click to save you time.

 

  1. To scan all files in a folder, e.g. "c:\my-source-folder", type

    On Windows
    
    CrcCheckCopy /scan c:\my-source-folder
    On MacOS
    
    ./CrcCheckCopy /scan /my-source-folder

    The utility will read all files and store their CRCs in a file named "CRCstamps.txt", in the current directory.

    You can open this file to observe its contents. The report will also contain a list of files that have zero size and a list of duplicate files.

  2. To verify and compare another folder, e.g. "c:\my-destination-folder", type

    On Windows
    
    CrcCheckCopy /verify c:\my-destination-folder
    On MacOS
    
    ./CrcCheckCopy /verify /my-destination-folder

    The utility will open the file "CRCstamps.txt" from the current directory, and verify that the files inside "c:\my-destination-folder" have the same CRCs.

    A new text file "CrcCheckCopy-verification-report.txt" will be created in the current directory, listing any errors found.

For a more detailed description of how to compare the files of two folders see "How it works".

Software identity card
Software name: 
CrcCheckCopy - Folder comparison using CRC32 checksums.
Verify by CRC that your file/folder copies are identical.
Author: 
StarMessage software
Category: 
Disk/File utilities
Subcategory: 
File/folder/directory comparison utilities
Screenshots: 
Operating system: 
Microsoft Windows
Apple macOS
Rating: 
5
Average: 5 (7 votes)

Comments

Excellent and small software. It was perfect and bug-less for HUGE data of our engineering company. Thanks to programmer to his support and to release this new version.

After copying an old partition to a new purchased hard drive, to ensure that all data are identical and to have a report for all files, I used this software. I ran it on a 400 GB folder with about 300,000 files and many many sub-folders, some file-names was in unicode.

Used software was: CrcCheckCopy v1.31 (2018)

The PC system was: OS: windows server 32bit / CPU: intel dual core 2.6GHz (an ancient CPU in 2010's) / HDD: Western Digital Gold 2TB (purchased in 2018)

It runs in less than 3 hours on the above system to generate a CRC-32 report for this huge data.

Rating: 
5

Very good, thank you so much! :-)
I love tiny and useful command line softwares! :-D

Can I do a little suggestion? The help says the syntax is:

CrcCheckCopy {/scan path | /verify path} [options]

In truth, the correct is:
CrcCheckCopy {/scan path | /verify path} patch [options]

Forcing an error, it was clear to me:
"Error: 2nd argument must be the path of the folder to check.
Run the program without parameters to see its help."

This small detail confused me a moment ago.

Once again, many thanks! *thumbsup*

PS.: I am sending the same comment again because I forgot to give 5 stars to this jewel. ;-)

Rating: 
5

Thanks for using this file comparison utility and thank you for the remark about the syntax. The next version will have the updated syntax:

CrcCheckCopy {/scan | /verify} path [options]
Rating: 
0

would be nice if path parm (2nd) was optional. if not specified the current path would be used
thanks

Rating: 
0

not working
AppData\Roaming\StarMessage software\CrcCheckCopy.ini Error message:No such file or directory

Rating: 
0

Thanks for reporting this bug. It is fixed in v1.8.

Rating: 
0

The Windows executable is identified as malicious by several antivirus programs when analyzed in virustotal.com. I downloaded the zipped combination of Mac/Windows versions.

Rating: 
0

Thank you for reporting this.This is a false positive. It is due to the fact that it is a small command line utility and it is not digitally signed. We have submitted the program as a false positive to these antivirus companies and we expect their engines to get updated. While their desktop engines get updated fast, it is not know how frequent are the engine updates on virustotal.

We will eventually acquire a code signing certificate in the next few weeks to remove the false positives.

In the meantime, please try v1.9.

Rating: 
0

The software works really well and I have used it check the copying of some files from an old file repository to a new one. However, on about 10 of the files I got [Warning, CRC=-1] <name of file>. The software said there were no errors but what does this warning mean?

Rating: 
5

This warning is only reported because it is peculiar to find a CRC of -1 (0xFFFFFFFF in hex) as if the CRC algorithm did not even start.

Do these file pairs have the same contents? Can you make a binary comparison of them?
On Windows: fc /b fullpathtofile1 fullpathtofile2

Other cases to check for are:
- some Microsoft ISO disk images have such a CRC
- these files are symbolic links (not actual files in the directory)
- there are file permission issues for these files
- there are strange characters in their filename or file path

 

Rating: 
5

Cool software, very lightweight. Have tried a few different hash checkers - this one is the only one that manages to combine all the best features in one application:

Multi-file hash (surprisingly uncommon),

Multi hash comparisons from a saved file (also surprising, some just compare single pasted hashes),

Location agnostic - the hashes are saved with relative location so files can be moved to different drives without issues,

And, the most important, highlighting which file hashes don't match in a simple format. This is especially useful when comparing hundreds of thousands of files. Some other hashers mention there are errors but need you to manually scroll through all the files to find the error.

Rating: 
5

Great app. Was looking for something like this for a while. Wrote my own Powershell script to do something similar, but this is clean and simple. I love it. Just wish they made it for Linux as well, this way I could validate data directly on my Linux NAS rather than over the network from my Windows PC.

Rating: 
5

Great piece of software. Works very fast to scan files and create the crc file. I noticed that at the end of the CRCstamps file is a comment about only reporting up to 5 duplicates unless you have the PRO version. How is the PRO version obtained? I don't see any mention of it here on your site.

Rating: 
5

Jason, thank you very much for your good words. The PRO version is new. Currently, the PRO license it is given to anyone who makes a donation (big or small).

Rating: 
0

I made a donation back in February. Was I supposed to receive a link to the pro version? I don't have anything indicating as such. Thanks.

Rating: 
0

Mark, the PRO edition is a completely new thing; it did not exist in February. This extra functionality in the PRO edition (finding duplicate files) was created specifically for donors like you, as an extra "Thank you" for your support. You (and all other donors) will receive an email with your activation license in the coming days.

Rating: 
0

Can we have an option to provide a name for the CRCstamp file during scan & verify.
We can then include the program in scripts to automatically deal with multiple sets of files & folder.

For example:
crccheckcopy /scan "E:\MyData" /log "Mydata_CRCstamps.txt"
crccheckcopy /verify "\\MyNAS\MyShare\MyData" /log "Mydata_CRCstamps.txt"

Rating: 
0

Steve, interesting suggestion. I will contact you by email to understand better the need. For example, now that this option does not exist, can't your script rename the CRCstamps file before or after running CrcCheckCopy?

Rating: 
0

One suggestion for this utility? Would it be possible to allow CrcCheckCopy to only generate hashes of and verify files listed in a text file?

For example, I'll crccheckcopy /scan to generate a hash of all my files I am backing up for offsite cold backup every few months. After a few months, I get the disk to update the data, I confirm existing data is OK with crccheckcopy /verify.

Then when I go to update the data on the backup disk, my backup script generates a log of files that have changed and therefore copied over/updated. It would be nice to have crccheckcopy just generate hashes just for those paths\filenames that are new or updated, instead of having to check the hash of every file on the source every time.

Thanks.

Rating: 
5

One more suggestion? If there is a way to exclude a folder or file name or file extension with `/scan` that would also be a nice feature to have. Thanks.

Rating: 
5

Thanks for the suggestion to add a parameter for files and folders to ignore from the comparison.

Rating: 
0

Just downloaded, opened non-admin command prompt on Win10/64.
crccheckcopy /scan *.*
gives 0 files on multiple NTFS drives (31GB on 1TB drive and 5TB directory on 11TB drive, all single level)
TXT file gets created
--------------------
CrcCheckCopy v2.4.1
--------------------
Scanning source path:*.*
and saving the results to:CRCstamps.txt
Scanning folder and building the file list [Done]
Sort the file list [Done]
Files to check:0
Scanning was completed and the CRCs were saved in the file: CRCstamps.txt
You can use this file to verify the files in the destination folder.
Performance statistics:
00:00:00.0 total processing time, of which,
00:00:00.0 were for the scanning of the disk folders and
00:00:00.0 for the sorting of the files in memory.
0 files, 0 bytes.
Processing rate: 0 files/sec, 0 Kb/sec.
Added app file to Permissions entry in settings.
Command prompt in Admin makes no difference.

Rating: 
0

dear Hans,

thanks for flagging this issue. CrcChechCopy does not accept wildcards. I will probably need to show an error message if wildcards are used.

Assuming that the CrcCheckCopy executable can be found in the "path" or is stored in the current directory, the Windows syntax is for the scanning is:

CrcCheckCopy /scan <path>

Where <path> can be an absolute path or a relative path from the current working directory.

In your case, you can scan the current working directory by putting "./" instead of "*.*".

CrcCheckCopy /scan ./
Rating: 
0

All DOS/Windows command line apps need the file path hacks to handle the various file specs. RFE please implement.

BTW I will process ./*.xyz because those are the important files. C:\temp\*.xyz syntax doesn't work.

I don't want to process the other larger files. It's already slow enough using only one core per directory/drive. Good thing I have lots of cores and lots of drives to crc.

Now, is there a way to incrementally crc new files only? Running the tool always overwrites the output file.

Is there a tool that can copy/move files based in these CRCs? I can diff and script, but I am getting spoiled with nice GUIs. Any recommendations?

Thanks!

Rating: 
0

Hi,

I think your program has promise for me, but I cannot get it to verify against files on my Synology NAS. I have the NAS mapped to Y: and "CrcCheckCopy /scan y:" starts the NAS chugging away just fine so I know I have the mapping correct. But "crcCheckCopy /verify y:" doesn't access the NAS, and instead it just gives [Missing file] for every file I'm looking for. As a Sanity check I copied the folder to D: and "crcCheckCopy /verify d:" works just fine.

Are there special instructions required for accessing mapped NAS folders? And I have it working fine for USB and local drives btw.

Thanks

Rating: 
0

Hi and thanks for reporting this. I was able to reproduce it on my Synology. I will try to fix it for the next version.

There are two workarounds:

Add the trailing \ at the drive letter:

crcCheckCopy /scan y:\
crcCheckCopy /verify y:\

Or, instead of mapping the network share to a drive letter, you can directly use the network address, e.g.

crcCheckCopy /verify \\myservername\sharedfolderX
Rating: 
0

Thanks for the reply,

I tried the workarounds but neither of them work for me. And I had already tried the direct network address earlier on to no avail. Can you provide any clues to help me work out why your Synology works and mine doesn't? I'm really keep to get this working. Thanks

Rating: 
0

Why does the Overview page refer to a command line tool CheckCopy when in fact this command line tool is called CrcCheckCopy? Confusing....

Rating: 
0

Howdy I love this it did exactly what I needed, I did have a question as to whether or not you were going to add that option to ignore files or folders.

Rating: 
5

I ran this on my 3TB of photos, comparing to a backup done with FreeFileSync. It worked and verified the whole backup. But, I don't backup files like thumbs.db (Windows thumbnail cache) and some cache files from Adobe Bridge which makes for a lot of missing files. I had to filter out the /Scan output to remove those before comparing with the /Verify run. That took me awhile to figure out. I could really use some filter options to ignore certain files. It wouldn't surprise me if many backup tools are configured to ignore certain types of files or even whole directories as they are temporal or replaceable data.

The program took 16 hours to run on 3TB of reasonably fast hard drives. When CrcCheckCopy was running, it didn't seem like it was maxing out the drive transfer speed or even a single CPU. I wonder if it could be made to run faster on large numbers of files with some threads so one thread is just grabbing data from the disk as fast as possible and one or more threads are calculating CRC on buffers. If the code was open source and a toolset I have some familiarity with, it might be something I'd be willing to work on.

Rating: 
4

Thank you for these ideas. I will add thumbs.db to the internal list of ignored files. Do the Adobe Bridge cache files have specific filenames?

Yes, one manual way to ignore specific files is to remove their lines from the /scan output file.

Yes, I will add the option to specify (change) the filename of the CRCstamps.txt. In my cases, I am always storing this file inside the scanned folder, so it always accompanies the data. In that case, a fixed filename would suffice.

This program started when SSDs did not exist. So seek time was an important speed factor for the disk. This is why it is still reading one file after the other. Since some files are too large to be put in the memory, the CRC is calculated on the fly. I take good note of your idea to cut the CRC calculation is smaller chunks and assign it to different threads.

Update: version 2.6 has a new, faster CRC calculation algorithm which increases speed up to 20x when calculating the CRC from files on the local SSD drive

Rating: 
0

It's OK for you to hard code ignoring thumbs.db I guess since it's just a cache file that most backup programs will ignore by default, but I don't think you should be hard-coding other files to ignore that people may or may not want to consider. If you ignored the Bridge files, but people didn't also manually ignore them in their backup, then you're going to falsely report mismatched files. That's a user level decision (IMO), not something your tool should do. I don't understand why you don't just allow us to specify an ignore list, both on the command line and via a default settings file.

Likewise, it's getting to be quite a hassle that you don't let us specify the name of the CRCStamps.txt file that is being used for any given operation. I have to create batch files to constantly rename files to/from the CRCStamps.txt name because I'm scanning multiple sources and retaining that file.

Why not open source the code so others can contribute? I can't imagine that this is either a big money maker for you or that the source contains trade secrets. You can still be donation-ware AND open source.

FYI, my comments about not maxing full transfer speed were not related to SSDs at all - just plain spinning drives, but you're probably right that you'd need threads even more with SSDs (less time waiting for disk, higher percentage of time waiting for CRC). The point is that you can calculate CRC and wait for the next disk transfer in parallel rather than in serial. It won't probably do you any good to have multiple threads working on separate files because a spinning drive's read head can only be one place at a time, but not making the next disk read wait for a CRC to complete on the previous read is probably worth doing.

Rating: 
0

John, all your comments are great. Thank you. I will explore them and work on them.

Rating: 
0

Just to thank you for such a powerful and simple tool! You helped to recover my family file archive for 1M+ files after occasional delete, raid failure and formatting on top of that.

I was consolidating my files from old PCs and laptops and NAS to one dedicated file server (2 disks in raid-1 and 2 disks for backups). And I managed somehow to delete/format/lose during raid rebuild about 90% of all data.

All initial disks were partially re-written then, so I was not able to fully recover any partition. Backups were outdated..

So I had to recover millions of files from 4 HDDs + add something from old backups. At the end I got 6 completely different sets of recovered files, partially missing, partially corrupted and partially of different versions. The task was to combine all together and to define where the difference are to check what version of file to keep.

CRC is fast enough to compare versions. I tried MD5 tools, but they were too slow and not that adjusted for folders and sub-folders.

So, your tool helped a lot to define if files were the same (in this case I just kept one copy) and if there was a difference (in this case I checked individually).

I would say this tool can be quite useful both for personal and commercial use for data recovery or file archive management applications. Maybe some GUI and you can shareware it to the market.

Thanks again.

Rating: 
5

Add new comment