Thursday, October 25, 2007

CRC Check / CRC Values

CRC values are calculated using an algorithm known as the Cyclic Redundancy Check, or "CRC" for short. Basically, this involves generating a 32-bit number (or "CRC value") based on the contents of a file. If the contents of a file change, its CRC value changes as well. This allows the CRC number to be used as a "checksum" in order to identify whether or not the file has changed. It also allows you to distinguish between different versions of a file by comparing its CRC value to the CRC values of the originals.

The basic idea of the CRC algorithm is to treat all the bits in a file as one big binary number, and then divide that number by a standard value. The remainder from the division is the CRC value.

You can think of this value as being like a fingerprint for each file. Unlike human fingerprints, however, it isn't impossible for two files to have the same CRC-32 value. UpdatesDownloader uses an industry-standard CRC-32 algorithm which generates CRC values that are 32 bits in length. This means that one in every 4,294,967,296 files could have the same CRC "fingerprint."

A file doesn't have to change much for its CRC value to be different. In fact, if even just one bit in a file changes, the CRC value for that file will change as well. If all you did was change one letter in a readme.txt file between version 1 and version 2, the CRC value for that readme.txt file would be completely different.

CRC values can be calculated for any type of file.

Although the chances of any two files having the same CRC value are incredibly small, the CRC value alone isn't enough to guarantee an accurate identification. If you need to be absolutely sure, check the CRC in addition to other information about the file, such as the size of the file in bytes and its location on the user's system.

2 comments:

btn said...

Is it possible to detect which parts of files are chenged?

KIRAN said...

yes it is possible