EMD/CRT

From Ft Wm, Inv & Tor CC
< EMD

The Chinese Remainder Theorem (CRT) is a useful result in the theory of prime numbers

In June 2022 EMD used the CRT to devise a maximally collision-resistant checksum with an entropy in excess of 323 bits without the hassle of careful CRC design. He then wrote simple C programs to select near-optimal prime moduli based a candidate table cribbed from <https://primes.utm.edu/lists/2small/0bit.html>, and to determine good 'geometries' (ie formats) for temporary files recording bitvectors totalling 547 30-bit words, suitable for recording whether or not a file having length congruent to each number in the range 1..2^14 exists in a given directory (aka 'folder'); 547x30 = 2^14+26 = 16410. Satisfied, he wrote a rather longer C program to:-

  • Traverse a filestore taking a snapshot of the directory structure, filenames, and file lengths.
  • Split the snapshot into many subfiles using a 'geometry' appropriate to the total number of directories encountered.
  • Load, in turn, each subfile into main memory, and calculate and output 324-bit checksums of each file in the filestore.

For obvious reasons, the resulting output file has a '.CRT' filename extension. EMD then wrote another, somewhat shorter, C program to read up to five CRT files simultaneously and generate a report of consistencies and inconsistencies; these two programs were soon integrated into a single executable, 'SinoSure.EXE', intended to verify that filesystems have been accurately duplicated onto backup media.

By 11th July 2022, EMD had added per-file reporting to SinoSure, meaning that individual files not fully backed-up were named-and-shamed explicitly; this feature was arguably crucial to its subsequent adoption as a standard.