[ Home ]
CRC RevEng: arbitrary-precision CRC calculator and algorithm finder
Copyright (C) 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017 Gregory Cook
This file is part of CRC RevEng.
CRC RevEng is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
CRC RevEng is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with CRC RevEng. If not, see <https://www.gnu.org/licenses/>.
CRC RevEng incorporates source code from ASPEX under the terms of version 3 of the GNU General Public License (GPLv3). ASPEX is:
Copyright (C) 1998, 2003 by David A. Hinds -- All Rights Reserved
ASPEX is licensed under version 2 and all later versions of the GNU General Public License, therefore CRC RevEng remains licensed under version 3 and all later versions.
Depending on one's interpretation of the License, the author created a
modified copy of ASPEX as a whole by extracting files
getopt.h verbatim on 19 December 2010. These are found in the
directory. A compatible
getopt module is available as standard in many
environments and will work just as well, but this copy is included for
those ANSI C environments that lack one.
A copy of the RISC OS Shared C Library (CLib), and a patch to make
certain versions of RISC OS compatible with this version of CLib, are
supplied with the RISC OS binary under licence from Castle Technology
Limited for distribution to end users for the purpose of upgrading, if
required. These are found in the
bin/riscos/ directory. The ARM Tube
OS binary, located in the
bin/armtubeos/ directory, is statically linked
with (that is, it incorporates) the RISC OS Shared C Library.
CLib qualifies as a "System Library" and GPLv3 permits licensees to
"combine GPLed software with GPL-incompatible System Libraries, [...]
and distribute them both together." See Sections 1, 5 and 6 and A Quick
Guide to GPLv3, <https://www.gnu.org/licenses/quick-guide-gplv3.html>
The patch is an independent program and is not bound by GPLv3 by mere
inclusion. See Section 5.
CRC RevEng is an arbitrary-precision, machine word length-independent,
byte order-independent CRC calculator and algorithm finder in ANSI C.
It is a port and expansion of the author's
crcbfs.pl script from 2007,
and runs up to 200 times faster on equivalent problems. It is also a
reference implementation of the author's "Catalogue of parametrised CRC
algorithms", with the 102 currently listed models available as presets.
To the author's knowledge CRC RevEng is the first published compiled
application to address the general case of CRC algorithm reversal and
reverse engineering, its predecessor
crcbfs.pl being the first published
application of any type to do so. Greg Ewing of Canterbury University
in New Zealand solved a CRC algorithm manually on similar principles in
2010, but partly by feeding chosen plaintexts into an implementation at
CRC RevEng is hosted by SourceForge, from where the latest version can be downloaded and documentation browsed:
Compiling CRC RevEng is straightforward: in the i386 GNU/Linux, MinGW and Raspbian environments, simply cd to the directory containing the source files, and enter
A special makefile can also be used in MinGW to make a Windows executable with version information and an icon:
make -f Mk.Win32
In RISC OS, with the Acorn Desktop Development Environment (DDE)
installed, click Menu on the file
RISCOSify, and set its type to
Obey. Double-click Select first on
RISCOSify, then on
In non-ASCII compatible environments, the array
aliases in file
preset.c may need to be reordered to suit the local string collation
order. The correct order of
aliases can be verified at
compile time by the error-free completion of this command:
make clean pretst
Otherwise, enter commands similar to the following to compile CRC RevEng on any ANSI C compliant compiler:
gcc -O3 -Wall -ansi -c bmpbit.c cli.c model.c poly.c preset.c \ reveng.c cd contrib gcc -O3 -Wall -ansi -c getopt.c cd .. gcc -o reveng bmpbit.o cli.o model.o poly.o preset.o reveng.o \ contrib/getopt.o
The platform-independent method does not compile the preset models. To
compile them, you will need to edit the configuration options in
config.h to suit your architecture. Having done so, define the macro
config.h and recompile as above.
Some enterprise users may wish to disable the -F switch to minimise CPU
usage. To do this, define the macro
config.h or on the
Usage: reveng -cdDesvhu? [-bBfFGlLMrStVXyz] [-a BITS] [-A OBITS] [-i INIT] [-k KPOLY] [-m MODEL] [-p POLY] [-P RPOLY] [-q QPOLY] [-w WIDTH] [-x XOROUT] [STRING...] Options: -a BITS bits per character (1 to n) -A OBITS bits per output character (1 to n) -i INIT initial register value -k KPOLY generator in Koopman notation (implies WIDTH) -m MODEL preset CRC algorithm -p POLY generator or search range start polynomial -P RPOLY reversed generator polynomial (implies WIDTH) -q QPOLY search range end polynomial -w WIDTH register size, in bits -x XOROUT final register XOR value Modifier switches: -b big-endian CRC -B big-endian CRC output -f read files named in STRINGs -F skip preset model check pass -G skip brute force search pass -l little-endian CRC -L little-endian CRC output -M non-augmenting algorithm -r right-justified output -S print spaces between characters -t left-justified output -V reverse algorithm only -X print uppercase hexadecimal -y low bytes first in files -z raw binary STRINGs Mode switches: -c calculate CRCs -d dump algorithm parameters -D list preset algorithms -e echo (and reformat) input -s search for algorithm -v calculate reversed CRCs -h | -u | -? show this help
You can use one of the preset models or specify your own.
reveng -m crc-32
selects the CRC-32 algorithm used in PKZIP and elsewhere. You can dump any preset model as an extended Williams model record using -d:
reveng -m crc-32 -d width=32 poly=0x04c11db7 init=0xffffffff refin=true refout=true xorout=0xffffffff check=0xcbf43926 residue=0xdebb20e3 name="CRC-32"
You can specify the parameters of the model instead:
reveng -w 32 -p 04c11db7 -i ffffffff -l -x ffffffff
This is equivalent to
reveng -m crc-32
except the model will have no name when dumped. (The -l switch sets both RefIn = True and RefOut = True. To set RefOut separately, use switches -L and -B.)
The options and switches for specifying a model are:
0x18005is specified as C002. This automatically provides the Width value.
0x18005is specified as 8005.
0x18005is specified as A001. This automatically provides the Width value.
Other model-related options:
Messages for CRC RevEng to process can be specified as files, as raw binary strings, or as numerical (typically hexadecimal) string arguments on the command line. Output from CRC RevEng is either as extended Williams model records (having their own fixed format) or as numerical string arguments printed one per line on standard output.
When passing numerical arguments on the command line, each argument is conceptually divided into characters, each character consisting of one or more hexadecimal digits. For each character, enough hex digits to specify it are read then a number of bits (specified by the -a option) are taken from the least significant end, reversed (if RefIn = True) and appended to the binary representation of the argument; any excess bits are discarded. The -a option (q.v.) permits a number of useful representations of a given underlying binary sequence.
When passing messages as files or as raw binary strings, the same division into characters applies; bytes of the message are read until enough bits have been collected, then a specified number of bits are taken from the specified least significant end of the collection (see -y), reflected (if RefIn = True) and added to the binary representation.
When printing CRCs, the binary representation is again divided into characters, each of which is reversed (if RefOut = True) and printed with the minimum sufficient number of hex digits.
Output model records conform to the Williams model set out in "A
Painless Guide to CRC Error Detection Algorithms". To recap, the model
consists of a linear feedback shift register (LFSR) having the number
of bit cells defined in the
width parameter, and shifting from right to
init parameter defines the settings of the bit cells at the
start of each calculation, before reading the first message bit. The
refin parameter, if equal to
false, specifies that the characters of the
message (whose size is specified by -a) are read bit-by-bit, most
significant bit (MSB) first; if equal to
true, the characters are read
bit-by-bit, least significant bit (LSB) first. Each sampled message bit
is then XORed with the bit being simultaneously shifted out of the
register at the most significant end, and the result is passed to the
feedback taps. The
poly parameter specifies the feedback taps of the
register – it is the result of initialising the register to all zeroes,
then reading a single one bit.
refout parameter, if equal to
false, specifies that the contents of
the register after reading the last message bit are unreflected before
presentation; if equal to
true, it specifies that they are reflected,
character-by-character, before presentation. For the purpose of this
definition, the reflection is performed by swapping the content of each
cell with that of the cell an equal distance from the opposite end of
the register; the characters of CRCs output by -c and -v are then true
images of parts of the reflected register, the character containing the
original MSB always appearing first. The
xorout parameter defines the
XOR value applied to the contents of the register after the last message
bit has been read and after the optional reflection. The
parameter defines the contents of the register after initialising,
reading the UTF-8 string
"123456789" (as 8-bit characters), optionally
reflecting, and applying the final XOR. The
residue parameter defines
the contents of the register after initialising, reading an error-free
codeword and optionally reflecting the register (if
not applying the final XOR. This is mathematically equivalent to
initialising the register with the
xorout parameter, reflecting it as
refout=true), reading as many zero bits as there are cells
in the register, and reflecting the result (if
refin=true). The residue
of a crossed-endian model is calculated assuming that the characters of
the received CRC are specially reflected before submitting the codeword.
width is printed as a decimal integer.
residue are true images of the shift register at various times, printed
as hexadecimal integers.
refout are Boolean values, displayed
name parameter is the name assigned to
a preset model, enclosed in double quotes, otherwise
There are a few more options for controlling the presentation of input and output:
When a model has been specified, use -c or -v to calculate CRCs for input messages.
If -V and -v are given together, their respective model reversals cancel
out. CRC RevEng then calculates an ordinary CRC for each argument,
processing the characters from right to left and likewise emitting the
CRC characters in reverse order.
Correspondingly, to obtain the same effect as -v using a model reversed by -V, the user must present the characters of his or her message, and process those of the returned CRC, in reverse order.
Take care when the CRC width (-w) is not a multiple of the character width (-a). If the result of a calculation is not what you expect, try selecting left-justification (with -t) or right-justification (with -r).
The -c mode is, of course, useful for creating a checksum to append to a message so that the combination will pass a particular CRC check. The -v mode, on the other hand, is useful for editing a message so as to force its checksum to a desired or at least predetermined value. In order to do this there must be some part of the message's data that can be modified freely without observable effect; many network protocols and file formats, including executables, images and word processor documents, have (or can be altered to have) reserved fields or comment sections that cannot be easily viewed, and whose contents are entirely ignored.
Among the simplest ways to control a CRC calculation is to find one such unused space that is both contiguous and large enough to hold a checksum. For example, suppose we have an existing message with an X.25 CRC:
0: 44 6F 67 73 2F 2A 12 34 2A 2F 72 6F 63 6B 4E 47 | Dogs/*.4*/rockNG
4E 47 is the X.25 checksum, and we wish to alter the message
without either changing the checksum or failing the CRC. We notice that
the 7th and 8th bytes can be replaced at will, and these can contain a
calculated value to force the CRC. Firstly we change the text as we
0: 43 61 74 73 2F 2A 12 34 2A 2F 72 75 6C 65 4E 47 | Cats/*.4*/ruleNG
Calculate the CRC of the part on the left of the unused space with XorOut = 0:
reveng -m x-25 -x 0 -c 436174732f2a 9dc5
Then reverse-calculate the CRC of the part on the right, including the old CRC, with Init = 0:
reveng -m x-25 -i 0 -v 2a2f72756c654e47 1505
Now exclusive-OR the two returned CRCs together, and insert the result in the unused space. CRC RevEng can be used to do the exclusive-OR if a hex calculator is not to hand:
reveng -w 16 -p 0001 -c 9dc51505 88c0
Our edited message now looks like this:
0: 43 61 74 73 2F 2A 88 C0 2A 2F 72 75 6C 65 4E 47 | Cats/*.A*/ruleNG
To confirm that it still passes the X.25 CRC:
reveng -m x-25 -c 436174732f2a88c02a2f72756c65 4e47
For more flexible editing options, see Mark Adler's source file
spoof.c takes an abbreviated description of the CRC, the
exclusive-or of the current CRC of the message and the desired CRC, the
length of the message, and a list of bit locations in a message, and
tells you which of those bits should be inverted in the message to get
the desired CRC. Note that it does not need the message itself, due to
the linearity property of CRCs."
* * *
In Stigge et al. (section 4.1) a polynomial q(x) is calculated as the
multiplicative inverse of xN such that (xN) q(x) = 1 (mod pCRC(x)).
The authors promote the extended Euclidean algorithm as a means of
calculating q(x), however any CRC calculator can also produce it. The
reciprocal of the CRC-32 polynomial is
0xdb710641, as output by:
reveng -w 32 -p 04c11db7 -V -d
The authors' constant
CRCINV, a reflected representation of q(x), is the
CRC of the reversal of the desired remainder,
reveng -w 32 -p db710641 -c 80000000 5b358fd3
Equivalently, the -v function returns q(x) in direct order from the unreflected parameters:
reveng -w 32 -p 04c11db7 -v 00000001 cbf1acda
The most important feature of CRC RevEng is the ability to recover the parameters of a CRC algorithm from a handful of codewords created by that algorithm. In general at least four data points are needed, either as known parameters or as message-CRC pairs. Extra data points help to eliminate false results and to confirm models that are found.
Known parameters are specified using -w, -p, -i and -x (see SPECIFYING A MODEL above). The width, -w, is a required parameter for all searches and counts as one of the data points. The size of characters (words) in the protocol must also be known and set with -a if this is not 8 bits.
The search function is selected with -s, and message-CRC pairs are given as arguments, each message and CRC combined into one argument. There must not be any non-participating characters between each message and its CRC, or the search will not work. Typically, end-of-message markers do not participate in the CRC.
As non-standard algorithms are comparatively rare, the program first tries all the preset models of the given width, reporting and exiting if one is found. Otherwise it commences a full search. As it proceeds it prints a progress message from time to time, allowing the search to be restarted from that point:
reveng: searching: width=32 poly=0x50000001 refin=false refout=false
If -b or -l are specified, CRC RevEng only searches for algorithms with that bit ordering. Otherwise, it tries RefIn = False, RefOut = False then RefIn = True, RefOut = True. Crossed-endian algorithms are also uncommon and the program will not search for them.
To find the Poly value when Init is not known, at least two arguments must have the same length.
To find both the Init and XorOut values, at least two arguments must have different lengths; otherwise there is only enough information to determine one value, given the other. If all arguments have the same length then, by default, CRC RevEng fixes XorOut at zero and calculates Init accordingly. (In hardware it is easier to set a non-zero Init than to apply a non-zero XorOut.) To set XorOut to another value, specify -x XOROUT; to fix Init and calculate XorOut instead, use -i INIT.
When bit strings are input with -a 1, there is no information on
endianness. In such cases -s returns the big- and little-endian forms
of each algorithm found. The Check values of these forms will differ,
as they are always calculated on the 8-bit UTF-8 string
To restart a stopped search, or to divide a search between several processors, CRC RevEng can be instructed to search within a specified range of generator polynomial values.
The full search space comprises all 'odd' polynomials of the specified WIDTH, that is, polynomials of the form xn + ... + 1. Treating the concatenated coefficients as a binary integer, the range can be up to (but excluding) a specified polynomial, from a specified polynomial upwards, or from one polynomial up to (but excluding) another.
Polynomial range searching is enabled using
] -q QPOLY, where
POLY and QPOLY are hex strings. -p POLY, if given, must precede
-q QPOLY. To start searching at a polynomial, use -p POLY -q 0. To
stop searching at a polynomial (exclusive), use -q QPOLY. To search
between two polynomial values, use -p POLY -q QPOLY.
Range limiting does not apply to the initial check against the preset models, or to Init or XorOut values, which are computed using a fast, efficient algorithm.
For example, to split a 32-bit search into four processes:
reveng -w 32 -q 40000000 -s c98964f6b9 a5fa49f2fd 13370aee7df0 reveng -w 32 -p 40000000 -q 80000000 \ -s c98964f6b9 a5fa49f2fd 13370aee7df0 reveng -w 32 -p 80000000 -q c0000000 \ -s c98964f6b9 a5fa49f2fd 13370aee7df0 reveng -w 32 -p c0000000 -q 0 \ -s c98964f6b9 a5fa49f2fd 13370aee7df0
To continue an interrupted search:
NB: If an either-endian search is stopped while RefIn/RefOut = False then it takes two further command lines to complete the search: one big-endian range search, and one little-endian full search.
reveng -w 32 -p 50000001 -q 0 -b \ -s c98964f6b9 a5fa49f2fd 13370aee7df0 reveng -w 32 -l -s c98964f6b9 a5fa49f2fd 13370aee7df0
The full list of search options is as follows:
CRC RevEng provides a few additional options for convenience:
reveng -w 16 -l -F -s 31816b 32c16a 31326a0a reveng -w 32 -p 04c11db7 -l -s c98964f6b9 a5fa49f2fd 13370aee7df0 reveng -w 32 -l -s c98964f6b9 a5fa49f2fd 13370aee7df0
A comprehensive list is being compiled.
In addition to the disclaimers listed at the top of this file and in the GNU General Public License (see file COPYING), remember that CRC RevEng is merely a search tool, and not authoritative. Searching is only statistical and any particular result may be a fluke, especially from a small number of samples. Also any output is only as accurate as the input.
Model reversal (-V, -v) makes little sense on crossed-endian models.
CRC RevEng lacks a facility to generate code to implement specified
algorithms. Pycrc by Thomas Pircher (2014) is one suitable code
generator; Mark Adler's
crcany source package (2014) also produces
source code when compiled and run, and references the author's CRC
Adler, Mark (15 January 2017). "zlib Home Site" (section "CRC (Cyclic
Redundancy Check) Bonus Information"). Contains links to
Bies, Lammert; et al. "Computer Interfacing Forum" (section "Error detection and correction").
Cook, Greg (20 June 2017). "Catalogue of parametrised CRC algorithms".
Ewing, Gregory C. (March 2010). "Reverse-Engineering a CRC Algorithm". Christchurch: University of Canterbury.
Koopman, Philip (July 2002). "32-Bit Cyclic Redundancy Codes for Internet Applications". The International Conference on Dependable Systems and Networks: 459–468. doi:10.1109/DSN.2002.1028931.
Koopman, Philip (23 January 2017). "Best CRC Polynomials". Pittsburgh: Carnegie Mellon University.
Koopman, Philip; Chakravarty, Tridib (June 2004). "Cyclic Redundancy Code (CRC) Polynomial Selection For Embedded Networks". The International Conference on Dependable Systems and Networks: 145–154. doi:10.1109/DSN.2004.1311885.
Pircher, Thomas (12 December 2016). pycrc. Python based parametrised CRC calculator and C code generator.
Stigge, Martin; Plötz, Henryk; Müller, Wolf; Redlich, Jens-Peter (May 2006). "Reversing CRC – Theory and Practice". Berlin: Humboldt University Berlin.
Williams, Ross N. (24 September 1996). "A Painless Guide to CRC Error Detection Algorithms V3.00".
CRC RevEng came about from the coincidence of four events:
The author would like to thank Dr. Mark Adler, Lammert Bies, Wolfgang Ehrhardt, Greg Ewing, Prof. Philip Koopman, Thomas Pircher, Dr. Ross Williams, and all contributors to CRC RevEng and the CRC Catalogue.
[ Top of page ]