How CRC (Cyclic Redundancy Check) Works

CRC (Cyclic Redundancy Check)

I'm not gonna talk about definitions here. It has been well documented, well, millions of times I guess. What I'm going to talk about here is the logic behind CRC and why certain arithmetic approach is adopted to calculate CRC.

What is XOR

Still not a definition. Come to think about what XOR does is to find out the difference between two entities. And entity may be a number, may be a frame transmit through network. For clarification, I borrowed the picture from accu.org (full url can be found in reference, I really recommend everybody to read it) to show XOR's capability.

This is what logic AND does, to find the same part which is the intersection.

https://accu.org/journals/overload/20/109/lewin_1915/0.png

This is what OR does. To find the union of two entity. It's an either...or... situation where you pick whatever is available.

https://accu.org/journals/overload/20/109/lewin_1915/1.png

And this is for XOR. As mentioned above, it is to find the difference between the two circles, or entities, or sets, name it whatever.

https://accu.org/journals/overload/20/109/lewin_1915/2.png

You know how to efficiently swap two numbers.


a = a ^ b; b = a ^ b; a = a ^ b;

This can be think of like this.

First find the difference between a and b and store the difference in a.

Then, search in every bit of b with the difference bits, cut them out and store the result in b. What's b without the difference with a, it's a. So a is swapped into variable b now.

The last step, bear in mind that the value in variable b is a, find out all the different bits in a (those different from b), and cut them out, what's left is the value of b.

Two numbers are swapped.

Why XOR

It is really counter intuitive first time I read about the bytes and bits behind XOR. This is so fascinating about the computer world that things are going to melt your brain, at any moment. You never know. You stay foolish.

Right back at the topic. To answer why utilize XOR in CRC is pretty simple. Because CRC is designed to check if the frame received is identical to the one sent. Another saying would be, to find the difference between any one bit that's been altered or tampered that cause the final frame different from the one sent (CRC is only good for one-bit-incorrect only,  need to work on this later).

As mentioned before, XOR is capable of finding the difference between two entities, thus it is what we look for.

Now, to further understand how CRC works. Some concepts should be talked about.

Frame Check Sequence

The Frame Check Sequence, or FCS, will be appended to the end of transmitting message to check if the message is corrupted.

As we'll see later, the FCS is calculated with Modulo 2 Arithmetic and the length of FCS is corresponding to the highest polynomial degree of the generating polynomial, which all of them will be discussed below.

Modulo 2 Arithmetic

XOR has this Modulo-2 arithmetic explanation. And modulo 2 arithmetic is used in final calculation of the FCS (Frame Check Sequence).

Blah! First things first. Let's come to see the theory, then explain them a bit.

Addition Equals to Subtraction Equals to XOR

In modulo 2 arithmetic, addition is equal to subtraction and , is equal to the result of XOR. HA!

Rules for addition:


0 + 0 = 0 0 + 1 = 1 1 + 0 = 1 1 + 1 = 0

This is pretty basic, since the result is the same as applying XOR on the two bits.

Now, if 1 + 1 = 0, then 0 - 1 = 1, which we can deduce. Hence,


0 - 0 = 0 0 - 1 = 1 1 - 0 = 1 1 - 1 = 0

Pretty interesting! Now, mark this in mind. Subtraction is addition, which is XOR.

Polynomial Representation of Data

Think of the data we want to transmit as 111010, the data can be represented by 1 * x^5 + 1* x^4 + 1 * x^3 + 0 * x^2 + 1 * x^1 + 0 * x^0, hence x^5 + x^4 + x^3 + x.

The power starts with 0 from the least significant bit all the way to the most significant bit. This is also true with generating polynomial.

Generating Polynomial

The generating polynomial can be seen as a "key" to calculate the FCS.

For example, consider 11001 is the generating polynomial for calculating FCS when sending data 111010. Then, the length of final calculated FCS will be 4 bits long. Since the highest polynomial degree of the generating polynomial is 4 (x^4 + x^3 + 1).

Generating polynomial can be any bits, but there're certain best practices to follow. Check it out here!

Long Division

Long division is the last step in calculating FCS. It requires the same bits be selected from the data starting with the highest bits, to do an XOR or subtraction with the generating polynomial. If the length is not long enough to match the generating polynomial, then a 0 is added to the result. Otherwise, it is a 1. Keep doing this until you have no bits left in the data, then the remainder left-padded with enough 0 to match the length of the highest polynomial degree of the generating polynomial is the final FCS to be appended to the data, and send to the recipient.

Video Link

I'm not going to show the actual steps in calculating the FCS. Here is the video tutorial to calculate it. It's well explained.

Final Thoughts

Just another interesting bit I encountered when reading TCP/IP Volume 1. XOR has lots of applications, read the article in the first reference link. It's good.

References

https://accu.org/journals/overload/20/109/lewin_1915/

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.5.5027&rep=rep1&type=pdf