Previously we discussed the current state of symmetric cryptography, and discussed how the security of the ciphertext, no matter how long that is, all pivots on the security of that one secret key. A single key that's used to encrypt and decrypt. We need some mechanism of transmitting our secret key across the internet such that eavesdroppers can't snatch the key away from us and listen to or tamper with messages we secure using that key.
Asymmetric cryptography (or Public Key Cryptography as it's more commonly referred to) is the solution, and was conceptually described by James H Ellis in 1970. Having described it though, they could find no immediate way of implementing it.
Conceptually you want to build a portable safe and a corresponding key. Unusually you hand out your safe to someone you want to receive a message from across a crowd of people. Everyone on the route between the sender and receiver can peek inside the safe, but there's nothing there. The recipient of the safe finally puts a document in the safe and closes it. It’s locked shut. The safe is then handed back down the chain, but no one can open it. Only once it's delivered back can the original sender open the safe with the corresponding key. That’s the theory.
Where we looked back to Julius Caesar for the embryonic beginnings of Symmetric Key Cryptography, we have to peer back to some of the great Greek mathematicians to find a mathematical mechanisms we can use to create this effect for digital communications.
All public key cryptography relies on some mathematical function that is easy to compute, but from the result is very difficult to reverse without some "clue", this is commonly called a "trapdoor function". In our real world example above, the safe lock was the trapdoor function; anyone could close the lock, but only the key holder could unlock it. The squaring function is not a trapdoor function, but it does show how simply knowing the method of creating a result and the result itself does not allow someone to reverse the function to retrieve the original input parameter. You see, while 3 squared is 9, the square-root of 9 is either 3 or -3, so we cannot be sure to know the original input parameter without also knowing its polarity.
The RSA cryptosystem (named by taking the initial letters of the surnames of its inventors, Ron Rivest, Adi Shamir, and Leonard Adleman) is a trapdoor algorithm which uses integer factorisation as its trick. You may remember factorisation from when you were learning about fractions at school, but here's a quick re-cap.
One integer (a number with no fractional part, like 7, 26 or 103) is said to be the factor of another integer if it can evenly divide it into another integer with no fractional part.
So the factors of 18 are 1,2,3,6, 9 and 18. Now, how did we work that out? Well, since 18 is an even number, we know it's divisible by 2. But for all the other factors we had to test, and divide 18 by every other integer from 3 up to and including 9 (since we know 18 divided by 2 is 9, there will not be any other factors of 18 larger than 9). If that division gave us an integer, we wrote it down; if it didn't we moved on to the next number.
This process is ok for numbers such as 18, but as we start going above 100 the mental maths starts to get somewhat tiresome. And in fact while there are a few clever tricks and shortcuts we can use, there are no really easy ways to factor any given number.
Prime numbers are special numbers which are only factored by 1 and themselves. So, 19 is a prime number since there is no number between 1 and 19 (other than 1 and 19) that are its factors.
Making the public key
In the RSA algorithm we choose two prime numbers at random, for our example we'll use 5 and 11 since they are reasonably small. (Why don't you follow along and pick a couple of different prime numbers to use. Make sure they're small, or else you'll be dealing with a smoking calculator!) If we multiply these two numbers together we get 55. 55 is called our "maximum". Next we calculate a value called the totient of the product. This is calculated as (5-1) x (11-1), which is 40.
Next we pick another number, called the public key, between 1 and our totient, which is coprime to our totient. What coprime really means is that it must not share any factors with our totient. So let's pick 23. 23 is a prime number also so we know it won't share any factors, but most importantly it is co-prime (doesn’t share any factors) with our totient. (5, for instance, is prime, but is a factor of 40, so we can’t use that). 23 our public key.
Now to calculate the private key
The next principle we need to discuss before we can continue is the modulus function.
When we divide two integers, sometime they don't divide exactly and we get a fraction. The modulus function filters out the whole part of the division operation and reveals the remainder instead.
The modulus function is sometimes called a "clock function" for the following reason. Say we wanted to calculate 52(mod 12), we'd get a piece of string that is 52 units long. We'd tie one end to the "12" position on a clock, and wrap the string around the clock face (clock wise) until we get to the end of the piece of string. Where the end of the string lands is the solution to 52(mod 12), which is 4.
In our example we need to find a number which when multiplied by 23 gives us a remainder of 1 when divided by 40. Gosh that hard… but thankfully our friend Euclid had a way for us to solve this.
Applying Euclid's Extended algorithm (http://planetcalc.com/3298/) to our public key (23) and our totient (40) we get the coefficient to be 7 (the larger of the two coefficients). So 7 is our private key. This is possibly where I need to gloss over the maths, but for those who are interested, here is Dr Sinead Lyle’s comments ‘In general if you want to use the algorithm to find the inverse of P \mod T then you begin by finding integers a and b such that aT+bP=1. The inverse is then b \mod T as b might be negative. For example, if I wanted to find the inverse of 29 \mod 40, I would begin by calculating 8*40-11*29 = 1. The inverse is then (-11) \mod 40 = 29 \mod 40.
So these are the three numbers we need to perform RSA encryption and decryption (maximum: 55, public: 23 and private: 7)
Now we are ready to do our encryption:
- Think of any number you like as our plaintext (I'll pick 9). Multiply it by itself 23 times (I get 8,862,938,119,652,501,095,929). Apply modulus 55 (I get 14… so 14 is my ciphertext) (There are other mathematical tricks such as modulating each step or using the repeated squaring method which help to simplify the modulation, but for the sake of your sanity, I suggest relying on your calculator to manage the big numbers and go the long way round!)
- Now take your ciphertext value and multiply it by itself 7 times (I get 105,413,504). Apply modulus 55 (and I'm back to 9… Whoop! I recovered my plaintext)
It's not possible to recover the private key knowing only the public key and the modulus, and in fact the only way to brute-force to achieve this is to factor out these numbers, a task we've already seen to be quite exhausting.
The numbers we picked initially (5 and 11 in my example) were quite small, but in modern implementations of the RSA algorithm these numbers are significantly larger. The internet standard is to use public keys that are 2048-bits long, which would make them about decimal 600 digits long if we were to write them out! Too big for my head to work with. But modern computers can chew through the reasonably simple integral and modular mathematics and huge numbers to create the keys and perform encryption and decryption without too much effort. Sure key generation can take some time, and it did in our example too, but encryption and decryption are not too labour intensive, even with the massive numbers in use.
The take away from this though is that the entire system is now reliant on the secrecy of that private key. Distribute your public key to everyone, they can then encrypt messages intended for you, and then only you (with your ultra-secret private key) can decrypt those messages.
In practice what happens is that the plaintext message is actually a symmetric key. Encrypting bulk data using RSA is very time consuming, so we tend to encrypt our data with a symmetric cryptosystem, and then encrypt the symmetric key with the public key of the recipient. We deliver the encrypted message along with the encrypted form of the symmetric key, safe in the knowledge that both are secured. Once delivered, this means that only the person with the corresponding private key can recover the symmetric secret key, and thus do both parts of the decryption necessary to recover the message.
It's common to use 2048-bit keys because the RSA key is then the same length as the symmetric key (in AES-256 that's… 256 bytes, which is (256 x 8) 2048 bits). Using a shorter RSA key than the symmetric key means you have to encrypt the symmetric key in chunks… which can reveal information about the private key, so is something you would probably prefer to avoid.
The mathematics in this section have been checked and commented on by Dr Sinead Lyle, Senior Lecturer in Mathematics at the University of East Anglia. I’ve tried to include her comments where doing so holds true to the goal of remaining accessible to the expected audience. I’d like to thank and credit her for the time she took in correcting some of the ‘glossing’ I’ve done in this post, and correcting some of my elementary mistakes.
That was quite an extensive discussion on RSA, and so next time we'll have a look at how we can use RSA to build a network of trust. We will also look at an alternative mechanism for establishing a common symmetric key over an insecure medium, that does not rely on any private keys… curious? Sharpen your pencils and recharge your calculator ready for the next post.