LABTITLE=: 'Public Key Encryption' LABAUTHOR=: 0 : 0 David Ness 16 May 2001 ) LABCOMMENTS=: 0 : 0 The purpose of this lab is to introduce Public Key Encryption. Topics covered are: 1) Where We'll End Up 2) The needed mathematical functions; 3) Diffie-Hellman Key Exchange 4) Experimenting with PKE 5) The Mathematics of PKE 6) References ) LABERRORS=: 1 LABWINDOWED=: 0 p =: p: 750000x 1250000x Radix__ =: */p SetRdx =: 3 : 0 if. y. do. Rdx =: y. P__ =: Rdx&|@+ M__ =: Rdx&|@- T__ =: Rdx&|@* PW__ =: Rdx&|@^ end. Rdx ) D__ =: <.@% ModRecip =: 4 : 0 p =. 0 1 v =. x.,y. s =. 0 while. 1~:1{v do. p =. }. p,(0{p) P__ (D__/v)T__ 1{p v =. }. v,x.|y. * 1{p if. 0 = _1{v do. _. return. end. if. s =. 1 - s do. v =. (x. M__ 1{v) 1}v end. end. if. s do. x. M__ 1{p else. 1{p end. ) SetRdx Radix__ Secret__ =: 12131939x Public__ =: (*/p - 1) ModRecip Secret__ NB. ========================================================= Lab Chapter Where We'll End Up NB. ========================================================= Lab Section An Excursion While writing the end of this lab, it occurred to me that it might be wise to start it with a quick visit of the conclusion. This may help motivate and explain some of the steps that we later take on the journey. ) NB. ========================================================= Lab Section Some Assertions Here's an overview of how and why Public Key Encryption works. Later we'll get the details straight. Assertion 1: If p and q are large primes, it is possible to 'publish' p*q without exposing either of the factors. We call p*q 'Radix' in this discussion. Assertion 2: If you know p and q, it's easy to choose a large number, call it 'Secret' that is relatively prime to Radix. Assertion 3: It's easy to calculate a number, call it 'Public', such that Public * Secret = k * (p - 1) * (q - 1) + 1 for some integer k. Here are a Radix, Secret and Public that we have generated to experiment with. We have also defined P, M, T, PW and D 'operators' that do modular arithmetic calculations for +, -, *, ^ and %. ) Radix Secret Public NB. ========================================================= Lab Section A Couple of Points A couple of points should be made. First, we have chosen small primes for this early examle. The idea is to keep the numbers from being dozens or hundreds of digits long. For this part of the exercise we'll restrict our numbers to be in the neighborhood of 1e14 or smaller. Second, one has to be careful when working here to remember to type `x' at the end of most numbers. We are often dealing with numbers that are large, and it is important that they be exact integers, not floating point approximations. If something seems to go wrong with a calculation, look for this problem first. ) NB. ========================================================= Lab Section More Assertions Assertion 4: It is possible to expose Radix and Public without anyone being able to infer p, q or Secret. Try to figure out how you'd get Secret if all you knew was Radix and Public. Assertion 5: It is possible to calculate, and expose, Publish = Message^Public (mod Radix), without anyone being able to infer anything about Message from Publish (without knowledge of Secret, that is). Assertion 6: Publish^Secret = Message (mod Radix), i.e. with knowledge of Secret it is trivial to recover the Message. Try it. You'll like it. ) ]Message1 =: 11122322111x ]Publish1 =: Message1 PW Public ]Message2 =: 111111111x ]Publish2 =: Message2 PW Public ]Message3 =: 2x ]Publish3 =: Message3 PW Public NB. Now decoding: Publish1 PW Secret Publish2 PW Secret Publish3 PW Secret NB. ========================================================= Lab Section Summary So that is Public Key Encryption in a nutshell. Look at a summary of the Messages and their Published equivalents. Some experimentation should help convince you that there's 'nothing to see there' unless you know Secret, and then you can see it all very easily ) 3 2$Message1;Publish1;Message2;Publish2;Message3;Publish3 NB. ========================================================= Lab Chapter Introduction to Public Key Encryption NB. ========================================================= Lab Section Introduction Working with PKE (Public Key Encryption) can be interesting. This lab is designed to help understand PKE and the mathematics that supports it. The lab starts by introducing some functions that will be needed for performing important calculations in this domain. ) NB. ========================================================= Lab Section J makes the calculations needed to experiment with PKE easy. First, it makes performing computations with very long integers easy. This is important because in this domain, the approximations that would be introduced by any 'floating point' representation would be intolerable. All calculations must be exact, and in practical circumstances they often involve numbers that are dozens or hundreds of digits in length. Second, modular arithmetic is needed for performing PKE calculations. J makes defining modular operations very simple. ) NB. ========================================================= Lab Section Modular Arithmetic When dealing with PKE, calculations use radix arithmetic. This lab assumes some knowledge of radix arithmetic, so the treatment here will be rather cursory. In radix (or modular) arithmetic the integers are from the range i. Radix Addition, subtraction, multiplication and exponentiation all work very simply (and quite parallel to their usual function) except that after calculating the 'normal' result, it is reduced mod the Radix back to an integer from i. Radix Division is rarely used, and is simply defined to be the integer floor of the regular quotient. Here we create a function SetRdx Radix that defines P (plus), M (minus), T (times), and PW (power) operators that represent +, -, *, and ^, but where the result of the operation is reduced mod Rdx ) SetRdx =: 3 : 0 if. y. do. Rdx =: y. P =: Rdx&|@+ M =: Rdx&|@- T =: Rdx&|@* PW =: Rdx&|@^ end. Rdx ) D =: <.@% NB. ========================================================= Lab Section Let's experiment. Setting a radix of the 1000th prime (which happens to be 7927), we can calculate 5000 + 5000 in modular form and observe that it is the same as 2 * 5000 and (5000 + 5000) mod 7927 in conventional arithmetic. ) p: 1000 SetRdx 7927 5000 P 5000 2 T 5000 (5000 + 5000) - 7927 NB. ========================================================= Lab Section So now we have a way of generating computations in modular arithmetic. Let's apply this to the Diffie-Hellman `Key Exchange' algorithm to gain some practice doing things with these functions. ) NB. ========================================================= Lab Chapter Diffie-Hellman Key Exchange NB. ========================================================= Lab Section The Problem Diffie and Hellman tackled what I find to be a fascinating, yet very simple, problem. This problem is: Two people want to 'share' a key (i.e. they want to agree about it), but 1) they want it to be secret; 2) all of their communications about it are 'public' (so they can't just 'write it down' and hand it to the other guy); 3) it doesn't matter what the key is, so long as both agree. Diffie and Hellman were able to solve this problem with a clever algorithm and some modular mathematics. ) NB. ========================================================= Lab Section The Solution Remember, all communications are public. So, in public, Diffie and Hellman (we might as well use them as our protagonists, they won't mind) agree on, and 'publish' two numbers: 7927 and 1111 The only property required is that these numbers be relatively prime. We call these numbers n and g, and we set up to perform our calculations mod n. The role of g will be clear in a moment. ) n =: 7927 g =: 1111 SetRdx n NB. ========================================================= Lab Section Now, Diffie picks a number that he keeps secret. Say it is 1213. We call this number 'DiffieSecret'. He calculates: g PW 1213 NB. g PW DiffieSecret and announces this number to the world 6415 We call it DiffiePublic. ) DS =: DiffieSecret =: 1213 ]DP =: DiffiePublic =: g PW DiffieSecret NB. ========================================================= Lab Section Hellman performs a similar task. He also picks a secret number: 514. Then he performs the g PW 514 NB. g PW HellmanSecret calculation and announces 4045 to the public. ) HS =: HellmanSecret =: 514 ]HP =: HellmanPublic =: g PW HellmanSecret NB. ========================================================= Lab Section Now, let's recapitulate. Everyone (Diffie, Hellman and _all_ the rest of us) knows four facts: n, g, DiffiePublic and HellmanPublic However, assuming (and we will discuss this in detail in a moment) _only_ Hellman knows HellmanSecret and only Diffie knows DiffieSecret, we will show that they can 'share' a number that is not discernible by those who only have these four 'public' facts. ) n;g;DiffiePublic;HellmanPublic NB. ========================================================= Lab Section Hellman calculates: DiffiePublic PW HellmanSecret which he can do because he knows the radix (it is public), DiffiePublic (it is, too) and he (and _only_ he) knows HellmanSecret. Diffie, in a symmetric situation, calculates: HellmanPublic PW DiffieSecret And voila! Both 'know' 2389: 1) without ever mentioning it to one another; and 2) in a way that no one else can infer it (the rest of the world knows neither of the Secret numbers) ) DiffiePublic PW HellmanSecret HellmanPublic PW DiffieSecret NB. ========================================================= Lab Section Analysis But our analysis can't stop there. We have to show, somehow, that it is not possible to infer the secret numbers. In fact, for numbers of the size we have used in this example, it's actually quite easy to infer these numbers, and a little analysis of how we do that can show why this algorithm is really effective if the numbers used are moved into a sensible range. ) NB. ========================================================= Lab Section In J, it's easy to get DiffieSecret from DiffiePublic. For example: (# i.@#) DiffiePublic = g PW i. Rdx yields 1213 5716 one of which is DiffieSecret Similarly (# i.@#) HellmanPublic = g PW i. Rdx produces 514 4477 thus exposing HellmanSecret ) (# i.@#) DiffiePublic = g PW i. Rdx (# i.@#) HellmanPublic = g PW i. Rdx NB. ========================================================= Lab Section Unimpressed? You should be until we look a little more closely. Think of how we discovered the secret. We performed 'Rdx' PW operations. This is all fine and dandy if Rdx is a reasonable number, but J makes it easy for us to study what happens as we increase the size of Rdx. ) NB. ========================================================= Lab Section There are two 'costs' to analyze. One is the 'cost' to Diffie (or Hellman) of generating their Public numbers. The other cost is the cost to us (the outsiders) of 'inferring' the Secret code given the public code. Let's use 6!:2 to time these functions for different sizes of modulus. [Note: we could analyze for different g as well, but a little preliminary analysis shows that the value of g has _no_ substantial influence on the time consumed in the calculations. This is left as an exercise to the student, if you are in doubt.] ) NB. ========================================================= Lab Section Let's introduce a timer and run it to generate some times. The timer will generate a key (Diffie and Hellman's 'cost') and it will generate a solution (our cost for code breaking). We build an array of the results. ) Result =: 0 3$0#0 Time =: 4 : 0 SetRdx y. GenPub =: 6!:2 'PubKey =: g PW x.' GenSol =: 6!:2 '(# i.@#)PubKey = x. PW i. Rdx' Result =: Result,y.,GenPub,GenSol 0#'' ) NB. ========================================================= Lab Section Calculation Building some results shows us the behavior that we might expect. The time taken by Diffie or Hellman to generate their Public numbers is essentially independent of the size of the modulus, but the time taken to crack the solution rises dramatically as the size of the modulus increases. [Depending on the speed of your computer, this calculation may take a while. Please be patient.] ) 12343&Time each p: 1000 2000 3000 12343&Time each p: 10000 20000 30000 12343&Time each p: 100000 Result NB. ========================================================= Lab Section Results Even with the small amount of data we have generated, it seems clear that the search time to break the code is rising worse than linearly. In the range studied here, and running on one of my slower machines, the time it takes to evaluate a modulus of size n is about n % 40000 NB. seconds with the denominator decreasing as n increases. At this rate it take about 1 min to evaluate a modulus in the neighborhood of 2.5E6. This would imply an hour to look at 1.5E8, a week for 2.5E10, a year for 1E12, and a century for a modulus of scale near 1E14 etc. ) ]f=:%/0 2{|:Result NB. ========================================================= Lab Section So here we choose a modulus that is in the 1E48 range. If our previous analysis is reasonable, this should take on the order of 10^34 centuries to crack. Long enough, one might think, to be some discouragement. And how about using it to calculate the public number equivalent to a secret number. Let's redo our earlier work with this new n and a new g (chosen rather arbitrarily) and some fairly random (large) secret keys. ) SetRdx n =: 1234567898765432123456789876543212345678987654477x g =: n D 123454321x NB. Some 'random' number DiffieSecret =: 99887766775544221122334455566554433242232x HellmanSecret =: 98789878987899878775323243434324325343x NB. ========================================================= Lab Section Now comes the 'proof of the pudding'. We calculate the public codes at the cost of about %8 of a second in J. And our protagonists now share an exact number (in this case approximating 6.6E47) as was required. ) 6!:2 'DiffiePublic =: g PW DiffieSecret' 6!:2 'HellmanPublic =: g PW HellmanSecret' Hellmans =: DiffiePublic PW HellmanSecret Diffies =: HellmanPublic PW DiffieSecret Hellmans -: Diffies Diffies NB. ========================================================= Lab Section Closure However, we should be clear that while this clearly demonstrates the practicality of what has been done here, it does not prove that we scheme can't be cracked. We only introduced one simple algorithm (exhaustive search) and then proved that such an algorithm doesn't work. This obviously does not prove that there isn't some other approach that might work. However, if you find one that materially affects the timing, then you probably have a publishable article in the waiting. ) NB. ========================================================= Lab Chapter Public Key Encryption NB. ========================================================= Lab Section Introduction The analysis of the Diffie-Hellman key exchange relates directly to, but is not the same as, 'Public Key Encryption'. However, our route to understanding the functioning of PKE should be eased by the exercise we have just finished. Public Key Encryption encompasses two separate, but related, problems: (1) Being able to make authenticating 'signatures'; and (2) Encrypting messages. The technology discussed here can be used for both purposes. ) NB. ========================================================= Lab Section The Mathematics The mathematics of PKE relies on some ancient results that I will not 'prove' in the course of this Lab. A good book on finite mathematics would probably do it well. The mathematics that we need involves the behavior of a very specific kind of modulus, namely one that is the product of exactly two factors. As we did with the Diffie-Hellman analysis, let's look at this case with some very small numbers that will be easy to work with, and then later make the numbers of more useful size. ) NB. ========================================================= Lab Section Let's choose two primes, and we'll make them _very_ small for this first go round at analysis. We also need the concept of a 'reciprocal'. In the world of modular arithmetic this is somewhat different than our usual notion of reciprocal, although its definition is similar. Let's call Rdx ModRecip n the reciprocal of n with respect to modulus Rdx. This means that x T (Rdx ModRecip x) = 1 NB. (mod Rdx) ) primes =: 31 47 ModRecip =: 4 : 0 p =. 0 1 v =. x.,y. s =. 0 while. 1~:1{v do. p =. }. p,(0{p) P (D/v)T 1{p v =. }. v,x.|y. * 1{p if. 0 = _1{v do. _. return. end. if. s =. 1 - s do. v =. (x. M 1{v) 1}v end. end. if. s do. x. M 1{p else. 1{p end. ) NB. ========================================================= Lab Section Make 59 the modulus for a moment. We then calculate the reciprocal of 28 and see that it is 19. We then observe that 1 -: 19 T 28 NB. mod 59 as is required for it to be the reciprocal. ) SetRdx 59 59 ModRecip 28 1 -: 19 T 28 NB. ========================================================= Lab Section Now on to the key theorem. Setting the modulus to the product of two factors, we can also calculate a number Unk, Unk =: */primes - 1 and choose an arbitrary number PubKey (it must be relatively prime to */primes) and calculate: PrivKey =: Unk ModRecip PubKey So far so good. ) SetRdx */primes Unk =: */primes - 1 PubKey =: 317 ]PrivKey =: Unk ModRecip PubKey NB. ========================================================= Lab Section But now the interesting result. Calculate 29 PW PubKey and we get 922. What's 275? try 275 PW PrivKey Lo, and behold, we get 29 back. Indeed, try any (1234 PW PubKey) PW PrivKey J lets us conveniently exhaustively search a space as small as the one relevant here with (i. Rdx) -: ((i. Rdx)PW PrivKey) PW PubKey ) 29 PW PubKey 275 PW PrivKey (1234 PW PubKey) PW PrivKey (i. Rdx) -: ((i. Rdx)PW PubKey) PW PrivKey NB. ========================================================= Lab Section There is some interesting mathematics behind this calculation. We will discuss it further in the next Chapter. The result we have just obtained is a result of 'Euler's generalization of Fermat's little theorem'. This theorem suggests that x = x PW 1380 for all x which are relatively prime to 1427. That is to say, in somewhat more general terms, x = x PW Unk NB. (mod Rdx) So, in the case we have just been considering we get: (10#1) -: (1 2 3 4 5 6 7 8 9 10) PW 1380 We'll 'do the math' on this theorem in the next chapter. ) (10#1) -: (>:i.10) PW 1380 NB. ========================================================= Lab Section Sending a Message For the purposes of this discussion assume that a 'message' is a number in the range 0 ... <:Rdx. Let's say I 'publicize' my modulus and PubKey. Now someone wanting to send me a message can calculate pub =: message PW PubKey NB. (mod modulus) and transmit this to me. If someone else gets this message it means nothing to them, but I can calculate: pub PW PrivKey NB. (mod modulus) and I will recover the 'message'. This allows private, encrypted conversation to take place even if the channel involved is public. ) message =: 1111 ]pub =: message PW PubKey ]pub PW PrivKey NB. ========================================================= Lab Section Note that I can also take a message that I want to publish and calculate publish =: mymess PW PrivKey NB. (mod modulus) Anyone receiving this message can take my published PubKey and modulus and calculate publish PW PubKey NB. (mod modulus) Since the message decodes with my PubKey the recipient has validated that the message came from me, I'm the only one that could have encoded it to a key that will decode with my PubKey. This corresponds to 'signing' the message, and so long as you 'know' that what purports to be my 'public key' is really mine, you can be sure I did generate the message in question. ) mymess =: 711 ]publish =: mymess PW PrivKey publish PW PubKey NB. ========================================================= Lab Section The Usual Caveat Of course this is ok, but I hear you saying, 'I'd have no trouble calculating the PrivKey, given the modulus and PubKey' because we could just calculate Unk by using J to factor Rdx as in tmp=:*/(q:Rdx)-1 from which I can then calculate tmp ModRecip PubKey recovering PrivKey. ) ]tmp=:*/(q:Rdx)-1 tmp ModRecip PubKey NB. ========================================================= Lab Section But, this is just a parallel of the situation we were in with respect to the Diffie-Hellman inversion. It's easy for J to calculate q: when the numbers involved are of reasonable size, but if we make them _big_ even J can't help us with the factoring task. The size of the factoring problem rises with the size of the numbers involved, and while we can factor 35 in our head, and J can factor 1457 for us, even J has trouble when the numbers are dozens or hundreds of digits long. ) q: 35 q: 1457 NB. ========================================================= Lab Section An Example We do the pertinent calculations. We have chosen some large primes, and they produce a modulus that's about 110 digits. Notice, by the way, that while J may have not responded instantaneously (this clearly depends on the speed of your computer), all of this is really quite tolerably speedy. ) p1 =: 1644106683470046030365874053686206256832866505723642231489x p2 =: 80844284208043284480667106756687223457372344211571009x primes =: p1,p2 SetRdx */primes Unk =: */primes-1 PrivKey =: 10#. x:(? 43#9),3 NB. Generate a random 44 digit Private Key PubKey =: Unk ModRecip PrivKey NB. ========================================================= Lab Section Just to prove the point about factoring, asking J to 'factor' the current Rdx results in a fairly long computation which terminates in a 'nonce error', indicating that J was unable to factor the number in question. This is not to say that the product of two large primes cannot be factored. However, at the moment no one knows a constructive way of doing this, and a lot of hard work has been going on for at least a decade attempting to do it, so far without significant useful results. So we're probably safe for a while, after all it took centuries to crack `Fermat's Last Theorem' and the 4-color problem. ) q:Rdx NB. ========================================================= Lab Section Let's close with a couple of simple functions that will let us code and decode a real message. J lets us make them pretty straightforward. Now when I say something like pub=:103930904746043676664663162719862894693425577481721794668062509906195950848050361100388206415598844156707111376x you can just apply decode pub PW PubKey And you'll not only 'get the message' you'll _know_ it came from me, as my PubKey decoded it. ) code =: 3 : '256#.x:a.i.y.' decode =: 3 : '(256#.^:_1 y.){a.' message =: 'I hope this was useful. David Ness' ]pub =: (code message) PW PrivKey decode pub PW PubKey NB. ========================================================= Lab Section Concluding It is probably worth noting that one doesn't learn much from the messages unless you are able to decode them. For example, creating a 'size' function so that we can 'see' the numbers (they are too long to give any feeling by just looking at them), we see that the coded versions of 2, 3, 4, 5 and 6 don't 'look' much like they relate in any particular way. Remember also in looking at messages that there are 10 times as many messages of magnitude 1E109 as there are of magnitude 1E108, etc. so do not be surprised by the fact that most messages encode into 'large' numbers. ) size =: 3 : 'x:^:_1 y.' size each (2+i.5) PW PrivKey NB. ========================================================= Lab Chapter The Math of PKE NB. ========================================================= Lab Section The Overview The idea of Public Key Encryption relies on some very interesting principles. Let's start by understanding their role in a sequence of steps. First, we start with something that we can make public without exposing its complete structure. Assuming we can do that, the second step is to generate something that depends on our knowledge of the structure which cannot be directly inferred from what we have made public. Now consider a number which we construct by multiplying two primes that we have chosen. This combination satisfies our first criterion. We can expose the product and yet not expose the parts. If we can then exploit our knowledge of the parts PKE becomes possible. Notice that this argument relies on the fact that it will not be possible for someone who only has knowledge of the product to factor it into its component parts. Factoring, as we understand it today, is a process that demands search, and if the numbers involved are huge, the problem is quite intractable. ) NB. ========================================================= Lab Section Exploiting the Factoring How could we expoit our knowledge of the factoring to accomplish some useful objective? Euler's generalization provides us one way. We will discuss this in detail in a moment, but suffice it for now to observe that there is a concept called the totient of a number for which x PW totient n = 1 NB. mod n How is this useful? There is a way. Let's say we choose a number, call it Pub. It is a straightforward matter to calculate the reciprocal of Pub mod totient of n. Let's call it Priv. Notice that while we had no trouble generating Priv from Pub, we relied on having totient(n). We will show that calculating totient(n) is easy if you know the `structure' of n, and difficult otherwise. ) SetRdx n =: 1457 Totient1457 =: 1380 NB. Of course later we'll figure out why 3 PW Totient1457 129 PW Totient1457 Pub =: 137 ]Priv =: Totient1457 ModRecip Pub NB. ========================================================= Lab Section Before investigating totient(n), let's go forward to our final goal. First, we 'publish' Rdx and Pub, making these available to the public. If we accept the fact that totient(Rdx) can't be inferred, this is not sufficient information to construct Priv. Now assume someone wants to send us a secret message. If they calculate message PW Pub NB. mod Rdx consider (message PW Pub) PW Priv This is message PW (Pub T Priv) ) message =: 777 ]send =: message PW Pub (message PW Pub) PW Priv Pub T Priv message PW (Pub T Priv) NB. ========================================================= Lab Section Euler's generalization is the one important mathematical element in the preceeding presentation where we pretty much just 'waved our hands', assumed the result, and proceeded. Now it's time to face the music and try and make some sense of the mathematics involved. Euler's generalization involves a very interesting function called `Euler's Totient Function' The totient of ) NB. ========================================================= Lab Chapter References NB. ========================================================= Lab Section References The Diffie-Hellman scheme described here is adapted from Bruce Schneier's 'Applied Cryptography', particularly the chapter on 'Key-Exchange Algorithms' that begins on Page 515 in the Second Edition. )