Civilization is the progress toward a society of Privacy. The savage's existence is public, ruled by the laws of his tribe. Civilization is the process of setting man free from men.
- Ayn Rand, Fountainhead
By FoxThree (foxthree@antionline.org)
Abstract. In this paper, I hope to touch upon the basics of Cryptography (hereafter referred to as Crypto). Topics covered include Basic Crypto terminology, Basic Crypto algorithms, Crypto Hash functions, Crypto Random Number generators (CRNG), Cryptanalysis, File Verification methods including Digital Signatures and (everyone's' favorite) launching crypto attacks. A brief mention has been made about Crypto Protocols.
Keywords: Cryptography, Crypto Algorithms, Crypto Protocols, Digital Signatures.
Table of Contents
- Introduction
- Basic Terminology
- Basic Cryptographic Algorithms
- Symmetric Ciphers
- Asymmetric Ciphers
- File Verification Technologies
- Cryptographic Random Number Generators
- Strength of Cryptographic Algorithms
- Cryptanalysis and Attacks on Crypto systems
- Cryptographic Protocols
- Conclusion
1. Introduction
The advent of modern computing has had a major impact on our current-day society. Suddenly, we find that computing power is not just a privilege of major corporations and universities, but is rather within the reach of a common man. This portends both good and evil. The good side is that application of such a revolutionary tool to various fields dramatically improves Quality of Service (QOS). The down side is that people just do not seem to realize what they are up against. What were once inside a bank's vaults, are now available on electronic media. The systems that hold them are now thrown open to public networks. Many of you will agree with me, when I say that one of the current day concerns is about Secure Computing and many more of you will agree with me, when I say that these concerns are no longer considered paranoid. The need for security is very real.
But, then how do we tackle the issue of security? The answer to this question lies in the field of Mathematics called Cryptography. Cryptography deals with (A) All aspects of secure messaging, (B) Authentication, (C) Digital Signatures and (D) E-Money.
As we move into an information society, the technological means for global surveillance of millions of individual people are becoming available. Cryptography has become one of the main tools for privacy, trust, access control, electronic payments, corporate security, and countless other fields.
Cryptography is no longer a military thing that should not be messed with. It is time to demystify cryptography and make full use of the advantages it provides for the modern society. Widespread cryptography is also one of the few defenses people have against suddenly finding themselves in a totalitarian surveillance society that can monitor and control everything they do.2. Basic Terminology
Suppose that Tom wants to send a message to Jerry. Tom opens his message editor, punches in the message and sends it across to Jerry. However, there is a possibility that someone else along the way has read the message or altered its contents.
In Crypto terminology, the message is referred to as plain text or clear text. So, what does Tom do? Tom - being intelligent he is - encodes the message. Encoding the contents of the message in such a way that hides its contents from outsiders is called encryption.The encrypted message is called the ciphertext. The process of retrieving the plaintext from the ciphertext is called decryption. Encryption and decryption usually make use of a key, and the coding method is such that decryption can be performed only by knowing the proper key.
Cryptography is the art or science of scrambling data so that none other than the desired parties can decipher it. In our case they happen to be Tom and Jerry. Cryptanalysis is the art of breaking ciphers, i.e. retrieving the plaintext without knowing the proper key. People who do cryptography are cryptographers, and practitioners of cryptanalysis are cryptanalysts. Cryptology is the branch of mathematics that studies the mathematical foundations of cryptographic methods.
3. Basic Cryptography Algorithms
Broadly, we can divide Crypto Algos into two classes:
- Algos which rely on the secrecy of the Algorithms used
- Algos which rely on keys for encryption and decryption
Of the above, the algorithms which rely on the secrecy of the algorithms used for encryption and decryption are only of historical interest and are not adequate for real-world needs. This is so because current technology permits even a relative novice to disassemble or reverse engineer the code.
Key-based encryption algorithms, on the contrary, add a newer dimension to the whole game. A message can be decrypted only if the key matches the encryption key. The key used for decryption can be different from the encryption key, but for most algorithms they are the same.
There are two classes of key-based algorithms, Symmetric (or secret-key) and Asymmetric (or public-key) algorithms. The difference is that symmetric algorithms use the same key for encryption and decryption (or the decryption key is easily derived from the encryption key), whereas asymmetric algorithms use a different key for encryption and decryption, and the decryption key cannot be derived from the encryption key.
Generally, symmetric algorithms are much faster to execute on a computer than asymmetric ones. In practice they are often used together, so that a public-key algorithm is used to encrypt a randomly generated encryption key, and the random key is used to encrypt the actual message using a symmetric algorithm. This philosophy is illustrated in Fig.1.4. Symmetric Ciphers
Symmetric algorithms can be divided into stream ciphers, block ciphers and ciphers based on cryptographic hash functions. Stream ciphers can encrypt a single bit of plaintext at a time, whereas block ciphers take a number of bits (typically 64 bits in modern ciphers), and encrypt them as a single unit.
How a stream cipher works is pretty straight forward, but then how does a block cipher work?
How a Block Cipher works?
Block ciphers take a fixed-size block of data (usually 64 bits), an transform it to another 64 bit block using a function selected by the key. The cipher basically defines a one-to-one mapping from 64-bit integers to another permutation of 64-bit integers.
If the same block is encrypted twice with the same key, the resulting ciphertext blocks are the same (this method of encryption is called Electronic Code Book mode, or ECB). This information could be useful for an attacker.
In practical applications, it is desirable to make identical plaintext blocks encrypt to different ciphertext blocks. Two methods are commonly used for this:CFB mode: a ciphertext block is obtained by encrypting the previous ciphertext block, and xoring the resulting value with the plaintext.
CBC mode: a ciphertext block is obtained by first xoring the plaintext block with the previous ciphertext block, and encrypting the resulting value.The previous ciphertext block is usually stored in an Initialization Vector (IV). An initialization vector of zero is commonly used for the first block, though other arrangements are also in use.
Data Encryption Standard
- DES is an algorithm developed in the 1970s. It was made a standard by the US government, and has also been adopted by several other governments worldwide. It is widely used, especially in the financial industry.
DES is a block cipher with 64-bit block size. It uses 56-bit keys. This makes it fairly easy to break with modern computers or special-purpose hardware. DES is still strong enough to keep most random hackers and individuals out, but it is easily breakable with special hardware by government, criminal organizations, or major corporations. In large volumes, the cost of breaking DES keys is on the order of tens of dollars. DES is getting too weak, and should not be used in new designs.
A variant of DES, Triple-DES or 3DES is based on using DES three times (normally in an encrypt-decrypt- encrypt sequence with three different, unrelated keys). Many people consider Triple-DES to be much safer than plain DES.
- Blowfish
Blowfish is an algorithm developed by Bruce Schneier. It is a block cipher with 64-bit block size and variable length keys (up to 448 bits). It has gained a fair amount of acceptance in a number of applications. No attacks are known against it.
Blowfish is used in a number of popular software packages, including Nautilus and PGPfone.
- International Data Encryption Algorithm
IDEA (International Data Encryption Algorithm) is an algorithm developed at ETH Zurich in Switzerland. It uses a 128 bit key, and it is generally considered to be very secure. It is currently one of the best public known algorithms. It is a fairly new algorithm, but it has already been around for several years, and no practical attacks on it have been published despite of numerous attempts to analyze it.
IDEA is patented in the United States and in most of the European countries. The patent is held by Ascom-Tech. Non-commercial use of IDEA is free.
- RC4
RC4 is a cipher designed by RSA Data Security, Inc. It used to be a trade secret, until someone posted source code for an algorithm in Usenet News, claiming it to be equivalent to RC4. There is very strong evidence that the posted algorithm is indeed equivalent to RC4. The algorithm is very fast. Its security is unknown, but breaking it does not seem trivial either. Because of its speed, it may have uses in certain applications. It can also accept keys of arbitrary length. RC4 is essentially a pseudo random number generator, and the output of the generator is xored with the data stream. For this reason, it is very important that the same RC4 key never be used to encrypt two different data streams.
The United States government routinely approves RC4 with 40 bit keys for export. Keys that are this small can be easily broken by governments, criminals, and amateurs.
- SAFER
SAFER is an algorithm developed by J. L. Massey (one of the developers of IDEA). It is claimed to provide secure encryption with fast software implementation even on 8-bit processors. Two variants are available, one for 64 bit keys and the other for 128 bit keys.An implementation is in ftp.funet.fi:/pub/crypt/cryptography/symmetric/safer. An analysis of SAFER-K64 was presented in Crypto'95 and is in the proceedings.
Ciphers based on Cryptographic Hash functions
Hash functions are typically used as efficient search algorithms. Consider for example, that Tom needs to store 'n' items. Tom uses a function to generate unique indices for each of these items and the indices are stored. Later, when Jerry wants to retrieve an item, the same function is used to generate a index, a look-up is made on the indices table and Voila! Jerry has managed to retrieve the item. Are we overlooking something here? Yep. The hash function must generate indices in such a way that no other item would hash to that particular index. IOW, it must generate unique indices for unique data items.
Any cryptographically strong hash function can be turned into a cipher. There are several possible arrangements; the general idea is that the hash function is used as a random number generator, and the hash value is xored with the data to be encrypted. When all bytes of the hash value have been used, a new hash value is obtained by generating a yet another random number and taking a hash of that. The data to be hashed may include a key, the previous hash value, a sequence number, previous plaintext, etc.
- Enigma was the cipher used by the Germans in World War II. It is trivial to solve with modern computers. This cipher is used by the unix crypt(1) program, which should thus not be used.
- Vigenere is a historical cipher mentioned in many textbooks. It is easily solvable.
5. Asymmetric Ciphers
Asymmetric ciphers (also called public-key algorithms or generally public-key cryptography) permit the encryption key to be public (it can even be published in a newspaper), allowing anyone to encrypt with the key, whereas only the proper recipient (who knows the decryption key) can decrypt the message. The encryption key is also called the public key and the decryption key the private key or secret key.
Public key methods are important because they can be used to transmit encryption keys or other data securely even when the parties have no opportunity to agree on a secret key in private. All known methods are quite slow, and they are only used to encrypt session keys (randomly generated "normal" keys), that are then used to encrypt the bulk of the data using a symmetric cipher. (Recall Fig. 1.)
RSA
RSA (Rivest-Shamir-Adelman) is the most commonly used public key algorithm. Can be used both for encryption and for signing. It is generally considered to be secure when sufficiently long keys are used (512 bits is insecure, 768 bits is moderately secure, and 1024 bits is good). The security of RSA relies on the difficulty of factoring large integers. Dramatic advances in factoring large integers would make RSA vulnerable. RSA is currently the most important public key algorithm. It is patented in the United States (expires year 2000), and free elsewhere.
Many implementations of RSA are freely available. See ftp.funet.fi:/pub/crypt/cryptography/asymmetric/rsa.
Diffe-Hellman
Diffie-Hellman is a commonly used public-key algorithm for key exchange. It is generally considered to be secure when sufficiently long keys and proper generators are used. The security of Diffie-Hellman relies on the difficulty of the discrete logarithm problem (which is believed to be computationally equivalent to factoring large integers).
Diffie-Hellman is sensitive to the choice of the strong prime and the generator. The size of the secret exponent is also important for its security. Conservative advice is to make the random exponent twice as long as the intended session key.
A study showed that the work needed for the pre-computation of discrete logarithms relative to a particular prime is approximately equal or slightly higher than the work needed for factoring a composite number of the same size i.e. both RSA and DH are computationally similar. In practice this means that if the same prime is used for a large number of exchanges, it should be larger than 512 bits in size, preferably 1024 bits.Elliptic curve public key cryptosystems
Elliptic curve public key cryptosystems is an emerging field. They have been slow to execute, but have become feasible with modern computers. They are considered to be fairly secure, but haven't yet undergone the same scrutiny as for example RSA.
Digital Signature Standard
DSS is signature-only mechanism endorsed by the United States Government. Its design has not been made public, and many people have found potential problems with it (e.g., leaking hidden data the signature, and revealing your secret key if you ever happen to sign two different messages using the same random number).
ElGamal public key cryptosystem.ElGamal public key cryptosystem is based on the discrete logarithm problem.
LUC
LUC is a public key encryption system. It uses Lucas functions instead of exponentiation. It's inventor Peter Smith has since then implemented four other algorithms with Lucas functions: LUCDIF, a key negotiation method like
Diffie-Hellman; LUCELG PK, equivalent to El Gamal public-key encryption; LUCELG DS, equivalent to El Gamal digital signature; and LUCDSA, equivalent to the US Digital Siganture Standard. LUC Encryption Technology Ltd has obtained patents for cryptographic use of Lucas functions in United States and New Zealand.
Source code can be found in ftp.funet.fi:/pub/crypt/cryptography/asymmetric/luc6. File Verification Technologies
Imagine the following situation: Tom just e-mailed Jerry a memo containing confidential budget figures for the upcoming year. Now Jerry has to be absolutely certain that the copy he has received is genuine and that the numbers in it weren't altered during the transmission. A couple of transposed characters could cost Jerry's' company millions of dollars. Jerry's' suspicious that the document may have been tampered with en route, because some of the numbers don't add up and the e-mail was routed through an external mail system. How does he know beyond a shadow of a doubt that the document he is looking at is identical to the document that was originally transmitted?
This scenario isn't as farfetched as it may seem. In a day and age when digital commerce is fast becoming a reality, one's confidence in digital transactions is wholly dependent upon the security surrounding such transactions. If you send a spreadsheet file to someone via e-mail or on a floppy disk, how does the recipient know that the spreadsheet wasn't altered by someone who had intermediate access to it? If you send a credit card number over the Internet, how does the recipient know that it really was you who placed the order?The answer lies in what are called File Verification Technologies and we will see some technologies in the increasing order of state-of-the-art.
Checksums
The simplest method of verifying the integrity of digitally transmitted data is the checksum method. A checksum is a value computed by adding together all the numbers in the input data. If the sum of all the numbers exceeds the highest value that a checksum can hold, the checksum equals the modulus of the total--that is, the remainder that's left over when the total is divided by the checksum's maximum possible value plus 1. In mathematical terms, a checksum is computed with the equation
Checksum = Total % (MaxVal + 1)
where Total equals the sum of the input data and MaxVal is the maximum checksum value you will allow.
Suppose the document whose contents you wish to verify is the following stream of 10 byte values:36 211 163 4 109 192 58 247 47 92
If the checksum is also a 1-byte value, then it can't hold a number greater than 255. The sum of the values in the document is 1,159, so the 8-bit checksum is the remainder left when 1,159 is divided by 256, or 135. If the person who sent the document calculated a checksum of, say, 246, and you get a checksum of 135, then the data was altered. A checksum is the simplest form of digital fingerprint--a value, calculated from the content of other data, that changes if the data upon which it's based changes. Checksums have been used since the dawn of computing and are still the basis for error checking in some forms of the popular XMODEM file-transfer protocol.
The problem with checksums is that although conflicting checksums are proof positive that a document has been altered, matching checksums doesn't necessarily prove that the data was not altered. You can reorder the numbers in the document any way you want and the checksum won't change. Worse, you can change individual numbers in the document and tweak others so that the checksum comes out the same. When you use 8-bit checksums, there's a 1 in 256 chance that two completely random data streams will have the same checksums. Expanding the checksum length to 16 or 32 bits decreases the odds of coincidental matches, but checksums are still too susceptible to error to provide a high degree of confidence in the data they represent.Cyclic Redundancy Checks
A better way to digitally fingerprint a block of data is to compute a cyclic redundancy check (CRC) value for it. CRCs have been used for years in network adapters, disk controllers, and other devices to verify that what goes in equals what comes out. Also, many modern communications programs use them to perform error checking on packets of data transmitted over phone lines.
The mathematics behind CRCs is based on something called polynomial division, in which each bit in a chunk of data represents one coefficient in a large polynomial. A polynomial is a mathematical expression like this one:f(x) = 4x3 + x2 + 7
For the purpose of CRC calculations, the polynomial that corresponds to the byte value 85 (whose 8-bit binary equivalent is 01010101) is
0x7 + 1x6 + 0x5 + 1x4 + 0x3 + 1x2 + 0x1 + 1x0 = x6 + x4 + x2 + 1
The key to CRC calculations is that polynomials can be multiplied and divided just like ordinary numbers. Dividing a "magic" polynomial (one whose coefficients are dictated by the CRC algorithm you're using) into the polynomial generated from a data stream yields a quotient polynomial and a remainder polynomial. The latter forms the basis of a CRC value. Like checksums, CRCs are small (usually 16 or 32 bits in length), but they're far more reliable at detecting minor changes in the input data. If just one bit in a large block of data changes, there's a 100 percent chance that the CRC will change, too. If two bits change, there's better than a 99.99 percent chance that a 16-bit CRC will catch the error. Swapping two bytes or adding 1 to one and subtracting 1 from another won't fool a CRC as it will a checksum.
CRCs are extremely useful for checking files downloaded from online services. If the CRC computed from the downloaded file is not the same as the CRC for the file uploaded, we know that there was an undetected transmission error during the download. (It does happen!)How CRCs work
Suppose a sender on the network sends a frame of data. CRC appends what is called a Frame Check Sequence (FCS). The FCS is generated in such a way that the original data (represented in polynomial form) with the FCS is exactly divisible by some "magic" polynomial. This polynomial is what is referred to as the CRC Polynomial.
Consider an example:
Suppose that
- M = Original Frame to be transmitted (K bits long) = 1010001101 (K = 10)
- F = Resulting FCS to be added to M (N bits long)
- T = Result of cascading M and F (K + N bits long)
- P = Predefined CRC Polynomial (N + 1 bits long) = 110101 (N + 1 = 6)
Now the FCS needs to be calculated. It is generated in such a way that T/P is zero. In our example, FCS will be 5 bits in length. Shifting M by 5 bits and dividing by P results in a reminder of 1110. Hence, the actual transmitted frame would be 1010001101 1110.
Mathematical Proof:
Let us denote the left shift operator by ^. Then we have
T = M ^ n + F Eq. 1
i.e. We shift M by n bits and add F. But consider M ^ n / P. Since M ^ n is not purely divisible by P, we will have a Quotient (Q) and Remainder (R). Thus,
M ^n / P = Q + R/P Eq. 2
Now, T (= M ^ n + R) divided by P equals,
= (M ^ n + R) / P
= (M ^ n)/ P + R/P
= Q + R/P + R/P From Eq. 2
= Q +
(R+R)/P Addition under Mod 2 arithmetic is XOR.= Q.
Overview of CRC
To summarize, the transmitting network station gets the frame, left shifts it by n bits and divides by P. It then adds the reminder to the frame (Reminder = FCS). It then, transmits the frame. The receiving network station receives the frame and divides by P. If the division is clean, then the frame is uncorrupted else it requests the frame to be retransmitted again.
Hashing Algorithms
The problem with even a 32-bit CRC value is that although it's pretty good at detecting inadvertent changes to the input data (such as those introduced by transmission errors), it doesn't stand up very well to intentional attacks. If you use a 32-bit CRC to fingerprint a file, it's relatively easy for someone with access to a computer to generate a completely different file that produces the same CRC value.
A step beyond CRCs are one-way hash algorithms that produce "hash" values. "One way" means it's easy to input A and get B, but it's impossible--or nearly impossible--to work backward from B to A. A good hash algorithm has one very important property: The values that it generates are so unique and so difficult to duplicate that not even someone with a bank of Cray supercomputers and a few centuries to kill could find two sets of input data that produce the same hash value. Typically, hash values are at least 128 bits in length. The greater the length, the more difficult it is to reproduce the input or to find another set of input data that produces a matching result.
Two of the most widely known one-way hash algorithms are the MD5 message digest algorithm developed by MIT professor Ron Rivest (one of the developers of the highly regarded RSA public-key cryptosystem) and the Secure Hash Algorithm (SHA) developed by the National Institute of Standards and Technology (NIST) and the National Security Agency (NSA). MD5 produces a 128-bit digital fingerprint from a set of input data, and SHA produces a 160-bit value. Assuming no one discovers a heretofore unknown "trap door" to either algorithm, it is computationally unfeasible to take a hash value produced by SHA or a "message digest" produced by MD5 and work backward to find the input data. Thus, if someone sends you a file and an MD5 or SHA fingerprint generated from the file, and if you run the same hash algorithm on the file you receive and get the same result,
you can be virtually certain the file was received intact.Digital Signatures
A digital signature - looked at objectively - is nothing but a block of data generated using a secret key and there is a public key out there that can be used to verify that this block was generated using only that secret key. The algorithms relies on that fact that without the secret key it is impossible to generate a message that would verify as valid. Some of the proposed uses of Digital Signatures are User Authentication, Time stamping of documents and messages, Verification that a public key belongs to a particular person etc.
How do we achieve the above? A digital signature of an arbitrary document is typically created by computing a message digest from the document, and concatenating it with information about the signer, a timestamp, etc. The resulting string is then encrypted using the private key of the signer using a suitable algorithm. The resulting encrypted block of bits is the signature. It is often distributed together with information about the public key that was used to sign it.
To verify a signature, the recipient first determines whether it trusts that the key belongs to the person it is supposed to belong to. How does the receiver verify the public key itself? Normally public keys are certified by signing the combination of the key and the information about its owner by a trusted key. The reason for trusting that key may again be that it was signed by another trusted key. Eventually some key must be a root of the trust hierarchy (that is, it is not trusted because it was signed by somebody, but because you believe a priori that the key can be trusted). In a centralized key infrastructure there are very few roots in the trust network (e.g., trusted government agencies; such roots are also called certification authorities). In a distributed infrastructure there need not be any universally accepted roots, and each party may have different trusted roots (such of the party's own key and any keys signed by it). This is the web of trust concept. The receiver authenticates the public key itself using one of the above methods and then decrypts the signature using the public key of the person. If the signature decrypts properly and the information matches that of the message (proper message digest etc.), the signature is accepted as valid.
Lets' consider a practical example. Suppose Jerry wants to send Tom a contract or credit card number electronically and Tom needs an electronic signature to verify authenticity. First Jerry sends the document. Then he uses a hash algorithm to generate a fingerprint for the document, encrypts the hash value with his private key, and sends the encrypted hash value to Tom. This is Jerry's' digital signature. Tom uses the same hash algorithm to fingerprint the document he received and then unencrypts the hash value he received from Jerry using Jerry's' public key. If the two hash values match, then Tom not only knows that the document he received is authentic, but he also knows that Jerry's' signature is real. Conducting commercial transactions this way is arguably more secure than using paper signatures, because paper signatures can be forged. And if the information that Jerry sent to Tom is sensitive (for example, if it contains a credit card number), then it, too, can be encrypted so that only Tom can read it.
This model--or one similar to it--is probably the one we'll use to conduct business on the Internet and elsewhere. It is the basis for the U.S. government's proposed Digital Signature Standard (DSS), which relies on the Secure Hash Algorithm to produce hash values and a public-key cryptosystem known as the Digital Signature Algorithm (DSA) to produce signatures from hash values. The DSS has been criticized for various reasons, but much of the criticism comes from parties who have a financial interest in seeing that it's not adopted.
Time will tell which, if any, method for creating digital signatures will become the standard. Regardless of the outcome, of greater importance is that it is possible to conduct electronic commerce in a secure fashion.7. Cryptographic Random Number Generators
Cryptographic random number generators generate random numbers for use in cryptographic applications, such as for keys. Conventional random number generators available in most programming languages or programming environments are not suitable for use in cryptographic applications (they are designed for statistical randomness, not to resist prediction by cryptanalysts).
In the optimal case, random numbers are based on true physical sources of randomness that cannot be predicted. Such sources may include the noise from a semiconductor device, the least significant bits of an audio input, or the intervals between device interrupts or user keystrokes. The noise obtained from a physical source is then "distilled" by a cryptographic hash function to make every bit depend on every other bit. Quite often a large pool (several thousand bits) is used to contain randomness, and every bit of the pool is made to depend on every bit of input noise and every other bit of the pool in a cryptographically strong way.
When true physical randomness is not available, pseudorandom numbers must be used. This situation is undesirable, but often arises on general purpose computers. It is always desirable to obtain some environmental noise - even from device latencies, resource utilization statistics, network statistics, keyboard interrupts, or whatever. The point is that the data must be unpredictable for any external observer; to achieve this, the random pool must contain at least 128 bits of true entropy.Cryptographic pseudorandom generators typically have a large pool ("seed value") containing randomness. Bits are returned from this pool by taking data from the pool, optionally running the data through a cryptographic hash function to avoid revealing the contents of the pool. When more bits are needed, the pool is stirred by encrypting its contents by a suitable cipher with a random key (that may be taken from an unreturned part of the pool) in a mode which makes every bit of the pool depend on every other bit of the pool. New environmental noise should be mixed into the pool before stirring to make predicting previous or future values even more impossible.
Even though cryptographically strong random number generators are not very difficult to built if designed properly, they are often overlooked. The importance of the random number generator must thus be emphasized - if done badly, it will easily become the weakest point of the system.
Some machines may have special purpose hardware noise generators. Noise from the leak current of a diode or transistor, least significant bits of audio inputs, times between interrupts, etc. are all good sources of randomness when processed with a suitable hash function. It is a good idea to acquire true environmental noise whenever possible.8. Strength of Cryptographic Algorithms
Good cryptographic systems should always be designed so that they are as difficult to break as possible. It is possible to build systems that cannot be broken in practice (though this cannot usually be proved). This does not significantly increase system implementation effort; however, some care and expertise is required. There is no excuse for a system designer to leave the system breakable. Any mechanisms that can be used to circumvent security must be made explicit, documented, and brought into the attention of the end users.
In theory, any cryptographic method with a key can be broken by trying all possible keys in sequence. If using brute force to try all keys is the only option, the required computing power increases exponentially with the length of the key. A 32 bit key takes 2^32 (about 10^9) steps. This is something any amateur can do on his/her home computer. A system with 40 bit keys (e.g. US-exportable version of RC4) takes 2^40 steps - this kind of computing power is available in most universities and even smallish companies. A system with 56 bit keys (such as DES) takes a substantial effort, but is quite easily breakable with special hardware. The cost of the special hardware is substantial but easily within reach of organized criminals, major companies, and governments. Keys with 64 bits are probably breakable now by major governments, and will be within reach of organized criminals, major companies, and lesser governments in a few years. Keys with 80 bits may become breakable in future. Keys with 128 bits will probably remain unbreakable by brute force for the foreseeable future. Even larger keys are possible.For public keys algorithms, refer to an excellent analysis on RSA key length recommendations by Bruce Schneier.
9. Cryptanalysis and Attacks on Crypto Systems
Cryptanalysis is the art of deciphering encrypted communications without knowing the proper keys. There are many cryptanalytic techniques. Some of the more important ones for a system implementor are described below.
Ciphertext-only attack: This is the situation where the attacker does not know anything about the contents of the
message, and must work from ciphertext only. In practice it is quite often possible to make guesses about the plaintext, as many types of messages have fixed format headers. Even ordinary letters and documents begin in a very predictable way. It may also be possible to guess that some ciphertext block contains a common word.
Known-plaintext attack: The attacker knows or can guess the plaintext for some parts of the ciphertext. The task is to decrypt the rest of the ciphertext blocks using this information. This may be done by determining the key used to encrypt the data, or via some shortcut.
Chosen-plaintext attack: The attacker is able to have any text he likes encrypted with the unknown key. The task is to determine the key used for encryption. Some encryption methods, particularly RSA, are extremely vulnerable to chosen-plaintext attacks. When such algorithms are used, extreme care must be taken to design the entire system so that an attacker can never have chosen plaintext encrypted.
Man-in-the-middle attack: This attack is relevant for cryptographic communication and key exchange protocols. The idea is that when two parties are exchanging keys for secure communications (e.g., using Diffie-Hellman), an adversary puts himself between the parties on the communication line. The adversary then performs a separate key exchange with each party. The parties will end up using a different key, each of which is known to the adversary. The adversary will then decrypt any communications with the proper key, and encrypt them with the other key for sending to the other party. The parties will think that they are communicating securely, but in fact the adversary is hearing everything.
One way to prevent man-in-the-middle attacks is that both sides compute a cryptographic hash function of the key
exchange (or at least the encryption keys), sign it using a digital signature algorithm, and send the signature to the other side. The recipient then verifies that the signature came from the desired other party, and that the hash in the signature matches that computed locally.
Timing Attack: This very recent attack is based on repeatedly measuring the exact execution times of modular
exponentiation operations. It is relevant to at least RSA, Diffie-Hellman, and Elliptic Curve methods.10. Cryptographic Protocols
Of late, cryptographic wrappers are wrapped around common protocols to make them secure. Examples of these would be SSL, SHTTP, S/MIME. Let us see briefly about each of these.
SSL
SSL stands for Secure Socket Layer. Current Version is 3.0. It was originally conceived and developed by Netscape. But current implementations are supported by both Microsoft and Netscape. It helps in creating a secure channel over the Internet, using which transactions can be securely carried out. It uses the server's public key for server authentication and encryption. Alternately, the server may request the client to submit its own public key for authentication. Sites using SSL have https instead of http in their URL. SSL is application layer independent and any layer can be layered on to it. For examples, secure implementations of telnet, ftp over SSL are not uncommon.
SHTTP
SHTTP stands for Secure HTTP. Whereas SSL uses a secure channel over which any amount of data can be sent, SHTTP is designed to send individual messages securely. They are basically complementary technologies. Like SSL, SHTTP can do both server and client authentication, provide complete transaction privacy and additionally check for data integrity.
S/MIME
S/MIME is just a secure wrapper over MIME protocol. Future email clients might implement this to increase email security.
11. Conclusion
As Philip Zimmermann, the author of PGP once put it,
Privacy is just like any other right, exercise it or risk losing it.
Disclaimer: Any opinions and evaluations presented here are speculative, and the author cannot be held responsible for their correctness.