A few simple definitions on privacy and cryptography,
the way I understand them
Privacy means (among other things) that information transmitted from a sender to a receiver is not accessed by others.
A secure communication channel is a channel that enforces privacy by making it impossible to eavesdrop on the communication. An insecure channel makes eavesdropping possible.
Cryptography is the use of techniques which make information readable by the sender and receiver, but unintelligible to anyone else (including anyone with complete access to the ciphertext, see below). Information is encrypted by the sender before sending, and decrypted by the receiver. Thus, cryptography makes privacy possible even on an insecure channel.
Plaintext (or cleartext) is non-encrypted information. The sender converts plaintext to ciphertext (see below) by applying an encryption algorithm to the plaintext.
Ciphertext is encrypted information. The receiver recovers the original plaintext by decrypting the received ciphertext.
A cryptographic key is a sequence of numbers, characters or other data used by a cryptographic algorithm to convert plaintext into ciphertext or vice versa.
Secret key cryptography use a key which is kept secret and known only to the sender and receiver. Leaking a secret key compromises the privacy of any ciphertext generated with this key. Therefore, a secret key should be sent only on a secure channel.
Public key cryptography uses a pair of keys, one for encrypting and the other for decrypting. The two keys constituting a pair are generated together by an algorithm. The key used for encrypting is called public, i.e., it can be publicly known without compromising the security of ciphertext. The key used for decrypting is called secret. Anybody in possession of a public key can encrypt a plaintext, but only the party in possession of the corresponding secret key can decrypt the ciphertext. A public key can be sent on an insecure channel.
Weak cryptography is the use of cryptographic techniques that make encrypted information unintelligible to a casual observer not in possession of the key. A determined opponent with large human, technological and financial resources (e.g., a government), on the other hand, may be able to decrypt a weakly-encrypted ciphertext in a short time without having the key.
Strong cryptography is the use of cryptographic techniques which make it impossible or extremely slow and expensive (even for a government) to decrypt a plaintext without having the key.
Snake-oil cryptography is the use of proprietary, home-made or secret cryptographic algorithms. Public knowledge of a cryptographic algorithm and its extensive review by cryptographers and mathematicians is the only guarantee that an algorithm is strong (i.e., that a way to break it within a reasonable time frame has not been found). Secret algorithms have not received public review, and their strength therefore cannot be assessed. Also, there is no intrinsic safety in a secret algorithm. If a strong cryptographic algorithm is used, knowledge of the algorithm does not help a third party to crack the ciphertext. On the other hand, in the great majority of cases, snake-oil algorithms are quite easy to crack by professional cryptanalysts. If you trust snake-oil ointments, then you will love snake-oil cryptography.
Key-escrow cryptography is the use of a cryptographic scheme in which a secret key is deposited (escrowed) with a trusted party. The deposited key (which may or may not be the same as the key in possession of the sender and/or receiver) allows the plaintext to be recovered. Generally, different parts of an escrowed key are held by two or more agencies or independent entities, and are released to the government upon motivated request. Not storing the whole key at one location makes it unlikely that the key will be used in unauthorised ways (unless the government decides to use it in unauthorised ways, in which case the multiple-agency system offers no protection). As a result, persons using a key-escrow system must trust the government not to eavesdrop on their encrypted communications without just reason, but have no real guarantee that this is indeed the case (in a similar - totally hypothetical - case, you could be asked to trust your president or prime minister not to lie, but would have no way to tell whether he does).
A government may use national security and the fight against crime as pretexts to justify the adoption of key-escrow, but again, private persons have no guarantee that this is the real reason. It is obvious, however, that two more steps must follow the adoption of a key-escrow system, lest it remains useless. The first step is making it obligatory to participate in the key-escrow scheme (the system is no use to the government if only a few persons use it). As a result, the government will have access to each and every citizen's escrowed key. The second step is outlawing other types of cryptography (since a key-escrow system is useless if a ciphertext generated with a key-escrow system contains another ciphertext generated with strong cryptography). Thus, the government effectively will have abolished the right to privacy of the people. At this point, only terrorists and criminals will still have privacy.
Unbreakable cryptography is the use of a cryptographic algorithm which is demonstrably (in a mathematical sense) impossible to break. Only one such algorithm is known (i.e., the one-time pad). This algorithm is very simple. Unfortunately, it is a secret-key algorithm, the key must be at least as long as the plaintext, and the security of both key and plaintext is compromised if the same key is used more than once. Variations on the one-time pad in order to remedy one or more of its limitations invariably destroy its unbreakability.
Because of the mysticism and sometimes incorrect information that seem to surround the one-time pad, I am providing here more information. A simple implementation of the one-time pad uses a random sequence of bytes as key. You can use a pseudo-random sequence (i.e., a sequence generated by a deterministic algorithm) which passes mathematical tests for pseudo-randomness (i.e., fails to show any detectable pattern in the sequence. There is no mathematical test to distinguish real randomness from pseudo-randomness, so a "good" pseudo-random sequence will do, as long as it is generated using a random seed (which is a shorter sequence of data used to initialize a pseudo-random generator). Random and pseudo-random sequences are a field of research in their own, so don't expect an exhaustive treatment here. The key must be at least as long as the plaintext.
- The cleartext is XORed with the key. It is preferrable, but not indispensable, to compress the cleartext before doing so. XORing is a logical operation that takes two input bits and outputs a TRUE if either input bit is TRUE, and FALSE if both input bits are TRUE or FALSE. Each bit in a byte of the key and plaintext is XORed in sequence, and each byte in the plaintext and key is XORed in sequence.
- The key is stored in a safe place, and/or sent to the receiving part through a safe channel. This is the weakest part of this cypher, of course, and the most difficult to implement.
- Decryption consist of applying the same procedure to the cyphertext.
- The unbreakability of this cypher consists in the fact that there is no way to distinguish the true key from any of the possible keys (i.e., all possible combinations and permutations of the bits constituting the key). As a result, a brute-force approach (which consists of applying all possible keys to the cyphertext) is meaningless, because it will produce all possible plaintexts of the same length as the cyphertext, with no way to tell which of these plaintexts is real.
- If you use the same key for more than one plaintext, however, the cyphertexts will display a statistical correlation and, especially if part of one plaintext is known, this allow a third party to compute part or all of the key, thus compromising the security of all cyphertexts obtained with the same key.
Finally, note the following, highly summarised points about the one-time pad:
- The one-time pad is unbreakable in itself, but only as good as the
key being used, and the way the key is stored and/or transmitted.
- If a good key is used
(i.e., one that passes tests for pseudo-randomness), a cyphertext
generated with the one-time pad is undistinguishable from a nonsense random sequence: nobody can prove it contains - or does not contain - a plaintext.
- The same thing applies
to a key: nobody can prove it is a cryptographic key unless they have access to the corresponding cyphertext.
- Once a key is lost or destroyed, there is no way to recover the cyphertext. This is true, unless you happen to know how you obtained the
key, and are able to duplicate the process. However, if you can duplicate a key, the key itself
is not random, and the method used to generate it is no good. You can use this as one of the tests a key-generation scheme must pass.
- You don't need a computer to use
the one-time pad. You can apply it with paper and pencil, and you can use statistically unflawed dice, coins or other valid manual methods to generate a key.
Deniable encryption is the use of
cryptographic techniques which produce a ciphertext that can decrypt to two or more plaintexts when different keys are used. This can be useful to a sender or a receiver who wishes to deny the contents of a ciphertext even when forced to give up the
decryption key. For instance, the original plaintext may say "The president sucks". The ciphertext may decrypt to "The president sucks" when one key is used, and to "What a nice day" when another key is used. When forced to give up the key (e.g., under
threat), a party may give up the second key, which yields an innocent plaintext. In order to make the scheme more credible, the "contingency" plaintext may have a slightly incriminating content, in order to justify the use of cryptography - in the eyes of
the authorities - by the person(s) involved, but not important enough to cause serious consequences to them. Very little practical work has been done on deniable encryption so far.
A possible implementation of deniable encryption uses the one-time pad described above. In simple terms, and indicating the XORing operation with X, we have:
- key X plaintext1 = cyphertext1
- cyphertext1 X plaintext2 = cyphertext2
- Plaintext1 is the real message, plaintext2 the decoy message. Note that you used cyphertext1 as a key. At this point,under threat you can disclose cyphertext1, saying that it is the key (there is no way to prove otherwise). The adversary will perform the operation:
- cyphertext2 X cyphertext1 = plaintext2
and will have to be satisfied with it. You or your ally, on the other hand, can perform the operation:
cyphertext1 X key = plaintext1
cyphertext2 X plaintext2 X key = plaintext1
As shown above, cyphertext1 does not need to be kept, as long as the other three texts are known.
Steganography is the application of techniques that hide information within data. For instance, steganographic techniques exist for hiding a text message within a photograph stored in electronic format. The data constituting the photograph act as a container for the hidden message. The photograph looks normal, but the hidden text can be recovered by using an appropriate algorithm. As a rule, the information that must be hidden is substantially smaller than the information acting as a container. Also, the container must be of a nature that allows for a slight loss of information without becoming unusable or carrying tell-tale signs of hiding extraneous data. Pictures, movies and sound are obvious choices for containers. Redundant data formats which allow the automatic correction of corrupted and/or missing bits (e.g., the data recovery records on a CD-ROM) or the file slack (i.e., unused storage filling the space between the end of a file and the end of the last allocation unit on a hard disk) can also be used as containers. Steganography can be combined with cryptography, so that a steganographic container may contain an encrypted message.
Cryptography alone is
not a complete solution to achieve privacy. Indirect techniques may allow an observer to gain information from the act of sending a ciphertext even when she cannot decrypt or break the ciphertext. For instance,
traffic analysis (e.g., the knowledge of who
sent a ciphertext to whom, and when in time) can already tell something useful to an observer. Time-based traffic analysis can be foiled, for instance, by frequently transmitting nonsense ciphertext, so that a meaningful ciphertext (i.e., a ciphertext
containing a meaningful plaintext), to the observer, will be undistinguishable from the nonsense ones. Destination-based traffic analysis can be foiled by uploading ciphertext to a location where it is publicly available (e.g., a Usenet group), where in
principle anybody can download it. The destinatary can thus remain secret. Steganography can be used to prevent source-based traffic analysis, i.e., by hiding from an observer the fact that the sender is sending a ciphertext. Depending on the algorithm
being used, the same plaintext may always encrypt to the same ciphertext, or to a different ciphertext each time the algorithm is used. The latter type of algorithm is preferable, since in this case an observer will have no way to detect that the same
message has been sent on different occasions. The length of the plaintext may also be significant. This can be hidden by appending nonsense data of random length to the real plaintext before encrypting it.
If you want to know more, read
the cryptography FAQ (posted at regular intervals on sci.crypt).