Cryptography 101

Part I: Cryptographic Methods
Cryptography is a means of obfuscating and or/validating data in a way that only people or systems possessing the proper credentials (‘keys’) can access or validate the data. Several cryptographic methods are most commonly used: encryption (and decryption), hashes, and digital signatures.

Encryption: The most well known use for cryptography is to encrypt data. Encryption is a process by which information is visible, but (literally) indecipherable to someone not possessing the correct key. This provides the dual benefits of providing for the confidentiality and integrity of the information. Confidentiality ensures that (under most circumstances) the information cannot be read by an unauthorized party, while integrity ensures that the information has not been modified between the time it was encrypted and the time it was decrypted. When it is delivered, it contains all of the information it had originally, but in a manner that this fact is verifiable.

Hashing is another useful feature of cryptography. Hashing provides a means by which to “fingerprint” a block of data in a unique but replicable manner. It is often referred to as a “one-way” hash because, unlike encryption, it is not reversible. It also does not preserve the integrity of the original data – hashing is “lossy”, insofar as the original content cannot be reconstructed from the result (the “hash”) itself. Its value lies in its lightweight nature; a very large file can be processed, and a very small hash results. If one is concerned only with the integrity of, and has independent access to, the original block of information, one can re-create the hash and compare it to the stored value, ensuring the data has not been modified in the interim, without needing to store the entire original information separately.

Digital Signatures: Modern cryptography also provides a means by which to digitally sign information. In this process, information is still available to be read by anyone (i.e., it does not provide for confidentiality), but the integrity of the data is ensured through the digital signature. This provides a digital analogue to the real-world example of a doctor’s signature on a scrip: anyone can read the prescription, but the authenticity of the scrip is provided by the doctor’s (hopefully inimitable) signature. (Of course, doctors’ handwriting is famously indecipherable, perhaps making this a poor analogy…)

[For the technically-minded, a digital signature is generated by hashing the information, then encrypting the hash in a manner that anyone can verify that the signatory encrypted it.]

Part II: Cryptographic Algorithm Types
Three types of algorithm (algorithms are, essentially, applied mathematical functions) are used to manage cryptographic functions: symmetric algorithms, asymmetric algorithms, and hashing algorithms.

Symmetric algorithms: The most common type of cryptographic algorithm is called symmetric. It is the fastest and least onerous type to implement, and is the basis for all historic (i.e., pre-computer) cryptography. Its application is analogous to a physical lock: a key turns the lock, locking it. The same key turns the opposite direction, unlocking it. Simple, easy, and effective. (Symmetric algorithms are used only for encryption and decryption – they are useless for hashing or digital signatures.)

The problem with symmetric algorithms comes in the forms of key distribution and accountability. Since the same key provides multiple functions (locking and unlocking, or encrypting and decrypting, respectively), it is impossible to determine who has locked or unlocked the lock. In addition, every entity who needs access to the data must be given an identical (and indistinguishable) copy of the same key. So from a practical perspective, every time a piece of data needs to be encrypted uniquely between two entities, a new lock (and resulting key) must be created. This quickly becomes a logistical nightmare when (especially on the Internet) data needing unique protection can easily run into the hundreds of millions. Also, since a key must be transmitted in the clear, often to a remote party, passing through many hands in-between, it is impossible to know that the two authorised parties are the only ones in possession of the key.

Asymmetric algorithms: A newer form of cryptographic algorithm is asymmetric, which as its name suggests, differs from the symmetric algorithms in that the process used to encrypt information is different from the process used to decrypt. This is achieved by the use of a (mathematically-related) key pair, called the “public” and “private” keys. As the names suggest, the “private” key belongs to a single entity and is kept secret from all others, while the “public” key can be distributed to the world-at-large without compromising the security of the system or the private key. As a result, the private key can be used in any number of unrelated transactions, allowing for “one-to-many” use, in contrast to the symmetric key’s one-to-one limitation. This reduces the complexity of key management by orders of magnitude.

Asymmetric systems can be used for encryption and digital signatures, but not for hashing. The most common implementation of asymmetric cryptography is in “Public-Key Infrastructure”, or PKI.

The downside of asymmetric algorithms is the cost of efficiency when compared to symmetric algorithms; to maintain the integrity of the system, asymmetric keys are orders of magnitude larger than their symmetric counterparts, with a significant increase in computing power for computations and resulting loss of speed.

Hashing algorithms: Hashing algorithms are used only for hashing; they provide no independent use for encryption nor digital signatures. Please see the section on Hashing in Part I for further explanation of the uses for hashing algorithms.

Part III: Public Key Infrastructure
The usefulness of public and private key pairs cannot be overestimated. But with the proliferation of a practical one-to-many system came the problem of identity. The use of public-key cryptography inherently allows for the verification of the consistency of the party on the other end, but not that party’s authenticity. That is, it is impossible for one party to uniquely identify the other without some sort of pre-existing arrangement or process. And the process of identifying the other party can be taxing.

As a solution, the Public-Key Infrastructure (PKI) was developed. This enables a trusted third-party to establish a process for identifying an entity, such as a business or individual, and issuing a credential tied to that entity so that anyone who trusts the third-party can trust their process, and derivatively trust any credential issued to them. This operates as a digital analogue to a licensing agency such as a Motor Vehicle Agency. By trusting the agency’s process (identification, and eye, knowledge, and driving tests) and the certificates it produces (driving licences), one can be reasonably assured that the individual associated with that credential is who they say they are.

A PKI establishes a Certificate Authority (CA) to perform this function. A CA validates the identity of the holder of a particular key pair, then digitally signs the hash of that entity’s private key. This produces a certificate issued to the requesting entity that can then be relied upon to verify that entity’s identity (as long as one trusts the CA’s process). The entity can then distribute the certificate to anyone, knowing that its private key is stored safely away.

A PKI may use one or more Registration Authorities (RAs) to manage the registration processes and security of the Certificate Authority, allowing for greater flexibility in deployment and increased security for the CA. RAs are secured systems that act as a trusted advisor to the CA, processing Certificate Signing Requests and only sending authorized requests to the CA for signing.

Part IV: Certificate Uses
Authentication can be handled through the use of certificates. Because information can be targeted to a single entity, and that entity is identified by their CA-issued certificate, it is possible to allow that entity to assert and prove its authenticity using a digital certificate and corresponding private key. Certificate-based authentication is commonly used for connections between servers and clients.

Encryption is made simpler with PKI by allowing anyone with access to a certificate to encrypt information to an entity with the knowledge that only that entity (as the sole holder of the certificate’s corresponding private key) can decrypt that information. Thus the key distribution problem inherent in symmetric cryptography is solved.

[This process is reversible, insofar as the private key can be used to encrypt information such that anyone with access to the certificate/public key can decrypt it, but this is a very uncommon use case and rarely implemented.]

Digital Signatures are also made possible with PKI by allowing an entity to digitally sign a block of data. This is accomplished by the entity using its private key to sign the data block such that anyone with access to its certificate may verify that the data have not been modified in-between.