Tuesday, 13 February 2018

Some Timely Fundamentals of Encryption


This post is in response to some questions about the mechanics of encryption that have been getting more frequent as cryptocurrency and related technologies become more well-known.




First, some definitions:

Alice: The traditional placeholder name for the sender of an encrypted message.

Bob: The name for the INTENDED receiver.

Oscar: The name for a theoretical unintended receiver.


Plaintext: The message Alice wants to send to Bob, in a readable format.

ciphertext: The message that Alice actually sends to Bob, which is the plaintext, but encrypted (made much harder for Oscar to read).

Decryption: The opposite of encryption; the conversion of ciphertext back into the original plaintext.

Hash: A ciphertext of some information that is not intended to be decrypted. A set of data will always produce the same hash, but the hash is 'lossy' in that it does not contain all of the information in the original data. The only reliable way to reproduce a hash is to encrypt the same plaintext. Passwords on reputable websites are stored as hashes.

Symmetric Cipher: A system in which the same key is used to encrypt a message as decrypt it. There are many different such systems, but only a few are used because they are proven to be strong. The primary advantage of symmetric encryption is computational speed.

Asymmetric Cipher: A system with a public key and a private key. A plaintext that has been encrypted with a public key cannot be decrypted by it. Only the private key can decrypt the ciphertext. The advantage of such a system is that only the public key ever needs to given out for communication, and it shouldn't matter who has that. Every asymmetric encryption



Q1: I'm not a spy or a criminal or a bitcoin nerd. Why should I care?

A1: Because encryption is vital to identity verification. When you use information that identifies you digitally, such as using a bank or credit card, or whenever you use a password on a website, encryption is involved to confirm that it is actually you (or the cardholder) making the purchase or logging in.


Q2: So what's the deal with bitcoin and encryption?

A2: Encryption comes into bitcoin, and other cryptocurrency, in the protection from counterfeiting. In order to 'create' more bitcoin, you need to guess a hash based partially on a record of all the bitcoin transactions [Footnote 1] that have ever happened, and partially on some information that literally nobody knows in advance. Guessing the 'winning' hash [Footnote 2] credits your bitcoin account with a reward, and only one person can claim this award every 10 minutes across the entire system.



The best known way to get the winning hash is by blind guessing. This has two big implications:

1) There are so many possible hashes that it takes a tremendous amount of computer power to have a good chance at a winner. In short, it takes a great deal of computing work to make new bitcoin. The result is an object that represents a large amount of work that can't be feasibly counterfeited.

2) A record of all transactions up to a given time point is required to even make guesses. This requirement incentivizes the public record-keeping required to make bitcoin transactions function.

Combining (1) and (2), you have an object that is hard to counterfeit and easy to transfer, two essentials for a currency.



Q3: So what's 'better', symmetric or asymmetric?

A3: In modern commerce, it's asymmetric encryption that matters.

For example, consider making a purchase with a bank card. The bank may have a public key, and data like your account number, the merchant number, and the amount of the purchase could all be encrypted with that public key. That ciphertext can be sent safely over the internet. Even if someone gets a copy of the ciphertext, they can't do anything with it without the bank's private key. They can't get your account number, and they can't send a similar ciphertext message to the bank pretending to be you.

If something large, like gigabytes of confidential data, needs to be transmitted (or carried in a cellphone or hard drive), and an asymmetric cipher would be too slow, the key for a symmetric cipher can be sent with a public key, and then everything else could be encrypted with a symmetric cipher. That way, the speed of a symmetric cipher is used while maintaining the flexibility of an asymmetric cipher.


Q4: What's the deal with prime numbers and encryption?

A4: Many asymmetric ciphers known to the public uses prime numbers to make its public and private keys. A public key is the product of prime numbers multiplied together (e.g. 77). The private key is the original prime numbers (e.g. 7 and 11). It is easy to multiply two numbers, but extremely hard to retrieve the original two numbers, especially if the two numbers were large. The best known methods are slightly [Footnote 3] better than guessing one of the numbers and checking. (e.g. Does 2 divide 77? No. Does 3? No. Does 5? No. Does 7? Okay, yes.)



Q5: What else can asymmetric encryption do.

A5: Lots of things! There are ciphers for which the cipher text cannot be decrypted by one private key, but by any three of a set of five private keys. If five people each have one private key, then each key represents a 'vote' to decrypt something.

At a larger scale, each person could send their voting ID and selection the same way they would send their bank transactions.

For more information:
Voting implementation example: http://www.zuj.edu.jo/wp-content/uploads/2014/05/PA_05.pdf

Computerphile's argument against computer voting: https://www.youtube.com/watch?v=w3_0x6oaDmI

I'm personal against it, along with other problems, I would concerned about one person with a quantum computer, or even a quantum annealer breaking the lock and posting infinity votes.



Q6: What does quantum computing change?

A6: Everything. Although stable, large scale quantum computers do not currently exist (we think), programs for them have been under development for decades.

Each additional processor added to a classical computer adds the same amount of capacity; a system 10 cores is at most 10 times as fast as a 1-core. Every qubit added to a quantum computer DOUBLES this capacity as it applies to classical problems; a system 10 qubits are 2048 times as fast as a 1-qubit system.

Shor's algorithm can use this capacity to retrieve the product of two primes into their original primes, and it can do so in linear time. This would render many asymmetric encryption systems used today breakable overnight. For more, see https://en.wikipedia.org/wiki/Shor%27s_algorithm



Q7: Why isn't there a wider variety of encryption systems in place?

A7: First, the mainstream ones are pretty good. The more they are used, the greater the incentive to find exploits. The fact that exploits in these encryption systems are rarely if ever found is a testament to their strength.

It's easy to make a cipher, but it's very hard to prove the security of the cipher. Until a cipher is tested extensively by people other than the system's creator, there is no reason to believe it cannot be broken. Without a reason to test the cipher, such as a large cash prize protected by the cipher to be tested, nobody will bother.


Q8: Does that stop people from trying.

A8: Some, not all. My undergraduate research project was to develop an encryption system. I called it Freakazoid for its obnoxious level of unpredictability, and although I don't have the means to prove its security, I'm still proud of it 10 years later. Freakazoid is described in the post in this link.



Footnotes

1. The transactions themselves are in plaintext, but they're all of the form 'Address X sends Y bitcoin to address Z', where the addresses are randomly generated numbers. No identities are attached to the addresses, which keeps bitcoin transactions anonymous.

2.  There could be more than one winning hash, and the number of hashes that are winners is automatically adjusted to make it unlikely that two people will guess a winning hash in a close time to each other. For more information, look up “bitcoin difficulty”.

3. 'Slightly better' isn't exactly true. There is a method called a 'general number field sieve' that can factor extremely large numbers. However, even this cannot factor primes of n bits in length in a time that is polynomial with respect to length n.