Leanpub: Publish Early, Publish Often

Chapter 4: Secure Channels and Key Exchange

In the previous chapter, we looked at symmetric security. While fast, symmetric cryptography requires both sides to share the same key. However, it’s assumed that if encryption is required, the communication medium is insecure. In this chapter, we’ll look at some mechanisms for exchanging keys over an insecure channel.

Secure channel

The goal of this chapter is to set up a secure channel between two peers: one in which an attacker cannot eavesdrop or forge messages. Typically, we’ll use symmetric security to provide a secure channel, and some key exchange mechanism to set up the initial keys. This channel will be established in two directions, which means that there should be a separate set of keys (the combined encryption and authentication keys) for communications in both directions.

What does the secure channel look like? It’ll be message-oriented, for the same reasons described in the previous chapter. It will be built on top of an insecure channel, like the Internet, that can’t be trusted. An attacker might be able to disrupt or manipulate this insecure channel, for example; we also can’t hide how large messages are or who is talking. Ideally, messages will not be replayable or have their order changed (and that it will be apparent when either occurs).

The easiest way to prevent a replay is to keep track of message numbers; this number might be serialised as part of the message. For example, a message might be considered as the pairing of a message number and some message contents. We’ll track numbers for both sent and received messages.

type Message struct {
       Number   uint32
       Contents []byte
}

Serialising the message appends the contents to the 4-byte message number. The out variable is initialised with only four bytes, but with a capacity that accounts for the message contents.

func MarshalMessage(m Message) []byte {
       out := make([]byte, 4, len(m.Contents) + 4)
       binary.BigEndian.PutUint32(out[:4], m.Number)
       return append(out, m.Contents...)
}

Unmarshaling a message first checks the assumption that the message contains a sequence number and at least one byte of contents. Then, the message number and contents are extracted.

func UnmarshalMessage(in []byte) (Message, bool) {
       m := Message{}
       if len(in) <= 4 {
               return m, false
       }

       m.Number = binary.BigEndian.Uint32(in[:4])
       m.Contents = in[4:]
       return m, true
}

Including message numbers is only useful if they’re being checked. We’ll keep track of message numbers for a given session: as we’ll see in this chapter, we exchange unique keys for each session, so we only track message numbers for the scope of a session. A message replayed outside of a session won’t be decryptable, so we won’t worry about it. We’ll also want to keep track of the message number of messages we’ve sent and messages we’ve received separately. We also keep separate keys for each direction.

type Channel io.ReadWriter

type Session struct {
       lastSent uint32
       sendKey *[32]byte

       lastRecv uint32
       recvKey *[32]byte

       Channel Channel
}

func (s *Session) LastSent() uint32 {
       return s.lastSent
}

func (s *Session) LastRecv() uint32 {
       return s.lastRecv
}

When we encrypt a message as part of a session, we’ll set the message number to the last sent message number plus one. The serialised message number and contents are then encrypted, and handed off to be sent.

func (s *Session) Encrypt(message []byte) ([]byte, error) {
       if len(message) == 0 {
               return nil, secret.ErrEncrypt
       }

       s.lastSent++
       m := MarshalMessage(Message{s.lastSent, message})
       return secret.Encrypt(s.sendKey, m)
}

Now, ensuring that we have an incremented message number is a requirement for decryption. If the message number hasn’t incremented, we assume it’s a replayed message, and considered it a decryption failure.

func (s *Session) Decrypt(message []byte) ([]byte, error) {
       out, err := secret.Decrypt(s.recvKey, message)
       if err != nil {
               return nil, err
       }

       m, ok := UnmarshalMessage(out)
       if !ok {
               return nil, secret.ErrDecrypt
       }

       if m.Number <= s.lastRecv {
               return nil, secret.ErrDecrypt
       }

       s.lastRecv = m.Number

       return m.Contents, nil
}

This example could be trivially extended to include other message metadata; using an AEAD like GCM means that this metadata doesn’t need to be encrypted. However, we’ll prefer to encrypt as much as we can to limit the amount of information about messages to an eavesdropper.

A more complete example of sessions can be found in the example source code in the chapter4/session/ package.

Password-based key derivation

The simplest mechanism for exchanging keys is to use a password to derive the key. There are a few choices for doing this, but the one we’ll prefer is scrypt; it is provided in the golang.org/x/crypto/scrypt package. Internally, scrypt uses another key derivation function (KDF), called PBKDF2, but it increases the resource requirements. This requires an implementation to choose between using a lot of memory or a lot of CPU in an attempt to reduce the effectiveness of hardware implementations, making them more expensive to produce.

The resource requirements of scrypt are controlled by its parameters:

N: the number of iterations. In the original presentation[Perc09b] for the algorithm, 16384 (2¹⁴) is recommended for interactive logins and 1048576 (2²⁰) for file encryption. In this book, we use 2²⁰ as a default for securing cryptographic secrets; a one-time initial cost to derive the key is acceptable for the applicatons we’ll use it for.
r: relative memory cost parameter (controls the blocksize in the underlying hash). The original presentation recommends a value of 8.
p: relative CPU cost parameter. The original presentation recommends a value of 1.

On the machine I’m writing this on, a 2010 ThinkPad with a 2.67 GHz quad-core Intel Core i5 M560 and 6G of memory, scrypt takes on average 6.3s to derive a key; there isn’t an appreciable timing difference between generating 32-byte NaCl or AES-GCM keys and generating 80-byte AES-CBC or AES-CTR keys due to the way that key material is derived.

In order to generate a key, scrypt takes a password and a salt. The salt is not a secret, and we’ll prepend it to data encrypted using an scrypt-derived key. The salt is passed to PBKDF2, and is used to prevent an attacker from just storing (password,key) pairs as a different salt yields a different output from scrypt. Each password has to be checked with the salt used to derive the key. We’ll use randomly-generated 256-bit salts as recommended in [Ferg10].

In the previous chapter, it was mentioned that our key exchange method permitted us to use random nonces. When we use scrypt for key exchange, each encryption uses a new salt. This effectively means that each encryption will use a different key, even though we’re using the same passphrase. This means it is astronomically unlikely that we’ll reuse a nonce with the same key.

Using scrypt in Go is a one-liner (plus error checking), so let’s try encrypting something with NaCl using a passphrase. The first step is to write the key derivation function, which mostly translates the byte slice returned by scrypt.Key to the *[32]byte used by the secretbox package.

// deriveKey generates a new NaCl key from a passphrase and salt.
func deriveKey(pass, salt []byte) (*[secret.KeySize]byte, error) {
       var naclKey = new([secret.KeySize]byte)
       key, err := scrypt.Key(pass, salt, 1048576, 8, 1, secret.KeySize)
       if err != nil {
               return nil, err
       }

       copy(naclKey[:], key)
       util.Zero(key)
       return naclKey, nil
}

The encryption function will take a passphrase and a message, generate a random salt, derive a key, encrypt the message, and prepend the salt to the resulting ciphertext.

func Encrypt(pass, message []byte) ([]byte, error) {
       salt, err := util.RandBytes(SaltSize)
       if err != nil {
               return nil, ErrEncrypt
       }

       key, err := deriveKey(pass, salt)
       if err != nil {
               return nil, ErrEncrypt
       }

       out, err := secret.Encrypt(key, message)
       util.Zero(key[:]) // Zero key immediately after
       if err != nil {
               return nil, ErrEncrypt
       }

       out = append(salt, out...)
       return out, nil
}

The derived key is only needed in the call to secret.Encrypt: it is derived immediately before the call, and it’s zeroised immediately after the call. This is an attempt to keep the secret in memory only as long as it’s needed. Likewise, the passphrase should be immediately zeroised by the caller after a call to this function. Error handling can wait until after the key has been zeroised, which helps to prevent the possibility of the key material leaking out of this function.

To decrypt, we check our assumption about the length of the message (that is, a properly encrypted passphrase-secured message will have a salt prepended, plus the NaCl nonce and overhead). Then, we derive the key, decrypt, and return the plaintext.

const Overhead = SaltSize + secretbox.Overhead + secret.NonceSize

func Decrypt(pass, message []byte) ([]byte, error) {
       if len(message) < Overhead {
               return nil, ErrDecrypt
       }

       key, err := deriveKey(pass, message[:SaltSize])
       if err != nil {
               return nil, ErrDecrypt
       }

       out, err := secret.Decrypt(key, message[SaltSize:])
       util.Zero(key[:]) // Zero key immediately after
       if err != nil {
               return nil, ErrDecrypt
       }

       return out, nil
}

Once again, an attempt is made to limit the scope of key material in memory. It’s no guarantee, but it’s the best we can do.

Using passwords is also called a “pre-shared secret”; pre-shared means that the password has to be exchanged over some other secure channel. For example, your wireless network probably uses encryption with a pre-shared secret (your wireless password); this password is used to secure network communications, and you probably tell your friends the password which is fairly secure under the assumption that an attacker targeting your network will not be physically present (unless they have installed a remote-access tool on your machine and is listening on the microphone). Generally, the attackers who can eavesdrop on this verbal key exchange aren’t the ones that you are concerned about eavesdropping on your network traffic.

Asymmetric key exchange: ECDH

Another alternative is to agree on symmetric keys by performing a key exchange using asymmetric keys. In particular, we’ll use an asymmetric algorithm based on elliptic curves ([Sull13]) called the Elliptic Curve Diffie-Hellman key agreement protocol, which is a variant on the Diffie-Hellman key exchange for elliptic curves.

An asymmetric encryption algorithm, or public key encryption algorithm, is one in which the key used to encrypt and the key used to decrypt are different. The key that’s used to encrypt is made public, and anyone who has this public key can encrypt a message to it. The decryption key is kept private by the holder. In symmetric security, all communicating parties have to share keys, which means (N * (N - 1)) / 2 keys. With asymmetric keys, only N public keys have to be exchanged.

In an asymmetric key exchange, both sides have a key pair consisting of a private component and a public component. They send each other their public components, even over an insecure channel, and ECDH means that combining your private key and their public key will yield the same symmetric key as combining their private key and your public key. However, ECDH can only be performed between keys that use the same curve.

Generating elliptic curve key pairs is generally a fast operation, and in this book, we’ll use ephemeral key pairs for sessions: key pairs that are generated at the start of each session, and discarded at least once the session is over, if not earlier. Using ephemeral keys limits the damage of key compromise: an attacker who manages to recover an ephemeral key can only decrypt messages for that session.

Introducing asymmetric cryptography into a system brings a lot of other concerns, as mentioned in the engineering concerns chapter. It comes down to trust, and at some point there needs to be a root of trust that all participants in a particular key exchange agree to trust. This is not a trivial concern, and should be considered carefully. While it does afford the ability to perform key agreement over an insecure channel without a pre-shared secret, the benefits of this will need to be weighed against the other factors. For example, we’ll look at digital signatures in an upcoming chapter, but for now it suffices to note that often a long-term identity key is used to sign session keys to prove their ownership. Now both sides have to determine how they’ll get the other’s identity key, how they’ll know they can trust it, and how they can match the public key to the peer they’re talking to. Some method for distributing and trusting public keys has to be determined.

NaCl: Curve25519

In NaCl, generating new key pairs and performing a key exchange is simple, and provided by functions in golang.org/x/crypto/nacl/box.

1 pub, priv, err := box.GenerateKey(rand.Reader)

We can use the Session defined earlier to set up to perform the key exchange. In this function, the dialer is the party who initiated the session.

func NewSession(ch Channel) *Session {
       return &Session{
               recvKey: new([32]byte),
               sendKey: new([32]byte),
    Channe: ch,
       }
}

func keyExchange(shared *[32]byte, priv, pub []byte) {
       var kexPriv [32]byte
       copy(kexPriv[:], priv)
       util.Zero(priv)

       var kexPub [32]byte
       copy(kexPub[:], pub)

       box.Precompute(shared, &kexPub, &kexPriv)
       util.Zero(kexPriv[:])
}

func (s *Session) Rekey(priv, peer *[64]byte, dialer bool) {
       if dialer {
               keyExchange(s.sendKey, priv[:32], peer[:32])
               keyExchange(s.recvKey, priv[32:], peer[32:])
       } else {
               keyExchange(s.recvKey, priv[:32], peer[:32])
               keyExchange(s.sendKey, priv[32:], peer[32:])
       }
       s.lastSent = 0
       s.lastRecv = 0
}

The Precompute function actually carries out the key exchange. The NaCl package can also do the key exchange for each message, though that may be less efficient, depending on the application.

1 out := box.Seal(nil, message, nonce, peerPublic, priv)

This might be useful for a one-off message. In this case, it’s usually useful to generate a new ephemeral key pair for each message that is used for the key exchange, and pack the ephemeral public key at the beginning so the receiver can decrypt.

In NaCl, all key pairs use the same elliptic curve, Curve25519 ([DJB05]). This makes the interface simple, and simplifies exchanging public keys: they will always be 32 bytes. As long as we’re using Curve25519, we won’t have to worry about which curve the sender might be using.

NIST curves

The Go standard library has support for the commonly-used NIST curves, of which there are several. In order to communicate using these NIST curves, both sides have to agree on which curve to use. This adds additional overhead to the previously mentioned considerations: now, the system designer has to consider not only where keys come from and how they’re trusted, but which type of curve to use. There is serious concern over the provenance of these curves and whether they have been designed with an NSA backdoor; they should be used only when required for compatibility or as part of a specification. They can be generated using

1 priv, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)

The three curves of interest are P256, P384, and P521 (or secp256r1 / prime256v1, secp384r1, and secp521r1). P521 is believed to have an equivalent security level to AES-256 (notwithstanding questions about the integrity of the curve), so it is appropriate for use with the AES-256 plus HMAC ciphersuites.

Doing this with the Go crypto/ecdsa package is more involved. We first have to verify that the curves match, and then perform the scalar multiplication (or point multiplication). The number that is returned is the shared key, but it won’t be uniformly random. We do want our shared key to have this indistinguishability property, so we’ll compute the SHA-512 digest of this number which produces a 64-byte value: this is large enough to use with AES-256-CBC with HMAC-SHA-256 and AES-256-CTR with HMAC-SHA-256.

var ErrKeyExchange = errors.New("key exchange failed")

func ECDH(priv *ecdsa.PrivateKey, pub *ecdsa.PublicKey) ([]byte, error) {
       if prv.PublicKey.Curve != pub.Curve {
               return nil, ErrKeyExchange
               return
       }

       x, _ := pub.Curve.ScalarMult(pub.X, pub.Y, prv.D.Bytes())
       if x == nil || (x.BitLen()+7)/8 < secret.KeySize {
               return nil, ErrKeyExchange
       }

       shared := sha512.Sum512(x.Bytes())
       return shared[:secret.KeySize], nil
}

The crypto/ecdsa keys are used here instead of the ones defined in the crypto/elliptic package due to the support for serialising ECDSA keys in the crypto/x509 package (i.e. using the MarshalPKIXPublicKey and ParsePKIXPublicKey functions).

func ParseECPublicKey(in []byte) (*ecdsa.PublicKey, error) {
       // UnmarshalPKIXPublicKey returns an interface{}.
       pub, err := x509.UnmarshalPKIXPublicKey(in)
       if err != nil {
               return nil, err
       }

       ecpub, err := pub.(*ecdsa.PublicKey)
       if !ok {
               return nil, errors.New("invalid EC public key")
       }

       return ecpub, nil
}

Due to the concerns about the NIST curves, we’ll mostly avoid using them in this book (though they are defined for some other systems we’ll look at), and will primarily stick to Curve25519.

Other key exchange methods

Password-authenticated key exchanges are another mechanisms for agreeing on a symmetric key that also use a public key. I haven’t seen any implementations in Go, so we won’t be using them in this book.

Another mechanism is the original Diffie-Hellman algorithm; one implementation is at “github.com/dchest/dhgroup14”. We won’t use DH except in the context of ECDH in this book, though.

More complex mechanisms such as that found in Kerberos are also used.

We’ll look at exchanging keys using an asymmetric algorithm called RSA later, though this won’t be our preferred mechanism. Elliptic curve keys can be generated fast enough to make ephemeral keys practical; RSA key generation is much slower and RSA implementations are difficult to get right. The alternative is to use RSA to sign the ephemeral key; this is often used with Diffie-Hellman.

Practical: File encryptor

To put this into practice, try to build a file encryption program that uses passwords to generate the key. First, think about how the program should work: should it encrypt multiple files, or a single file? What does the security model look like? What kinds of platforms does it need to run on? Are there compatibility requirements?

My solution to this is in the filecrypt repository. Your requirements and specification (and therefore security model) may be different, so you may end up with a different solution.