Leanpub: Publish Early, Publish Often

Updates

As soon as your system reaches a certain level of complexity, you will require updates. Reason for that are the inevitable bugs and feature requests.

And as bugs can be vulnerabilities you will not be offering a secure solution without updates.

The software offered today is highly complex. Caused by the foundation that is used: Operating Systems, libraries, frameworks, …

Even CPU microcode can be updated.

Updates are the solution. And they add more complexity.

Things that a good update strategy covers

Be able to rollback
Verify signature
Secure key handling
Verify version - only upgrade and no downgrades
Track distribution
Be able to stop update roll outs
Distributing the updates
Compress updates
Diff updates
Incentives to update
Automatic updates

Things to avoid

TOCTTOU Time of check to time of use - error
Break the update chain

Especially breaking the update chain (for example by ruining the update on the client system) is one of the biggest fears.

Response Time

This is the time you have to response to an incident. Normally it is the span between learning about an issue and successfully updating all your customers systems.

If a vulnerability gets known to the bad guys, expect a time-to-exploit of a few days. Normal process is that someone identifies the vulnerability and writes an exploit (plus a blog post). The exploit is then stabilized and integrated into attack systems (metasploit for the more white hack side, exploit kits for the bad guys).

If a vulnerability gets reported as “responsible disclosure” by a friendly hacker you will have about 90 days (standard) to release the fix to all customers. If you have good reasons (complex situation where it is hard to get all your users updated) you can maybe negotiate and get some more days. But there is a very good reason for the 90 days: It is a reasonable time to get the fix released and a longer time frame would increase the risk that a bad guy finds the same vulnerability in parallel. This happened more often than you think.

Short: You will not be able to ship the fixes with the next regular update. Be ready to for quick response. Build your processes from support call to all-your-users update accordingly.

One more thing: As soon as the first client gets an update, the bad guys could get it as well. Normally they do a binary diff (using disassembly tools). From that diff they can learn which vulnerability you just patched. The race is on between the bad guys creating a reliable exploit and you getting the update to your last customer.

These disassembler tools may sound arcane, but are quite simple to find and sophisticated.

IDA, paid and free test version. For Mac/Win/Linux
Hopper, Mac/Linux. Test version available
Radare2, Mac/Windows/Linux/… %% TODO Add Ghidra

Update strategy details

Be able to rollback

Be able to roll back your changes either automatically or by user interaction if the update breaks something. User interaction could be “press the reset button for 10 seconds”. Rollback could be to the factory system or last-know-good software state. This will reduce your stress when doing emergency updates.

Verify signature

Cryptographically sign your binaries (public key cryptography is the keyword).

One way of signing your binaries is to create at least a sha256 hash of your binaries (md5, CRC and sha1 are broken). Create a update config file containing all your files, their versions and their hashes. This update config file should also have a unique version number that is incremented only.

Now sign this config file using public key signatures.

This config file will be checked against the public key file on the end-system. If the key matches and the version is bigger than the current one the binaries can be downloaded and verified against the hashes.

Secure key handling

The secret key must be stored on an isolated system. Not a build machine and especially not the download server. You sign the update binary or config config prior to release with this key.

You will want a second emergency key (its public part also distributed to the clients). This one is stored in a safe. If the first key is compromised you can can revoke it and do an emergency update now binding the emergency key to the clients.

Verify version - only upgrade

Only update to a newer version. No updates to older versions. If you ever want to roll back a change, release an old version of your programs with a higher version number.

The reason for this is that an attacker could trick clients to install updates with older versions of your program that still contains known vulnerabilities. And hack the systems afterwards.

Track distribution

You want to track your update roll out. And if systems are functional post-update. Especially that.

Be able to stop updates

If you roll out a broken update, you should be able to remove it from your update servers very quick. That way other users do not risk ruining their systems. Automatic roll back (mentioned a few chapters above) will take care for the updated systems.

Slowly increasing the percentage of client systems getting the update offered is also part of this. Start with a few percent, measure success, increase percentage.

Result of these precautions is that everyone (including management) feels much more comfortable rolling out updates. And this will make it simpler to update clients with a new version that contains “only one security fix that is not even abused - as far as we know”.

Distributing the updates

If your updates are big your servers will not have enough bandwidth for to update all clients in time. You could use a CDN for your updates. To further reduce load update in stages:

Check first if there is a newer version than the one installed. This is a few bytes only
Download config file. Verify it.
Download binaries

As you have your files signed, an option to be protected against DDOS-es vs. your update infrastructure would be to also have a fallback to a peer-to-peer network (BitTorrent). Some bigger software applications (games) use BitTorrent. Do not make that the only source ! BitTorrent could be blocked in some environments like companies. Also do not waste the users bandwidth without asking. It could be expensive for some of them if they are connected using a mobile network.

Compress updates

Compressing updates reduces stress on your update servers and makes them more resilient against DDOS attacks. Test common algorithms with your update files. The algorithms have different strengths and weaknesses. Also: De-compressing is very often much faster than compressing. And this is what your clients are doing.

Diff updates

If you are shipping large updates, you want to ship them as diffs to previous versions. Depending on your file types there can be different diff algorithms that are quite effective. Please test before using. For PE files there is “courgette”. Always verify after diffing if the created file is the proper one (by hash). If something went wrong, fall back to whole-file download.

That way there is no additional risk.

Incentives

Some users care more about new Emojis than about vulnerability fixes. Maybe bundle some eye candy (new backgrounds, …) with the updates to get the users to update. Sad but true.

Plan for several channels

When creating an updater be ready to add new channels later. Valuable channels to have are for example an “internal” and a “beta” channel.

Automatic updates

Do updates in the background. Without user interaction. Especially if you have corporate users offer an Opt-out hidden in some config. Companies want to test before they deploy to thousands of users.

If the update requires a restart of the software think about waiting for the user to naturally restart it - with a timer. If the user does not restart it after X days nudge him.

Control the update infrastructure

It is essential that you control your update infrastructure. There are several things that can break given time:

certificates run out
You lose ownership of your domain
server needs system updates

Plan ahead and create scripts that test your current situation and warn you weeks ahead before your domain ownership runs out or before you should create a new certificate.

Another thing to control is the certificate that signs the updates. It must not be on the update server. It also should not be on the build server (where lots of people from you company can access it) but on a dedicated system.

Details for things to avoid

TOCTTOU Time of check to time of use - error

Wikipedia on TOCTOU

A possible issue when checking the signature of an update package: If the package is on an external medium and the updater is not copying it locally, the attacker can switch the (legit) package after verification with a malicious package.

You verify the (clean) external data’s signature
Attacker replaces clean software with malicious software
Your tool now installs the malicious software from external medium

Solution: Copy the package into a local storage before verification and installation.

Break the update chain

Never break the update chain for your devices. As long as updates work you can fix anything. Focus of your testers should be to (automatically) test updates over several versions.

Up next

Passwords