Addresses
Postal addresses are an important link between the virtual realm of the Internet and the physical world. Apart from the obvious role in shipping the stuff people buy online, knowing users’ addresses is also critical for calculating delivery prices, correctly accounting for tax, and applying territory-specific limitations.
But postal addresses often play three more roles in software, which they were never intended for. With the lack of globally unique personal identifiers, addresses are also used to distinguish between two people with the same name, especially in countries that don’t have mandatory ID cards, such as the UK. Addresses also often serve as an additional piece of personal identification to match records from different systems, for example when banks check credit ratings. And parts of addresses, such as zip or post codes, are increasingly used as semi-secret information to prevent fraud, for example when verifying online credit card transactions.
For hundreds of years, postal delivery processes evolved to deal with inconsistent and incomplete addresses, but the new digital roles for address information require exact, precise and uniform data. Similar to names, the different conflicting roles of addresses create plenty of opportunities for software bugs.
Here are some often-overlooked facts that cause problems when handling addresses in software:
ZIP codes or post codes aren’t mandatory. Some countries don’t use post codes (Fiji, UAE).
Post code formats aren’t permanent, they change over time. For example, Singapore used two digits in the ’60s, four digits in the ’80s, and now uses six digits. Older records with post codes might use different formats from the current ones.
Some countries started using post codes relatively recently. For example, post codes were introduced in Ireland in 2014. For such cases, even though current addresses might have post codes, slightly older address records might not have that information.
Post codes aren’t always consistently used, even within a single country. For example, Jamaica doesn’t use post codes (the country tried to, but the system was suspended in 2007), but there are two-digit area codes for the capital, Kingston. China uses post codes, but Hong Kong does not.
There’s no universally agreed length for post codes. For example, Austria and Switzerland use four-digit codes. The Faroe Islands use three-digit codes. Iran uses up to ten.
Post codes aren’t just numeric; many countries use alphanumeric post codes. For example, EC11AA is a valid UK post code.
Post codes can contain spaces. For example, EC1 1AA is a common way of writing a post code in the UK.
Having the same or similar post codes doesn’t necessarily imply physical proximity. For example, rural codes in New Zealand can be far apart.
Post codes aren’t always the same in a city or area. In the UK, post codes are allocated to estates, blocks, buildings or even individual houses.
There’s no universally agreed minimum or maximum length for location names, including for street names or city names. For example, Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch is a place in Wales, Y is a place in France. There are six villages called Å in Norway.
IP addresses aren’t a reliable link to a physical location. Although many home broadband subscribers now effectively have an allocated IP address, there are too many exceptions and ways to spoof this information.
Out of all the categories of problems, real-world rules around addresses seem to change the most frequently at the moment. The examples and edge cases listed above were correct when I wrote this in 2017, but do check.