Personal names

Personal identifiers such as names are a crucial piece of our identities. Many software problems result from a fundamental conflict between the two key aspects of names. On one hand, names are personal, so they carry a lot of cultural and family heritage, tracing back to a time long before computers. Specific spellings, accents and parts of names have meaning and can’t just be simplified or changed to make processing easier. On the other hand, useful computer identifiers need to be standardised and easy to store and process. In many cases, various software systems need to agree on someone’s identity. All those systems were designed by different people, working under specific constraints, making their own assumptions. That’s why small inconsistencies and bugs in handling names in one component can easily create a mess in collaborating systems.

Here are some often-overlooked oddities of personal names, which you should remember when designing software:

Single-letter names aren’t always initials, so it’s bad practice to use length checks to prevent people from entering initials (Stephen O, A Martinez, O Rissei).

There’s no universally acceptable standard for maximum name length. The International Civil Aviation Organization (ICAO) allows up to 64 characters per name. Many governments today limit registered baby names to those that fit on a passport, which may be up to 40 letters. Of course, different governments have different standards. Also, people born before machine readable passports weren’t subject to that restriction. Some names can get very long (e.g. Christodoulopoulos, Srinivasaraghavan, StopFortnumAndMasonFoieGras, Wolfeschlegelsteinhausenbergerdorff, Rhoshandiatellyneshiaunneveshenk, Keihanaikukauakahihuliheʻekahaunaele). Double-barrelled surnames can also get quite long (e.g. Plunkett-Ernle-Erle-Drax).

Names aren’t permanent, and in most countries people can easily change their names as many times as they want.

Names don’t just consist of letters. They can contain accents and apostrophes (O’Stephen, Keihanaikukauakahihuliheʻekahaunaele), dots (GoVeg.com), dashes (Thurman-Busson), numbers (Number 16 Bus Shelter, Jon Blake Cusack 2.0) and probably some other classes of symbols. It’s best not to assume any specific character set for validity checks.

People don’t always have a given name and a surname. Some people are mononymic – they have only a single name (e.g. They, Teller, Naqibullah). It’s best not to ask for first and last name separately. When communicating with external systems, make sure you can handle cases in which one of those two fields is missing.

There’s no universally accepted standard for working with mononymic names. Many government systems require first and last names to be recorded separately, and some will set mononymic names as the given name, some as the surname. Some use the mononymic name for both fields (Neli Neli). When matching names against external sources, consider that the sources might be using different schemes for single names. Some countries use markers such as FNU, LNU or XXX for the other name when recording mononymic people. Detect those markers and consider them when matching or validating external records, so you don’t end up interpreting them as given names (No Name Given Sandhya). But don’t assume these are always markers (someone can theoretically change their name to XXX).

People don’t always have just one or two given names and surnames. Tracy Nelson has 138 middle names. A nice example is Rosalind Arusha Arkadina Altalune Florence Thurman-Busson. For a good edge case, remember James Dr No From Russia with Love Goldfinger Thunderball You Only Live Twice On Her Majesty’s Secret Service Diamonds Are Forever Live and Let Die The Man with the Golden Gun The Spy Who Loved Me Moonraker For Your Eyes Only Octopussy A View to a Kill The Living Daylights Licence to Kill Golden Eye Tomorrow Never Dies The World Is Not Enough Die Another Day Casino Royale Bond.

Null isn’t just a computer kill word, it’s also a perfectly valid name.

Test, Sample and many other common words are also valid names. Just because a user’s surname is Test doesn’t mean that it’s actually a test account. When testing, avoid using specific names to mark example data, because real users might get caught by this as well.

Fictional character names aren’t necessarily always fake (Superman Wheaton, Buzz Lightyear, Darth Vader). Names that are also those of popular brands aren’t always fake either (Facebook Jamal Ibrahim, Google Kai). Common English (or any other language) words or phrases in a name don’t necessarily make it fake (Elaine Yellow Horse, Above Znoneofthe).