Context Creates Clarity
The hierarchy of names forms the relationship between the current item (function, class, variable) and its surroundings.
It is rather like the relationship between the items in a grocery store and the aisle in which they can be found.
Some chocolates are found in the candy section. Semi-sweet baking chocolates are found in the baking aisle. They are essentially the same, but are “baking chocolate” and “chocolate bars” are separate and have separate intentions. [^1] Likewise the cocoa found in the hot beverages aisle with coffee and tea may be a rather different product than the cocoa found in the baking aisle.
[^1]: They are the same thing, and I have often enjoyed semi-sweet baking chocolate as a treat.
The names in a source file are expected to harmonize with their environment.
Let’s explore this concept:
The Hierarchical Nature of Code
I have a variable named left.
Is that a good name or a bad name?
The answer “it depends” is a appropriate here.
In the right context, left may be an excellent name. Con binary tree you may
have a node with references to left and right nodes.
- What names does it exist alongside?
- What algorithm is using it?
- What is the name of the containing function?
- What class is it in (if any)?
- What module is it in?
- What is the application that contains it?
Names are not standalone entities.
1 # operator_functions.py
2
3 def addition(left: int, right: int) -> int:
4 return left + right
In a very simple case like this, I know that the module is called
operator_functions where functions are declared to wrap standard operators in
python. The method is called addition, and the two variables are left and
right.
They are the number on the left of the plus and the number on its right.
This doesn’t leave a lot of room for confusion.
1 def bite_head_off(original_array: List[Any])
2 removed, *left = original_array
3 return left
In this case, we can see that left was intended to mean “remaining” and not
item on the left.
On the other hand, the function is said to “bite the head off” - which is not a wonderful name, but introduces the concept of a head. There isn’t a head and a tail in the body of the function. They seem to not agree.
Perhaps this works better:
1 def bite_head_off(original_array: List[Any])
2 head, *tail = original_array
3 return tail
The context of the function name needs to carry into the body of the function. That is a piece of context best not left to the reader to translate.
If we rename the function, we will want to rename the variables to match the function.
It may also be that bite_head_off is perfectly coherent within the themes
used in other functions in the current file or module. If so, we would keep it.
If it seems out of place in its context, we would do well to rename it.
Every name is expected to be a good citizen in its context, even if this reduces the creativity of expression.
Consider the differences in the word left when in proximity to either
arrived, right, or taken. Then consider the names in a function or
paragraph of code–do they reflect the meaning in balance/harmony with the
names around them?
What is the domain name associated with this concept?
A useful principle was offered to me by Jeff Langr some time ago while pair-programming:
What is it called in the test?
This question is key and may lead us to some important insights for programmers.
The appropriate name may already exist either in the problem domain or else the solution domain.
The problem domain may be something like banking, insurance, factory automation, loan processing, or other. If we are doing something that is recognizably part of a well-known business domain, then using the business term is always a good choice.
But we might be dealing with something that is not recognisable in the business domain because it is a simple programming idea far below the “water line” of the business analyst. In that case, is there an appropriate name for the parts of an algorithm we are using?
Names like numerator, denominator, product, head, tail may be
perfectly good names. Likewise, map, reduce, filter, or dataset may be
the best names available to us in a moment.
Windshield Naming
These may have even better names, though, based not on what they are but what they are for.
I could name the left and right variables from the addition method int1 and
int2 and probably nobody would be offended or terrified.
They say what the numbers are and differentiate between the two of them. In such a small function, that’s probably enough that nobody will be confused.
I would not extend that style of naming to a function with dozens of variables.
It would be too confusing to remember the meanings of int1 vs int11 vs
float1.
Consider using Windshield Naming.
Allegedly (and I’ve never sought proof), the glass in front of the driver of a vehicle has different names depending on the context.
Whereas you or I may call it a “windshield” or a “windscreen”, depending on where we spent our formative years, I’m told that in some factories it is called neither.
There, the name is “front glass.” This is valid, as it’s made of (safety) glass, and is in the front of the car.
The apocryphal factory name says what it is. The common name says what it is for.
It is to shield or screen us from the wind. That’s its utility to us, and why we put a glass in the front of a car. Lacking a windscreen, we would need goggles to drive, and probably a scarf to keep bugs out of our faces.
When we are inventing names, I posit that our best choice is windscreen naming. Call the item after its purpose rather than its composition.
In this case, we may want to call the tail of the array something else. In the
case of such a short function, even calling it result is intention-revealing
to an extent–it is the part to be returned.
Unnecessary Context
Names in code don’t exist in isolation. They exist within layers of context that work together to create meaning. Understanding this hierarchical system—from system to module to file to class to method to variable—is helpful to avoid redundancy.
Consider this example:
1 // Over-specified names that ignore context
2 package com.payroll.employee;
3
4 public class EmployeePayrollCalculator {
5 public Money calculateEmployeeGrossPay(Employee employee, PayPeriod payPeriod) {
6 BigDecimal employeeHourlyRate = employee.getEmployeeHourlyRate();
7 int employeeHoursWorked = payPeriod.getEmployeeHoursWorked();
8 return new Money(employeeHourlyRate.multiply(new BigDecimal(employeeHoursWorked)));
9 }
10 }
The redundant use of “employee” and “payroll” throughout this code creates noise rather than clarity. is there anything in this function that is not about employees and payroll?
While some people seem to recommend that long names are preferable and “more readable”, we find that brevity has value, and brevity is preferable to redundancy.
1 package com.payroll.employee;
2
3 public class EmployeePayrollCalculator {
4 public Money calculateGrossPay(Employee employee, PayPeriod payPeriod) {
5 BigDecimal rate = employee.getHourlyRate();
6 int hours = payPeriod.getHoursWorked();
7 return new Money(rate.multiply(new BigDecimal(hours)));
8 }
9 }
Benner describes the useful value of “distinguishable”: a term should stand out from surrounding terms. I can’t applaud this virtue enough.
Notice that in this example, rate is the only rate in the function, and it is
explained by the initialization (employee.getHourlyRate) so there is
sufficient context for that name. The variable hours parallels rate.
It is a very short example (for publishing reasons) but even if the function were a dozen lines long, these facts would hold.
The Context Hierarchy
You may be looking for a hammer and a pack of floor tiles. Your first stop will likely be to find a DIY store in your area. Then, once you are in the DIY store, you will look for signs pointing you to the tools department. Once in the tool department, you’ll look for hammers and select a likely candidate. The next step is flooring, in which you look for tiles, and you will make a selection.
If you wandered into the DIY and found a “Fiction” section lined with books, you would be rather surprised. In the Fiction section you would not expect to see a fresh seafood section.
When you look at your company’s software system, do you see that the type of system is reflected in the top-most list of modules? Is each module reflected in the names of the submodules? Do the filenames reflect the submodule, and the classnames in the files reflect their purpose?
A system can be rather like a big filing system.
Now, of course, it isn’t exactly like a filing system, but it is a useful way to think about the system.
Wouldn’t it be nice to be hired by a company where the software’s organisation is that intuitive?
System Context
At the highest level, the system name establishes the overall domain. If you’re working in a system called “InventoryManagement,” you don’t need to prefix every class with “InventoryManagement.”
Now, there is usually a go-to-market name that must be chosen for market reasons. I use an app called Obsidian, which has nothing at all to do with mineralogy or geology. The source code is unlikely to have a system of geologic names. It is more likely that they are connected to the ideas of note-taking, MarkDown display and editing, tagging, linking, and graphing tags and links.
I once spent some time refactoring a game. When I showed the original code, nobody could tell it was game code. It was full of invented terms and overly concrete ones like string_dictionary (abbreviated to ‘strdict’). When I was done, everyone who saw it knew that it was a game, and which familiar game it was. I don’t say that to brag, but to point out that names in a system can reflect the purpose of the system, and it is helpful for readers to see what the module contributes to the system at hand.
Module/Package Context
In an inventory management system, you might expect to see modules for inventory, purchasing, fulfilment, and so on.
A good module name will sound like it belongs in the system, and a good submodule name will sound like it belongs in the module.
Names of packages, modules, and subsystems should serve as a bridge from the general language of the system to the more specific terms and actions they encompass.
Consider the name to be a “one dot” name, System.Subsystem with a convenient
convention of eliding the outer context. The fulfilment module of an online
catalogue sales system is best understood as `onlineCatalogueSales.Fulfilment.
Where low-level details “leak” into higher level names, it’s likely that some re-architecting may need to be done. I am not one to recommend unnecessary layers but neither do I suggest skipping necessary levels, and odd details leaking into module names can be a clear sign that we missed something useful.
File Context
The file name contributes significant context:
1 // In file: order-validation.ts
2 export class Validator {
3 // Clear from filename what we're validating
4 validate(order: Order): ValidationResult {
5 // Implementation
6 }
7 }
8
9 // In file: email-validation.ts
10 export class Validator {
11 // Same class name, different context
12 validate(email: string): boolean {
13 // Implementation
14 }
15 }
Note, however, that if these Validator classes are used outside of the files
containing them, then they need to bring their context along for the ride. If
I’m in some remote file outside of the module and file defining these objects,
I will need them to carry context.
Sometimes this is done via the import statements. In python, one can add context on import:
1 from email import Validator as email_validator
Or sometimes we can import the package and give a full reference:
1 import email
2
3 # and later...
4 email_validator = email.Validator()
Class Context
Class names establish what we’re working with:
1 public class Invoice {
2 private decimal amount; // Not: invoiceAmount
3 private DateTime date; // Not: invoiceDate
4 private Customer customer; // Not: invoiceCustomer
5
6 public void Calculate() { // Not: CalculateInvoiceTotal
7 // We know we're calculating something invoice-related
8 }
9 }
Here the Calculate method isn’t a random name, but is really invoice module’s
Invoice class’ Calculate method. One might call it
“Invoice::Invoice.Calculate().”
It can be weird to be wandering the invoicing area of your application see
something called a StrategyManagerFactory. That seems as startling as finding a
cooler of fish fingers for sale in the middle of the lumber department of your
local DIY store. Pattern combination names like StrategyFactory, combined
with noise words like data or manager, impart little meaning and fail to
reflect the business at hand. See the section on “Windshield Naming.”
Classes and functions are often intended for use outside of the module that defines them. This is a special consideration; the name may have to restate enough context that it will make sense when used elsewhere in the application, far from its defining context. A bit of redundancy may be necessary. You (collective ‘you’) will have to see the method/class used outside of its context in order to decide how much additional context to embed in the name.
In python, you may find yourself aliasing a name with
import module.classname as localname. If this happens frequently and the
local name is consistent in modules, it may signal that you should change the
class name to match the frequently used local name.
Method Context
Method names provide the immediate action context for their parameters and local variables:
1 def calculate_tax(gross_income, num_dependents, deductions, filing_status):
2 standard_deduction = get_standard_deduction(filing_status, num_dependents)
3 taxable_income = max(0, gross_income - deductions - standard_deduction)
4 return tax_due(taxable_income, filing_status)
Because the scope of the variable is the scope of the function, all the
variables can be understood in light of calculate_tax. The parameter’s name is
really federal_income_tax.calculate_tax.taxable_income.
The code sample is not offered as an ideal phrasing of the problem, but a typical one.
There are two incomes represented here, the gross income and the taxable income. These are made distinguishable via naming.
That is helpful, but naming is the second-best way to distinguish similar things in an OO system. Ideally, the types of gross_income and taxable_income would be sufficient to prevent one from performing invalid operations on them.
Also note that the above code sample lacks not only type hints, but also units. A better OO system would look rather different. Alas, this is a naming book. Perhaps the next book will be about types.
The Variable Name Sweet Spot
A variable name should be just specific enough to be unambiguous within its context, and explain its intent.
The context provides the rest of the meaning.
1 public class OrderProcessor {
2 public ProcessingResult process(Order order) {
3 if (!order.isValid()) {
4 return ProcessingResult.INVALID;
5 }
6 Payment payment = order.getPayment();
7 if (!charge(payment)) {
8 return ProcessingResult.PAYMENT_FAILED;
9 }
10 return ProcessingResult.SUCCESS;
11 }
12 }
Could these names be even shorter if we used abbreviations? Sure. But what would it buy us? Is there any advantage to using terms like ‘o,’ ‘ord,’ ‘pmt,’ or the like?
Let’s talk about ProcessingResult for a moment.
“Processing” is what we consider a noise word. Any imaginable operation can be called “processing” and anything it returns can be called “result.”
This appears to be an ENUM by typographic convention. Are all processes
returning a ProcessingResult value? If so, do all processes have a
possibility of PAYMENT_FAILED, or is that value irrelevant most of the time?
Do all the values relate to all the operations using them?
It seems to be about financial transactions and not at all as generic as one
might think. It would take a discussion with the team in this codebase to
determine if it is really a TransactionResult or if it is so generic it
contains every possible failure and success entity and should just be called
Result.
As always, context and team understanding must be considered.
Over-Contextualization Anti-Pattern
Let’s just use one more example for comparison. Which do you think is easier to take in at a glance?
1 class UserAccountManager {
2 processUserAccountRegistration(UserAccountData userAccountData) {
3 let userAccountEmail = userAccountData.userAccountEmail;
4 let userAccountPassword = userAccountData.userAccountPassword;
5
6 if (!this.validateUserAccountEmail(userAccountEmail)) {
7 throw new UserAccountEmailValidationError();
8 }
9
10 let userAccountRecord = this.createUserAccountRecord(userAccountData);
11 this.saveUserAccountRecord(userAccountRecord);
12
13 return new UserAccountRegistrationResult(userAccountRecord);
14 }
15 }
1 class AccountManager {
2 processRegistration(user) {
3 let email = user.email;
4 let password = user.password;
5 if (!this.validateEmail(email)) {
6 throw new EmailValidationError();
7 }
8 let account = this.createAccount(data);
9 this.save(account);
10 return new RegistrationResult(account);
11 }
12 }
A Quick Sidenote
We will not argue over whether validateEmail is a reasonable member of the
account manager.
The name, though is helpful. At some point, if we see there are a number of methods floating about that deal with email, we might search for the term ‘email’ and find all the disparate bits to gather into a more coherent class or module.
This is a good thing to remember as you are working in one area and need a method that isn’t entirely cohesive with the class or module name, as ‘email’ is a somewhat different topic than ‘account’, that your naming can help you in your later refactoring and design.
Domain Context: Speaking the Language
Domain context is particularly important because it establishes the vocabulary that makes sense to domain experts:
1 # In a medical records system
2 class Patient:
3 def __init__(self):
4 self.mrn = None # Medical Record Number - domain term
5 self.dob = None # Date of Birth - standard abbreviation
6 self.allergies = []
7
8 def schedule_appointment(self, provider, datetime):
9 # "provider" not "doctor" because domain includes
10 # various clinicians.
11 pass
Medical Record Number (MRN), Date of Birth (DOB), and Provider are well-known terms in this space.
We don’t avoid using domain acronyms. It’s good for developers to learn them and use them in these situations.
Might a developer, encountering MRN for the first time, be confused? Yes, but this is what we refer to as ‘profitable struggle’. When the developer learns this term, they have something of value. It will appear not only in the source code, and on various UI screens and reports, but also in conversations with the sponsors and users of the system.
1 # In a financial system
2 class Account:
3 def __init__(self):
4 self.routing_number = None
5 self.account_number = None
6 self.apy = None
7
8 def calculate_accrued_interest(self, days):
9 # Domain-specific calculation method name
10 pass
Annual Percentage Yield (APY) is a well-known term in this domain.
Should they be capitalized or not? That’s a toughy. I leave this to the team to decide.
Process Context: Understanding the Flow
The position within a process or workflow provides additional context for naming:
1 def process_order_pipeline(raw_order_data):
2 # Early in pipeline: descriptive of source
3 validated_data = validate_order_data(raw_order_data)
4
5 # Mid-pipeline: focus on current state
6 order = create_order_from_data(validated_data)
7
8 # Later in pipeline: focus on next action
9 enriched_order = add_customer_data(order)
10
11 # End of pipeline: focus on outcome
12 result = submit_for_fulfillment(enriched_order)
13 return result
There is a side-issue with using the term ‘data’, which is generally a noise word. It seldom has anything to add to a name.
What is the difference between ‘CustomerRecord,’ ‘CustomerData,’ and ‘CustomerInfo’? In most cases, as domain entities, these names are garbage. This is especially true if more than one of them appear in the same codebase!
In the case above, however, there is a difference between the raw data gathered
from some source and the domain entity order.
This is one of the exceptions that tests the rule. When, in Clean Code, I wrote that ‘data’ is a noise word best avoided, people took that as a unilateral and universal ban.
The more important thing here is the principle. We don’t add needless words to an identifier, only what is needed for it to be understood in its context.
Mathy or Prosey?
While a lot of people in the craft camp have said it is crucial that code reads like well-written prose, I offer a suggestion:
Code is often better ‘mathy’ than ‘prosey’
Many brilliant engineers have spent decades designing languages specifically for expressing programming ideas. These langages are generally superior to human languages for expressing the same ideas.
Let the languages do what they do best.
Work with the colleagues who share your codebase, and see when your algorithms are better expressed in prose vs code.
I tend to prefer the one that most readily yields its meaning at a glance.
Mathematical Operations
In mathematical contexts, single-letter variables often align with domain conventions:
1 def distance(point_a, point_b):
2 x = point_a.x - point_b.x
3 y = point_a.y - point_b.y
4 return math.sqrt(x * x + y * y)
5
6 def quadratic_formula(a, b, c):
7 discriminant = b * b - 4 * a * c
8 if discriminant < 0:
9 return None
10 sqrt_discriminant = math.sqrt(discriminant)
11 return [(-b + sqrt_discriminant) / (2 * a),
12 (-b - sqrt_discriminant) / (2 * a)]
Consider these contexts. Are the names sufficient? Would you change them, and if so, how would you? If you set your and the original side-by-side, which would your team members choose?
Index Variables in Clear Contexts
1 matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
2
3 # Traditional and clear
4 for i in range(len(matrix)):
5 for j in range(len(matrix[i])):
6 matrix[i][j] *= 2
7
8 # Overly verbose for this context
9 for row_index in range(len(matrix)):
10 for column_index in range(len(matrix[row_index])):
11 matrix[row_index][column_index] *= 2
When to Use Minimal Names
Minimal variable names work best when:
- Scope is tiny - The variable exists for just a few lines
- Context is unambiguous - It’s clear what the variable represents
- Single concept in scope - No chance of confusion with other variables
- Convention supports it - Domain or technical conventions make it natural
- Operation is simple - Complex logic needs more descriptive names
When Context Is Mixed
1 def calculate_discount(customer, product):
2 customer_type = customer.type
3 product_type = product.type
4 if customer_type == 'premium' and product_type == 'luxury':
5 return 0.15
6 return 0.05
There is more than one ‘type’ variable here. We can’t use ‘type’ as the name of two variables, nor could we call the first ‘customer’ because there is already a customer variable.
Admittedly, in a function this small, we don’t really need to hold the customer and product types in separate variables. We don’t save any space by doing so, nor make the code any shorter.
1 def calculate_discount(customer, product):
2 if customer.type == 'premium' and product.type == 'luxury':
3 return 0.15
4 return 0.05
In a longer or more complicated function, though, we may find it useful to extract these values.
Practical Guidelines for Context-Aware Naming
Trust your context - Don’t repeat information that’s already clear from the surrounding context.
Consider the reader’s journey - Names should make sense as someone reads through the hierarchy from system to variable. Likewise, if a concept travels from its initial context into a new context, it must bring its meaning along through more comprehensive naming.
Match domain vocabulary - Use the same terms your domain experts use, even if they’re abbreviated or technical.
Embrace minimal names in minimal contexts - See the Length chapter, coming up shortly.
Evolve names with changing context - As code moves between contexts (refactoring, extraction), update names to match their new environment.