A Shell of a Problem
To understand the history of PowerShell, and its subsequent impact, you need to understand a bit of Microsoft history. At least, a simplified version of a tiny piece of it.
Until 1993, Microsoft Windows was a desktop operating system, meaning it ran on individual computers used by individual people. Most of those computers were fairly large, beige boxes that sat on, or under, a desk; laptop computers at the time were pretty primitive and bulky compared to what you might see today. The Windows of the day was version 3.1, and it couldn’t even participate fully in the more-primitive computer networks of the day, and home users tended to rely on dial-up services like America Online rather than the always-on Internet we take for granted these days. That Windows was shortly succeeded by Windows for Workgroups, the first fully network-capable Windows operating environment, and the first Windows arguably created with business scenarios specifically in mind. But even then, Windows for Workgroups could really only join a network; there was no version of Windows capable of hosting a network. Networks of the time were hosted by a server, with the most common servers running either a product called NetWare 3.1, produced by a company called Novell, or a variant of the UNIX operating system.
Unix (as it is more commonly styled nowadays) had been around for a while by then, primarily at military research facilities and large universities, but also running on the enormously expensive mainframe and midrange computers used by the largest enterprise companies. It featured robust networking, and in fact provided the underpinnings of what would become today’s Internet. Unix at the time was both incredibly complex to use (compared to Windows) and incredibly expensive; a single Unix computer could easily represent an investment in the tens of thousands of dollars.
NetWare was common in smaller- and medium-sized businesses; it was a fairly complex product to install and manage, and included its own protocols for network communications. NetWare could run on smaller, cheaper computers, and it was somewhat simpler for network administrators to learn to use.
Perhaps most importantly, many decent-sized companies didn’t have a “network” at all: they has a midrange computer like an IBM AS/400 that handled all of the business’ computing. Users connected to these midrange computers through “terminal emulation” cards that plugged into their bulky desktop computers, and by using applications that ran on Windows. Essentially, these terminal emulators turned each desktop into a “dumb monitor”—literally not much more than a television that was hardwired to the midrange machine—capable of sending keystrokes and displaying whatever the midrange sent back. Truly huge companies often had a mainframes, which was basically just a giant equivalent to a midrange, such as the Digital Equipment VAX line of computers.
The point is that, in the early 1990s, computer networking wasn’t a big thing for most businesses, and you needed specialized personnel to build and run a network if you did have one. It wasn’t like today, where your smartphone can join a wireless network with a couple of taps, and you can set up your own home WiFi network just by plugging in a box and running a setup app on your phone.
So, the landscape of the time: big companies had a single enormous computer, and perhaps a bunch of pricey Unix machines to supplement it. Smaller companies maybe had a few desktop computers connected to a NetWare server. Without networking, it was a pretty big pain to own more than a handful of computers. Even a giant tech university like MIT could probably have counted up all the computers they owned without much effort—a marked contrast to today’s world, where most of us have a pocket computer (called a smartphone), maybe a laptop or tablet, maybe a wrist computer (smart watch), a gaming machine, and more. In the early 1990s, a single human being didn’t run around owning a half-dozen computers!
In 1993, Microsoft launched Windows NT, its first truly “business grade” edition of Windows. Most critically, it launched Windows NT Server, which was their first operating system capable of hosting a robust network. Windows NT wasn’t yet anywhere near the class of a midrange or mainframe operating system, but it could certainly compete with Novell NetWare as the centerpiece of a small- or medium-sized business network. Even large companies started buying Windows NT: not to replace their AS/400 or VAX machine, mind you: in most cases, Windows NT “snuck into” the environment, purchased by a single department that was tired of not getting the computing resources they wanted from their company’s midrange or mainframe. Compared to NetWare, Windows NT was easy to set up and straightforward to operate: it adopted the same graphical user interface that had made Windows so popular on the desktop. It was an easy sell: “add a file or print network as easily as opening up your word processor!”
This was a truly critical point in computing history: suddenly, everyone could have a server capable of hosting shared files, providing shared printing services, and other basic tasks. Servers were cheaper, and Windows NT made it easy for almost anyone to set up a network. Network computing had been democratized and commoditized, and almost every business wanted in on it. So the number of servers installed in the world’s companies began to proliferate markedly. Microsoft followed Windows NT, initially versioned 3.1, with Windows NT 3.51, and then Windows NT 4. Dropping the “NT,” they then followed with Windows 2000 Server, and then Windows Server 2003.
By 2003, Windows—and the many applications that ran on it—had grown up a lot. It was fully capable of taking on numerous enterprise-class workloads, such as messaging, collaboration, databases, and more. Microsoft—themselves an AS/400 company for much of the company’s history—committed to running their own company on Windows Server, and by the 2003-2005 timeframe had made the jump.
Here’s the advantage of moving your computing from giant, multimillion-dollar AS/400 and VAX systems to smaller, “commodity” servers running Windows Server: it’s cheaper. Sure, a single Windows Server might not be able to do messaging and file and print and databases and whatever else, but you could buy three dozen Windows Server machines for far less than the price of a single AS/400. In the late Nineties, as the public Internet and World Wide Web came online and proliferated, people quickly realized that having more, cheaper servers was often better than having one, expensive one. Want to stand up a website that can survive getting mentioned on Oprah? Have that website served up by an entire building full of cheap web servers, each taking a small part of the overall workload. Today’s modern cloud, in the form of Amazon Web Services, Microsoft Azure, Google Cloud, and others, exists entirely around the concept of “lots of cheap computers.”
But here’s the downside of all those servers: someone has to manage them all. They need to be configured to work properly, and they need to stay configured. They need periodic security updates and bug patches. Patching one server back in 1994 was no big deal; patching a building containing thousands of servers in 2003 became an entirely different thing.
And that’s where Windows Server ran into trouble.
Sure, Windows Server was “as easy to run as the desktop you already know and love,” but the ability to click through a “Wizard” to install a patch didn’t scale well. Having to run through the same “Wizard” on ten computers—clicking Next, Next, Next, Next, Finish on each one—might be acceptable, but doing it for a thousand computers? Not so much.
Windows started to bog down in the terms of the labor it required, and so computers running Linux started giving Microsoft trouble in enterprise environments.
Linux, an open-source operating system that’s largely compatible with Unix, was created mainly in response to the high cost of Unix operating systems. Linux is free to use in most cases, and it runs on the same cheaper, commodity hardware that Windows Server can run on. Linux is a lot harder to administer, though: it favors cryptic commands typed into the computer, versus Windows’ pretty icons and Next-Next-Finish “Wizards.” The upside of Linux, though, is that once you do learn to manage it, it’s almost as easy to manage a hundred computers as it is just one. Instead of typing the commands into each computer yourself, you simply type them into a text file—not unlike a word-processing document—and tell all of your computers to “run” that text file. The text file becomes a script, like you might hand out to actors in Hollywood, with each computer “reading their lines” so that you don’t have to.
Microsoft started struggling to close deals in large companies, due in part to the perception that managing large batches of Windows Server machines took more labor than doing the same thing with cheap Linux machines.
Microsoft first countered with Visual Basic Script, or VBScript. This “scripting language” was intended to let you manage Windows Server by typing commands into text files, just as Linux could do.
But the fundamental architecture of Windows Server wasn’t the same as that of Linux, and it impacted Windows’ ability to have an effective scripting language.
Linux, as with Unix before it, is a “text-based operating system.” Everything that tells the server how to behave—its configuration—is basically lines in text files. The operating systems’ means of communicating with other devices is similarly simplistic. Changing a text file is easy: most Linux administrators quickly figured out the small number of tools that enabled them to change text files on hundreds of computers at once, effectively reconfiguring those computers with a single keystroke, if needed. Sure, those tools were cryptic, with incomprehensible names like grep, sed, awk, cat, and more, but you only needed to learn them once. Once you did, the world of Linux administration was open to you. Learn a little, and you could do a lot.
Windows, on the other hand, is an “API-based operating system.” Each component inside Windows defines a set of interfaces that you use to tell it what to do. When you click an icon in Windows, one bit of software uses those interfaces to tell another bit of software to do something: open a file, send a message, or whatever. Automating Windows administration, then, is less about changing text files and more about some pretty serious computer programming. These interfaces are (for the most part) documented, but that documentation presumes you’re an experienced software engineer. Sadly, the people hired to manage computer networks tend not to be experienced software engineers. In the Windows world, they were used to clicking icons, not coding programs of a hundred lines or more. VBScript helped a bit, but VBScript couldn’t access all of the APIs needed to make Windows do everything it did. Eventually, someone using VBScript would run into a situation they simply couldn’t handle, leaving them to go back to clicking icons to make stuff happen.
Worse, Windows’ various APIs had all been created by developers who never expected anyone but themselves to use those APIs. Some APIs required you to use low-level programming languages like C or C++, while others could use more-accessible, higher-level languages like VBScript. Still others were best used from Microsoft’s .NET Framework, a set of APIs released in the late 1990s to make software development faster and more consistent. But .NET Framework didn’t cover everything a server administrator might need.
Lest you think Microsoft had been remiss in their architecture, rest assured that’s not the case. Using APIs to “wall off” different components from each other is not only a standard practice, it’s a recommended practice. APIs let multiple teams of people work on different subsystems, without interference or dependencies on other teams. Team A can do whatever they like with their piece of software, knowing that all they need to do is publish an interface through which other teams could access whatever was needed. It’s a bit like the radio in your car: you might not know how a radio works, but you can use the interface provided to change stations and adjust the volume. The back side of the radio sports another interface that lets the car supply power, antenna signals, and so on to the radio. If you buy a Ford truck, you’re welcome to swap out the Ford radio for a Pioneer one, provided the Pioneer can support the same interface that your truck expects of a radio.
A problem with interfaces, though, is that they can only give you the things their developer anticipated you needing. If your truck radio has no interface for taking a satellite radio signal, then there’s nothing you can do about that. And that’s where Windows administrators often found themselves often: if the developers of some Windows subsystem hadn’t anticipated an administrator needing to do something, then the subsystem’s interfaces wouldn’t make it possible, and the administrator was out of luck.
And here’s another problem Windows had: many of the teams who built Windows’ various components assumed nobody would ever do anything other than click the pretty icons they’d created. For those components, it was essentially impossible to automate their administration, because they simply had no interfaces through which to do so. It was, frankly, a bit of a mess, and it caused no end of frustration to Windows administrators who were managing a rapidly growing number of servers in their environments. This wasn’t necessarily a bad decision on the part of those teams, because the whole point of Windows was its graphical user interface. For many of them, suggesting that people might need to administer using something other than icons and “Wizards” approached heresy. Teams were required to deliver a comprehensive and easy-to-use graphical user interface; anything else was often optional in terms of Microsoft’s architecture standards, and optional things tend to fall by the wayside when resources get tight and timelines get short.
Linux, to be clear, also technically relies on APIs. It’s just that nearly every piece of Linux adopted “put stuff in text files” as their interface. If you want to reconfigure a piece of Windows—say, you need to add a user account to the company directory—you have to hope the directory subsystem’s APIs offer a way to do that, and then you have to learn what data structure to pass them to make them do it. With Linux, you often just add a line to a text file. Notably, many recent Microsoft products have shifted to this text-based approach. With Microsoft Azure, for example, a specially formatted text file can be used to make Azure do almost anything.
But in the early- to mid-2000s, complex APIs still ruled Windows Server. It seemed like all the bits were there to automate most Windows administration, but they were scattered over a half-dozen largely difficult and sometimes-incompatible languages and technologies. It’s like going into an auto shop and realizing you need a set of metric sockets for the frame of the vehicle, Imperial sockets for the body, a torch welder for the roof, and a magic wand for the engine. If you can master all of the different tools, maybe you’re fine, but it’s a lot to wrap your head around.
Now, this problem would have been solvable: you just need your Windows server administrators to be really broad in the terms of the technologies they can support, and really fast at learning new things. Basically, if your admins are capable of being ersatz developers, you’re fine. Except that wasn’t the sales pitch Microsoft had been making for a decade: “Administer your network as easily as you use your own desktop!” had been the message, not “learn four programming languages and spend all your time writing code!” The bulk of Microsoft’s administrator audience wasn’t up to speed on software programming, and in a lot of cases they weren’t interested in learning languages like C#, C++. VBScript, or whatever else. Again, it’s as if Microsoft had attracted a large audience of competent, intelligent, hardworking automotive mechanics, and then carried a nuclear reactor into the shop and said, “you can do this too, right?” The audience was used to a certain level of consistency and abstraction that a graphical user interface affords, and they simply hadn’t been prepared to have Windows’ underlying inconsistencies and ugliness dumped in their laps.
Understand, too, than in 2003, Windows administrators tended to be paid markedly less than a software developer with equivalent seniority, and in many cases less than a similarly situated Linux or Unix administrator. The assumption that “managing Windows is easy!” was baked in their their salaries, and the idea of suddenly being asked to take on a very different kind of role, without necessarily being paid more, didn’t sit well.
This is the world that PowerShell (originally Windows PowerShell) was born into: Windows Server struggling to compete with Linux in large-scale companies, due in main part to the relative difficulty in automating Windows administration at scale. Under the hood, Windows was a hodgepodge of different interconnected systems, each one optimized for whatever its task was, and each one difficult to automate without knowing a half-dozen or more different technologies and approaches.
Thing is, Microsoft had known this was a problem for quite a while, and their initial solution wasn’t even aimed at Windows administrators.