Leanpub: Publish Early, Publish Often

Working with SQL Server Data

We often see people struggle - really, really hard, in some cases - to store data in Microsoft Excel, using PowerShell to automate the process. This makes us sad. Programmatically, Excel is kind of a hunk of junk. Sure, it can make charts and graphs - but only with significant effort and a lot of delicacy. But, people say, “I already have it!” This also makes us sad, because for the very reasonable price of $FREE, you can have SQL Server (Express), and in fact, you probably have some flavor of SQL Server on your network that you could use. But why?

SQL Server is easy to use from PowerShell code. Literally a handful of lines, and you’re done.
SQL Server Reporting Services (also free in the Express edition) can turn SQL Server data into gorgeous reports with charts and graphs - and can automate the production and delivery of those reports with zero effort from you.
SQL Server is something that many computers can connect to at once, meaning you can write scripts that run on servers, letting those servers update their data in SQL Server. This is faster than a script that reaches out to query many servers to update a spreadsheet.

We don’t know how to better evangelize using SQL Server for data storage over Microsoft Excel.

SQL Server Terminology and Facts

Let’s quickly get some terminology and basic facts out of the way.

SQL Server is a service that runs on a server. Part of what you’ll need to know, to use it, is the server’s name. A single machine (physical or VM) can run multiple instances of SQL Server, so if you need to connect to an instance other than the default, you’ll need the instance name also. The naming pattern is SERVER\INSTANCE.
A SQL Server instance can host one or more databases. You will usually want a database for each major data storage purpose. For example, you might ask a DBA to create an “Administration Data” database on one of your SQL Server computers, giving you a place to store stuff.
Databases have a recovery mode option. Without getting into a lot of details, you can use the “Simple” recovery mode (configurable in SQL Server Management Studio by right-clicking the database and accessing its Options page) if your data isn’t irreplaceable and you don’t want to mess with maintaining the database’s log files. For anything more complex, either take a DBA to lunch or read Don’s Learn SQL Server Administration in a Month of Lunches.
Databases contain tables, each of which is analogous to an Excel worksheet.
Tables consist of rows (entities) and columns (fields), which correspond to the rows and columns of an Excel sheet.
Columns have a data type, which determines the kind of data they can store, like text (NVARCHAR), dates (DATETIME), or numbers (INT). The data type also determines the data ranges. For example, NVARCHAR(10) can hold 10 characters; NVARCHAR(MAX) has no limit. INT can store smaller values than BIGINT, and bigger values than TINYINT.
SQL Server defaults to Windows Authentication mode, which means the domain user account running your scripts must have permission to connect to the server (a login), and permission to use your database (a database user). This is the safest means of authentication as it doesn’t require passwords to be kept in your script. If running a script as a scheduled task, the task can be set to “run as” a domain user with the necessary permissions.

Just seven little things to know, and you’re good to go.

Connecting to the Server and Database

You’ll need a connection string to connect to a SQL Server computer/instance, and connect to a specific database. If you’re not using Windows Authentication, the connection string can also contain a clear-text username and password, which is a horrible practice. We use ConnectionStrings.com to look up connection string syntax, but here’s the one you’ll use a lot:

Use Server=SERVER\INSTANCE;Database=DATABASE;Trusted_Connected=True; to connect to a given server and instance (omit the \INSTANCE if you’re connecting to the default instance) and database. Note that SQL Server Express usually installs, by default, as an instance named SQLEXPRESS. You can run Get-Service in PowerShell to see any running instances on a computer, and the service name will include the instance name (or just MSSQLSERVER if it’s the default).

With that in mind, it’s simple to code up a connection:

$conn = New-Object -Type System.Data.SqlClient.SqlConnection
$conn.ConnectionString = 'Server=SQL1;Database=MyDB;Trusted_Connection=True;'
$conn.Open()

You can leave the connection open for your entire script; be sure to run $conn.Close() when you’re done, though. It’s not a tragedy to not close the connection; when your script ends, the connection object will vanish, and SQL Server will automatically close the connection a bit later. But if you’re using a server that’s pretty busy, the DBA is going to get in your face about leaving the connection open. And, if you run your script multiple times in a short period of time, you’ll create a new connection each time rather than re-using the same one. The DBAs will definitely notice this and get agitated.

Writing a Query

The next thing you need to do is retrieve, insert, update, or remove some data. This is done by writing queries in the Transact-SQL (T-SQL) language, which corresponds with the ANSI SQL standard, meaning most queries look the same on most database servers. There’s a great free online SQL tutorial if you need one, but we’ll get you started with the basics.

To do this, you’ll need to know the table and column names from your database. SQL Server Management Studio is a good way to discover these.

For the following sections, we’re going to focus on query syntax, and then give you an example of how we might build that query in PowerShell. Once your query is in a variable, it’s easy enough to run it - and we’ll cover how to do that in a bit. Also, we’re not going to be providing exhaustive coverage of SQL syntax; we’re covering the basics. There are plenty of resources, including the aforementioned online tutorial, if you need to dig deeper.

Adding Data

Adding data is done by using an INSERT query. The basic syntax looks like this:

INSERT INTO <tablename>
    (Column1, Column2, Column3)
    VALUES (Value1, Value2, Value3)

So you’ll need to know the name of the table you’re adding data to, and you’ll need to know the column names. You also need to know a bit about how the table was defined. For example, if a “Name” column is marked as mandatory (or “NOT NULL”) in the table design, then you must list that column and provide a value for it. Sometimes, a table may define a default value for a column, in which case you can leave the column out if you’re okay with that default value. Similarly, a table can permit a given column to be empty (NULL), and you can omit that column from your list if you don’t want to provide a value.

Whatever order you list the columns in, your values must be in the same order. You’re not forced to use the column order that the table defines; you can list them in any order.

Numeric values aren’t delimited in T-SQL. String values are delimited in single quotes; any single quotes within a string value (like “O’Leary”) must be doubled (“O’‘Leary”) or your query will fail. Dates are treated as strings and are delimited with single quotes.

We might build a query in PowerShell like this:

$ComputerName = "SERVER2"
$OSVersion = "Win2012R2"
$query = "INSERT INTO OSVersion (ComputerName,OS) VALUES('$ComputerName','$OSVersion\
')"

This assumes a table named OSVersion, with columns named ComputerName and OS. Notice that we’ve put the entire query into double quotes, allowing us to just drop variables into the VALUES list.

Removing Data

A DELETE query is used to delete rows from a table, and it is almost always accompanied by a WHERE clause so that you don’t delete all the rows. Be very careful, as there’s no such thing as an “UNDO” query!

DELETE FROM <tablename> WHERE <criteria>

So, suppose we’re getting ready to insert a new row into our table, which will list the OS version of a given computer. We don’t know if that computer is already listed in the table, so we’re going to just delete any existing rows before adding our new one. Our DELETE query might look like this:

$query = "DELETE FROM OSVersions WHERE ComputerName = '$ComputerName'"

There’s no error generated if you attempt to delete rows that don’t exist.

Changing Data

An UPDATE query is used to change an existing row, and is accompanied by a SET clause with the changes, and a WHERE clause to identify the rows you want to change.

UPDATE <tablename>
   SET <column> = <value>, <column> = <value>
   WHERE <criteria>

For example:

$query = "UPDATE DiskSpaceTracking `
          SET FreeSpaceOnSysDrive = $FreeSpace `
          WHERE ComputerName = '$ComputerName'"

We’d ordinarily do that all on one line; we’ve broken it up here just to make it fit more easily in the book. This assumes that $FreeSpace contains a numeric figure and that $ComputerName contains a computer name.

Retrieving Data

Finally, the big daddy of queries is the SELECT query. This is the only one that returns data (although the other three will return the number of rows they affected). This is also the most complex query in the language, so we’re only tackling the basics.

SELECT <column>,<column>
       FROM <tablename>
       WHERE <criteria>
       ORDER BY <column>

The WHERE and ORDER BY clauses are optional, and we’ll come to them in a moment.

Beginning with the core SELECT, you follow with a list of columns you want to retrieve. While the language permits you to use * to return all columns, this is a poor practice. For one, it performs slower than a column list. For another, it makes your code harder to read. So stick with listing the columns you want.

The FROM clause lists the table name. This can get a ton more complex if you start doing multi-table joins, but we’re not getting into that in this book.

A WHERE clause can be used to limit the number of rows returned, and an ORDER BY clause can be used to sort the results on a given column. Sorting is ascending by default, or you can specify descending. For example:

$query = "SELECT DiskSpace,DateChecked `
          FROM DiskSpaceTracking `
          WHERE ComputerName = '$ComputerName' `
          ORDER BY DateChecked DESC"

Creating Tables Programmatically

It’s also possible to write a data definition language (DDL) query that creates tables. The four queries we’ve covered up to this point are data manipulation language (DML) queries. The ANSI specification doesn’t cover DDL as much as DML, meaning DDL queries differ a lot between server brands. We’ll continue to focus on T-SQL for SQL Server; we just wanted you to be aware that you won’t be able to re-use this syntax on other products without some tweaking.

CREATE TABLE <tablename> (
    <column> <type>,
    <column> <type>
)

You list each column name, and for each, provide a datatype. In SQL Server, you’ll commonly use:

Int or BigInt for integers
VarChar(x) or VarChar(MAX) for string data; “x” determines the maximum length of the field while “MAX” indicates a binary large object (BLOB) field that can contain any amount of text.
DateTime

You want to use the smallest data type possible to store the data you anticipate putting into the table, because oversized columns can cause a lot of wasted disk space.

Running a Query

You’ve got two potential types of queries: ones that return data (SELECT) and ones that don’t (pretty much everything else). Running them starts the same:

$command = New-Object -Type System.Data.SqlClient.SqlCommand
$command.Connection = $conn
$command.CommandText = $query

This assumes $conn is an open connection object, and that $query has your T-SQL query. How you run the command depends on your query. For queries that don’t return results:

$command.ExecuteNonQuery()

That can produce a return object, which you can pipe to Out-Null if you don’t want to see it. For queries that produce results:

$reader = $command.ExecuteReader()

This generates a DataReader object, which gives you access to your queried data. The trick with these is that they’re forward-only, meaning you can read a row, and then move on to the next row - but you can’t go back to read a previous row. Think of it as an Excel spreadsheet, in a way. Your cursor starts on the first row of data, and you can see all the columns. When you press the down arrow, your cursor moves down a row, and you can only see that row. You can’t ever press the up arrow, though - you can only keep going down the rows.

You’ll usually read through the rows using a While loop:

while ($reader.read()) {
  #do something with the data
}

The Read() method will advance to the next row (you start “above” the first row, so executing Read() the first time doesn’t “skip” any data), and return True if there’s a row after that.

To retrieve a column, inside the While loop, you run GetValue(), and provide the column ordinal number of the column you want. This is why it’s such a good idea to explicitly list your columns in your SELECT query; you’ll know which column is in what position. The first column you listed in your query will be 0, the one after that 1, and so on.

So here’s a full-fledged example:

$conn = New-Object -Type System.Data.SqlClient.SqlConnection
$conn.ConnectionString = 'Server=SQL1;Database=MyDB;Trusted_Connection=True;'
$conn.Open()

$query = "SELECT ComputerName,DiskSpace,DateTaken FROM DiskTracking"

$command = New-Object -Type System.Data.SqlClient.SqlCommand
$command.Connection = $conn
$command.CommandText = $query
$reader = $command.ExecuteReader()

while ($reader.read()) {
    [PSCustomObject]@{'ComputerName' = $reader.GetValue(0)
                         'DiskSpace' = $reader.GetValue(1)
                         'DateTaken' = $reader.GetValue(2)
                        }
}

$conn.Close()

This snippet will produce objects, one object for each row in the table, and with each object having three properties that correspond to three of the table columns.

If by chance you don’t remember your column positions, you can use something like this to auto-discover the column number.

while ($reader.read()) {
    [PSCustomObject]@{
    'ComputerName' = $reader.GetValue($reader.GetOrdinal("computername"))
    'DiskSpace' = $reader.GetValue($reader.GetOrdinal("diskspace"))
    'DateTaken' = $reader.GetValue($reader.GetOrdinal("datetaken"))
    }
}

Regardless of the approach, we’d usually wrap this in a Get- function, so that we could just run the function and get objects as output. Or a corresponding Set-, Update- or Remove- function depending on your SQL query.

Invoke-SqlCmd

If you by chance have installed a local instance of SQL Server Express, you will also have a set of SQL-related PowerShell commands and a SQLSERVER PSDrive. We aren’t going to cover them as this isn’t a SQL Server book. But you will want to take advantage of Invoke-SqlCmd.

Instead of dealing with the .NET Framework to create a connection, command and query, you can simply invoke the query.

Invoke-SqlCmd "Select Computername,DiskSpace,DateTaken from DiskTracking" `
-Database MyDB

You can use any of the query types we’ve shown you in this chapter. One potential downside to this approach in your toolmaking is that this will only work locally, or where the SQL Server modules have been installed. Also, there is a bit of a lag while the module is initially loaded.

Thinking About Tool Design Patterns

If you’ve written a tool that retrieves or creates some data that you intend to put into SQL Server, then you’re on the right track. The next step would be a tool that inserts the data into SQL Server (Export-Something), and perhaps a tool to read the data back out (Import-Something). This approach maintains a good design pattern of each tool doing one thing, and doing it well, and lets you create tools that can be composed in a pipeline to perform complex tasks. You can read a bit more about that approach, and even get a “generic” module for piping data in and out of SQL Server databases in Ditch Excel: Making Historical & Trend Reports in PowerShell, a free ebook.

Let’s Review

Because we don’t want to assume that you have access to a SQL Server computer, we aren’t going to present a hands-on experience in this chapter. However, we do encourage you to try and answer these questions:

How to you prevent DELETE from wiping out a table?
What method do you use to execute an INSERT query?
What method reads a single database row from a reader object?

Review Answers

Here are our answers:

Specify a WHERE clause to limit the deleted rows.
The ExecuteNonQuery() method.
The Read() method.

Up next

Part 5: Seriously Advanced Toolmaking