5. Writing Basic Queries

A query is a SQL statement that is used to extract a subset of data from your database and presents it in a readable format. As we have seen previously, the SELECT command is used to run queries in SQL. You can further add clauses to your query to get a filtered, more meaningful result. Since the majority of operations on a database involve queries, it is important to understand them in detail. While this chapter will only deal with queries run on a single table, you can run a SELECT operation on multiple tables in a single statement.

5.1 Selecting a limited number of columns

We have already seen how to extract all the data from a table when we were verifying our results in the previous chapters. But as you might have noted - a query can be used to extract a subset of data too. We first test this by limiting the number of fields to show in the query output by not specifying the * selection criteria, but by naming the fields explicitly.

Listing: selecting a subset of fields from a table
1 SELECT language, year FROM proglang_tbl;
Figure: Output of running the chosen fields SELECT query
language year
Prolog 1972
Perl 1987
APL 1964

You can see that the query we constructed mentioned the fields we wish to see, i.e. language and year. Also note that the result of this query is useful by itself as a report for looking at the chronology of programming language creation. While this is not a rule enforced by SQL or a relation database management system, it makes sense to construct your query in such a way that the meaning is self-evident if the output is meant to be read by a human. This is the reason we left out the field id in the query, since it has no inherent meaning to the reader except if they wish to know the sequential order of the storage of records in the table.

5.2 Ordering the results

You might have noticed that in our previous query output, the languages were printed out in the same order as we had inserted them. But what if we wanted to sort the results by the year the language was created in. The chronological order might make more sense if we wish to view the development of programming languages through the decades. In such cases, we take the help of the ORDER BY clause. To achieve our purpose, we modify our query with this additional clause.

Listing: Usage of the ORDER BY clause
1 SELECT language, year FROM proglang_tbl ORDER BY year;
Figure: Output of the ordered SELECT query
language year
APL 1964
Prolog 1972
Perl 1987

The astute reader will notice that the output of our ORDER BY clause was ascending. To reverse this, we add the argument DESC to our ORDER BY clause as below.

Listing: Usage of the ORDER BY clause with the DESC argument
1 SELECT language, year FROM proglang_tbl ORDER BY year DESC;
Figure: Output of the ordered SELECT query in descending order
language year
Perl 1987
Prolog 1972
APL 1964

5.3 Ordering using field abbreviations

A useful shortcut in SQL involves ordering a query result using an integer abbreviation instead of the complete field name. The abbreviations are formed starting with 1 which is given to the first field specified in the query, 2 to the second field and so on. Rewriting our above query to sort the output by descending year, we get:

1 SELECT language, year FROM proglang_tbl ORDER BY 2 DESC;
Figure: Output of the ordered SELECT query in descending order using field abbreviations
language year
Perl 1987
Prolog 1972
APL 1964

The 2 argument given to the ORDER BY clause signifies ordering by the second field specified in the query, namely year.

5.4 Putting conditions with WHERE

We have already seen how to select a subset of data available in a table by limiting the fields queried. We will now limit the number of records retrieved in a query using conditions. The WHERE clause is used to achieve this and it can be combined with explicit field selection or ordering clauses to provide meaningful output.

For a query to run successfully, it must have atleast two parts - the SELECT and the FROM clause. After this we place the optional WHERE condition and then the ordering clause. Thus, if we wanted to see the programming language (and it’s author) which was standardized by ANSI, we’d write our query as below.

Listing: Using a WHERE conditional
1 SELECT language, author FROM proglang_tbl WHERE standard = 'ANSI';

As you may have noticed, the query we forulated specified the language and author fields, but the condition was imposed on a separate field altogether - standard. Thus we can safely say that while we can choose what columns to display, our conditionals can work on a record with any of its fields.

Figure: Output of the SELECT query with a WHERE conditional clause
language author
APL Iverson

You are by no means restricted to use = (equals) for your conditions. It is perfectly acceptable to choose other operators like < and >. You can also include the ORDER BY clause and sort your output. An example is given below.

Listing: Combining the WHERE and ORDER BY
1 SELECT language, author, year FROM proglang_tbl WHERE year > 1970 ORDER BY autho\
2 r;
Figure: Output of the SELECT query with a WHERE and ORDER BY
language author year
Prolog Colmerauer 1972
Perl Wall 1987

Notice that the output only shows programming languages developed after 1970 (atleast according to our database). Also since the ordering is done by a varchar field, the sorting is done alphabetically in an ascending order.

5.5 Combining conditions

If we can only specify one condition using the WHERE clause, it will fulfill only a tiny fraction of real world requirements. We can however construct complex conditions using the boolean operators AND and OR.

When we want our resultset to satisfy all of the multiple conditions, we use the AND operator.

Listing: using the AND operator to combine conditions
1 SELECT language, author, year FROM proglang_tbl
2 WHERE year > 1970 AND standard IS NULL;
language author year
Perl Wall 1987
Tcl Ousterhout 1988

The result satisfies both the conditions we specified, namely - the language should not be standardized and it must have been created after 1970.

If we want our resultset to satisfy any one of our conditions, we use the OR operator.

Listing: using the OR operator
1 SELECT language, author, year FROM proglang_tbl
2 WHERE year > 1970 OR standard IS NULL;
language author year
Prolog Colmerauer 1972
Perl Wall 1987
Tcl Ousterhout 1988

The result now contains languages which were created either after 1970 like Prolog or which are not standardized like Perl or Tcl. In this particular example, the first condition satisfies all the rows of the resultset. But if there were a language which was created before 1970 and wasn’t yet standardized, it would show up as a result of this query. We can even create yet more complex queries by combining these operators.