5. Writing Basic Queries
A query is a SQL statement that is used to extract a subset of data from your database and presents it in a readable format. As we have seen previously, the SELECT command is used to run queries in SQL. You can further add clauses to your query to get a filtered, more meaningful result. Since the majority of operations on a database involve queries, it is important to understand them in detail. While this chapter will only deal with queries run on a single table, you can run a SELECT operation on multiple tables in a single statement.
5.1 Selecting a limited number of columns
We have already seen how to extract all the data from a table when we were verifying our results in the previous chapters. But as you might have noted - a query can be used to extract a subset of data too. We first test this by limiting the number of fields to show in the query output by not specifying the * selection criteria, but by naming the fields explicitly.
1 SELECT language, year FROM proglang_tbl;
| language | year |
|---|---|
| Prolog | 1972 |
| Perl | 1987 |
| APL | 1964 |
You can see that the query we constructed mentioned the fields we wish to see, i.e. language and year. Also note that the result of this query is useful by itself as a report for looking at the chronology of programming language creation. While this is not a rule enforced by SQL or a relation database management system, it makes sense to construct your query in such a way that the meaning is self-evident if the output is meant to be read by a human. This is the reason we left out the field id in the query, since it has no inherent meaning to the reader except if they wish to know the sequential order of the storage of records in the table.
5.2 Ordering the results
You might have noticed that in our previous query output, the languages were printed out in the same order as we had inserted them. But what if we wanted to sort the results by the year the language was created in. The chronological order might make more sense if we wish to view the development of programming languages through the decades. In such cases, we take the help of the ORDER BY clause. To achieve our purpose, we modify our query with this additional clause.
1 SELECT language, year FROM proglang_tbl ORDER BY year;
| language | year |
|---|---|
| APL | 1964 |
| Prolog | 1972 |
| Perl | 1987 |
The astute reader will notice that the output of our ORDER BY clause was ascending. To reverse this, we add the argument DESC to our ORDER BY clause as below.
1 SELECT language, year FROM proglang_tbl ORDER BY year DESC;
| language | year |
|---|---|
| Perl | 1987 |
| Prolog | 1972 |
| APL | 1964 |
5.3 Ordering using field abbreviations
A useful shortcut in SQL involves ordering a query result using an integer abbreviation instead of the complete field name. The abbreviations are formed starting with 1 which is given to the first field specified in the query, 2 to the second field and so on. Rewriting our above query to sort the output by descending year, we get:
1 SELECT language, year FROM proglang_tbl ORDER BY 2 DESC;
| language | year |
|---|---|
| Perl | 1987 |
| Prolog | 1972 |
| APL | 1964 |
The 2 argument given to the ORDER BY clause signifies ordering by the second field specified in the query, namely year.
5.4 Putting conditions with WHERE
We have already seen how to select a subset of data available in a table by limiting the fields queried. We will now limit the number of records retrieved in a query using conditions. The WHERE clause is used to achieve this and it can be combined with explicit field selection or ordering clauses to provide meaningful output.
For a query to run successfully, it must have atleast two parts - the SELECT and the FROM clause. After this we place the optional WHERE condition and then the ordering clause. Thus, if we wanted to see the programming language (and it’s author) which was standardized by ANSI, we’d write our query as below.
1 SELECT language, author FROM proglang_tbl WHERE standard = 'ANSI';
As you may have noticed, the query we forulated specified the language and author fields, but the condition was imposed on a separate field altogether - standard. Thus we can safely say that while we can choose what columns to display, our conditionals can work on a record with any of its fields.
| language | author |
|---|---|
| APL | Iverson |
You are by no means restricted to use = (equals) for your conditions. It is perfectly acceptable to choose other operators like < and >. You can also include the ORDER BY clause and sort your output. An example is given below.
1 SELECT language, author, year FROM proglang_tbl WHERE year > 1970 ORDER BY autho\
2 r;
| language | author | year |
|---|---|---|
| Prolog | Colmerauer | 1972 |
| Perl | Wall | 1987 |
Notice that the output only shows programming languages developed after 1970 (atleast according to our database). Also since the ordering is done by a varchar field, the sorting is done alphabetically in an ascending order.
5.5 Combining conditions
If we can only specify one condition using the WHERE clause, it will fulfill only a tiny fraction of real world requirements. We can however construct complex conditions using the boolean operators AND and OR.
When we want our resultset to satisfy all of the multiple conditions, we use the AND operator.
1 SELECT language, author, year FROM proglang_tbl
2 WHERE year > 1970 AND standard IS NULL;
| language | author | year |
|---|---|---|
| Perl | Wall | 1987 |
| Tcl | Ousterhout | 1988 |
The result satisfies both the conditions we specified, namely - the language should not be standardized and it must have been created after 1970.
If we want our resultset to satisfy any one of our conditions, we use the OR operator.
1 SELECT language, author, year FROM proglang_tbl
2 WHERE year > 1970 OR standard IS NULL;
| language | author | year |
|---|---|---|
| Prolog | Colmerauer | 1972 |
| Perl | Wall | 1987 |
| Tcl | Ousterhout | 1988 |
The result now contains languages which were created either after 1970 like Prolog or which are not standardized like Perl or Tcl. In this particular example, the first condition satisfies all the rows of the resultset. But if there were a language which was created before 1970 and wasn’t yet standardized, it would show up as a result of this query. We can even create yet more complex queries by combining these operators.