This chapter expands on the operators and functions available to PostgreSQL. These character symbols and identifiers allow you to flexibly modify and compare results within SQL statements. The results of these operations can be used in a variety of ways, from updating existing row data, to constraining query reults to only rows matching conditions.
PostgreSQL supports the usual variety of standard SQL operators and functions as defined by the ANSI/ISO SQL standards, such as mathematical operators, basic text formatting funtions, and date and time value extraction. PostgreSQL also comes with a rich set of custom PostgreSQL extensions, such as regular expression comparison operators, and the flexible to_char() text conversion function.
Take note that these sections describe the native operators and functions available to PostgreSQL. An excellent feature of PostgreSQL is its extensibility in this area. Once you have a solid understanding of operators and functions, you may be interested in developing your own. This technique is described in Chapter 7.
In Chapter 3, operators are defined syntactically as tokens which are used to perform operations on values (e.g., constants, or identifiers), and return the results of that operation. In addition to these syntactic character symbols, there are some SQL keywords which are considered operators due to their effect on values in a SQL statement. Throughout this section, both these symbols and keywords will be referred to as operators.
The function of each operator is highly dependent on its context. Applications of operators range from performing mathematical operations, to concatenating character strings, to performing a wide variety of comparisons yielding boolean results. This section describes the general usage of operators in SQL, with successive sections on the following family of operators:
Character string operators
For an up to date and complete list of PostgreSQL supported operators, you can use psql's \do slash command to view a list of available operators. Understand that many of the listed operators are PostgreSQL-specific, and therefore may not exist in other SQL-capable databases implementations.
Following the discussions of the various types of operators, you'll find information on dealing with NULL values in expressions, and on the order in which operators are evaluated.
Operators either operate on a single value, or a pair of values. The majority of operators operate on two values, with the operator placed between the values it is to operate upon (e.g., a - b). Operators which affect only one value are called unary operators, and either precede or follow the value they affect (e.g., the @ operator preceding a value is a unary operator indicating the absolute value).
Many operators, while invoked with the same keyword or character symbol, will have different effects depending on the data types to which they are applied. Further, operators will not always have a relevant use to every data type (see Chapter 3, for more information on what data types are available to PostgreSQL).
For example, you can use the addition operator (+) to add two integer values together, but you cannot use it to add an integer to a text type. This is an undefined (and therefore ambiguous, and disallowed) use of the operator. The operator character itself (+, in this case) will still be recognized, but you will receive an error such as the one shown in Example 5-2 if you try to misuse an operator.
Consider the Book Town authors table, which correlates author's names with numeric identifiers.
Table "authors" Attribute | Type | Modifier ------------+---------+---------- id | integer | not null last_name | text | first_name | text | Index: authors_pkey
Two identifiers in this table are the columns id, and last_name, which are of type integer (a 4-byte integer) and text, respectively. Since the id column is of type integer, it may be used with a mathematical operator along with another numeric value.
Example 5-1 demonstrates correct usage of the addition (+) operator.
Example 5-1. Correct Operator Usage
booktown=# SELECT id + 1 AS id_plus_one, last_name booktown-# FROM authors booktown-# ORDER BY id DESC LIMIT 5; id_plus_one | last_name -------------+-------------- 25042 | Bianco 15991 | Bourgeois 7807 | Christiansen 7806 | Lutz 4157 | King (5 rows)
Notice, the result of trying to add incompatible types in Example 5-2.
Example 5-2. Incorrect Operator Usage
booktown=# SELECT id + last_name AS mistake booktown-# FROM authors; ERROR: Unable to identify an operator '+' for types 'int4' and 'text' You will have to retype this query using an explicit cast
Fortunately, as you can see in Example 5-2, PostgreSQL's operator-misuse error messages supply a reason for failure, rather than blindly failing. These can be helpful in determining the next step in developing your statement, in order to make it a valid query.
PostgreSQL contains a comprehensive set of character string operators, from simple text concatenation and string comparison, to a strong set of regular expression matching. Character string operators are valid upon values of types char, varchar, and PostgreSQL's own text type.
The following sections describe the basic comparison and concatenation operators, as well as the implementation of case-sensitive and case-insensitive regular expression operators.
Each of the basic character string comparison and concatenation operators supported by PostgreSQL are listed in Table 5-1.
Note that the LIKE and ILIKE keywords, which call to the like() are sometimes referred to as string comparison operators. These keywords are covered in the section called Functions.
Table 5-1. Basic Character String Operators
'string' = 'comparison'
A comparison returning true if string matches comparison identically
'string' != 'comparison'
A comparison returning true if string do not match comparison identically
'string' <> 'comparison'
Identical to the != operator
'string' < 'comparison'
A comparison returning true if string should be sorted alphabetically before comparison
'string' <= 'comparison'
A comparison returning true if string should be sorted alphabetically before comparison, or if the values are identical
'string' > 'comparison'
A comparison returning true if string should be sorted alphabetically after comparison
'string' >= 'comparison'
A comparison returning true if string should be sorted alphabetically after comparison, or if the values are identical
Each of the string comparison operators returns a boolean result of either true or false. The alphabetical sorting referred to by Table 5-1 compares each sequential character in a string, determining if one character is considered 'greater than' or 'less than' the other. If the leading characters in two strings are at all identical, each character is checked from left to right until two different characters are found for comparison. In this sorting scheme, characters are determined to be higher than one another based on their ASCII value, as demonstrated in the following example:
booktown=# SELECT letter, booktown-# ascii(letter) booktown-# FROM text_sorting booktown-# ORDER BY letter ASC; letter | ascii --------+------- 0 | 48 1 | 49 2 | 50 3 | 51 A | 65 B | 66 C | 67 D | 68 a | 97 b | 98 c | 99 d | 100 (12 rows)
If you are unsure of how a character will be sorted, you can use the ascii() function to determine the ascii value of the character. This function is described further in the section called Functions. Example 5-3 illustrates a comparative check on the books table, and returns all titles whose first letter would be sorted before the letter D.
The text concatenation operator (||) is an invaluable tool for formatting output results. Like all operators, it may be used anywhere a constant value is allowed in a SQL statement. Values may be repeatedly concatenated in a single statement by simply appending the || operator after each appended string constant or identifier.
As an example, it might be used in the WHERE clause in order to constrain rows by comparing against a dynamically generated character string. Example 5-4 demonstrates the use of this operator.
Example 5-4. Concatenating Strings
booktown=# SELECT 'The Title: ' || title || ', by ' || booktown-# first_name || ' ' || last_name AS book_info booktown-# FROM books NATURAL JOIN authors AS a (author_id) booktown-# LIMIT 3; book_info -------------------------------------------------------- The Title: The Shining, by Stephen King The Title: Dune, by Frank Herbert The Title: 2001: A Space Odyssey, by Arthur C. Clarke (3 rows)
For times when normal equivalence comparisons are inadequate, PostgreSQL has several operators designed to perform pattern matching against regular expressions. A regular expression is similar to any other string to be matched against, with the exception that some characters (such as the square braces, pipe, and backslash) have special meaning in a comparison. If you have used UNIX programs such as sed, grep, or perl, you may already be familiar with this kind of syntax.
For more detailed information on regular expressions in general, refer to O'Reilly's Mastering Regular Expressions, by Jeffrey E. F. Friedl.
When a value is compared against a regular expression, the expression itself (or regex) may match both literal character sequences, as well as several variable character sequences. Both literal and variable sequences may be specified throughout the expression. Example 5-5 illustrates an example of such a sequence. It searches the Book Town authors table for authors whose names begin either with A, or T.
Example 5-5. An Example Regular Expression
booktown=# SELECT first_name, last_name booktown-# FROM authors booktown-# WHERE first_name ~ '^A|^T'; first_name | last_name ---------------+-------------- Ariel | Denham Tom | Christiansen Arthur C. | Clarke Andrew | Brookins Theodor Seuss | Geisel (5 rows)
The ~ symbol is the regular expression operator, within the WHERE clause, and the regular expression sequence itself in Example 5-5 is ^A|^T. The special characters in this sequence are the caret (^), and the pipe (|), while the literal characters are A and T. The special characters used in regular expressions are explained in detail later in this section.
The most important syntactic difference between the use of the like() function and regular expression operators is that like() uses wild-card symbols (e.g., %) at the beginning and end of its expression in order to match a substring. In contrast, unless otherwise instructed (with the beginning and end-line symbols found in Table 5-3) regular expression operators will implicitly look for the regular expression sequence anywhere in the compared character string.
Each of the regular expression operators are listed in Table 5-2. These operators compare a text value (either an identifier, or a constant) to a regular expression. Each operator provides a boolean result, depending on the nature of the operator.
Table 5-2. Regular Expression Comparison Operators
'string' ~ 'regex'
A regular expression comparison, yielding true if the expression matches
'string' !~ 'regex'
A regular expression comparison, yielding true if the expression does not match
'string' ~* 'regex'
A case-insensitive regular expression, yielding true if the expression matches
'string' !~* 'regex'
not equal to regular expression, case insensitive
The special characters available to a regular expression are listed in Table 5-3. These are the characters which may be used in a regular expression string to represent special meaning.
Table 5-3. Regular Expression Symbols
Matches the beginning (^) of the character string
Matches the end ($) of the character string
Matches any single character
Matches any single character which is between brackets (e.g., a, b, or c)
Matches any single character not between brackets, following caret (e.g., not a, b, or c)
Matches any character which is between the range of characters between brackets and separated by the dash (e.g., within a through z)
Matches any characters not between the range of characters between brackets and separated by the dash (e.g., not within a through z)
Matches zero or one instances of the character (or regex sequence) preceding it
Matches zero or more instances of the character (or regex sequence) preceding it
Matches one or more instances of the character (or regex sequence) preceding it
Matches character sequences to the left or right of it (e.g., either expression1, or expression2)
Explicitly groups expressions, to clarify precedence of special character symbols
Note that in order to use a literal version of any of the characters in Table 5-3, they must be prefixed with two backslashes (e.g., \\$ represents a literal dollar sign).
A common use of regular expressions is to search for a literal substring within a larger string. This can be achieved either with the ~ operator, if case is important, or with the ~* operator if the comparison should be case-insensitive. These operators are each demonstrated in Example 5-6.
Example 5-6. A Simple Regular Expression Comparison
booktown=# SELECT title FROM books booktown-# WHERE title ~ 'The'; title ---------------------- The Shining The Cat in the Hat The Velveteen Rabbit The Tell-Tale Heart (4 rows) booktown=# SELECT title FROM books booktown-# WHERE title ~* 'The'; title ----------------------------- The Shining The Cat in the Hat Bartholomew and the Oobleck Franklin in the Dark The Velveteen Rabbit The Tell-Tale Heart (6 rows)
As you can see in Example 5-6, 2 more rows are returned when using the ~* operator, as it matches not just "the" sequence, but modification of case on the same sequence (including the, tHe, ThE, and so on).
The same regular expression sequence can be modified to use the ^ symbol, to match only the character string The when it is at the beginning of the comparison string, as shown in Example 5-7. Additionally, the .* sequence is then appended, to indicate any number of characters may match until the next following grouped expression. In this case, the .* sequence is followed by a parenthetically grouped pair of strings (rabbit and heart), which are separated by the | symbol, indicating that either of the strings will be considered a match.
Example 5-7. A More Involved Regular Expression Comparison
booktown=# SELECT title FROM books booktown-# WHERE title ~* '^The.*(rabbit|heart)'; title ---------------------- The Velveteen Rabbit The Tell-Tale Heart (2 rows)
In Example 5-7, the results should fairly clearly indicate the effect of the regular expression comparison. Translated into English, the expression ^The.*(rabbit|heart) states that a match will be found only if the compared string begins with the character sequence The and, any amount of any characters thereafter, contain either the character sequence rabbit, or heart. The use of the ~* operator (rather than just the ~ operator) makes the comparison case-insensitive.
Example 5-8 executes an even more complicated regular expression comparison.
Example 5-8. A Complicated Regular Expression Comparison
booktown=# SELECT title FROM books booktown-# WHERE title ~* '(^t.*[ri]t)|(ing$|une$)'; title ---------------------- The Shining Dune The Velveteen Rabbit The Tell-Tale Heart (4 rows) booktown=#
The regular expression used in Example 5-8 should make it clear why regular expressions can be intimidating, just by its appearance! Breaking it down an element at a time, you can see that there are two parenthetically grouped expressions, separated by a | symbol. This means that if either of these expressions are found to match the title, the comparison will be considered a match.
Breaking it down further, you can see that the expression to the left of the | symbol consists of, from left to right: a caret (^) followed by the character t, a period (.) followed by an asterisk (*), and a pair of square brackets (), enclosing the characters r and i, followed by the character t. Translated into English, this sub-expression essentially says that in order to match, the compared string must begin with the letter t, and be followed by a sequence of zero or more characters until either the letter r, or i is found, which must be followed immediately by the letter t. If any of these conditions is not found, the comparison will not be considered a match.
The expression to the right of the | symbol is a bit simpler, consisting of two character string sequences (ing and une), each followed by the $ character, and separated by another | symbol. This sub-expression, translated into English, describes a match as one which either ends with the value ing, or une. If either of these are found, the expression is considered a match, because of the | symbol .
PostgreSQL's numeric operator support can be divided into three general groups:
Numeric comparison operators
Binary (or bit string) operators
Mathematical operators affect one or two values, perform a mathematical operation, and return a value of a numeric data type. Numeric comparison operators draw a conclusion based on two numeric values (such as whether one is larger than the other) and returns a value of type boolean, set to either true or false. Binary, or bit string, operators manipulate numeric values at the bit level of zeroes and ones. The following sections address each of these operator groups.
Mathematical operators can be used in the target list, in the WHERE clause of a SELECT statement, or anywhere else a numeric result may be appropriate. This sometimes will include the ORDER BY clause, a JOIN qualifier, or a GROUP BY clause.
Table 5-4 describes each of the mathematical operators available in PostgreSQL, along with example usage.
Table 5-4. Mathematical Operators
a + b
Addition of numeric quantities a and b
a - b
Subtraction of numeric quantity b from a
a * b
Multiplication of numeric quantities a and b
a / b
Division of numeric quantity a by b
a % b
Modulus, or remainder, from dividing a by b
a ^ b
Exponential operator, the value of a to the power of b
Square root of a
Cube root of a
Factorial of a
Factorial prefix, factorial of a, different only in syntactic placement from !
Absolute value of a
As an example of mathematical operators in the target list, the statement in Example 5-9 takes the retail price for each book and divides the cost with the / operator in order to determine the profit margin. This value is then typecast to a truncated numeric value with only two digits of precision. Finally, the integer constant 1 is subtracted from the division result, to yield only the percentage points over 100.
Example 5-9. Using Mathematical Operators
booktown=# SELECT isbn, booktown-# (retail / cost)::numeric(3,2) - 1 AS margin booktown-# FROM stock booktown-# ORDER BY margin DESC booktown-# LIMIT 4; isbn | margin ------------+-------- 0451457994 | 0.35 0760720002 | 0.33 0451198492 | 0.30 0441172717 | 0.29 (4 rows)
Notice that the column name is temporarily aliased to margin by using the AS keyword. Remember that the column name created by the AS keyword is a temporary name, and used only for the duration of the query.
Comparison operators are used to compare values of types such as integer or text to one another, but they will always return a value of type boolean. These operators are most commonly used in the WHERE clause, but may be used anywhere in a SQL statement where a value of type boolean would be valid.
Table 5-5 shows the available comparison operators.
Table 5-5. Comparison Operators
Less-than, returns true if the value to the left is smaller in quantity than the value to the right
Greater-than, returns true if the value to the left is greater in quantity than the value to the right
Less-than or equal-to, returns true if the value to the left is smaller, or equal to, in quantity than the value to the right
Greater-than or equal-to, returns true if the value to the left is greater, or equal to, in quantity than the value to the right
Equal-to, returns true if the values to the left and right of the operator are equivalent
|<> or !=||Not-equal, returns true if the values to the left and right of the operator not equivalent|
The <> operator exists as an alias to the != operator, for functional compatibility with other SQL-capable database implementations. They are effectively identical.
For an example of mathematical comparison operator usage, observe Example 5-10. The query involved uses the <= operator first, to check if the retail value is less-than or equal-to 25. Subsequently, the != operator is employed with the AND keyword to ensure that only books which are in stock (whose stock value are not equal to 0) are returned.
The BETWEEN keyword (sometimes called an operator) provides the ability to check a value for existence within a range of values. For instance, Example 5-11 shows a SELECT statement which looks for books with cost between 10 and 17 dollars.
Example 5-11. Using BETWEEN
booktown=# SELECT isbn FROM stock booktown-# WHERE cost BETWEEN 10 AND 17; isbn ------------ 0394800753 0441172717 0451457994 (3 rows)
You can achieve the same output using the less-than or equal-to operator (<=) in conjunction with the greater-than or equal-to (>=) operator. You can see an example of this in Example 5-12.
Example 5-12. Operator Equivalents to BETWEEN
booktown=# SELECT isbn FROM stock booktown-# WHERE cost >= 10 AND cost <= 17; isbn ------------ 0394800753 0441172717 0451457994 (3 rows)
The BETWEEN syntax simply adds to the readability of an SQL statement. Since both the keyword and operator forms are equally valid to PostgreSQL, it's mostly a matter of user preference.
Binary operators perform bitwise operations on the literal bits of a bit string or integer. These operators may affect integer values, or directly on bit string values. Each of PostgreSQL's binary operators are described in Table 5-6.
Table 5-6. Bit String Operators
a & b
Binary AND between bit string values of a and b (which may be provided as integers)
a | b
Binary OR between bit string values of a and b (which may be provided as integers)
a # b
Binary XOR between bit string values of a and b (which may be provided as integers)
Binary NOT, returns the inverted bit string of b
b << n
Binary shifts b to the left by n bits
b >> n
Binary shifts b to the right by n bits
Example 5-13 demonstrates shifting a numeric value, and its equivalent bit string, two bits to the right with the >> operator. It also demonstrates the use of the bittoint4() function, described in the section called Functions.
Example 5-13. Shifting Bit Strings
booktown=# SELECT b'1000' >> 2 AS "8 shifted right", booktown-# bittoint4(b'1000' >> 2) AS integer, booktown-# 8 >> 2 AS likewise; 8 shifted right | integer | likewise -----------------+---------+---------- 0010 | 2 | 2 (1 row)
When shifting bit strings, the original length of the string does not change, and any digits pushed either to the left or right of the bit string will be truncated. When using &, |, or #, the bit strings operated on must be of equal length in order to properly compare each bit on a one-to-one basis.
The AND, OR and NOT keywords are PostgreSQL's Boolean operators. They are commonly used to join or invert conditions in a SQL statement, particularly in the WHERE clause and the HAVING clause.
Table 5-7 illustrates the boolean values returned for the AND, OR, and NOT keywords, with each possible value for a boolean field (true, false, or NULL).
Table 5-7. The AND, OR and NOT Operators
|a||b||a AND b||a OR b||NOT a||NOT b|
Example 5-14 sequentially uses the OR and AND keywords in two queries to combine a pair of conditions by which rows should be retrieved. In the first query, if a book has either a cost of greater than thirty dollars, or is out of stock, its information will be returned. As you can see from the result set, matching one or both of these conditions causes a row to be returned.
The second query in Example 5-14 uses the same conditions, but combines them with the AND keyword. This results in a stricter condition, as both criteria must be met. As such, only one row is returned, since only one book is found which both has a cost of greater than thirty dollars, and is out of stock..
Example 5-14. Combining Comparisons with Boolean Operators
booktown=# SELECT isbn, cost, stock booktown-# FROM stock booktown-# WHERE cost > 30 booktown-# OR stock = 0; isbn | cost | stock ------------+-------+------- 0394900014 | 23.00 | 0 044100590X | 36.00 | 89 0451198492 | 36.00 | 0 0451457994 | 17.00 | 0 (4 rows) booktown=# SELECT isbn, cost, stock booktown-# FROM stock booktown-# WHERE cost > 30 booktown-# AND stock = 0; isbn | cost | stock ------------+-------+------- 0451198492 | 36.00 | 0 (1 row)
If a table has NULL values in it, there exist a special pair of comparison operators which can be used to include or omit NULL valued fields. You can check for fields set to NULL using the IS NULL keyword phrase. In order to check for a non-NULL value, use the IS NOT NULL keyword phrase.
Example 5-15 uses the IS NULL keyword to check for authors whose first_name column value are set to NULL.
Example 5-15. Comparisons Using IS NULL
booktown=# SELECT last_name, first_name booktown-# FROM authors booktown-# WHERE first_name IS NULL; last_name | first_name -----------+------------ Geisel | (1 row)
Example 5-16. Comparisons Equal to NULL
booktown=# SELECT last_name, first_name booktown-# FROM authors booktown-# WHERE first_name = NULL; last_name | first_name -----------+------------ Geisel | (1 row)
PostgreSQL provides a translation from = NULL to IS NULL, and likewise for the != NULL operation with IS NOT NULL. This is provided only for compatibility with existing client applications (such as Microsoft Access).
When comparing a value to NULL in an expression, it is a good idea to be in the habit of using the IS NULL and IS NOT NULL keyword operators rather than the = or != math-style operators. While this translation is provided for the sake of compatibility with other systems, it may be discontinued in the future, as it is not a standard SQL procedure, and it is certainly not guaranteed to be a portable procedure to other SQL-based database systems for the same reason.
Any as yet undiscussed comparison operator used on a NULL value will return a NULL value, as NULL will never be larger, smaller, or otherwise related to any non-NULL value. A direct query on the result of a comparison against a NULL value will therefore return NULL. You can think of a NULL value as being a sort of SQL black hole, from which no comparison (outside of IS NULL, and its special = translation) may return true, and to which no values may be added, or concatenated.
When utilizing several operators in large expressions, it can be helpful to know in what order PostgreSQL processes operators. It is not, as you might think, strictly from left to right. If not properly understood, the order of execution can introduce potential for accidental side-effects, such as those shown in Example 5-18.
Example 5-18. Operator Precedence
booktown=# SELECT 60 + 12 * 5 AS "sixty plus twelve times five", booktown-# 12 + 60 * 5 AS "twelve plus sixty times five"; sixty plus twelve times five | twelve plus sixty times five ------------------------------+------------------------------ 120 | 312 (1 row)
As you can see by the two column values returned in Example 5-18, the use of several operators without parentheses to enforce precedence can result in very different results, despite the same numbers being manipulated in only a slightly different order. In this example, the multiplication is actually executed first (regardless of the fact that the addition sign (+) precedes it sequentially, from left to right).
Table 5-8 lists, in order of PostgreSQL's execution from the top down, the precedence of each group of operators.
Table 5-8. Operator Precedence
Array element index
Table and column name separator
value ^ power
* / %
value1 * value2
Multiplication, division, and modulus
value1 + value2
Addition and subtraction
value IS boolean
Compares against true or false
Compares against NULL
IS NOT NULL
value IS NOT NULL
Checks for value inequivalent to NULL
Includes all other native and user-defined character operators
value IN set
Checks for membership of value in set
value BETWEEN a AND b
Checks for value in range between values a and b
string LIKE comparison
Checks for matching pattern comparison in string
< > <= >=
Quantity comparisons for less than, greater than, less than or equal to, and greater than or equal to.
value1 = value2
Logical NOT inversion
value1 AND value2
Logical AND conjunction
value1 OR value2
Logical OR conjunction
The operator precedence listed in Table 5-8 applies to user-defined operators that have the same character sequence as built-in operators. For example, if you define the plus symbol (+) operator for your own user-defined data type, it has the same precedence as the built in plus (+) operator, regardless of its function.