This chapter discusses the history and fundamental concepts of SQL, and forms the foundation for the next chapter, which is on applying SQL with PostgreSQL. It addresses the basics of relational databases, object-related database extensions, the structure of a SQL statement, and provides an overview of PostgreSQL-supported data types, operators and functions.
SQL, the Structured Query Language, is a mature, powerful, and versatile relational query language. The history of SQL extends back to IBM research begun in 1970. The next few sections discuss the history of SQL, its predecessors, and the various SQL standards that have been agreed upon over the years.
The relational model, which SQL draws much of its conceptual core from, was first formally defined in 1970 by Dr. E. F. Codd, a researcher for IBM, in a paper entitled A Relational Model of Data for Large Shared Data Banks. This article generated a great deal of interest in both the feasibility and practicality of such a system in a commercial application.
In 1974 IBM began the System/R project, and with the work of Donald Chamberlin and others, defined SEQUEL, a Structured English Query Language. System/R was implemented on an IBM prototype called SEQUEL-XRM in 1974-75. System/R was then re-written completely from 1976-1977 in order to implement multi-table and multi-user features. As the system was revised, it was briefly re-named "SEQUEL/2", before eventually being re-named to "SQL" for legal reasons.
1978 saw the beginning of the methodical testing of System/R at customer test sites. Demonstrating both the usefulness and practicality of the system, this testing proved to be a success for IBM. As a result, IBM began to develop commercial products that implemented SQL based on their System R prototype, including SQL/DS introduced in 1981, and DB2 in 1983.
Several other software vendors accepted the rise of the relational model, and announced SQL-based products. These included Oracle (who actually beat IBM to market by two years by releasing their first commercial RDBMS, in 1979), Sybase, and Ingres (based on the University of California's Berkeley Ingres project).
PostgreSQL's name is, as you might have guessed, a play on the name Ingres. Both PostgreSQL and Ingres trace their roots back to the UC Berkeley's Ingres RDBMS system.
SQL is based largely on tuple relational calculus and relational algebra. Relational algebra, introduced by E. F. Codd in 1972, provided the basic concepts behind computing SQL syntax. It is a procedural way to construct data-driven queries, and it addresses the how logic of a structured query. The tuple relational calculus (TRC), on the other hand, had a large effect on the underlying appearance of SQL. Relational calculus uses declarative expressions, which address the what logic of a structured query.
There are additional features that set SQL apart from merely implementing features that are part of relational algebra or calculus:
A user is allowed to insert, delete, and modify stored data records.
Arithmetic operations such as addition, subtraction, multiplication, and division (e.g., (value1 * 5) + value2) are allowed, as well as comparison operators (e.g., value3 >= value4).
Display query-generated relationships (such as a table's contents).
Assignment provides a user the ability to rename a relation that is computed by a query, rather than forcing the use of the default relationship name, which may be derived from a column or function name, depending on the query.
Aggregate functions allow a user to group related rows together and perform evaluations upon that group of data. These operations can include, but are not limited to: average, sum, count, maximum, and minimum.
SQL has existed as an "official" standard since the adoption of the ANSI (American National Standards Institute) standard (X3.135) in 1986, and the ISO (International Standards Organization) standardization in 1987. The United States government's Federal Information Processing Standard (FIPS) adopted the ANSI/ISO standard. In 1989, a revised standard known commonly as SQL89 or SQL1, was published.
Due partially to conflicting interests from commercial vendors, much of the SQL89 standard was left intentionally incomplete, and many features were labeled implementor-defined. In order to strengthen the standard, the ANSI committee revised its previous work with the SQL92 standard, ratified in 1992 (also called SQL2). This standard addressed several weaknesses in SQL89, as well as set forth conceptual SQL features which at that time exceeded the capabilities of any existing RDBMS implementation. In fact, the SQL92 standard was approximately six times the length of its predecessor. As a result of this disparity, the authors defined three levels of SQL92 compliance: Entry-Level (only the barest improvements to SQL89), Intermediate-Level (a generally achievable set of major advancements), and Full (total compliance with the SQL92 features).
More recently, in 1999, the ANSI/ISO SQL99 standard (also called SQL3) was released. This standard addresses some of the more advanced and previously non-addressed areas of modern SQL systems, such as object-relational database concepts, call level interfaces, and integrity management. SQL99 replaces the SQL92 levels of compliance with its own degrees of conformance: Core SQL99, and Enhanced SQL99.
PostgreSQL presently conforms to most of the Entry-Level SQL92 standard, as well as many of the Intermediate and Full features. Additionally, many of the new SQL99 features are quite similar to many of the object-relational concepts pioneered by PostgreSQL, such as arrays, functions, and inheritance.