Introduction

SQL stands for "Structured Query Language". You have probably seen many different methods used in programming manuals to show how to use a specific programming language. The two most common methods use syntax flow diagrams and what is known as Backus-Naur Form (BNF) which is a formal language for describing a programming language. In this document we use a simplified BNF method that seeks to represent the language in a way that closely matches the way you will code your own SQL statements for your application.

For example, the following SELECT statement:

select sale_name, company, city, state 
 	from salesperson natural join customer;

can be described by this syntax rule:

select_stmt:
SELECT identifier[, identifier]… FROM identifier [NATURAL JOIN identifier]

where select_stmt is the name of the rule (sometimes called a non-terminal); the bold-faced identifiers SELECT, FROM, NATURAL, and JOIN are key words (sometimes called terminal symbols); identifier is like a function argument that stands in place of a user-specified value (technically, it too is the name of a rule that is matched by any user-specified value that begins with a letter followed by any sequence consisting of letters, digits, and the underscore ("_") character). Rule names are identifiers and their definitions are specified by giving the rule name beginning in column 1 and terminating the rule with a colon (":") as shown above.

There are also special meta-symbols that are part of the syntax descriptor language. Two are shown in the above select_stmt syntax rule. The brackets ("[" and "]") enclose optional elements. The ellipsis ("…") specifies that the preceding item can be repeated zero or more times. Other meta-symbols include a vertical bar (i.e., an "or" symbol) that is used to separate alternative elements and braces ("{" and "}") which enclose a set of alternatives from which one must always be matched. All other special characters (e.g., the "," and ";" in the select_stmt rule) are considered to be part of the language definition. Meta-symbols that are themselves part of the language will be enclosed in single quotes (e.g., '[') in the syntax rule.

Rule names can be used in other rules. For example, the syntax for a stored procedure that can contain multiple SELECT statements could be described by the following rule:

create_procedure:
CREATE PROCEDURE identifier AS
select_stmt[; select_stmt]…
END PROC;

In order to make the syntax more readable, any non-bold, italicized name is considered to be matched as an identifier. Thus, the select_stmt rule can also be written as follows…

select_stmt:
        SELECT column_name[, column_name]… FROM table_name NATURAL JOIN[ table_name] ;

where column_name represents identifiers that correspond to table column names and table_name represents identifiers that correspond to table names.

Some italicized terms are used to match specific text patterns. E.g., number matches any text pattern that can be used to represent a number (either integer or decimal) and integer matches any pattern that represents an integer number.

These rules are summarized in the table below.

Table 1. Syntax Description Language Elements
Syntax Element Description
KEYWORD Capital words that identify the special words used in the language that specify actions and usage. Sometimes called reserved words. Examples, SELECT, INSERT, CREATE, USING. These words are not case sensitive.
identifier Italicized word corresponding to an identifier: sequences of letters, digits, and "_" that begin with a letter.
number Any text that corresponds to an integer or decimal number.
integer Any text that corresponds to an integer.
[option1 | option2] A selection in which either nothing or option1 or option2 is specified.
{option1 | option2} Either option1 or option2 must be specified.
element... Repeat element zero or more times.
identifier Normal-faced identifiers correspond to the names of syntax rules. Syntax rules are defined by the name starting in column 1 and ending with a ":".