You are here: Users Guide > Creating an RDM Server Database

An RDM Server database is a collection of related files, which serve as the primary physical storage units. Files are used to group records, which consist of the actual data, and key fields used to specify data access. All database models handle these basic elements in the same way, but differ in their handling of inter-table relationships. Decide on a Database Model below provides more information about database models supported by RDM Server.

To create your RDM Server application database, you must write a database schema, which is a text file containing the definition of the content and organization of the database. If you will be using SQL in conjunction with your application, you will need to create your schema using the SQL database definition language (DDL).  Otherwise, you create your schema using the Core (C-like) DDL. Details for compiling the schema are described in Executing Your RDM Server Programs.

This chapter explains how to prepare a database for use with your application.  It includes a discussion of database models and instructions for creating both SQL and Core databases.  It also provides information about creating and using database instancing.

Database instancing is an exclusive feature of RDM Server that allows you to create multiple databases that share the same database definition (the schema).  The primary benefit is that mutually exclusive instances of a database can be created without replicating the metadata of the schema in either the system catalog or the server's memory.  (A schema's metadata is the information stored in the system catalog and database dictionary describing the definition of a database.) Using Core Database Instances describes how to create database instances for use in Core applications by using the s_dbClone administrator function. For applications using SQL, Using SQL Database Instances explains how to use the create database instance SQL statement.

By the time you finish this chapter, you should have enough information to implement an RDM Server application database.  However, database technology is a complex subject, requiring outside study for full understanding. 

6.1 Decide on a Database Model 

Before creating the database, you must decide on a database model. 

RDM Server supports the following models:

In writing your schema, try to use the power of the combined database model.  Use the relational database model when the application needs fast random access to data, or when it needs to sort.  The network model is best to use for fast, predefined joins between tables.

6.1.1 Relational Database Model 

In the relational database model, access to database records is through a common data field, which is a key accessed by a fast lookup of a key (index) file.  Database management systems based on the relational database model are popular because the simplicity of the underlying data model makes them easy to use.

The relational database model defines a database as a group of two-dimensional tables.  This tabular view of a database is particularly easy to manipulate with standard relational database operations, which are based on mathematical set theory. SQL is a database language that provides many of these relational operations.

As Figure 6-1 shows, the relational model establishes relationships between two tables through a data field that is common to both.  Note that the data field must be defined in both tables and that an index file must exist in order to efficiently access the related occurrences in the second table.


Figure 6-1.  Relational Database Model

A simple but effective analogy can be made between a relational database and a file cabinet, as illustrated in Figure 6-2. A file cabinet contains drawers, and each drawer contains files that are organized around a common theme. For example, one drawer might contain customer information, while another drawer might contain vendor data. In this case, each drawer holds individual file folders for each customer or vendor, sorted in name order.  The file cabinet corresponds to the database, each drawer is similar to a table, and each folder corresponds to a row (record occurrence) in the table.


Figure 6-2.  A File Cabinet as a Relational Database

Using the relational database model, the RDM Server application finds a row for a key supplied by the user. RDM Server maintains and navigates the keys in the order of their arrangement in the index file, regardless of the physical order of the rows to which they point.  During a data search, if a key exactly matches data in the database, the search leads directly to a specific row.  If the key is a near match for data, the search ends at the position just before the first row that contains a key higher in the collating sequence. The application can also search for the first or last key of a given key type, regardless of value.  Once a data search has reached a position, the application can find the previous or next key in the collating sequence. In SQL, this searching is performed automatically during the processing of a query.

6.1.2 Network Database Model 

In the network database model, data access uses navigation of sets that define data relationships among records. The result is a network of interconnected records, for which duplicate fields and key files are not necessary.  Thus, using the network database model can reduce data redundancy.  The primary benefits of a design using the network model are better performance, reduced storage requirements, and greater assurance of data integrity.

In the network database model illustrated in Figure 6-3, the relationships between record types are explicitly defined and directly maintained through linked lists called "sets." Sets are implemented as database pointers. Your application accesses a record directly with one database read operation.


Figure 6-3.  Network Database Model

In a network database model, any given record type can be the "owner" of many different sets.  It can also be a "member" of any number of sets. Each time a record type is defined as an owner or member of a set, RDM Server reserves a bucket of pointers in the record slot to hold the set owner or member pointers.

Network structures like the one shown in Figure 6-4 are valid.  In this figure, the arrows originate at owner records and point to member records.  Note that record type C is a member of two sets and an owner of two sets. Records can also own other records through multiple sets, as in the case of record type B owning record type E.  A record type can also own itself, as illustrated for record type E.


Figure 6-4.  Network Database Access Method

Using network database access, your RDM Server application can navigate sets in a number of ways.  It can search from the owner of a set to the first or last member of the set. From a set member, the application can navigate to the next or previous member record, or to the owner record.  During set navigation, position is established on a per-set basis.  In other words, the current owner and current member is specified for each set type that is used.

6.1.3 Hierarchical Database Model 

The hierarchical database model is a special variation of the network model that also uses set navigation for data access.  In this model, a record type can own more than one set, but it can only be a member of one set.  The owner is called the "parent" and the member is called the "child."

As in the network model, hierarchical sets are implemented through linked lists.  Figure 6-5 shows an example of a hierarchical schema structure showing the hierarchical database access method.


Figure 6-5.  Hierarchical Database Access Method

6.1.4 Combined Database Model 

The combined database model, on which RDM Server itself is based, combines the access methods of the relational and network database models. Basic relational and network methods combine easily in this model, as they share the concept of the current record.  The current record is simply the record in the database that has been accessed most recently.  It serves as the default object for many of the RDM Server functions. All RDM Server database access methods set the current record.

The network model can be mapped into the relational model.  In RDM Server SQL this is achieved through the create join statement described in "Create Joins for an SQL Database".

6.1.5 Comparing RDM Server Core and SQL Databases 

RDM Server provides the capability to create both RDM Server Core databases, using its own DDL, and RDM Server SQL databases, using SQL. For information about accessing an SQL database from a Core application, see Core API Access to SQL Databases. Which type of database you choose for a particular application will depend upon the specific characteristics of that application. Table 6-1 compares the characteristics of RDM Server Core and RDM Server SQL databases and applications.

Table 6-1.  Comparison of Core and SQL Databases and Applications

Characteristic

RDM Server Core Database

RDM Server SQL Database

Coding

For single table scans, a Core database is simpler.

For complex queries with multi-table joins, multiple conditions, aggregation, or sorting, SQL is simpler.

Database access

Permits Core access only.

Permits access from SQL/ODBC API and the Core API.

Database model

Can use both the relational and network models.

Can use both.  Network model is utilized through the RDM Server SQL create join statement.

Data types supported

Supports standard C language data types plus database address and BLOBs.  For a list, see Standard C Data Types

Supports SQL data types, including smallint, real, float, char, date, time, decimal, rowid, long varchar, long wvarchar, and long varbinary.

Performance

Better performance.  However, for a large result set, there is no Core equivalent to the SQLExtendedFetch function, except writing an extension module.

Recommended for the following record operations:

  Insert
  Update
  Delete
  Single instance retrievals

When a large result set must be obtained, SQL provides the SQLExtendedFetch function, which provides higher performance.

Recommended for:

  Dynamic query operations
  Filling lists
  Standards conformance

6.2 Write the Schema for a Core Database 

This section explains how to write a Core database schema. Note that the Core DDL statements used in the programming instructions in this section are detailed in RDM Server Language Guide.

Before you proceed to write the schema, be sure you have planned the following for the application database:

Note that you can intersperse file, record, and set declarations in the schema as long as:

The schema compiler ddlproc can produce a record, field, and set name cross-reference listing to help you locate names contained in a long schema specification.

All record, field, structure, and set names used in the schema must contain at least one lowercase letter so that they do not conflict with the names of the header file constants generated by the schema compiler.  For information about compiling an RDM Server database schema, see "Executing Your RDM Server Programs".

6.2.1 Understanding the Core Database Schema 

The tims database schema is located in the tims.ddl file in the examples installation directory for the server or client.  This schema is written in the Core DDL. The following shows the entire tims database schema.

/*---------------------------------------------------------------- 
Technical Information Management System (TIMS) Database
----------------------------------------------------------------*/
database tims {
  data file "tims.d01" contains keyword, intersect;
  data file "tims.d02" contains author, borrower, info, text;
  key  file "tims.k01" contains id_code;
  key  file "tims.k02" contains name, friend, word;

  record author {        /* author's name: "last, first" */
    char name[32];       /* or editor's name */

    unique key name;
  }

  record info {
    char id_code[16];    /* dewey dec.  or own coding tech.  */
    char info_title[80]; /* title of boOK, article, mag.  */
    char publisher[32];  /* name of publisher */
    char pub_date[12];   /* date of publication */
    int16 info_type;     /* 0=boOK, 1=magazine, 2=article */

    unique key id_code;
  }

  record borrower {
    char friend[32];     /* name of borrower */
    int32 date_borrowed; /* dates are stored initially as */
    int32 date_returned; /* numeric YYMMDD (e.g.  870226) */

    key friend;
  }

  record text {
    char line[80];       /* line of abstract text */
  }

  record keyword {       /* subject key words */
    char word[32];

    unique key word;
  }

  record intersect {     /* copy of info_type to save I/O */
    int16 int_type;      /* when searching for, say, boOKs */
  }

  set has_published {
    order sorted;
    owner author;
    member info by info_title;
  }

  set article_list {
    order last;
    owner info;
    member info;
  }

  set loaned_boOKs {
    order last;
    owner info;
    member borrower;
  }

  set abstract {
    order last;
    owner info;
    member text;
  }

  set key_to_info {
    order last;
    owner keyword;
    member intersect;
  }

  set info_to_key {
    order last;
    owner info;
    member intersect;
  }
}

This publications database stores information on books, magazines (journals), and articles contained in a technical library. It stores borrower's name, date borrowed, and date returned for each item loaned.  The example application using the database maintains a loan history for each publication. In addition, it provides reporting for all unreturned publications.

The tims database includes the following kinds of data:

6.2.2 Create a Core Database 

To create the database, you simply name it by placing a database statement at the beginning of the schema and enclosing the DDL statements that define the database within braces (see syntax below). Note that you can enter other statements before the database statement if desired.  See Create Records for a Core Database for details.

database dbname [ '[' pgsize ']' ] [on devname] [allow null] [db_addr '(' precision ')'] '{'
        ddl statements ...
'}'

If using the pgsize option with the database statement, always make the page size a multiple of the basic block size for your operating system. A multiple of 512 works for most systems. If you do not specify a page size with the database statement, RDM Server uses a default of 1024 bytes.  For more information about determining file and page sizes, see Guidelines for Creating Files.

The on clause identifies the default device on which the database files will be stored.  If not specified, the default device for the database is the default device of the user who is creating the schema.

If you specify allow null, each record will include a bitmap as an extra field that will be used to keep track of whether or not the contents of a field are null or not.  Each field will have a bit in this bitmap, so the amount of space needed for this bitmap is dependent on the number of fields in the record.

If you specify db_addr(precision), you can change the size of the database address (db_addr) for the database.  A database address is used to identify a particular record and is necessary for all keys and sets. 

The default database address precision is 4, which matches the database address size of all databases prior to RDM Server 8.1.  The problem with a precision of 4 is that it limited the total number of records per file to around 2 billion. RDM Server 8.1 and later allows you to have databases with a much larger number of records per file. To do this, you need to tell the system that the database address is larger, so that there is sufficient space to identify all the records. The valid options for the database address precision are 4, 6, and 8, which yield approximately 4.2 X 109 (232 - 2), 2.8 x 1014 (248 - 2), and 1.8 x 1019 (264 - 2) records per file, respectively.

6.2.3 Create Records for a Core Database 

6.2.3.1 What is a Record?

A record contains a group of related data fields stored and accessed together as a unit.

  An SQL database uses the term "table" instead of record.  For a discussion of SQL database tables, see Create Table for an SQL Database.

6.2.3.2 Declare a Record Type

The database record type is similar to a C structure.  Each occurrence of a record type in the database contains a value for each of the fields defined for the record type.

For each record type in the application database, place a record statement in the schema.  Within the record, you must first declare the record's regular data fields followed by the compound key field declarations (if any) as shown in the syntax below. See Create Data Fields for a Core Database below for a description of the field declarations and Create Key Fields for a Core Database for a discussion of key creation.

[static] record recname '{'
    [field_stmt]
    ...
    [key_stmt]
    ...
'}'

Let's illustrate how to organize data into records by examining the records of the example schema, see Understanding the Core Database Schema above. The example tims database uses the author record to contain the name of the author.  The info record holds information about a publication written by this author. It includes the fields defined in Table 6-2.

Table 6-2.  info Record Fields

Field

Description

id_code

15-character field containing ID code.  This field is used in a unique key to use in uniquely accessing a single record occurrence in the database.

info_title

79-character field containing publication title.

publisher

31-character field containing publisher name.

pub_date

11-character field containing publication date.

info_type

Integer code field identifying information type.  Possible values are:
0 = book, 1 = magazine, 2 = article.

Note that the author record in the schema stands alone because there may be many publications written by a single author.  If we stored the author name in the info record, it would result in redundant data, in this case, multiple occurrences of the same author.

The example schema includes a borrower record to contain information about a person checking out one of the author's publications. When the library using the example database loans a publication, the application creates a new borrower record.  This record includes the person's name, the date the publication is borrowed, and the date it is returned.  In this example, the application stores the dates in this record as an integer of the form YYYYMMDD (for example, 19991023 is October 23, 1999). Note that this format is not the way dates are stored in RDM Server SQL, since this is a Core example.

The text record stores a text string of up to 80 characters, including a terminal null byte.  The one-to-many relationship of info to text allows as many 80-character lines as necessary for the abstract of the publication. The schema also defines a key_word record that stores a key word, which is a string field indexed to allow rapid retrieval of individual key word occurrences and to allow retrieval of alphabetized key words.  The intersect record has one field to hold a copy of the info record type. Use of the intersect record eliminates an extra disk read of the info record for non-books.

6.2.3.3 Declare a Static Record

You can declare a static record using the static attribute with the record statement.  Declaring a record static means the record occurrences do not change when the database is being shared by more than one user. The application program need not place locks on static records to access them, thereby improving concurrency. An application can modify these records only when the database is opened in exclusive access mode.

1.  The same result can be achieved dynamically on non-static by placing a read lock on the record type using function d_rtlock.
2.  Data files defining static records contain only static data.  Static records can only be used in sets modified with the database in exclusive access mode.

6.2.4 Create Data Fields for a Core Database 

This section shows how to declare data fields in a record.  The syntax for a field statement declaration is given below.

field_stmt:
            type fldname [ '[' dim ']'...];
or
            struct '{'
                 field_stmt
                 ...
            '}' fldname [ '[' dim ']'...];

type = int8 | uint8 | int16 | uint16 | int32 | uint32 | int64 | uint64 |
           [unsigned] char | wchar | float | double | blob_id | db_addr |
            date | time | timestamp | BCD(prec, scale)

6.2.4.1 About Data Fields

A data field (or field) is the basic unit of information in the database.  It declares a basic item of data that will be stored, usually along with other fields, in a record.  Attributes of a data field include name, data type, and length.  You specify the name of the field in place of the fldname syntax parameter.

Be sure you do not duplicate field names in the same record.  You can duplicate names used in other record types.

SQL uses the term "column" in place of data field.

6.2.4.2 Specify the Data Type

Standard C Data Types

The standard C language base data types supported in RDM Server include the following:

Bit fields and enumeration types are not supported.

RDM Server stores all data in the native format of the host computer.  To accommodate any resulting differences between the client and server machines, we recommend the use the types specified by RDM Server (i.e. int16, uint16, int32, etc.) to represent integer data. These types are defined correctly on each different architecture and therefore will always be the same even if the client and server are on widely different systems.

In the next example, field check_date stores a date in Julian format (i.e., the number of days elapsed since January 1, 1 A.D.).  Its values are stored in the check_date key in integer order. Thus, the application using this schema retrieves keys in check date order through use of the runtime API key retrieval functions described in Key Retrieval Functions.

record mydata {
 date check_date;
 ...
 key check_date;
}

In the following example, the bitmap field stores a bitmap of attribute flags that are tested using binary operators and masks. For example, assuming this field is declared in the rec record type, rec.bitmap & 0x0001 is non-zero if the low-order bit is set.

uint16 bitmap;
Arrays

Your schema can define fields as one-, two-, and three-dimensional arrays of any type.  RDM Server treats one-dimensional character arrays as strings.  Thus, you should always terminate a one-dimensional character array field with a null byte.  The length specified in the field declaration should include the null byte.  If you need a one-dimensional character array that is not to be treated as a string (for example, a byte array), declare it as a two-dimensional array for which the second dimension is 1. The same rule applies if you are using Unicode strings, except that the termination character is the 16-bit zero value.

Byte array fields are implemented as two-dimensional character arrays for which the second dimension is 1.  This constraint forces RDM Server to manipulate all 16 bytes of the field rather than stopping at the first null byte, as it does with string fields. This is shown in the following field definition.

char byte_array[16][1];

In the next example, the coordinates field is defined as an array of three integer variables.  It could be used in a key field to locate an object on a large three-dimensional grid.

int32 coordinates[3];
 DB_ADDR Data Type

A database address in RDM Server is the location in the data file where a particular record occurrence is stored.  The database address for a record does not change during the record's lifetime. You can declare a data field of type DB_ADDR, which signifies that the field contains a database address to some other record occurrence.  Use of a DB_ADDR field provides unlimited data organization possibilities. You can directly access database address fields to maintain your own record links using the runtime API currency access functions, described in Currency Access Functions.

You can use a database address field in a record instead of a set to establish a one-to-one relationship. For example, the following record declaration sets up a relationship using a database address field for a person's spouse.  (A person has at most one spouse.)

record person {
 ...
 db_addr  spouse;  /* database address of another person, or NULL_DBA */
 ...
}
BLOB_ID Data Type

A BLOB ID field is a data field declared with type blob_id.  The term BLOB means binary large object and is used to contain a field consisting of a large number of bytes such as document text, graphics, or pictures. RDM Server stores the BLOB data as a linked list of pages in a separate BLOB file, see Declare a BLOB File. The blob_id field is a 32-bit integer that contains the number of the page in the BLOB file where the BLOB data is stored.  A value of 0 indicates a null BLOB ID.

blob_id  pictograph;

  A BLOB ID field cannot be contained in a key.

Generate Auto-Increment Values for Integer Data Fields

The d_recseqno function can be used to acquire a monotonically increasing, non-negative 32-bit integer value associated with the file that contains the specified record type. The function prototype is given below.

int16 d_recseqno(
 int16   rec,         /* record type */
 uint32 *pSeqno,      /* ptr to sequence number variable */
 RDM_DB  hDb)        /* database handle */

Each time d_recseqno is called, a new value is generated; this value is at least one greater than the value returned from the prior call for the specified record type, rec. The next value is stored in page zero of each data file, along with the last slot and deletes chain values.  If a file contains more than one record type, then all of those record types will share the same sequence number. For example, in the schema below, both rec1 and rec2 are contained in the same data file.

database autoincr {
 data file "autoincr.000" contains rec1, rec2;

 record rec1 {
   int32 f1;
 }
 record rec2 {
   int32 f2;
 }
}

The following code fragment shows the sequence of values that will be returned from successive calls to d_recseqno.

#include "autoincr.h"

uint32 lv;
RDM_DB hDb;

d_open("autoincr", "x", hSess, &hDb);

d_recseqno(REC1, &lv, hDb);  /* lv = 1 */
d_fillnew(REC1, &lv, hDb);

d_recseqno(REC1, &lv, hDb);  /* lv = 2 */
d_fillnew(REC1, &lv, hDb);

d_recseqno(REC2, &lv, hDb);  /* lv = 3 */
d_fillnew(REC2, &lv, hDb);

d_recseqno(REC1, &lv, hDb);  /* lv = 4 */
d_fillnew(REC1, &lv, hDb);

d_recseqno(REC2, &lv, hDb);  /* lv = 5 */
d_fillnew(REC2, &lv, hDb);

d_recseqno(REC1, &lv, hDb);  /* lv = 6 */
d_fillnew(REC1, &lv, hDb);

d_recseqno(REC2, &lv, hDb);  /* lv = 7 */
d_fillnew(REC2, &lv, hDb);

Thus, rec1 will contain occurrences with values 1, 2, 4, and 6, and rec2 will contain occurrences with values 3, 5, and 7.

6.2.4.3 Declare a Field as a Structure

If desired, you can declare a data field as a structure, using the struct statement.  Note the following about a structure field:

In the following example, the field declaration defines a structure that stores an array of three complex numbers, each composed of an imaginary part and a real part.

struct {
 double imag;
 double real;
} complex[3];

6.2.5 Create Key Fields for a Core Database 

6.2.5.1 What is a Key?

A key field is a data field used for rapid or sorted access to the record.  Key values are stored in a B-tree index in a separate key file. A key is comprised of any combination of data fields from a given record.  You can use a key in your application database to allow individual sorting of the specified data fields in either ascending (the default) or descending order. Examples of a key field in the tims example schema are id_code in the info record and friend in the borrower record.

SQL indexes are implemented as keys.

Types of keys include unique, non-unique, and optional.  Your application uses a key to:

Your application can perform key scans to produce sorted lists of records and to implement fast searches of a large number of record occurrences (for example, retrieving all checks entered between two dates).  When the application implements a key scan operation, it reads the keys in the order of the field's base data type.

A compound key is a key that contains more than one data field.

In general, a compound key is used when one of these conditions exists:

6.2.5.2 Declare a Key

For each key that you want to declare in your schema, use a key statement as shown in the syntax below.  Remember that key declarations must follow all other field declarations for a record.

single_field_key :
            [optional | unique] key fldname [ asc[ending] | desc[ending] ] ;

compound_field_key:  [optional | unique] key keyname '{'
             fldname [ asc[ending] | desc[ending] ];
             ...
 '}'

The first form is used to identify one of the data fields as a key.  Here the fldname must identify a previously declared data field in the record.

In the second form, the name of the key field is keyname.  Each of the fields that comprise the key (from the fldname specifications) identify data fields from the record.  Note that a key cannot contain a structure field.

The order specified for the fields of a compound key determines the major and minor sort sequences.  If you do not the sort sequence, the default order is ascending.

You can designate a key field as optional or unique.  If you specify the optional attribute with the key statement, RDM Server stores the key only when your application calls d_keystore.  Otherwise, RDM Server creates the key when it creates the containing record.

When you compile a schema containing key fields, ddlproc creates in the <dbname>.h file a structure declaration for each field. Your application can use these structures in conjunction with the key manipulation functions in the runtime API.  For a list, see RDM Server Description. Also included in the dbname.h file are #define constants for each data and key field that will be used as arguments to those Core API functions that access and manipulate data and key fields.  The names of these constants is the upper case form of the field and key names, prefixed with the record name and a "_".  For example, the field constant for the name field in the author record is "AUTHOR_NAME".

Let's look at a key example.  Suppose that the application is to search a record based on the contents of two fields, where search values for either field are not always available.  If you use a single key field and the first field of the key is not specified, the application must scan all keys. The use of two compound keys solves the problem.

record combo_search {
 int16 field1;
 int16 field2;
 key f1_1st {
   field1; field2;
 }
 key f2_1st {
   field2; field1;
 }
}

If only field1 data is available for the search, the application scans key f1_1st.  If only field2 data is supplied, the application scans key f2_1st.  If both are supplied, f1_1st is scanned.

Unique Keys

A unique field is a field that cannot have a duplicate value in the database.  When declaring a record in your schema, you can use the unique attribute to define a key field as a unique key. The following kinds of data fields are often defined as a unique key:

budget code

inventory item number

serial number

employee number

personal name

social security number

 

 

vehicle identification number

Optional Keys

Normally, keys are created when a record is created, involving overhead of up to three or four disk I/O operations per key.  If you don't want to have the key stored every time a record is created or updated, add the optional attribute to the key declaration. An optional key is created (that is, inserted in the key file) only when specifically requested by the application in a call to the d_keystore runtime API function.

Your application also can use an optional key when you wish to defer key creation to a time other than when the record is first stored.  Deferring key creation can greatly improve the number of record updates per time period (called the transaction rate). Usually a separate program is written providing for key creation during non-peak system load times.

When your application modifies a record containing an optional key (through a d_recwrite or d_crwrite call), the key file is updated only if the current key value exists in the file.  The application can delete an optional key by calling function d_keydel.

  Optional keys cannot be unique.

6.2.6 Create Sets for a Core Database

6.2.6.1 About Sets

A set declaration defines an explicit, one-to-many relationship between record types. A set is implemented as a doubly linked list of member record instances connected to a single owner record instance (which serves as the root or head of the list). When the application creates a database record, it can optionally connect the record to any of the sets in which it has been designated as a member. Each set has a record designated as the owner and one or more member records. One set instance consists of one owner and all of its members. See Network Database Model for an illustration of the use of sets in the network database model.

  An SQL database uses the create join statement to represent a set.  See "Create Joins for an SQL Database"for more information.

6.2.6.2 Declare a Set

The syntax for the set statement is as follows.

set setname '{'
            order { next | last | sorted
            owner recname;
            member recname [ by fldname [asc[ending] | desc[ending]] ...  ];
            ...
'}'

The setname is an identifier for the set you pass to the set runtime API functions (for example, d_findfm, or d_setor). The order clause specifies the order in which member records are inserted into the set.  Connecting member records to sets of order next or last is faster than connecting to sorted sets.  The list of member records associated with a set of ascending or descending order must be scanned each time a new member is connected, to find the proper insertion point. Thus, connection is relatively slow for a set with many members.  If you need a large sorted set, explore alternative approaches using keys to maintain the ordering.

After the order clause in a set declaration comes the owner statement, which defines the set owner.

The member clause specifies a member record for the set. The by clause must be included to indicate one or more field names (fldname) by which to order the member record when the order is sorted. For sorted sets having more than one member record type, the sort field of those record types must be of the same type and length.  This sort field must be listed in the by clause in the same order.

For each set owner, RDM Server stores extra information in each record occurrence that contains a count of the number of member records and the database address of the first and last members of the set occurrence. This amounts to 16 bytes of overhead.

For each set member, RDM Server stores additional information in each record instance that contains the database address of the owner, the previous member, and the next member (recall that a set is implemented as a doubly-linked list). This amounts to a total of 18 bytes of overhead.

To save on overhead, your application can implement a one-to-one relationship using a database address field, as described in DB_ADDR Data Type.

6.2.6.3 Guidelines for Creating Sets

When creating sets, keep the following guidelines in mind:

6.2.6.4 Declare a Many-to-Many Relationship

A many-to-many relationship is one in which an occurrence of either of the two participating record types can be related to more than one occurrence of the other record type.

Let's look at a classic example of the use of a many-to-many data relationship.  Each class has many students and each student takes many classes. Thus, there is a many-to-many relationship between classes and students. Your first impulse might be to prepare a schema like the one below to implement this relationship.

record student {
 int32 id_number;
 char  name[25];
 ...  /* other student data */
 unique key id_number;
}
record class{
 char course_number[16];
 char prof[25];
 int16 room;
 ...  /* other class data */
}
set student_class {
 order last;
 owner student;
 member class;
}
set class_student {
 order last;
 owner class;
 member student;
}

This schema is incorrect, however.  As described in the last section, pointers are reserved in each record slot to hold owner or member pointers.  In the example above, two buckets of pointers are reserved for each record, one for the record's role as set owner and one for its role as set member.  Now assume an instance of the student record type for John Smith, and two instances of the class record type, Algebra 101 and Biology 101. If we connect John Smith to Algebra 101, RDM Server puts a pointer in John Smith's member pointer area to point back to the owner and the previous and next members in the set.  However, a subsequent attempt to connect John Smith to Biology 101 (to indicate that the student is also enrolled in this class) fails because the one and only member pointer bucket is already used for the connection to Algebra 101.

To solve the problem, we introduce an intersect record called student_ class, as shown below. This record applies to every instance of a relationship between student and class, regardless of which is the owner and which is the member.

record student {
 ...
}
record class {
 ...
}

set student_takes_classes {
 order last;
 owner student;
 member student_class;
}
set class_has_students {
 order last;
 owner class;
 member student_class;
}

Each relationship between student and class now has its own bucket of pointers, so that we never run out of buckets. At this point, we can create the relationship between John Smith and Algebra 101 by creating a student_class record, connecting it as member of set student_takes_classes to student John Smith, and then connecting it as member of set class_has_students to class Algebra 101. After this, we can establish the relationship between John Smith and Biology 101 in the same way.  Using this method, we do not run out of pointer buckets because each relationship has its own bucket.

We referred to the student_class record as an intersect record.  An intersect record sometimes is used only to form the many-to-many connection between two records.  In this case, it will not have any data fields.  (You can declare a record that contains no data fields.) But often the intersect record itself needs to contain useful data. For example, the student_class record might be declared as follows.

record student_class{
 int16  days_absent;
 float  grade;
}

The many-to-many relationship described here is specific to a network database model.  However, the concept of an intersect record also applies to relational databases.  The difference is that the relational intersect record uses a foreign key for each owner/member record.  The foreign keys play the same role as owner/member pointers in the network model.

6.2.6.5 Set Examples

The tims example schema, as described in Understanding the Core Database Schema, defines six sets, which relate the records as shown in Figure 6-6. In this figure, note that records are shown as boxes.  Sets are depicted as arrows, with owner records at the beginning of the arrows and member records at the arrow heads.

As you can see from Figure 6-6, the has_published set links the info member record to the author record as owner.  Set loaned_books connects the borrower member record to the info record as owner.


Figure 6-6. Relationship of Sets in the tims Database

The set named article_list has an info record participating as both owner and member of the same set (legal in RDM Server).  This set is intended to connect article information records to the information record of the journal or magazine in which the article is published.

The abstract set, with info as owner and text as member, forms a one-to-many relationship between info and each line of abstract text. An abstract is text of arbitrary length describing a book or article.  Since the abstract's size is unknown, each line of the paragraph is stored separately and attached to the info record.  Because a set can have many members, an abstract of any number of lines can be stored.

Note that the key_to_info set is used to find the info records corresponding to a particular keyword.  The info_to_key set is used to find the key words associated with a given info record.  The relationship between the key_word record and the info record is many-to-many.

Now look at an example not included in tims.  The comment set illustrated below shows you how to use a multiple-member set to reduce unnecessary overhead space. Each project, task, and work_day record in a project management database can have an optional comment associated with it in the note record.  In this example, the note record is associated with only a single set. Using a single set is preferable to using three separate sets because it requires only one set pointer instead of three.

set comment {
 order first;
 owner note;
 member project;
 member task;
 member work_day;
}

6.2.7 Declare Database Files for a Core Database 

6.2.7.1 About Database Files

An RDM Server database consists of the following types of files:

The maximum number of files in an RDM Server database is 32,767.  This includes all files defined with data file, key file, and blob file statements.

6.2.7.2 Declare a Data File

Data files store the occurrences of the record types contained in them.  You define the data files using the data file statement as follows.

data file [ '[' pgsize ']' ] [on devname]  [fileid =] "filename"
           contains recname[, recname]...  ;

Each data file consists of contiguous fixed-length pages, with each page containing a specific number of fixed-length record slots.  The page is the smallest I/O unit used by RDM Server. The pgsize parameter specifies the size of the file's page in bytes.  If unspecified, the page size for the file is the default database page size (specified in the database statement or 1024 if not specified). For best performance, you should always define a page size that is a multiple of the basic block size for your host server platform (most systems have a basic block size of 512).

  See Guidelines for Creating Files for information concerning best performance, the File Structure Report containing the physical characteristics of a data file, and for advice on creating and placing files.

The on clause can be used to specify the RDM Server device where the file will be stored.  If not specified, the file will be stored on the default device for the database.

RDM Server stores control information, as well as normal record contents, in the record slots of the file.  The size of the slots in a file is based on the size of the largest record contained in that file.  Smaller record types, of necessity, produce unused slot space. Physical characteristics of a data file are reported in the File Structure Report.

Place a data file statement in the schema for each data file you wish to include in the database.  Note that each record type you declare must be contained in a data file.

When specifying the file name (filename parameter), remember that the name does not include path information.  The name you specify will be the base name of the file and can be up to 47 characters long. You identify the location of your files using RDM Server database devices when you register the database, as described in RDM Server Installation / Administration Guide.

Provision of a file identifier for the data file (fileid parameter) is also optional.  If you will need dynamic initialization of individual files, you should specify a file identifier by using the s_dbInitfile function in the administration API.

When identifying the records contained in the data file (contains clause), note that you must store occurrences of a given record type in a single file.  You can use one file to contain all record types, or you can use a separate file for each record type.

6.2.7.3 Declare a Key File

A key file contains the B-tree index for one or more key fields declared in your database schema.  Key files are declared using the key file statement.

key file [ '[' pgsize ']' ] [on devname] [fileid =] "filename"
            contains [recname.]fldname[, [recname.]fldname]...;

Each key file consists of contiguous pages, with each page containing a specific number of fixed-length key slots.  RDM Server stores control information, as well as key contents, in these slots. The size of the slots in a file is based on the size of the largest key field contained in that file.  Smaller key fields, of necessity, have unused slot space when contained in the same file with larger key fields. For information about the physical characteristics of a key file, see Guidelines for Creating Files below.

The on clause can be used to specify the RDM Server device where the file will be stored.  If not specified, the file will be stored on the default device for the database.

Use a key file statement in your schema to specify the name and page size of each required key file.  Each key field that you are defining for the database must be contained in a key file.  Each key file consists of contiguous fixed-length pages, with each page containing a specific number of fixed-length key slots. The pgsize parameter specifies the size of the file's page in bytes.  If unspecified, the page size for the file is the default database page size (specified in the database statement or 1024 if not specified). For best performance, you should always define a page size that is a multiple of the basic block size for your host server platform (most systems have a basic block size of 512).

For duplicate key field names, each reference to the key field name in the key file statement must have a prefix showing the name of the record that contains the field.  The syntax is recname.fldname.

6.2.7.4 Declare a BLOB File

A BLOB file is used to contain BLOB data.  A BLOB is stored as a linked list of pages, with the start of each page containing header information. BLOB files are declared using the blob file statement as shown below.

blob file [ '[' pgsize ']' ] [on devname] [fileid =] "filename"
            contains [recname.]fldname[, [recname.]fldname]...;

Each of the specified fldname entries must identify a data field declared as type blob_id.  For information about use of a BLOB, see "Working with BLOBs".

The on clause can be used to specify the RDM Server device where the file will be stored.  If not specified, the file will be stored on the default device for the database.

6.2.7.5 Guidelines for Creating Files

When creating files, keep the following guidelines in mind:

6.2.8 Using Core Database Instances  

One principal use for database instancing is a situation where mutually exclusive data exists that includes differing archiving requirements. For example, to retrieve and delete from a database all database information related to a particular account or client record can be tedious to program and expensive to process. However, if each client or account data is placed in a separate database instance, it is easy to both archive (simply copy the database files), and delete (simply reinitialize the database or delete it altogether).

Time oriented applications also can benefit from database instancing.  Consider the example of a company that uses a separate instance for each day of the current year.  In this setup, each day's transactions can simply be stored in the instance for that day.

Instancing is also useful in some replication applications.  As an example, assume a large corporation has a mainframe computer that stores all accounts from all its branch offices.  Each branch office performs a daily download of the new and modified accounts into separate database instances for each account. This allows each modified account to simply reinitialize the database before receiving the new account information or to create a new instance for the new accounts.

Database instancing requires that the database definition be considered as distinct from the database itself, since there can be more than one instance of a schema and each instance has a different name. The original instance has the same name as the schema; subsequent instances have different names. Once a database instance has been created, it can be used in exactly the same manner as any database.

For a Core database, database instances are created by calling the s_dbClone administrator function.  The s_dbClone function prototype is shown below.

int16 s_dbClone(
 char     *sourcedb,  /* in: name of source database */
 char    *destdb,    /* in: name of new database instance */
 char    *devname,   /* in: device where instance is stored */
 int16   copy_flag, /* in: if TRUE then copy source db's files */
 RDM_SESS hSess)     /* in:    session handle */

This function can only be called from an administrator user's session.  The name of an existing database (or instance) is sourcedb. The name of the new database instance is destdb.

The files associated with the destination database are contained in the database device named devname.  Because the names of the database files are the same, each instance of the same schema must be contained in a separate database device. All files associated with destdb will be stored in the devname device.  If you need particular files located in different devices, you must change them after the new database instance is created. You can use the s_dbModify function to change file devices, or you can use the RDM Server Administrator or rdsadm utility.

The data and key files from the source database are copied into the new database when copy_flag is set to 1. If copy_flag is zero, the destination database files are automatically initialized.  The files can be copied only when the source database is not opened by other users (that is, s_dbClone will open the database in exclusive access mode). The S_DBINUSE error code is returned if sourcedb cannot be opened.

6.2.9 Using the Core DDL API Functions 

RDM Server provides a set of low-level API functions that can be used to dynamically create, and in some cases modify, a database schema. Although these functions are available for general application programming, an RDM Server application will have little or no use for the API. A possible application would be if you wanted some database definition method other than going through the Core DDL or SQL DDL languages.The list of Core DDL API functions is given in the table below.

Table 6-3.  Core DDL API Functions

Function

Description

ddlAbort

Abort database definition discarding all changes made during the transaction.

ddlAddMember

Add a member record to set definition.

ddlAllowNULLValues

Allow a database to contain nulls.

ddlAlterDatabase

Alter an existing database definition.

ddlAlterField

Alter an existing field definition.

ddlAlterRecord

Alter an existing record definition.

ddlBegin

Begin schema modification transaction.

ddlCreateBCD

Add a BCD field to a record.

ddlCreateBlob

Add a BLOB field to a record.

ddlCreateDatabase

Create a database schema.

ddlCreateField

Add a data field to a record.

ddlCreateFile

Create a data/key/BLOB file.

ddlCreateKey

Create a key.

ddlCreateRecord

Create a record.

ddlCreateSet

Create a set.

ddlCreateStruct

Add a structure field to a record.

ddlDropDatabase

Drop a database.

ddlDropField

Drop a data field from a record.

ddlDropKey

Drop a key.

ddlDropRecord

Drop a record.

ddlEnd

End database definition and commit the Core DDL modifications.

ddlEndStruct

Finish the definition of a structure field.

ddlGetField

Get field information.

ddlGetFileById

Get file handle associated with the file id.

ddlGetFileByName

Get file handle associated with the file name.

ddlGetRecord

Get record information.

ddlQueryField

Get current information about a field.

ddlRenameField

Rename a field.

ddlRenameRecord

Rename a record.

ddlSchema

Compile a string containing a complete Core DDL specification.

ddlSchemaFile

Compile a file containing a complete Core DDL specification.

ddlSetFieldDefault

Set the default value for new fields in existing record instances.

ddlSetFieldNotNULL

Disable the use of nulls in a field.

Core DDL specifications are encapsulated in a special DDL transaction initiated through a call to function ddlBegin.  When this transaction ends, all of the DDL changes are committed and immediately applied.  For new databases, the commit will initialize the database files.

The API is designed to correspond closely to the API needed for processing a particular RDM Server DDL statement.  For example, the arguments to the ddlCreateDatabase function correspond to the information specified in the database statement.

Note that several sequencing rules must be applied when using the dynamic DDL API.  The rules are listed below.

Note that both the ddlproc and sddlp utilities utilize the low-level Core DDL API to create RDM Server databases.

Detailed descriptions for all of these functions can be found in RDM Server Reference Manual.

6.3 Write the Schema for an SQL Database

This section explains how to write an RDM Server SQL database schema.

Although the syntax is different, RDM Server SQL statements are similar in function and basic form to their RDM Server Core counterparts. However, RDM Server SQL terminology is quite different from RDM Server Core terminology. The following chart shows the Core equivalent for each RDM Server SQL term.

Table 6-4.  SQL Versus Core Terminology

SQL Database Component

Core Database Component

Database/schema name

Database name

File

Data, key, or BLOB file

Table

Record

Column

Field

Index

Key

Join

Set

As noted earlier in this chapter, SQL uses the terms primary key and foreign key. These terms are not directly related to indexing; instead they refer to a relationship between two tables. A foreign key in one table forms what is essentially a logical link to a primary key in another table.  This relationship is the basis of the relational database model.

6.3.1 Understanding the RDM Server SQL Schema 

For an RDM Server SQL application, a database schema consists of table, index, and join declarations specified by the following RDM Server SQL statements: create database, create file, create table, create index, and create join. All RDM Server SQL statements used in the programming instructions in this section are detailed in RDM Server Language Guide.

The RDM Server SQL examples in this chapter illustrate two inter-related databases, sales and inventory. These databases are used to manage sales data and product inventory, respectively, for a fictitious company that sells computer systems and supplies nationally. The files for these examples, which have the file extension .sql, can be found in the examples directories of both the server and the client directories.

6.3.1.1 The Inventory Example Schema

The example inventory database is defined in the following schema. This database consists of three tables.  The product table contains information about each product, including an identification code, description, pricing, and wholesale cost. The outlet table identifies all company office and warehouse locations. The on_hand table is used to link the other two tables by defining the quantity of a specific product (through prod_id) located at a particular outlet (through loc_id).

/*---------------------------------------------------------------------
SQL DDL for Inventory Database
---------------------------------------------------------------------*/
create database invntory on sqldev;

create table product
(
 prod_id smallint primary key "product identification code",
 prod_desc char(39) not null  "product description",
 price float                  "retail price",
 cost float                   "wholesale cost"
);
create unique index prod_key on product(prod_id);
create index prod_pricing on product(price desc, prod_id);

create table outlet
(
 loc_id char(3) primary key,
 city char(17) not null,
 state char(2) not null,
 region smallint not null "regional U.S.  sales area"
);
create unique index loc_key on outlet(loc_id);
create optional index loc_geo on outlet(state, city);

create table on_hand
(
 loc_id char(3) not null references outlet(loc_id)
      "warehouse containing this product",
 prod_id smallint not null references product
     "id of product at this warehouse",
 quantity integer not null
     "units of product on hand at this warehouse",
 primary key(loc_id, prod_id)
);
create unique index on_hand_key on on_hand(loc_id, prod_id);
create join inventory order last on on_hand(loc_id);
create join distribution order last on on_hand(prod_id);

6.3.1.2 The Sales Example Schema

The example sales database, shown in the following schema, is more complex than the inventory database. The salesperson table contains specific information about a salesperson, including sales ID code, name, commission rate, etc. The customer table contains standard data identifying each customer.  The sale_id column in this table is a foreign key for the salesperson who services the customer account.  Note that sales orders made by a customer are identified through the cust_id foreign key in the sales_order table.

 /*---------------------------------------------------------------------
SQL DDL for Sales Database
---------------------------------------------------------------------*/
create database sales on sqldev;

create file salesd0;
create file salesd1;
create file salesd2;
create file salesd3;
create file salek0;
create file salek1 pagesize 2048;
create file salek2 pagesize 4096;

create table salesperson
(
 sale_id char(3) primary key,
 sale_name char(30) not null,
 dob date "date of birth",
 commission decimal(4,3) check(commission between 0.0 and 0.15)
     "salesperson's commission rate",
 region smallint check(region in (0,1,2,3))
     "regional U.S.  sales area",
 office char(3) references invntory.outlet(loc_id)
     "location where salesperson works",
 mgr_id char(3) references salesperson
     "salesperson id of sales mgr"
) in salesd0;
create unique index sale_key on salesperson(sale_id) in salek0;
create optional index sales_regions on salesperson(region, office)
   in salek1;
create optional index sales_comms
   on salesperson(commission desc, sale_name) in salek2;
create join manages order last on salesperson(mgr_id);

create table customer
(
 cust_id char(3) primary key,
 company varchar(30) not null,
 contact varchar(30),
 street char(30),
 city char(17),
 state char(2),
 zip char(5),
 sale_id char(3) not null references salesperson
     "salesperson who services customer account"
) in salesd0;
create unique index cust_key on customer(cust_id) in salek0;
create optional index cust_geo on customer(state, city) in salek2;
create join accounts order last on non_virtual customer;

create table sales_order
(
 cust_id char(3) not null references customer
     "customer who placed order",
 ord_num smallint primary key
     "order number",
 ord_date date default today
     "date order placed",
 ord_time time default now
     "time order placed",
 amount double
     "total base amount of order",
 tax real default 0.0
      "state/local taxes if appl.",
 ship_date timestamp default null,
 check(amount >= tax)
) in salesd0;
create unique index order_key on sales_order(ord_num) in salek0;
create index order_ndx on sales_order(ord_date, amount desc, ord_time)
   in salek1;
create index cust_order_key on sales_order(cust_id) in salek0;
create join purchases order last on sales_order;

create table item
(
 ord_num smallint not null references sales_order,
 prod_id smallint not null references invntory.product,
 loc_id  char(3) not null references invntory.outlet,
 quantity integer not null
     "number of units of product ordered",
 check( HaveProduct(ord_num, prod_id, loc_id, quantity) = 1 )
) in salesd1;
create index item_ids on item(prod_id, quantity desc) in salek1;
create join line_items order last on item(ord_num);

create table ship_log
(
 log_num integer default auto primary key,
 ord_date timestamp default now
     "date/time when order was entered",
 ord_num smallint not null
     "order number",
 prod_id smallint not null
     "product id number",
 loc_id char(3)  not null
     "outlet location id",
 quantity integer not null
     "quantity of item to be shipped from loc_id",
 backordered smallint default 0
     "set to 1 when item is backordered",
 check(OKayToShip(ord_num,prod_id,loc_id,quantity,backordered) = 1)
) in salesd0;

create index ship_order_key on ship_log(ord_num, prod_id, loc_id)
   in salek1;

create table note
(
 note_id char(12) not null,
 note_date date not null,
 sale_id char(3) not null references salesperson,
 cust_id char(3) references customer,
 primary key(sale_id, note_id, note_date)
) in salesd2;

create unique index note_key on note(sale_id, note_id, note_date)
   in salek1;
create join tickler order sorted on note(sale_id) by note_date desc;
create join actions order sorted on note(cust_id) by note_date desc;

create table note_line
(
 note_id char(12) not null,
 note_date date not null,
 sale_id char(3) not null,
 txtln char(81) not null,
 foreign key(sale_id, note_id, note_date) references note
) in salesd3;
create join comments order last on note_line;

For the sales database, the item table contains product and quantity data pertaining to the sales order identified through the ord_num foreign key. Notes that can serve as a tickler for the salesperson or that indicates actions to be performed for the customer are stored in the note table. Each line of the note text is stored as a row of the note_line table.  An additional table, called ship_log, contains information about sales orders that have been booked but not yet shipped. Your application will create rows in this table through a trigger function, which is a type of user-defined function (UDF) described in Writing and Using Extension Modules.  These relationships are depicted in Figure 6-9.

6.3.2 Create an SQL Database 

The database schema for your RDM Server SQL application starts with a create database statement specifying the name of the database you are creating.  The syntax for this statement is given below.

create {database dbname
        | schema dbname [authorization username]
        | schema authorization dbname}
             [slotsize {468}]
             [pagesize pgsize] [on devname]
             [{enable | disable} null values]
             [{enable | disable} references count]

Note that the create database statement must be the first statement in the schema.  The following example shows the use of create database in its simplest form, to create the example sales database.

create database sales;

The pagesize clause specifies the default page size for the database file.  If you use this option, always make the page size a multiple of the basic block size for your operating system. A multiple of 512 works for most systems. If you do not specify a page size with the database statement, RDM Server uses a default of 1024 bytes. For a complete discussion of the factors governing file and page size, see Section Guidelines for Creating Files.

As shown in the following example, you can use the on clause with the create database statement to specify the name of the RDM Server database device you want to store the database files.  Be sure to indicate a database device that exists on the server. If you do not specify a database device, RDM Server stores the database files on the issuing user's database device. For information about creating a database device, see RDM Server Installation / Administration Guide.

create database sales on mis;

When issuing your create database statement, you can use the enable null values or disable null values clause to specify how the database will handle null values. You also can use the enable references count or disable references count clause to specify how the database will handle reference counts. The following example shows how to use these clauses with the create database statement in the schema for the RDM Server SQL system catalog, syscat.

create database syscat on catdev
 disable null values disable references count;

Maintaining reference counts for the tables in a database incurs extra overhead.  With the disable references count clause, you can disable reference counts if they are not necessary for your application.  When the counts are disabled, your application is unable to delete rows from a table with a primary key referenced by a foreign key (that is, the primary key not part of a join).  For information about using reference counts, see Declare and Name a Table.

6.3.3 Create Tables for an SQL Database 

In an RDM Server SQL database, tables are the main unit for data storage. For example, the example inventory database includes the outlet, on_hand, and product tables.  The following sections describe factors to keep in mind when you plan the tables for your database, including reducing data redundancy and enforcing referential integrity. 

These sections also provide instructions for naming database tables and information about defining table columns.  They describe how to define table constraints how to define a table file.

6.3.3.1 Design Tables for a Database

Each SQL database table consists of one or more columns and integrity constraints.  A column corresponds to a data field in a Core database record and each table row corresponds to one record occurrence. The only SQL database column that does not have a counterpart in the definition of a Core database is a foreign key column, for which you specify a create join statement. Information about using the create join statement is provided in Create Joins for an SQL Database.

Figure 6-7 shows a table containing rows for a simple sales order database that keeps track of salespersons and their customers.  The table includes columns for each salesperson's name and commission rate. The customer's company name, city, and state are stored with the data of the salesperson who services the customer's account.  Each salesperson has one or more customer accounts.


Figure 6-7.  RDM Server SQL Database Table

Note that a salesperson's name and commission are repeated in each row that contains one of that salesperson's customers.  Such duplicated data is called redundant data. One of the goals in designing a database is to minimize the amount of redundant data that must be stored.  Your design can minimize redundant data by using a process called normalization to create multiple tables with compact data.

6.3.3.1.1 Using Primary and Foreign Keys to Reduce Redundancy

In Figure 6-8, the table illustrated in Figure 6-7 has been normalized by dividing the data into two tables. The normalized tables eliminate redundant data values for the salesperson's name and commission.  Instead, the data in the tales is linked by using primary and foreign keys to create a relationship between the tables.


Figure 6-8.  SQL Database Table Relationship

In Figure 6-8, the salesperson table includes the sale_id column, which is designated as the primary key.  In the customer table, the sale_id column is designated as a foreign key.This foreign key references the primary key and creates the table relationship.  It usually (but not necessarily) has the same name as the primary key.

The owner table in a relationship is the table that includes the column designated as the primary key.  A member table contains a column designated as a foreign key.


Figure 6-9.sales and inventory Database Tables

Figure 6-9 illustrates the table relationships for the example sales and inventory databases. The boxes represent database tables.  Each arrow starts at a table containing a referenced primary key and ends at the table containing its foreign key.  Each arrow label identifies the column name of the foreign key, which usually matches the column name of the primary key.

6.3.3.1.2 Using the References Count to Enforce Referential Integrity

An important concept associated with primary and foreign keys is referential integrity.  ANSI SQL referential integrity checking requires the existence of every table row referenced by each foreign key in the database.  If a database has referential integrity, all primary keys associated with all foreign keys in the database exist. In other words, the foreign keys in the database reference the matching primary key for the owner of a table relationship (join). For example, in the tables shown in Figure 6-9, all the sale_id values in the customer table reference salesperson table rows that have matching sale_id values.

To enforce referential integrity in the processing of an RDM Server SQL delete statement, RDM Server maintains a hidden references count column in each database table that is referenced by a foreign key. Deletions to referenced rows are allowed only when the references count is zero.  You can enable or disable the references count for your application when you create a database.

6.3.3.1.3 Manage Null Column Values

Note that RDM Server SQL also manages null column values through a hidden, null value indicator column contained in each table.  You can disable this feature during table creation if your RDM Server SQL application has no need for null column values. See Define Table Columns below for details.

6.3.3.2 Declare and Name a Table

When you are ready to declare a table by using the create table statement, the first thing to do is to name the table.  Remember that the table name must be unique from all other tables or views in the database.

The syntax for the create table statement is given below.

create table [dbname.]base_table_name["description"]
            ( column defn [, column defn]...  [, table constraint]...  )
            [in filename

The name of the table is an identifier given by base_table_name.  The name must be unique within this database.  The "dbname." qualifier is required only when this statement is dynamically submitted by itself, i.e., apart from a prior create database or alter database statement.

You can identify the table name with an associated description string that is stored in the RDM Server SQL system catalog.

During database manipulation, you can set up temporary tables using the create temporary table statement.  The use of this statement is described in Manipulating Data.

6.3.3.3 Define Table Columns

A column is the basic unit of information in an SQL database. A table consists of one or more columns.  A column has a name, data type, and may have default values and integrity constraints. The syntax for a column definition is shown in the following syntax specification.

column_defn:
            column_name column_spec ["description"]

column_spec:
        type_spec [default {constant | null}]
              [not null] [unique | primary key]
              [references table_name[(column_name)]
              [check(cond_expr)]

table_name:
           [dbname.]base_table_name

In the create table statement, define each column of the base table (base_table_name parameter) by an identifier (column_name parameter) and a data type. The column name must be distinct from the other columns in this table, but can be used in other tables.  In the schema for the inventory database, see Design Tables for a Database, the product table contains four columns. The prod_id column is a small integer that serves as the primary key of the table. The prod_desc column contains a description of up to 39 characters.  The price and cost columns contain floating-point values that store the retail price and wholesale cost of the product item, respectively.

The data type can be specified as shown in the following type_spec syntax:

type_spec:
            {char | varchar} (length) |
            {wchar | wvarchar} (length) |
            {binary | varbinary} (length) |
            {double [precision] | float} |
            real | bigint | integer | smallint | tinyint | bit |
            {decimal | numeric}[( precision[, scale] )] |
            date | time [(precision)] | timestamp [(precision)] |
            long {varchar | varbinary | wvarchar} |
            rowid | c_data

When you have set up the data type for the column, you can define it further to contain only non-null data or a default value.  The default is automatically stored when RDM Server processes an insert statement that does not specify a value for the column.

If you want to designate the column as a unique or primary key, you can use the unique or primary key clause, respectively, to qualify the data type in the column specification.  A unique or primary key definition can include up to eight columns.  Note that a given table can have only one primary key.

You should use the create index statement to index all unique keys and primary keys that do not have the rowid data type.  If you do not index a key, sddlp automatically creates the index for you at compile time.  For information about creating an index, see Create Indexes for an SQL Database.

Note that your column specification can also make the column a foreign key.  To do this, include the references clause to identify the table being referenced.

In the column specification, you can use a check clause to define a check condition that must be true for every value of the column in the database.  This condition can include constants and can call both RDM Server SQL built-in functions and user-defined functions, but can only refer to the single column.  The check condition cannot contain a subquery.

To illustrate column creation, look at a table example for a typical SQL application that validates the two-character code for a state.  The table of state codes shown defines the primary key as a three-character location identification code column called loc_id.  Also note the two-character state code contained in the state column.

create table us_states 
(
 state_abbrev char(2) primary key,
 state_name char(15) not null
);

Some of the rows in the us_states table are shown below.  This table defines two columns, and rows defined for the columns are:

state_abbrev

state_name

AK

Alaska

AL

Alabama

...

 

WI

Wisconsin

WY

Wyoming

To validate the state code contained in the state column of the outlet table (example inventory database), we can modify the state column declaration as shown below.  In this example, the references clause identifies that state is a foreign key of us_states.  Any insert or update statement issued for the outlet table will ensure that the corresponding state does indeed exist.

state char(2) not null references us_states,

6.3.3.4 Define Special Data Types for a Column

6.3.3.4.1 rowid Columns

RDM Server SQL provides a special data type called rowid.  A rowid is an integer value assigned serially by the system equal to the record number in the database file of a particular row of a table.  You can use rowid columns with primary and foreign keys to provide direct access capabilities and reduce the need for indexes. Thus, your application can use a rowid primary key to find a table row with a single disk access. See RDM Server Reference Manual for details of the rowid data type.

Some other database management systems refer to rowid keys by other terms, such as "tuple id."

6.3.3.4.2 c_data Columns

You can declare special c_data columns that directly map into arrayed or structured C-language data types.  The c_data type is provided so that RDM Server applications that use the runtime API for data access and modification can utilize RDM Server SQL for queries and reports.

A c_data column can be an array of up to three dimensions.  If the column is a structure, each field can be an array of up to three dimensions. 

Note the following rules about using c_data columns in your RDM Server SQL schema:

An example RDM Server SQL schema containing c_data columns is shown below.

create database cdata 
disable null values
disable references count;

create table cd1
(
 cd1_name char(32),
 cd1_t01  c_data[2][3] float,
 cd1_t02  c_data struct {
    char  v1;
    int16 x1[3];
    int16 y1[3];
    int32 z1;
 },
 cd1_t03  c_data[2] struct {
   float  x2[3];
   double y2[3];
   int32  z2;
   char   descr[3][30];
 },
 cd1_t04  smallint,
 primary key(cd1_name, cd1_t04)
);
create unique index cd1_key on cd1(cd1_name, cd1_t04);

create table cd2
(
 cd2_name char(32),
 cd2_t01  c_data[2][3][2] float,
 cd2_t02  c_data[2][3] uint32
);

create table cd3
(
 cd3_name char(32),
 cd3_t01  c_data[2][3][10] char,
 cd3_t02  c_data struct {
   uint32 iv;
   char   cv[10];
 }
);

The following is the corresponding Core schema, showing a direct mapping from the RDM Server SQL DDL to the Core DDL.

database cdata {
 data file file_cd1 = "cdata.000" contains cd1;
 key  file file_cd1_key = "cdata.001" contains cd1_key;
 data file file_cd2 = "cdata.002" contains cd2;
 data file file_cd3 = "cdata.003" contains cd3;

 record cd1 {
    char    cd1_name[33];
   float   cd1_t01[2][3];
   struct {
     char    v1;
     int16   x1[3];
     int16   y1[3];
     int32   z1;
   } cd1_t02;
   struct {
     float   x2[3];
     double  y2[3];
     int32   z2;
     char    descr[3][30];
   } cd1_t03[2];
   int16   cd1_t04;
   unique key cd1_key {
      cd1_name asc;
      cd1_t04 asc;
   }
 }

 record cd2 {
   char    cd2_name[33];
   float   cd2_t01[2][3][2];
   uint32  cd2_t02[2][3];
 }

 record cd3 {
   char    cd3_name[33];
   char    cd3_t01[2][3][10];
   struct {
     uint32 iv;
     char   cv[10];
   } cd3_t02;
 }
}

See RDM Server Reference Manual for more details on the use of the c_data type.

6.3.3.5 Specify Automatic Increment Integer Columns

You can specify automatic assignment of a monotonically increasing (increases by at least one), non-negative integer value to an integer column.  To do this, specify auto from the default clause of the create table statement's column definition section.

default {constant | null | auto}

For example, the log_num column of the ship_log table is declared with default auto in the following create table statement.

create table ship_log
(
 LOG_NUM integer default auto primary key,
 ORD_DATE timestamp default now
     "date/time when order was entered",
 ORD_NUM smallint not null
     "order number",
 PROD_ID smallint not null
     "product id number",
 LOC_ID char(3)  not null
     "outlet location id",
 QUANTITY integer not null
     "quantity of item to be shipped from loc_id",
 BACKORDERED smallint default 0
     "set to 1 when item is backordered",
 check(OKayToShip(ord_num, prod_id, loc_id, quantity, backordered) = 1)
);

When executing an insert statement, SQL automatically generates a value for log_num if no value has been specified.  For example, in the insert statement below, SQL supplies the value for the log_num column.

insert into ship_log
 values(, date "1998-07-02", 3710, 17419, "SEA", 1, 0);

However, if you supply a value, then that value will be stored.  In the example below, the loc_num value stored will be 12.

insert into ship_log
 values(12, date "1998-07-02", 3710, 17419, "SEA", 1, 0);

You should have little reason to assign your own values.  But if you do, be sure to assign a value lower than the most recently auto-generated value.

The automatically generated integer values do not necessarily increase in strict monotonic order (that is, exactly by 1 each time).  If a table's rows are stored in a file that also contains rows from other tables, the next number might exceed the current number by more than 1.

Values from deleted rows are not reused.

The use of auto-increment default values does not incur any additional performance cost.  RDM Server has implemented them as part of the standard file header, which uses special high-performance logging and recovery mechanisms.

6.3.3.6 Define Table Constraints

If you are defining table constraints (the table_constraint parameter) for a table, the associated statements follow the column definitions. The clauses used to declare table constraints mirror the clauses used in the column specification, except that they apply to all the listed columns instead of just one column.  You can use the unique or primary key clause, the foreign key and references clauses, or the check clause. For information about defining table columns, see Define Table Columns.

The following table definition from the example sales database shows a primary key table constraint after the column definitions. Note that you must use a table constraint to declare multiple column primary or foreign keys.

create table note 
(
 note_id char(12) not null,
 note_date date not null,
 sale_id char(3) not null references salesperson,
 cust_id char(3) references customer,
 primary key(sale_id, note_id, note_date)
) in salesd2;

6.3.3.7 Define a Table File

After the column declarations and table constraints, which are enclosed in parentheses, you can finish the table declaration with an in clause, as shown in the example in Define Table Constraints above. Use an in clause to specify the name of the file where the table rows are stored.  If you do not specify a file to contain the table, RDM Server will store it in a separate file with a name derived from the table name and with the default page size. For information about page size, see Guidelines for Creating Files.

6.3.4 Create Indexes for an SQL Database 

6.3.4.1 What is an Index?

An index is a separate file containing a sorted list of the values from one or more columns in a table.  The index can be used to quickly retrieve a row from the table, based on specific values of the indexed columns.  The index can also be used to retrieve the rows of the table in the order of the indexed columns.

You do not directly use an index from SQL, but indexes are used by the RDM Server SQL optimizer in the selection of an access plan for retrieving data from the database.  More indexes provide the optimizer with more alternatives and can greatly improve select execution performance. Unfortunately, the cost associated with a large number of indexes is a large amount of required storage and a lower performance, incurred by insert, update, and delete statements.

Therefore, your selection of table columns to include in an index requires careful consideration.  In general, create an index on the columns through which the table's rows typically will be accessed or sorted.  Do not create an index for every possible sort or query that may be of interest to a user. SQL can sort when the select statement is processed, so it is unnecessary to create all indexes in advance.  Create indexes on the columns you expect will be used most often in order to speed access to the rows or to order the result rows.

During database manipulation, you can set up indexes for temporary tables using the create index statement.  The use of this statement for temporary tables is described in Manipulating Data.

6.3.4.2 Declare an Index

Include a create index statement in your schema for each index you wish to declare.  The syntax for create index is shown in the following syntax specification:

create [optional | unique] index [dbname.]index_name
            on base_table_name( column [asc|desc][, column [asc|desc]]...)
            [in filename]

Each index declared in the database has a unique index_name identifier.  Use the unique attribute with the create index statement if you want to create an index that cannot contain any duplicate values.

The "dbname." qualifier is required only when this statement is dynamically submitted by itself, i.e., apart from a prior create database or alter database statement.

Use the in clause to identify the file that contains the index.  If not specified, the index will be maintained in a separate file using the default page size (1024 bytes).

All unique and primary key columns (except those of the rowid data type) should be indexed.  If you do not specify a create index for a unique or primary key, sddlp will automatically create one for you.

To define an index that can be deactivated, you can include the optional attribute.  When an optional index is deactivated, SQL does not use the index.  This means that insert statements will not store key values and the optimizer will not use the index in determining a query's access plan.  When the optional index is active, it operates just like any other index.

To enable or disable use of an optional index, use the activate index and deactivate index statements.  Initially, an optional index is deactivated.

In the following index example, the outlet table in our inventory database has two indexes.  The loc_key index is the index for the primary key and loc_geo is an optional index.

create table outlet 
(
 loc_id char(3) primary key,
 city char(17) not null,
 state char(2) not null,
 region smallint not null
     "regional U.S.  sales area"
);
create unique index loc_key on outlet(loc_id);
create optional index loc_geo on outlet(state, city);

6.3.5 Create Joins for an SQL Database 

6.3.5.1 About Joins

Using a create join statement, you can declare predefined joins that RDM Server SQL will automatically maintain on each insert, update, and delete statement issued by your application. These predefined joins are implemented by using the RDM Server Core sets described in Create Sets for a Core Database. With this implementation, queries that include a join (that is, an equi-join) between the prejoined table use direct access methods to retrieve the related rows.  This results in optimal join performance.  However, like an index, a join is implicitly used by the RDM Server SQL optimizer in optimizing data access. This means that no RDM Server SQL data manipulation statement refers directly to the join.

A predefined join provides direct access from the primary key's row to all referencing foreign key rows, as well as from the foreign key rows to the referenced primary key row.  Thus, bi-directional direct access is available without the necessity of an index on the foreign key.  This bi-directional access also provides efficient outer-join processing.

Suppose that the salesperson table illustrated in Figure 6-8 contains rows for newly hired salespersons who do not yet have any customers.  An "inner join" results in a virtual table that includes only the salespersons who have at least one customer (new hires are excluded). In this case, Figure 6-8 corresponds to an inner natural join of the salesperson and customer tables.  An "outer join" created for these tables results in a virtual table that includes all salespersons and their customers.  New hires appear in the table with empty (or null) customer column values, as illustrated in Figure 6-10.


Figure 6-10.  Example of Outer Join Result

Access from the table row containing a foreign key to another table row containing the corresponding primary key entry is always available through the primary key index.  You can simply index the foreign key column to allow quick access to the foreign key row from the primary key table.  However, doing so can use a large amount of disk storage because many foreign keys can be associated with a single primary key.  If you create a join instead, without indexing the foreign key, RDM Server uses direct access methods to form the table relationship.  This strategy results in better performance and saves considerable disk storage.

The foreign key columns used in a create join are virtual columns (that is, columns for which RDM Server does not store the data values).  The application can access a value in a virtual column, just as it does any column.  For virtual columns, RDM Server automatically extracts the data value from the primary key column of the referenced row by using a pointer to that row.  The pointer is associated with the predefined join and maintained by RDM Server.

Since values in a foreign key column come from the corresponding primary key column of the referenced table, no redundant data is required.  However, if an index uses one of the foreign key columns or if you have specified the non_virtual attribute in your create join, the foreign key column values will be stored. In this case, redundant data is maintained in the referencing (foreign key) table.

When all foreign keys that reference a particular primary key are virtual, RDM Server allows the primary key to be modified, even if there are still active foreign keys that reference it.  This is the only case where RDM Server allows a primary key column to be modified with references still active.  Thus, changing the primary key value will instantly change it in all the foreign key rows that reference it.

Using a join in your schema guarantees that only a single logical disk access is necessary to retrieve a row in the referenced table. Thus, performance is optimal for referential integrity checking, for select statement processing, and for locating all rows of the tables with a particular foreign key value.  In addition, the database can use either one-to-many or many-to-one data retrieval.

As with indexes, you should take care in deciding what foreign keys to use in predefined joins.  Since a join is implemented by using database address links stored in each row, RDM Server must use resources to maintain the links during execution of database modification calls. Therefore, you should only use a join for situations in which the application needs to access tables by going from primary key to foreign key, as well as from foreign key to primary key. When the access direction will only be from the foreign key to the table containing the primary key, simply using the primary key index usually achieves acceptable performance. For example, it is not necessary to create a predefined join for the us_states table shown in the example in Define Table Columns.

6.3.5.2 Declaring a Join

The syntax for create join is shown in the following example:

create join join_name order {first | last | sorted}
            on foreign_key [and foreign_key]...

The following example shows how the foreign_key is specified:

[virtual | non_virtual] base_table_name[( colname[, colname]...)]
            [by colname [asc | desc] [, colname [asc | desc]]...]

The specified columns for each base table must already be defined as a foreign key (through either the references or foreign key clause) in the base table's create table. If no columns are listed, then the base table must have a single foreign key or references clause. If more than one base table is specified, each must reference the same table.  Joins cannot be created between tables in different databases.

When you define a join as order sorted, use the by clause with either asc or desc to describe the sort order for each column as ascending or descending, respectively.  Sort orders of mixed ascending and descending columns can be specified but are not yet fully supported. Currently, ordering of all sort columns is based on the ordering of the first sort field.  If you are specifying more than one table for the join, each of the corresponding sort columns must match in type and length.

The performance of an insert or update operation involving a joined foreign key will degrade when a large number of matching foreign key values exist.  This is because predefined joins are implemented as linked lists that must be scanned to locate the proper insertion place.  The larger the list, the longer the time of the scan.

You can use the order clause to specify a sort order for rows with the same foreign key values.  Order can be first, last, or sorted.  In the following example, the join is ordered "last" to maintain foreign key rows in the chronological order in which they are created.

create table salesperson (
 sale_id char(3) primary key,
 ...
)
create table customer (
 cust_id char(3) primary key,
 sale_id char(3) references salesperson,
 ...
)
create join s_c on customer(sale_id) order last;

If cust_id's 123, 456, 789, 234, 567, and 890 are created and attached to sale_id 001 in the order shown, the select statement below,

select sale_id, cust_id from customer where sale_id = "001";

will retrieve the information as follows.

SALE_ID   CUST_ID
001       123
001       456
001       789
001       234
001       567
001       890

Now change the create join statement above to illustrate the use of the sorted attribute with the order clause.  Use of order sorted allows the sorting of referencing rows by the columns that you specify in the by clause.

create join s_c on customer(sale_id) order sorted by cust_id;

For this join, the same select statement results in the table shown below.

SALE_ID   CUST_ID
001       123
001       234
001       456
001       567
001       789
001       890

The RDM Server SQL optimizer uses the index on the sale_id primary key to locate the salesperson row where sale_id equals 001.  It then retrieves the customer rows that reference it through the predefined join, which returns the customer rows in the same order specified in the create join statement.

6.3.6 Create Files for an SQL Database 

Use the create file (or create tablespace) statement in your schema to define database files.  One file can contain the rows of one or more tables or one or more indexes.  The tables or indexes that will be stored in a given file are determined from the in clause of the create table and create index statements, respectively. Review the syntax for the create file statement below.

create {file | tablespace} filename [pagesize bytes] [on devname]

Note that the create file statement is not required in the schema.  If you do not set up a file, RDM Server stores the contents of each table or index in a separate file with the default page size. RDM Server automatically assigns physical file names from the first eight characters of the database name and a three-digit file extension, beginning with 000 and incrementing by one.

See Guidelines for Creating Files concerning the storing a table or index in a separate file with the default page size and to determine the page size This section applies to SQL files exactly as it does for Core files (SQL files map directly into their Core counterparts).

The file name you specify for the filename parameter of the create file statement can have a maximum of 32 characters.  The devname is the name of the RDM Server database device where the file will be stored.  If not specified, the file will be stored in the database device specified in the on clause of the create database statement.

Use the pagesize clause with the create file statement to specify the size, in bytes, of each page of the file.

The file count includes all files for which a create file statement is issued, plus all tables and indexes for which an in clause has not been specified with the create file statement.  In general, it is usually best to use as few files as possible in your database. By using fewer files, you reduce the amount of file opening and closing that the server needs to do, improving performance.  To minimize storage waste, it is recommended that you place only items of similar size in the same file.

6.3.7 Create Triggers on an SQL Database

A trigger is procedure associated with a table that is executed (i.e., fired) whenever that table is modified by the execution of an insert, update, or delete statement. A non-standard trigger mechanism has been available in RDM Server SQL through the use of a User-Defined Function that gets called via the execution of a check condition that was specified in the create table statement. The SQL standard now provides the ability for triggers to be specified using SQL statements. This section describes how standard SQL triggers are implemented in RDM Server SQL.

6.3.7.1 Trigger Specification

The create trigger statement is used to create a trigger on a specified database table. The syntax for this statement is given below.

create trigger trigger_name ["description"]
        {before | after} trigger_event on tablename
        [referencing transition...]
        [for each {row [when (search_condition)] | statement]
        trigger_stmts

trigger_event:
        insert | delete | update [of colname[, colname]...]

transition:
        {old | new} [row] [as] corname

trigger_stmts:
        sql_stmt | begin [atomic] sql_stmt... end

sql_stmt:
        a dynamically executable SQL statement (with some restrictions, see below)

tablename:
        [dbname.]identifier

The trigger_name is the unique name of the trigger and must conform to a standard identifier. The tablename is the name of the base table with which the trigger is to be associated. An optional string containing a description or comment about the trigger can be specified. This string is stored along with the trigger definition in the system catalog.

The trigger is defined to be fired either before or after the changes are made by the insert, update, or delete as specified in the trigger_event. The firing of an update trigger can be restricted to occur only when the values of the column names specified in the update of clause are updated. If no columns are specified, then an update trigger will be fired upon the execution of every update statement on tablename.

Two types of triggers can be created. A statement-level trigger is created by specifying for each statement in the trigger declaration. If no for each clause is specified, for each statement is the default. A statement-level trigger fires once for each execution of the insert, update, or delete statement as indicated by the trigger_event clause. Thus, for example, an update statement that modifies 100 rows in a table will execute a statement-level update trigger on the table only once.

A row-level trigger is created by specifying for each row in the trigger declaration. Row-level triggers fire once for each table row that is changed by the insert, update, or delete statement. Row-level triggers are the more useful of the two types of triggers in that they can reference the old and/or new columns values for each row. The referencing clause is used to specify a correlation name for either the old table row values or the new table row values. This clause can only be specified with row-level triggers. The when clause can be used to specify a condition that must evaluate to true in order for the trigger to fire. Note that the only table values that can be referenced in the when search_condition are through the referencing old and/or new row correlation names.

The new or old column values of each row can be referenced in the trigger’s SQL statements through the correlation names specified in the referencing clause. However, references to blob type columns (long varchar/varbinary/wvarchar) are not allowed. Note that insert triggers only have new column values, delete triggers only have old column values, while update triggers have both old and new column values.

The SQL statement to be executed when the trigger fires is specified last. If more than one statement is needed, it must be placed within a begin [atomic] and end block. The SQL standard offers no explanation as to why it chose to include the word "atomic." It normally is used to mean that a sequence of statements are not interruptable. However, since the execution of a trigger can cause other data modifications to occur that also have triggers (they can be nested) this cannot be the case with triggers. We have interpreted it to mean that either all of the SQL statements succeed or if any one fails then the state is restored to its pre-trigger execution condition. Regardless of why they chose to include this term, it does tend to make one not want to use triggers for fear of nuking the database!

There are some restrictions on the kinds of SQL statements that can be included in a trigger. No select, DDL, or create statements are allowed in a trigger. A trigger cannot create another trigger. A stored procedure cannot create a trigger. Also, since it is necessary that any database modifications made by a trigger be included as part of the user’s transaction, no transaction statements are allowed in a trigger definition. While stored procedures and user-defined procedures can be executed within a trigger, great care must be exercised to ensure that no harmful side effects occur from the execution of these procedures inside a trigger.

A trigger begins to take effect immediately upon the successful execution of the create trigger statement. Thus, it is should be considered more of a DDL than a DML statement since their creation should occur immediately after the DDL statements are issued that define the database tables on which the triggers are associated. Triggers that are created on an existing database may require that the conditions and data relationships being maintained by the triggers be externally established at trigger creation time. See the "Summary Statistics" section below for an example.

6.3.7.2 Trigger Execution

A trigger that has been defined on a table will be executed based on the {before | after} trigger_event specification. Any changes that are made by the SQL statements specified in a before trigger will remain intact even when the triggering data modification statement fails (e.g., with an integrity violation). The triggered SQL statements defined in an after trigger are only executed when the triggering data modification statement succeeds.

A before statement-level trigger will execute before any changes are made by the associated (triggering) insert, update, or delete statement. An after statement-level trigger will execute after all changes have been successfully made by the associated insert, update, or delete statement.

A before row-level trigger will execute prior to each row modification made by the triggering insert, update, or delete statement. An after row-level trigger executes after each row has been successfully modified by the triggering insert, update, or delete statement. If a when clause has been specified with a row-level trigger, the trigger will only fire on those rows where the evaluation of the when's search_condition returns true.

All changes made by the SQL statement(s) defined by the trigger are included as part of the user’s transaction. Thus, the triggered database modifications will be committed when the user subsequently issues a commit statement or they will be rolled back should the user subsequently execute a rollback statement.

There is no limit to the number of triggers than can be defined on a table. There can even be multiple triggers with the same trigger_event specified on a table. Multiple triggers are executed sequentially in the order in which they were defined.

The SQL trigger_stmts can themselves make changes to tables on which other triggers have been defined. Thus, trigger execution can be nested.

Note that any rows that are modified by a trigger remained locked until the user either commits or rolls back the transaction.

Any trigger can be disabled and subsequently re-enabled through use of the alter trigger statement.

alter trigger trigger_name {enable | disable}

The altered trigger status takes effect immediately upon successful execution of the alter trigger statement.

6.3.7.3 Trigger Security

The ability for non-administrator users to create triggers is included in the create database command-level privilege. This can be set by executing the following grant statement.

grant create database to user_id[, user_id]...

The create database privilege can be removed by executing the following revoke statement.

revoke create database from user_id[, user_id]...

A user must either be an administrator or have create database command privilege in order to create, alter or drop triggers.

In addition to having the proper command privilege, A non-administrator user must also have been granted trigger privilege on any tables on which the user will be creating triggers. Trigger privileges are set using the following grant statement.

grant trigger on table_name to user_id[, user_id]...

Trigger privilege is required for a user to create, alter, or drop a trigger on the specified table. Trigger privileges can be revoked by issuing the following statement.

revoke trigger on table_name from user_id[, user_id]...

Revoking trigger privileges does not affect any triggers that may have already been created by the specified user.

Triggers execute under the authority of the user who created the trigger and not that of the user who executed the original insert, update, or delete statement that caused the trigger to fire. Thus, the user that issues the create trigger statement must have the proper security privileges on any table that is to be accessed or modified by the trigger’s SQL statements. Later changes to the security settings for the user who created the trigger will not affect the execution of the trigger.

A trigger can be dropped by executing the drop trigger statement.

drop trigger trigger_name

All triggers that have been defined on a particular table are automatically dropped when the table is dropped.

6.3.7.4 Trigger Examples

The use of triggers in a database system necessarily means that modifications made to the tables on which triggers have been defined will have side effects that are hidden from the user who issued the original SQL modification statement. Generally, side effects are not a good thing to have occur in a software system. Yet, triggers are am important and useful feature for certain kinds of processing requirements. The examples in this section illustrate two such uses. Triggers are particularly useful in maintaining certain kinds of statistics such as usage or summary stats. Triggers are also very useful in maintain various kinds of audit trails.

Summary Statistics

The query below returns the sales totals for each customer in the sales database.

set double display(12, "#,#.##");
select cust_id, sum(amount) from sales_order group by 1;
cust_id             sum(amount)
ATL                  113,659.75
BUF                  263,030.36
CHI                  160,224.65
	. . .
SEA                   60,756.36
SFF                  112,345.66
TBB                  104,038.25
WAS                   63,039.90

An alternative approach which does not require running a query that scans through the entire sales_order table each time can be implemented with triggers. A new column named sales_tot of type double is declared in the customer table. The following three triggers can be defined on the sales_order table that keep the related customer’s sales total amount up to date.

create trigger InsSalesTot after insert on sales_order
    referencing new row as new_order 
    for each row
    update customer 
 	set sales_tot = sales_tot + new_order.amount
    	where cust_id = new_order.cust_id;
create trigger UpdSalesTot after update of amount on sales_order
    referencing old row as old_order new row as new_order 
    for each row
    update customer 
 	set sales_tot = sales_tot + (new_order.amount - old_order.amount)
       where cust_id = new_order.cust_id;
create trigger DelSalesTot before delete on sales_order
    referencing old row as old_order 
    for each row
    update customer 
 	set sales_tot = sales_tot - old_order.amount
 	where cust_id = old_order.cust_id;

The first trigger, InsSalesTot, executes an update on the customer table after each successful insert on the sales_order table by adding the new sales_order's amount through the correlation name new_order to the current value of the customer's sales_tot. The second trigger is fired only when there is an update executed that changes the value of the amount column in the sales_order table. When that occurs the customer's sales_tot column needs to subtract out the old amount and add in the new one. The DelSalesTot trigger fires whenever a sales_order row is deleted causing its amount to be subtracted from the customer's sales_tot.

Now suppose you want to also maintain the sales totals for each salesperson in addition to each customer. You can also add a sales_tot column of type double to the salesperson table and use a trigger to update it as well as the customer sales_tot column. The simplest way to do this is to modify the above triggers to update the row of the salesperson table who manages the account of the customer whose sales_order is being modified as shown below.

create trigger InsSalesTot after insert on sales_order
    referencing new row as new_order 
    for each row
begin atomic
    update customer 
 	set sales_tot = sales_tot + new_order.amount
    	where cust_id = new_order.cust_id
    update salesperson 
 	set sales_tot = sales_tot + new_order.amount
    	where sale_id = (select sale_id from customer 
 		where cust_id = new_order.cust_id)
end;
create trigger UpdSalesTot after update of amount on sales_order
    referencing old row as old_order new row as new_order 
    for each row
begin atomic
    update customer 
 	set sales_tot = sales_tot + (new_order.amount - old_order.amount)
       where customer.cust_id = new_order.cust_id;
    update salesperson 
	set sales_tot = sales_tot + (new_order.amount - old_order.amount)
    	where sale_id = (select sale_id from customer 
 		where cust_id = new_order.cust_id)
end;
create trigger DelSalesTot before delete on sales_order
    referencing old row as old_order 
    for each row
begin atomic
    update customer 
 	set sales_tot = sales_tot - old_order.amount
 	where customer.cust_id = old_order.cust_id;
    update salesperson 
	set sales_tot = sales_tot - old_order.amount
    	where sale_id = (select sale_id from customer 
 		where cust_id = new_order.cust_id)
end;

Since each trigger contains two SQL update statements, they must be enclosed between the begin atomic and end pairs. Also note that the subquery is needed to locate the salesperson row to be updated through the customer row based on the cust_id column in the sales_order table.

The same result can also be achieved not by modifying the original triggers but by introducing one new trigger that updates the salesperson's sales_tot whenever a related customer's sales_tot column is updated. Note that the saleperson sales_tot does not need to be updated when a new customer row is inserted (because the sales_tot is initially zero) or when a customer row is deleted (because the sales_order rows associated with the customer must first be deleted which causes the customer's sales_tot to be updated). The trigger definition is as follows.

create trigger UpdSPSalesTot after update of amount on customer
    referencing old row as old_cust new row as new_cust 
    for each row
    update salesperson
 	set sales_tot = sales_tot + (new_cust.amount - old_cust.amount)
    	where sale_id = new_cust.sale_id;

This trigger fires whenever an update is executed on the sales_tot column in the customer table. That will only occur when one of the earlier triggers fires due to the execution of an insert, delete, or update of the amount column on the sales_order table. Thus, this is an example of a nested trigger—a trigger which fires in response to the firing of another trigger.

The sales database example is delivered with the sales_tot column already declared in the salesperson and customer tables but without the triggers having been declared. Now, however, you want to create the triggers that will maintain the sales_tot values for each customer and salesperson but data already exists in the database. So, the sales totals somehow need to be initialized at the time the triggers are created. To do this the database should be opened in exclusive access to ensure that no updates occur between the time the triggers are first installed and the sales_tot values in the customer table are initialized. The following rsql script shows how this can be done.

open sales exclusive;
set double display(12, "#,#.##");
select sale_id, sales_tot from salesperson;

sale_id                  sales_tot
BCK                           0.00
BNF                           0.00
BPS                           0.00
	...
WAJ                           0.00
WWW                           0.00
select cust_id, sales_tot from customer;

sale_id                  sales_tot
ATL                           0.00
BUF                           0.00
CHI                           0.00
	...
TBB                           0.00
WAS                           0.00
create trigger InsSalesTot ...
create trigger UpdSalesTot ...
create trigger DelSalesTot ...
create trigger UpdSPSalesTot ...
update customer set sales_tot = 
   query("select sum(amount) from sales_order where cust_id = ?", cust_id);

*** 28 rows affected
select cust_id, sales_tot from customer;
commit;

sale_id                  sales_tot
ATL                     113,659.75
BUF                     263,030.36
CHI                     160,224.65
	...
TBB                     104,038.25
WAS                      63,039.90
select sale_id, sales_tot from salesperson;

sale_id                  sales_tot
BCK                     237,392.56
BNF                     112,345.66
BPS                           0.00
	...
WAJ                     141,535.34
WWW                      49,461.20
close sales;

Note that the update statement that sets the sales_tot values for each row in the customer table uses the query system function (a copy has also been included as an example user-defined function called "subquery").

Audit Trails

Audit trails keep track of certain changes that are made to a database along with an identification of the user who initiated the change and a timestamp as to when the change occurred. Suppose we want to keep track of changes made to the sales_order table. The following statements creates a table called orders_log that will contain on row per sales_order change and grants insert (only) privileges on it to all users.

create table sales.orders_log(
	chg_desc char(30),
	chg_user char(32) default user,
	chg_timestamp timestamp default now
);
commit;
grant insert on orders_log to public;

Six statement-level triggers are needed to track all successful and unsuccessful attempts to change the sales_order table: three before triggers to track all attempts and three after triggers to track only those changes that succeed. Note that should the transaction that contains the sales_order change statement be rolled back, the changes to orders_log will also be rolled back. Thus, only unsuccessful change attempts associated with subsequently committed transactions will be logged in the orders_log table. The declarations of the triggers are given below.

create trigger bef_ord_ins before insert on sales_order
	for each statement
	insert into orders_log(chg_desc) values "insert attempted";
create trigger bef_ord_upd before update on sales_order
	insert into orders_log(chg_desc) values "update attempted";
create trigger bef_ord_del before delete on sales_order
	insert into orders_log(chg_desc) values "delete attempted";
create trigger aft_ord_ins after insert on sales_order 
	insert into orders_log(chg_desc) values "insert successful";
create trigger aft_ord_upd after update on sales_order
	insert into orders_log(chg_desc) values "update successful";
create trigger aft_ord_del after delete on sales_order
	insert into orders_log(chg_desc) values "update successful";

By the way, as you can see from the above trigger declarations the for each statement clause is optional and is the default if no for each clause is specified.

The rsql script below creates a couple of new users who each make several changes to the sales_order table in order to see the results of the firing of the associated triggers. Note also that the original row-level triggers are still operative.

create user kirk password "tiberius" on sqldev;
grant all commands to kirk;
grant select on orders_log to kirk;
create user jones password "tough" on sqldev;
grant all commands to jones;
grant select on orders_log to jones;
.c 2 server kirk tiberius
insert into sales_order values "IND",2400,today,now,10000.00,0,null;

*** 1 rows affected
commit;
.c 3 server jones tough
update sales_order set amount = 1000.00 where ord_num = 2400;
*** 1 rows affected
delete from sales_order where ord_num = 2210;
****RSQL Diagnostic 3713: non-zero references on primary/unique key
commit;
select * from orders_log;

chg_desc             chg_user            chg_timestamp
insert attempted     kirk                2009-07-27 11:58:17.9460
insert successful    kirk                2009-07-27 11:58:17.9460
update attempted     jones               2009-07-27 11:59:48.2900
update successful    jones               2009-07-27 11:59:48.2900
delete attempted     jones               2009-07-27 12:00:06.3680
.c 2
*** using statement handle 1 of connection 2
delete from sales_order where ord_num = 2400;

*** 1 rows affected
select * from orders_log;

chg_desc             chg_user            chg_timestamp
insert attempted     kirk                2009-07-27 11:58:17.9460
insert successful    kirk                2009-07-27 11:58:17.9460
update attempted     jones               2009-07-27 11:59:48.2900
update successful    jones               2009-07-27 11:59:48.2900
delete attempted     jones               2009-07-27 12:00:06.3680
delete attempted     kirk                2009-07-27 12:05:10.0710
delete attempted     kirk                2009-07-27 12:05:49.9620
delete successful    kirk                2009-07-27 12:05:49.9620
rollback;
select * from orders_log;

chg_desc             chg_user            chg_timestamp
insert attempted     kirk                2009-07-27 11:58:17.9460
insert successful    kirk                2009-07-27 11:58:17.9460
update attempted     jones               2009-07-27 11:59:48.2900
update successful    jones               2009-07-27 11:59:48.2900
delete attempted     jones               2009-07-27 12:00:06.3680

6.3.7.5 Accessing Trigger Definitions

Trigger definitions are stored in the system catalog. Two predefined stored procedures are available for accessing trigger definitions. Procedure ShowTrigger will return a result set containing a single char column and one row for each line of text from the original declaration for the trigger name specified in the procedure argument. Procedure ShowAllTriggers returns two columns: the trigger name and a line of text from the original declaration. Example calls and their respective result sets are shown in the example below.

exec ShowTrigger("UpdSalesTot");
TRIGGER DEFINITION
create trigger UpdSalesTot after update of amount on sales_order
    referencing old row as old_order new row as new_order
    for each row
    update customer set sales_tot = sales_tot + (new_order.amount - old_order.amount)
        where customer.cust_id = new_order.cust_id;

exec ShowAllTriggers;
NAME                 DEFINITION
InsSalesTot          create trigger InsSalesTot after insert on sales_order
InsSalesTot             referencing new row as new_order
InsSalesTot             for each row
InsSalesTot             update customer set sales_tot = sales_tot + new_order.amount
InsSalesTot             where customer.cust_id = new_order.cust_id;
UpdSalesTot          create trigger UpdSalesTot after update of amount on sales_order
UpdSalesTot             referencing old row as old_order new row as new_order
UpdSalesTot             for each row
UpdSalesTot             update customer 
                        set sales_tot = sales_tot + (new_order.amount - old_order.amount)
UpdSalesTot             where customer.cust_id = new_order.cust_id;
DelSalesTot          create trigger DelSalesTot before delete on sales_order
DelSalesTot             referencing old row as old_order
DelSalesTot             for each row
DelSalesTot             update customer set sales_tot = sales_tot - old_order.amount
DelSalesTot             where customer.cust_id = old_order.cust_id;
UpdSPSalesTot        create trigger UpdSPSalesTot after update of sales_tot on customer
UpdSPSalesTot           referencing old row as oldc new row as newc
UpdSPSalesTot           for each row
UpdSPSalesTot           update salesperson 
                        set sales_tot = sales_tot + (newc.sales_tot - oldc.sales_tot)
UpdSPSalesTot           where sale_id = newc.sale_id;

6.3.8 Using SQL Database Instances 

One principal use for database instancing is a situation where mutually exclusive data exists that includes differing archiving requirements. For example, to retrieve and delete from a database all database information related to a particular account or client record can be tedious to program and expensive to process. However, if each client or account data is placed in a separate database instance, it is easy to both archive (simply copy the database files), and delete (simply reinitialize the database or delete it altogether).

Time oriented applications also can benefit from database instancing.  Consider the example of a company that uses a separate instance for each day of the current year.  In this setup, each day's transactions can simply be stored in the instance for that day.

Instancing is also useful in some replication applications.  As an example, assume a large corporation has a mainframe computer that stores all accounts from all its branch offices.  Each branch office performs a daily download of the new and modified accounts into separate database instances for each account. 

This allows each modified account to simply reinitialize the database before receiving the new account information or to create a new instance for the new accounts.

Database instancing requires that the database definition be considered as distinct from the database itself, since there can be more than one instance of a schema and each instance has a different name.  The original instance has the same name as the schema; subsequent instances have different names.Once a database instance has been created, it can be used in exactly the same manner as any database.

6.3.8.1 Create an SQL Database Instance

When SQL processes an SQL DDL specification, the database name specified in the create database statement names both the schema and the first instance of the schema, and automatically creates the first instance.  Other instances can then be created and dropped.

A new instance of a database is created by a successful execution of the create database instance statement shown below.

create database [instance] newdb from sourcedb [with data] on devname

New database instances are created from existing databases.  The name of the new database is given by newdb, which must be unique for all databases on the server.  The existing database instance from which the new instance is created is sourcedb.

The database device name, devname, must be specified and must be a valid RDM Server database device.  In addition, that device cannot have been used to store other instances of the schema from which sourcedb is derived. All database files will be stored on that device and, since the file names for all instances are identical, they must be stored in separate database devices.  If specified, the with data option opens the source database for exclusive access and causes all database files and optimizer statistics from the source database to be copied into the new database. Otherwise, the new database files are automatically initialized.

The create database instance statement can only be executed by administrators or the owner of the schema (that is, the user who issued the original create database statement through sddlp).

The initial instance of a database is created when a database definition is processed.  The name of the instance is specified in the create database statement.  Other instances can then be created from the original database.  All instances share the same database definition information from the system catalog. However, database statistics used by the SQL query optimizer collected during execution of the update stats statement are maintained separately for each database instance.

6.3.8.2 Reference Database Instances

Database instances are referenced just as you reference any database.  You can explicitly open a database instance using the open statement or implicitly open one through a qualified table name.  For example, assume that wa_sales, ca_sales, and mi_sales are each instances of the sales database, containing the sales for Washington, California, and Michigan, respectively.  The following example shows how these instances can be created and populated.

create database instance wa_sales from sales on wadev;
insert into wa_sales.customer from file "customrs.wa" on salesdev;

create database instance ca_sales from sales on cadev;
insert into ca_sales.customer from file "customrs.ca" on salesdev;

create database instance mi_sales from sales on midev;
insert into mi_sales.customer from file "customrs.mi" on salesdev;

update stats on wa_sales, ca_sales, mi_sales;

The next example returns the customers from the Michigan instance of sales.

open mi_sales;
select * from customer;

This same query could have been executed using a single statement as follows.

select * from mi_sales.customer;

You can have any number of instances of the same schema opened at a time.  An unqualified reference to a table in the schema will use the most recently opened instance by default. If you are not sure which instance is open, it is best to explicitly qualify the table name with the database name.

An unqualified reference to a table from a schema on which there is more than one instance will use the oldest instance (usually the original) when none have been opened.

6.3.8.3 Views and Stored Procedures for Database Instances

Views and stored procedure definitions are maintained based on the schema definition and are not dependent on a particular database instance except when the database instance is explicitly referenced in the view or stored procedure declaration. However, the execution plan generated for the view or stored procedure is based on the optimization statistics associated with whatever database instance was open at the time the view or stored procedure was compiled. Thus, if a view or stored procedure will be used with more than one database instance, it is important that the instance used during compilation contain a representative set of data on which an update stats has been run.

The example below creates a view called in_state_by_zip that will list the customers in a database instance in zip code order.  The mi_sales database was opened for the create view because it contained a large number of customers. Thus, the optimizer would be sure to use the index on zip (assuming that in this example zip is indexed).  The subsequent open on wa_sales followed by the select of in_state_by_zip will return the results from the wa_sales database.

-- Lot's of customers in Michigan, should provide good stats
open mi_sales; 
create view in_state_by_zip as
select * from customer order by zip;

open wa_sales;
select * from in_state_by_zip;

Note that for views referenced in a select statement qualified with an instance name, the instance name is used to identify the schema to which the view is attached.  It does not specify which instance to use with any unqualified table names in the view definition itself. Thus, in the following example, the result set will contain Washington, not Michigan, customers.

open wa_sales;
select * from mi_sales.in_state_by_zip;

6.3.8.4 Drop a Database Instance

The drop database statement can be used to delete database instances.  The syntax is shown below.

drop database dbname

This statement can only be executed by administrators or the database owner.  Also, database dbname must not be opened by any other users.  The system drops the database instance by removing its instance-specific information from the system catalog and removing all entries from the lookup tables used by the SQLTables, SQLColumns, and SQLSpecialColumns function calls.  The database definition information associated with the schema is not deleted. 

If all instances are dropped (including the original created by sddlp), no new instances can be created without rerunning sddlp and reprocessing the database definition. 

Note that this statement will not delete any database files.

6.3.8.5 Restrictions

If more than one instance will be created for a database definition, one restriction does apply.  No explicitly declared foreign key references can exist between databases. For example, the example sales database schema provided in RDM Server contains foreign references to the inventory database.  Any attempt to clone either sales or inventory will return an error. This restriction exists because it is impossible for RDM Server to reliably manage inter-database reference counts for multiple database instances. The reliability of such operations would be based on the correctness of the application's use of those databases, thus violating the very concept of DBMS-enforced referential integrity.

Inter-database relationships can still be maintained by the application program by using undeclared foreign keys.  Shown below is an excerpt from sales.sql with the declared foreign keys to the inventory database highlighted. By simply removing the indicated reference clauses, it is possible to create multiple instances of both sales and inventory. Referential integrity will not be enforced by SQL but the inter-database relationships can still exist with no effect on how joins between the databases are processed.

create table salesperson
(
 sale_id char(3) primary key,
 ...
 office char(3) references invntory.outlet(loc_id),
 mgr_id char(3) references salesperson
);
...
create table item
(
 ord_num smallint not null references sales_order,
 prod_id smallint not null references invntory.product,
 loc_id  char(3) not null references invntory.outlet,
 ...
);

6.3.9 Modify an SQL Database Schema 

RDM Server allows the schema for an existing (i.e., populated) database to be modified by adding new tables or indexes or by dropping existing tables or indexes. You can add an individual table or index simply by issuing the create table/index statement with the table/index name qualified by the database name the table/index will be added to. Similarly, you can drop a table or index by issuing an individual drop table/index again, where the table/index name is qualified with the database name.  However, if you will be making more than one change to a database schema, it is best to encapsulate the changes in an alter database transaction.

The syntax for the alter database statement is shown below.

alter {database | schema} dbname

Following the alter database statement would be the series of create file, create table, create index, drop table, and drop index statements that describe the changes you wish to make to the database schema.  All the changes will be incorporated when a subsequent commit statement is submitted.

As an example, the following alter database script will add an index on column contact in the customer table, drop the cust_order_key index in sales_order, and add a new table called sales_office.

alter database sales;

create file salesd4;
create file salek3;

create index contact_key on customer(contact) in salek3;
drop index cust_order_key;
create table sales_office(
 office_id char(3) primary key,
 address char(30),
 city char(20),
 state char(2),
 zip char(10),
 phone char(12)
);
create unique index office_key on sales_office(office_id) in salek3;

commit;

Copyright © 2009, Birdstep Technology, Inc. All rights reserved.