GenHelm - db_object Model Help

The db_object model is used to generate a class that can interact with a series of related SQL tables. Its primary purpose is to maintain the integrity of the data in the tables that it manages and to expose the data contained in these tables in a consistent way. While all updates to the underlying SQL tables are handled by the object, in many cases no special code needs to be written for this since most of the required functionality is handled by the base class db_object_base.

Consider the classic example of a set of tables used to store order data. Typically each order is defined using a header row to store attributes such as the order_date, customer_number, order_status, etc.. Usually an order can be for one or more products so the details of the order are stored within separate database rows, one per order item. Now imagine that your system has several web forms that need to process orders. Some forms might be used by customers to enter their own orders, some might be used by your order fulfilment staff, others might be used by your accounting staff, etc..

Suppose your system was designed to allow all of these various forms to access and update whatever SQL tables they wanted to change as depicted in the following diagram:

Chaotic system design

In most cases, a design such as this would soon lead to chaos and data inconsistencies because so many different processes have unrestricted access to the same set of files. If a new requirement for certain field validation emerges, this would likely need to be implemented in many different form handlers. When data corruption is discovered, it would be difficult to determine what process needs to be corrected since there could be dozens of processes that have update access to the files involved.

Now consider the following system design approach:

Improved system design using a database object

We still have several web forms that can process orders but with this architecture each form must go through a common object in order to make changes to the underlying table rows and columns. This common object is a php class generated by the db_object model. With this new design, all of the data integrity related to the Order Header and Order Details tables are managed by the Order Object class. With this architecture, any new validation requirements only need to be implemented in one place, thus guaranteeing that all processes that use the Order Object implement a consistent set of rules.

Think of a db_object as a gatekeeper to a set of related SQL tables.

Sample Object Maintenance Form

It is important to note that generated db_objects are intended to be called programmatically and don't have a specific user interface. Nevertheless, a common use of db_objects is to support web forms. Here we see a sample form used to maintain film information. This form interfaces with the film db_object in order to handle the database interactions.

Film Maintenance Page

Record Locking

Most SQL databases are able to lock records to prevent multiple users from updating the same row of a table and overwriting another user's changes. Such database level locking is really intended to be for very short periods of time, generally a few seconds. This type of database level locking is not suitable for locks that could span several minutes or even hours. Furthermore, the stateless nature of http requests does not lend itself to utilizing database level locking across different web pages.

Rather than relying on database locking, classes generated by the db_object model use a logical, optimistic locking mechanism that relies on checking the value of a designated column to ascertain whether it is safe to update a table row. Locking columns can be defined in one of two ways:

A timestamp column can be used, in this case the timestamp value is updated to the current time whenever an object is inserted or changed.
A counter (integer) column can be used, in this case the counter is incremented by 1 whenever the object is updated.

Consider the following scenario:

User 1 displays the record associated with customer 93122.
User 2 displays the same record associated with customer 93112 and changes the customer's address.
Now if user 1 decides to make a change to the customer, they could overwrite the changes just made by user 2.

In order to prevent this situation from happening, the db_object classes keep track of the value of the logical record locking field at all times. Let's say this was defined as a timestamp field. In our scenario, in step 1 the db_object class would make note of the current timestamp for customer 93122 when it is read by user 1. In step 2, user 2 would cause this timestamp field to be updated to the current time. In step 3, before going ahead with the updates requested by user 1 it would make sure the timestamp value had not changed since the original record was supplied to user 1. If the timestamp had changed, user 1's updates would be rejected. User 1 would need to re-read the object data in order to make changes.

Defining The Object Tables

Most db_object classes will be used to maintain a single SQL table known as the primary table. Additionally, a db_object class can maintain any number of secondary tables. In most cases these secondary tables are related to the primary table with cardinality 1:many. The related tables are generally dependent on the primary table. For example, it is not possible for an Order Line to exist without an Order (header). One of the hallmarks that suggest that a table is dependent on another table is that the dependent table implements a cascading delete constraint. For example, if an Order is deleted, all order lines should also be deleted. This tight relationship suggests that the order table and the order_lines table should be part of the same db_object. Note that there does not have to be a cascading delete constraint defined within the database since the db_object takes care of handling this automatically.

Exposing the Object

One of the benefits of using db_objects is that they hide the complexity of the data model and also the technical details of the underlying database from the clients that use the generated db_objects. The users of the object simply view the data as a series of properties. Secondary tables are represented as arrays of properties. To change the object, the object users simply call methods to change the properties or insert and delete rows from the arrays representing dependent tables.

Shared Validation Rules

HTML 5 offers many useful validation features such as limiting values to dates or numbers, setting ranges, restricting lengths, etc.. While these are certainly useful in trapping errors within forms, these validations cannot be relied upon for protecting the integrity of the data. In some cases, the db_object may be used by "headless" components that cannot take advantage of form validation. Even if all of the access to an object is supposed to be via a form, hacker can easily post requests that bypass the forms in order to pass data to the server that could compromise the data integrity.

In order to take advantage of the rich validation features offered by modern browsers while at the same time protecting the integrity of the data when the browser validation is bypassed, db_objects allow you to specify certain validation rules that will be replicated on both the client (in the browser) and on the server (within the db_object). This approach gives you, the developer, the best of both words. You can implement rich client validation without having to manually keep the server validation is sync with the client validation since both sets are validation are derived from the same set of rules.

Database Constraints

One of the key advantages of using an SQL database to store your data is that the database can enforce certain constraints automatically. For example, utilizing database rules you could make it impossible to store an order record that points to a non-existent customer record. While this is certainly a great feature, one of the challenges of utilizing these constraint rules is that it is very difficult to convey constraint violations to your end users (web form users) because it is often not possible to determine what database constraints caused an error to be raised within the database in order to instruct the user on correcting the problem. For this reason, it is generally necessary to implement the same constraints within your program logic in order to give your users proper information on what they need to correct to save or update an object instance. Even though these rules are implemented within your program logic (inside a db_object) it is still a good idea to have these constraints enforced by the database in case the program rule is bypassed for some reason.

Defining Your Objects

Consider the following partial data model of the sakila database that ships with MySql 8. This database is used to manage a video rental enterprise.

Partial sakila database schema diagram

Let's assume we want to define the film db_object. The Entity Relationship Diagram above tells us:

A film can be associated with any number of categories but does not need to be related to a category. The film_category table is used to associate the film with its categories.
A film can be associated with any number of actors but does not need to have any actors. The film_actor table is used to associate the film with its actors.
A film must be related to exactly one language.
A film can optionally be related to an original language.
Any number of copies of the film can exist in inventory (including zero).
All categories must be pre-defined within the category table.
All actors must be pre-defined within the actor table.
All languages must be pre-defined within the language table.

Implementing all of these rules and constraints along with all of the validation associated with all of the fields (columns) might seem to be a daunting task. In fact, no code needs to be hand-written to implement these rules. Instead, you simply need to declare these rules as part of the db_object model specification.

Identifying Child Objects

The first step in configuring the db_object is to identify the tables to be included as part of the object. As mentioned previously, a good way to identify a child entity is to look for entities that cannot "stand on their own". For example, you can certainly have an actor without a film since many actors are not in any films and some actors appear in many films. This tells us that actors should not be defined as part of the film object. On the other hand, the film_actor table only exists for the purpose of listing the actors within a film, this table would not be needed if we did not have a film table and if you delete a film, you would want to automatically delete all of the associated film_actor rows. This provides a strong indication that the film_actor table should be part of the film object.

One could argue that the film_actor table could instead be part of the actor object. This partly depends on how you plan to maintain the data. By making film_actor a part of the film object you would tend to define the actors associated with the film at the time that you add a new film title to the database. If the film_actor table was part of the actor object you would tend to list the firms that the actor has played in at the time that the actor is added to the database. In terms of cardinality/volumes each option could make sense, most films usually don't have more than a few dozen actors and most actors don't appear in more than a few dozen films.

To help decide which object film_actor belongs to, it is often helpful to think about the life cycle of the objects. Assuming films are not added to the database until after they have been produced, at the time the film object is created we would know all of the actors in the film so there would generally be no need to update the film after it has been inserted into the database. On the other hand, if film_actor was part of the actor object, we could add the actor's film history when we store the actor but each time the actor plays in a new film we would need to update the actor's "actor object". Similarly, when a new film is made, we would need to update dozens of actor objects at the time the new film is inserted into the database. This is all evidence that film_actor really belongs within the film object and not the actor object.

In a similar way, the film_category table only exists to define the categories of films, therefore this table would appear to be part of the film object as well. In this case, the cardinality strongly suggests that film_category belongs in the film object and not in the category object. That is, a film will generally fit into two or three categories. On the other hand, a category could be associated with tens of thousands of films. This would make is unwieldy to put the film_category table inside of the category object.

Based on this analysis we have determined that the film object consists of three tables:

film is the primary table
film_actor is a child table related 1:many (optional) with film
film_category is a child table related 1:many (optional) with film

Defining Child Objects

Dependent (child) objects are defined using the db_object_child model. These must be created before you can complete the creation of the db_object class. The following link provides help on creating db_object_child classes.

Identifying Database Constraints

In most cases constraints arise from foreign keys. By definition, foreign keys must point to a valid entry in another table. Consider the language table, this has one row for each possible language that may exist. It may even include languages for which there are no films. When we add a film, we must indicate the language in which the file is produced. An insert/update constraint must be defined on the film object to prevent the film from referencing an undefined language.

Note that whenever you define an update constraint on a foreign key this generally means that you will also need a delete constraint when you define the primary object being referenced. In this example, we must prevent the deletion of any language codes that are referenced by films. This would be done by adding a delete constraint the the language object.

Creating Audit Trails

db_objects allow you to log all updates to a separate set of files. This is used to create a history of all inserts, updates and deletes associated with the object in order to see historical snapshots of the object and to determine which users have made changes to the object. The log files will generally consist of the same set of columns as the file they are logging with additional columns to contain information such as:

log_event (the event that triggered the log record)
log_timestamp (the date/time that the update occurred)
log_user (the id of the user who performed the update)
log_ip (the IP address of the user who performed the update)

Using the db_object Model

Next we will describe how to fill in the specifications for the db_object model in order to tell it how you want your db_object to behave. We have broken this description into the following sections. Click on the links to learn more.

db_object Model Overview