Chapter 9: Persistent Data on the Client

The key component of smart client applications is the persistent data store. This is the technology that allows you to maintain data on the device, removing the requirement for wireless network coverage. Beyond allowing you to work offline, persistent data also makes it possible to build applications with rich user interfaces that have high-performance data access, transactional capabilities and enterprise integration support.

You have a variety of options to choose from for data storage: You can use the device's file system to store data, build your own data storage mechanism, or purchase a commercial solution. In this chapter we are going to take a closer look at some of the reasons why databases are an important component of smart client applications. We will also look at some of the options you have for implementing this type of solution.

Types of Data Storage

Before delving into the requirements for persistent storage within mobile applications, it is worthwhile to step back and take a look at traditional data storage systems. In general, database systems provide a storage mechanism for data. This can be accomplished in a variety of formats, with different storage techniques and relationships between the data. They also provide a means to access that data by using a defined language, like Structured Query Language (SQL), or other proprietary APIs. A complete system that provides the data storage, access mechanism, and administration tools is referred to as a database management system, or DBMS. This is clearly an oversimplified view of what a database is and does, but for our purposes here, it should suffice.

We are going to look at four main database systems: flat-file, relational, object, and XML databases.

Flat-File Databases

The most basic form of database is a flat-file database. This form of database is only able to store data as a single set of records of the same kind. Conceptually, flat-file databases are very similar to a card catalog where each record is kept in a distinct location. For example, a flat-file database might contain a list of contacts, each represented by a string and separated using a common delimiter such as a comma, tab, or other character that is unlikely to be part of the data itself. This type of data storage system is suitable only for the most basic forms of applications that require basic persistent storage, with limited programming capabilities. Since they store data only as a single record, they do not lend themselves to large or complex data sets. If the amount of data becomes too large, performance suffers considerably.

One of the major performance limitations of flat-file systems comes from the lack of tuning and optimizing features that are typically found in relational database models. For example, there is no means of database normalization; that is, there is no way to structure information to reduce redundancy and promote the most efficient use of resources. Normalization is accomplished by using separate tables with foreign and primary keys in the relational database model. In addition, flat-file databases do not allow for joins—retrieving related sets of data based on a common element.

To address some of the performance issues, many flat-file databases provide the ability to designate certain fields as keys, allowing the contents of the field to be indexed. Searches for information in a keyed field are much faster than for those in fields without keys. In a keyed field, the program only has to consult the index to determine which fields meet the search criteria, whereas in nonkeyed fields, every field in the record has to be scanned to achieve the same results.

Flat-file database implementations are rarely found in enterprise applications because of their lack of scalability and poor performance.

Relational Databases

Relational databases have a much more logical structure for storing data than flat-file databases, where you are limited to a single record type. Relational databases allow you to store many records of various data types in a format called a table. Within the table, the data is stored in rows and columns. Each column of data contains elements of the same data type. The table information is related on the basis of a common field (or key), which allows for relationship mapping. In this way it is possible to associate an unlimited number of different record types with one another.

In the late 1970s, Oracle came out with the first relational database; it was soon followed by offerings from IBM and Sybase. Today, relational databases rule the enterprise market. They use SQL for interacting with the database. Using SQL it is possible to perform complex database interactions using simple commands, and it gives users a standards-based way to interact with databases, although it's important to point out that many vendors have extended the SQL standards to create their own proprietary SQL languages.

Over the years, the capabilities offered in relational databases have evolved. The leading vendors such as Oracle, IBM, Sybase, and Microsoft have added the capability to store a wide variety of data types and data processing features, such as support for transactions. These new features are often aimed at making relational databases easier to use for business applications and e-business development. Stored procedures, for example, enable the user to build business logic into the database using either a SQL-based proprietary language or, as of late, using Java. These types of additions make the relational database more attractive than other competing technologies.

In the mobile space, we are seeing many of the enterprise relational database vendors retrofit their enterprise-scale offerings for mobile usage. As well, other vendors have introduced products specifically targeted for mobile use. (More details on this are given later in the chapter in the Commercial Relational Database section. The complete relational database system is commonly referred to as a relational database management system, or RDBMS.

Object Databases

Recently, object databases have attempted to challenge relational databases for the enterprise market, but so far have not succeeded. They differ significantly from relational databases in their approach to data storage. This approach is based on the capability to store persistent objects; so, rather than storing tables, as relational databases do, object databases store a variety of persistent objects, which can then be referenced through persistent identifiers (PIDs). PIDs identify the objects and are used to build relationships between them. Instead of using SQL to access and manipulate the objects, common object-oriented (OO) programming languages such as C++ and Java are used.

The first object databases came to market in the early 1990s, more than 10 years after their relational counterparts. Though they have not caught on in the broad enterprise market, they do excel in certain areas. When, for example, you need to be able to store many forms of data, which do not lend themselves to the tabular setup of a relational database, object databases are a good choice. They give you the freedom to create and store any type of information, and provide tight integration with OO programming languages.

Recently, with the proliferation of Web-based applications, object databases are experiencing some surge in popularity because of the many formats of Web content and the requirement for easy access to these data formats. Unfortunately for the object database vendors, the performance overhead and nonstandard design continue to make them unattractive for many business application developers. Whether mobile applications will take advantage of object databases is yet to be determined, but the concept of sending persistent objects, instead of just data, to mobile applications is starting to gain some momentum.

The complete object database system is commonly referred to as an object database management system, or ODBMS. Another term that you may come across for object databases is object-oriented database management system, or OODBMS.

XML Databases

With eXtensible Markup Language (XML) data storage, there are two very different types of databases: one implements XML capabilities on top of another database format; the other uses XML as the storage mechanism itself. It is important to make this distinction, because here, when referring to XML databases, we are referring to the latter. This is because most, if not all, enterprise RDBMS vendors have added some form of XML capabilities to their offerings. These offerings range from the capability to store XML documents in text fields to the capability to interact with the database using XML datagrams. These relational database offerings are not XML databases, as the data storage itself is not XML-based.

The latter type of XML databases, comprise products that have been designed for use with XML from the ground up. They use XML database engines to store large amounts of data, provide concurrent access, and integrated security. The selling feature behind XML databases is that they can interact with XML streams more efficiently than can RDBMS that have XML features, because they do not have the overhead of converting XML to a relational format. In some cases, such as when manipulating an entire document, this can lead to better performance than using an RDBMS. That said, in most cases, storing data in XML formats can be very inefficient, and offers poor performance, due to the lack of indexing capabilities.

Another feature of XML databases is the integration with other XML technologies on mobile devices such as synchronization services, transformation engines, and XML query languages such as XQuery. The premise is that, by having all XML-based technologies on the client device, you should be able to reduce system complexity and cost while increasing the flexibility due to XML's platform independence. Increased flexibility may in fact be possible, but XML databases demand increased processing and storage requirements as well, which are not attractive characteristics for mobile computing.

XML database management systems (XDBMS) are rarely used, in comparison to relational systems, most likely because relational systems are ingrained in most organizations, and not enough compelling reasons exist for moving to a pure XML model at this time.