Designing a Table in Oracle NoSQL Database Cloud Service

Learn how to design and configure tables in Oracle NoSQL Database Cloud Service.

This article has the following topics:

Table Fields

Learn how to design and configure data using table fields.

An application may choose to use schemaless tables, where a row consists of key fields and a single JSON data field. A schemaless table offers flexibility in what can be stored in a row.

Alternatively, the application can choose to use fixed schema tables, where all of the table fields are defined as specific types.

Fixed schema tables with typed data are safer to use from an enforcement and storage efficiency standpoint. Even though the schema of fixed schema tables can be modified, their table structure cannot easily be changed. A schemaless table is flexible and the table structure can be easily modified.

Finally, an application can also use a hybrid data model approach where a table can have typed data and JSON data fields.

The following examples demonstrate how to design and configure data for all three approaches.

Example 1: Designing a Schemaless Table

You have multiple options to store information about browsing patterns in your table. One option is to define a table that uses a cookie ID as a key and keeps audience segmentation data as a single JSON field.

// schema less, data is stored in a JSON field
CREATE TABLE audience_info (
       cookie_id LONG,
       audience_data JSON,
       PRIMARY KEY(cookie_id))

In this case, the audience_info table can hold a JSON object such as:

{
  "cookie_id": "",
  "audience_data": {
    "ipaddr" : "10.0.00.xxx",
    "audience_segment: {
       "sports_lover" : "2018-11-30",
       "book_reader" :  "2018-12-01"
    }
  }
}

Your application will have a key field and a data field for this table. You have flexibility in what you chose to store as information in your audience_data field. Therefore, you can easily change the types of information available.

Example 2: Designing a Fixed Schema Table

You can store information about browsing patterns by creating your table with more explicitly declared fields:

// fixed schema, data is stored in typed fields.
CREATE TABLE audience_info(
       cookie_id LONG,
       ipaddr STRING,
       audience_segment RECORD(sports_lover TIMESTAMP(9),
                               book_reader TIMESTAMP(9)),
       PRIMARY KEY(cookie_id))

In this example, your table has a key field and two data fields. Your data is more compact, and you are able to ensure that all data fields are accurate.

Example 3: Designing a Hybrid Table

You can store information about browsing patterns by using both typed and JSON data fields in your table.

// mixed, data is stored in both typed and JSON fields.
CREATE TABLE audience_info (
       cookie_id LONG,
       ipaddr STRING,
       audience_segment JSON,
       PRIMARY KEY(cookie_id))

Primary Keys and Shard Keys

Learn the purpose of primary keys and shard keys while designing your application.

Primary keys and shard keys are important elements in your schema and help you access and distribute data efficiently. You create primary keys and shard keys only when you create a table. They remain in place for the life of the table, and cannot be changed or dropped.

Primary Keys

You must designate one or more primary key columns when you create your table. A primary key uniquely identifies every row in the table. For simple CRUD operations, Oracle NoSQL Database Cloud Service uses the primary key to retrieve a specific row to read or modify. For example, consider a table has the following fields:

  • productName

  • productType

  • productLine

From experience, you know that the product name is important as well as unique to each row, so you set the productName as the primary key. Then, you retrieve rows of interest based on the productName. In such a case, use a statement like this, to define the table.

/* Create a new table called users. */
CREATE TABLE if not exists myProducts 
(
  productName STRING,
  productType STRING,
  productLine INTEGER,
  PRIMARY KEY (productName)
)"

Shard Keys

The main purpose of shard keys is to distribute data across the Oracle NoSQL Database Cloud Service cluster for increased efficiency, and to position records that share the shard key locally for easy reference and access. Records that share the shard key are stored in the same physical location and can be accessed atomically and efficiently.

Your Primary and shard key design has implications on scaling and achieving provisioned throughput. For example, when records share shard keys, you can delete multiple table rows in an atomic operation, or retrieve a subset of rows in your table in a single atomic operation. In addition to enabling scalability, well-designed shard keys can improve performance by requiring fewer cycles to put data to, or get data from, a single shard.

For example, suppose that you designate three primary key fields:

PRIMARY KEY (productName, productType, productLine)

Because you know that your application frequently makes queries using the productName and productType columns, specifying those fields as shard keys is appropriate. The shard key designation guarantees that all rows for these two columns are stored on the same shard. If these two fields are not shard keys, the most frequently queried columns could be stored on any shard. Then, locating all rows for both fields requires scanning all data storage, rather than one shard.

Shard keys designate storage on the same shard to facilitate efficient queries for key values. However, because you want your data be distributed across the shards for best performance, you must avoid shard keys that have few unique values.
Note

If you do not designate shard keys when creating a table, Oracle NoSQL Database Cloud Service uses the primary keys for shard organization.

Important factors to consider when choosing a shard key

  • Cardinality: Low cardinality fields, such as a user home country, group records together on a few shards. In turn, those shards require frequent data rebalancing, increasing the likelihood of hot shard issues. Instead, each shard key should have high cardinality, where the shard key can express an even slice of records in the data set. For example, identity numbers such as customerID, userID, or productID are good candidates for a shard key.

  • Atomicity: Only objects that share the shard key can participate in a transaction. If you require ACID transactions that span multiple records, choose a shard key that lets you meet that requirement.

What are the best practices to follow?

  • Uniform distribution of shard keys: When shard keys are uniformly distributed, no single shard limits the capacity of the system.

  • Query Isolation: Queries should be targeted to a specific shard to maximize efficiency and performance. If queries are not isolated to a single shard, the query is applied to all shards which is less efficient and increases query latency.

See Creating Tables to learn how to assign primary and shard keys using the TableRequest object.

Time to Live

Learn how to specify expiration times for tables and rows using the Time-to-Live (TTL) feature.

Many applications handle data that has a limited useful lifetime. Time-to-Live (TTL) is a mechanism that allows you to set a time frame on table rows, after which the rows expire automatically, and are no longer available. It is the amount of time data is allowed to remain in the Oracle NoSQL Database Cloud Service. Data that reaches expiration time can no longer be retrieved, and does not appear in any storage statistics.

By default, every table that you create has a TTL value of zero, indicating no expiration time. You can declare a TTL value when you create a table, specifying the TTL with a number, followed by either HOURS or DAYS. Table rows inherit the TTL value of the table in which they reside, unless you explicitly set a TTL value for table rows. Setting a row's TTL value overrides the table's TTL value. If you change the table's TTL value after the row has a TTL value, the row's TTL value persists.

You can update the TTL value for a table row at any time before the row reaches the expiration time. Expired data can no longer be accessed. Therefore, using TTL values is more efficient than manually deleting rows, because the overhead of writing a database log entry for the data deletion is avoided. Expired data is purged from the disk after expiration date.

Table States and Life Cycles

Learn about the different table states and their significance (table life cycle process).

Each table passes through a series of different states from table creation to deletion (drop). For example, a table in the DROPPING state cannot proceed to the ACTIVE state, while a table in the ACTIVE state can change to the UPDATING state. You can track the different table states by monitoring the table life cycle. This section describes the various table states.

Description of table-state.png follows

Table State Description

CREATING

The table is in the process of being created. It is not ready to use.

UPDATING

Updating the table is in process. Further table modifications are not possible while the table is in this state.

A table is in the UPDATING state when:

  • The table limits are being changed
  • The table schema is evolving
  • Adding or dropping a table index

ACTIVE

The table can be used in the current state. The table may have been recently created, or modified, but the table state is now stable.

DROPPING

The table is being dropped and cannot be accessed for any purpose.

DROPPED

The table has been dropped and no longer exists for read, write, or query activities.
Note

Once dropped, a table with the same name can be created again.

Table Hierarchies

The Oracle NoSQL Database enables tables to exist in a parent-child relationship. This is known as table hierarchies.

The create table statement allows for a table to be created as a child of another table, which then becomes the parent of the new table. This is done by using a composite name (name_path) for the child table. A composite name consists of a number N (N > 1) of identifiers separated by dots. The last identifier is the local name of the child table and the first N-1 identifiers point to the name of the parent.

Characteristics of parent-child tables:
  • A child table inherits the primary key columns of its parent table.
  • All tables in the hierarchy have the same shard key columns, which are specified in the create table statement of the root table.
  • A parent table cannot be dropped before its children are dropped.
  • A referential integrity constraint is not enforced in a parent-child table.

You should consider using child tables when some form of data normalization is required. Child tables can also be a good choice when modeling 1 to N relationships and also provide ACID transaction semantics when writing multiple records in a parent-child hierarchy.