Data organization

One of Datastore’s key features is creating bespoke structures and the methods used to query data in these structures. These are defined by the API specifications and data models that make up your blueprints.

Datastore stores your actual data as properties in objects called documents, where groups of related documents are organized into parent objects called collections.

These collections and documents are analogous to directories and files (respectively) on a file system.

Datastore implements two standards for blueprints:

  • OpenAPI - an initiative to standardize how RESTful APIs are described. This standard is used to create your API specification that defines the API endpoints and their supported HTTP methods. These files are typically defined in YAML format.

  • JSON Schema - as documented on this site, this is a vocabulary for annotating and validating JSON documents. This standard is used to define the sets of properties (or data models) that each endpoint accepts through requests (to access each property’s data values), as well as their responses. Refer to the json.org website for a summary of this syntax.

How collections and documents are organized

From an organizational perspective:

  • A document must always be placed inside a collection.

  • A collection must be created at the root level, from which multiple documents can be created and accessed. (Documents cannot be created at the root level.)

  • A document may contain a collection, known as a sub-collection.

  • Multiple collections can be created at the root level.

For example, a collection called comments could store comments submitted through an application. Each comment could have a sub-collection called replies used to store the replies to each comment.

This creates a storage structure like the following:

Example storage structure for comments and their respective replies
/comments
/comments/comment#1
/comments/comment#1/replies
/comments/comment#1/replies/reply#1
/comments/comment#1/replies/reply#2
/comments/comment#2
/comments/comment#2/replies
/comments/comment#2/replies/reply#1

This alternating collection > document > collection > etc. data structure pattern is used to construct the API endpoint URLs within Datastore, in the form /collection/document/collection/…​, where either a collection or unique document identifier (ID) name is used at each level of the URL for a given API call.

The collection (for comments and replies) can retain its user-friendly name within a URL. This name only needs to be unique at a given URL level.

The documents (the comments and replies themselves) use a unique ID assigned by Datastore. While this document name only needs to be unique within a given collection, Datastore allocates a version 4 universally unique identifier (UUID) for each document name. A UUID generated by Datastore is a 36 character string separated into 5 groups using hyphens, for example: 0cad4697-8422-47c0-9d4b-5f16d7f5baf3.

The following URL paths can be generated from the comments and replies example above:

Example of comment and reply endpoint URL paths (with truncated UUIDs)
/comments
/comments/9ff23854-af...
/comments/9ff23854-af.../replies
/comments/9ff23854-af.../replies/ec0a929b-0c...
/comments/9ff23854-af.../replies/b7a6e61e-ab...
/comments/0dc67f6d-3b...
/comments/0dc67f6d-3b.../replies
/comments/0dc67f6d-3b.../replies/0cad4697-84...
Since a document name only needs to be unique within a given collection, your application can assign a unique identifier to a document instead of one allocated by Datastore. This allows you to create friendlier URLs with unique document identifiers.