webCOMAND

Shipping containers stacked like building blocks.

Content Modeling Standard

Content management systems (CMS) come in all shapes and sizes, but most provide some way to model content.  Wouldn't it be great if there was a standard way to communicate a content model that any CMS could understand, that isn't specific to one product or implementation?

This is a draft proposal for a content modeling standard, motivated by Deane Barker's Towards a Content Modeling Standard post on Gadgetopia.

Key Goals

  • Simple - It should be as simple as possible while still able to accomplish all key goals listed here.
  • Expressive - It should be able to communicate complex content models and the many content type and attribute options available in CMSes.
  • Progressive Enhancement - Wherever possible, an application should be able to ignore or work-around model features it does not implement.  Standard and custom extensions can add to the core feature set to express less common and more specific features.

Proposed Content Modeling Standard

This proposed standard's primary function is to define collections of content types.  In its simplest form, a content type is a collection of attributes that can be used to define a common structure for a set of content items.

Core

Below is an overview of the core structures used to model content and their attributes.  Except for the Content box, each box points to another to indicate which attributes to inherit.  In other words, the Package box inherits all attributes from the Folder box, which in-turn inherits all attributes from the Entity box, and so on.  So, Package combines the attributes in the four boxes down the left.  Each box is detailed after the overview diagram.

core_content_types_tree.png

These core structures can be used to model content at a basic level, to define data types, content types and their attributes, and even content items of those types.

Content

This content type is required for all content models because it is the base content type that is extended by all other content types in the model.  It is the only content type that does not extend another content type.

  • Key - A string that serves as a unique key for a content item of a specific Type and Package.  Multiple versions and variants of the same content item will share this Key.  Keys must be 1-32 characters, are case-sensitive and must start with a letter and contain only letters (A-Z, a-z), numbers (0-9) and underscores (_).  In regex terms, /^[a-z][a-z0-9_]{0,31}$/i.  These constraints make it easier to reference keys in scripts, templates and various serialization formats.
  • UUID - A universally unique identifier, which will uniquely identify this object across all content items in all models and applications.  UUID v1 should be used whenever possible.
  • Type - A reference to the Content Type this content item is an instance of.  Useful for type introspection.
  • Package - Optional reference to a Package, which serves as a namespace for this content item.

Entity

A base content type, intended to by extended by any content type that wants to support named content items.

  • Extends: Content (inherits all attributes from Content above)
  • Title - A title for the content item.
  • Description - A short description for the content item.

Folder

Folders are optional structures that can be used to organize elements of the model into logical groups.  Folders can be nested as content of another folder.

  • Extends: Entity
  • Content - A collection of content items of any content type.

Package

A package is a Folder that represents a logical group of content items, and is meant to be used like a namespace for content items distributed and imported/installed together.  When a content item references a Package, it avoids conflicts with content item's with the same Key in another (or no) Package.  If a package contains a content model, it can be considered a module.  However, this proposed standard does not explicitly differentiate between the two because the difference and interchangeable uses can be fuzzy.

  • Extends: Folder

Type

This content type is required for all content models because it is the base content type that is extended by Data Type and Content Type, which are necessary for every content model.

  • Extends: Entity

Data Type

This content type is required for all content models because it is how attribute data types are specified, which is necessary for nearly every content model.

  • Extends: Type
  • Primitive - String that indicates the type of data to be stored by an attribute of this data type.  May be one of the following values.
    • boolean - A boolean value.
    • integer - An integer value (alias for int64).  The following explicit types are also available: int8, int16, int32, int64
    • decimal - A decimal value (alias for float32).  The following explicit types are also available: float8, float16, float32, float64
    • text - UTF8 text.
    • binary - Binary data.
    • date - A date.
    • time - A time.
    • datetime - A date and time, which may include timezone information.
  • SQL - Optional data type clause that would appear in an SQL create statement for a column of this data type, based on the SQL Data Types.  This enables more specific control, such as text/binary length and decimal accuracy for systems that support it.
  • Index Length - Specifies the number of bytes to index in storage systems that support indexing, such as an SQL database.
  • Binary - Value contains binary data that may not be safe for text-based storage systems.

Core Data Types

The following Data Types should be defined as part of the core Package.  Others can be defined in extensions and content models.

  • Boolean
  • Integer
  • Big Integer (int64)
  • Decimal
  • Currency
  • Date
  • Time
  • DateTime
  • Text
  • File Data
  • Image Data

Content Type

This content type is required for all content models because it is how a content type is specified.

  • Extends: Type
  • Extends - Content Type that contains attributes to inherit for this content type (required).
  • Implements - An ordered collection of Content Types that contain attributes to inherit in the order they are specified, after the Extends content type.  If there are any conflicts, the last attribute definition will be used (Extends, Implements, Attributes).
  • Attributes - An ordered collection of attributes for this content type.

Attribute Option

Optional information to indicate how an attribute of specific Data Types and/or Content Types is meant to be used.  See the Options attribute of the Attribute content type below for more information.

  • Extends: Entity
  • Type Filter - The kind of type this option may be applied to: Data Type or Content Type.  If not specified, this option can be applied to either kind of type.
  • Item Filter - The content items of the specified type(s) this option may be applied to.  If none specified, this option can be applied to any item of the specified type(s).

Core Attribute Options

The following attribute options should be defined as part of the core Package.

  • Required - The attribute must contain a value.
  • Optional - The attribute does not need to be assigned a value, and it need not be visible unless added/requested by the content creator.  If an attribute is not Optional, then it should be visible by default.
  • Single Line - Indicates only a single line of text (no line feeds) for Text attributes.
  • Multiple Lines - Indicates multiple lines of text (line feeds) for Text attributes.
  • Unique (Globally) - Value must be unique across all Content.
  • Unique per Package - Value must be unique within a Package (used as a namespace).
  • Unique per Content Type - Value must be unique for all Content of a specific Content Type.
  • Unique per Inherited Content Type - Value must be unique for all Content of a Content Type and all Content Types that inherit it.
  • Unique per Collection - Value must be unique for this attribute across all Content within the same collection.
  • Embed - Treat referenced content like it is embedded within the parent content.  That is, when the parent is deleted, embedded content should also be deleted.
  • Ordered - Maintain the order content is added (newest last) and allow manual re-ordering.
  • HTML - Indicates HTML formatting for Text attributes.
  • Markdown - Indicates Markdown formatting for Text attributes.
  • Calculated - Indicates the value of this attribute is calculated by a calculation script.
  • Validated - Indicates the value of this attribute should be validated by a validation script.

Attribute

This content type is required for all content models because it is how content type attributes are specified.

  • Extends: Entity
  • Collection - TRUE if multiple content items or values are allowed for this field.  If needed, Validation can specify a minimum and/or maximum number of items/values.  If no validation is specified, zero or more should be allowed.
  • Options - A collection of Attribute Options to provide more information about this attribute and how it can be used.  Different implementation may recognize only a subset of the defined options.  Additional implementation-specific options can be defined as well.

Data Attribute

Defines a data attribute.

  • Extends: Attribute
  • Data Type - A reference to a Data Type.
  • Choices - An ordered collection of Choices, which serve as an enumerated list of title/value pairs, where the title is the friendly name of an item and the value is the value that will be stored when selected.

Content Attribute

Defines a content attribute.

  • Extends: Attribute
  • Content Type - A reference to a Content Type.  This attribute will be constrained to content items of this content type and content types that extend it  An attribute can be further constrained to a set of specific types with Content Attribute Types (see next content type), but those types should be limited to this content type and content types that extend it.  If Content is specified, any content item can be referenced.
  • Reciprocal - Optional reciprocal Attribute to serve as the other side of this relationship in the target content type(s).  Defines the Key, Title and Description of the Attribute that will be added to the target content type.  If the reciprocal Attribute's Collection is not set, then this Content Attribute represents a many to one relationship, which should be enforced (only one content item can reference the target).  The combination of a Content Attribute and it's optional Reciprocal are used to specify the relationship's cardinality: one-to-one, one-to-many, many-to-one and many-to-many.  The Content Attribute Collection represents the left side of the "x to y" relationship and the Reciprocal represents the right.  If no Reciprocal is set, the reciprocal side is assumed to be "many" with no reciprocal attribute.  No reciprocal attribute does not mean the relationship can not be queried/joined from the other side, it just means there won't be a unique attribute within the target to query/join from.

Content Attribute Types

Constrains the content items that can be referenced from a content attribute to an arbitrary set of content types (limited to the Content Attribute's Content Type and those that extend it).

  • Extends: Content
  • Content Attribute - A reference to the Content Attribute to constrain.
  • Content Types - Collection of Content Types to constrain the content items referenced by the Content Attribute to.

Example

The following example illustrates a content model serialized in JSON.  @import pulls in an external Package and aliases the Package Key (used as a namespace) so the package can be referenced in this JSON, without the externally defined Package's Key.  If no alias is specified, the Package is imported into the current namespace.  If a Type attribute is not specified for an item, the parent attribute's base content type is assumed.  If a string is specified where an object is expected, the string will be used as a key to match on an object with the corresponding key of the attribute's content type (or any that inherit it).  This format is loosely based on cJSON.

{
  "type": "Content Model",
  "version": "1.0",
  "contents": [
    {"@import": {"URL": "https://comand.io/packages/types.js"}},
    {"@import": {
        "Key": "methods",
        "URL": "https://comand.io/packages/methods.js"
    }},
    {
      "Key": "contacts_model",
      "Type": "Package",
      "Title": "Contacts Model",
      "Namespace": "io_comand_contacts",
      "Content": [
        {
          "Key": "Color",
          "Type": "DataType",
          "Title": "Color",
          "Description": "Color in #RRGGBB or rgba(R,G,B,A) format.",
          "Primitive": "Text",
          "SQL": "VARCHAR(32)"
        },
        {
          "Key": "Contact",
          "Type": "ContentType",
          "Extends": "Content",
          "Attributes": [
            {
              "Key": "Name",
              "Type": "DataAttribute",
              "DataType": "Text",
              "Options": ["Required", "SingleLine"]
            },
            {
              "Key": "DOB",
              "Type": "DataAttribute",
              "DataType": "Date"
            },
            {
              "Key": "FavColor",
              "Type": "DataAttribute",
              "Title": "Favorite Color",
              "DataType": "Color"
            },
            {
              "Key": "Parents",
              "Type": "ContentAttribute",
              "ContentType": "Contact",
              "Collection": true,
              "Reciprocal": {
                "Key": "Children",
                "Collection": true
              }
            }
          ]
        }
      ]
    }
  ]
}

Extensions

In addition to the core Package detailed above, extensions (external Packages) can be imported to model additional aspects of content types and their attributes.  The following sections detail common extensions to the core.

Version Extension

Adds Variant and Version to model different variations and historic versions of content items in a standard way.

Variant

This content type is used to communicate multiple versions of content items, such as different language translations of a content item.  Since this is not required, it could be moved from the core to a Version extension.

Content items of this type should not be defined directly.  Instead, it serves as a base content type to be extended by other content types that define a variant type (aka dimension).  For example, a content type with the Identifier "Language" can be defined to extend Variant.  Content items of the Language type then represent the set of language variants available for content items (ie. English, French and Spanish).  A content item can then specify that it is a French variant of an object by referencing the French content item of the Language content type from the Variants field within the Version.

  • Extends: Entity

Version

This content type is only required to communicate variants (ie. translations) and/or revision information (creation/deletion/modification over time) of content items.  Since this is not required, it could be moved from the core to a Version extension.

  • Extends: Content
  • Content - A single (one-to-one) reference to a content item.
  • Variants - A collection (many-to-many) reference to zero or more variants the referenced object implements.  If no variants are specified, the Content is just a historic revision of the base content item.
  • Start - A timestamp that represents when this version of the content first came into existence.  It may represent the time the content item was created or when it was modified to become a new version.  No timestamp indicates the creation or modification time is not known.
  • End - A timestamp that represents when this version of the content expired.  It may represent the time the content item was removed or when it was modified and a new version came into existence.  No timestamp indicates the version is active (no subsequent modification or deletion has occurred).

More Extensions

The following extensions will also be detailed here soon.

  • Defaults - Define default values for attributes.
  • Validations - Restrict collection attributes to a minimum and maximum number of items.  Restrict text attributes to a certain length and patterns with regular expressions.
  • Assets - Define content types to represent images, documents and other files that are often part of content models.
  • Methods - Define content type behaviors and scripts to define behaviors, more elaborate validations and more.
  • DCMI Terms - Definition of Dublin Core Metadata Initiative Terms, which can associate Content Types and Attributes with standardized terminology.  Models that specify DCMI terms can be more easily analyzed for related Content Types and Attributes, even when they do not share the same key, Type or Package.

Photo by frank mckenna on Unsplash.

More Posts

November 19, 2018
What is a Content Database?
Adding context to build smarter apps and content creations.
Content is application data with relevant context. For example...
February 11, 2019
Introducing Content Workflow
Facilitate teams and collaboration with versioned content in custom stages.
This is the fourth post in the series Implementing a Content Database In 10 Steps
January 5, 2019
Implementing Inheritance and Relationships
Leveraging the power of abstracted content types to introduce object-oriented concepts
This is the third posting in the series Implementing a Content Database In 10 Steps.