logo  AtomServer, Category Details

Chris Berry, Bryon Jacob. Updated 08/15/08

This document describes some specific details about dealing with AtomServer Categories including information about creating Categories and searching Feeds for Entries based on their Categories.

For a further, detailed description of the actual protocol, either
This document does not explain the underlying concepts behind AtomServer; REST, Atom, and OpenSearch. That information can be found in the AtomServer General Introduction document. It is highly recommended that you read this document first.

Nor does this document explain the basics of XML, namespaces, syndicated feeds, and the GET, POST, PUT, and DELETE requests in HTTP, as well as HTTP's concept of a "resource." For more information about those things, see the Additional resources section of this document.

Contents


General Information

Most data requires categorization to make it more manageable, especially as the data grows in size. Categorization is the characterization of data into collections based on some common attribute. And, not surprising, the Atom specification provides a built-in mechanism for categorization.

Atom provides the concept of Category Documents, which contain lists of categories described using the "atom:category" element from the Atom Syndication Format [RFC4287]. Categories can also appear in Service Documents, where they describe the categories allowed in a Collection (see Section 8.3.5).

Category Documents are identified with the "application/atomcat+xml" media type (see Section 16).

The root of a Category Document is the "app:categories" element. An app:categories element can contain zero or more "atom:category" elements from the Atom namespace. A Category is simply a way to assign arbitrary attributes to an Entry. And then using these categories  the User can then group Entries into arbitrary collections.

<category> has one required attribute, term, and two optional attributes, scheme and label.

term identifies the category

scheme identifies the categorization scheme via a URI. While scheme is optional within the Atom spec, AtomServer requires the scheme attribute. You can think of schemes as a namespace. Within AtomServer we currently do not accept schemes with "/" characters, although this is a valid URI character. You are encouraged to create schemes using a "." delimited hierarchy. (e.g "urn:foo.widgets.brands")

label provides a human-readable label for display. AtomServer does not currently support the "label" attribute in full.

Example

    <?xml version="1.0" ?>
<app:categories
xmlns:app="http://www.w3.org/2007/app"
xmlns="http://www.w3.org/2005/Atom">
<category scheme="urn:foo.widgets.type" term="foo" />
<category scheme="urn:foo.widgets.brands" term="bar" />
</app:categories>

This Category Document contains two categories, with the terms "foo" and "bar". None of the categories use the 'label' attribute defined in [RFC4287]. They both use the  "urn:foo.widgets" 'scheme' attribute . Therefore if the "foo" category were to appear in an Atom Entry or Feed Document, it would appear as:

<category scheme="urn:foo.widgets" term="foo" />

Managing Categories

Managing categories within AtomServer does not follow the standard Atom technique for managing categories. In Atom you would create a Category for a given Entry by submitting it within that Entry during a PUT or POST. This was determined to be too burdensome for the Entry publishers, each of which would have to be fully "Category-aware" and would have to honor optimistic concurrency restrictions.

Instead, AtomServer has chosen a Separation of Concerns. Within AtomServer Categories are a completely separate Resource, maintained in a virtual parallel Workspace. The name for this Workspace is arbitrary - it is assigned within AtomServer's Spring configuration file. But, by convention, we are using a "tags:" prefix for all Category Workspaces. Thus, the Categories for "widgets" are edited using the "tags:lwidgets" Workspace.

Interaction with Category Workspaces follows all the same rules as interaction with any Workspace, as described in here. Although, for Category Workspaces one must follow the optimistic concurrency restrictions.

Creating or Modifying Categories for a specific Entry

To create or modify the Categories which apply to a given Entry, send a PUT request, and supply a standard Atom Categories document within the <content> element. Note that there is effectively no difference between an Update and an Insert, except for Updates, where you must provide the revision identifier. And because the "Categories Workspace" is virtual, this revision number must be the one for the "real" Entry.

Let's look at an example,  imagine that you want to supply Categories for the Entry identified as /widgets/acme/123.xml/2, so by the rules governing AtomServer Optimistic Concurrency, you must submit your Categories request to /tags:widgets/acme/123.xml/3. As you can see, we've created an implicit mapping between the "actual" Entry ("widgets/acme/123.xml") and the Category Entry ("tags:widgets/acme/123.xml").

NOTE: The User must take great care when editing Categories. It is the User's responsibility to deal with all "merge conflicts". Put simply, when AtomServer receives a list of Categories to apply to a given Entry, it will first delete all existing Categories, before inserting the current list of Categories. Thus, the User must make certain that the list of Categories they are submitting is correct and preserve whatever necessary Categories they received from a previous GET or PUT. While the rules of optimistic concurrency apply, you cannot write an Entry without supplying the correct next revision Id, these rules do not protect you from negligence.

NOTE: the XML namespace qualifier ( e.g. <entry xmlns="http://www.w3.org/2005/Atom">) is required by the Atompub specification!

PUT /v1/tags:widgets/acme/123.xml/3

<entry xmlns="http://www.w3.org/2005/Atom">
   <id>/atomserver/v1/tags:widgets/acme/123</id>
   <content type="application/xml" >
     <app:categories xmlns:app="http://www.w3.org/2007/app"         
                     xmlns="http://www.w3.org/2005/Atom" >

         <category scheme="urn:foo.widgets.type" term="foo" />
         <category scheme="urn:foo.widgets.brands" term="bar" />
     </app:categories>

   </content>
</entry> 

The server responds:

200 OK

<?xml version='1.0' encoding='UTF-8'?>
<entry xmlns="http://www.w3.org/2005/Atom">
   <id>/atomserver/v1/tags:widgets/acme/123</id>
   <updated>2007-08-06T22:58:11.030Z</updated>
   <published>2007-08-06T22:58:11.030Z</published>
   <title type="text">AtomServer Feed Entry</title>
   <content
type="application/xml" >
     <app:categories xmlns:app="http://www.w3.org/2007/app"
                     xmlns="http://www.w3.org/2005/Atom" >
         <category scheme="urn:foo.widgets.type" term="foo" />
         <category scheme="urn:foo.widgets.brands" term="bar" />
     </app:categories>

   </content>
   <author><name>AtomServer Atom Service</name></author>
   <link href="/atomserver/v1/widgets/acme/12345.en.xml/3" rel="edit" />
   <link href="/atomserver/v1/widgets/acme/12345.en.xml" rel="self" />
</entry>

Note that the client does not need to provide any of the entry elements when creating an Entry except for <id>, which must be provided.

Requesting the Categories for a specific entry

You don't have to do anything special to request the Categories associated with a specific, "actual" Entry, because all Categories associated with that Entry are automatically returned when you request that Entry

So, for example, let's assume that you are requesting the acme Widget; 123. And let's further assume that we've associated Categories with this Entry as in the preceding example.
To see it, you would send the following request to the server. Notice that we are requesting the "actual" Entry, and that the Entry's Categories are always returned for that request.

GET /v1/widgets/acme/123.xml

The server responds with:

200 OK

<?xml version='1.0' encoding='UTF-8'?>
<entry xmlns="http://www.w3.org/2005/Atom">
   <id>/atom/v1/widgets/acme/123</id>
   <updated>2007-06-18T23:29:32.000Z</updated>
   <title type="text">Acme Widget (123)</title>
   <link xmlns="http://www.w3.org/2005/Atom" href="/atomserver/v1/widgets/acme/123.xml/4" rel="edit" />
   <category scheme="urn:foo.widgets.type" term="foo" />
   <category scheme="urn:foo.widgets.brands" term="bar" />

   <content type="application/xml" >
     <property xmlns="http://schemas.atomserver.org/widgets/v1/rev0" systemId="acme" id="123" inNetwork="false">
        .....
     </property>
   </content>

</entry>

Note that the Entry contains the two Categories we had assigned to it previously (i.e. foo and bar).

Deleting the Categories for a specific entry

To delete the Categories associated with an existing Entry, send a DELETE request, using the Entry's edit URI (as provided by the server in the previous example).

If your firewall does not allow DELETE, then do an HTTP POST and set the method override header as follows:
 
X-HTTP-Method-Override: DELETE

The following example deletes all categories associated with the "actual" Entry widgets/acme/123.xml. Note that, again, we must follow the rules of AtomServer Optimistic Concirrency.

DELETE /v1/tags:widgets/acme/123.xml/4

The server responds:

200 OK 

If the deletion fails, then the server responds with an error code.

Requesting all Categories associated with a Workspace or Collection

Atom provides a standard way to ask for all Categories associated with either a Workspace or with a specific Collection within a Workspace. This is provided within the standard "Service Document". The Atom Service Document is how a service tells you what is available. It is the introspection document. Each service represents one or more workspaces, which represent one or more collections. The collections contain the individual resources (Entries).

By definition, a Service Document may contain a "Categories Document" for each Collection. You can request either all available Workspaces (by making the GET request to the "base URL" - e.g. "/v1" ) or a particular Workspace (by making the GET request to the "base Workspace URL" - e.g. "/v1/widgets").

GET /v1/

might return;

<?xml version='1.0' encoding='UTF-8'?>
<service xmlns="http://www.w3.org/2007/app" xmlns:atom="http://www.w3.org/2005/Atom">
  <workspace>
    <atom:title type="text">widgets</atom:title>
    <collection href="widgets/acme/">
       <atom:title type="text">acme</atom:title>
       <accept>application/atom+xml;type=entry</accept>
       <categories>
          <category xmlns="http://www.w3.org/2005/Atom" scheme="urn:widgets.bar" term="term1" />
          <category xmlns="http://www.w3.org/2005/Atom" scheme="urn:widgets.foo" term="term1" />
         <category xmlns="http://www.w3.org/2005/Atom" scheme="urn:widgets.foo" term="term3" />
         <category xmlns="http://www.w3.org/2005/Atom" scheme="urn:widgets.foo" term="term2" />
       </categories>
    </collection>
  </workspace>
  <workspace>
     <atom:title type="text">other1</atom:title>
  </workspace>
  <workspace>
    <atom:title type="text">other2</atom:title>
  </workspace>
</service>

Or you can request a particular Workspace;

GET /v1/widgets

might return;

<?xml version='1.0' encoding='UTF-8'?>
<service xmlns="http://www.w3.org/2007/app" xmlns:atom="http://www.w3.org/2005/Atom">
  <workspace><atom:title type="text">widgets</atom:title>
    <collection href="widgets/acme/">
      <atom:title type="text">acme</atom:title>
      <accept>application/atom+xml;type=entry</accept>
      <categories>
        <category xmlns="http://www.w3.org/2005/Atom" scheme="urn:widgets.bar" term="term1" />
        <category xmlns="http://www.w3.org/2005/Atom" scheme="urn:widgets.foo" term="term1" />
        <category xmlns="http://www.w3.org/2005/Atom" scheme="urn:widgets.foo" term="term3" />
        <category xmlns="http://www.w3.org/2005/Atom" scheme="urn:widgets.foo" term="term2" />
      </categories>
    </collection>
  </workspace>
</service>

Modifying a Single Category for a specific Entry

With standard Atom Categories document, categories for an entry can be modified as a whole. Instead of modifying all categories of an entry, a Single Category document can be used to modify a single category within the categories.

A Single Category document is an extension to Atom Categories document with a name-spaced element category-op. The element category-op specifies the type of operation to perform with the category in the document. Currently the AtomServer supports "modify" operation which changes the categories of the entry and allows modification of one category per document.

To modify a single Category which apply to a given Entry, send a PUT request, and supply a Single Atom Category document within the<content> element.

A category element in the single Category document must specify modifyType in addition to the scheme, term, and optional label. modifyType is a name-spaced attribute which describes what to do with the category in the document. A category can be inserted or added to existing categories, it's term value can be updated to a new value, or it can be deleted from existing categories. Corresponding values for modifyType are insert, update, and delete.

For the update operation, an additional name-spaced attribute called oldTerm must be specified in the category element. oldTerm specifies the existing term of the category with the same scheme. The term of the category which matches scheme and oldTerm attribute values will be updated to the value of the term attribute. If a category matches the scheme, but not the oldTerm, AtomServer considers this as an optimistic concurrency error and http status code of 409 (Conflict) is returned. If there are no categories which matches the scheme, AtomServer will return status code 400 (Bad request).

Examples of using Single Category document follow.

1. Insert a new category to entry with Id 12345.


PUT /v1/tags:widgets/acme/12345.xml/1

<entry xmlns="http://www.w3.org/2005/Atom">
   <id>/atomserver/v1/tags:widgets/acme/12345</id>
   <content type="application/xml" >
     <app:categories xmlns:app="http://www.w3.org/2007/app"         
                     xmlns="http://www.w3.org/2005/Atom"

                     xmlns:catop="http://atomserver.org/namespaces/1.0/category" >

         <catop:category-op type="modify"/>
         <category scheme="urn:foo.widgets.location" term="US" catop:modifyType="insert" />
    </app:categories>

   </content>
</entry> 

The server responds:

200 OK

<?xml version='1.0' encoding='UTF-8'?>
<entry xmlns="http://www.w3.org/2005/Atom">
   <id>/atomserver/v1/tags:widgets/acme/12345</id>
   <updated>2010-08-06T22:58:11.030Z</updated>
   <published>2010-08-06T22:58:11.030Z</published>
   <title type="text">AtomServer Feed Entry</title>
   <content
type="application/xml" >
     <app:categories xmlns:app="http://www.w3.org/2007/app"
                     xmlns="http://www.w3.org/2005/Atom" >
         <category scheme="urn:foo.widgets.type" term="foo" />
         <category scheme="urn:foo.widgets.brands" term="bar" />
         <category scheme="urn:foo.widgets.location" term="US" />
     </app:categories>

   </content>
   <author><name>AtomServer Atom Service</name></author>
   <link href="/atomserver/v1/widgets/acme/12345.en.xml/1" rel="edit" />
   <link href="/atomserver/v1/widgets/acme/12345.en.xml" rel="self" />
</entry>

2. a) Update a category with new term to entry with Id 12345 with incorret oldTerm.


PUT /v1/tags:widgets/acme/12345.xml/2

<entry xmlns="http://www.w3.org/2005/Atom">
   <id>/atomserver/v1/tags:widgets/acme/12345</id>
   <content type="application/xml" >
     <app:categories xmlns:app="http://www.w3.org/2007/app"         
                     xmlns="http://www.w3.org/2005/Atom"

                     xmlns:catop="http://atomserver.org/namespaces/1.0/category" >

         <catop:category-op type="modify"/>
         <category scheme="urn:foo.widgets.location" term="Canada" catop:modifyType="update" catop:oldTerm="UK" />
    </app:categories>

   </content>
</entry> 

The server responds:

409 Conflict

<error xmlns="http://incubator.apache.org/abdera">
   <code>409</code>
   <message>Optimisitic Concurrency Error:: /atomserver/v1/tags:widgets/acme/12345.xml/2</message>
   <link xmlns="http://www.w3.org/2005/Atom" href="/atomserver/v1/widgets/acme/12345.xml/2" rel="edit" />
</error>

2. b) Update a category with new term to entry with Id 12345 with correct oldTerm.


PUT /v1/tags:widgets/acme/12345.xml/2

<entry xmlns="http://www.w3.org/2005/Atom">
   <id>/atomserver/v1/tags:widgets/acme/12345</id>
   <content type="application/xml" >
     <app:categories xmlns:app="http://www.w3.org/2007/app"         
                     xmlns="http://www.w3.org/2005/Atom"

                     xmlns:catop="http://atomserver.org/namespaces/1.0/category" >

         <catop:category-op type="modify"/>
         <category scheme="urn:foo.widgets.location" term="Canada" catop:modifyType="update" catop:oldTerm="US" />
    </app:categories>

   </content>
</entry> 

The server responds:

200 OK

<?xml version='1.0' encoding='UTF-8'?>
<entry xmlns="http://www.w3.org/2005/Atom">
   <id>/atomserver/v1/tags:widgets/acme/123</id>
   <updated>2010-08-06T22:58:11.030Z</updated>
   <published>2010-08-06T22:58:11.030Z</published>
   <title type="text">AtomServer Feed Entry</title>
   <content
type="application/xml" >
     <app:categories xmlns:app="http://www.w3.org/2007/app"
                     xmlns="http://www.w3.org/2005/Atom" >
         <category scheme="urn:foo.widgets.type" term="foo" />
         <category scheme="urn:foo.widgets.brands" term="bar" />
         <category scheme="urn:foo.widgets.location" term="Canada" />
     </app:categories>

   </content>
   <author><name>AtomServer Atom Service</name></author>
   <link href="/atomserver/v1/widgets/acme/12345.en.xml/2" rel="edit" />
   <link href="/atomserver/v1/widgets/acme/12345.en.xml" rel="self" />
</entry>

3. Delete an existing category to entry with Id 12345.


PUT /v1/tags:widgets/acme/12345.xml/3

<entry xmlns="http://www.w3.org/2005/Atom">
   <id>/atomserver/v1/tags:widgets/acme/12345</id>
   <content type="application/xml" >
     <app:categories xmlns:app="http://www.w3.org/2007/app"         
                     xmlns="http://www.w3.org/2005/Atom"

                     xmlns:catop="http://atomserver.org/namespaces/1.0/category" >

         <catop:category-op type="modify"/>
         <category scheme="urn:foo.widgets.location" term="Canada" catop:modifyType="delete" />
    </app:categories>

   </content>
</entry> 

The server responds:

200 OK

<?xml version='1.0' encoding='UTF-8'?>
<entry xmlns="http://www.w3.org/2005/Atom">
   <id>/atomserver/v1/tags:widgets/acme/123</id>
   <updated>2010-08-06T22:58:11.030Z</updated>
   <published>2010-08-06T22:58:11.030Z</published>
   <title type="text">AtomServer Feed Entry</title>
   <content
type="application/xml" >
     <app:categories xmlns:app="http://www.w3.org/2007/app"
                     xmlns="http://www.w3.org/2005/Atom" >
         <category scheme="urn:foo.widgets.type" term="foo" />
         <category scheme="urn:foo.widgets.brands" term="bar" />
     </app:categories>

   </content>
   <author><name>AtomServer Atom Service</name></author>
   <link href="/atomserver/v1/widgets/acme/12345.en.xml/3" rel="edit" />
   <link href="/atomserver/v1/widgets/acme/12345.en.xml" rel="self" />
</entry>


Category Queries

The usefulness of Categories lies, of course, in the ability to query for Feeds (i.e. lists of Entries) which have particular Categories associated with them. Since AtomServer is modeled after GData, we've adopted a similar scheme for "Category Queries" as that in GData. Instead of using Query Parameters to specify Category search parameters (e.g. foo/bar/123?category=foo&category=bar), we've adopted GData's URL scheme. Although we have not adopted GData's scheme verbatim because the GData scheme employs several Invalid URL characters, which require that the User explicitly encode them in the URL (e.g. "|" must be encoded as %7C) . So rather than do that to our Users, and for the sake of readable URLs, we've adopted a slightly different syntax.

To retrieve a Feed of Entries that match a particular Category - say the "bar" Category we implemented earlier, you would submit a request like this;

GET /v1/widgets/acme/-/(urn:foo.widgets.brands)bar

where
        /-/                                       is the URL delimiter which indicates that a Category query follows.
        (urn:foo.widgets.brands)  is the Category scheme.  schemes are delimited by parenthesis; ( and ). 
        bar                                       is the Category term

You can "chain" together a series of "ANDs" by simply requesting a series of Categories. For example;

GET /v1/widgets/acme/-/(urn:foo.widgets.brands)bar/(urn:foo.widgets.type)foo

is the equivalent of asking "give me a Feed of all widgets that have Category {urn:widgets.brands, bar} AND {urn:widgets.type, foo}"

Complex Category Queries

Category queries also support arbitrarily complex boolean expressions using AND and OR operators explicitly.  To ensure unambiguity in URL processing, the language for these boolean operators is expressed as Prefix Notation, and the operators are binary only, meaning that each one takes exactly two operands.

You are probably more familiar with Infix Notation, in which the operator is placed between the operands - in Prefix Notation, the operand is placed before its operands.

    INFIX:  x AND y
   
PREFIX:  AND x y

The main reason for using the Prefix Notation is that it removes any chance of ambiguity in the operators that would need to be resolved with parentheses.  For Example:

    INFIX:  (x AND y) OR z
    PREFIX:  OR AND x y z

    INFIX:  x AND (y OR z)
    PREFIX:  AND x OR y z

The two expressions in Infix Notation are only distinguishable by their parentheses - if you take them away, it is not clear which one you mean, whereas in the Prefix Notation, there is only one way to interpret each of the expressions, with no parentheses needed for grouping sub-expressions.

In a URL, if you want to get all widgets in brand "bar" that have property type of "foo" or "condo", you would say:

GET /v1/widgets/acme/-/AND/(urn:foo.widgets.brands)bar/OR/(urn:foo.widgets.type)foo/(urn:foo.widgets.type)condo

There is, as discussed at the beginning of this section, an implicit AND at the top level.  That means that you can simplify this query to:

GET /v1/widgets/acme/-/(urn:foo.widgets.brands)bar/OR/(urn:foo.widgets.type)foo/(urn:foo.widgets.type)condo

If you want to do an AND or an OR of more than two things, you simply chain the operators together, for example:

    x OR y OR z  in INFIX becomes OR x OR y z in PREFIX

Or in a URL, if you want all "foo", "condo", and "house" widgets

GET /v1/widgets/acme/-/OR/(urn:foo.widgets.type)foo/OR/(urn:foo.widgets.type)condo/(urn:foo.widgets.type)house



Additional resources

You may find the following third-party documents useful:

    * Overview of Atom from IBM
    * HTTP 1.1 method definitions; specification for GET, POST, PUT, and DELETE
    * HTTP 1.1 status code definitions
    * Atom Syndication Reference (from Atom-enabled)
    * Getting to know the Atom Publishing Protocol (from IBM)
    *
IBM Tip: Organize Documents with Atom Categories