DSS Specifications 0.7

 

What is DSS

DSS is a format to share data on web. It allows you to publish the data as a XML feed, subscribe to someone else's data feed, post back your modifications and handle the conflicts.

The term DSS is an acronym for Data Syndication Services.

 

Sample feed

You can view the sample DSS feed here.

 

Description

The DSS format is extremely easy to use. The typical data sharing scenario where DSS format can be useful is described like this: Let's say you have a website for cooking recipes. Using DSS, it is immediately possible to publish new recipes and updates as a feed which users can consume.

The users of this recipe website may also want make modifications to the recipes that they have received. Through DSS, they can send back their modifications to the website, via interfaces such as Web Service or XML-RPC. The website then distributes modifications out to all other users. However, instead of overwriting other user’s data, the modifications show up as a conflict  on a data item. Other users can accept or reject the modification or just let the conflict be there forever. This example just highlights the possibility of immense collaboration that DSS can make possible among users. The DSS is similarly usable in decentralized P2P topology or just sending updates through email.

 

The Real World

DSS aims to solve real world problems rather than becoming an abstract theoretical specification. To that end, DSS includes few technology specific recommendations that let it to be usable right out of the box.

Pagged Feeds

The typical real world databases used in businesses can have thousands of records and using DSS XML feed to expose these number of records at once can be impractical without using pagged feeds. The DSS specifications has a recommandation to implement pagged feed as described later in this document.

Reference Application and Platform API

As a step to be in-sync with the real world application, DSS specification is released with fully functional end-to-end application that demonstrates all features in an end user application.

The reference application [source] is a recipe manager that would allow you to manage your cooking recipes. You can subscribe to sample recipe feed and receive the recipes from the website. Once you download the new recipes through DSS feed, you can view and modify those recipes. After you are done, you can send your updates back to the website which will be immediately available to all other users who are checking the feed.

The application also demonstrates peer-to-peer synchrnization through remoting. It also shows how you can send your updates simply through an email to other users.

The entire functionality of processing DSS feed including Sync engine is encapsulated in a .Net library called Blank  (Java package coming soon). The single Blank.DLL proviles plateform API to work with DSS and synchronization.

 

DSS Schema Description

Following is the brief description of each element and attribute for DSS element. It is highly recommended that you view the sample DSS feed as many elements are self-explanatory.

<dss> element

The DSS document starts with root element <dss>.  The <dss> element can have an optional attribute called version which has default value 0.7.

 

<dataset> element

This is an optional sub-element of <dss>.

The <dataset> element roughly corresponds to a physical database instance.

In a feed, for instance, one <dataset> element may contain data about recipes, while another <dataset> element may contain data about concert dates in New York . Typically only one <dataset> element is recommended within <dss> element.

Optional attributes

Attribute 

Description

Example

globalGuid

Uniquely and globally identifies this dataset. An application may, for instance, use this value to figure out where it should put data in this dataset each time it reads the feed. It is recommended that this value is globally unique URI through the life time of the feed.

DailyCooking.com/Recipes

title

User friendly title for this dataset.

Recipe Updates From DailyCooking!

endpointGuid

Identifies the endpoint which is publishing this feed. Typically this value may be used to figure out when was the last time feed was read from this endpoint. If the endpoint supports pagged feed (an ability to read the feed in chunks of items) then this value may be used to figure out when to stop requesting more pages.

http://www.DailyCooking.com/schemas/recipefeed

itemSchema

Identifies the logical application schema for this dataset; this is not the schema for XML markup. This can be any string value that publisher or subscriber may mutually agree to, including an URI.

http://www.DailyCooking.com/schemas/recipefeed

since

All items in this feed have been updated on or after this date-time value. If publisher supports feed pages, the subscriber may use this value to request next feed page.

2005-12-06T11:41:41.4460000-08:00

until

All items in this feed have been updated on or before this date-time value.

2005-12-06T11:41:41.4560000-08:00

 

<item> element

This is an optional sub-element of <dataset>.

Each item element corresponds to an instance of some entity. For example, in the recipe feed there would be one <item> element for each recipe.

In relational database world, the <item> element roughly corresponds to a row in a table.

Optional attributes

Attribute 

Description

Example

guid

Uniquely identifies this item among all existing items in the dataset.

80eee2dc-c29c-41aa-814f-dbb0e79d9062

type

It's the user defined value that may identify the type of data that this item contains. An application may use this value along with itemSchema attribute of <dataset> element to learn more about type of data that this item contains. In relational database world, it may roughly correspond to the table name in a database.

Recipe

when

Time stamp indicating when was the last time this item was updated. In DSS feed, typically, items are sequenced in descending order of modification time (i.e. newest items first).

Recipe

 

<property> sub-element of <item>

Each property element contains value for certain aspect of an item. For example, the item for a Chicken Tikka Masala recipe could have properties such as Title, Ingredients, CookingSteps and CookingTime and so on.

In relational database world, the <property> element roughly corresponds to a column in a table.

Optional attributes

Attribute 

Description

Example

name

Uniquely identifies this property in an item.

Ingredients

currentValueBy

A property may contain values authored by different authors. For example, let’s say the recipe for Hasparats has a property called Ingredients. Now Dave and Leslie can each author their own value for Ingredients. The DSS allows to retain both of their values without overwriting each other. The currentValueBy attribute specifies whose value is considered as currently agreed upon while the conflict is still being resolved. The user interface elements such as textbox can show only one value. The currentValueBy attribute dictates whose value to show when a property is in conflict by multiple authors.

dave@example.com

by

Author who last changed the currentValueBy attribute.

dave@example.com

when

The date-time when the currentValueBy attribute was last changed

2005-12-06T11:41:41.4560000-08:00

revision

Each time currentValueBy changes, the value for this attribute is incremented.

12

 

<value> sub-element of <property>

Each property can have a value. But when multiple user’s author set their own value, there would be one <value> element per author.

For example, the property named CookingTime may contain one <value> element for the author Dave with the value 12 minutes and another <value> element for Leslie set to 20 minutes.

In relational database world, the <value> element roughly corresponds to a value in a particular row and column of a table, except that in relational databases, only one value exist for (row,column) combination while DSS retains value authored by each author for that (row,column).

Optional attributes

Attribute 

Description

Example

deleted

true if this value has been deleted by its author. Note that, the <value> element itself is never removed even though the value has been deleted. The default value is false.

false

by

Author to whom this <value> element belong to.

dave@example.com

when

The date-time when the value was last changed

2005-12-06T11:41:41.4560000-08:00

revision

Each time author’s changes their value, this attribute is incremented by one.

5

resolvedWhen

A date-time value indicating when was the last time the author of this value performed the conflict resolution

2005-10-06T11:41:41.4560000-08:00

resolvedReason

If authors specified any reasoning when they performed the conflict resolution, this attribute contains that value.

I checked it, 12 minutes is the correct cooking time!

 

<syncMark> element

The DSS specification also introduces the concept called SyncMark to let the endpoints figure out where they had last left while reading the feed. This is an optional feature and hence an optional element.

For example, let's say on Monday you received updates from DSS feed and got your local endpoint upto date. Now when you read the feed 4 days later, you want to figure out after which item in the feed all other updates where already included in Monday's update so you don't waste your time requesting more pages.

The syncMark element is simply a stamp put by publisher. The stamp is an integer value contained in mark attribute. So each time, subscribers read a feed, they note down the mark for that endpoint. And then when they need next time, they stop processing items when they have found the mark they had noted last time.

Required attributes

Attribute 

Description

Example

endpointGuid

Uniquely identifies the endpoint to which this sync mark belongs to.

80eee2dc-c29c-41aa-814f-dbb0e79d9062

mark

An integer value that decrements along the feed.

3

 

<related> element

This optional element gives more information about how to retrieve feed items in pages instead of all at once. It also gives information about how to post the feed back to publisher. For example,

  < related link =" http://www.DailyCooking.com/dss/feed.ashx?after={0}" type =" getData" />  

The default DSS feed would have included most recently modified 10 items, for example. Above fragment tells us that to retrieve more feed items simply access the URL with query parameter called after set to the modified date value of last item in the feed. 

  < related link =" http://www.DailyCooking.com/BlankWebService.asmx" type =" putData" />  

Above fragment tells us that to post data back to the server use Web Service entry point.

Required attributes

Attribute 

Description

Example

link

URI that may be used for the purpose specified by type .

http://www.DailyCooking.com/dss/feed.ashx?after={0}

type

Currently allows two values getData and putData. The getData specified the URL that
may be used to access feed in pages. The putData defines the entry point to the WebService
which must have interface defined here.

getData

 

Synchronization algorithm

Following algorithm should be used by all endpoints to have a globally consistent data: 

For each <item>, <property> and the <value> element that we received in a feed, we need to look up the local data store and make a decision whether to keep local value or to keep the value that we received in the feed. The value we keep is a "winner" and the value that we toss away is "loser"

When we receive an <item> with a guid which does not exist in our local data store, we simply add it in our data store.

When we receive an item that exists in our datastore, we look up each <property> of the item and verify that we have them. If we do not than we simply add that property in our data store. Next we look up each value associated with <property>. If value does not exist in our local data store, we simply add it. If it does exist then we compare revision, when and by attribute values in that order and the element which has the highest value of revision, when or by attributes (in that order of preference), wins.

The reference implementation of this algorithm can be found in SyncEngine class. Please note that this algorithm is designed to be efficient and request feeds in pages instead of reading potentially thousands of rows at once. The pagged feed is implemented by a technique called SyncMark described above.

 

Authorship and licensing

This document is authored and currently maintained by Shital Shah [shital@ShitalShah.com] and released under Creative Commons Share A Like License.