In this tutorial, we'll explore OData, a standard protocol that allows easy access to data sets using a RESTFul API.
2. What is OData?
OData is an OASIS and ISO/IEC Standard for accessing data using a RESTful API. As such, it allows a consumer to discover and navigate through data sets using standard HTTP calls.
For instance, we can access one of the publicly available OData services with a simple curl one-liner:
curl -s https://services.odata.org/V2/Northwind/Northwind.svc/Regions <?xml version="1.0" encoding="utf-8" standalone="yes"?> <feed xml:base="https://services.odata.org/V2/Northwind/Northwind.svc/" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xmlns="http://www.w3.org/2005/Atom"> <title type="text">Regions</title> <id>https://services.odata.org/V2/Northwind/Northwind.svc/Regions</id> ... rest of xml response omitted
As of this writing, the OData protocol is at its 4th version – 4.01 to be more precise. OData V4 reached the OASIS standard level in 2014, but it has a longer history. We can trace its roots to a Microsoft project called Astoria, which was renamed to ADO.Net Data Services in 2007. The original blog entry announcing this project is still available at Microsoft's OData blog.
Having a standards-based protocol to access data set brings some benefits over standard APIs such as JDBC or ODBC. As an end-user level consumer, we can use popular tools such as Excel to retrieve data from any compatible provider. Programming is also facilitated by a large number of available REST client libraries.
As providers, adopting OData also has benefits: once we've created a compatible service, we can focus on providing valuable data sets, that end-users can consume using the tools of their choice. Since it is an HTTP-based protocol, we can also leverage aspects such as security mechanisms, monitoring, and logging.
Those characteristics made OData a popular choice by government agencies when implementing public data services, as we can check by taking a look at this directory.
3. OData Concepts
At the core of the OData protocol is the concept of an Entity Data Model – or EDM for short. The EDM describes the data exposed by an OData provider through a metadata document containing a number of meta-entities:
- Entity type and its properties (e.g. Person, Customer, Order, etc) and keys
- Relationships between entities
- Complex types used to describe structured types embedded into entities (say, an address type which is part of a Customer type)
- Entity Sets, which aggregate entities of a given type
The spec mandates that this metadata document must be available at the standard location $metadata at the root URL used to access the service. For instance, if we have an OData service available at http://example.org/odata.svc/, then its metadata document will be available at http://example.org/odata.svc/$metadata.
The returned document contains a bunch of XML describing the schemas supported by this server:
<?xml version="1.0"?> <edmx:Edmx xmlns:edmx="http://schemas.microsoft.com/ado/2007/06/edmx" Version="1.0"> <edmx:DataServices xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" m:DataServiceVersion="1.0"> ... schema elements omitted </edmx:DataServices> </edmx:Edmx>
Let's tear down this document into its main sections.
The top-level element, <edmx:Edmx> can have only one child, the <edmx:DataServices> element. The important thing to notice here is the namespace URI since it allows us to identify which OData version the server uses. In this case, the namespace indicates that we have an OData V2 server, which uses Microsoft's identifiers.
A DataServices element can have one or more Schema elements, each describing an available dataset. Since a full description of the available elements in a Schema is beyond the scope of this article, we'll focus on the most important ones: EntityTypes, Associations, and EntitySets.
3.1. EntityType Element
This element defines the available properties of a given entity, including its primary key. It may also contain information about relationships with other schema types and, by looking at an example – a CarMaker – we'll be able to see that it is not very different from descriptions found in other ORM technologies, such as JPA:
<EntityType Name="CarMaker"> <Key> <PropertyRef Name="Id"/> </Key> <Property Name="Id" Type="Edm.Int64" Nullable="false"/> <Property Name="Name" Type="Edm.String" Nullable="true" MaxLength="255"/> <NavigationProperty Name="CarModelDetails" Relationship="default.CarModel_CarMaker_Many_One0" FromRole="CarMaker" ToRole="CarModel"/> </EntityType>
Here, our CarMaker has only two properties – Id and Name – and an association to another EntityType. The Key sub-element defines the entity's primary key to be its Id property, and each Property element contains data about an entity's property such as its name, type or nullability.
A NavigationProperty is a special kind of property that describes an “access point” to a related entity.
3.2. Association Element
An Association element describes an association between two entities, which includes the multiplicity on each end and optionally a referential integrity constraint:
<Association Name="CarModel_CarMaker_Many_One0"> <End Type="default.CarModel" Multiplicity="*" Role="CarModel"/> <End Type="default.CarMaker" Multiplicity="1" Role="CarMaker"/> <ReferentialConstraint> <Principal Role="CarMaker"> <PropertyRef Name="Id"/> </Principal> <Dependent Role="CarModel"> <PropertyRef Name="Maker"/> </Dependent> </ReferentialConstraint> </Association>
Here, the Association element defines a one-to-many relationship between a CarModel and CarMaker entities, where the former acts as the dependent party.
3.3. EntitySet Element
The final schema concept we'll explore is the EntitySet element, which represents a collection of entities of a given type. While it's easy to think them as analogous to a table – and in many cases, they're just that – a better analogy is that of a view. The reason for that is that we can have multiple EntitySet elements for the same EntityType, each representing a different subset of the available data.
The EntityContainer element, which is a top-level schema element, groups all available EntitySets:
<EntityContainer Name="defaultContainer" m:IsDefaultEntityContainer="true"> <EntitySet Name="CarModels" EntityType="default.CarModel"/> <EntitySet Name="CarMakers" EntityType="default.CarMaker"/> </EntityContainer>
In our simple example, we have just two EntitySets, but we could also add additional views, such as ForeignCarMakers or HistoricCarMakers.
4. OData URLs and Methods
In order to access data exposed by an OData service, we use the regular HTTP verbs:
- GET returns one or more entities
- POST adds a new entity to an existing Entity Set
- PUT replaces a given entity
- PATCH replaces specific properties of a given entity
- DELETE removes a given entity
All those operations require a resource path to act upon. The resource path may define an entity set, an entity or even a property within an entity.
Let's take a look on an example URL used to access our previous OData service:
The first part of this URL, starting with the protocol up to the odata/ path segment, is known as the service root URL and is the same for all resource paths of this service. Since the service root is always the same, we'll replace it in the following URL samples by an ellipsis (“…”).
CarMakers, in this case, refers to one of the declared EntitySets in the service metadata. We can use a regular browser to access this URL, which should then return a document containing all existing entities of this type:
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xml:base="http://localhost:8080/odata/"> <id>http://localhost:8080/odata/CarMakers</id> <title type="text">CarMakers</title> <updated>2019-04-06T17:51:33.588-03:00</updated> <author> <name/> </author> <link href="CarMakers" rel="self" title="CarMakers"/> <entry> <id>http://localhost:8080/odata/CarMakers(1L)</id> <title type="text">CarMakers</title> <updated>2019-04-06T17:51:33.589-03:00</updated> <category term="default.CarMaker" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme"/> <link href="CarMakers(1L)" rel="edit" title="CarMaker"/> <link href="CarMakers(1L)/CarModelDetails" rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/CarModelDetails" title="CarModelDetails" type="application/atom+xml;type=feed"/> <content type="application/xml"> <m:properties> <d:Id>1</d:Id> <d:Name>Special Motors</d:Name> </m:properties> </content> </entry> ... other entries omitted </feed>
The returned document contains an entry element for each CarMaker instance.
Let's take a closer look at what information we have available to us:
- id: a link to this specific entity
- title/author/updated: metadata about this entry
- link elements: Links used to point to a resource used to edit the entity (rel=”edit”) or to related entities. In this case, we have a link that takes us to the set of CarModel entities associated with this particular CarMaker.
- content: property values of CarModel entity
An important point to notice here is the use of the key-value pair to identify a particular entity within an entity set. In our example, the key is numeric so a resource path like CarMaker(1L) refers to the entity with a primary key value equal to 1 – the “L” here just denotes a long value and could be omitted.
5. Query Options
We can pass query options to a resource URL in order to modify a number of aspects of the returned data, such as to limit the size of the returned set or its ordering. The OData spec defines a rich set of options, but here we'll focus on the most common ones.
As a general rule, query options can be combined with each other, thus allowing clients to easily implement common functionalities such as paging, filtering and ordering result lists.
5.1. $top and $skip
We can navigate through a large dataset using the $top an $skip query options:
$top tells the service that we want only the first 10 records of the CarMakers entity set. A $skip, which is applied before the $top, tells the server to skip the first 10 records.
It's usually useful to know the size of a given Entity Set and, for this purpose, we can use the $count sub-resource:
This resource produces a text/plain document containing the size of the corresponding set. Here, we must pay attention to the specific OData version supported by a provider. While OData V2 supports $count as a sub-resource from a collection, V4 allows it to be used as a query parameter. In this case, $count is a Boolean, so we need to change the URL accordingly:
We use the $filter query option to limit the returned entities from a given Entity Set to those matching given criteria. The value for the $filter is a logical expression that supports basic operators, grouping and a number of useful functions. For instance, let's build a query that returns all CarMaker instances where its Name attribute starts with the letter ‘B':
Now, let's combine a few logical operators to search for CarModels of a particular Year and Maker:
.../CarModels?$filter=Year eq 2008 and CarMakerDetails/Name eq 'BWM'
Here, we've used the equality operator eq to specify values for the properties. We can also see how to use properties from a related entity in the expression.
By default, an OData query does not return data for related entities, which is usually OK. We can use the $expand query option to request that data from a given related entity be included inline with the main content.
Using our sample domain, let's build an URL that returns data from a given model and its maker, thus avoiding an additional round-trip to the server:
The returned document now includes the CarMaker data as part of the related entity:
<?xml version="1.0" encoding="utf-8"?> <entry xmlns="http://www.w3.org/2005/Atom" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xml:base="http://localhost:8080/odata/"> <id>http://example.org/odata/CarModels(1L)</id> <title type="text">CarModels</title> <updated>2019-04-07T11:33:38.467-03:00</updated> <category term="default.CarModel" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme"/> <link href="CarModels(1L)" rel="edit" title="CarModel"/> <link href="CarModels(1L)/CarMakerDetails" rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/CarMakerDetails" title="CarMakerDetails" type="application/atom+xml;type=entry"> <m:inline> <entry xml:base="http://localhost:8080/odata/"> <id>http://example.org/odata/CarMakers(1L)</id> <title type="text">CarMakers</title> <updated>2019-04-07T11:33:38.492-03:00</updated> <category term="default.CarMaker" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme"/> <link href="CarMakers(1L)" rel="edit" title="CarMaker"/> <link href="CarMakers(1L)/CarModelDetails" rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/CarModelDetails" title="CarModelDetails" type="application/atom+xml;type=feed"/> <content type="application/xml"> <m:properties> <d:Id>1</d:Id> <d:Name>Special Motors</d:Name> </m:properties> </content> </entry> </m:inline> </link> <content type="application/xml"> <m:properties> <d:Id>1</d:Id> <d:Maker>1</d:Maker> <d:Name>Muze</d:Name> <d:Sku>SM001</d:Sku> <d:Year>2018</d:Year> </m:properties> </content> </entry>
We use the $select query option to inform the OData service that it should only return the values for the given properties. This is useful in scenarios where our entities have a large number of properties, but we're only interested in some of them.
Let's use this option in a query that returns only the Name and Sku properties:
The resulting document now has only the requested properties:
... xml omitted <content type="application/xml"> <m:properties> <d:Name>Muze</d:Name> <d:Sku>SM001</d:Sku> </m:properties> </content> ... xml omitted
We can also see that even related entities were omitted. In order to include them, we'd need to include the name of the relation in the $select option.
The $orderBy option works pretty much as its SQL counterpart. We use it to specify the order in which we want the server to return a given set of entities. In its simpler form, its value is just a list of property names from the selected entity, optionally informing the order direction:
.../CarModels?$orderBy=Name asc,Sku desc
This query will result in a list of CarModels ordered by their names and SKUs, in ascending and descending directions, respectively.
An important detail here is the case used with the direction part of a given property: while the spec mandates that server must support any combination of upper- and lower-case letters for the keywords asc and desc, it also mandates that client use only lowercase.
This option defines the data representation format that the server should use, which takes precedence over any HTTP content-negotiation header, such as Accept. Its value must be a full MIME-Type or a format-specific short form.
For instance, we can use json as an abbreviation for application/json:
This URL instructs our service to return data using JSON format, instead of XML, as we've seen before. When this option is not present, the server will use the value of the Accept header, if present. When neither is available, the server is free to choose any representation – usually XML or JSON.
Regarding JSON specifically, it's fundamentally schemaless. However, OData 4.01 defines a JSON schema for metadata endpoints as well. This means that we can now write clients that can get totally rid of XML processing if they choose to do so.
In this brief introduction to OData, we've covered its basic semantics and how to perform simple data set navigation. Our follow-up article will continue where we left and go straight into the Olingo library. We'll then see how to implement sample services using this library.
Code examples, as always, are available over on GitHub.