URI Templates and REST URL Matching

Digg this!

OK, so I haven't posted in a long time. Figure I'll ease in with an issue that I wanted to write about a while back. Oh yeah, and I turned off comments because the spam was out of control :)

For those watching the REST space, I'm sure Joe Gregorio's work on the IETF URI templates draft is interesting. The initial draft (draft-gregorio-uritemplate-01.txt) took a simple approach to describe URIs with variable parts. The idea was that parameterized variable parts would have { var name } within the URI to indicate the variable parts. A simple example would look like this:

http://foo.com/accounts/{accountID}

where the account ID could be substituted with an appropriate ID in an instance of the REST URI:

http://foo.com/accounts/9991234

This approach was simple enough. It could be used in a straightforward manner when calling a REST service by substituting the variables parts with values prior to the invocation. It could also be used somewhat for matching inbound requests by transforming the template into a regular expression, replacing the variables with non-greedy matches.

In the process of updating the template specification, and despite many questions about matching inbound requests on the mailing list, Joe Gregorio stated that he never intended for URI templates to be used for matching but only for substitution. Indeed, the next version of the specification (draft-gregorio-uritemplate-02.txt introduced a bunch of new operators aimed at the substitution. The primary use cases that Mr. Gregario was interested in solving where client-side: simplifying the process of making a RESTful request from a JSP input form, for example, which would substitue the input form variables into their respective places in the REST request.

This was and continues to be a shame. Matching of inbound URIs comes up frequently on the server-side application, and there may be middleware processing an inbound request and determining where to dispatch it. In the cases of bridging legacy interfaces to REST-style interactions, this dispatching process is very mechanical, but there needs to be a description that's consumable by a runtime. The original URI templates spec seemed to provide almost enough for a simple approach to specifying this the URI to match, but not quite:

  • URI parameters, query parameters, and fragments may have parameters that appear out of order
  • Parameters may not have values, but are simply just present. Their presence is a boolean variable
  • The URI templates let variables be anywhere in the template. This doesn't work so well for key-value pairs, when ordering is uncertain

So, applying a few restrictions gives a fairly simple but powerful matching template, which can be automated by infrastructure fairly easily:

  • Any variable that appears in a parameter, query, or fragment slot as a single variable with no key-value pair can be considered a boolean. An example is http://foo.com;{mybool}, where mybool would be true for the match http://foo.com;mybool.
  • Parameter keys can't be variables, just their values. Out of order problem solved.
  • Also, any matching engine should ignore specific delimiters in parameter/query/fragment sections of the URI. This is just robustness.
  • Matching in paths shouldn't be too greedy; the match should stop at the next path '/' element.

With these restrictions, the basic concept of URI templates can be applied to matching. This is powerful enough to handle the majority of RESTful URI constructions that are commonly used. While the second draft of URI templates took the focus of URI templates in the direction of substitution, there's no reason that these approaches couldn't be combined and strengthened to address the matching problem. This would require some changes to how substitution operators are specified today or a flavor matching operators could be introduced.

Make syntax and matching semantics match existing languages?

It'd be cool if the REST parameter URLs got to have a ${variable}-style feel with the PHP/Perl/bash line noise included inline. That way they could be used directly in poor man's languages without having to parse for matching brackets. Similar gearing for languages like Javascript (dunno, maybe bracket variables with a non-URI character like a comma to allow for easy splitting) may also be possible. Seems odd to introduce a variable expansion-like mechanism which makes it difficult in every language. Also seems like matching-syntax mechanisms which are close to the J2EE (or similar) servlet deployment descriptor may be worthwhile.

Rhys Ulerich
rhys (.dot.) ulerich (@at@) gmail.com
http://agentzlerich.blogspot.com

Maybe Rationale For the Precendent?

It would be hard to have a '$' character, since it can appear in a valid URI per the BNF syntax, whereas the { } characters appeared in the unwise set and must be escaped per the URI RFC. Dunno if that was the rationale behind the original choice in syntax.