My favorites | Sign in
Project Logo
                
Search
for
Updated Aug 18, 2009 by s...@samj.net
Labels: Featured
Specification  
Technical specification for the HTML/HTTP "shortlink" relation (permalink: http://purl.org/net/shortlink)

To the extent possible under the laws of Australia, Sam Johnston and Australian Online Solutions have waived all copyright and related or neighboring rights to the shortlink specification. Furthermore, the authors neither assert nor are aware of any patent or trademark rights over the specification.

Introduction

The shortlink relation allows webmasters to specify a short link to use for the resource, thereby avoiding having to obtain one from a potentially unreliable third party URL shortening service such as tinyurl.com.

Such links are useful for space-constrained applications (e.g. microblogging including Twitter and mobile Internet) as well as any time URLs need to be manually entered (e.g. when they are printed or spoken).

Details

The shortlink appears in two places:

Diagram

Implementation

Servers

Servers should implement both HTML and HTTP links, however exposing the HTTP Link: header in HTTP HEAD requests is strongly preferred for efficiency and performance reasons.

The shortlink should default to an automatically generated stable URI based on an existing unique identifier (e.g. http://example.com/123). Such identifiers may be compressed using base32 or similar (e.g. http://example.com/3r). URIs should be case-insensitive and avoid non-ASCII characters or symbols that look or sound similar (e.g. 1 vs l), particularly when manual entry will be required (e.g. printed, spoken).

Publishers may also be given the option to specify a human-friendly slug (e.g. http://example.com/promo), allowing users to derive information about the resource (path) and its source (domain) from the URL.

Where a shortlink is changed the previous URL should not be broken as it may have been stored by users. Typically this requires maintaining a register of previous mappings.

Clients

Clients that have already retrieved the document (e.g. web browsers, news readers) should parse it to discover the link rel="shortlink" element(s) and extract the href attribute from each.

Clients that have the URL but not the document (e.g. microblogging software) should conduct a HTTP HEAD request and extract any Link: headers from the response. Clients should not retrieve and parse the document unless requested to do so by the user or as a last resort. # Stored preference information (e.g. domain-based whitelist or user configuration) # HTTP HEAD request (Link: header) # HTTP GET request (only a single request is required for both tests): ## Link: header ## HTML LINK element # 3rd party URL shortening service (e.g. tinyurl.com)

In the event that there are multiple shortlinks then the client may choose one itself or offer the user the choice (e.g. in a drop-down list). If the client chooses one it may do so randomly, by order (first vs last) or by some quality of the URL (length, readability, etc.).


Comment by BlaM4c, May 18, 2009

Note that there's another "competing" standard suggestion: http://sites.google.com/a/snaplog.com/wiki/short_url

The only difference is that the other one uses "shorturl" instead of "shortlink" as relation identifier.

We should decide on one for 'servers' - BUT: Consuming clients should understand both. Don't be too picky just for indoctrination.

Comment by odin.omdal, May 18, 2009

Oh my, - I implemented this one, but there is also shorturl. Grr. Could you two go together and choose which one you'll both back?

Comment by s...@samj.net, May 24, 2009

Yup, aware of short_url... or was it shorturl... or short-url... or "short url"... or shorturi... or short_uri... or short-uri... or "short uri"...? It started life as short_url remember and uri/url confusion is common.

There is no confusion with shortlink, and no intellectual property issues either (the shorturl spec is not freely available, there is no information about patents and a popular URL shortening service uses it as a trademark so we can't even say "shorturl" without potentially landing ourselves in hot water).

Basically, implementing short[- ]uril? is only marginally less insane than implementing rev=canonical. Oh, and zero effort towards standardisation has been made for shorturl while shortlink has been submitted to both IETF and WHATWG.

Comment by goo...@fliks.net, Jun 21, 2009

Macnotes now moved over to using rel=shorturl

See http://www.macnotes.de/2009/06/18/in-eigener-sache-macnotes-short-urls-laufen-jetzt-uber-macnotes/ for the announcement and they used this reference in the source-code:

<link rev="canonical" rel="shorturl" href="http://macnot.es/10563" />

Comment by s...@samj.net, Aug 18, 2009

WordPress.com have announced that they have adopted rel=shortlink for their 7 million+ blogs. This is a huge win for the standard that puts it well and truly on the radar of client developers - we're now busy evangelising the standard and hope to have more good news soon.

Comment by subbu.allamaraju, Aug 18, 2009

Fantastic. Now, what exactly is the point of having two links for the same resource. Why can't the server use the shortened link in the first place?

Comment by alisonw, Aug 19, 2009

This seems fine except for the "URIs should be case-insensitive" bit. I run an in-house-but-open link shortener but imho need to use both cases in the 'shortened section' as otherwise the overall URI gets too long again!

Comment by stevie.rice, Aug 19, 2009

subbu: You don't think we use long links for the fun of it? They're extremely useful in being a) sometimes human readable, b) full of context, c) search engine friendly. Shortened links for everything would work in theory but they're very unfriendly and impenetrable to humans. I'm sure I'm not the only person to have been rick-rolled through Tiny-URL.

Comment by wmwm.com, Sep 02, 2009

<p><a href="http://www.wm-wm.com/vb/">منتديات مملكة ملوك</a></p> Oh my, - I implemented this one, but there is also shorturl. Grr. Could you two go together and choose which one you'll both back?

Comment by s...@samj.net, Sep 11, 2009

alisonw: interesting point - the point of this wording is to remind people about usability issues resulting from poorly chosen character sets (e.g. mixed case, confusing characters like l, I and 1) but in reality most services (e.g. bit.ly) want shorter links so they use as many different characters as they can - http://tinyarro.ws takes this to the extreme by using i18n. Perhaps it could be worded differently.

Comment by s...@samj.net, Sep 11, 2009

subbu: actually the discussion on apps-discuss highlighted the need for more than two different types of links:

  • a link to the 'current' or 'latest' version (e.g. ToS in force)
  • a link to a specific version (e.g. an old ToS)
  • a link to an immutable copy (e.g. in an archive)
  • an immutable, permanent link (e.g. a permalink)
  • a human-friendly short link (e.g. ToS linked in advertising)
  • a search engine friendly canonical link (e.g. terms-of-service-tos.html)

I believe most of these needs can be answered with 'current', 'self', 'shortlink' and 'canonical', while relations like 'permalink' and 'static' could be used for the others.


Sign in to add a comment
Hosted by Google Code