I’ve been sitting back and watching from the sidelines this week as a lot of smart people [de]ride the hash (or more specifically the #!). Most of this has come from the recent switchover of the gawker media properties to using a new URL structure based on the !# and a JavaScript app to load content. As someone who’s invested a lot of time into the # I want to make something clear – as with many things in life, the hash is no different: It’s not broken, its how you use it.

Deep Linking

Let’s take a little trip down memory lane, shall we? My first interaction with the ‘#’ as a method for getting at specific content (not as an anchor tag) was working in the ad/interactive space and with a lot of Flash developers. On every proposal or statement of work was a line about ‘Deep Linking’. In this case it meant plugging in swfaddress and wiring it up to your fancy Flash slideshow or fashion showcase. The point of this all was that when you clicked on the 8pt text in the menu that said ‘MENS BLOUSES’ you would be taken to the photo of a puffy shirt and the url would change to ‘#/items/puffy-shirt’. When you refreshed the page, or sent the link to your friend you would be taken immediately to the puffy shirt and be able to bask in its frilly glory without navigating through the 8pt menu again.

This was a big thing! Big, as in exciting and awesome, because it meant you could build relatively complex Flash applications, felt ‘stateful’ somehow. You would navigate somewhere and you’d be there, not in the middle of nowhere.

An important thing to note here is that this was not for Google. It was Flash. Google can’t crawl Flash. The point was allowing people to ‘Skip intro’ and have them feeling that this crazy interactive experience wasn’t all that different from there normal browsing. For crawling and SEO there were a host of other strategies – META tags, ‘ghost’ static HTML sites, etc. – and these were deployed alongside ‘Deep Linking’ as two different strategies trying to accomplish different things.


We’ve moved on (sort of) from the days of 8pt menus and pixel fonts. The old problem is new all over again. Now we’ve built these applications that rely on JavaScript being enabled to construct our site and fetch our data and animate our menus. Awesome! Its HTML5-tastic. At some point, we discovered the same problems that we had with Flash. Namely, when you’re loading all your content with AJAX and loading it onto the page based on clicks, you get lost in a sea of asynchronicity. What state are you in? Clearly, there’s an easy way to solve this – use the ‘#’ to route URLs without reloading the page and Deep Linking directly into content. I created and continue to work on Sammy.js, partly, as a solution to this problem. You want your application to feel interactive and fluid, you want to avoid page reloads, but when the user does reload, or copy the link, they should be taken back to the same place.

Then, confusion sets in. In a world where the only way people get to Facebook is by Googling Facebook you want your site to be ‘googleable’. This presents a problem: your application is dependent on javascript and ‘#’ and google don’t know about those. Well then, brotha, Google’s got your back. All you have to do is put a little ! after your # and tell google where to fetch your content and it’s like BLAM: SEO.

Except, no.

This is the part where I want to shake everyone. Like physically shake and be like “YOU! YOU’RE MISSING THE POINT”.

There are two different things. Crawlability and Deep Linking are two different things. For different reasons and different goals, and conflating them not only makes it hard on users, but also makes it hard on developers. You end up doing neither of them right.

I’ve been asked about a million times now, “How do you do SEO with Sammy?”. The answer is never simple, because its the wrong question. The question should really be – “If I’m really concerned about SEO, the crawlability of my site, and the persitance of my links should I use Sammy?”. The answer without a doubt is No. I’ll be the first one to say that Sammy and other similar frameworks are not for every site. Period. Theres no reason that your blog or your news site needs to load all its content with AJAX or needs to use ‘#’ to route for state. An important disctinction in what I’m trying to say is that its “you shouldnt have to” not “you never should”. If you’re building a site like Gizmodo you should pretty much always at least START with good ol’ semi-static pages that dont require JavaScript and load at old-fashioned URLs. This is not to say that you can’t build an application on top of these static pages that is more dynamic and interactive and relies on JavaScript.

Sammy and the ‘#’ are for applications. It provides a way to maintain state in a world where you can require JavaScript and even require the presence of certain browsers. If you’re an application, that requires login/signup you can make a number of demands of your users. You also probably dont even want the crawlability. You’re using ‘#’ to maintain state for a specific user in a specific session.

Outside of the world of the ‘application’ you really, really shouldn’t rely on JavaScript being there for your site to work (at least at a basic level). Jen Lukas has already talked really eloquently about why thats the case. I believe that this is where a lot of the recent frustration has come from, and really it should be directed at the use case, not the overall usage of ‘#’ for state. There have been some fingers pointed at Google for making this conflation possible, and I tend to agree.

How to do this right

I’m not going to go into detail about what framework you should use or how to lay out your app. Those are fairly complicated and my opinions could take up a couple blog posts. However, the basics are straightforward.

First, you need to determine some things about the structure of what you’re building and what your content is. Are you building an ‘application’ or a site? Does your content need to be searchable and reachable by the entire web? Do your links require true permanance and will changing them in the future “break” traffic to your site?

If you need searchability and permancance and you’re not building an application, the real best way to approach is to build your site as if JavaScript doesn’t exist. The fact is that there’s still a chunk of users for which that will be true. Not only because they’re using old browsers, but they’re on phones, or they’re on assistive devices or they just turn JavaScript off for ‘security’ reasons. If your site works this way and looks pretty decent (meaning its readable and the content is accessible) then basically you’re good. Golden. Your site works as expected, but maybe not all AJAAAAZZAY. It’s at this point that it’s acceptable to build a layer on top of your existing site, sprinkle the magic dust.

Twitter is an interesting example here because they took what was at its core really an information site, not all that different from a blog and turned it into an application. The questions of URL permanence are really gray here but I dont find myself hating it at all.

Web standards are at the core of this debate. Many make a strong argument that concept of the URL and HTTP is here for a reason, that URLs have worked and continue to work. I think instead of fueling FUD, we should get back to the work of building a better standard and a better web.

A brief epilogue on pushState/HTML5 History

A lot of people are pointing to the new HTML5 History API as a much better solution to this problem. I agree with reservations. First, the browser support for this currently isn’t very wide, and certain browsers that do support it have buggy/broken implementations. The bottom line is that it will work in some places, but not most, and this will probably be the case for years to come. That said, it is really exciting and I’m working on built in support for it in the next version of Sammy.js. I have other reservations, too – namely I think that not all hashes are created equal and in some cases the state you want to represent is really the state of a specific page not a seperate page that really requires a fully seperate URL.

9 Responses to “#-ish”

[…] This post was mentioned on Twitter by Wynn Netherland and Will Green, Simon Højberg. Simon Højberg said: RT @pengwynn: #! – "crawlability and deep linking are two different things" — good thoughts from @aq http://wynn.fm/av […]

I’d been itching to write this response, almost to the letter. Agreed entirely.

Thanks for writing this. I was too lazy to say all the exact same things, and I would not have said it as well, most likely.

The only problem is that you’re too well-reasoned, too rational. This does not carry the fire of linkbait. You should change the title to “Obama and Palin Get In Fist Fight Over Hash Bang Nerd War”

I second this response, and agree entirely.

I think your opinions on this whole hashbang debate are very much inline with my own. Hashbang URLs are *not* breaking the web (contrary to what some may say), it’s people making incorrect usage of them thats causing that. With a balance between correct application, implementation and graceful deg in mind there’s simply no reason why we can’t offer *some* SEO-friendly experience for users without JS turned on and a richer (Ajaxified) experience for the rest of us.

Web Axe Says: #

Completely disagree. Not good to break conventions and standards then blame the user when they complain.

zdennis Says: #

@WebAxe, I don’t see anyone blaming the users in this article (or anywhere).

I agree with this article and find it’s rationale and reasoning well thought out. Great post.

Tim Bray’s concise article on Broken Links seems way to elementary and overly simplified. I agree with your application vs. web-site distinction. It is amusing to me that the hashbang antagonists say that the hashbang is only implemented “because it’s cool”. That argument doesn’t make sense. Maybe they thought it was cool when they thought of using it to solve a particular problem, but I highly doubt the discussion for implementing it was because it was cool, more likely they were trying to solve a problem.

jkulak Says: #

Totally agreed, the bottom line would be to use appropriate technologies and approaches, define own goals, and still, have in mind backward compatibility.

By the way, I find hashbang URL’s neat and ‘futuristic’.

[…] #-ish – more discussion about the hash tag and browser history, and why you should only use it if it fits your needs […]


QuirkeyBlog is Aaron Quint's perspective on the ongoing adventure of Code, Life, Work and the Web.




QuirkeyBlog is proudly powered by WordPress