static.html

— a blog about web development and whatnot by Steve Webster

Link prefetching should be a relatively simple technique that web developers can use to improve the performance of their pages. Unfortunately, a crippling bug in Chrome means that using technique will result in your users being served up a broken page.

Update: This bug has been fixed as of Chrome 10. Move along please - nothing to see here :)

Link prefetching is a mechanism by which developers can instruct the browser to pre-cache pages or resources that the user might end up downloading in the near future. From the MDC Link Prefetching FAQ:

Link prefetching is a browser mechanism, which utilizes browser idle time to download or prefetch documents that the user might visit in the near future. A web page provides a set of prefetching hints to the browser, and after the browser is finished loading the page, it begins silently prefetching specified documents and stores them in its cache. When the user visits one of the prefetched documents, it can be served up quickly out of the browser's cache.

Taking advantage of this is as simple as adding code such as the following to the <head> section of your page:

<link rel="prefetch" href="/path/to/resource">

Supporting browsers will fetch and cache the references resource file in the background, so that when it's used on a subsequent page it can be loaded from the cache rather than having to fire off an HTTP request. The same mechanism can be used to pre-cache images, JavaScript files, or even HTML pages.

At the time of writing only Firefox 3.5+ and Chrome 8+ support link prefetching, and it's the latter of these two browsers that renders the above technique completely unusable.

Chrome, link prefetching and corruption

What can Chrome possibly be doing that means the mere presence of a <link rel="prefetch"> on a page can cause serious rendering issues? I'm glad you asked.

Firstly, Chrome doesn't bother to wait until the browser is idle before downloading these resources; it starts as soon as each prefetch link is processed. If you're using this technique to pre-cache large images and your user's bandwidth is constrained, this could have a detrimental effect on the load time of the current page. There's an open bug to fix this issue, which will hopefully make it into the next version of Chrome.

Far more egregious, however, is that any attempt to use these assets in a subsequent page will fail silently. The file is cached incorrectly, and the browser simply gives up when it notices that. Since the file is cached it's not re-downloaded, but it's cached in such a way that it will always fail to load. It's like a silent 404 error. I told you it was bad.

Digging deeper

To be fair, this is a problem that exists in the WebKit framework rather than Chrome itself. However, Chrome is the only commonly-available browser to be built on a version of WebKit with link prefetching support enabled. Consequently it's the only browser that's exhibiting this behaviour, so I don't feel too bad about heaping scorn on it.

I've been delving through the WebKit source code, and I've found that the resources loaded via link prefetching are added to the memory cache with a type of LinkPrefetch. (Warning, extremely simplified pseudo-code ahead.)

resource = create_resource( LinkPrefetch, url )
resource.load();
cache_resource( url, resource )

When using that resource on subsequent pages as a stylesheet, for example, WebKit goes looking in the cache for a matching URL. It finds the resource in the cache, but WebKit double-checks the type of the resource to make sure it's a CSS resource (CSSStyleSheet) before applying it to the document.

resource = get_cached_resource( url )

if ( not resource ) {
    resource = create_resource( CSSStyleSheet, url )
    resource.load()
    cache_resource( url, resource )
}

if ( resource.type() == CSSStyleSheet ) {
    add_stylesheet_to_document( resource )
} else {
    // Do nothing
}

WebKit never attempts to re-download the resource if the types don't match, so the result is the same as if the CSS file you references never existed. This will persist until either:

a. the user closes the browser, thus obliterating the memory cache; b. it's bumped out of the cache to make room for something else; c. the user manually clears their browser cache; or d. the time-to-live on the cached resource expires.

Test case

I've put together a simplified test case for you to try. The first page contains a prefetch link for a CSS file, which is in turn loaded as a stylesheet in the following page to style the paragraph green. There's some JavaScript there to programatically verify the result of the test.

Note: I've seen Chrome get itself into a state where it will steadfastly fail to make the original prefetch request on the first page. You see an error in the WebInspector console saying that the resource could not be loaded, but no network requests are ever made.

WebInspector Resource graph showing failed prefetch

This masks the issues outlined above because the memory cache entry is never created, so when you move to the second step Chrome makes a new request for the resource and the test seems to pass.

WebInspector Resource graph showing test.css request on test page

When this happens, the only recourse seems to be to close Chrome, delete its User Data directory, restart Chrome, clear the browser cache (yes, somehow that makes a difference) and load the first page of the test case again.

If you're using Chrome 8 or 9 and the test seems to pass, check the Web Inspector to check that the above issue isn't masking the problem.

I'll work on updating the test case so that it can tell between a prefetch that failed and the condition described by this article.

The fix, and platform differences

There are related bugs files against both Chrome and WebKit, and the fix that has landed in WebCore prevents LinkPrefetch resources from being entered into the memory cache in the first place. They're still stored in the network-level cache so users still get the benefit of link prefetching, but because the network cache is type-agnostic we don't have any of the type-mismatching issues.

This fix has been applied to the version of WebKit being used by Chrome 10, which is currently in development. Unfortunately Chrome 8 and 9 still both exhibit this buggby behaviour, so we're stuck in a position where we can't serve up <link rel="prefetch"> elements until Chrome 10 is released and all Chrome users have upgraded. Thankfully Chrome automatically upgrades itself when new versions are released, but with Chrome 10 in development it's still likely to be at least a few months before we can start to use this technique again.

Thoughts on the fix

Even with the fix, I can't help but think that this code should be more forgiving of type-mismatches. At the very least I would have removed the object from the in-memory cache and resubmitted a request to load the resource from the network. The rewritten pseudo-code would then look something like this:

resource = get_cached_resource( url )

if ( not resource || resource.type() != CSSStyleSheet ) {
    resource = create_resource( CSSStyleSheet, url )
    resource.load()
}

add_stylesheet_to_document( resource )

The outcome would be the same but would also protect against resources being accidentally cached under the wrong type and silently failing to load. There have been attempts to use a print stylesheet to pre-cache JavaScript but that is afflicted by similar type-checking logic when WebKit tries to load the pre-cached JavaScript file via a <script> tag:

This technique has turned out to be dangerous in Chrome. It seems that Chrome will load the JS files into the cache, but then set an implied type="text/css" on them. This means that it will refuse to re-use them as JavaScript on future pages, until they have left the cache. I can no longer recommend this technique, but hope that it was at least interesting.

Refactoring the logic to be a little more forgiving (as above) would neatly solve this problem.

Comments

I've yet to implement comments on this site – one of the failings of a static publishing system is that it makes such things much more difficult — so I've submitted the link to Hacker News for discussion.