AJAX, Google and Screen Scraping

The ability to implement applications that are not page oriented, i.e. a single page is updated using DHTML, CSS, Javascript and XHR, complicates matters for search engines (not to mention screen scrapers). This is an area that is still evolving, but Backbase (backbase.com) already has some ideas on this topic.  They've published a whitepaper on this topic, entitled "Designing Rich Internet Applications For Search Engine Accessibility." They propose three methods of making your Ajax app searchable:

  • Lightweight Indexing: no structurally changes are made to your site; existing tags such as meta, title and h1 are leveraged.
  • Extra Link Strategy: extra links are placed on the site, which search bots can follow and thereby index the whole site.
  • Secondary Site Strategy: a secondary site is created, which is fully accessible to the search engine.

Lots of work, possibly, if you want to make your single page app searchable.

Dan Klyn has a different take on the matter, siting ROR (XML site description format) Google Base and Google Sitemaps as options.

Ajax bookmarks tie into this as well, for when you search for a particular term on google, you'd like to be able to navigate there directly. This article from OnJava.com suggests using the Real Simple History framework.

I suspect that we'll have some evolving standards around bookmarking, so it's probably too soon to put all of your search and bookmark eggs all in one basket. Now, what will the people who live off of screen scraping do? Their task has just become a little bit more difficult.

Leave a comment

Powered by WP Hashcash

About Pathfinder

Follow the Blog

    Get a monthly update on best practices for delivering successful software.

    Subscribe via email

      

    Subscribe via RSS      RSS icon

Topics

Search

WordPress

Comments about this site: info@pathf.com