Making Sure Your Site is Crawler-friendly
By Jill Whalen - February 10, 2004
|
Another source of grief towards getting your pages thoroughly crawled is the use of the exact same Title tags on every page of your site. This sometimes happens because of Webmaster laziness, but often it's done because a default Title tag is automatically pulled up through a content management system (CMS). If you have this problem it's well worth taking the time to fix it.
Most CMS's have workarounds where you can add a unique Title tag as opposed to pulling up the same one for each page. Usually the programmers simply never realized it was important, so it was never done. The cool thing is that with dynamically generated pages you can often set your templates to pull a particular sentence from each page and plug it into your Title field. A nice little "trick" is to make sure each page has a headline at the top of the page that is utilizing your most important keyword phrases. Once you've got that, you can set your CMS to pull it out and use it for your Titles also.
Another reason I've seen for pages not being crawled is because they are set to require a cookie when a visitor gets to the page. Well guess what, folks? Spiders don't eat cookies! (Sure, they like beer, but they hate cookies!) No, you don't have to remove your cookies to get crawled. Just don't force-feed them to anyone and everyone. As long as they're not required, your pages should be crawled just fine.
What about the use of JavaScript? We've often heard that JavaScript is unfriendly to the crawlers. This is partly true, and partly false. Nearly every site I look at these days uses some sort of JavaScript within the code. It's certainly not bad in and of itself. As a rule of thumb, if you're using JavaScript for mouseover effects and that sort of thing, just check to make sure that the HTML code for the links also uses the traditional <a href> tag. As long as that's there, you'll most likely be fine. For extra insurance, you can place any JavaScript links into the <noscript> tag, put text links at the bottom of your pages, and create a visible link to a sitemap page which contains links to all your other important pages. It's definitely not overkill to do *all* of those things!
There are plenty more things you can worry about where your site's crawlability is concerned, but those are the main ones I've been seeing lately. One day, I'm sure that any type of page under the sun will be crawler-friendly, but for now, we've still gotta give our little arachnid friends some help.
One tool I use to help me view any potential crawler problems is the Lynx browser tool. Generally, if your pages can be viewed and clicked through in a Lynx browser (which came before our graphical browsers of today), then a search engine spider should also be able to make its way around. That isn't written in stone, but it's at least one way of discovering potential problems that you may be having. It's not foolproof, however. I just checked my forum in the Lynx browser and it shows a blank page, yet the forum gets spidered and indexed by the search engines without a problem.
This is a good time to remind you that when you think your site isn't getting spidered completely, check out lots of things before jumping to any conclusions.
Jill
Jill Whalen is the owner of HighRankings.com and moderator of the free weekly email newsletter, the High Rankings' Advisor. She is also known for her moderation of the critically acclaimed, Rank Write Roundtable. Jill specializes in search engine optimization, directory submissions, SEO consultations and workshops. She has obtained hundreds of number 1 and 2 spots for her vast array of clients throughout the years. Clients include multi-million dollar companies, major universities, real estate agencies, attorneys, surgeons, dentists, and small-medium sized businesses.
Copyright © 2002-2003 Web World Index