SEO and Web Development: Pros and Cons of Sitemap.xml
When designing the architecture and navigational structure of a web site, a sitemap can be extremely helpful. A sitemap can keep the navigation links organized by category, predict how any dynamic links will function, and keep the framework under control. Web developers are not the only parties involved in the process who can benefit from a well-structured sitemap. Many search engines, including Googlebot, can crawl through an XML file that contains the sitemap data to crawl and index the site's content
What is sitemap.xml?
A sitemap.xml file contains the URL of each page within a site, as well as data on all the files within that page.
Here is an example of a simple page with a single image and video file:
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"> <url> <loc>http://www.example.com/cutekitty.html</loc> <lastmod>2012-09-09</lastmod> <changefreq>daily</changefreq> <priority>1.0</priority> <image:image> <image:loc>http://example.com/image.jpg</image:loc> </image:image> <video:video> <video:content_loc>http://www.example.com/cutekitty.flv</video:content_loc> <video:player_loc allow_embed="yes" autoplay="ap=1">http://www.example.com/videoplayer.swf?video=123</video:player_loc> <video:thumbnail_loc>http://www.example.com/thumbs/cutekitty.jpg</video:thumbnail_loc> <video:title>Cute Kitty</video:title> <video:description>Cute Kitty Playing with Yarn</video:description> </video:video> </url> </urlset>
Here are some of the advantages and drawbacks of using sitemap.xml:
PRO: Site Indexing for Flash-Based Sites
As we saw in earlier articles, some search engines are not equipped to crawl through Flash-based content. With the sitemap.xml file, the search engines can examine the relevant content in the XML tags that refer to a flash file (such as a video). The file can also include the URLs included in a Flash-based navigation system that would not typically catch the attention of a spider:
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"> <url> <loc>http://www.example.com/pets/?id=cats</loc> <lastmod>2012-09-09</lastmod> <changefreq>daily</changefreq> <priority>1.0</priority> </url> <url> <loc>http://www.example.com/pets/?id=dogs</loc> <lastmod>2012-09-09</lastmod> <changefreq>daily</changefreq> <priority>1.0</priority> </url> <url> <loc>http://www.example.com/pets/?id=fish</loc> <lastmod>2012-09-09</lastmod> <changefreq>daily</changefreq> <priority>1.0</priority> </url> </urlset>
PRO: Revisit New and Changing Pages
The <changefreq> tag directs the spiders to pages that change their content periodically. The more often a page changes, the more frequently the search engine will return to check for fresh content. Fresh content also keeps return traffic flowing to a site to find out "what's new".
PRO: Easy to Create
Developers need not write their sitemap.xml files by hand. Several companies offer freeware and low-cost solutions to creating sitemaps. These tools are available in a wide array of languages and operating systems and can either be downloaded or employed through the browser.
CON: Limited File Size
By rule, a sitemap.xml file is limited to 50,000 URLs. Last year, Google announced that the sitemap.xml file must be no larger than 50MB for it to be submitted to its Webmaster Tools service. For retail sites with a large number of products, developers can create several XML files, list those files in a sitemap index file and submit the index file.
Sitemap Index File Example
<?xml version="1.0" encoding="UTF-8"?> <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <sitemap> <loc>http://www.example.com/sitemap_cats.xml.gz</loc> <lastmod>2004-10-01T18:23:17+00:00</lastmod> </sitemap> <sitemap> <loc>http://www.example.com/sitemap_dogs.xml.gz</loc> <lastmod>2005-01-01</lastmod> </sitemap> </sitemapindex>
CON: Not All URLs Crawled
When a developer submits a sitemap, the search engine indexes the content. However, a sitemap alone will not address any flaws in navigational structure that would prevent a spider from crawling the site and returning the relevant information.
CON: No Rank Improvement
As a consequence of failing to crawl some URLs, those pages will not be entered into the search engine's algorithms for calculating page rank. While a sitemap will signal to the search engine that the pages are available, it alone does not guarantee an improvement in search relevance for that content.
As with other SEO tools, a sitemap has a specific role: get these pages and their content indexed into the search engine databases. A sitemap is a tool, not a panacea. The issues that surround the development of high-ranking pages are multi-dimensional and require multiple tools. We will explore more of those tools in future articles.
SHARE THIS POST
ABOUT THE AUTHOR
Developer Drive is a quality Web development blog featuring tutorials, tips, news and reviews on things that matter to developers. We cover the latest trends and techniques such as CSS3, HTML5, WordPress, responsive/mobile design and much more.