My previous post
Build a Search Engine SiteMap in C# covered how to create a sitemap.xml file using the File System. It also provided guidelines on how to go about validating the sitemap as well as submitting it to the major search engines. If you came here for background on search engine sitemaps, go read
that post first. If all you care about is the HttpHandler, you may proceed.
An Overview of the HttpHandler
Scenario: There is a page you wish to create that will be generated dynamically with code, but you want to use a file extension that isn't dynamic. Like XML. RSS feeds and a sitemap are two examples of xml files that would be ideal for an HttpHandler. There are other uses for HttpHandlers, but in this post we are only interested in creating a dynamic sitemap.
Here is a brief overview of how the HttpHandler will work. A request will be made for the sitemap.xml, probably by a search engine like Google. Instead of looking for it on the file system, your web application will intercept that request and pass it off to a class which which generate and deliver an XML document.
Step 1: Create SitemapHandler.cs
Inside the App_Code folder create
SitemapHandler.cs. This class will implement the IHttpHandler interface. Before we deliver an XML document, let's create a simple HTML test to make sure the HttpHandler is working.
namespace HttpExtensions
{
public class SitemapHandler : IHttpHandler
{
public SitemapHandler()
{ }
#region IHttpHandler Members
public bool IsReusable
{
get { return true; }
}
public void ProcessRequest(HttpContext context)
{
response = context.Response;
response.Write("<html><body><h1>HTTP Handler is Working!</h1></body></html>");
}
#endregion
}
}
Update the Web.Config
Add the following section inside
system.web. The
sitemap.aspx line is for debugging purposes only and will be removed once everything is working.
<httpHandlers>
<add verb="*" path="sitemap.xml" type="HttpExtensions.SitemapHandler"/>
<add verb="*" path="sitemap.aspx" type="HttpExtensions.SitemapHandler"/>
</httpHandlers>
From Visual Studio 2005, test the site. Now type in
sitemap.xml in the path of the URL. You should see your
HTTP Handler is Working! message. And if you type in
sitemap.aspx, you should also see the message. As long as you view your site through Visual Studio 2005, you handler will work fine for both cases. Once you hand that job back to IIS, you'll need to do one more step.
Map *.XML to the aspnet_isapi.dll
If you run your code without doing this step, you will see your
HTTP Handler is Working! message only on the
sitemap.aspx request. Any request to
sitemap.xml will return a 404 Page Not Found error. Instead of the HTTP Handler intercepting the request, IIS sees that the file extension is not a .NET file extension so it takes command. It looks on the file server and doesn't see a
sitemap.xml and returns the 404.
The solution is map the *.xml file extension to the ASP.NET DLL (aspnet_isapi.dll). Once this is done and the server is restarted, the handler should work. For more information on how to do the IIS mapping go to
Protecting Files with ASP.NET and scroll down to
Protecting .mdb Files. Replace .xml for .mdb when following those directions.
Back to the SitemapHandler
Now that we have a working HttpHandler, we can go back and replace the
HTTP Handler is Working! message with a real sitemap XML file. I've commented out the loop to add pages. Here is where you would add the code to pull the urls from some data store, be it a component or database.
public void ProcessRequest(HttpContext context)
{
response = context.Response;
response.ContentType = "text/xml";
using (TextWriter textWriter = new StreamWriter(response.OutputStream, System.Text.Encoding.UTF8))
{
XmlTextWriter writer = new XmlTextWriter(textWriter);
writer.Formatting = Formatting.Indented;
writer.WriteStartDocument();
writer.WriteStartElement("urlset");
writer.WriteAttributeString("xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance");
writer.WriteAttributeString("xsi:schemaLocation", "http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd");
writer.WriteAttributeString("xmlns", "http://www.sitemaps.org/schemas/sitemap/0.9");
// Add Home Page
writer.WriteStartElement("url");
writer.WriteElementString("loc", "http://example.com");
writer.WriteElementString("changefreq", "daily");
writer.WriteEndElement(); // url
// Add code Loop here for page nodes
/*
{
writer.WriteStartElement("url");
writer.WriteElementString("loc", url);
writer.WriteElementString("changefreq", "monthly");
writer.WriteEndElement(); // url
}
*/
writer.WriteEndElement(); // urlset
}
}
Validate and Submit
Details on how to validate and submit your sitemap can be found on
Build a Search Engine SiteMap in C#. Don't forget to remove the
sitemap.aspx directive inside the
web.config file. That was just for debugging.
A Word of Warning
After writing this and patting myself on the back for being so clever, I discovered that other XML files on my site were not displaying. They were throwing errors. Whereas IIS can natively display XML files, the ASP.NET DLL can't.
This leaves you with 2 possibilities. ONE: Use a different file extension such as .MAP instead of .XML. TWO: Write a second HTTP Handler to catch all other XML file requests. That handler would open the XML and stream it back to the browser with a contentType of "text/xml".
<add verb="*" path="sitemap.xml" type="HttpExtensions.SitemapHandler"/>
<add verb="*" path="*.xml" type="HttpExtensions.XMLHandler"/>
public void ProcessRequest(HttpContext context)
{
HttpResponse response;
response = context.Response;
string thisURL = context.Request.RawUrl.ToString();
string thisXMLFile = HttpContext.Current.Server.MapPath(thisURL);
StreamReader xmlStream = File.OpenText(thisXMLFile);
string xmlOutput = xmlStream.ReadToEnd();
response.ContentType = "text/xml";
response.Write(xmlOutput);
}
Labels: Csharp, HttpHandler, SEO, Sitemap, XML