Digital Colony!

Creating a Filtered RSS Feed Using Yahoo! Pipes

Do you read any blogs that just seem to have too many posts to read? Perhaps you are only interested in reading certain posts. With Yahoo! Pipes you can easily create a custom web page and RSS feed to handle your filtering requests.

1- Go to Yahoo! Pipes. Get an account if you don't have one.
2- Select CREATE A PIPE.
3- On the left side, drag a FETCH FEED module from the Sources onto the grid canvas.
4- Add the URL of the RSS Feed for the Blog you are interested in filtering. For this example, I will be using the financial blog The Big Picture.
5- Expand the Operators section and drag a FILTER module onto the grid canvas.
6- Drag a PIPE to connect the FETCH FEED to the FILTER.
7- Create your FILTER rule. For this example, I am going to PERMIT ALL items that follow the Rule: item.category CONTAINS "Psychology/Sediment".
8- Drag a PIPE from the bottom of the FILTER to PIPE OUTPUT.
9- Save the PIPE.
10- Run the PIPE. Now you will see a page displaying just posts in the Psychology/Sediment category. This page will also have its own RSS feed.
11- At this point, you can optional publish the PIPE to allow other users access.



DEMO: pipes.yahoo.com/digitalcolony/bigpicturepsychology

Labels: ,

 

Build a Search Engine SiteMap in C#

Sitemaps are XML files that web masters can create to let search engines know what what pages to index and how frequently to check for changes on each page. The XML format of the sitemap file is detailed on sitemaps.org. Here is a sample of a sitemap with a single url.
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://example.com</loc>
    <changefreq>daily</changefreq>
  </url>
</urlset>
My sample differs from the one on sitemaps.org. The more defined namespace in my example will validate on Google, Yahoo! and with Ask.com. At the time of this writing, theirs doesn't. The loc and changefreq are required, priority and lastmod are optional.

My advice with the search engines is treat them like a passport agent. Say only what is required and nothing else or you could find yourself sent to the end of the line. Once you've determined what URLs will be on the sitemap, the only decision is defining the changefreq of each page. One strategy might be to set the home page to daily, section pages to weekly and content pages to monthly. If your home page has stock tickers or sports scores, you could set the changefreq to always or hourly.

Google prefers the sitemap file to be in the root folder and every example on their site names the file sitemap.xml. What Google wants, Google gets.

Additional Namespaces

using System.IO;
using System.Xml;

Sample Code

You will need write access to the sitemap.xml file. Since you don't want to give your entire root folder write access, my advice is to create a dummy sitemap.xml, place it into the root folder and then set write access to that file. Adding a try...catch to the code below will alert you if that write access is not there.
string SiteMapFile = @"~/sitemap.xml";
string xmlFile = Server.MapPath(SiteMapFile);

XmlTextWriter writer = new XmlTextWriter(xmlFile, System.Text.Encoding.UTF8);
writer.Formatting = Formatting.Indented;
writer.WriteStartDocument();
writer.WriteStartElement("urlset");
writer.WriteAttributeString("xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance");
writer.WriteAttributeString("xsi:schemaLocation", "http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd");
writer.WriteAttributeString("xmlns", "http://www.sitemaps.org/schemas/sitemap/0.9");

// Add Home Page
writer.WriteStartElement("url");
writer.WriteElementString("loc", "http://example.com");
writer.WriteElementString("changefreq", "daily");
writer.WriteEndElement(); // url

// Add Sections and Articles
SqlConnection con = new SqlConnection(connectionString);
string sql = @"SELECT url, 'weekly' as changefreq FROM Section 
UNION SELECT url, 'monthly' as changefreq FROM Articles ";
SqlCommand cmd = new SqlCommand(sql, con);
cmd.CommandType = CommandType.Text;

try
{
    con.Open();
    SqlDataReader reader = cmd.ExecuteReader();
    while (reader.Read())
    {
        string loc = "http://example.com" + reader["URL"].ToString();
        string changefreq = reader["changefreq"].ToString();
        writer.WriteStartElement("url");
        writer.WriteElementString("loc", loc);
        writer.WriteElementString("changefreq", changefreq);
        writer.WriteEndElement(); // url
    }
    reader.Close();
}
catch (SqlException err)
{
    throw new ApplicationException("Data Error (Sections):" + err.Message);
}
finally
{
    con.Close();
}

writer.WriteEndElement();// urlset        
writer.Close();

Validate Your Sitemap

Once you've confirmed you have a good looking sitemap.xml file in the root folder of your web site and it contains all the pages you want indexed by the search engines, it is now time to validate it. XML-Sitemaps.com has a sitemap validator that you can test out your new sitemap. Once it's fine, move to the next step.

Update Your robots.txt File

Add the location of your sitemap in your robots.txt file.

Sitemap: http://example.com/sitemap.xml

Google Webmaster

Google has a suite of tools for managing your relationship between your web sites and them. They call this suite Google Webmaster Central. It is here that you will register your web sites with validation files. Once they've established you are the webmaster of your site, they will present you will a screen to submit your sitemap. You will need a Google Account for this process. If you don't have one, follow the link to Create a Google Account.

Yahoo! Site Explorer

Yahoo! has a similar setup which is called Yahoo! Site Explorer. Using your Yahoo! ID, you will go through the same process of registering your web sites. And once that process has been completed, you can then submit your sitemap.xml file. Don't have a Yahoo ID? Get one.

Ask.com

Ask.com doesn't require any accounts or site validation. Just ping their server with the location of your sitemap.xml file modeled after the URL example below. Of course replace example.com with your domain name.

http://submissions.ask.com/ping?sitemap=http%3A//example.com/sitemap.xml

Monitoring the Sitemap Crawl

Both the Yahoo! Site Explorer and the Google Webmaster Tools have reports that provide updated status on the success and failure of the sitemap crawl. Ask.com to my knowledge doesn't have any such tools. And I couldn't locate a sitemap submission tool at all for MSN.com.

Using an HTTP Handler

This example uses the File System. Another option is to use an HTTP Handler to deliver the sitemap.xml.

Final Word

A friend of mine with a low traffic site saw his page views double after adding a sitemap. If one page of code can potentially double your page views, then it is worth pursuing.

Labels: , , , , ,

 

Using Yahoo! Maps GeoCoding API in C#

Building a map using Yahoo! Maps or Google Maps requires Latitude and Longitude points. Until Yahoo! released it's GeoCoding APIs getting address latitude and longitude was neither easy or free. At this time Yahoo! allows you to GeoCode 50,000 addresses a day. The code below will call the Yahoo! GeoCoding API using C# and ASP.NET 2.0. Yahoo!

Latitude and Longitude Precision

When supplying Yahoo! with an address, it will try to return the highest level of precision. The returned XML document will tell you how precise the latitude and longitude are. If it can't resolve an address, it will return a warning. Yahoo! Map

Files Included in Download

* GeoCode.aspx - Form to enter address. Also will display response from Yahoo!.
* GeoCode.aspx.cs - Calls class to make geocoding request.
* GeoAddress.cs - An address class.
* GeoAddressAPI.cs - Class which calls Yahoo! and parses response.
* web.config - Holds application ID required by Yahoo!

GeoCode.zip - Don't forget to enter your application ID at the end of the querystring in the web.config file. This format supports ASP.NET 2.0. If you are running 1.1, you will need to make some minor adjustments to how you store your API url in the web.config file.

Working Demo

GeoCoding C# Yahoo Demo

Labels: , , ,

 

Digital Colony Copyright © 1999-2008 XHTML   508
This site uses Blogger, which is not 100% XHTML compliant.
Try...Catch Disclaimer: For brevity many examples do not include error handling. That is your responsibility.