Search
Follow
Recent Comments

Entries in rss (3)

Monday
Oct252010

Checking RSS Feeds for New Posts

I have previously posted some example iPhone Objective C code for reading an RSS feed and then on how to parse the XML content of the feed using NSXMLParser. The one topic that I did not cover so far is to how determine the new posts in an RSS feed. A common expectation for RSS readers is that when the user refreshes the feed they are presented with any new posts in a clearly identifiable way so that they do not have to see previously read posts. I had assumed this was trivial by checking the publication date and/or the GUID string for each post. However, once I started to look into it I realised that it was not so straightforward…

Detecting duplicate RSS posts

The GUID element of an RSS feed is intended as a globally unique identifier for each item in the feed. Unfortunately, as with the pubDate element, it is optional so you can never be 100% sure that the feed you will be reading will include a GUID in each entry. A good overview of the various strategies that feed readers use to determine a new post is summarised here. The approach that I will adopt is to use the GUID if it exists and if not fall back to the link item. Note that this will fail for feeds such as this one from Apple that use the link to reference a generic web page rather than the individual post.

For some more excellent recommendations on how to parse RSS feeds it is also worth reading the documentation for the Universal Feed Parser by Mark Pilgrim. This is the feed parser used by the open source Planet feed aggregator. In particular the section on HTML sanitization is worth understanding since many of the fields can potentially contain malicious scripting.

Reviewing the Model

To recap for those that have not been following along my trivial RSS reader currently has the following model classes:

  • Channel: contains details such as the feed title, link and description extracted from the <channel> element of an RSS feed.
  • Post: contains the details of individual posts in a feed such as the title, description, guid. Also records when a post has been read by the user.
  • Feed: a container class to model and process a single RSS feed. A feed object has a single channel and a collection of posts retrieved from the feed. A feed object is initialised with the URL of the required feed and updated by calling the -refresh instance method.

As currently implemented the Feed class makes no attempt to determine which posts have previously been retrieved. This means that the view controller has to implement this logic. I would like to move this functionality into the model and also implement our strategy for determining new RSS posts. To do this I am going to add a new class to our Model which will store the key attributes of a post used when testing if we have previously retrieved a post. When finished our model classes will look like this:

Adding a Feed Index to the Model

The first thing to do is modify the Feed class to include a feed index. The feed index will have a single entry for every item we find in the feed. To store this index I will add an NSMutableArray to the feed class as follows:

@interface Feed : NSObject <NSXMLParserDelegate> {
	
    NSURL *feedURL;
    ASIHTTPRequest *feedRequest;
	
    Channel *feedChannel;
    NSMutableArray *feedPosts;
    NSMutableArray *feedIndex;
	
    id currentElement;
    NSMutableString *currentElementData;

}

The feedIndex instance variable will contain an array of IndexEntry objects which are defined as follows:

@interface IndexEntry : NSObject {
	
    BOOL exists;
    NSString *guid;
    NSString *link;
}

The implementation of IndexEntry is trivial so I will omit it here you can find the full details in the Xcode project download. Before taking a look at the changes to the implementation of the Feed class there is one final change to the model which is to add the link field to the Post class:

@interface Post : NSObject {
    BOOL isRead;
    NSString *title;
    NSString *description;
    NSString *guid;
    NSString *link;
}

The nice thing about our XML parsing code is that we do not need to do anything else to our code to implement the link attribute. Defining it in the model is sufficient for it to be populated anytime we find a <link> element in an RSS feed entry.

Implementing the Feed Index

When we initialise a new Feed object we now also need to initialise the array that will hold our feed index (and release it when we dealloc a Feed object):

- (id)initWithURL:(NSURL *)sourceURL {
	
    if (self = [super init]) {
		
        self.feedURL = sourceURL;
        self.feedPosts = [[NSMutableArray alloc] init];
        self.feedIndex = [[NSMutableArray alloc] init];

    }
	
    return self;
}

The basic approach the feed parsing code will take is that each time a post is extracted from the feed we will check the feedInidex to see if this is an old post. If it is an existing post we will not bother storing the post. If however this is a new post we will add it to the feedPosts array and update our feedIndex with the post details (guid and link).

To make things easier I have created some helper methods to check and manage the post index as follows:

  • checkExists: search the index to see if a post already exists in the index. This method is what will implement our RSS post duplicate detection strategy.
  • updateIndex: add a post to the index. This method updates the index with the key attributes of a post which are currently just the guid and link elements.
  • resetIndex: this method is called each time we retrieve an RSS feed to reset the exists flag for all posts in the index. As each post is found the exists flag is set to YES indicating that the post has been found in the feed.
  • purgeIndex: this method is used after retrieving a feed to remove old entries in the index that are no longer contained in the feed.

checkExists

The code for the checkExists method is shown below, it takes a single argument which is the current Post object:

- (BOOL)checkExists:(Post *)post {
	
    NSString *key;
    NSString *value;
	
    if (post.guid) {
        key = @"guid";
        value = post.guid;
    } else if (post.link) {
        key = @"link";
        value = post.link;
    } else {
        return NO;
    }
	
    NSPredicate *predicate = [NSPredicate predicateWithFormat:@"%K == %@", key, value];
    NSUInteger index = [feedIndex indexOfObjectPassingTest:^(id obj, NSUInteger idx, BOOL *stop) {
                        return [predicate evaluateWithObject:obj];
                       }];
	
    if (index != NSNotFound) {
		
        IndexEntry *entry = [feedIndex objectAtIndex:index];
        entry.exists = YES;
        return YES;
    }
	
    return NO;
}

The Post object is checked and if it contains a GUID element we make use of it otherwise we attempt to fallback to the link element. If the post contains neither a GUID or link we give up and return NO to indicate that the post does not exist in the index. To search the index we make use the indexOfObjectPassingTest: method of NSArray which takes a block containing an NSPredicate that tests for a matching GUID or link. I covered this way of searching arrays with NSPredicate and blocks in a previous post. If we get a match we update the feed index entry by setting the exists flag to YES and return YES to indicate that the post exists in the index.

updateIndex

The updateIndex method is responsible for adding an entry to the feed index. It takes a single argument which is the post to be added:

- (void)updateIndex:(Post *)post {
	
    IndexEntry *entry = [[IndexEntry alloc] init];
    entry.exists = YES;
    entry.guid = post.guid;
    entry.link = post.link;
    [feedIndex addObject:entry];
    [entry release];
}

This code is self explanatory, the main advantage of maintaining a separate feed index is that we are only storing a limited number of Post attributes (just the GUID and link) rather than the whole post. This helps keep our memory requirements under control.

resetIndex

The resetIndex method is trivial, it iterates through all entries in the index setting the exists flag to false:

- (void)resetIndex {
	
    for (IndexEntry *entry in self.feedIndex) {
		
        entry.exists = NO;
    }
}

purgeIndex

The purgeIndex method is responsible for cleaning old entries from the index that no longer exist in the feed. It does this by removing all entries where the exists flag is set to NO. This is another example of filtering arrays with predicates:

- (void)purgeIndex {

    NSPredicate *predicate = [NSPredicate predicateWithFormat:@"exists == YES"];
    [feedIndex filterUsingPredicate:predicate];
}

The predicate tests for the exists flag set to YES, the filterUsingPredicate method of NSMutableArray then removes all items from the array which do not match the predicate.

Refreshing the Feed

The whole process is kicked off when the refresh method is called on a Feed object. The refresh method removes all entries from the array of posts and also resets the feed index. It then initiates a request to retrieve and parse the feed:

- (void)refresh {

    [feedPosts removeAllObjects];
    [self resetIndex];

    self.feedRequest = [ASIHTTPRequest requestWithURL:feedURL];
    [feedRequest setDelegate:self];
    [feedRequest startAsynchronous];
	
}

For the details on how the ASIHTTPRequest works you can refer back to the previous post on Reading an RSS Feed. Once the feed data has been received we get a callback to our delegate method requestFinished which is unchanged except that after successfully parsing the feed we purge the index using the helper method we saw previously:

- (void)requestFinished:(ASIHTTPRequest *)request {
	
    NSData *responseData = [request responseData];
	
    NSXMLParser *parser = [[NSXMLParser alloc] initWithData:responseData];
    [parser setDelegate:self];
	
    if ([parser parse]) {

        [self purgeIndex];
        [[NSNotificationCenter defaultCenter
            postNotificationName:kFeederReloadCompletedNotification
            object:nil];
    }
	
    [parser release];
}

Parsing the Posts

To finish up the changes to the Feed implementation we need to adjust the way we parse and store the posts. Previously as we found each post it was added immediately to the array of posts. Now we only want to store the post if it does not exist in the index. To do that we first need to retrieve all of the attributes of the post. The NSXMLParser delegate methods that need changing are didStartElement: and didEndElement:. I will omit the didStartElement method since the only change is to remove the line that stores the post. The didEndElement method now looks like this:

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName
               namespaceURI:(NSString *)namespaceURI
               qualifiedName:(NSString *)qName {
	
    if ([elementName isEqualToString:kItemElementName]) {

        if (![self checkExists:currentElement]) {
			
            [feedPosts addObject:currentElement];
            [self updateIndex:currentElement];
        }

        self.currentElement = nil;
        return;	
    }
	
    SEL selectorName = NSSelectorFromString(elementName);
    if ([currentElement respondsToSelector:selectorName]) {
        NSCharacterSet *charSet = [NSCharacterSet whitespaceAndNewlineCharacterSet];
        NSString *value = [currentElementData stringByTrimmingCharactersInSet:charSet];
        [currentElement setValue:value forKey:elementName];
    }
	
    [currentElementData release];
    self.currentElementData = nil;
}

As the NSXMLParser reaches the </item> element it calls didEndElement allowing us to check our feed index using the checkExists helper method we defined earlier. If the post does not exist in the index we store the whole post into our feed posts array and we also update the index.

One other minor change when processing other elements in a feed such as the post title and description is to strip whitespace, tab and newline characters from the beginning and end of the field contents. This ensures that values like the GUID and link we store in the index do not contain any extra characters.

Updating the Feed View Controller

With all of the model changes completed we can finally update the view controller to make use of the new feed functionality. The first change we will make is to change the last line in the table which currently allows the user to refresh the view. I am going to change this so that when there are unread posts it allows the user to mark all posts as read. With all posts read it will show “Get more items…” to allow new posts to be fetched. The changes are in the table view delegate method cellForRowAtindexPath:

- (UITableViewCell *)tableView:(UITableView *)tableView 
                     cellForRowAtIndexPath:(NSIndexPath *)indexPath {
    static NSString *postCellId = @"postCell";
    static NSString *moreCellId = @"moreCell";
    UITableViewCell *cell = nil;
	
    NSUInteger row = [indexPath row];
    NSUInteger count = [posts count];
	
    if (row == count) {
		
        cell = [tableView dequeueReusableCellWithIdentifier:moreCellId];
        if (cell == nil) {
            cell = [[[UITableViewCell alloc
                      initWithStyle:UITableViewCellStyleDefault 
                      reuseIdentifier:moreCellId] autorelease];
        }
		
        if ([self countUnreadPosts]) {
            cell.textLabel.text = @"Mark all as read...";
        } else {
            cell.textLabel.text = @"Get more items...";
        }
        cell.textLabel.textColor = [UIColor blueColor];
        cell.textLabel.font = [UIFont boldSystemFontOfSize:16];
		
		
        } else {
            ...
            ...
    }
	
    return cell;
}

 

To implement this changed UI behaviour we also need to change the table view delegate method didSelectRowAtIndexPath:

- (void)tableView:(UITableView *)tableView 
                  didSelectRowAtIndexPath:(NSIndexPath *)indexPath {
	
    NSUInteger row = [indexPath row];
    NSUInteger count = [posts count];
	
    if (row == count) {
		
        if ([self countUnreadPosts]) {
            [self markAllRead];
        } else {
            [self getMoreItems];
            [self.tableView deselectRowAtIndexPath:indexPath animated:YES];
        }
		
    } else {
        ...
        ...
    }
}

 

So when the user selects the last row in the table we check if we have unread posts and if so call the local method markAllRead otherwise if there are no unread posts we call getMoreItems to refresh the feed. These two methods are shown below:

- (void)markAllRead {
	
    for (Post *post in self.posts) {
        post.isRead = YES;
    }
	
    [self updateViewTitle];
    [self.tableView reloadData];
}

- (void)getMoreItems {

    NSPredicate *predicate = [NSPredicate predicateWithFormat:@"isRead == NO"];
    [posts filterUsingPredicate:predicate];
    [self.tableView reloadData];
    [feed refresh];
}

 

Note the use of an NSPredicate to remove the read posts from the array of posts stored in the feed view controller. To trigger the retrieval of new posts we call the refresh method on our feed object which will when it completes callback the feedChanged method which now looks like this:

- (void)feedChanged:(NSNotification *)notification {
	
    NSMutableArray *feedPosts = [feed feedPosts];
    for (Post *feedPost in feedPosts) {
        [posts addObject:feedPost];
    }
	
    [self.tableView reloadData];
    [self updateViewTitle];
}

This method now just copies the new posts from the feed object into our view controller and then refreshes the table view and the view title. Our feed object takes care of ensuring we only get to see new posts.

The Post View

To finish up I have made one minor improvement to the detailed post view to allow the full post to be view in Safari. This makes use of the link attribute which we now extract from the RSS feed for each post. A standard system action button is added to the navigation bar in the viewDidLoad method of the PostViewController and wired up to call the method openLink when touched by the user:

- (void)viewDidLoad {

    [super viewDidLoad];
	
    UIBarButtonItem *openButton = [[UIBarButtonItem alloc]
initWithBarButtonSystemItem:UIBarButtonSystemItemAction
target:self
action:@selector(openLink)];
    self.navigationItem.rightBarButtonItem = openButton;
    [openButton release];
	
    NSString *postTitle = [NSString stringWithFormat:@"<H1>%@</H1>",
						   post.title];
	
    NSString *html = [postTitle stringByAppendingString:post.description];
    [postBody loadHTMLString:html baseURL:nil]; 
}

 

The openLink method uses the application delegate to open the URL of the post link (assuming it is defined):

- (void)openLink {
	
    if (post.link) {
		
        NSURL *url = [NSURL URLWithString:post.link];
        [[UIApplication sharedApplication] openURL:url];
    }
}

 

Wrapping Up

Another long post so congratulations to anybody who made it all the way to the end. The basic functionality of an RSS reader is now just about complete though there is of course plenty of room for improvement, not least since we only read a single hard-coded feed at the moment. One other change that I may look at in a future post is how to convert the project to make use of Core Data to store the post objects rather than relying on keeping them all in memory. The updated Xcode project can be downloaded from here if you want to see the complete details.

Saturday
Oct162010

Parsing an RSS Feed using NSXMLParser

This is the second of a two part post looking at the reading and parsing of a remote RSS feed. The first post covered the retrieval of the feed data over the network. This part will look at how to parse the resulting XML data to extract the individual posts.

Structure of an RSS feed

Before we get too much into the detail it is worth taking a second to look at the structure of an RSS feed. A typical feed, with the most common elements looks something like this:

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>Use Your Loaf</title>
    <link>http://useyourloaf.com/blog/</link>
    <item>
      <title>Reading an RSS Feed</title>
      <pubDate>Thu, 14 Oct 2010 21:09:30 +0000</pubDate>
      <link>http://useyourloaf.com/blog/2010/10/14/reading-an-rss-feed.html</link>
      <guid>538327:6179246:9187069</guid>
      <description><![CDATA[...post goes here...]]></description>
    </item>
    <item>
    ...
    </item>
  </channel>
</rss>

These are more or less the fields that I want to extract from the feed. There are some initial fields such as title and link that describe the channel and then a sequence of items each one containing a title, publication date, link to the original post, a guid that uniquely identifies the item within the feed and then finally the description which contains the actual post data.

The Post Model

I am going to modify the previously created Post model class so that the ivar names match the names of the corresponding XML elements. The reason for this will become clear when we look at the code for parsing the XML. Our Post model interface now looks as follows:

@interface Post : NSObject {
    BOOL isRead;
    NSString *title;
    NSString *description;
    NSString *guid;
}

I have also changed the table view controller code to use these modified field names but I will omit that code here.

The Channel Model

As well as a Post class I will also create a Channel class to contain the RSS feed elements such as the channel title and link. I could store these items directly in the feed class but keeping them in a self contained class actually makes the parsing code easier. The interface for the Channel class is as follows:

@interface Channel : NSObject {

    NSString *title;
    NSString *link;
    NSString *description;
}

The Feed Model

There will be some additional items we need to add to the feed model once we get into the XML parsing code but for now I will add a reference to a channel model and a mutable array to collect the posts that we decode from the RSS feed:

@interface Feed : NSObject {
	
    NSURL *feedURL;
    ASIHTTPRequest *feedRequest;
	
    Channel *feedChannel;
    NSMutableArray *feedPosts;

}

Event Driven Parsing with NSXMLParser

Both Cocoa for Mac OSX and Cocoa Touch for iOS devices provide a class, named NSXMLParser, that takes care of all of the hard work required to parse XML data. The basic approach is to initialise an NSXMLParser object with the XML stream to decode and then implement a number of delegate methods defined by the NSXMLParserDelegate protocol.

There are delegate methods defined when the NSXMLParser encounters the start and end of a document, the start of a tag (<channel>,<title>,<item>), the end of a tag (</channel>,</title>,</item>), an attribute or character data. As the NSXMLParser object identifies each element in the XML stream it calls the appropriate delegate method to allow something useful to be done with each piece of data.

The basic approach we will take is to map the higher level objects in the RSS feed such as the channel and item to one of our model objects (a channel or post object). Each time we encounter an opening tag for one of these objects (<channel>,<item>) we will allocate a new object. The elements of the object will then be populated as we encounter each of the items contained within the object.

To track which object we are currently constructing we need an instance variable in the feed object to track the current element. We also need a temporary instance variable to collect the content of an element as the parser may invoke our delegate multiple times for the same element. So our revised Feed class now looks as follows:

@interface Feed : NSObject {
	
    NSURL *feedURL;
    ASIHTTPRequest *feedRequest;
	
    Channel *feedChannel;
    NSMutableArray *feedPosts;
	
    id currentElement;
    NSMutableString *currentElementData;

}

The array to hold the Post objects can be allocated when we initialise the Feed object:

-(id)initWithURL:(NSURL *)sourceURL {
	
    if (self = [super init]) {
		
        self.feedURL = sourceURL;
        self.feedPosts = [[NSMutableArray alloc] init];

    }
	
    return self;
}

Now to get things started we need to revisit the point in the last blog post where we successfully retrieved an RSS feed over the network. Since we are using ASIHTTPRequest to handle the network request the delegate method of interest is called requestFinished. To start the parsing of the retrieved data we need to create an instance of NSXMLParser, set ourselves as the delegate and then tell it to start parsing the data:

- (void)requestFinished:(ASIHTTPRequest *)request {
	
    NSData *responseData = [request responseData];
	
    NSXMLParser *parser = [[NSXMLParser alloc] initWithData:responseData];
    [parser setDelegate:self];
	
    if ([parser parse]) {

        [[NSNotificationCenter defaultCenter
                 postNotificationName:kFeederReloadCompletedNotification
                 object:nil];
		
    }
	
    [parser release];
}

This is fairly straightforward, once we have an NSXMLParser object we set the delegate and then call the parse instance method. If we get a successful result we send a notification to any observing class to let them know we have updated the feed.

To actually receive delegate callbacks we need to ensure our Feed class implements the NSXMLParserDelegate protocol:

@interface Feed : NSObject <NSXMLParserDelegate> {
    ...
    ...
}

The first delegate method that we need to implement is for when the parser encounters a new element. But first we will define some string constants for the various XML elements we are interested in decoding:

static NSString * const kChannelElementName = @”channel”;

static NSString * const kItemElementName = @”item”;

 

Now the delegate method:


- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName
               namespaceURI:(NSString *)namespaceURI
               qualifiedName:(NSString *)qName
               attributes:(NSDictionary *)attributeDict {
	
    if ([elementName isEqualToString:kChannelElementName]) {
		
        Channel *channel = [[Channel alloc] init];
        self.feedChannel = channel;
        self.currentElement = channel;
        [channel release];
        return;
		
    }
	
    if ([elementName isEqualToString:kItemElementName]) {
		
        Post *post = [[Post alloc] init];
        [feedPosts addObject:post];
        self.currentElement = post;
        [post release];
        return;
		
    }
	
}

The didStartElement method has a number of parameters but we are really only interested in the element name. If we have just found a <channel> tag we allocate a Channel object and store it in the Feed object. Likewise if we find an <item> tag we allocate a Post object and add it to the end our Posts array in the Feed object. In both cases we set our currentElement reference to the newly created object.

In all other cases we initialise our currentElementData string array ready to collect any data for the current element. The next delegate method that we will implement will be the foundCharacters method which is called each time some content data is encountered:

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
	
    if (currentElementData == nil) {
        self.currentElementData = [[NSMutableString alloc] init];
    }
	
    [currentElementData appendString:string];
	
}

Each time this delegate method is called we check to see if we have our currentElementData buffer allocated and if not we create it. As previously mentioned this method can be called multiple times as a single element is processed so we append the string data to the buffer each time it is called.

Finally we need the delegate method for when we reach the end of an element:

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName
               namespaceURI:(NSString *)namespaceURI
               qualifiedName:(NSString *)qName {
	
    SEL selectorName = NSSelectorFromString(elementName);
    if ([currentElement respondsToSelector:selectorName]) {
			
        [currentElement setValue:currentElementData forKey:elementName];
				
    }
	
    [currentElementData release];
    self.currentElementData = nil;
}

Here we make use of the fact that we named our object ivars with the XML element names. So rather than testing each element name and manually deciding which ivar needs to be set we can use some Key-Value Coding magic. First we create a selector from the XML elementName then we test if the current element we are processing (a channel or post object) responds to the selector. If it does we use the data that has collected in our currentElementData string buffer to set the value for the ivar whose key is the same as the elementName.

Using Key-Value coding means that we do not have to hardcode which fields we are want to collect for each of our model classes. If I later decide that I want to decode an extra field for the Post object I only need to add that field to the Post class. The XML parsing code remains the same.

The last thing we do in this delegate method is clear our string buffer by releasing and then setting it to nil to guard against over releasing.

There is one more delegate method that we should implement to handle parsing errors. When the NSXMLParser object encounters an error it stops processing the XML stream and sends the parseErrorOccurred: method to its delegate. I am not going to do anything sensible with the error message in this example but it would generally be a good idea to inform our controller of the error situation:

- (void)parser:(NSXMLParser *)parser parseErrorOccurred:(NSError *)parseError {
	
    NSString *info = [NSString stringWithFormat:@"Error %i, Description: %@, Line: %i, Column: %i",
                       [parseError code],
                       [[parser parserError] localizedDescription],
                       [parser lineNumber],
                       [parser columnNumber]];
	
    NSLog(@"RSS Feed Parse Error: %@", info);
}

 

To finish up the changes to the Feed class we need to make one minor change to the refresh method that is called by our controller each time it wants to update the feed. To ensure we do not store old objects in the feed we clear out the array holding the posts before we initiate the new network request to retrieve the feed:

- (void)refresh {
 
    self.feedRequest = [ASIHTTPRequest requestWithURL:feedURL];
 
    [feedPosts removeAllObjects];
    [feedRequest setDelegate:self];
    [feedRequest startAsynchronous];
 
}

Updating the Table View Controller

To finish up we need to update the table view controller to interact with our new enhanced feed class. We already have a method named feedChanged that is called when the controller receives a notification from our feed object indicating the feed has been successfully reloaded. We now need to modify that method to actually use the posts we have extracted from the RSS feed:

- (void)feedChanged:(NSNotification *)notification {
	
    BOOL newPost = NO;
    NSMutableArray *feedPosts = [feed feedPosts];
    for (Post *feedPost in feedPosts) {
		
        if (![self postExists:feedPost]) {
            newPost = YES;
            [posts addObject:feedPost];
        }
    }
	
    if (newPost) {
		
        [self.tableView reloadData];
        [self updateViewTitle];
    }
}

This method works it way through the posts stored in the feed object and if they do not already exist in the store of posts that our view controller knows about we add the new post. Then if we have at least one new post we reload the table data and update our view title. The help method postExists: is defined as follows:

- (BOOL)postExists:(Post *)newPost {
	
    NSString *guid = [newPost guid];
	
    for (Post *post in self.posts) {
		
        if ([post.guid isEqualToString:guid]) {
            return YES;
        }
    }
	
    return NO;
}

This simply iterates through our store of posts comparing the unique guid string for each post to determine if we already have this post. This is almost certainly not the best approach, especially since any posts that we have previously deleted will reappear in the view. A better approach would be to have our Feed object store the publication date of the most recent post it has seen and when we refresh the feed only return more recent posts. Since this is already a long post I will save that for another time but hopefully you get the idea.

Wrapping Up

Hopefully this post has shown how easy it is to parse XML data using the NSXMLParser class. The example app is still a very poor RSS reader and not just because of the horrible user interface but it is on the way to becoming useful. A good topic to explore in a future post would be how to store the posts in a persistent store such as an sqlite database or using Core Data so that they can be read offline. Using Core Data to store the posts would also be a better choice than keeping an array of posts in memory.

If you want to take a look at the complete Xcode project for the code in this and the previous post you can find it here.

Thursday
Oct142010

Reading an RSS Feed

I previously used the example of an RSS feed reader app to illustrate some table view techniques so I thought it was about time that I looked at actually reading and parsing an RSS feed for real. The problem can be divided into two parts, the first that I cover here is actually retrieving the feed contents over the network. The second part is then parsing the XML feed contents to extract the post information that we actually want to read. I will cover the XML parsing part in a following post.

I will be adding to the version of the Feeder app that I created previously. If you want to follow along you can find the Xcode project here:

ASIHTTPRequest

You can use the standard core foundation network or NSURLConnection methods to retrieve the contents of a remote URL. I prefer to use the excellent ASIHTTPRequest wrapper developed by Ben Copsey which does a good job of doing all the hard work for me. To get started you need to download the Objective-C source files and add them to the Xcode project. You also need to add Apple’s Reachability class or the replacement by Andrew Donoho.

As well as adding the source files you also need to add some additional frameworks to the project. The frameworks to add are as follows:

  • CFNetwork.framework
  • SystemConfiguration.framework
  • MobileCoreServices.framework
  • CoreGraphics.framework
  • libz.1.2.3.dylib

If you are not sure how to add the frameworks, right click on the project target in the Xcode Groups & Files window and select Get Info. Then in the General tab use the + button at the bottom of the Linked Libraries window. When you have finished it should look something like this:

I should also note that building against the iOS 4.1 SDK there is a compiler warning generated for version 1.7 of ASIHTTPRequest. The warning is for ASIDownloadCache.m:

’createDirectoryAtPath:attributes:’ is deprecated (declared at /Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator4.1.sdk/System/

Library/Frameworks/Foundation.framework/Headers/NSFileManager.h:168)

 

For the purposes of this example it does not really matter but if you want to fix the warning you can replace the offending line:

[[NSFileManager defaultManager] createDirectoryAtPath:directory attributes:nil];


with the following:

[[NSFileManager defaultManager] createDirectoryAtPath:directory 
				  withIntermediateDirectories:NO
				   attributes:NO error:nil];

Retrieving the Feed

I already have a model class named Feed that I intended to hide the details of retrieving posts. The first step is using ASIHTTPRequest to fetch the RSS feed. To do that I will first add some useful ivars to the Feed class:

@class ASIHTTPRequest;

@interface Feed : NSObject {
	
    NSURL *feedURL;
    ASIHTTPRequest *feedRequest;

}

@property (nonatomic, copy) NSURL *feedURL;
@property (nonatomic, retain) ASIHTTPRequest *feedRequest;

 

The feedURL is just used to store the URL of the RSS feed that we will retrieve. The feedRequest ivar will be used to store an ASIHTTPRequest object which is the wrapper object that hides the details of the underlying network libraries.

To initialize a feed object we need a new initializer that takes as an argument the URL we want to retrieve. This method does not do much right now other than store the URL:

-(id)initWithURL:(NSURL *)sourceURL {
	
    if (self = [super init]) {
		
        self.feedURL = sourceURL;
    }
	
    return self;
}

To actually initiate some network communication I have added an instance method named refresh:
- (void)refresh {
	
    self.feedRequest = [ASIHTTPRequest requestWithURL:feedURL];
	
    [feedRequest setDelegate:self];
    [feedRequest startAsynchronous];
	
}

This method creates an ASIHTTPRequest object with the URL of the feed, sets our feed object to be the delegate and then starts an asynchronous request so that we do not block the user interface whilst the feed is being retrieved. All we need to do now is implement two delegate methods to handle a success and failure. First the method to handle a successful request:

- (void)requestFinished:(ASIHTTPRequest *)request {
	
    NSData *responseData = [request responseData];
	
    // Parse the response data
    // ...
	
    [[NSNotificationCenter defaultCenter
                           postNotificationName:kFeederReloadCompletedNotification
                           object:nil];
	
}

Since I am not yet doing anything useful with the data this is trivial right now. I know that I need a mechanism to tell the controller class when the download is complete. I could use delegation to do this but I think in this case it is cleaner to send a notification when the download finishes. My controlling class can than register for the notification and will be called when it has some new data to display.

The method to handle a failed request is also simple (again I do not do anything useful with the error message right now other than log it):

- (void)requestFailed:(ASIHTTPRequest *)request {
	
    NSError *error = [request error];
    NSLog(@"requestFailed: %@", [error localizedDescription]);
	
    [[NSNotificationCenter defaultCenter
        postNotificationName:kFeederReloadFailedNotification
	object:nil];
	
}

Initialising the Feed

Now we have some groundwork prepared we can modify the FeedViewController to make use of the new feed class. The viewDidLoad method of the controller now looks like this:

- (void)viewDidLoad {

    [super viewDidLoad];
	
    static NSString *feedURLString = @"http://useyourloaf.com/blog/rss.xml";
    NSURL *feedURL = [NSURL URLWithString:feedURLString];
	
    self.feed = [[Feed alloc] initWithURL:feedURL];
    self.posts = [[NSMutableArray alloc] init];
	
    [[NSNotificationCenter defaultCenter] addObserver:self
                                          selector:@selector(feedChanged:)
                                          name:kFeederReloadCompletedNotification 
                                          object:nil];
	
    [feed refresh];
}

 

So when the view controller is first loaded an NSURL object is created with the URL of the RSS feed for this blog. We then initialise a new feed object with the NSURL and add ourselves as an observer to the notification we created previously to signal a successful feed download.

Finally we call the feed refresh method to actually trigger the network communication. The only thing left to do is to implement the method that is called when the notification arrives:

- (void)feedChanged:(NSNotification *)notification {
	
    NSArray *newPosts = [feed newPosts];
	
    if (newPosts) {
		
        [self.posts addObjectsFromArray:newPosts];
        [self.tableView reloadData];
        [self updateViewTitle];
        [newPosts release];
    }
}

Since we do not yet actually parse the downloaded feed there is no real data to display so for now I continue to call the newPosts method to generate some dummy data to display. Once we implement the feed parser we can change this to update the table with actual post data.

Next Steps

That is pretty much all that is needed to retrieve the feed data. The simplicity of the code owes a lot to the work that ASIHTTPRequest code does under the covers. It takes away the hard work of dealing with background communications and allows us to focus on the useful stuff. In the next post I hope to cover how to use the NSXMLParser to extract the post data from the RSS feed.