Fog Creek Software
Discussion Board




Full Content RSS: I was wrong

Both here on this board and in private comments and emails I have been pushing several bloggers towards providing a full content feed.
Those of you accessing weblogs through an aggregator such as http://www.newsgator.com or one of the many free alternatives know why: Blogs are ultimately cross-referential and to follow that tangled web by visiting each of the sites manually would take more lifetimes than most of us have to spare.

My own little "provide full content feeds" campaign, both here, through blog comments and private communications, was quite successful in getting some very nice blogs to switch.
Thanks guys and galls. However, now I can only hope you will accept my humble apology. You see: I now believe I was wrong. Direct full content feeds are not the answer.

Let me qualify this: Full content feeds (FCF's from here on) are an answer, but not "the right thing". Sure, FCF's will get the end result: you can reed the feeds straight in your aggregator, but at what cost? A FCF will be substantially larger than a light notification-only feed. Since RSS is "pull straight from the source" this places a significant load on the components in the system. Imagine you are writing your little blog and get quite successful at it. You get about 1.000 aggregators pulling in that full feed on average 16 times a day, no matter whether something changed or not. How much can you afford to put in your blog? Put in that nice 3k story about your great dinner with Scoble ( http://radio.weblogs.com/0001011/ )? Lets just say you leave it in the feed for 14 days, then that story alone will cost you  more than half a gigabyte in upstream bandwidth. Not good.

Of course we do not want to give up on getting all the content in the aggregator. But there are far better solutions to this than getting all the content every hour of the day through a full feed. If the feed would just be a lightweight notification, why can’t the aggregator act on the notification and grab the full item just once, when it is published?

The notification could just contain the url of the item if all aggregators agree to pull in the page over HTTP the first time they find its address in the feed. The added benefit for the bloggers might be that now the efforts they spend on making up a nice design for the pages are actually seen by more than just the occasional first time visitor. Ziv Caspi has another idea (http://radio.weblogs.com/0106548/2003/06/14.html#a112 ): why not do an RSS cascade: Notification only -> Full one item feed. I’m sure others can even improve on this and work out the details. Don ( http://www.gotdotnet.com/team/dbox/ )and the others meeting regularly at Sam’s (http://www.intertwingly.net/blog/ ) sure have the powers to boil this little ocean, don’t they?

Just me (Sir to you)
Monday, June 16, 2003

This has been pretty much done to death. Joel, in fact, talked about the bandwidth-usage issue and himself got some things solved; now most aggregators use ETag and other standard HTTP caching means to drastically limit their consumption.

Anyhow, full feeds ARE great, especially when you use some of the neet new aggregators that will thread links, show which other blogs you're subscribed to have referenced the same pages, etc, for example, SharpReader and RSSBandit are great at this.

All that having been said, I agree that some sort of variation on the standard P2P concept for notifications could work here, but is probably unecessary for the problem at hand (it would seem to raise the technical bar too high for even the most basic aggregation systems)...

J
Monday, June 16, 2003

One forgets how fast the online world changes these days. You're offline for three weeks any hey ...
Still, wouldn't ETag solutions still fetch the full feed every time one post was added?

Just me (Sir to you)
Monday, June 16, 2003

Yes, ETag still pulls a full feed every time something NEW is available. Some are experimenting with using the value in the ETag given back by the client to customize the content that's sent down so that only new posts are given.

IMO, this is an HTTP problem to be solved. It doesn't have to be solved in the context of RSS.

Brad Wilson (dotnetguy.techieswithcats.com)
Monday, June 16, 2003

Wouldn't the solution be to have individual date-coded files for each post, then a central standard-named index.rss that lists the files?

Aggregator pulls index.rss, reads the index, pulls whatever files it doesn't have (or pulls three most recent and shows links for ones before that)

Philo

Philo
Monday, June 16, 2003

There's nothing wrong with that solution, Philo, except that it's very different from what we have now (different enough that I wouldn't call it RSS). You have to consider the amount of effort required to change all the tools that generate RSS as well as consume it.

Brad Wilson (dotnetguy.techieswithcats.com)
Monday, June 16, 2003

Bandwith is a big issue. But, I prefer the Full Content, since I read most of the stuff online (I am using dial-up connect at 32Kbps if I am lucky and net access is expensive)

Prakash S
Monday, June 16, 2003

DUH!!!; I meant offline:-)

Prakash S
Monday, June 16, 2003

*  Recent Topics

*  Fog Creek Home