|
david galbraith's blog
May 12, 2006
JSON and RSS
John Resig has a very nice RSS to JSON converter. We've been working on a JSON output to Solr (Lucene) for Wists' search and I'm now convinced that JSON is the way forward for RSS and webservices in general. John Resig - RSS to JSON Convertor Posted by david galbraith on May 12, 2006
January 23, 2006
Outline style blogging
Over the last six months I have kept meaning to switch to outline style blogging. For all the hoo ha about OPML as a standard for reading lists - you might reasonably ask why not use RSS - after all RSS is often used for playlists so what could the difference be? The difference is really fairly subtle, but also very important, and the real answer has nothing to do with syndication, but the process of writing and what people who evangelize outliners have been trying to persuade people for years. The comments on Anil Dash: Outlining a Blog are a really clear illustration of the problem. Later generation blogging tools were designed with the influence of RSS which in turn was influenced by news headline syndication. This meant that every post had a headline and with only one default template for post styles post templates tended to look like news stories with a big bold headline and text beneath. If you want to post small snippets, the news story style format is a problem. If you put the headline in the body, then what do you use for the headline? If you use the snippet as the headline, the bodyless post looks empty and you can't put links to sites mentioned in the snippet in the headline. Non outline style blogging leads to the type of writing where you feel compelled to make every post a mini essay. This is bad for both writers and readers - since most people don't want to read essays about everything and most bloggers don't really want to write essays about everything. The headlineless style allows people to write more freely and more often, note style - what blogging is about. Reading through the comments on Anil's site my gut feel is that the solutions to this are: Multiple styles for post templates - headline or essay. Dates/times as default headlines in syndicated outline style posts. Named anchors as permalinks for individual entries within an outline style list post. Ability to nest OPML within RSS within OPML namespace or use separately for a pure list. (BTW- if anyone reading this knows - how are images handled in OPML?).
Posted by david galbraith on January 23, 2006
November 17, 2005
RSS - waiting for the great leap forward
Some people don't seem to like Google Base - I like it a lot and I'm sure that its a product that will gradually evolve into something truly revolutionary. So far nobody has been able to touch Ebay, but one gets the impression that since they are an effective monopoly Ebay have become very conservative with their product, not wanting to risk innovation which could mess things up. This leaves others who innovate with an opportunity, and Google will innovate here. The people that pay Ebay - the sellers, would switch if they could, but Ebay has the buyers. Only someone like Google could offer a rival marketplace of buyers. As an aside - since the single item Google Base upload allows you to define your own metadata via custom name-value pairs, does that mean that with bulk uploading Google will intelligently parse RSS modules in their own namespaces? If this is the case then RSS has taken one mighty leap forward. Posted by david galbraith on November 17, 2005
July 26, 2005
Microsoft RSS search engine to launch on Monday?
Niall Kennedy has a fair idea that Microsoft may be about to launch RSS search. Posted by david galbraith on July 26, 2005
May 05, 2005
Ads in RSS
"The feeds themselves are ads for the stories they link to, which are revenue-generators. Anything that keeps people from clicking, that confuses them, takes them off course, is going to drop the click-through rate." here here. There are only three possibilities for ads in RSS: 1. where the feed is an aggregated feed or search result from many sources, then the ad is similar to what the search engines do (but this is a volume game - the individual ad revenue is less than at the destination site). 2. where MOST of the RSS ad revenue is given back to the publisher - so that the publisher can decide whether the ad revenue outweights the potential revenue from the added traffic. 3. Where the RSS feed is full content - although to be honest most people can make more revenue off fancy advertising at the site where the headline links to. But as Dave points out, if you really think it makes sense for 2. then people aren't really reading your stuff anyway. Posted by david galbraith on May 05, 2005
April 11, 2005
Tagging and the Semantic Web
Tagging Linkblogging and bookmarking RSS and metadata. Despite the fact that RSS has been around for nearly 10 years and that extending RSS via modules has been around for 5, there is not a single RSS aggregator that can read a new RSS module or extension, on-the-fly, allow for results to be filtered by this metadata and display it correctly. Although this sounds complicated, it need not be – an example would be addition of a tag called price - On the publisher side, one of the reasons why modules are not cropping up everywhere is that there are no simple tools for people to create RSS modules. At the moment people have to sit down and agree on a module and draft a spec. For example here is one that I worked on that is used for biographical information: http://vocab.org/bio/0.1/ Top down vs bottom up. The Semantic Web. The semantic web. Tagging and semantics By allowing people to create metatags and attaching these metatags to their own ‘namespace’ you allow for the possibility of formally defining groups of metatags as an RSS module for a specific industry. In theory one can create a marketplace for RSS modules where the people creating the modules need not know or care about the technicalities of what this means. In other words if people involved in apartment rentals start to tags things in the following manner: rooms=3 square_feet=2000 monthly_rent=2000 etc., one has the beginnings of something that could be formalized as a standard module for apartment rentals with elements defined in a standard namespace. It is possible that these early steps in grass roots classification via tagging could evolve into something more along the lines of what the original aims of the semantic web promised.
Posted by david galbraith on April 11, 2005
January 19, 2005
The story of a trademark lawyer and RSS
OK, for a perfect example of the absurdity of not looking before legaling: Lawyer writes a blog about trademarks. Suggestion: perhaps not having syndicated full content RSS would have been simpler.
Posted by david galbraith on January 19, 2005
December 07, 2004
Good RSS article by David Berlind
David Berlind has done his homework in What's wrong with RSS is also what's right with it. The first sensible piece about RSS all year. The reality of RSS is that modules have been a failure, and that leaves RSS as a standard for headlines and links and a miscellaneous catch all called description or content. As Berlind points out you dont always have headlines and links are sometimes ambiguous, so that leaves us looking at a rather naked emperor. But that doesn't mean to say that RSS isn't useful as a meme if not a standard. Posted by david galbraith on December 07, 2004
November 18, 2004
RSS ads
Dave Winer on RSS ads in feeds without full content: "To read the full article you have to click on a link and (listen very carefully now) see an ad as you read the article. In other words, the RSS feed is itself an ad, pulling you in to read a page with a big ad on it." This is true, but then again, Google makes most if its money by serving up pages of links with ads alongside, the links pointing to pages that in turn are often ad supported. If double-dip advertising works for search engines why shouldn't it work for feeds? Posted by david galbraith on November 18, 2004
October 11, 2004
Blogging and Youth Culture
The real growth in blogging and syndication is amongst Xanga and Livejournal users and these systems are walled gardens. RSS and syndication are an anathema. Good lowdown on Zephoria: "[young people] use the Profiles in IM to find out if their friends updated their LJs or Xangas, even though they are subscribed by email as well. The only feed they use is the LJ friends list and hyper LJ users have figured out how to syndicate Xangas into LJ." apophenia: a culture of feeds: syndication and youth culture Posted by david galbraith on October 11, 2004
October 07, 2004
RSS is not a space
I've heard three people refer to the 'RSS space' at Web 2.0. This is dangerous hype. RSS is not a space, its a description of a way to transport links with clean titles. Advertising in RSS feeds will probably be worth $100 - $150 million within the next 18 months, and RSS readers will eventually be baked into all browsers as a fancy bookmarking feature - and that's it. If people wanted to get excited about a piece of geekery that weblogs have helped drive then ping servers would be a better thing to look at. If you become the king of all ping servers then you have something that is a real threat to the core business of search engines. When quantitative information such as price appears in RSS product feeds, then ping servers are hugely valuable and search engines based on crawling are fundamentally broken. Posted by david galbraith on October 07, 2004
April 01, 2004
Kinja - blogroll reading lists for non geeks
Kinja launches, well done Meg and Nick. My first impressions: Categorizing weblogs is difficult because weblogs are not often about one subject, so the Kinja categories are not the thing that interests me most. The single thing that I like most about Kinja is the public digest. Weblogs are about people and Kinja allows to me to read what another person is reading arranged how I am used to reading weblogs - as a weblog. In geek terms I can now read the posts from the sites in someone's blogroll as a weblog. I would like a link at the top of my blogroll that says 'get on the same page as me' by reading the sites in my blogroll as a Kinja powered blog. Others could read the daily newspaper that I read and I could read what they do. Posted by david galbraith on April 01, 2004
March 06, 2004
Amazon's RSS feeds show up the format's current weaknesses
When you are organizing things you usually have a miscellaneous category for stuff you are not sure where to put. If the miscellaneous category becomes too big then you haven't really organized things properly. With Amazon's product RSS feeds, has RSS broken 'out of its news-and-weblog-tracking ghetto' as Loosely Coupled suggest? RSS is XML and XML allows you to put things in tags that say what they mean - metadata. News has headlines and products have descriptions, so Amazon logically puts the descriptions in the 'description' tag. Here's the problem? Where does Amazon put the price information. Logically, you would think, in a price tag, since RSS is now extensible. The problem is twofold: 1. people are often nervous about creating their own modules or tags for RSS, there is no simple web forms interface for example that will build one for you (using your email address as a namespace perhaps). 2. aggregators do not read or display all RSS metadata, so putting the price in a price tag might actually make things worse. With only four things to organize (product name, price, link, product description), Amazon is forced to shoehorn the price of an item into the description tag, the 'miscellaneous' bucket. There are other bits of metadata in the Amazon descriptions, author, publisher etc., and since RSS has taken off because of simplicity, I'm not suggesting that Amazon adopt some hugely complicated committee-driven standard for a book seller module. But price is important, something that really needs to be marked as such to be useful. RSS is a very good way to syndicate links with clean titles (believe me, this solves a big problem for news aggregation), but until it regularly uses fundamentally important metadata such as prices, then it hasn't really grown out of the news and weblog ghetto. Posted by david galbraith on March 06, 2004
February 24, 2004
RSS tracking for marketing
Email marketers have a big problem. HTML email is a better marketing medium and is more trackable than text email. This is because remote content such as images can be loaded as the email is opened and used to monitor impressions. Even if payment is for clicks and these are tracked, the impressions to conversion ratio is a must have. Unfortunately, email clients are soon going to block dynamic content in email making it impossible to track impressions directly. This, coupled with the fact that up to 40% of HTML email doesn't get to its destination because of spam filters, has lead people to look to RSS as a possible savior for email marketing. Aside from the fact that RSS (as implemented currently) is not as good a medium for ads. If RSS is to be successful it needs to be trackable. This means tracking clicks and impressions. At first glance the article below seems to offer this: "When you open up the feed we know it. Every time you refresh the feed we count it" Clickthroughs are simple, however, I don't believe that IMN offer real impressions tracking. The reason is that RSS clients are like search engine crawlers that offer cached results. You don't know whether it is an unread feed pulled by a piece of software, that registers an RSS 'pageview' in your stats, or a genuine pageview by a human. Posted by david galbraith on February 24, 2004
RSS has nothing to do with push technology
Weekly Read has one of the few articles which point out that people confuse RSS and push, although it stops short of the reality. RSS HAS NOTHING TO DO WITH PUSH. You go to a URL and pull something down. The reason why people are confused, is that when you use OTHER weblog inspired publishing methodologies in concert with RSS, such as alerting a ping server, then you can push things to a client. A ping server plus HTML and permalinks gives you much the same thing. RSS gives you clean headlines (and on rare occasions, extra metadata) so in theory you do don't have to scrape websites. The reality is that you do have to scrape websites (ask Google News) because the majority of RSS doesn't contain full text for search engine indexing - but that is another story. Posted by david galbraith on February 24, 2004
February 18, 2004
Yahoo search includes RSS features
Jenny Levine points out that the new Yahoo search shows RSS URLs where available and has links to add sites automatically to MyYahoo. Posted by david galbraith on February 18, 2004
January 29, 2004
Kinja weblog aggregator wishlist
I've seen several comments lately about the trend away from the browser and how RSS may contribute to this, it can be used in an email client etc. But the trend for Usenet was towards the browser, with eGroups and Deja. Likewise, despite lacking in features, people seem to like Bloglines, it is a browser based RSS aggregator and my money is on this model. Bloglines doesn't do what I want, perhaps Kinja will. My aggregator top ten wishlist items: 1. Search 2. Ability to pick a selection of blogs from a limited list of categories, not too many - prob like Google news. 3. Ability to do scoped search within these categories. 4. 'More like this' recommendations. 5. 'People who linked to this blog' button beneath selections. 6. 'People this blog links to' button beneath selctions (blogroll plus contextual) 7. Browseable list of blogs ranked alphabetically or by popularity or by rate of increase in popularity in addition to category lists above. 8. Ability to view other users’ public lists of blogs and to clone and modify their lists to add to mine. 9. OPML and javascript export of part or all of my list as a blogroll. 10. Installable browser component that takes my list and makes it like a bookmark list, but with one key difference, it shows the number of new items in parentheses next to the bookmark. Overall I want something that looks like iTunes Music Store but for blogs. Posted by david galbraith on January 29, 2004
December 15, 2003
How to make RSS commercially viable
RSS, or more generally, web based syndication, appears to be hitting critical mass, but where is the money? Despite the promise of metadata enriched syndicated content, RSS is usually no more than a way to syndicate a link and a headline. No large publisher will syndicate their full content in RSS because they would lose traffic and therefore, money. Without full content no aggregator can add much value by categorizing and filtering infomation, so no purely RSS based aggregator can make much money. Despite all of the interest around web based syndication, people like Lexis Nexis will still make all the money unless this problem is solved. The solution that gives publishers traffic and allows aggregators to add value is to syndicate full content in such a way that it can be searched or categorized, but people still have to go to read the article on the publisher's site. All that is needed to do this is to remove 'stop words' such as 'the', 'and', 'of' and place the tokenized remainder of the full text in the description tag. Persuading publishers to do this would surely be the best way of focusing community efforts to guarantee the success of web based syndication, rather than concentrating on standards minutiae. Posted by david galbraith on December 15, 2003
December 11, 2003
New metadata standard for music files
Meta-files proposed for legal music sharing "The Content Reference Forum (CRF), founded by Universal Music Group and backed by technology companies including Microsoft, released the first specifications for the standard this week. Shouldn't this be RSS? Does anyone know the people involved in CRF? Posted by david galbraith on December 11, 2003
October 27, 2003
Bill Gates shows RSS integration in Longhorn desktop demo.
"Among the features shown off were transparent windows, animated windows that pop open and a new taskbar on the righthand side of the screen that displayed a clock, buddy list, and news and other information streamed onto the desktop via an RSS feed." Posted by david galbraith on October 27, 2003
October 09, 2003
(Not) Spam email to John Udell - one billion dollars please now, Viagra, enlarged body parts etc.
I wanted to send an email to Jon after having watched the webcast of his aggregators session at BloggerCon but unfortunately can't find it. In the presentation Jon said that RSS was one possible step towards solving the problems in email, something that was perhaps worth $1Bn. Since the question I wanted to ask via email was on the very topic of how RSS really offers something different than email then if Jon reads this through his subscription to this weblog, then perhaps this will illustrate the point. The problem is this: the email channel is too noisy for people like newsletter publishers to use. Assuming for a moment that RSS readers are commonplace in email clients. For pure opt-in Newsletters then RSS works, (Jon is subscribed to this weblog, so he has opted in to read this - and even if he doesn't really read my weblog, perhaps the mention of his name in the headline will help and because it is RSS the mention of Viagra and huge amounts of cash won't matter). For pure opt-in, email works much like RSS - someone could whitelist my email address in much the same way that they may subscribe to my weblog. Not everyone uses whitelist filters, but even less people have RSS readers. The problem can be solved technically by both, but RSS would actually need more people to start using new software. A more interesting problem is unsolicited information. This need not all be pernicious - I would hope this email to Jon wouldn't be considered as such, and I would assume that suggested RSS feeds for weblogs based upon a personal profile would also be OK. But if the channel is open for suggested feeds then how does this avoid the spam problem. In short, I am missing something, sorry two things, I forgot the $1Billion. Posted by david galbraith on October 09, 2003
October 03, 2003
Events based weblogs
Libby Miller has an excellent piece on strategies for combining foaf and geo with RDFical. Plan B: Combining foaf, RDFical and geo, and maybe RSS 1.0... Posted by david galbraith on October 03, 2003
October 02, 2003
RSS and schema equals RSS-Data
0xDECAFBAD has a very nice example of RSS 2.0 and RSS-Data alongside. If you use schemas with RSS, you get eveything that RSS-Data provides and more to the point, you make the data structure definitions optional. The only downside as far as I can see is that with RSS-Data the structural definitions are inline. Surely, however, since schemas are in XML themselves they can be inline - rather like CSS being inline or referenced? Posted by david galbraith on October 02, 2003
RSS-Data
Jeremy Allaire is having some interesting ideas about RSS. I like his idea of RSS-Data, but isn't the idea of a generic aggregator separate? Rich metadata in RSS isn't happening at the moment, the spec is there but the tools to create and read the content aren't: 1. there are no end user tools to create modules (why not allow people to build their own forms, where each form field is an RSS tag in a namespace that is their email address by defaut?) 2. there are no aggregators that read extended metadata (there are no aggregators that filter by a MoveableType category, for example). Both these issues are as much to do with UI as data modelling. RSS module builders could use a web forms that build forms approach, (the 'metacrap' syndrome would be a problem but there are hundreds of person-years work that have already gone into this with EDI standards such as EDIFACT (I had a go at this a few years back, with a proposal for the fields in webforms)). An RSS meta-aggregator would have to allow users to preselect which new autodiscovered metadata to display in order to avoid innevitable UI issues such as sparse columns etc. In fact the best interim hack for this would be a Excel import tool that read RSS modules on the fly. Posted by david galbraith on October 02, 2003
October 01, 2003
RSS aggregation business models
Current RSS readers are more or less similar, differentiated on interface features rather than core functionality, and sold, where applicable as software tools. Over time these features will surely be a commodity and any business model around RSS aggregation will be based upon the value add on top of aggregation. My guess is that this value-add is in efficient searching, categorizing and personalizing rather than discovery and display. Categorization and personalization can be done by adding metadata to existing feeds (the tokenization process of search could arguably be considered metadata a tokenized content tag would allow local searching). This can be entirely independent of the tool used to view RSS, providing that RSS readers can read this metadata. The time is probably about right to start looking at this from the various initiatives such as FOAF and RSS topics that are out there and building features based upon them into aggregators. Posted by david galbraith on October 01, 2003
FeedDemon
Nick Bradbury'sFeedDemon is very nice. The 3 pane interface is clearly the way to go for RSS reading. What's really interesting about FeedDemon however, is that it is basically an RSS enhanced browser rather than a separate app. admittedly the distinction is blurred, but seeing FeedDemon does lead me to believe that RSS features could become standard, collapsible components of a browser. When the joint Moreover/Blogger tool Newsblogger launched, it had a similar 3 pane view, but was definitely an online app. Blogger then decided to make it function through an Explorer bar in IE, which is more similar to the path that FeedDemon is going down. There are four types of aggregator: online (Bloglines); separate app (Newzcrawler, Newsmonster, Amphetadesk; Netnewswire); enhanced browser (FeedDemon); and enhanced email app (Newsgator). Until now, I was convinced that the online approach was best, but I'm not so sure. Posted by david galbraith on October 01, 2003
August 18, 2003
Weblog post plus permalink equals RDF
If you fill in a form to create a weblog post that has a permalink then you are creating something that is RDF-like by nature. Subject = the Post itself, which is pointed to by a permalink. The RDF/XML syntax can be hard - but the model is not, and no matter what the disputes surrounding its use are, weblog posts are an almost perfect application of some of the most important ideas behind RDF. An RDF statement is like a form field and its label (e.g. name: david) that are a property and value of something unique, like a person. Conveniently, if there is a URL that is that something, or is the identifier for that something then the properties pertain to that URL. When people post weblog entries, they often attach a unique url to that entry via a permalink. The weblog post, unlike a webpage, which can be a temporary rendering of the output from a database, is an item which contains meaning. Information is being published and retrieved in chunks where meaning, semantics, are being created or stored. The fact that weblog publishing arrived here via a separate route surely validates some of the principal ideas of RDF, even if there is some debate about the specifics. Posted by david galbraith on August 18, 2003
August 08, 2003
Obey Ebay
Ebay is a site that is full of links to trademarked names - things for sale like 'Nikes'. It is threatening to sue Google advertisers who use the name Ebay in phrases like 'Ebay power seller'. One advertiser puts it in perspective: "How do you say that you repair Volkswagens without saying Volkswagen?". Ebay doesn't have an API, is hostile to third party add-on software and has bought third party products like Paypal that encroached on its value-add and commissions. If Ebay, which has a virtual monopoly on classifieds is so hostile to decentralization, perhaps Ebay is vulnerable to the syndication model? Google ads a threat to eBay trademark? | CNET News.com Posted by david galbraith on August 08, 2003
August 07, 2003
RSS - Readable Syndication Standard
"If you believe in human-readability of your markup and in the power of XML, and your website isn't valid XHTML, you're contradicting yourself." Refined RSS feeds (kottke.org) I'd damn well better shut up then. Although I do have a super lovely Typepad (machine) created XHTML blog in the works. I still think that RSS can be both human readable and match its potential, its not to do with namespaces or the RDF model, but the fact that the RDF syntax shoehorns into XML in a way that there are double statements that read like legalese. Posted by david galbraith on August 07, 2003
An RSS driven marketplace
CareerBuilder, which is a joint venture by newspapers, outbid a $25 million per annum deal, struck at the peak of the dotcom bubble, between Monster and AOL, where the jobs service pays the portal for exclusivity. The reason: "newspapers' jobs classified ad revenue, which dropped by half from $8.7 billion in 2000, to $4.3 billion in 2002". Imagine if the entire classified ad business not only went online but decentralized and was based around RSS syndication and aggregation? Newspapers: Help wanted in Net ad battle | CNET News.com Posted by david galbraith on August 07, 2003
August 04, 2003
RSS jobs
With things like: RSSJobs, it's good to see two things finally happening: RSS being created on-the-fly from searches; RSS being used for things other than news. At the moment, however, aggregators are largely read-only, and do not read RSS modules' metadata on-the-fly. The latter will happen as there is more and more metadata available, however, for the former, weblog APIs and RSS need to merge. In theory, one the the things that Atom may provide is a way to automatically bind to and configure any service such as RSSJobs from within a combined 'meta-aggregator'/weblog publishing client. Posted by david galbraith on August 04, 2003
July 25, 2003
Round-tripping RSS
Most of the RSS feeds that are around are basically feeds from a single source, and few take advantage of metadata within them. However a few more interesting tweaks are happening. One is on-the-fly RSS generated from a search term. Wired now allows this and Moreover has been allowing customers to create on-the-fly RSS feeds based on parametric searches over metadata contained in its database for some time. The interesting thing is that a feed based upon a query over metadata, further creates metadata that can enrich the original source. This 'round tripping' of XML metadata potentially allows for enriching information as it flows around the web - this round tripping can be an infinite virtuous circle. As a simple example: Suppose an RSS feed contains the full content of articles and an on-the-fly RSS feed can be created by searching this full content. If you create an RSS feed of articles mentioning the term '80211.b', then all articles returned can have a topic tag attached that labels them as being about 'wireless', based upon a thesaurus lookup. Tagging the articles with this metadata as a topic is enriching the original source with further metadata. Posted by david galbraith on July 25, 2003
July 22, 2003
(Not)Atom and RSS
In some ways RSS could be accused of being the Emperors New Clothes of standards - the acronym has more than one meaning (including the Hindu Nationalist one), there are multiple versions, extending it via modules makes its use as generic as XML itself, and if you normalize the data flowing around in RSS, then the lowest common denominator is some text with a link surrounding it - hardly metadata, more like hypertext links. But to some extent, all standards are like the Emperors New Clothes in that they are not so much about specs and technical precision, but the virtual mindset that they occupy and the people and tools that use them. To this end, RSS is a winning meme, people outside of the grass roots weblog community are starting to talk about and use it and RSS 2.0 passes the good enough test (with a couple of tweaks IMHO) for applications beyond headline syndication. There are many things that (Not)Atom is doing that are absolutely necessary and very good from a technical perspective, but from a market perspective, surely it would be better if Atom worked around a core of RSS and if need be then the RSS 2.0 core should be amended to include necessary things like the encoding type of content. A year or so ago the name actually wasn't important, but now RSS is: 1. on the radar of the content management companies, Vignette, Interwoven, Documentum; 2. being output within services from hubs such as AOL; 3. being syndicated by media organizations such as the BBC. In as much as weblogs are more important than online journals - that they show the way that information can be published, re-purposed, re-routed and re-formatted to be viewable on any networked device in real-time, anywhere and searchable with SQL-like precision using metadata encoded in XML, then the standards for weblog publishing, syndication and change notification are important for how things will be published and searched on the web. If everything on the web were published using the emerging weblog method then the web would be searchable like a database and return anything as soon as it was available. Google will never be able to do this by crawling the web. There is a risk that if RSS has captured mind-share outside of the weblog community then Atom, without an RSS payload, may be perceived as a weblog only format, if only because people outside of the this community will be confused. Posted by david galbraith on July 22, 2003
July 18, 2003
RSS resolution
For the record, I am onboard with Dave's move to take RSS to the next level and appointing Brent Simmons and Jon Udell to an advisory board. At some point it would be good if this went through a standards organization like the W3C, however. I would suggest that it would be good if all RSS development focuses around a 2.0 core and that the developer community focus on RSS modules on the one hand and a message wrapper for RSS content based upon weblog API's on the other. With this any RSS 1.0 community work or Atom (Echo) work should fold into this arena. RSS 2.0 meets the requirement that I see as key (extensibility through modules) and any fragmentation of effort will be counter productive. There is plenty of work to do with defining modules and message wrappers - and to that I would add 'ping' server architecture, where the value of real-time information will demonstrate what a weblog published/RSS syndicated model can do what current search engines with nothing fresher than 15 minutes cannot - a Reuters for everyone. Technology at Harvard Law: Advisory Board Posted by david galbraith on July 18, 2003
July 16, 2003
One small step for Technorati.
Something interesting is happening in the world of online identities. The end goal is clear - a distributed, decentralized identity system where people have control over their own identity online - a people's 'Passport' or what Marc Canter envisages as a people's DNS. The problem is how to get there. Perhaps it will happen, in part, from the ground up through small steps such as personal data in systems such as Technorati or one line bio's as personal RSS headlines? In fact, in true Dave Sifry style, Technorati seems to already be moving along these lines: see Technorati Profiles and check out the picture. Over the longer term, this is perhaps as ground breaking as what weblogs have done for web publishing and ultimately will leverage the weblog model to its full potential by creating a parry to content through people's interests and requirements, creating a marketplace for RSS. Posted by david galbraith on July 16, 2003
July 10, 2003
What does SOAP give you
Jon's Udell nails it with his analysis of wrapper technology for weblog content: "I like its RESTian purity, though I'd also be open to a SOAP variant that could optionally leverage all of the authorization and routing machinery" I wonder if it can be proved that any 2 dimentional SOAP message can be represented entirely as a 1 dimentional URL of a certain length. I suspect that this is the case, and if so, then the only thing that the REST model does not allow is to create a secure login mechanism that blocks access proactively. The REST model requires retroactively blocking access based upon IP address. In which case, perhaps you could have a URL encoded wrapper for RSS feeds, with a generic SOAP login wrapper if required. Perverse perhaps, but useful. In either case, the specification for the wrapper of weblog content should not specify the format used - i.e. it should be able to be represented in SOAP or as a URL. To allow this means not merely separating the message from the envelope, but the envelope from its routing and security specification. Authentication should not be part of the wrapper specification for weblog content. Posted by david galbraith on July 10, 2003
July 09, 2003
Behold Echo
I'm not really interested in the politics of Echo, however, no matter what happens, a year or two from now we will have the way that publishing and aggregation works on the web nailed (most probably when Microsoft decide on what to adopt) - a development as important as the adoption of HTML and the web browser. This may include the Metaweblog API and RSS or a combined effort with a new XML schema in a SOAP wrapper. To my mind there is no problem in making RSS as is the default payload for SOAP content. A few tweaks that Echo already has would allow typing - e.g. avoiding the current madness where the mime type of the full content is not specified. So what are the missing bits? On the detailed level: RSS content is so unnormalized as to be almost useless for commercial applications. To build a searchable index of RSS content you need access to the full text of stories - and commercial publications are not going to syndicate the full text of stories - but you don't need to syndicate the full text of stories to index them. Encouraging the use of tokenized full text (i.e. remove stop words such as and, or, the etc.) allows for machines to index full articles but leaves humans to visit original publishers sites for the full article. This should be the default content of a 'content' tag and needs to be built into the default output from weblog publishing tools. On the medium scale: because of arguments over the RSS core, not enough focus has been made on tools to create modules and allow extensibility. Forms need to be built into applications such as Userland's, Blogger and Moveable Type's to allow end user creation of RSS modules within a users namespace and without having to have users have any need to know about the underlying XML. Rapid adoption of modules will take syndicated content beyond the headline/link pair that is the only metadata currently being syndicated in any volume. On the larger scale: content and the weblog API are two parts of the whole - most important of all perhaps is the ping server and related specs. In order to build personalized aggregators of real-time information, all of a weblog post needs to go to a neutral third party ping server and the ping server needs top have an API that allows clients to be alerted of changes in real time without having to scrape the ping server. Do this and you don't have 15 minute old Google aggregated news but real time news - the stuff that people like Reuters know the value of. Given the importance of the standards for publishing on the web, there needs to be a formal body with founder members such as Userland, Moveable Type and Blogger. There is no money to be made out of the standards themselves, but a great deal to be lost if they are not agreed on by everyone. Without a body lead by the weblog publishing tools their efforts will be userped by whatever the big co's decide to use. Posted by david galbraith on July 09, 2003
April 03, 2003
Headline's for audio blogs
Dan Z responds to the challenge of creating meaningful headlines for audio blogs: This really isn't so hard, if you plan ahead. I've already integrated titles into my personal audioblogging software by creating an easily modifiable VXML grammar of titles. Right now mine are set to places (i.e. "At a restaurant", "On the road", "At Bill's house", "Somewhere new") but they could be set to categories (i.e. "Cooking", "Politics", "Antiquing", "Found Audio") or anything, really. As far as accents go, as long as the VXML server isn't overloaded and you don't choose titles that sound alike, this really shouldn't be a problem. But I've numbered mine, too, just in case I'm in a loud environment. Dan Z." Posted by david galbraith on April 03, 2003
March 05, 2003
RSS comes full circle
Over on RSS-DEV: "There are millions upon millions of people who use services like My Yahoo for customized news, stock quotes, etc. Why not start sending them to an RSS aggregator instead?" Funny how some things have their time, I think the battles are over for RSS, version 2.0 gives the majority of people what they want, extensibility through modules, but stays simple, and for those that really need RDF, they can go back to version 1.0. More surprising is how the idea that RSS could rival My Yahoo is seen as something new - this is where it started, several years ago there was an RSS aggregator: My Netscape. Posted by david galbraith on March 05, 2003
February 26, 2003
Creating RSS modules
Don Park complains that metadata is being stuffed into the description tag. This was the original reason that a modular approach to RSS was necessary. What is needed is a simple, online, forms-based tool to create RSS modules. I guess I should have a go. Posted by david galbraith on February 26, 2003
December 05, 2002
RSS book out in April
Ben Hammersley's eagerly awaited book on RSS is out in April: oreilly.com -- Online Catalog: Content Syndication with RSS Mine's a Guinness Ben. Posted by david galbraith on December 05, 2002
November 07, 2002
Permalinks and trackback are the key to the semantic web
The commonplace use of permalinks in weblogs has profound implications. In this context a page is irrelevant, weblog pages are transient things, but entries are made permanent through archiving. The difference between weblog publishing and most other web publishing tools is that traditional tools have been geared around producing pages, whereas the application servers and databases that sit behind large sites are more naturally geared around 'nuggets of information'. This is not limited to publishing programs - consider the stats packages which analyze web logs (as opposed to weblogs), by not giving statistics about individual 'postings' they are essentially tied to extracting potentially meaningless information about page hits. So what are the limitations of permalinks as they are currently implemented: 1. they are too conservative in that they reside on (but don't point to) a portion of a web page. It should be possible to set the not merely the number of items or duration of time displayed on a weblog, but to set the page size so that a weblog item can be a portion of a web page, all of a web page or spanning several web pages, thus covering all of the options. This is rather like the 'more...' function of this posting but covering a page instead of an individual item. This would then allow weblogging tools to be used to publish things like conventional news websites. 2. Other links like comments and trackback are creeping in - they are very necessary additions to weblogging but they should be part of the permalink. Posted by david galbraith on November 07, 2002
October 22, 2002
Definitive RSS validator
Mark Pilgrim and Sam Ruby post their validator, which is optimized for 2.0 by default, but validates against: RSS 0.91, 0.92, 0.93, 0.94, 1.0, and 2.0 Posted by david galbraith on October 22, 2002
|