Overview
FeedParser does RSS aggregators do a great job at parsing RSS feeds but there is additional metadata in the HTML posts in the form of comments and trackbacks and we'd like to parse this out.
Currently all the templating systems use the same general format for div and class names and its possible to write a parser for them.
Roughly the structure is:
No Format |
---|
<div class="comments">
<div class="comment-body">
Hey guys... this is my comment.
</div>
</div>
|
We'd like to push forward an "Open Comments" system mechanism to standardize on XHTML structure (similar to XOXO) so that RSS parsers like FeedParser /Atom aggregators can also parse comments in and trackbacks from HTML.
- Trackback parsing
- Comment parsing
FAQ
- Q: Can't users just break their templates?
- A: Yes but they could also remove the autodiscovery tags and so forth. If they break them the parser should ignore the broken comments.
- Q: Why not just have comment feeds?
- A: Comment RSS/Atom feeds are one solution. There are some additional advantages with using HTML.
- Developers don't have to change much. 2. If you already have the HTML you can index the comments with no additional IO. Imagine an email style pane for threading which was directly integrated within your browser. 3. Why does it make sense to have a separate XML file when the data is right there in the XHTML file.
- Developers don't have to change much. 2. If you already have the HTML you can index the comments with no additional IO. Imagine an email style pane for threading which was directly integrated within your browser. 3. Why does it make sense to have a separate XML file when the data is right there in the XHTML file.
Who's Interested?
KevinBurton - Rojo Networks Inc.
Name?
Is "Open Comments" a good name?
Where to organize?
Where should we organize this effort? I don't want to do it on a vendor's site and would rather keep this as open as possible. Any suggestions?
API
No Format |
---|
onTrackback() |
...
@param author The author of the comment |
...
@param weblog The URL to weblog of the author. |
...
@param permalink The permalink to the comment post. |
...
@param content The content of the comment. |
...
onComment( String author, |
...
String weblog,
String permalink,
String content,
Date date )
|
Existing Template Structure
Typepad
No Format |
---|
<meta name="generator" content="http://www.movabletype.org/" /> |
...
Greymatter
Xanga
pMachine
Blojsom
No Format |
---|
<meta name="generator" content="blojsom v2.24"/>
|
No Format |
---|
<div id="comments">
<h3>Comments on this entry:</h3>
<div class="comment">
<p class="blue">
Left on Wed, 2 Feb 2005 10:19 by Grumpy
(<a href="http://hxr.us/grumpops"
rel="nofollow">http://hxr.us/grumpops</a>) </p>
<p> Very cool. And they'll be playing at a Borders only 4 miles
from my house in late Feb.
</p>
</div>
|