User:Mike-ZeleaCom/Difference feeder
Contents |
The difference feed is a Web service providing an aggregate newsfeed of consensus-making discussions as they happen. A harvesting services crawls or subscribes to various discussion media that are compatible with difference bridging (mailing lists, Web forums, chat networks, microblogs and so forth); detects discussions that are focused on concrete differences of position, and collects a summary of the relevant messages into a single, aggregate newsfeed to which its clients may subscribe. The present page is a temporary scratch pad for hashing out the design of the harvester and is not necessarily up-to-date. The final design might be documented directly in the Java source code, as it takes shape.
Here's an example of a message from a mailing list source. If the harvester had been subscribed to that source, then it would have received the message, parsed it and detected the embedded difference URL http://obsidian.reluk.ca:8080/v/w/Diff?b=3860&a=3891. If that URL points the harvester's local difference bridge, and if the poster is one of the drafters named in the diff (Mike-ZeleaCom or ThomasvonderElbe GmxDe), then the message would have been accepted as relevant and a summary of it incorporated in the feed.
Discussion Client Source * \ / * \ / \ / - - - - - - - - - - - - - - - - - - - - - - | \ / | \ / | 1..* \ / 1..* | Difference ------------ Difference | harvester 1 1..* bridge | \ 1 | \ | \ | \ 1 | Pollwiki | site | - - - - - - - - - - - - - - - - - - - - - -
Each harvester service is located at a site (server cluster) anchored by a single pollwiki (bottom of diagram). It reads some of its config from the wiki, as well as generating relative URLs to the wiki's poll, position and user pages, as part of the feed content. It also works in conjuction with the site's difference bridge (right). It is only through detecting links to the bridge that it discovers relevent messages. The harvester may also be a client of the bridge's own discovery services. (Currently we're prototyping only a single difference bridge. That won't change till we implement free-range drafting across multiple media, in addition to MediaWiki. Till then, the harvester prototype can assume a single bridge.)
Functions
Discovery of message sources
- none at first - we keep a page somewhere in the pollwiki that lists our dev-test sources (lists, forums, etc) - maybe later the harvester can discover new sources by detecting when they are added to that page - later still, the discovery might be largely automated with the help of the difference bridge - the bridge can provide its own feed of reverse URLs [1] (e.g. to list archives) where people are clicking on its diff links - the harvester can then attempt to trace back to the original sources (e.g. lists)
Harvesting of message sources
- manual for now - so the admin has to do the work of configuring each mailing list etc. - we'll improve this later, teaching the harvester how to crawl/subscribe on its own to the various different media
Structure
Input from message sources
- a web archive and respective web-harvest crawling script. some scripts are provided by us
Intermediate piping
- scans crawled/incoming messages for diff URLs - only URLs to the local difference bridge - only where the message is sent by a user whose draft is referenced in the diff - so ID of message poster must equal or correlate to drafter ID (email address) obtained by forward tracing of the diff URL
Storage of feed
A summarized version of each message containing the related poll and the minimal information necessary like difference URL and URL to the post is stored.
Output of metadata
- pollwiki URL
- so clients can construct absolute URLs into the pollwiki
- difference bridge URL
- ditto
- or put this in the 'diff' part of the feed?
to support multiple bridges in future
Output of feed
Reading from storage and assembling the feed in response to each request.
- request format - HTTP with parameters - result types - feeds - result formats [ JSONP - allows cross-origin requests despite browser's "same origin policy" restrictions / http://code.google.com/webtoolkit/doc/latest/DevGuideCodingBasicsJSON.html / - allowed anyway in Firefox 3.5: / https://developer.mozilla.org/En/HTTP_access_control / - see also Cross-Origin Resource Sharing / http://www.w3.org/TR/cors/ // but JSONP is more standard
Tasks
- integrate with other clients than Crossforum [ Atom (or RSS whichever is best) / extended as necessary - for compatability with general newsreader clients, - make the client feed configurable and scalable for a large depot of messages