Harvester Design

Michael Allan mike at zelea.com
Fri Mar 9 00:10:56 EST 2012


Thanks C, this looks good.  I have a bunch of minor suggestions and I
reply to your own suggestion under 3a below.

> > 1. Redact the diagram above and put it in the Javadocs with links
> >    to the namesake classes and other docs (below).

http://whiletaker.homeip.net/votorola/harvester/javadoc/votorola/a/diff/harvest/package-summary.html#package_description
(i) The methods on the kicker (listen and trigger) should proabably be
renamed "register" and "raise" for consistency.

(ii) Can the table at bottom link to the named components using
{@link} or something?  That'll make it easier to keep up to date,
because Javadoc will warn you when names change, etc.

(iii) "Cache" is incorrect in the table.

> > 2. Draft the Javadoc API for the various components beginning with
> >    those most pointed at by arrows:
> >
> >      a) Cache

ok

> >      b) Kicker

http://whiletaker.homeip.net/votorola/harvester/javadoc/votorola/a/diff/harvest/Kicker.html#register%28votorola.a.diff.harvest.Kicker.EventHandler,%20java.lang.String%29
(i) I guess archiveFormat should be archiveDesign and should link to
http://zelea.com/w/Property:Archive_design for the meaning.  I guess
the format should be the simple name of the design without namespaces
etc, e.g. just "Pipermail" for http://zelea.com/w/Stuff:Pipermail

(ii) The forum pages (and forum design, etc) will eventually be in a
dedicated wiki, which usually won't be the admin's local pollwiki.
After your code is running, we'll teach WikiCache how to work with
multiple wikis.  We'll add a config item pointing to the forum wiki.
(Please add this design note to your code and back-link to this post.)

http://whiletaker.homeip.net/votorola/harvester/javadoc/votorola/a/diff/harvest/Kicker.Event.html
(iii) A suggestion: I would break it out as a separate class because
it's a fairly important thing.  I would call it simply Kick, because a
kick is an event (!) by definition.

(iv) I think the documented purpose of the Kick (or Kicker.Event)
should be more specific.  Maybe: "An event signalling that a forum's
local cache of messages might need to be refreshed.  The typical kick
receiver is a harvester.  On receiving a kick, the harvester schedules
a harvest job in which newly added messages are read from the forum
archive and stored in the cache."  Something like that.

(v) I think the forum method should point to
http://zelea.com/w/Concept:Forum for it's documentation.  It should
return the simple name of the forum page without namespaces etc,
e.g. "Metagovernment" for: http://zelea.com/w/Stuff:Metagovernment

http://whiletaker.homeip.net/votorola/harvester/javadoc/votorola/a/diff/harvest/Kicker.EventHandler.html
(vi) Again, I would make it a top level class.  Maybe "KickReceiver",
with a method called KickReceiver#receive(Kick)?  Just a suggestion.

> >      c) CacheWAP (Javadoc documenting the web API)

I assume these two will be removed later:
http://whiletaker.homeip.net/votorola/harvester/javadoc/votorola/s/wap/DiffCacheSS.html
http://whiletaker.homeip.net/votorola/harvester/javadoc/votorola/s/wap/HarvestCacheSS.html
Leaving only this:
http://whiletaker.homeip.net/votorola/harvester/javadoc/votorola/s/wap/HarvestWAP.html

(i) In the response, I would say "bites in order of parsed date,
newest first" because it's not obvious.

(ii) I would link *all* of the fields to their javadocs, because they
give useful information, and they're what the client is dealing with.

> > 3. Document the configuration of the Pipermail harvester.  The
> >    various harvesters should have similar forms of configuration,
> >    but this cannot be required.  There are two major parts to the
> >    configuration:
> >
> >      a) User configuration in pollwiki, such as archive location

ok, if you like my latest: http://zelea.com/w/Special:RecentChanges

I added what you suggested: http://zelea.com/w/Concept:Forum_design

> >      b) Administrative configuration on server                  

ok, none needed yet

> > 4. Draft the command interface for the Pipermail harvester.  Again
> >    the various harvesters should have similar command interfaces,
> >    but this cannot be required.

http://whiletaker.homeip.net/votorola/harvester/theatre.xht

(i) This isn't really part of the theatre any more is it?  It's mainly
a standalone service.  Maybe just add "voharvest" to the manual?
http://zelea.com/project/votorola/s/manual.xht#Line

All that needs to be documented is the "voharvest" command itself.
There's already a "vowebharvest" (though just a link), which I guess
will be removed or renamed.

> > 5. Code it.

It seems ready to code - after you review my points above!

-- 
Michael Allan

Toronto, +1 416-699-9528
http://zelea.com/


conseo said:
> Hi,
> 
> > 
> > We already have forums.  http://zelea.com/w/Concept:Forum
> > So the candidate's position page will have one or more Forum
> > properties that point to all the forums in which discussions are
> > happening.  http://zelea.com/w/Property:Forum
> > 
> >   Position
> >     -> Forum
> > 
> > Forums have a property defining the archive location:
> > http://zelea.com/w/Property:Archive_URL
> > All we need to add is an archive format:
> > 
> >   Forum
> >     -> Archive URL
> >     -> Archive format
> > 
> 
> Good. I think we need a Forum type (here "Mailman") as well. In fact we don't 
> always need to know the Archive, because if we configure the base-url of 
> Mailman, then we can access the Archive from there. Yet configuring both would 
> be sanest. Detectors only know Forums, while Harvesters need to understand 
> different kind of archives for one Forum type. Or do you want to hardwire it 
> around archive type ("Pipermail") for now?
>  
> >   voharvest clear FORUM   - clear FORUM from the cache
> >   voharvest detect        - run the harvest detectors
> >   voharvest harvest FORUM - harvest any new messages
> 
> Added to docs. (1)
> 
> > 
> > Whatever the diff feed needs to run, because that's currently your
> > only client.  Later you can extend the API to support the talk track
> > if it needs additional request methods.
> > 
> > So 2 is almost done.  3a is pretty much done if you agree.  That
> > 
> > leaves 3b and 4 to document:
> > > > 3. Document the configuration of the Pipermail harvester.  The
> > > > 
> > > >     various harvesters should have similar forms of configuration,
> > > >     but this cannot be required.  There are two major parts to the
> > > >     
> > > >     configuration:
> > > >       a) User configuration in pollwiki, such as archive location
> > > >       b) Administrative configuration on server
> 
> 3 b) For Pipermail (and Irssi) we don't need server side configs, we will 
> fetch all configs out of the Wiki (as already discussed on IRC). 
> 
> 4) see above. 
> 
> Javadocs should reflect all proposed changes now. (2)
> 
> conseo
> 
> (1) http://whiletaker.homeip.net/votorola/harvester/theatre.xht
> 
> (2) 
> http://whiletaker.homeip.net/votorola/harvester/javadoc/votorola/a/diff/harvest/package-
> summary.html



More information about the Votorola mailing list