--- Log opened Mon Aug 29 00:00:31 2011
00:13 < conseo> can you upstream your changes to lib-gwt-svg?
00:14 < mcallan> the developer is patching them in, he said today
00:16 < conseo> fine
00:41 < mcallan> i've nailed them here, c: http://lists.electorama.com/pipermail/election-methods-electorama.com/2011-August/028285.html
00:43 < mcallan> i daresay, the status quo of democracy is tarred by this simple critique, and these guys have to answer for it
00:43 < mcallan> it's their job
00:43 < mcallan> i'm off for lunch, cu soon
00:44 < conseo> :-D
00:49 < conseo> you like them, don't you? :-*
02:16 < mcallan> conseo: am i too severe?  i hope i'm not rude
11:38 < conseo> no, not at all, i am only not sure of the outcome of this discussion. yet i don't care too much atm. as i need to get that running
11:38 < conseo> mcallan: my problem is that the wrapper jar allows me to access classes from its classpath, but strangely not my classes from votorola.jar
11:39 < conseo> i got to go now, will be back in an hour with more investigation :-)
13:43 < conseo> stupid me. works now
14:20 < conseo> juhu, it fills the db properly :-D
16:21 < conseo> mcallan: u there?
17:31 < mcallan> conseo: ah good, it works :-)
17:33 < conseo> atm. i parse the gson code like in c@zelea.com:/home/c/WikiAuthorshipVerifier.java
17:33 < conseo> how do i generate the Property string (with the weird 3A) correctly from the property name?
17:36 < mcallan> looking...
17:38 < mcallan> one thing (you may already know) WikiAuthorshipVerifier is prob. not thread safe - use only from single thread
17:40 < conseo> why?
17:41 < mcallan> is JsonParser thread safe? prob. not, and your class stores it, so class is not thread safe
17:41 < conseo> ok
17:42 < conseo> fixed, but wikicache is marked @ThreadSafe
17:43 < mcallan> wiki cache is thread safe, afaik
17:44 < mcallan> ok, but i don't understand what "weird 3A" you need
17:45 < mcallan> can you paste property string, and point to what you need to parse out of it?
17:46 < mcallan> (BTW i meant that *instances* of WikiAuthorshipVerifier are not TS, but class itself is)
17:48 < conseo> the json looks like this: "s": { "type": "uri" , "value": "http://zelea.com/w/Special:URIResolver/Property-3AIRC_nickname" }
17:49 < mcallan> and you want to read the value: "http://zelea.com/w/Special:URIResolver/Property-3AIRC_nickname" ?
17:50 < conseo> yep i want to read any property value: for example "IRC nickname" which gets this underscore and 3A in it
17:50 < conseo> i guess i have to parse the rdf right
17:50 < conseo> jumping around until i have the right value
17:50 < mcallan> i can't remember, looking at old code...
17:51 < mcallan> GoogleGeocoder has no problem
17:53 < mcallan> ach, GoogleGeocoder is not reading RDF :-)
17:54 < conseo> i guess i have to first find the array where "o": { "type": "literal" , "value": "IRC nickname" }, then i have to find the one where "p": { "type": "uri" , "value": "http://zelea.com/w/Special:URIResolver/Property-3AIRC_nickname" }
17:54 < conseo> and there is the value in o
17:54 < mcallan> the only pertinent examples are in our .js files, which use the wiki cache
17:54 < conseo> ok
17:54 < conseo> thx
17:57 < mcallan> ok, minor point btw: use JsonReader if u easily can, it's more efficient than the JsonParser tree generator
17:57 < conseo> ok
17:57 < mcallan> GoogleGeocoder has example of that much
18:05 < conseo> but if i have to jump around the tree makes more sense. we have the wiki.jsm and wikiURIResolver.jsm which do some magic to get an array of our wiki properties
18:05 < conseo> hmm, this was confusing
18:06 < mcallan> i know, the w3c guys who dreamed up rdf ought'a be spanked
18:06 < conseo> your ecmascript parser generates a var listing for us and this was used in the config scripts. i'd have to write something similiar if we need access to wiki properties elsewhere in the code
18:06 < conseo> :-D
18:07 < mcallan> i thought so, you ok with writing it?
18:07 < conseo> i could use JsonReader there. or i go for a simple code path to just fetch my property
18:07 < conseo> yep
18:07 < conseo> if we need it elsewhere
18:07 < mcallan> ok, JsonReader is an optimization issue, so don't worry too much about it
18:07 < conseo> otherwise i'd just fix that routine and try to push forward
18:07 < mcallan> ok
18:08 < conseo> i guess we will need it
18:08 < conseo> so i'll do it
18:10 < mcallan> btw, whatever code is very similar between .js and .java, use comments to link the files so we can maintain them more easily
18:15 < conseo> ok
18:43 < conseo> i could add a routine getPageProperties to WikiCache or should i create my own class?
18:50 < mcallan> conseo: you decide what's best c.  if i think of an improvement, i'll suggest later
18:50 < conseo> kk
19:37 < conseo> why do you use these "labels": retry: for( int retryCount = 0;; ++retryCount ) ? they don't show up elsewhere in the function, so you don't really use them do you?
19:37 < conseo> does this improve debugging?
19:54 < conseo> if we drop readRDF_JSON as it is only used to fetch properties in ~/votorola/poll.js and ~/votorola/trust/votrace.js
19:54 < mcallan> usually i do refer to loop lables in break/continue statements  otherwise, i guess they only serve as comments
19:55 < conseo> we can stop using jena. not sure if this benefits (it is BSD)
19:56 < conseo> us. i'll leave it for now
20:05 < conseo> does it make sense to parse the page and return a HashMap<String,String> for the properties of the requested page?
20:06 < conseo> or is this too plain for RDF. i could try to use jena itself to return it in its objects, but i am not sure how much work this is
20:07 < conseo> ok, focus... i'll do the minimum and go with a hashmap
20:09 < mcallan> hash maps are expensive to construct.  jena is also expensive (wiki cache only uses jena to populate cache, not to fetch from it).  i don't know exactly what code you are adding, so it's hard to comment.  but if it's not perfect, we can fix it later.
20:11 < mcallan> the easiest thing is to code *inline*, whatever you yourself need.
20:12 < mcallan> don't add it to wiki cache, until some *other* piece of code needs it
20:12 < mcallan> that's simplest, and i often do that myself, when in doubt
20:14 < mcallan> (always a mistake to code infrastructure that is not needed, because we get tangled up in it)
20:17 < conseo> mcallan: then easiest is to fix getProperty in WikiAuthorshipVerfier and stick to it until we need some general property fetcher
20:17 < mcallan> ok
20:20 < mcallan> now, i must *not* reply to election methods till *after* i do my coding work.  i must not, i must not
20:20 < conseo> hehe
20:21 < mcallan> i will add the resources view to the diff bridge, with a link to votespace
20:43 < conseo> cool
20:44 < conseo> i have updated WikiAuthorshipVerifier on c@zelea, could you have a look? it works that way
20:46 < mcallan> only general comments, if you move property field into method params, then you can reuse a single instance of WikiAuthorshipVerifier
20:46 < conseo> yep, ok
20:46 < mcallan> that way you don't offend the garbage collector with lots of little objects that are used only once
20:47 < conseo> i have defined the method in the interface, so i have to change that
20:47 < conseo> ok
20:47 < conseo> well, for each scraper i only need one property, so this is no problem
20:48 < conseo> but this might not work all the time so i change it
20:48 < mcallan> and then mark the class @ThreadSafe, since it is
20:49 < conseo> ok
20:57 < conseo> can i leave the System.out.println in there for the catch block? i can add a logger, but that way the output shows in webharvest output
20:59 < conseo> or does this cause problems if we use this class in our code?
21:00 < mcallan> it should never output on user's console, unless user wants that
21:03 < mcallan> if it's *supposed* to output stuff, then safest is to pass output stream (System.out if u like) into constructor
21:04 < mcallan> usually exceptions should not be printed.  handle them if you can, else throw them and let client handle them if he can, else let it die
21:09 < mcallan> you also have debug print statements in there?  if those are permanent, best to log them at fine, or whatever
21:13 < conseo> well if the wikicache does not work and therefore the scraper silently parses nothing then it could get frustrating. i have put it at WARNING, but i can degrade it
21:14 < mcallan> why not let it die?
21:15 < mcallan> never (almost never) catch an exception unless you can handle it, and do something useful in response (like retry)
21:15 < mcallan> otherwise it is a bug
21:16 < conseo> throw new Error() then?
21:16 < conseo> or how should i let it die?
21:17 < mcallan> just do nothing
21:17 < mcallan> it will die all by itself :-)
21:17 < conseo> in the catch phrase?
21:17 < conseo> sry for n00b question
21:17 < mcallan> no, s'ok.  you gotta add a throws statement to the method
21:17 < conseo> ah, sure. ok
21:17 < mcallan> that's always the safest thing, unless u know what else to do
21:18 < conseo> ok
21:29 < conseo> does it make sense to define an empty interface? i need to pass the AuthorshipVerifier classes to MessageParser, but the verify method has different parameters now
21:32 < conseo> no it doesn't, hmm
22:50 < conseo> hmm, no i have hit weird behaviour of web-harvest, hopefully they fix it
22:50 < conseo> both scraper works now, pipermail and irssi for your irc log
22:51 < conseo> mcallan: what about thomasvonderelbe? will he be back soon?
22:51 < mcallan> cool!
22:52 < mcallan> could be...
22:52 < mcallan> he said minimum of 4 weeks, but could be as long as (?) 3 months
22:52 < conseo> hmm, ok
22:53 < mcallan> hope he returns sooner, rather than later
22:53 < conseo> me 2
22:55 < conseo> i have removed the "harvesters" for now and refactored my code btw. as harvesting and message parsing are now two different jobs. i'll remove the ircbot and maildir reader code from the repo for now
22:55 < conseo> ok?
22:56 < mcallan> ok
22:57 < conseo> i have also replaced google-gson-stream by the whole google-gson package and upgraded to 1.7.1. .txt info is uptodate
22:58 < conseo> do you have unit test/testing code btw.?
23:00 < mcallan> ok.  no unit testing tho
23:01 < conseo> it is too early for that, right?
23:01 < conseo> or don't you use them?
23:01 < conseo> i had to write some for my blogging code once, as they are mandatory for KDE libs
23:02 < mcallan> we don't have any bugs, so far :-)
23:02 < conseo> how do you know ? :-P
23:03 < conseo> wtf(?): votorola/g/web/web-harvest-2.1-snapshot.jar: up to 41 MB of RAM may be required to manage this file
23:03 < conseo> (use 'hg revert votorola/g/web/web-harvest-2.1-snapshot.jar' to cancel the pending addition)
23:04 < mcallan> what is web-harvest-2.1-snapshot.jar?
23:04 < conseo> the harvest thingie
23:04 < conseo> it contains all its jars
23:04 < mcallan> a jar that contains jars?
23:04 < conseo> which is quite some duplication, yet i don't want to waste time on repackaging
23:04 < conseo> yep
23:05 < mcallan> nooo :-)
23:05 < conseo> i might remove it if it is necessary, but for now can we stick to it?
23:05 < mcallan> i don't even know how that can possibly work
23:06 < conseo> i am fed up with all that scraping stuff and would like to move to crossforum now
23:06 < mcallan> how big is that file?
23:06 < conseo> 14M
23:07 < mcallan> that's a big jar
23:08 < mcallan> almost as big as all of votorola gzipped, with source code and jars
23:08 < mcallan> what's in there?
23:08 < mcallan> why jars in jars, i don't understand
23:09 < conseo> have uploaded it, have a look at c@zelea.com:~
23:10 < mcallan> looking...
23:12 < mcallan> yikes
23:13 < mcallan> if that goes on our classpath, it will clobber other jars
23:13 < mcallan> u dumped other jars in there>
23:13 < mcallan> in there?
23:14 < mcallan> don't despair c, it's no big deal to fix i'm sure
23:14 < mcallan> i just want to understand where this came from, and what it's used for
23:15 < mcallan> we can skype if you want
23:16 < conseo> ok
23:17 < conseo> http://web-harvest.sourceforge.net
23:17 < mcallan> my skype box is booting, i will call you
23:17 < mcallan> it's their ordinary distro jar?
23:19 < conseo> yep, i have built it from their maven build script as stated in the repo
23:20 < conseo> the released beta has 8M also
23:27 < conseo> vowebharvest config=/usr/src/webharvest/pipermail.xml workdir=/tmp/votorola-mail "#startUrl=http://metagovernment.org/pipermail/start_metagovernment.org/" "#diffBridgeUrl=http://u.zelea.com:8080/v/w/D" "#startDate=2010-05-01T03:10:00.000-0100"
--- Log closed Tue Aug 30 00:00:47 2011