--- Log opened Mon Aug 29 00:00:31 2011 00:13 < conseo> can you upstream your changes to lib-gwt-svg? 00:14 < mcallan> the developer is patching them in, he said today 00:16 < conseo> fine 00:41 < mcallan> i've nailed them here, c: http://lists.electorama.com/pipermail/election-methods-electorama.com/2011-August/028285.html 00:43 < mcallan> i daresay, the status quo of democracy is tarred by this simple critique, and these guys have to answer for it 00:43 < mcallan> it's their job 00:43 < mcallan> i'm off for lunch, cu soon 00:44 < conseo> :-D 00:49 < conseo> you like them, don't you? :-* 02:16 < mcallan> conseo: am i too severe? i hope i'm not rude 11:38 < conseo> no, not at all, i am only not sure of the outcome of this discussion. yet i don't care too much atm. as i need to get that running 11:38 < conseo> mcallan: my problem is that the wrapper jar allows me to access classes from its classpath, but strangely not my classes from votorola.jar 11:39 < conseo> i got to go now, will be back in an hour with more investigation :-) 13:43 < conseo> stupid me. works now 14:20 < conseo> juhu, it fills the db properly :-D 16:21 < conseo> mcallan: u there? 17:31 < mcallan> conseo: ah good, it works :-) 17:33 < conseo> atm. i parse the gson code like in c@zelea.com:/home/c/WikiAuthorshipVerifier.java 17:33 < conseo> how do i generate the Property string (with the weird 3A) correctly from the property name? 17:36 < mcallan> looking... 17:38 < mcallan> one thing (you may already know) WikiAuthorshipVerifier is prob. not thread safe - use only from single thread 17:40 < conseo> why? 17:41 < mcallan> is JsonParser thread safe? prob. not, and your class stores it, so class is not thread safe 17:41 < conseo> ok 17:42 < conseo> fixed, but wikicache is marked @ThreadSafe 17:43 < mcallan> wiki cache is thread safe, afaik 17:44 < mcallan> ok, but i don't understand what "weird 3A" you need 17:45 < mcallan> can you paste property string, and point to what you need to parse out of it? 17:46 < mcallan> (BTW i meant that *instances* of WikiAuthorshipVerifier are not TS, but class itself is) 17:48 < conseo> the json looks like this: "s": { "type": "uri" , "value": "http://zelea.com/w/Special:URIResolver/Property-3AIRC_nickname" } 17:49 < mcallan> and you want to read the value: "http://zelea.com/w/Special:URIResolver/Property-3AIRC_nickname" ? 17:50 < conseo> yep i want to read any property value: for example "IRC nickname" which gets this underscore and 3A in it 17:50 < conseo> i guess i have to parse the rdf right 17:50 < conseo> jumping around until i have the right value 17:50 < mcallan> i can't remember, looking at old code... 17:51 < mcallan> GoogleGeocoder has no problem 17:53 < mcallan> ach, GoogleGeocoder is not reading RDF :-) 17:54 < conseo> i guess i have to first find the array where "o": { "type": "literal" , "value": "IRC nickname" }, then i have to find the one where "p": { "type": "uri" , "value": "http://zelea.com/w/Special:URIResolver/Property-3AIRC_nickname" } 17:54 < conseo> and there is the value in o 17:54 < mcallan> the only pertinent examples are in our .js files, which use the wiki cache 17:54 < conseo> ok 17:54 < conseo> thx 17:57 < mcallan> ok, minor point btw: use JsonReader if u easily can, it's more efficient than the JsonParser tree generator 17:57 < conseo> ok 17:57 < mcallan> GoogleGeocoder has example of that much 18:05 < conseo> but if i have to jump around the tree makes more sense. we have the wiki.jsm and wikiURIResolver.jsm which do some magic to get an array of our wiki properties 18:05 < conseo> hmm, this was confusing 18:06 < mcallan> i know, the w3c guys who dreamed up rdf ought'a be spanked 18:06 < conseo> your ecmascript parser generates a var listing for us and this was used in the config scripts. i'd have to write something similiar if we need access to wiki properties elsewhere in the code 18:06 < conseo> :-D 18:07 < mcallan> i thought so, you ok with writing it? 18:07 < conseo> i could use JsonReader there. or i go for a simple code path to just fetch my property 18:07 < conseo> yep 18:07 < conseo> if we need it elsewhere 18:07 < mcallan> ok, JsonReader is an optimization issue, so don't worry too much about it 18:07 < conseo> otherwise i'd just fix that routine and try to push forward 18:07 < mcallan> ok 18:08 < conseo> i guess we will need it 18:08 < conseo> so i'll do it 18:10 < mcallan> btw, whatever code is very similar between .js and .java, use comments to link the files so we can maintain them more easily 18:15 < conseo> ok 18:43 < conseo> i could add a routine getPageProperties to WikiCache or should i create my own class? 18:50 < mcallan> conseo: you decide what's best c. if i think of an improvement, i'll suggest later 18:50 < conseo> kk 19:37 < conseo> why do you use these "labels": retry: for( int retryCount = 0;; ++retryCount ) ? they don't show up elsewhere in the function, so you don't really use them do you? 19:37 < conseo> does this improve debugging? 19:54 < conseo> if we drop readRDF_JSON as it is only used to fetch properties in ~/votorola/poll.js and ~/votorola/trust/votrace.js 19:54 < mcallan> usually i do refer to loop lables in break/continue statements otherwise, i guess they only serve as comments 19:55 < conseo> we can stop using jena. not sure if this benefits (it is BSD) 19:56 < conseo> us. i'll leave it for now 20:05 < conseo> does it make sense to parse the page and return a HashMap for the properties of the requested page? 20:06 < conseo> or is this too plain for RDF. i could try to use jena itself to return it in its objects, but i am not sure how much work this is 20:07 < conseo> ok, focus... i'll do the minimum and go with a hashmap 20:09 < mcallan> hash maps are expensive to construct. jena is also expensive (wiki cache only uses jena to populate cache, not to fetch from it). i don't know exactly what code you are adding, so it's hard to comment. but if it's not perfect, we can fix it later. 20:11 < mcallan> the easiest thing is to code *inline*, whatever you yourself need. 20:12 < mcallan> don't add it to wiki cache, until some *other* piece of code needs it 20:12 < mcallan> that's simplest, and i often do that myself, when in doubt 20:14 < mcallan> (always a mistake to code infrastructure that is not needed, because we get tangled up in it) 20:17 < conseo> mcallan: then easiest is to fix getProperty in WikiAuthorshipVerfier and stick to it until we need some general property fetcher 20:17 < mcallan> ok 20:20 < mcallan> now, i must *not* reply to election methods till *after* i do my coding work. i must not, i must not 20:20 < conseo> hehe 20:21 < mcallan> i will add the resources view to the diff bridge, with a link to votespace 20:43 < conseo> cool 20:44 < conseo> i have updated WikiAuthorshipVerifier on c@zelea, could you have a look? it works that way 20:46 < mcallan> only general comments, if you move property field into method params, then you can reuse a single instance of WikiAuthorshipVerifier 20:46 < conseo> yep, ok 20:46 < mcallan> that way you don't offend the garbage collector with lots of little objects that are used only once 20:47 < conseo> i have defined the method in the interface, so i have to change that 20:47 < conseo> ok 20:47 < conseo> well, for each scraper i only need one property, so this is no problem 20:48 < conseo> but this might not work all the time so i change it 20:48 < mcallan> and then mark the class @ThreadSafe, since it is 20:49 < conseo> ok 20:57 < conseo> can i leave the System.out.println in there for the catch block? i can add a logger, but that way the output shows in webharvest output 20:59 < conseo> or does this cause problems if we use this class in our code? 21:00 < mcallan> it should never output on user's console, unless user wants that 21:03 < mcallan> if it's *supposed* to output stuff, then safest is to pass output stream (System.out if u like) into constructor 21:04 < mcallan> usually exceptions should not be printed. handle them if you can, else throw them and let client handle them if he can, else let it die 21:09 < mcallan> you also have debug print statements in there? if those are permanent, best to log them at fine, or whatever 21:13 < conseo> well if the wikicache does not work and therefore the scraper silently parses nothing then it could get frustrating. i have put it at WARNING, but i can degrade it 21:14 < mcallan> why not let it die? 21:15 < mcallan> never (almost never) catch an exception unless you can handle it, and do something useful in response (like retry) 21:15 < mcallan> otherwise it is a bug 21:16 < conseo> throw new Error() then? 21:16 < conseo> or how should i let it die? 21:17 < mcallan> just do nothing 21:17 < mcallan> it will die all by itself :-) 21:17 < conseo> in the catch phrase? 21:17 < conseo> sry for n00b question 21:17 < mcallan> no, s'ok. you gotta add a throws statement to the method 21:17 < conseo> ah, sure. ok 21:17 < mcallan> that's always the safest thing, unless u know what else to do 21:18 < conseo> ok 21:29 < conseo> does it make sense to define an empty interface? i need to pass the AuthorshipVerifier classes to MessageParser, but the verify method has different parameters now 21:32 < conseo> no it doesn't, hmm 22:50 < conseo> hmm, no i have hit weird behaviour of web-harvest, hopefully they fix it 22:50 < conseo> both scraper works now, pipermail and irssi for your irc log 22:51 < conseo> mcallan: what about thomasvonderelbe? will he be back soon? 22:51 < mcallan> cool! 22:52 < mcallan> could be... 22:52 < mcallan> he said minimum of 4 weeks, but could be as long as (?) 3 months 22:52 < conseo> hmm, ok 22:53 < mcallan> hope he returns sooner, rather than later 22:53 < conseo> me 2 22:55 < conseo> i have removed the "harvesters" for now and refactored my code btw. as harvesting and message parsing are now two different jobs. i'll remove the ircbot and maildir reader code from the repo for now 22:55 < conseo> ok? 22:56 < mcallan> ok 22:57 < conseo> i have also replaced google-gson-stream by the whole google-gson package and upgraded to 1.7.1. .txt info is uptodate 22:58 < conseo> do you have unit test/testing code btw.? 23:00 < mcallan> ok. no unit testing tho 23:01 < conseo> it is too early for that, right? 23:01 < conseo> or don't you use them? 23:01 < conseo> i had to write some for my blogging code once, as they are mandatory for KDE libs 23:02 < mcallan> we don't have any bugs, so far :-) 23:02 < conseo> how do you know ? :-P 23:03 < conseo> wtf(?): votorola/g/web/web-harvest-2.1-snapshot.jar: up to 41 MB of RAM may be required to manage this file 23:03 < conseo> (use 'hg revert votorola/g/web/web-harvest-2.1-snapshot.jar' to cancel the pending addition) 23:04 < mcallan> what is web-harvest-2.1-snapshot.jar? 23:04 < conseo> the harvest thingie 23:04 < conseo> it contains all its jars 23:04 < mcallan> a jar that contains jars? 23:04 < conseo> which is quite some duplication, yet i don't want to waste time on repackaging 23:04 < conseo> yep 23:05 < mcallan> nooo :-) 23:05 < conseo> i might remove it if it is necessary, but for now can we stick to it? 23:05 < mcallan> i don't even know how that can possibly work 23:06 < conseo> i am fed up with all that scraping stuff and would like to move to crossforum now 23:06 < mcallan> how big is that file? 23:06 < conseo> 14M 23:07 < mcallan> that's a big jar 23:08 < mcallan> almost as big as all of votorola gzipped, with source code and jars 23:08 < mcallan> what's in there? 23:08 < mcallan> why jars in jars, i don't understand 23:09 < conseo> have uploaded it, have a look at c@zelea.com:~ 23:10 < mcallan> looking... 23:12 < mcallan> yikes 23:13 < mcallan> if that goes on our classpath, it will clobber other jars 23:13 < mcallan> u dumped other jars in there> 23:13 < mcallan> in there? 23:14 < mcallan> don't despair c, it's no big deal to fix i'm sure 23:14 < mcallan> i just want to understand where this came from, and what it's used for 23:15 < mcallan> we can skype if you want 23:16 < conseo> ok 23:17 < conseo> http://web-harvest.sourceforge.net 23:17 < mcallan> my skype box is booting, i will call you 23:17 < mcallan> it's their ordinary distro jar? 23:19 < conseo> yep, i have built it from their maven build script as stated in the repo 23:20 < conseo> the released beta has 8M also 23:27 < conseo> vowebharvest config=/usr/src/webharvest/pipermail.xml workdir=/tmp/votorola-mail "#startUrl=http://metagovernment.org/pipermail/start_metagovernment.org/" "#diffBridgeUrl=http://u.zelea.com:8080/v/w/D" "#startDate=2010-05-01T03:10:00.000-0100" --- Log closed Tue Aug 30 00:00:47 2011