--- Log opened Mon Mar 26 00:00:09 2012 00:33 < conseo> n8, cu later 00:41 < mcallan> n8 c 12:57 < conseo> mcallan: how would you name marked areas of the time line. i thought of dots and lines, atm. it is markers and slices 12:57 < conseo> ? 13:00 < conseo> the whole thing is a timeline? atm. it is a "history" 18:49 < mcallan> conseo: maybe something like "message sequence" with "un-harvested segments" (or "uncached segments") 18:52 < mcallan> (i guess it's not really a timeline though. it's made of messages not time, and it's a discontinuous sequence of atoms rather than a continuous line) 22:25 < conseo> i have an idea to simplify the marking stuff. since the first step of every crawl is the compilation of the list of pages with messages, we can simply save which of theses sub-lists is already crawled 22:25 < conseo> we then only have to save the url for every month (for pipermail for example) which we already have crawled 22:28 < conseo> on a new crawl we obmit these sub-lists. the smallest sub-list unfinished and including the present, gets a marker for the newest continuesly harvested message 22:30 < conseo> i will sketch that out tomorrow, that could simplify harvesting a lot. the only problem is that we might lose the number represented on one smallest sub-list page if we interrupt a harvest 22:30 < conseo> anyhow, i go to bed :-) 22:31 < conseo> (forget about "smallest" in "smallest sub-list unfinished") 22:33 < conseo> this neither needs a sequential order of the messages by id or by date, only the list has to be static in the past 22:33 < conseo> n8 22:48 < mcallan> sounds interesting, i'll wait to read your design... 22:48 < mcallan> g'n8 --- Log closed Tue Mar 27 00:00:25 2012