--- Log opened Mon Mar 26 00:00:09 2012
00:33 < conseo> n8, cu later
00:41 < mcallan> n8 c
12:57 < conseo> mcallan: how would you name marked areas of the time line. i thought of dots and lines, atm. it is markers and slices
12:57 < conseo> ?
13:00 < conseo> the whole thing is a timeline? atm. it is a "history"
18:49 < mcallan> conseo: maybe something like "message sequence" with "un-harvested segments" (or "uncached segments")
18:52 < mcallan> (i guess it's not really a timeline though.  it's made of messages not time, and it's a discontinuous sequence of atoms rather than a continuous line)
22:25 < conseo> i have an idea to simplify the marking stuff. since the first step of every crawl is the compilation of the list of pages with messages, we can simply save which of theses sub-lists is already crawled
22:25 < conseo> we then only have to save the url for every month (for pipermail for example) which we already have crawled
22:28 < conseo> on a new crawl we obmit these sub-lists. the smallest sub-list unfinished and including the present, gets a marker for the newest continuesly harvested message
22:30 < conseo> i will sketch that out tomorrow, that could simplify harvesting a lot. the only problem is that we might lose the number represented on one smallest sub-list page if we interrupt a harvest
22:30 < conseo> anyhow, i go to bed :-)
22:31 < conseo> (forget about "smallest" in "smallest sub-list unfinished")
22:33 < conseo> this neither needs a sequential order of the messages by id or by date, only the list has to be static in the past
22:33 < conseo> n8
22:48 < mcallan> sounds interesting, i'll wait to read your design...
22:48 < mcallan> g'n8
--- Log closed Tue Mar 27 00:00:25 2012