Working Group Minutes/EWG 2013-12-09

From OpenStreetMap Foundation

Attendees

IRC nick Real name
apmon Kai Krueger
gravitystorm Andy Allan
pnorman Paul Norman
shaunmcdonald Shaun McDonald
TomH Tom Hughes
zere Matt Amos

Summary

  • hack events
    • apmon observed that we can reach a different audience, possibly with more new people, by attaching hack events to conferences.
    • At the moment, we don't have the problem of preferring one type of event over another - none of the event funding for 2013 was claimed.
    • Main issue is getting the word out - zere is still on the hook for a blog post, and would appreciate any help.
  • osm2pgsql threading
    • Threading branch increases planet import speed by 75% on an SSD-store machine (i.e: minimum I/O wait).
    • There was a discussion about versioning: apmon reckoned a major version release (1.0.0?) would be appropriate after the partitioning branch is merged.


IRC Log

17:33:22 <zere> minutes of the last meeting: http://www.osmfoundation.org/wiki/Working_Group_Minutes/EWG_2013-12-02
17:33:29 <gravitystorm> TomH or zere: could you insert the word "formerly" into the repo description for https://github.com/openstreetmap/mapnik-stylesheets please?
17:34:02 <zere> actions from previous meetings. gravitystorm i know added those issues to the code4osm repo, thanks!
17:34:40 <zere> gravitystorm: doesn't look like i'm an admin for that repo, sorry.
17:35:10 <zere> i have not yet started the blog post / diary entry for last weekend. apologies. any help would be most appreciated.
17:35:16 <TomH> gravitystorm: done
17:35:35 <zere> the 2014 plan / budget - i've stuck up a slightly edited version of last year's here: https://hackpad.com/EWG-Plan-2014-oK7RHSfDYN9
17:36:22 <zere> i've tried to include the stuff which got discussed in previous meetings: a widening of the remit to general dev discussion, focussing more on promoting the subsidy for hack events, etc...
17:37:17 <gravitystorm> TomH: muchios gratias
17:38:23 <zere> gravitystorm: any word on the hackday advice PR, OWL docs, etc...?
17:38:44 <gravitystorm> zere: plan/budget looks good. Personally I'll stick to the improving documentation though :-)
17:39:07 <gravitystorm> zere: no, no progress on any of those. :-(
17:39:25 <gravitystorm> I'm not sure if it's worth tracking my lack of effort further until some effort materialises
17:39:46 <apmon> I wonder if it would be sensible to restrict / encourage the use of funds for hackweekends if they are linked to bigger conferences (like e.g. SotM)
17:40:05 <zere> just think of us if you find yourself hanging around with nothing to do over xmas ;-)
17:40:13 <apmon> As those are the times it is most likely to atract new developers
17:40:47 <zere> since the target number is > the number of bigger conferences, i don't think restricting it makes sense.
17:40:54 <zere> but encourage, sure.
17:41:02 <gravitystorm> zere: when I run out of steam on openstreetmap-carto, then code4osm, owl and keepright are all somewhere close behind
17:41:05 <apmon> for the usual hackweekend crowd, it is more likely to find "sponsored venues" i.e. office space of one of the usual devs
17:41:38 <gravitystorm> I'm with zere - when we've supported 11 hack weekends, we can start worrying about overspends :-)
17:42:14 <zere> sure - the point isn't to spend the money, the point is to have the events. the money is there to help that happen. if someone only needs $250, then that's $250 to spend on another event.
17:42:32 <apmon> I would be surpsised if there aren't way more than 10 geo / opensource / map revelant conferences
17:42:36 <zere> i think the big hurdle is getting the message out there and encouraging people who haven't done it before to do it.
17:43:22 <zere> apmon: sure. but to start this, it takes people wanting to run these events. that's what we're lacking.
17:44:05 * pnorman waves
17:44:08 <zere> there was one after sotm, one after sotm-us. that's great. i hope there are more, and that they can get some help, if it's not already sponsored.
17:44:25 <zere> i think the $500 is probably of more use when there are no sponsors
17:45:03 <zere> and, as gravitystorm says, when we have 11 applicants (rather than zero) then it's worth having the discussion about which applicants should have higher priority.
17:45:32 <apmon> It is not just about paying though, but also about outreach
17:46:02 <apmon> I.e. trying to encourage people to do these events related to conferences they go to anyway
17:46:16 <shaunmcdonald> There was also a hack day after the sotm Scotland this year.
17:46:41 <zere> shaunmcdonald: any issues arranging it or paying for stuff/
17:47:08 <apmon> If they are direct OSM conferences, finding sponsors for a hackday might not be as difficult.
17:47:13 <shaunmcdonald> No issues as the venue sponsored it, and then someone else sponsored the food/drink.
17:47:28 <zere> apmon: absolutely. that's where we failed this year - the money was available, but there was no outreach, no attempt to encourage people to run the events and take the money.
17:47:51 <apmon> But if you try and attach it to other conferences, e.g. geo conferences, or wikimania, or general OS conferences, or CCC, it might be harder
17:48:10 <zere> shaunmcdonald: awesome. were you organising it, or someone else on the sotm scotland committee?
17:49:07 <shaunmcdonald> zere: It was Bob Kerr who organised. I wasn't on the committee, I just recorded/uploaded the video.
17:50:16 <zere> indeed. but i think it targets different audiences. the "internal" hack days pretty much target a geographic audience - they don't get much publicity outside the city / region they're organised in. attaching to conferences might attract a more geographically diverse crowd. but first it takes someone to step up to run it...
17:50:34 <shaunmcdonald> I'll let the folks know tomorrow at the Edinburgh pub meet about the possibility of funds to run one.
17:50:45 <zere> cool, thanks :-)
17:51:03 * zere wonders how often shaunmcdonald "pops up" to edinburgh for a pint.
17:52:02 <zere> apmon: how many of those conferences do you go to? i must admit, my annual conference schedule doesn't extend much beyond SOTMs of various stripes.
17:52:32 <zere> presumably conferences like wikimania already have attached hack events for wikipedia-related things?
17:52:37 <shaunmcdonald> Also people are unlikely to travel a huge distances to hack days or pub meets, unless there's something else happening too.
17:53:06 <zere> you say that, but the one we had last weekend had 2 international travellers.
17:53:07 <shaunmcdonald> zere: it's the first time I've been in a couple of years I think.
17:54:21 <apmon> zere: Non, as I don't work in that industry
17:54:39 <shaunmcdonald> zere: did they come for just the hack day, or for other things as well, i.e. I wouldn't go to a Toronto hack day just for the hack day, rather going for something else while I'm there.
17:54:45 <apmon> but if someone knows of conferences in Denver / Colorado that are relevant, I might be able to help
17:55:07 <zere> shaunmcdonald: one for just the hack day, another combined with a short sightseeing break.
17:55:21 <apmon> shaunmcdonald: Yes, that is why I would like to link it to events, where a lot of people travel to anyway
17:55:55 <zere> there's quite a start-up scene in denver, isn't there? might be interesting to see if we can tap into that... but i must admit i don't know what events they organise.
17:56:47 <apmon> Even in Boulder (where I actually live, which is 30 miles from Denver), there are a lot of start-ups
17:57:05 <apmon> but I also don't really know how to tap into that sceen
17:57:06 <shaunmcdonald> So relatively low proportions of people traveling, though some will do.
17:58:12 <zere> indeed. i don't think one can compare the number of people travelling for the hack day last weekend with the number who travel for conferences.
17:58:56 <zere> the trick seems to be identifying the conferences which don't already have some sort of competing hack event, and finding someone who's going and willing to organise an OSM one.
17:59:32 <zere> anyone willing to undertake a bit of research?
18:01:17 <apmon> It seems that would be something in general OSMF should do. Try and identify conferences for OSM outreach.
18:01:52 <apmon> Once we have identified relevant conferences, we can try and find volunteers who are willing to "present" OSM or do hackdays
18:02:46 <zere> yup, why do you think we're talking about it at an OSMF EWG meeting? anyone within OSMF who's going to do it is here.
18:03:01 <apmon> I've seen this happen in Germany. Where people for example call for volunteers to man a stand at events like CeBIT or other tradeshows
18:03:42 <apmon> i.e. to try and bring those people together who know about those events and those who are local and can help out, but would otherwise not know about them.
18:04:27 <gravitystorm> apmon: yes. So we need to first identify the relevant conferences, right?
18:04:39 <apmon> Perhaps we can start with a simple email to talk. Ask if anyone knows about good candidates
18:05:21 <gravitystorm> apmon: go for it
18:05:28 <apmon> OK, will do.
18:06:40 * pnorman wants to put osm2pgsql threading on the agenda
18:07:02 <zere> ok. anything more to discuss on the topic of hack events?
18:08:18 <zere> i'll take that as a "no". if we do, then we can circle back at the end.
18:08:21 <apmon> not from my side.
18:08:27 <zere> #topic osm2pgsql threading
18:08:31 <gravitystorm> gah, I had one thing but I'll wait :-)
18:08:40 <zere> pnorman: you have some benchmark results? ;-)
18:09:23 <pnorman> Yes - on top-end hardware, 6 processes, threading branch shaves about 3 hours off of a planet import
18:10:00 <apmon> That sounds like a decent amount :-)
18:10:17 <zere> that's 3 hours off... 18? 24?
18:10:30 <pnorman> 3 off of 7
18:10:33 <gravitystorm> pnorman: do you have any setup for benchmarking updates?
18:10:51 <pnorman> 28.3 ksec to 17.8 ksec
18:11:18 <pnorman> note: does not include index generation or clustering, which adds another ~6 ksec
18:11:31 <zere> cool. 75% faster is pretty exciting.
18:11:38 <apmon> Yes, indeed :-)
18:11:53 <pnorman> as in a way to report the results? not really. I have piles of data though!
18:12:10 <apmon> The improvement is likely much less on spinning rust, but a lot of people are moving over to SSDs now, so it seems well worth it
18:12:29 <pnorman> some of the stuff is going to go into github tickets where its resulting in a proposed change
18:13:03 <zere> this isn't exactly a crazy machine - hetzner ex40-ssd, 70 EUR per month. it's not cheap, but it's not expensive either
18:13:04 <apmon> I have a bunch of changes done to the clustering / indexing that I will hopefully commit soon to the threading branch based on your results
18:13:47 <pnorman> zere: for osm2pgsql imports this is about the fastest machine money can buy, unless you go excessive and have an even wider RAID0 array
18:14:40 <zere> well... be careful now...
18:14:44 <apmon> I did some tests on the amazon 244GB ram instance and put everything on ramdisks. I think even that wasn't much faster
18:14:55 <pnorman> zere: haswell is measurably faster than SB or IVB, and osm2pgsql doesn't really scale to 16 threads, so the LGA2011 CPUs aren't significantly better
18:15:37 <zere> presumably a machine with 784GB RAM would be faster, or have we reached the level where osm2pgsql is CPU-bound rather than disk-bound?
18:15:58 <apmon> It is pretty much CPU bound if you have a raid of SSDs
18:16:17 <zere> the index generation and clustering are surely disk bound... unless postgres is spending most of its time in geos?
18:16:29 <pnorman> it's CPU bound, more specificly low thread CPU bound
18:16:48 <apmon> and for some reason the threading branch doesn't scale too well beyond 2 - 6 processes.
18:16:55 <pnorman> zere: index generation is single threaded per table, and ther e are <=8 tables
18:16:56 <zere> any idea what the bottleneck is?
18:17:05 <apmon> but still, a 75% speed is not bad even if it doesn't scale to higher CPU counts
18:17:11 <apmon> no.
18:17:12 <zere> pnorman: theading branch for postgres! ;-)
18:17:21 <apmon> It uses up more CPU time, but doesn't get more stuff done
18:17:23 <zere> s/thead/thread/
18:17:34 <pnorman> apmon: I seem to be getting results that are better when on real hardware, but haven't scaled up to 8 yet
18:17:58 <apmon> So it might be some spin lock wasting CPU time, but no good idea where or how to test or reduce that.
18:18:14 <zere> have you tried running through perf?
18:18:34 <apmon> Ah, good. That would indicate overhead in locks, which are possibly less efficient on virtualised hardware
18:19:00 <pnorman> apmon: in fact, it seems to be *more* efficient on 6 threads than 4 threads when it comes to relations
18:19:01 <zere> that should at least tell you whether you're keeping the CPU busy with real work, or stalls or cache misses or spinlocks.
18:19:47 <apmon> The profiles I have tried, never gave me interpretable results, but I don't really know how to read the lower level stuff. (beyond call graphs)
18:20:03 <apmon> So if anyone else wants to have a look at it, it would be great
18:20:03 <zere> once you've got perf results, then it might be time to instrument some of the code.
18:20:11 <zere> awesome. send me a pastebin or something
18:20:21 <pnorman> Anyways, what everyone cares about, has the output changed
18:20:36 <zere> i'm sadly quite used to digging crap out of gprof output...
18:20:39 <apmon> Yes, that is the big question. I.e. is it safe to merge that branch...
18:20:52 <pnorman> and if you give me 30 seconds i'll copy/paste and compare :)
18:21:25 * zere mumbles something about unit tests being really useful ;-)
18:21:46 <pnorman> passes those, but they're not comprehensive
18:21:47 <apmon> What is the best way to ensure there are no seldom timing / synchronisation issues that corrupt data
18:22:07 <apmon> they also don't work against thread level non-deterministic issues
18:23:13 <apmon> The integration tests aren't perfect, but I would considere them reasonably comprehensive by now.
18:23:19 <zere> yeah, sure. depends on the window for the timing issue. i've written some tests around timing issues, but the window was huge (i.e: insert a sleep(1) to hit).
18:24:12 <pnorman> all table stats check out - this is row counts, sums of way_area, st_area(way), st_perimeter(way) on polygons; st_length on line/roads; and for slim tables, row counts and array length of nodes, tags, parts, and members
18:24:25 <pnorman> md5sum for flat nodes.
18:24:37 <zere> that seems pretty comprehensive
18:24:59 <apmon> have you done any tests on the diff imports?
18:25:46 <apmon> That is where I would guess is still most likely that there are bugs we haven't found yet
18:26:56 <apmon> But perhaps I will just merge it, with a warning to dev of these big changes and a recomendation to stick to the version 0.84 if it is production critical
18:26:57 <pnorman> it's worth noting that the tables are in a different order on disk and this *does* have performance implications for an unclustered table, as well as clustering time, but the performance changes go away with clustering, and the clustering time changes are much much less than the speedups to importing
18:27:18 <pnorman> No checks on diff imports. It's not *supposed* to change those, is it?
18:27:29 <apmon> They are threaded as well
18:27:45 <apmon> as they use the same threaded code pathes
18:27:45 <pnorman> ah, hmm. well, time to prepare some diffs
18:27:55 <shaunmcdonald> apmon: isn't that what versioning is for? So that people can hold back releases and know roughly how much change there is?
18:27:57 <pnorman> Open the PR and I'll have some results in 1-2 days
18:28:06 <apmon> shaunmcdonald: Indeed
18:28:19 <apmon> just that osm2pgsql hasn't been great with versioning in the past
18:28:35 <apmon> So I don't know how many people actually use the versioned snapshots vs master
18:28:58 <apmon> switch2osm.org still recomends to simply checkout master if I am not mistaken
18:28:59 <pnorman> osm software, as a general rule, doesn't have useful versioning
18:29:15 <gravitystorm> pnorman: that's not an argument for ignoring versioning
18:30:00 <pnorman> as the developer, no; as a consumer, yes
18:31:05 <gravitystorm> but in this case, the bigger problem is the level of uncertainty around merging things into master. The thing to focus on is to figure out what can improve the situation so that it's more likely to work when merged, or where apmon has more confidence so that he can merge it and tag it as a new release
18:32:07 <gravitystorm> (and as a minor thing, version numbers are free, so I always encourage people to bump the major version. osm2pgsql is still on 0! )
18:33:35 <zere> sure, but then you have to be careful to say to people whether a major-0 state & db is usable as-is with major-1, or if there's some upgrade process, or if it's a case of throwing it away and starting over with a new import.
18:33:41 <apmon> I guess psychologically it matters if it has a version number of 0 or not... ;-)
18:33:43 <gravitystorm> so as a side note, we should action updating switch2osm to say use a release, rather than master
18:34:09 <apmon> yes
18:34:44 <apmon> zere: So perhaps once the partitioning branch goes in and becomes stable, it would be a good candidate to increase the major version number
18:35:03 <apmon> as that is a semi incompatible change to the database layout
18:35:56 <zere> yup. i'd generally consider a major version bump to mean incompatibility of some sort. although the 0 -> 1 bump is always a bit of a grey ara.
18:36:10 <zere> s/ara\./area./
18:36:43 <apmon> Sounds like a potential plan.
18:36:56 <apmon> Merge the threading branch into master soon. With a not to dev
18:37:09 <apmon> Wait a while to see if any reported problems emerge
18:37:20 <apmon> rebase the partitioning branch
18:37:36 <apmon> and once that is declared sufficiently stable, tag a 1.x release
18:37:40 <zere> cool :-) is there anything more on this, or do we have time to quickly circle back to gravitystorm's comment on the hack event?
18:37:54 <apmon> sounds good to me
18:38:05 <zere> #topic hack events part II
18:38:41 <gravitystorm> My only comment re hack events was to make sure that we have somewhere (e.g. on the osmf EWG page) a bit of blurb saying that we provide financial assistance for them
18:39:02 <gravitystorm> make it slightly less if-you've-read-the-minutes-in-the-locked-cabinet wrt getting the word out
18:39:05 <zere> yup
18:39:06 <shaunmcdonald> From 1.0 could switch to z.x.y naming scheme where z is database incompatible. y is significant change that may affect stability etc.
18:39:21 <zere> there's still an action on me to write that. any help would be appreciated.
18:40:13 <zere> was planning to put it out as a blog post attached to something about EWG in general. although now it might be better attached to the hack weekend write up.
18:40:58 <zere> in either case, it's blocking on me. but i'd be happy if anyone wanted to assist or take over.
18:42:09 <pnorman> andy, good to see merges catching up :)
18:44:02 <zere> finally, is there any other business anyone wants to discuss? preferably very quickly ;-)
18:46:02 <zere> awesome. thanks to everyone for coming!
18:46:10 <zere> hope to see you next week :-)