Working Group Minutes/EWG 2014-02-10

From OpenStreetMap Foundation

Attendees

IRC nick Real name
apmon Kai Krueger
pnorman Paul Norman
RichardF Richard Fairhurst
shaunmcdonald Shaun McDonald
TomH Tom Hughes
zere Matt Amos

Summary

  • Routing branch
    • zere reported some benchmarking results, but there are still some features of the results which need explaining.
    • ACTION RichardF to finish "a couple of little things (he needs) to finish"
    • There was some discussion of how to represent the various routing backends, but without any consensus conclusion.

IRC Log

17:32:05 <zere> minutes of the last meeting: http://www.osmfoundation.org/wiki/Working_Group_Minutes/EWG_2014-02-03 - please let me know if anything needs changing.
17:32:16 <zere> #topic previous actions
17:32:47 <zere> there was one on me to do some benchmarking of OSRM vs GraphHopper - got some interesting results, which i'll talk about in a bit.
17:32:57 <zere> apmon, any word from CM?
17:35:17 * pnorman waves
17:38:35 <zere> umm... ok, i guess apmon can give us an update later.
17:38:41 <zere> #topic routing benchmarking
17:38:55 <zere> some very interesting, and puzzling, stuff here. headline numbers first:
17:39:01 <apmon> zere: No, didn't engage with them yet.
17:39:58 <zere> apmon: ok, thanks. i'll hold it over for next week, then.
17:40:44 <zere> OSRM: "extract" took 160 mins, 16GB. "prepare" took 783 mins, 112GB (note that 783 mins included a lot of time spent swapping).
17:41:08 <zere> graphhopper: 253 mins, 34GB for everything.
17:41:08 <pnorman> How much RAM?
17:41:24 <zere> peak was 112GB for OSRM.
17:41:33 <pnorman> No, how much RAM did the machine have?
17:41:38 <apmon> This is for the full planet?
17:41:44 <zere> yup
17:41:52 <zere> default car routing profiles for both.
17:42:07 <apmon> There goes that theory of Java is memory inefficient and slow! ;-)
17:42:15 <zere> so i thought "ok, we knew OSRM took a lot of memory to generate a very efficient graph"
17:42:35 <zere> so i ran about 21k routes through both, with a profile like:
17:43:23 <zere> 1% distance 3.3km, 5% distance 7km, median 70km, 95% 346km, 99% 802km.
17:43:40 <zere> so these aren't short routes, but hopefully somewhat representative.
17:43:55 <zere> response timings (both with geometry on, instructions off):
17:44:34 <zere> OSRM: 1% 6.5ms, 5% 7.8ms, median 13.6ms, 95% 22.3ms, 99% 27.9ms.
17:44:47 <apmon> Those are percentiles?
17:44:52 <zere> yup
17:45:23 <zere> graphhopper: 1% 1.1ms, 5% 1.7ms, median 5.4ms, 95% 12.1ms, 99% 17.2ms
17:45:47 <apmon> wow, those are some impressive results for graphhopper
17:45:54 <pnorman> what kind of concurrency?
17:46:03 <zere> i did some work there too.
17:46:47 <zere> this is number of client threads: total time (with total number of requests constant)
17:47:27 <zere> OSRM: 1: 296s, 2: 121s, 4: 52s, 8: 27.2s, 16: 24s.
17:47:46 <zere> this is also despite there being only 8 cores available.
17:48:19 <zere> graphhopper: 1: 134s, 2: 41s, 4: 19s, 8: 16s, 16: 18s.
17:48:54 <pnorman> interesting that OSRM gets faster going frm 8 to 16 on a 8 thread machine
17:49:15 <zere> interestingly, both show super-linear speedup. so there must be some well-parallelised paths, or possibly advantage from the CPU caches.
17:49:48 <apmon> Those are great numbers
17:49:53 <zere> yes, i suspect 8 clients weren't actually saturating 8 cores, given the networking & client-side overhead.
17:50:07 <pnorman> routing is apparently a problem that's easy to run in parallel, which these results back up
17:50:57 <zere> yup, and on top of that obviously one could run multiple servers in parallel.
17:51:08 <apmon> Either way, it looks like performance is likely not an issue with either, as long as there is enough ram
17:51:30 <apmon> If the quality of routes is comparable, it looks like Graphhopper is the clear winner though
17:51:50 <apmon> It would be nice to get an overview of OSM tags supported by both
17:52:23 <zere> of the 21k routes, 18918 were found by both, 428 by OSRM only, 772 by graphhopper only and 984 by neither.
17:52:26 <apmon> i.e. do they support turn restrictions, via ways, bollards, maxspeed, access=, country specific defaults, ...
17:52:58 <pnorman> So, graphhopper is about 75% faster, takes less time to prepare on a machine with <128GB RAM, and takes substantially less RAM
17:53:00 <zere> this is pretty readable, i think: https://github.com/DennisOSRM/Project-OSRM/blob/master/profiles/car.lua
17:53:45 <pnorman> OSRM's support is good, and I know RichardF uses a complicated profile for cycle.travel
17:53:47 <zere> equivalent for GH: https://github.com/graphhopper/graphhopper/blob/master/core/src/main/java/com/graphhopper/routing/util/CarFlagEncoder.java
17:54:09 <zere> GH's is (imho) less readable and, quite clearly, less flexible.
17:54:37 <RichardF> yep, OSRM is largely unrestricted in what you can do via a Lua profile, though it doesn't (yet) support relations out-of-the-box.
17:55:03 <pnorman> How is turn restriction support when using ways as via?
17:55:04 <zere> what confuses me is that the graphs are wildly different sizes
17:55:49 <zere> the input OSM file (by a quick script) contains 813 million "edges", of which 158 million are highway edges.
17:56:25 <zere> OSRM claims 487 million, then expands that to 1.1 billion, then to 2.7 billion.
17:56:53 <RichardF> pnorman: OSRM: not implemented yet, promised for "not so distant future". I don't think GH does turn restrictions at all yet, but it's under development in a branch
17:56:54 <zere> (with the extra edges coming from restriction support, CH-contraction, i think)
17:57:07 <RichardF> (obviously I don't really care that much about turn restrictions as I'm a cyclist)
17:57:33 <zere> GH, otoh, starts with 125 million edges and expands to 221 million edges.
17:58:03 <apmon> that is a huge difference in edges
17:58:04 <pnorman> Looks correct about GH not supporting turn restrictions, no relevant matches for 'only' in the file.
17:58:05 <zere> so if OSRM is at a 10x disadvantage in number of edges, then that would explain much of the suprise in the benchmark results.
17:58:30 <zere> so the question i haven't been able to answer yet is: why so many edges, OSRM?
17:58:46 <apmon> Could you write up those numbers and post them to the OSRM and Graphhopper mailinglists?
17:58:57 <apmon> To see what their response and explanations are?
17:59:08 <zere> yeah, was planning a blog post about it.
17:59:11 <pnorman> Doesn't osmosis have the ability to turn turn restrictions into fake oneways?
17:59:36 <zere> also wanted to ping DennisL to see if i'd done something stupid with OSRM, but he seems to be away or something.
17:59:53 <pnorman> no, your numbers look reasonable with OSRM. it's a huge memory hog
17:59:59 <RichardF> zere: some commit activity in #osrm today so I guess he might be around
18:00:41 <zere> pnorman: sure, from the number of edges - but i wanted to know why the number of edges starts so high (4x the number i see in the OSM file)
18:01:00 <zere> not really accounted for by the 230k restrictions mentioned in the log.
18:01:25 <zere> unless restrictions end up having some sort of near combinatoric effect
18:03:08 <pnorman> what can we conclude for as it stands to pass on for advice about server selection?
18:03:13 <apmon> zere did you use a git snapshot of graphhopper, or a "released version"?
18:03:56 <apmon> Do we have a spare server with 64GB of ram lying around?
18:04:59 <pnorman> zere: how much ram did your test machine have?
18:08:23 <zere> 64GB
18:08:38 <zere> apmon: i used git master for both OSRM and GH.
18:09:13 <zere> and i'm renting a 64GB server for this. happy to do some other benchmarking if there's something you think would be interesting.
18:09:31 <pnorman> 1) OSRM requires a machine with 128GB ram for reasonable processing times
18:09:58 <zere> what it means for running our own servers, though, is that OSRM will clearly require something close to 192GB to process future planets. graphhopper might be ok with 96GB or even 64GB.
18:10:07 <apmon> There are commits in GH in the last 18 days of the form "turn restrictions configurable" https://github.com/graphhopper/graphhopper/commit/bd6d68ccc5c89dc1517bfbc1644d3da3b5ca3943
18:10:22 <apmon> So there definately seems to be work on turn restrictions if they aren't supported already
18:12:12 <zere> i dunno - the next commit renames it to "turnCosts", which could be something quite different
18:12:12 <apmon> I wonder if it is possible to start with graphhopper and a less powerfull server, but keep in mind the potential need to upgrade ram
18:12:31 <apmon> turnCosts are a superset
18:12:50 <apmon> i.e. you can configure that turning left is more expensive than right, (or the opposite in the UK)
18:12:58 <apmon> a turn restriction than has infinite cost for that turn
18:13:48 <pnorman> well there's an easy way to test this: test a turn restriction on http://graphhopper.com/maps/
18:16:01 <pnorman> give me a minute and I'll check
18:16:59 <pnorman> http://graphhopper.com/maps/?point=49.222036%2C-122.978398&point=49.221812%2C-122.978967&locale=en-GB happily ignores http://www.openstreetmap.org/relation/1820201
18:18:04 <shaunmcdonald> pnorman: yup, same here http://graphhopper.com/maps/?point=52.056152%2C1.165967&point=52.056495%2C1.164787&locale=en-GB where it ignores the turn restriction.
18:18:08 <pnorman> does anyone know if the plugin for the --induce-ways-for-turnrestrictions osmosis task works?
18:18:32 <shaunmcdonald> I've never tried to use it.
18:20:37 <apmon> Are there funds available to buy a server?
18:21:45 <zere> that rather depends on what the server will be running, and whether there's a need to even buy one.
18:21:58 <pnorman> https://github.com/graphhopper/graphhopper/pull/55#issuecomment-34197911 is the latest info on GH turn restrictions
18:23:26 <pnorman> We sem to be somewhat wandering - what can we wrap up with?
18:28:44 <apmon> Imho, turn restrictions are hugely important, as one of (my) the hopes of adding routing to osm.org is to help promote advanced routing tagging like turn restrictions by making the visible to more mappers
18:29:00 <apmon> So clearly if the router doesn't support them, that goal would not be achieved
18:29:07 <zere> pnorman: i think apmon's suggestion to write all this up and post it to the OSRM & GH mailing lists is a good one. i'll try and get that done. it looks like when GH lands turn restrictions, there'll be some more testing to do.
18:29:27 <pnorman> if you get time and the plugin works, can you try with --induce-ways-for-turnrestrictions?
18:29:29 <apmon> Nevertheless, given that we do seem to have permission from all of the backend providers to use their servers in osm.org
18:29:50 <zere> well, not so far from CM or OSRM.
18:30:04 <apmon> it might not be a bad idea to start with having a graphhopper instance on osmf hardware, include all of the other engines in the interface using their backend
18:30:22 <apmon> plan for the server to be expandable to support osrm (if we need to buy a server)
18:30:33 <apmon> and then reevaluate after a while of practical experience
18:30:34 <pnorman> zere: OSRM gave permission
18:30:43 <zere> first, we need the software. RichardF: what's left for people to help out with?
18:31:00 <zere> pnorman: they did? oh, cool.
18:31:28 <RichardF> zere: permalinks mostly - apmon and TomH were talking about that yesterday.
18:32:20 <apmon> One thing I am not sure is ideal, is the mix of routing backend with routing mode
18:32:36 <apmon> i.e. "Car (OSRM)", "bike (Graphhopper)"
18:32:38 <RichardF> apmon: in code or UI terms?
18:32:46 <apmon> both! ;-)
18:32:52 <RichardF> two different questions ;)
18:33:07 <zere> yeah, perhaps would be better to select the mode (car, bike, etc..) then see a list of available backends for that type.
18:33:09 <RichardF> no
18:33:13 <apmon> indeed, I was going to "complain" about both though ;-)
18:33:17 <RichardF> in UI terms, it's the same as we do with the layer switcher
18:33:32 <RichardF> if we were to add (say) my cycle.travel map to osm.org
18:33:38 <RichardF> (which I have absolutely no ambition to do ;) )
18:33:43 <pnorman> zere: I was about to suggest that
18:33:45 <RichardF> then we wouldn't say "first select cycle map, then select the one you want)
18:34:15 <TomH> it presents the same problem that we had with the map layers though - to a non-osm exprt WTF is "OSRM" or "GraphHopper"
18:34:19 <apmon> If we had many different cycle maps, it would be a more logical choice
18:34:35 <zere> i was thinking [car][bike][foot] buttons, then underneath that (and much smaller) "Provider: (drop-down) [what does this mean]"
18:34:37 <apmon> particularly if the differences are not stylistic, but technical that most users don't understand
18:34:39 <TomH> we tried to change the names of the layers to get away from those technical details
18:35:00 <apmon> the engine selection could even be hidden behind an "advanced" link
18:35:01 <zere> then the "what does this mean" can go to a page which explains that OSM distributes open data in a totally awesome way which means that there's more than one choice.
18:35:33 <RichardF> if you want to do it that way, feel free to write a patch. I personally think it'd be a retrograde step but I'm not _that_ fussed
18:35:59 <RichardF> on the second question - code - yes, there should be common OSRM response-parsing code which all the OSRM engines could call, and so on
18:36:02 <zere> apmon: hiding it would be great if we were competing with google, but we want people to wonder why there are multiple options and click the link giving them more information.
18:36:02 <apmon> your selection so far also only has a small subset of combinations
18:36:35 <pnorman> How many combinations of mode-engine do we have?
18:36:35 <TomH> from a UI point of view we also need to find a better way of activating it - the little link above the search box is horrible
18:36:37 <RichardF> it actually says that in a todo comment in the engine code, but I didn't see it as remotely a priority for v1
18:36:48 <apmon> Yes, it is one of those tradeoffs between usability and flexibility. I'd be fine with either
18:36:50 <TomH> and the "where am I" should go back where it is now
18:37:20 <RichardF> TomH: strictly speaking the "where am I" should go to /dev/null because it's confusing functionality that hardly anyone uses ;)
18:37:29 <RichardF> "oh look, a geolocation button... oh no wait"
18:37:59 <pnorman> I thought my log analysis from the last version of the site showed it was used
18:38:56 <apmon> There "where am I" functionality should be moved into the routing parts. Currently if you move the start marker, it only shows the coordinates. It probably should geolocate those coordinates into a human readable address
18:39:15 <RichardF> pnorman: not to any significant degree IIRC. I suspect if you put a "show me my horoscope" button on the front page it'd get as many clicks
18:39:42 <RichardF> apmon: yeah, it should. but I think we're in danger of bikeshedding a v1 into a v10 here
18:39:57 <TomH> RichardF: you can take it away if you promise to deal with all the complaints ;-)
18:40:09 <RichardF> TomH: do you think me dealing with complaints is a good idea? :)
18:41:15 <TomH> depends how much Black Rat you've had ;-)
18:41:46 <apmon> If we can get things deployed faster I don't mind leaving those things to v2 or later
18:41:47 <RichardF> ah, well the Rose & Crown is currently selling Black Dragon (no relation), which is a whole bunch tastier (so I drink more of it) and slightly stronger.
18:42:04 <apmon> But if deployment is blocked on something else, we might as well try and make V1 as good as possible
18:42:47 <RichardF> apmon: as ever, defining "we" is the key. :) I'm not planning to change the scope of the current code in any way, but if others want to add more stuff, go for it
18:42:53 <pnorman> I don't think deployment is blocked on anything either
18:43:15 <apmon> And how well the v2..V10 model works is also up for grabs. E.g. I never went back and improved the Notes code with the various feature improvements people had in mind...
18:44:01 <apmon> It might be safer for me to write patches for those things than to mess around with the permalink code
18:45:38 <apmon> Do we have concrete action items to move things forward?
18:46:31 <zere> the only action so far is on you to contact CM.
18:46:41 <zere> RichardF: do you think the code is ready for review?
18:46:54 <RichardF> no, there's a couple of little things I need to finish
18:47:28 <RichardF> I hope to get to them this week, though I'm crazy busy atm
18:47:33 <zere> ok :-)
18:47:44 <pnorman> I had a quick item
18:47:54 <zere> #action RichardF to finish "a couple of little things [he needs] to finish"
18:48:06 * RichardF laughs
18:48:11 <zere> worlds most descriptive action award goes to...
18:48:46 <zere> apmon, TomH: did the permalink discussion end up with a solution, or is it still being discussed?
18:49:46 <TomH> zere: well I don't really look at it as a "permalink" issue, though that is a side effect
18:50:14 <TomH> the point it that the routing code needs to register with OSM.Router like the other stuff does, so that transitions between that and other modes work correctly and the URL bar updates
18:50:23 <TomH> part of that is defining a URL syntax for routing
18:51:26 <zere> ok. i don't want to start that discussion off again, but i take it that it was more complicated than it appears to me right now.
18:51:38 <TomH> not especially
18:51:44 <zere> anyone wanting to take an action to sort out the URL syntax for routing?
18:51:50 <TomH> there's a big comment explaining how OSM.Router works
18:52:18 <zere> oh, yeah, i didn't mean OSM.Router, i meant the syntax.
18:53:03 <zere> i.e: more than /lon1,lat1/lon2,lat2/.../lonN,latN?provider=whatever
18:53:18 <zere> so... anyone want to take that on and write some code?
18:55:40 <zere> funny how it goes really quiet when we start asking for volunteers.
18:55:49 <zere> anyway, we're over time...
18:56:01 <zere> so thanks, everyone, for coming & hope to see you next week