In order to navigate you to and from an airport KDE Itinerary needs to know where that airport actually is. That is a seemingly easy question, but surprisingly hard to answer with the level of precision we need. Since the recent work on public transportation line metadata left me with a local OpenStreetMap database, I tried to see if we can improve our airport coordinates a bit.
Where is an airport?
So far we use a simple mapping from airport IATA codes to a single geographic coordinate that we obtain from the airport’s Wikidata entry. That is, we have a single point somewhere in a often multiple kilometers wide area. Typically that’s somewhere around the center of the overall bounding box.
In some cases, such as Munich (MUC) this happens to be exactly where we want it to be, the “entrance” of the airport, ie. the place you have to go to to enter the airport (which is again usually a much larger area than the term “entrance” would suggest).
In many cases however we end up with a location somewhere in the middle of the runway, and thus either with navigation instructions that end with “and now walk 1.5km into an area you are not allowed to enter”, or worse, the routing engine snapping to the “other side”, leading you the opposite side of the airport.
Let’s look at Frankfurt (FRA), the following image marks a few relevant parts:
- The red ‘X’ is the bounding box center coordinate we git from Wikidata.
- The blue circle marks the entrance area you’d actually want to go to.
- The magenta circles mark other railway stations in the vicinity of the airport. In the error case routing attempts to get us to those, and then suggests a bus to somewhere closer to the outer perimeter of the airport.
Where is the airport entrance?
While it’s usually fairly clear for a human looking at a map where to go to enter an airport, it’s not that easy to determine this from OSM data in code. The following heuristics have proven useful:
Entrance tags on terminal building nodes. This seems like the obvious thing to look for, but unfortunately there are a few issues with this as data availability varies a lot. On a large airport you can end up with hundreds of those, or none at all (or worse, just one and that being far away from where one would expect it). When available on small airports though, this provides a very precise position.
Terminal buildings, after all you have to pass through those at the airport. This turns out to be rather robust on small to mid-sized airports where we don’t have air-side concourses or additional inaccessible terminal buildings (e.g. for government, VIP, military or freight use). To deal with the latter, detailed tags on the terminal buildings help a lot.
Railway stations inside the airport boundaries, or at least in very close proximity to a terminal building. Those are more common on medium to larger airports. In-airport inter-terminal transport systems can interfere with properly detecting railway stations though, here we rely again on detailed tagging in the OSM data.
Another aspect that helps humans to spot the entrance area is the structure of the road network leading there. That might offer additional hints, but is unfortunately much harder to deal with in code.
The above approach covers a wide range of airports, and for many of them it provides a significant improvement in navigation precision. This however fails on some of the very large airports, namely those with multiple entrance areas, typically due to having multiple sets of terminals that are largely disconnected from each other, to the extend of even looking like two or more airports close to each other and sharing the same runways.
London Heathrow (LHR) is one of the more extreme examples for this, with three distinct sets of terminals (marked with blue circles below), all with their own railway stations and access roads, and no internal connection between them.
These cases cannot be modelled by a single coordinate per airport anymore, here we’d need to take the respective terminal into account as well. That is possible, we have that information in the itinerary data model. What I’m still unsure about however is if we should attempt to cluster terminals automatically using the OSM data, or if those are so few cases that a manually created table would be more efficient to build and maintain?
What about trains?
Airports are particularly affected by this navigation problem, due to:
- The enormous size.
- There is a large area in their center that you cannot actually enter (which is something routing engines don’t like).
- Navigating to/from airports involves changing the mode of transportation (public transport or car), and usually also the transport operator. That typically implies the use of distinct routing systems, and their focus is usually not exactly airports.
For train stations this applies to a much lesser extend. Besides being smaller, the public transport routing systems usually tend to produce sensible results no matter which coordinate you pick within the station boundaries (as train stations happen to be a very central concept of those systems).
That doesn’t mean everything is perfect there, the possible error scenarios are just different.
You can help!
In the previous post I wrote about how you can help improving OSM and Wikidata data for public transport lines, and the same applies to airports. Verifying airport codes, tagging terminals, entrances and railway stations with all available details, and cross-linking Wikidata and OSM elements are easy ways to contribute.