171 lines
6.1 KiB
Markdown
171 lines
6.1 KiB
Markdown
# FaPra Algorithms on OSM Data
|
|
|
|
This repository contains implementations for the Fachpraktikum Algorithms on
|
|
OSM Data. The code is written in Rust, a stable rust compiler and `cargo`, the
|
|
rust build tool/package manager is required.
|
|
|
|
# Building
|
|
|
|
simply run `cargo build --release` to build release versions of all binaries.
|
|
|
|
# Tasks
|
|
|
|
## Task 1
|
|
|
|
There is no real implementation work for task 1, next!
|
|
|
|
## Task 2 + Task 3 + Task 4
|
|
|
|
Reading the data from an OSM PDF file and converting it to a graph is done in
|
|
`src/bin/generate_grid.rs`.
|
|
|
|
The implementation of the spherical point in polygon test is done in `src/polygon.rs`
|
|
with the function `Polygon::contains`.
|
|
|
|
There is one polygon in the graph, for which no valid outside polygon can be found.
|
|
I did not have the time to investigate this further.
|
|
|
|
### Extracting Coastlines from the PBF file
|
|
|
|
The code uses the osmpbfreader crate.
|
|
Sadly this module uses ~10GB of memory to extract the data from the PBF file
|
|
with all the coastlines.
|
|
|
|
### Point in Polygon
|
|
|
|
The test by Bevis and Chatelain is implemented.
|
|
Instead of using a point that is inside the polygon a point outside of the
|
|
polygon is used, because here we can simply specify several "well-known" outside
|
|
points that are somewhere in the ocean and therefore are outside of every polygon.
|
|
|
|
### Grid Graph
|
|
|
|
The Grid Graph is implemented in `gridgraph.rs`.
|
|
|
|
A regular grid graph can be generated with the `generate_regular_grid` function.
|
|
Import and Export from/to a file can be done with the `from_fmi_file` and `write_fmi_file` functions.
|
|
|
|
## Task 5
|
|
|
|
### Dijkstra Benchmarks
|
|
|
|
Dijkstras algorithm is implenented in `gridgraph.rs` with `GridGraph::shortest_path`.
|
|
It uses a Heap to store the nodes.
|
|
For details on how to run benchmarks see the benchmarks section at the end.
|
|
|
|
## Task 6
|
|
|
|
The UI is a Web-UI based on leaflet.
|
|
To start it, run `task6` with a .fmi file as a graph layout and a set of landmarks
|
|
(see Task 7 for details).
|
|
|
|
The start and end nodes can be placed on arbitrary positions and an algorithm
|
|
searches for the closes grid node and then runs the Routing.
|
|
|
|
The webserver listens on http://localhost:8000
|
|
|
|
Currently there is a display bug when routes wrap around the globe at 180 degrees
|
|
longitude. Instead of continuing the line as expected the line is drawn around
|
|
the globe in the "wrong" direction.
|
|
This is however just a display bug, the route itself is correct.
|
|
|
|
## Task 7
|
|
|
|
I implemented ALT, as described in [1].
|
|
Additionally A\* is available with a simple, unoptimized haversine distance
|
|
as the heuristic.
|
|
|
|
A\* is implemented in `src/astar.rs` and the heuristics for ALT are implemented
|
|
in `src/alt.rs`.
|
|
|
|
### Landmarks for ALT
|
|
|
|
currently 3 different landmark generation methods are available
|
|
|
|
- random selection
|
|
- greedy, distance-maximizing selection
|
|
- manual selection (from a GeoJSON file)
|
|
|
|
These can be generated with the `gen_landmarks_random`, `gen_landmarks_greedy`
|
|
and `gen_landmarks_geojson` binaries. The random and greedy methods take
|
|
the number of landmarks to select from a parameter.
|
|
|
|
A handy wrapper for that can be found as a Python script in `utils/generate_landmarks.py` that
|
|
generates landmarks for 4, 8, 16, 32 and 64 landmarks, both greedy and random.
|
|
|
|
# Running the benchmarks
|
|
|
|
First a set of queries is needed.
|
|
These can be generated with `generate_benchmark_targets --graph <graph> > targets.json`.
|
|
This generates 1000 random, distinct source and destination pairs.
|
|
The `--amount` parameter allows to adjust the number of pairs generated.
|
|
|
|
The actual benchmark is located in `benchmark`.
|
|
It needs a set of queries and a graph.
|
|
|
|
If the `--dijkstra` option is given it runs Dijkstras algorithm.
|
|
If the `--astar` option is given it runs A\* with the haversine distance
|
|
If `--landmarks <landmarkfile>` is given it runs ALT with that landmark file.
|
|
By setting `--alt_best_size <number>` one can select how many of the landmarks
|
|
are used to answer the query.
|
|
|
|
The benchmark prints out how many nodes were popped from the heap for
|
|
each run and the average time per route.
|
|
|
|
`utils/run_benchmarks.py` is a wrapper script that runs the benchmarks for a
|
|
big set of parameters.
|
|
|
|
`utils/plot_results.py` generates several plots of the results.
|
|
|
|
# Results
|
|
|
|
These are some quick tests, further results will be presented later.
|
|
Everything was run on a Thinkpad X260 laptop with an Intel i7-6600U CPU @ 2.60GHz
|
|
processor.
|
|
Each test used the same 1000 queries.
|
|
Rust v1.57.0 was used for all tests.
|
|
|
|
The ALT variants were used with the 4 best landmarks.
|
|
Further tests on the performance of more landmarks will be presented laster.
|
|
The set of 44 handpicked landmarks were spread around the extremeties of the
|
|
continents and into "dead ends" like the Mediteranean and the Gulf of Mexico
|
|
with the goal to provide landmarks that are "behind" the source or target
|
|
node.
|
|
|
|
All benchmarks were run on the provided benchmark graph.
|
|
|
|
## raw data:
|
|
```
|
|
# name, (avg. heap pops per query, avg. time)
|
|
{'astar': (155019.451, 0.044386497025),
|
|
'dijkstra': (423046.796, 0.058129875474999995),
|
|
'greedy_32': (42514.751, 0.013299024275000002),
|
|
'greedy_64': (35820.461, 0.011887869759),
|
|
'handpicked_44': (70868.721, 0.01821366828),
|
|
'random_32': (58830.082, 0.016845884717),
|
|
'random_64': (51952.261, 0.015234422699)}
|
|
```
|
|
|
|
## Interpretation
|
|
|
|
Dijkstra needs ~58ms per route, while the best version is greedy\_64 (that is
|
|
with 64 landmarks) needs only 12 seconds, which is ~5 times faster.
|
|
We also see, that the greedy versions perform slightly better than their
|
|
random counterparts with the same amount of nodes.
|
|
While the 44 handpicked landmarks outperformed A\* and Dijkstra, they are beaten
|
|
by both the random and greedy landmark selections which had fewer nodes.
|
|
|
|
## Memory Consumption
|
|
|
|
The landmarks are basically arrays of the cost to each node.
|
|
Since the distances are currently calculates with 64 bit integers
|
|
each landmark needs 8 byte per node in the graph.
|
|
With a graph that has about 700k nodes this leads to ~5.5MB of memory per
|
|
landmark.
|
|
So 64 landmarks need ~350MB of memory.
|
|
|
|
One could also use 32 bit integers which would half the memory requirements.
|
|
|
|
# References
|
|
|
|
[1](Computing the Shortest Path: A\* meets Graph Theory, A. Goldberg and C. Harrelson, Microsoft Research, Technical Report MSR-TR-2004-24, 2004)
|