Fapra-OSM-2/README.md

171 lines
6.1 KiB
Markdown

# FaPra Algorithms on OSM Data
This repository contains implementations for the Fachpraktikum Algorithms on
OSM Data. The code is written in Rust, a stable rust compiler and `cargo`, the
rust build tool/package manager is required.
# Building
simply run `cargo build --release` to build release versions of all binaries.
# Tasks
## Task 1
There is no real implementation work for task 1, next!
## Task 2 + Task 3 + Task 4
Reading the data from an OSM PDF file and converting it to a graph is done in
`src/bin/generate_grid.rs`.
The implementation of the spherical point in polygon test is done in `src/polygon.rs`
with the function `Polygon::contains`.
There is one polygon in the graph, for which no valid outside polygon can be found.
I did not have the time to investigate this further.
### Extracting Coastlines from the PBF file
The code uses the osmpbfreader crate.
Sadly this module uses ~10GB of memory to extract the data from the PBF file
with all the coastlines.
### Point in Polygon
The test by Bevis and Chatelain is implemented.
Instead of using a point that is inside the polygon a point outside of the
polygon is used, because here we can simply specify several "well-known" outside
points that are somewhere in the ocean and therefore are outside of every polygon.
### Grid Graph
The Grid Graph is implemented in `gridgraph.rs`.
A regular grid graph can be generated with the `generate_regular_grid` function.
Import and Export from/to a file can be done with the `from_fmi_file` and `write_fmi_file` functions.
## Task 5
### Dijkstra Benchmarks
Dijkstras algorithm is implenented in `gridgraph.rs` with `GridGraph::shortest_path`.
It uses a Heap to store the nodes.
For details on how to run benchmarks see the benchmarks section at the end.
## Task 6
The UI is a Web-UI based on leaflet.
To start it, run `task6` with a .fmi file as a graph layout and a set of landmarks
(see Task 7 for details).
The start and end nodes can be placed on arbitrary positions and an algorithm
searches for the closes grid node and then runs the Routing.
The webserver listens on http://localhost:8000
Currently there is a display bug when routes wrap around the globe at 180 degrees
longitude. Instead of continuing the line as expected the line is drawn around
the globe in the "wrong" direction.
This is however just a display bug, the route itself is correct.
## Task 7
I implemented ALT, as described in [1].
Additionally A\* is available with a simple, unoptimized haversine distance
as the heuristic.
A\* is implemented in `src/astar.rs` and the heuristics for ALT are implemented
in `src/alt.rs`.
### Landmarks for ALT
currently 3 different landmark generation methods are available
- random selection
- greedy, distance-maximizing selection
- manual selection (from a GeoJSON file)
These can be generated with the `gen_landmarks_random`, `gen_landmarks_greedy`
and `gen_landmarks_geojson` binaries. The random and greedy methods take
the number of landmarks to select from a parameter.
A handy wrapper for that can be found as a Python script in `utils/generate_landmarks.py` that
generates landmarks for 4, 8, 16, 32 and 64 landmarks, both greedy and random.
# Running the benchmarks
First a set of queries is needed.
These can be generated with `generate_benchmark_targets --graph <graph> > targets.json`.
This generates 1000 random, distinct source and destination pairs.
The `--amount` parameter allows to adjust the number of pairs generated.
The actual benchmark is located in `benchmark`.
It needs a set of queries and a graph.
If the `--dijkstra` option is given it runs Dijkstras algorithm.
If the `--astar` option is given it runs A\* with the haversine distance
If `--landmarks <landmarkfile>` is given it runs ALT with that landmark file.
By setting `--alt_best_size <number>` one can select how many of the landmarks
are used to answer the query.
The benchmark prints out how many nodes were popped from the heap for
each run and the average time per route.
`utils/run_benchmarks.py` is a wrapper script that runs the benchmarks for a
big set of parameters.
`utils/plot_results.py` generates several plots of the results.
# Results
These are some quick tests, further results will be presented later.
Everything was run on a Thinkpad X260 laptop with an Intel i7-6600U CPU @ 2.60GHz
processor.
Each test used the same 1000 queries.
Rust v1.57.0 was used for all tests.
The ALT variants were used with the 4 best landmarks.
Further tests on the performance of more landmarks will be presented laster.
The set of 44 handpicked landmarks were spread around the extremeties of the
continents and into "dead ends" like the Mediteranean and the Gulf of Mexico
with the goal to provide landmarks that are "behind" the source or target
node.
All benchmarks were run on the provided benchmark graph.
## raw data:
```
# name, (avg. heap pops per query, avg. time)
{'astar': (155019.451, 0.044386497025),
'dijkstra': (423046.796, 0.058129875474999995),
'greedy_32': (42514.751, 0.013299024275000002),
'greedy_64': (35820.461, 0.011887869759),
'handpicked_44': (70868.721, 0.01821366828),
'random_32': (58830.082, 0.016845884717),
'random_64': (51952.261, 0.015234422699)}
```
## Interpretation
Dijkstra needs ~58ms per route, while the best version is greedy\_64 (that is
with 64 landmarks) needs only 12 seconds, which is ~5 times faster.
We also see, that the greedy versions perform slightly better than their
random counterparts with the same amount of nodes.
While the 44 handpicked landmarks outperformed A\* and Dijkstra, they are beaten
by both the random and greedy landmark selections which had fewer nodes.
## Memory Consumption
The landmarks are basically arrays of the cost to each node.
Since the distances are currently calculates with 64 bit integers
each landmark needs 8 byte per node in the graph.
With a graph that has about 700k nodes this leads to ~5.5MB of memory per
landmark.
So 64 landmarks need ~350MB of memory.
One could also use 32 bit integers which would half the memory requirements.
# References
[1](Computing the Shortest Path: A\* meets Graph Theory, A. Goldberg and C. Harrelson, Microsoft Research, Technical Report MSR-TR-2004-24, 2004)