updated README, added benchmark and plotting functions

fixed ALT using more landmarks than available
renamed handpicked landmarks
2022-09-15 19:48:57 +02:00 · 2022-09-15 19:48:30 +02:00 · 2022-09-15 19:47:01 +02:00
5 changed files with 136 additions and 8 deletions
--- a/README.md
+++ b/README.md
@ -20,7 +20,7 @@ Reading the data from an OSM PDF file and converting it to a graph is done in
 `src/bin/generate_grid.rs`.

 The implementation of the spherical point in polygon test is done in `src/polygon.rs`
-with the function `contains()`.
+with the function `Polygon::contains`.

 There is one polygon in the graph, for which no valid outside polygon can be found.
 I did not have the time to investigate this further.
@ -29,8 +29,7 @@ I did not have the time to investigate this further.

 The code uses the osmpbfreader crate.
 Sadly this module uses ~10GB of memory to extract the data from the PBF file
-with all the coastlines. So far I did not have time to look into what happens
-there.
+with all the coastlines.

 ### Point in Polygon

@ -50,9 +49,9 @@ Import and Export from/to a file can be done with the `from_fmi_file` and `write

 ### Dijkstra Benchmarks

-The dijkstras algorithm is implenented in `gridgraph.rs`.
+Dijkstras algorithm is implenented in `gridgraph.rs` with `GridGraph::shortest_path`.
 It uses a Heap to store the nodes.
-On details on how to run benchmarks see the benchmarks session at the end.
+For details on how to run benchmarks see the benchmarks section at the end.

 ## Task 6

@ -76,6 +75,9 @@ I implemented ALT, as described in [1].
 Additionally A\* is available with a simple, unoptimized haversine distance
 as the heuristic.

+A\* is implemented in `src/astar.rs` and the heuristics for ALT are implemented
+in `src/alt.rs`.
+
 ### Landmarks for ALT

 currently 3 different landmark generation methods are available
@ -94,7 +96,7 @@ generates landmarks for 4, 8, 16, 32 and 64 landmarks, both greedy and random.
 # Running the benchmarks

 First a set of queries is needed.
-This can be done with the `generate_benchmark_targets --graph <graph> > targets.json`.
+These can be generated with `generate_benchmark_targets --graph <graph> > targets.json`.
 This generates 1000 random, distinct source and destination pairs.
 The `--amount` parameter allows to adjust the number of pairs generated.

@ -110,4 +112,11 @@ are used to answer the query.
 The benchmark prints out how many nodes were popped from the heap for
 each run and the average time per route.

-[1] Computing the Shortest Path: A* meets Graph Theory, A. Goldberg and C. Harrelson, Microsoft Research, Technical Report MSR-TR-2004-24, 2004
+`utils/run_benchmarks.py` is a wrapper script that runs the benchmarks for a
+big set of parameters.
+
+`utils/plot_results.py` generates several plots of the results.
+
+# References
+
+[1](Computing the Shortest Path: A\* meets Graph Theory, A. Goldberg and C. Harrelson, Microsoft Research, Technical Report MSR-TR-2004-24, 2004)
--- a/landmarks/ocean_handpicked_44.json
+++ b/landmarks/ocean_handpicked_44.json
--- a/src/alt.rs
+++ b/src/alt.rs
@ -137,7 +137,8 @@ impl LandmarkBestSet<'_> {
        results.reverse();

        self.best_landmarks.clear();
-        for result in results[..self.best_size].iter() {
+
+        for result in results[..(self.best_size.min(results.len()))].iter() {
            self.best_landmarks.push(result.0);
        }
    }
--- a/utils/plot_results.py
+++ b/utils/plot_results.py
@ -0,0 +1,90 @@
+#!/usr/bin/env python3
+
+from sys import argv, exit
+import os
+from csv import writer
+from typing import Tuple, List
+import re
+import numpy as np
+
+import matplotlib.pyplot as plt
+
+if len(argv) != 2:
+    print(f"Usage: { argv[0] } <results_dir>")
+    exit(1)
+
+path = argv[1]
+
+files = [f for f in os.listdir(path) if os.path.isfile(f"{ path }/{f}")]
+
+files = [f for f in files if re.match(r"greedy_64_.+", f) is not None ]
+
+
+def parse_file(file: str) -> Tuple[float, List[int]]:
+
+    pops = list()
+    time = None
+    with open(file) as f:
+        for line in f.readlines():
+            m = re.match(r"popped\s(?P<pops>\d+)\s.*", line)
+            if m is not None:
+                pops.append(int(m.groupdict()["pops"]))
+                continue
+
+            m = re.match(r"It took\s(?P<time>\S+).*", line)
+            if m is not None:
+                time = float(m.groupdict()["time"])
+
+    if len(pops) == 0:
+        raise Exception(f"Parsing { file } failed, no heap pops found!")
+
+    return time, pops
+
+results = dict()
+
+with open("times.csv", "w+") as times_file:
+    times = writer(times_file)
+
+    with open("pops.csv", "w+") as pops_file:
+        pops = writer(pops_file)
+
+        for file in files:
+            name = file.split(".")[0]
+            full_path = f"{ path }/{ file }"
+            time, pop = parse_file(full_path)
+
+            total_pops = sum(pop)
+
+            results[name] = (total_pops, time)
+
+            times.writerow([name, time])
+
+            pops.writerow([name, *pop])
+rel_pops = list()
+rel_time = list()
+labels = list()
+
+# base_pops = results["dijkstra"][0]
+# base_time = results["dijkstra"][1]
+base_pops = results["greedy_64_1"][0]
+base_time = results["greedy_64_1"][1]
+
+for name, values in results.items():
+    pops, time = values
+    labels.append(name)
+    rel_pops.append(pops/base_pops)
+    rel_time.append(time/base_time)
+
+
+
+x = np.arange(len(labels))  # the label locations
+width = 0.35  # the width of the bars
+
+fig, ax = plt.subplots()
+rects1 = ax.bar(x - width/2, rel_time , width, label='time')
+rects2 = ax.bar(x + width/2, rel_pops, width, label='pops')
+
+ax.legend()
+ax.set_xticks(x, labels)
+
+plt.show()
--- a/utils/run_benchmarks.py
+++ b/utils/run_benchmarks.py
@ -0,0 +1,28 @@
+#!/usr/bin/env python3
+
+from os import system
+from sys import argv, exit
+
+if len(argv) != 4:
+    print(f"usage: {argv[0]} <graph> <landmark_prefix> <targets>")
+    exit(1)
+
+graph = argv[1]
+landmarks = argv[2]
+targets = argv[3]
+
+print("running A*")
+system(f"cargo run --release --bin=performance -- --graph={graph} --astar --targets {targets} > results/astar.txt")
+print("running Dijkstra")
+system(f"cargo run --release --bin=performance -- --graph={graph} --dijkstra --targets {targets} > results/dijkstra.txt")
+
+print("running ALT with 44 handpicked landmarks")
+system(f"cargo run --release --bin=performance -- --graph={graph} --landmarks={landmarks}_handpicked_44.bin --targets {targets} > results/handpicked_44.txt")
+for i in range(6, 7):
+    num = 2**i
+
+    for best in [12, 24, ]:
+
+        for lm_type in ["greedy", "random"]:
+            print(f"running ALT with {num} {lm_type} landmarks")
+            system(f"cargo run --release --bin=performance -- --graph={graph} --landmarks={landmarks}_{lm_type}_{ num }.bin --targets {targets} --alt-best-size { best} > results/{lm_type}_{num}_{ best }.txt")
Author	SHA1	Message	Date
Johannes Erwerle	a9fd341b45	updated README, added benchmark and plotting functions	2022-09-15 19:48:57 +02:00
Johannes Erwerle	cab977451c	fixed ALT using more landmarks than available	2022-09-15 19:48:30 +02:00
Johannes Erwerle	0b3d2db135	renamed handpicked landmarks	2022-09-15 19:47:01 +02:00