There's a new parking scraper in town
… just, that new scraper is mine. There are people doing this some years longer! So let me instead introduce parkendd.de by Offenes Dresden. It’s all open and the archive is available for download.
This post is similar to some of my other data investigation posts in the sense that i simply start coding and see what comes out of it. For a change, all code is included so you do not need to check the jupyter notebook.
Currently (end of 2021) data from 2015 to 2020 is packaged into a big tar.xz file, which i will convert to tar.gz because it’s easier to read in python.
wget https://parkendd.de/dumps/Archive.tar.xz
xz -dc Archive.tar.xz | gzip -cf9 > parkapi-2020.tar.gz
The xz compression actually seems to be a good choice because the filesize expands from 200 to 500 megabytes with gz. Not a problematic number, though. However, the uncompressed tar file is about 3.6 Gigabytes. I want to use pandas and experience tells me that loading a Gigabyte csv will usually not fit into memory. Even if it does, all operations that copy data will eventually kill the python kernel.
So, i’ll iterate through all files in the archive - each representing one parking lot per year - resample them to averaged 1 hour buckets and gradually merge them into a single DataFrame. I want to look at years 2016 to 2020, so that’s about 44,000 hour steps for a 100+ paring lots which should fit into anyone’s memory.
from pathlib import Path
import tarfile
import codecs
import re
from typing import Generator, Tuple, Union, Optional, Callable
from tqdm import tqdm
import requests
import pandas as pd
import numpy as np
import plotly
import plotly.express as px
from plotly.subplots import make_subplots
pd.options.display.max_columns = 30
pd.options.plotting.backend = "plotly"
plotly.templates.default = "plotly_dark"
def iter_archive_dataframes(
filename: Union[str, Path],
resampling: str = "1h",
) -> Generator[Tuple[str, pd.DataFrame], None, None]:
# tarfile does handle the gzip automatically
with tarfile.open(filename) as tfp:
# build map of lot_id to available csv filenames
# i ignore 2015 since it's incomplete
lot_id_filenames = dict()
for filename in sorted(tfp.getnames()):
if "backup" not in filename:
match = re.match("(.*)-(20\d\d).csv", filename)
if match:
lot_id, year = match.groups()
if year != "2015":
lot_id_filenames.setdefault(lot_id, []).append(filename)
# for each lot
for lot_id, filenames in lot_id_filenames.items():
# if we have years 2016 - 2020
if len(filenames) == 5:
# build one DataFrame, resampled to 1 hour
dfs = []
for filename in filenames:
fp = tfp.extractfile(filename)
dfs.append(pd.read_csv(
codecs.getreader("utf-8")(fp),
names=["date", "free"]
))
df = pd.concat(dfs, axis=0)
df["date"] = pd.to_datetime(df["date"])
try:
df = df.set_index("date").resample(resampling).mean()
yield lot_id, df
except:
pass
archive_file = Path("~/prog/data/parking/parkapi-2020.tar.gz").expanduser()
table_file = Path("~/prog/data/parking/parkapi-2020-1h.csv").expanduser()
if not table_file.exists():
big_df = None
for lot_id, df in tqdm(iter_archive_dataframes(archive_file)):
df["lot_id"] = lot_id
df = df.reset_index().set_index(["date", "lot_id"])
if big_df is None:
big_df = df
else:
# append rows and sort by date
big_df = pd.concat([big_df, df]).sort_index()
# x = lot_id, y = date
big_df = big_df.unstack("lot_id")
# drop the "free" label from columns, just keep lot_id
big_df.columns = big_df.columns.droplevel()
# store
big_df.to_csv(table_file)
else:
# read the file if it was already created
big_df = pd.read_csv(table_file)
big_df["date"] = pd.to_datetime(big_df["date"])
big_df.set_index("date", inplace=True)
big_df.columns.name = "lot_id"
big_df
lot_id | aalborgcwobel | aalborgfriis | aalborgføtex | aalborggåsepigen | aalborgkennedyarkaden | aalborgkongrescenter | aalborgmusikkenshus | aalborgpalads | aalborgsalling | aalborgsauersplads | aalborgsømandshjemmet | aarhusbruunsgalleri | aarhusbusgadehuset | aarhuskalkværksvej | aarhusmagasin | ... | luebeckpferdemarkt | luebeckradissonhotel | muensterbusparkplatz | oldenburgccoparkdeck1 | oldenburgccoparkdeck2 | oldenburgcity | oldenburggaleriakaufhof | oldenburghbfzob | oldenburgheiligengeisthoefe | oldenburgpferdemarkt | oldenburgschlosshoefe | oldenburgtheatergarage | oldenburgtheaterwall | oldenburgwaffenplatz | zuerichparkgarageamcentral |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
date | |||||||||||||||||||||||||||||||
2016-01-01 00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | 0.000000 | 0.000 | 0.0 | 200.0 | 200.0 | 383.083333 | 269.0 | 126.666667 | 268.0 | 384.916667 | 309.250000 | 53.750000 | 64.833333 | 650.000000 | 0.000000 |
2016-01-01 01:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | 0.000000 | 0.000 | 0.0 | 200.0 | 200.0 | 384.000000 | 269.0 | 129.750000 | 269.0 | 388.750000 | 316.833333 | 58.750000 | 67.083333 | 650.000000 | 0.000000 |
2016-01-01 02:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | 0.000000 | 0.000 | 0.0 | 200.0 | 200.0 | 384.000000 | 269.0 | 133.250000 | 269.0 | 391.500000 | 321.083333 | 60.166667 | 79.250000 | 650.000000 | 0.000000 |
2016-01-01 03:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | 0.000000 | 0.000 | 0.0 | 200.0 | 200.0 | 384.000000 | 269.0 | 133.833333 | 269.0 | 400.083333 | 323.416667 | 61.583333 | 77.500000 | 650.000000 | 0.000000 |
2016-01-01 04:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | 0.000000 | 0.000 | 0.0 | 200.0 | 200.0 | 384.000000 | 269.0 | 133.000000 | 269.0 | 401.000000 | 324.000000 | 61.000000 | 78.083333 | 650.000000 | 0.000000 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2020-12-31 19:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | 63.000000 | 178.0 | 350.000000 | ... | 46.000000 | 64.875 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 111.000000 | 154.0 | 399.666667 | 428.000000 | 90.000000 | 69.416667 | 502.416667 | 38.000000 |
2020-12-31 20:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | 63.000000 | 178.0 | 350.000000 | ... | 46.000000 | 64.750 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 111.000000 | 154.0 | 399.500000 | 428.000000 | 90.000000 | 67.416667 | 503.000000 | 38.000000 |
2020-12-31 21:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | 61.666667 | 178.0 | 349.916667 | ... | 7.666667 | 10.750 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 111.750000 | 154.0 | 399.416667 | 428.000000 | 90.000000 | 66.833333 | 503.000000 | 38.000000 |
2020-12-31 22:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | 57.750000 | 178.0 | 350.000000 | ... | 0.000000 | 0.000 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 113.000000 | 154.0 | 400.416667 | 428.000000 | 90.000000 | 67.166667 | 502.083333 | 12.666667 |
2020-12-31 23:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | 56.000000 | 178.0 | 350.000000 | ... | 0.000000 | 0.000 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 113.000000 | 154.0 | 395.333333 | 428.000000 | 90.000000 | 67.250000 | 0.000000 | 0.000000 |
43848 rows × 127 columns
Don’t mind all the NaNs, the city of Aalborg is not scraped throughout the whole period. But Oldenburg seems to look good. Without further number crunching let’s do a quick interactive plot, resampled to 1 week buckets:
(big_df
.resample("1w").mean()
.round() # the round saves about 300 Kb of javascript code
.plot(
title=f"average number of free spaces per week ({big_df.shape[1]} lots)",
labels={"value": "number of free spaces", "date": "week"}
)
)
As usual, you can drag and zoom, and hide the inidividual lots on the right side (doubleclick to hide all except one).
Now that i’m actually able to look at parking data predating this stupid covid pandemic i’ll pose two simple research questions
- Is the lockdown around Germany in beginning of 2020 visible in the parking lot occupation data?
- Has anything in the parking behaviour significantly changed compared to before?
First of all, when checking the plots above, a few cities have big chunks of missing data, Aalborg for example. It’s a shame but i’ll exclude them. Moreover, there are smaller gaps. Sometimes it happens that the number of free spaces listed on a website gets stuck, or is not listed at all, while other lots on the same site work fine. I’ll count the number of times that the average value does not change during three days. Specifically since year 2018:
df = (
big_df[(big_df.index >= "2018-01-01")]
.resample("1d").mean()
.replace(np.nan, 0) # treat missing values as zero
)
num_equal_days = ((df == df.shift(1)) & (df == df.shift(2))).astype(int).sum()
num_equal_days.sort_values().plot.bar(
title="Number of times that 3 consecutive days have unchanged number of free spaces",
height=600,
)
By visual inspection and comparison with the plot on top i decide to cut everything above 100, and also remove the Zurich lot because it misses data exactly at the time in question:
big_df = big_df.loc[:, (num_equal_days <= 100) & (big_df.columns != "zuerichparkgarageamcentral")]
big_df.shape
(43848, 53)
Okay, 53 lots remain. Now it would be great to normalize each lot using the total capacity.
big_df.max()
lot_id
aarhusbusgadehuset 97.166667
aarhussalling 700.000000
dresdenaltmarkt 439.000000
dresdenaltmarktgalerie 9868.000000
dresdenanderfrauenkirche 140.000000
dresdencentrumgalerie 3771.416667
dresdenfrauenkircheneumarkt 296.000000
dresdenkaditz 377.000000
dresdenkongresszentrum 26245.000000
dresdenparkhausmitte 432.333333
dresdenpirnaischerplatz 145.000000
dresdenprohlis 192.250000
dresdenreitbahnstrasse 409.916667
dresdensarrasanistrasse 1360.166667
dresdenschiessgasse 999.000000
dresdenterrassenufer 244.000000
dresdentheresienstrasse 159.000000
dresdenwiesentorstrasse 185.333333
dresdenwoehrlflorentinum 323.583333
dresdenworldtradecenter 314.416667
freiburgambahnhof 242.000000
freiburgbahnhofsgarage 224.000000
freiburgkarlsbau 977.000000
freiburgkonzerthaus 453.000000
freiburgmartinstor 142.000000
freiburgrotteck 312.000000
freiburgschlossberg 440.000000
freiburgschwarzwaldcity 436.250000
freiburgzaehringertor 100.000000
ingolstadtcongressgarage 453.000000
ingolstadthallenbad 661.666667
ingolstadthauptbahnhofost 240.000000
ingolstadtmuenster 750.000000
ingolstadtnordbahnhof 231.083333
ingolstadtreduittilly 356.000000
ingolstadttheaterost 595.000000
ingolstadttheaterwest 514.333333
luebeckbackbord 135.000000
luebeckfalkenstrasse 150.000000
luebeckhaerdercenter 212.000000
luebeckhafenbahnhof 108.833333
luebeckkanalstrasse2 216.000000
luebeckkanalstrasse3 197.000000
luebeckkanalstrasse4 284.000000
luebeckkanalstrasse5 45.000000
luebecklastadiep3 34.000000
luebecklastadiep4 17.000000
luebecklastadiep5 253.916667
luebeckleuchtenfeld 750.000000
luebecklindenarcaden 400.000000
luebeckmitte 420.000000
luebeckmuk 367.000000
luebeckradissonhotel 73.000000
dtype: float64
Ah, well, the congress center in Dresden probably does not had 26 thousand spaces. I’ll first clamp the dataframe to, let’s say, 2000, just to remove the most obvious outliers
big_df = big_df.clip(0, 2000)
and then ask the ParkAPI for more precise values. The endpoint is https://api.parkendd.de/<City>
which returns static and live data for each lot per city:
CITIES = ["Aarhus", "Dresden", "Freiburg", "Ingolstadt", "Luebeck"]
lot_infos = dict()
for city in CITIES:
response = requests.get(f"https://api.parkendd.de/{city}")
for lot in response.json()["lots"]:
lot["city"] = city
lot_infos[lot["id"]] = lot
lot_infos["dresdenkongresszentrum"]
{'address': 'Ostra-Ufer 2',
'coords': {'lat': 51.05922, 'lng': 13.7305},
'forecast': False,
'free': 234,
'id': 'dresdenkongresszentrum',
'lot_type': 'Tiefgarage',
'name': 'Kongresszentrum',
'region': 'Ring West',
'state': 'open',
'total': 250,
'city': 'Dresden'}
Well 26,000 was only two magnitudes above the truth.
lot_infos["luebeckbackbord"]
{'coords': {'lat': 53.970161, 'lng': 10.880241},
'forecast': False,
'free': 0,
'id': 'luebeckbackbord',
'lot_type': 'Parkplatz',
'name': 'Backbord',
'region': 'Parkplätze Lübeck',
'state': 'open',
'total': 0,
'city': 'Luebeck'}
Lübeck does not provide a total value. The website that is scraped can be determined from the geojson file of the Lübeck-scraper (or from https://api.parkendd.de/
). It actually seems to be offline right now. So i’ll use the official numbers if present and the maximum free value otherwise:
official_capacity = pd.Series(
big_df.columns.map(lambda c: lot_infos[c]["total"] or None),
index=big_df.columns
).dropna()
capacity = big_df.max()
capacity[official_capacity.index] = official_capacity
# lot occupation in range [0, 1]
occupied = 1. - (big_df / capacity).clip(0, 1)
(occupied
.groupby(lambda c: lot_infos[c]["city"], axis=1).mean()
.resample("1m").mean() * 100.
).plot(
title="Average lot occupation per month and city",
labels={"value": "occupation percentage", "date": "month"}
)
Alright. There it is. A pretty obvious dent! With the least occupation during April 2020. That’s how i remember it. Kids skating on empty parking lots, no planes in the sky, no stupid shops selling useless things.
For the interested, here’s the same plot for each lot:
(occupied.resample("1m").mean() * 100.).round().plot(
title="Average lot occupation per month and lot",
labels={"value": "occupation percentage", "date": "month"}
)
There are more ways of looking at the occupation data. Instead of calculating the average for each week we can build a histogram of the occupation values. This shows all levels of occupation during each week:
def plot_histogram(
df: pd.DataFrame,
resample: str = "1w",
bins: int = 48,
range: Optional[Tuple[float, float]] = None,
clip: Optional[Tuple[float, float]] = None,
title: Optional[str] = None,
labels: Optional[dict] = None,
):
if range is None:
df_n = df.replace(np.nan, 0)
range = (np.amin(df_n.values), np.amax(df_n.values))
df = pd.concat(
(pd.Series(np.histogram(group, bins=bins, range=range)[0], name=key)
for key, group in df.resample(resample, level="date")),
axis=1
).replace(0, np.nan)
df.index = np.linspace(*range, bins)
if clip is not None:
df = df.clip(*clip)
return px.imshow(
df, origin="lower",
title=title or "Weekly histogram of occupation per lot",
labels=labels or {"y": "occupation percentage", "x": "week"},
color_continuous_scale=["#005", "#08f", "#8ff", "#fff", "#fff"]
)
# ignore values that are exactly zero or one
# as they are usually *bad data* (see below)
plot_histogram(occupied.replace({0: np.nan, 1: np.nan}) * 100.)
So, starting end of March 2020, the most reported lot occupation is between 0 and 15%. The situation kind of normalizes in June and kind of returns in November.
What are these small horizontal stripes you ask? And what happened in the beginning of 2018?
The short 2018 outage is probably some internal server problem. You know, disk full, provider problems. There is no indication in the commit history.
To investigate the stripes, i’ll spend few more Megabytes of generated javascript and look at a few lots in pariticular:
def plot_lot_data(lot_id: str, filter: Optional[Callable] = None):
fig = make_subplots(
rows=2, cols=1,
vertical_spacing=0.1,
shared_xaxes=True,
subplot_titles=["weekly occupation histogram", "number of free spaces per hour"],
)
filter = filter or (lambda df: df)
df = filter(occupied[lot_id])
histo = plot_histogram(df * 100)
fig.add_trace(histo.data[0], row=1, col=1)
fig.add_trace(
filter(big_df[lot_id]).round().plot().data[0].update(showlegend=False),
row=2, col=1,
)
return fig.update_layout(
coloraxis=histo.layout.coloraxis,
title=f"{lot_id} (capacity: {lot_infos[lot_id]['total']})", height=700
)
plot_lot_data("dresdenparkhausmitte")
Obviously, a horizontal stripe means that the free-spaces-counter stood still somehow. Except for the stripes at 0% occupation starting at the end of 2019. They are caused by the reported number of free spaces being larger than the reported lot capacity, which is (by the time of writing this article) 280. This garage must have decreased it’s capacity in the meantime. It would be helpful if the recorded capacity would be published in the archive as well. Otherwise we must trust the maximum value which is 432 for this recording. However, if you zoom in at Oct 1st to 4th 2016 when this maximum was reached, you’ll notice a completely unrealistic looking period of 200+ spaces. Also note that the little free peaks that occur each day around that period are upside-down within! It may still be possible that some real-life event has caused that but i find it more likely to be some digital mess-up.
plot_lot_data("freiburgambahnhof")
At first glance, the parking lot in Freibug looks much more lively compared to the one above. But please zoom in at the flat-line in winter 2016/17. There is obviously no real car activity but still the number of reported free spaces changes between zero and 62 each day in a super regular pattern reminding on opening hours. They only publish free places during opening hours. You know, that might make sense for drivers but it just makes interpreting the data harder. Since the outage in April 2018 they seem to be open 24/7 and data is published continously. Still, looking closely at some points it becomes hard to determine, for myself at least, if this is real car-in car-out activity. The patterns are so regular at times, e.g. from one weekend to the next, that i find it either creepy or not completely trustworthy.
plot_lot_data("ingolstadtreduittilly")
This one’s interesting. The number of cars in Ingolstadt seems to be growing. Although, once again, zooming in on the data reveals some strange jumps of the occupied spaces during the night from one week to the next which do not look like a reflection of 3d events. Or could this actually be gradual steps back towards working-life after the first lock-down?
Changes to the capacity, whether real or digital, do affect the number of free spaces. And i start to realize that it’s actually hard work to sample true car-activity just from the published number of free spaces.
Gradients do have the same problem. I thought: lets just look at the difference to the previous day or something like that. This will at least mitigate the opening-hours problem and some other automatic or purely digital changes that the free-spaces counter might be subject to. Example:
df = big_df["freiburgambahnhof"]
df = df[(df.index >= "2020-01-01") & (df.index < "2020-06-01")]
fig = make_subplots(
rows=4, cols=1,
vertical_spacing=0.02,
shared_xaxes=True,
subplot_titles=[
"free spaces per hour", "difference to previous hour",
"difference to previous day", "difference to previous week"
],
)
fig.add_trace(df.plot().data[0], row=1, col=1)
fig.add_trace(df.diff(1).plot().data[0], row=2, col=1)
fig.add_trace(df.diff(24).plot().data[0], row=3, col=1)
fig.add_trace(df.diff(24*7).plot().data[0], row=4, col=1)
fig.update_layout(
height=1300, showlegend=False,
title="'freiburgambahnhof' free spaces and gradients (2020/01 - 2020/05)"
)
One can see things, still, it is hard to interpret this data automatically.
Fine. I’m not a paid scientist, not even a scientist, but i want to scrutinize question #2 a bit: Has anything in the parking behaviour significantly changed compared to before? I mean, apart from the fact that there is less parking, anyways. So i’ll try to look at the occupation per hour-of-day. In my previous parking post i found that there are some hints if occupation is driven by work & shopping activity or by more leisurely demands.
But first i need to check the opening hours problem. If a lot does list zero free spaces at some point that translates to 1.0 in the occupied
DataFrame and so i’ll simply count the number of times that a lot has full occupation for each hour of day:
zero_df = pd.concat([
(occupied[occupied.index.hour == hour] == 1).astype(int).sum()
for hour in range(24)
], axis=1)
zero_df.columns.rename("hour of day", inplace=True)
zero_df
hour of day | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
lot_id | ||||||||||||||||||||||||
aarhusbusgadehuset | 6 | 5 | 5 | 5 | 5 | 6 | 6 | 6 | 6 | 5 | 4 | 4 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
aarhussalling | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
dresdenaltmarkt | 15 | 15 | 15 | 15 | 10 | 9 | 8 | 7 | 9 | 17 | 58 | 151 | 182 | 168 | 160 | 145 | 138 | 149 | 136 | 66 | 26 | 18 | 15 | 15 |
dresdenaltmarktgalerie | 8 | 8 | 8 | 8 | 6 | 5 | 5 | 4 | 5 | 15 | 97 | 210 | 219 | 182 | 156 | 121 | 93 | 63 | 22 | 8 | 8 | 8 | 8 | 8 |
dresdenanderfrauenkirche | 27 | 27 | 26 | 26 | 26 | 29 | 30 | 28 | 29 | 33 | 41 | 36 | 29 | 24 | 27 | 26 | 25 | 29 | 23 | 22 | 24 | 25 | 26 | 28 |
dresdencentrumgalerie | 30 | 30 | 30 | 30 | 26 | 23 | 21 | 18 | 20 | 26 | 56 | 168 | 193 | 154 | 115 | 77 | 54 | 38 | 30 | 30 | 30 | 30 | 30 | 30 |
dresdenfrauenkircheneumarkt | 42 | 42 | 42 | 42 | 36 | 27 | 26 | 22 | 27 | 28 | 40 | 97 | 115 | 107 | 88 | 88 | 94 | 137 | 156 | 135 | 84 | 50 | 44 | 42 |
dresdenkaditz | 1 | 0 | 0 | 0 | 0 | 1 | 2 | 2 | 1 | 1 | 2 | 0 | 0 | 2 | 1 | 2 | 2 | 0 | 0 | 3 | 1 | 1 | 1 | 1 |
dresdenkongresszentrum | 1 | 0 | 0 | 0 | 0 | 0 | 2 | 17 | 45 | 56 | 59 | 53 | 51 | 41 | 26 | 19 | 15 | 24 | 31 | 28 | 24 | 17 | 12 | 6 |
dresdenparkhausmitte | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 2 | 3 | 3 | 2 | 1 | 1 | 0 |
dresdenpirnaischerplatz | 14 | 6 | 6 | 7 | 8 | 9 | 6 | 2 | 9 | 55 | 223 | 386 | 385 | 327 | 306 | 311 | 369 | 514 | 523 | 332 | 113 | 45 | 26 | 18 |
dresdenprohlis | 17 | 13 | 3 | 4 | 3 | 1 | 2 | 1 | 5 | 8 | 14 | 17 | 15 | 17 | 15 | 15 | 19 | 19 | 18 | 16 | 16 | 17 | 17 | 17 |
dresdenreitbahnstrasse | 33 | 31 | 31 | 31 | 23 | 15 | 16 | 13 | 48 | 210 | 407 | 504 | 497 | 446 | 428 | 425 | 419 | 346 | 183 | 78 | 49 | 40 | 37 | 35 |
dresdensarrasanistrasse | 53 | 52 | 54 | 55 | 38 | 16 | 16 | 17 | 57 | 105 | 119 | 123 | 112 | 109 | 105 | 122 | 124 | 132 | 191 | 221 | 168 | 96 | 58 | 54 |
dresdenschiessgasse | 57 | 54 | 54 | 54 | 49 | 39 | 56 | 86 | 188 | 500 | 864 | 1003 | 849 | 626 | 479 | 454 | 521 | 694 | 725 | 509 | 219 | 92 | 68 | 63 |
dresdenterrassenufer | 27 | 27 | 27 | 27 | 24 | 16 | 14 | 19 | 92 | 249 | 435 | 559 | 525 | 369 | 265 | 205 | 184 | 264 | 294 | 224 | 95 | 47 | 35 | 30 |
dresdentheresienstrasse | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 4 | 1 | 1 | 1 | 2 | 1 | 1 | 0 | 1 | 5 | 10 | 3 | 1 | 1 | 1 | 1 |
dresdenwiesentorstrasse | 92 | 91 | 91 | 92 | 86 | 79 | 76 | 63 | 59 | 59 | 72 | 82 | 96 | 98 | 95 | 104 | 121 | 121 | 138 | 156 | 146 | 107 | 96 | 93 |
dresdenwoehrlflorentinum | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 20 | 40 | 31 | 23 | 6 | 3 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
dresdenworldtradecenter | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 9 | 9 | 9 | 5 | 8 | 9 | 9 | 3 | 1 | 4 | 15 | 7 | 0 | 0 | 0 |
freiburgambahnhof | 491 | 600 | 603 | 606 | 599 | 254 | 8 | 8 | 7 | 6 | 7 | 10 | 10 | 7 | 6 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 190 |
freiburgbahnhofsgarage | 49 | 49 | 49 | 49 | 49 | 49 | 49 | 54 | 67 | 71 | 70 | 84 | 80 | 70 | 65 | 61 | 50 | 50 | 50 | 51 | 49 | 49 | 49 | 49 |
freiburgkarlsbau | 7 | 7 | 7 | 7 | 7 | 7 | 6 | 5 | 5 | 5 | 7 | 6 | 8 | 8 | 8 | 8 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
freiburgkonzerthaus | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 4 | 7 | 9 | 7 | 7 | 3 | 3 | 3 | 3 | 5 | 3 | 3 | 3 | 3 | 3 |
freiburgmartinstor | 553 | 550 | 551 | 551 | 543 | 285 | 57 | 38 | 47 | 58 | 69 | 79 | 72 | 66 | 58 | 53 | 48 | 45 | 42 | 39 | 37 | 34 | 207 | 466 |
freiburgrotteck | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 7 | 8 | 11 | 5 | 2 | 3 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
freiburgschlossberg | 9 | 9 | 9 | 9 | 9 | 9 | 8 | 8 | 8 | 10 | 30 | 37 | 29 | 21 | 20 | 16 | 12 | 10 | 10 | 9 | 9 | 9 | 9 | 9 |
freiburgschwarzwaldcity | 1543 | 1538 | 1544 | 1544 | 1543 | 838 | 302 | 302 | 301 | 299 | 288 | 289 | 289 | 289 | 291 | 295 | 295 | 295 | 295 | 295 | 423 | 1132 | 1545 | 1545 |
freiburgzaehringertor | 603 | 600 | 603 | 603 | 594 | 249 | 4 | 5 | 4 | 6 | 5 | 5 | 6 | 4 | 2 | 2 | 2 | 2 | 334 | 604 | 604 | 605 | 605 | 604 |
ingolstadtcongressgarage | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 17 | 13 | 16 | 7 | 5 | 7 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
ingolstadthallenbad | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 8 | 3 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
ingolstadthauptbahnhofost | 0 | 0 | 0 | 0 | 1 | 7 | 6 | 32 | 29 | 29 | 27 | 7 | 3 | 1 | 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
ingolstadtmuenster | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
ingolstadtnordbahnhof | 0 | 0 | 0 | 0 | 0 | 0 | 45 | 53 | 67 | 20 | 5 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
ingolstadtreduittilly | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 2 | 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
ingolstadttheaterost | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
ingolstadttheaterwest | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
luebeckbackbord | 1779 | 1774 | 1779 | 1779 | 1779 | 732 | 11 | 12 | 12 | 15 | 19 | 20 | 25 | 30 | 29 | 24 | 18 | 18 | 23 | 21 | 72 | 992 | 1770 | 1775 |
luebeckfalkenstrasse | 1779 | 1774 | 1779 | 1779 | 1779 | 895 | 293 | 293 | 291 | 288 | 290 | 293 | 297 | 293 | 300 | 305 | 303 | 302 | 304 | 311 | 387 | 1220 | 1776 | 1776 |
luebeckhaerdercenter | 1780 | 1775 | 1780 | 1780 | 1780 | 732 | 20 | 25 | 30 | 73 | 113 | 104 | 94 | 91 | 83 | 62 | 47 | 40 | 36 | 34 | 100 | 1101 | 1775 | 1776 |
luebeckhafenbahnhof | 1780 | 1775 | 1780 | 1780 | 1780 | 733 | 17 | 18 | 17 | 20 | 29 | 26 | 23 | 24 | 23 | 23 | 21 | 21 | 24 | 23 | 72 | 995 | 1770 | 1776 |
luebeckkanalstrasse2 | 1537 | 1532 | 1537 | 1537 | 1537 | 653 | 19 | 21 | 20 | 20 | 20 | 17 | 17 | 18 | 21 | 18 | 20 | 22 | 30 | 19 | 61 | 916 | 1534 | 1535 |
luebeckkanalstrasse3 | 1537 | 1532 | 1537 | 1537 | 1537 | 706 | 89 | 104 | 153 | 279 | 328 | 278 | 232 | 203 | 202 | 222 | 286 | 349 | 351 | 274 | 242 | 997 | 1534 | 1535 |
luebeckkanalstrasse4 | 1546 | 1541 | 1547 | 1547 | 1547 | 677 | 47 | 66 | 97 | 132 | 116 | 93 | 85 | 84 | 95 | 102 | 129 | 162 | 177 | 155 | 178 | 976 | 1543 | 1544 |
luebeckkanalstrasse5 | 1549 | 1544 | 1549 | 1549 | 1549 | 676 | 45 | 134 | 307 | 366 | 286 | 203 | 157 | 150 | 118 | 127 | 124 | 129 | 146 | 98 | 110 | 942 | 1547 | 1548 |
luebecklastadiep3 | 1781 | 1775 | 1780 | 1780 | 1780 | 736 | 22 | 33 | 39 | 52 | 73 | 99 | 111 | 115 | 106 | 101 | 96 | 136 | 186 | 146 | 194 | 1112 | 1774 | 1776 |
luebecklastadiep4 | 1785 | 1779 | 1784 | 1784 | 1784 | 1132 | 413 | 288 | 233 | 237 | 284 | 341 | 360 | 363 | 374 | 403 | 432 | 486 | 547 | 627 | 686 | 1255 | 1783 | 1782 |
luebecklastadiep5 | 1781 | 1775 | 1780 | 1780 | 1780 | 779 | 56 | 59 | 61 | 68 | 73 | 84 | 90 | 94 | 96 | 95 | 124 | 173 | 197 | 129 | 168 | 1086 | 1774 | 1776 |
luebeckleuchtenfeld | 1779 | 1774 | 1779 | 1779 | 1779 | 750 | 36 | 37 | 38 | 49 | 70 | 93 | 110 | 104 | 89 | 71 | 69 | 67 | 67 | 66 | 116 | 1005 | 1770 | 1775 |
luebecklindenarcaden | 1776 | 1771 | 1776 | 1776 | 1776 | 739 | 24 | 27 | 29 | 33 | 30 | 28 | 24 | 23 | 21 | 19 | 19 | 16 | 17 | 16 | 78 | 953 | 1686 | 1772 |
luebeckmitte | 1779 | 1774 | 1779 | 1779 | 1779 | 881 | 292 | 292 | 292 | 289 | 276 | 272 | 276 | 273 | 281 | 279 | 279 | 276 | 282 | 288 | 345 | 1204 | 1773 | 1776 |
luebeckmuk | 1780 | 1775 | 1780 | 1780 | 1780 | 754 | 47 | 52 | 57 | 58 | 56 | 56 | 58 | 64 | 72 | 66 | 66 | 77 | 113 | 125 | 161 | 1111 | 1772 | 1775 |
luebeckradissonhotel | 1780 | 1775 | 1780 | 1780 | 1780 | 730 | 13 | 18 | 20 | 19 | 17 | 17 | 17 | 17 | 17 | 13 | 12 | 13 | 15 | 15 | 82 | 1080 | 1773 | 1775 |
Lets see. Some Dresden lots seem to be particularily busy during the day but that could also be because of the lot occupation being too small at periods. All the Lübeck lots and one in Freiburg do obviously publish zero free spaces when closed, so that’s the ones to be careful about when calculating the occupation per hour. Though we also have seen previously that freiburgambahnhof
did the same until 2018 and freiburgmartinstor
and freiburgzaehringertor
do look similar.
Plotting the occupation data of the Lübeck lots hints at another problem:
df = occupied.loc[:, occupied.columns.map(lambda c: c.startswith("luebeck"))]
(df[(df.index >= "2018-03-01") & (df.index < "2018-03-08")] * 100).round().plot(
title="Occupation in Lübeck lots (March 2018)",
labels={"value": "occupation %"}
)
First it looks like the lots open at 6:00 and close at 22:00 but there are these little edges at the corners. It’s more likely they open at 6:30 and close at 20:30 or 21:30 but the final value is lost in the average bucketing of 1-hour-steps done in the beginning. Well, if they are closed, their data does not contribute to the leisure activity anyways so i’ll simply cut off all lots that have a zero-count of more than 400 at conservative times that are safe to assume, before 7:00 and after 20:00
occupied_open = occupied.copy()
for lot_id in zero_df[zero_df[0] > 400].index:
df = occupied_open.loc[:, lot_id]
occupied_open.loc[:, lot_id] = df[(df.index.hour >= 7) & (df.index.hour <= 20)]
Just to make sure i plot the sample range again for all lots:
df = occupied_open
(df[(df.index >= "2018-03-01") & (df.index < "2018-03-08")] * 100).round().plot(
title="Occupation during opening times (March 2018)",
labels={"value": "occupation %"}
)
As far as i can determine, there are no regular hard edges any more. So then gimme that occupation per hour-of-day plot, individually for every year:
def hours_year_group(df: pd.DataFrame) -> pd.DataFrame:
df = df.copy()
df["year"] = df.index.year
df["hour"] = df.index.hour
return (
df.reset_index().set_index(["date", "year", "hour"])
.unstack("year")
.groupby(level="hour").mean()
.groupby(level="year", axis=1).mean()
)
(hours_year_group(occupied_open) * 100).plot(
title="mean occupation per hour of day",
labels={"value": "occupation %"},
color_discrete_sequence=["#aa4", "#4a4", "#4aa", "#48a", "#f00"]
)
Amazing, isn’t it? No, not really. And the bump at 20:00 is not making much sense. Let’s plot the mean for each city individually:
def per_city_plot(occupied_open: pd.DataFrame, title: Optional[str] = None):
fig = make_subplots(
rows=len(CITIES), cols=1,
vertical_spacing=0.02,
shared_xaxes=True,
subplot_titles=CITIES,
)
for i, city in enumerate(CITIES):
df = occupied_open.loc[:, occupied_open.columns.map(lambda c: c.startswith(city.lower()))]
for trace in (hours_year_group(df) * 100).round().plot(
labels={"value": "occupation %"},
color_discrete_sequence=["#aa4", "#4a4", "#4aa", "#48a", "#f00"],
).data:
if i != 0:
trace.showlegend = False
fig.add_trace(trace, row=i+1, col=1)
fig.update_layout(
title=title or "mean occupation per hour of day", height=1000,
).show()
per_city_plot(occupied_open)
Obviously, Lübeck has parking lots that have closed even before 20:00 at some point. Apart from that, the Lübeck plot actually shows something i am looking for: Through working hours the occupation rate is similar to the years before 2020 while the evenings are certainly less occupied.
Freiburg also shows this little peak at 20:00 which is most likely caused by the closing hours problem and not by party goers.
Dresden shows a different picture. Seems like in 2020 more cars are simply left standing in the garage during the night. Dresden is quite a nice town with a lot of cool places to visit during the night–if there is no emergency decree, that is.
And as seen previously, Ingolstadt’s number of parked cars is growing over the years. In 2016 people stayed out longer compared to the other years.
Okay, well, please be aware! These are all just my assumptions. To proof anything, each parking lot has to be inspected individually. That is not what i want to do in this post. It has already a couple of Megabytes of javascript in it. I’ll stick with these average statistics but remember, if the river is half a meter deep on average, that does not mean that the cow is not going to drown when crossing it.
Finally, i’ll just repeat the above plot but for two particular weekdays: Wednesday and Sunday.
per_city_plot(
occupied_open[occupied_open.index.map(lambda d: d.weekday() == 2)],
title="mean occupation per hour of day on Wednesdays",
)
per_city_plot(
occupied_open[occupied_open.index.map(lambda d: d.weekday() == 6)],
title="mean occupation per hour of day on Sundays",
)
Thanks for reading!
Some applause to the parkenDD people and, really, don’t drink and drive!