These Python examples are intended to help you getting started with the analysis of travel time data or passenger data of Verkehrsbetriebe Zürich (VBZ), the public transport operator in the Swiss city of Zurich. The datasets as shown below are published as Open Government Data (OGD) at the Open Data Portal.
The R version of the examples can be found here.
Also have a look at showcases.
Do you want to share your output?
Feel free to contact us via [email protected] or the
Open Data Portal contact-form.
For additional transportation data covering Switzerland (not just Zurich), take a look at opentransportdata.
- Python (version 3.11)
Script
example_passengerdata.py
Input
In order to perform the example analysis, you need to download a .csv-file containing passenger data ("Reisende.csv") as well as three matching tables from the Open Data Portal. There you can also find additional descriptions (in German only).
-
REISENDE.csv: Main table, contains information about the number of passengers etc.
-
LINIE.csv: Matching table, contains information about line numbers etc.
-
TAGTYP.csv: Matching table, contains information about the validation of timetables etc.
-
HALTESTELLEN.csv: Matching table, contains information about stops names etc.
-
GEFAESSGROESSE.csv: Matching table, contains information about vehicle capacity.
Output
e.g passengers per line in total for 2019 based on the input tables above as shown in the example script (data frame "pax_line_year"):
Linien_Id | Linienname | Linienname_Fahrgastauskunft | pax_per_year |
---|---|---|---|
4 | 89 | 89 | 4'587'420.00 |
5 | 75 | 75 | 6'154'492.49 |
... | ... | ... | ... |
Script
example_traveltimedata.py
Input
In order to perform the example analysis you need to download a .csv-file containing travel time as well as two matching tables from the Open Data Portal. You'll also find additional descriptions there (in German only).
-
Fahrzeiten_SOLL_IST_YYYYMMDD_YYYYMMDD.csv: Main table, contains actual travel time raw data (each file contains one week of data).
-
Haltepunkt.csv: Matching table, contains information about the GPS position of each stop point.
-
Haltestelle.csv: Matching table, contains information about the full stop names.
Output
e.g percentage of the punctuality per line based on the input tables above as shown in the example script (data frame "punctuality").
According to the punctuality definition of VBZ, a ride is considered on time (punctual) when the actual arrival time at the stop does not exceed the scheduled arrival time by more than 2 minutes (otherwise defined as "delayed") or the actual departure at a stop does not happen more than 1 minute earlier than the scheduled departure (otherwise defined as "too early").
line | punctual | too early | delayed |
---|---|---|---|
2 | 95.61179 | 2.66941139 | 1.718800 |
3 | 95.60730 | 2.52430229 | 1.868402 |
... | ... | ... | ... |
Q: There is no data in the folder "data"
A: That’s correct. The data is not included in this repository. You can find it on the Open Data Portal of the City of Zurich at https://data.stadt-zuerich.ch/.
Be sure to check the links provided above and within the code for further guidance.
Q: I’ve downloaded the datasets, but the script still isn’t running.
A: Please check the path specified in path_to_data = Path(r".../data")
. Ensure it points to the directory where the downloaded files are stored on your hard drive.
Q: Where can I get more information about the datasets?
A: You can find metadata descriptions at the following links:
Q: What’s the difference between a stop point and a stop?
A: A stop (e.g., "Bellevue") can include several stop points, which are essentially platforms. For instance:
- Tram lines 2, 11, and 8 heading towards "Bürkliplatz"
- Tram lines 4 and 15 heading towards "Helmhaus"
Each platform is considered a separate stop point.
Q: Is it possible to derive route information (origin-destination) from the passenger data?
A: No, the data only provides information about the number of boarding passengers and the number of disembarking passengers. It does not include details about the number of people traveling between specific locations (A to B).
If anything is unclear or missing, feel free to let us know about any challenges you encounter while working with the OGD datasets or scripts provided by Verkehrsbetriebe Zürich!
Any feedback?
Feel free to contact us via [email protected]!