Posts tagged Artificial intelligence
Forecasting bus ridership with trip planner usage data: a machine learning application
Currently, public transport gives much attention to environmental impact, costs and traveler satisfaction. Good short-term demand forecasting models can help improve these performance indicators. It can help prevent denied boarding and overcrowding in busses by detecting insufficient capacity beforehand. It could be used to operate more economically by decreasing the frequency or the size of the bus if there is overcapacity. Moreover, it could help operators plan their busses during incidental occasions like big public events where little information is known. Finally, it could be used to reliably inform the travelers on the current crowdedness.
This study investigates the usefulness of a new data source; the usage data of a trip planner. In the Netherlands there are multiple trip planners available for users to help find the most optimal (multimodal) journeys. These trip planners require a date, a time and an origin and destination, which they use to construct multiple alternative journeys from which the user can choose. For this study the data of 9292 was used, being the major trip planner in the Netherlands including all public transport modes.
We developed a model for forecasting the number of people boarding and a model for forecasting the number of people alighting at a certain stop. These forecasts are defined at the vehicle-stop level. By summing the number of people boarding and subtracting the number of people alighting along the trip the forecasted number of passengers after a stop is calculated.
We compare five different machine learning models: multiple linear regression, decision tree, random forests, neural networks and support vector regression with a radial basis kernel. We compare these models with two simple rules: 1 predict the same number as last week, and 2 predict the historic average as number. The models are implemented in the Scikit-Learn library of Python. The data is stored in a PostgresSQL database.
The trip planner datasets and smart card dataset are merged and preprocessed. The resulted dataset is rather sparse; a lot of stops have zero passengers boarding or alighting or requests suggesting to do so. Therefore we investigated if subsampling is needed. From the datasets useful data is selected and features are constructed. The features are standardized. Different number of features are tested, these features are selected based on recursive elimination using a simple random forests model. Finally, the hyperparameters of the models are tuned and the optimal configurations are stored. The scores are validated by using cross validation.
Find more details in the following contributions by Jop van Roosmalen: Transit Data workshop presentation and MSc thesis
Supervised learning: Predicting passenger load in public transport
For many Public Transport (PT) users, overcrowding in PT vehicles has a major decreasing effect on the comfort experience. However, most online routing applications still not take comfort regarding to crowdedness into account, but provide recommendations based on shortest distance, shortest travel-time, or number of interchanges.
Being able to include certain information on crowdedness, requires knowledge about the current and future level of passenger load. Increasing amount and complexity of data describing public transport services allows us to better explore the detection methods and analysis of different phenomena of PT operations. Some countries or operators provide the possibility to use Smart Card (SC) data for occupancy prediction. However, SC data is not available in real time, which makes it hard to incorporate it into real time recommendation models. In this work, we show that it is possible to predict the passenger load via supervised learning, eliminating the need for fare collection data beyond the set needed for training.
Find the CASPT presentation by Léonie Heydenrijk-Ottens HERE