Predicting Trip Duration and Distance in Bike-Sharing Systems Using Dynamic Time Warping

Publisher:
TAYLOR & FRANCIS INC
Publication Type:
Journal Article
Citation:
Applied Artificial Intelligence, 2025, 39, (1)
Issue Date:
2025-01-01
Full metadata record
Bike-sharing systems (BSSs) have recently become important in urban transportation due to several factors, such as their cost-effectiveness and environmental considerations. The BSS provides an enormous amount of data that is recorded regarding trips. This huge volume of bike sharing data raises various challenges and opportunities. Many research studies have used bike sharing datasets to understand the geographical, social, financial, and behavioral aspects of bike user behaviors. While existing literature primarily focuses on predicting the number of rentals and returns per station, this study addresses the complementary aspect of predicting the trip duration and distance of the trip. Accurate prediction of ride duration allows a better estimate of bike availability at stations, while distance predictions assist in maintenance planning based on bike usage patterns. The contribution of this work is twofold. First, the proposed work clusters the BSS dataset into (Formula presented.) sub-datasets based on similarity of dataset instances. Then, the predictive model is trained to predict the data of each sub-dataset separately. Thus, there will be (Formula presented.) models for the (Formula presented.) sub-dataset. Next, the performance of the proposed method, the average score of the (Formula presented.) models, will be compared to the performance of a model trained on the complete dataset on predicting BSSs ride duration and distance of the trip. The rationale for splitting the dataset into (Formula presented.) sub-datasets is to separate similar patterns in one sub-dataset. Second, the utilization of the dynamic time warping (DTW) algorithm on the BSSs data was proposed for the clustering purpose, as the DTW usage is very limited in the current literature of BSSs. The dataset clustering is based on the similarity of the curves representing the number of trips between each pair of bike stations throughout the day hours. Then, the DTW algorithm is used to measure the curve similarity between these bike station pairs’ curves. These two contributions of the proposed approach complement existing prediction models for rentals and returns, providing a comprehensive solution for BSS optimization. The proposed method was thoroughly evaluated on two real datasets of different sizes. For the two datasets, the obtained results show that the best improvements of the predictive model’s accuracy are 30% and 42% on average for predicting trip duration and distance of the trip, respectively.
Please use this identifier to cite or link to this item: