| Peer-Reviewed

A Stacking-Based Ensemble Model for Prediction of Metropolitan Bike Sharing Demand

Received: 14 March 2023     Accepted: 14 April 2023     Published: 20 April 2023
Views:       Downloads:
Abstract

Due to the climate crisis and the improvement of public transportation networks, countries around the world are strongly advocating the low-carbon traveling mode. Shared bike as a new business model has a positive impact on the urban environment and transportation. The ability to estimate the hourly demand for bike sharing with high accuracy is essential for metropolis to offer stable bike rental services. Presently, data mining and predictive analysis technology can be utilized to realize the forecast of the hourly demand of shared bicycles. Data used in this article include the Seoul bike rented count dataset and weather information. This paper discusses various machine learning models for rental bike demand prediction, including Linear Regression, Ridge Regression, Lasso Regression, K-Nearest Neighbor, Random Forest, Decision Tree Regression, Support Vectors Machine, and Gradient Boosting Decision Tree. Different parameter tuning methods have been applied to improve the performance of basic predictive models. In addition, the redundant and irrelevant features have been removed to improve the performance of each basic model. After evaluating the individual basic predictors, several competent basic predictors are selected to compose a stacking-based ensemble model. Experimental results show that the stacking-based ensemble model outperforms the basic predictive models in all indicators.

Published in American Journal of Information Science and Technology (Volume 7, Issue 2)
DOI 10.11648/j.ajist.20230702.13
Page(s) 62-69
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2023. Published by Science Publishing Group

Keywords

Data Mining, Predictive Analytics, Regression Models, Ensemble Models, Bike Sharing Demand

References
[1] Breiman, L. (1996). Stacked regressions. Machine Learning, 24 (1), 49-64.
[2] Bui, D. T., Tran, A. T., Klempe, H., Pradhan, B., & Revhaug, I. (2016). Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides, 13 (2), 361-378.
[3] Chang, W., Ji, X., Wang, L., Liu, H., Zhang, Y., Chen, B., et al. (2021), A machine-learning method of predicting vital capacity plateau value for ventilatory pump failure based on data mining. Healthcare, DOI: 10.3390/healthcare9101306.
[4] Eren, E. & Uz, V. E. (2020). A review on bike-sharing: The factors affecting bike-sharing demand. Sustainable Cities and Society, DOI: 10.1016/j.scs.2019.101882.
[5] Fishman, E. (2016). Bikeshare: A review of recent literature. Transport Reviews, 36 (1), 92-113.
[6] Komi, M., Li, J., Zhai, Y., & Zhang, X. (2017). Application of data mining methods in diabetes prediction. In Proceedings of 2017 2nd International Conference on Image, Vision and Computing (ICIVC), Chengdu, China, June 2-4, 2017, pp. 1006-1010.
[7] Lessmann, S., Baesens, B. U., Seow, H. V., Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: a ten-year update. European Journal of Research, 247 (1), 124-136.
[8] Ngai, E. W., Hu, Y., Wong, Y. H., Chen, Y., & Sun, X. (2011). The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems, 50 (3), 559-569.
[9] Nugumanova, A., Maulit, A., Mansurova, M., & Baiburin, Y. (2021). Understanding bike sharing stations usage with Chi-Square statistics. In Proceedings of 13th International Conference on Computational Collective Intelligence, Kallithea, Rhodes, Greece, September 29-October 1, 2021, pp. 425-436.
[10] Prasad, A. M., Iverson, L. R., & Liaw, A. (2006). Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems, 9 (2), 181-199.
[11] Qi, Y., Li, Q., Karimian, H., & Liu, D. (2019). A hybrid model for spatiotemporal forecasting of PM2.5 based on graph convolutional neural network and long short-term memory. Science of the Total Environment, 664, 1-10.
[12] Ruß, G., Kruse, R. R., Schneider, M., & Wagner, P. (2008). Data mining with neural networks for wheat yield prediction. In Proceedings of the 8th Industrial Conference on Advances in Data Mining: Medical Applications, E-Commerce, Marketing, and Theoretical Aspects, Leipzig, Germany, July 16-18, 2008, pp. 47-56.
[13] Sathishkumar, V. E. & Cho, Y. (2020). A rule-based model for Seoul bike sharing demand prediction using weather data. European Journal of Remote Sensing, 53 (sup1), 166-183.
[14] Sathishkumar, V. E., Park, J., & Cho, Y. (2020). Using data mining techniques for bike sharing demand prediction in metropolitan city. Computer Communications, 153, 353-366.
[15] Sun, Y. (2018). Sharing and riding: how the dockless bike sharing scheme in China shapes the city. Urban Science, 2 (3), 68.
[16] Tjur, T. (2009), Coefficients of determination in logistic regression models - a new proposal: The coefficient of discrimination. American Statistician, 63 (4), 366-372.
[17] Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5 (2), 241-259.
[18] Wu, C., Kuo, S., & Kao, S. C. (2019), Classification-based data mining applied in vehicle accident prediction. Fuzzy Systems and Data Mining, 320, 218-223.
Cite This Article
  • APA Style

    Xinxue Lin, Chang Lu. (2023). A Stacking-Based Ensemble Model for Prediction of Metropolitan Bike Sharing Demand. American Journal of Information Science and Technology, 7(2), 62-69. https://doi.org/10.11648/j.ajist.20230702.13

    Copy | Download

    ACS Style

    Xinxue Lin; Chang Lu. A Stacking-Based Ensemble Model for Prediction of Metropolitan Bike Sharing Demand. Am. J. Inf. Sci. Technol. 2023, 7(2), 62-69. doi: 10.11648/j.ajist.20230702.13

    Copy | Download

    AMA Style

    Xinxue Lin, Chang Lu. A Stacking-Based Ensemble Model for Prediction of Metropolitan Bike Sharing Demand. Am J Inf Sci Technol. 2023;7(2):62-69. doi: 10.11648/j.ajist.20230702.13

    Copy | Download

  • @article{10.11648/j.ajist.20230702.13,
      author = {Xinxue Lin and Chang Lu},
      title = {A Stacking-Based Ensemble Model for Prediction of Metropolitan Bike Sharing Demand},
      journal = {American Journal of Information Science and Technology},
      volume = {7},
      number = {2},
      pages = {62-69},
      doi = {10.11648/j.ajist.20230702.13},
      url = {https://doi.org/10.11648/j.ajist.20230702.13},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajist.20230702.13},
      abstract = {Due to the climate crisis and the improvement of public transportation networks, countries around the world are strongly advocating the low-carbon traveling mode. Shared bike as a new business model has a positive impact on the urban environment and transportation. The ability to estimate the hourly demand for bike sharing with high accuracy is essential for metropolis to offer stable bike rental services. Presently, data mining and predictive analysis technology can be utilized to realize the forecast of the hourly demand of shared bicycles. Data used in this article include the Seoul bike rented count dataset and weather information. This paper discusses various machine learning models for rental bike demand prediction, including Linear Regression, Ridge Regression, Lasso Regression, K-Nearest Neighbor, Random Forest, Decision Tree Regression, Support Vectors Machine, and Gradient Boosting Decision Tree. Different parameter tuning methods have been applied to improve the performance of basic predictive models. In addition, the redundant and irrelevant features have been removed to improve the performance of each basic model. After evaluating the individual basic predictors, several competent basic predictors are selected to compose a stacking-based ensemble model. Experimental results show that the stacking-based ensemble model outperforms the basic predictive models in all indicators.},
     year = {2023}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - A Stacking-Based Ensemble Model for Prediction of Metropolitan Bike Sharing Demand
    AU  - Xinxue Lin
    AU  - Chang Lu
    Y1  - 2023/04/20
    PY  - 2023
    N1  - https://doi.org/10.11648/j.ajist.20230702.13
    DO  - 10.11648/j.ajist.20230702.13
    T2  - American Journal of Information Science and Technology
    JF  - American Journal of Information Science and Technology
    JO  - American Journal of Information Science and Technology
    SP  - 62
    EP  - 69
    PB  - Science Publishing Group
    SN  - 2640-0588
    UR  - https://doi.org/10.11648/j.ajist.20230702.13
    AB  - Due to the climate crisis and the improvement of public transportation networks, countries around the world are strongly advocating the low-carbon traveling mode. Shared bike as a new business model has a positive impact on the urban environment and transportation. The ability to estimate the hourly demand for bike sharing with high accuracy is essential for metropolis to offer stable bike rental services. Presently, data mining and predictive analysis technology can be utilized to realize the forecast of the hourly demand of shared bicycles. Data used in this article include the Seoul bike rented count dataset and weather information. This paper discusses various machine learning models for rental bike demand prediction, including Linear Regression, Ridge Regression, Lasso Regression, K-Nearest Neighbor, Random Forest, Decision Tree Regression, Support Vectors Machine, and Gradient Boosting Decision Tree. Different parameter tuning methods have been applied to improve the performance of basic predictive models. In addition, the redundant and irrelevant features have been removed to improve the performance of each basic model. After evaluating the individual basic predictors, several competent basic predictors are selected to compose a stacking-based ensemble model. Experimental results show that the stacking-based ensemble model outperforms the basic predictive models in all indicators.
    VL  - 7
    IS  - 2
    ER  - 

    Copy | Download

Author Information
  • School of Resource and Environmental Sciences, Wuhan University, Wuhan, China

  • School of Urban Design, Wuhan University, Wuhan, China

  • Sections