| Literature DB >> 35177774 |
Yujin Park1, Yoonhee Choi1, Kyongwon Kim2, Jae Keun Yoo3.
Abstract
We investigate regional features nearby the subway station using the clustering method called the funFEM and propose a two-step procedure to predict a subway passenger transport flow by incorporating the geographical information from the cluster analysis to functional time series prediction. A massive smart card transaction dataset is used to analyze the daily number of passengers for each station in Seoul Metro. First, we cluster the stations into six categories with respect to their patterns of passenger transport. Then, we forecast the daily number of passengers with respect to each cluster. By comparing our predicted results with the actual number of passengers, we demonstrate the predicted number of passengers based on the clustering results is more accurate in contrast to the result without considering the regional properties. The result from our data-driven approach can be applied to improve the subway service plan and relieve infectious diseases as we can reduce the congestion by controlling train intervals based on the passenger flow. Furthermore, the prediction result can be utilized to plan a 'smart city' which seeks shorter commuting time, comfortable ridership, and environmental sustainability.Entities:
Year: 2022 PMID: 35177774 PMCID: PMC8854707 DOI: 10.1038/s41598-022-06767-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1The pattern of the averaged daily number of passengers entering the subway station for 224 stations from January 6, 2020, to January 12, 2020 (black lines) with their sample mean (red lines).
Figure 2The pattern of the averaged daily number of passengers leaving the subway station for 224 stations from January 6, 2020, to January 12, 2020 (black lines) with their sample mean (red lines).
Figure 3Smoothed curves of the averaged daily number of passengers for four situations in a 1-h interval with Fourier basis (gray lines) and their sample mean functions (red lines).
Figure 4The contour plot of the covariance functions for the daily number of passengers in four situations.
Figure 5The first two eigenfunctions from the PCA result for four scenarios.
The execution time for FPCA in seconds.
| DN | DF | KN | KF |
|---|---|---|---|
| 0.018 | 0.011 | 0.024 | 0.018 |
Figure 6The results of the clustering analysis for all 224 stations (left) and the mean function for each cluster (right) in four cases.
Figure 7Scatter plot of the number of passengers get on and off the subway in 8 A.M.
Figure 8Scatter plot of the number of passengers get on and off the subway in 6 P.M.
Figure 9The location of subway stations in Seoul based on the results of the clustering analysis. (Map data is from Google and SK telecom. This map is annotated using ggmap package (https://cran.r-project.org/web/packages/ggmap/index.html) in R 4.0.5[21]).
Figure 10The composition ratio of clusters by subway lines.
The list of stations for six clusters.
| Cluster | Station |
|---|---|
| 1 | Garak Market, Gyeryong, Gaehwasan, Geoyeo, National Police Hospital, Godeok, Korea University, Gwangheungchang, Gusan, Geumho, Gildong, Gimpo Airport, Namtaeryeong, Noksapyeong, Daecheong, Daeheung, Dogok, Dorimcheon, Dongnimmun (Independence) Gate , Dokbawi , Dongjak, Dunchon-dong, Magok, Majang, Macheon, Mongchon toseong, Muakjae, Banpo, Bangi, Sangwolgok, Seokchon, Songjeong, Singeumho, Singil, Sinnae, Sindap, Sinseoldong, Aeogae, Yaksu, Yangcheon-gu Office, Yangpyeong, Yeokcheon, Yeongdeungpo Market, Ogeum, Olympic Park, Yongdap, Yongdu, Yongmasan, World Cup Stadium, Ichon, Jamwon, Sports Complex, Changsin, Cheonggu, Taereung Entrance, Hangnyeoul , Hanyang University, Haengdang, Hyochang Park |
| 2 | Gangnam, Guro Digital Complex, Seoul National University Entrance, Sindorim, Sillim, Hongdae Entrance |
| 3 | Gasan Digital Complex, Express Bus Terminal, Gwanghwamun, Nambu Bus Terminal, Samsung, Seoul, Seolleung, Seongsu, City Hall, Sinsa, Sinchon, Apgujeong, Yangjae, Yeouido, Yeoksam, Euljiro Entrance, Jamsil, Jonggak, Hakdong, Hyehwa |
| 4 | Gangnam-gu Office, Gyeongbokgung, Gongdeok, Gyodae, Namguro, Naebang, Nonhyeon, Dongdaemun, Dongdaemun History and Culture Park, Dongdaemun Entrance, Dongmyo, Digital Media City, Ttukseom, Mapo, Mangwon, Maebong, Myeongdong, Mullae, Munjeong, Balsan, Bangbae, Sangsu, Seodaemun, Seocho, Suseo, Sookmyung Women’s University, Sinyongsan, Anguk, Anam, Children’s Grand Park, Yeouinaru, Yeongdeungpo-gu Office, Euljiro 3-ga, Eulji-ro 4-ga, Ewha Womans University, Isu, Itaewon, Irwon, Jamsil Naru, Jangji, Jang-han-pyeong, Jongno3-Ga, Jongno 5-ga, Cheongdam, Chungmuro, Hangangjin, Hoehyeon |
| 5 | Gangdong, Gangbyeon (Dongseoul Bus Terminal), Konkuk University, Guui, Gupabal, Gireum, Kkachisan, Nakseongdae, Nowon, Dangsan, Daerim, Mokdong, Miasageori,Bongcheon, Sadang, Sanggye, Sangbong, Sangil-dong, Sungshin Women’s University, Suyu, Sindaebang, Ssangmun, Amsa, Yeonsinnae, Omokgyo, Eungam, Jamsil saenae, Jegi-dong, Chang-dong, Cheonho, Cheongnyangni, Chongsin University, Hagye, Hapjeong, Hongje, Hwagok |
| 6 | Gangdong-gu Office, Gongneung, Gwangnaru, Gunja, Gubeundari (Gangdong Community Center), Namseong, Nokbeon, Dapsimni, Danggogae, Daechi, Dobongsan, Dolgoch, Ttukseom Resort, Madeul, Mapo-gu Office, Meokgol, Myeonmok, Myeongil, Mia, Bokjeong, Bonghwasan, Bulgwang, Sagajeong, Sangdo, Sangwangsimni, Saejeol, Seokgye, Suraksan, Soongsil University entrance, Sindang, Sindaebangsamgeori, Sinjeong, Sinjeong negeori, Sinpung, Achasan, Ahyeon, Oksu, Onsu, Wangsimni, Ujangsan, Wolgok, Jangseungbaegi, Junggye, Junghwa, Jeungsan, Cheonwang, Hansung University entrance, Hwarang-dae |
Figure 11A Comparison of the predicted and actual daily number of passengers for cluster 1.
Figure 12A Comparison of the predicted and actual daily number of passengers for cluster 2.
Figure 13A comparison of the predicted and actual daily number of passengers for cluster 3.
Figure 14A comparison of the predicted and actual daily number of passengers for cluster 4.
Figure 15A comparison of the predicted and actual daily number of passengers for cluster 5.
Figure 16A comparison of the predicted and actual daily number of passengers for cluster 6.
Figure 17A comparison of the predicted and actual daily number of passengers without considering regional properties.
The execution time for prediction in seconds.
| Cluster 1 | Cluster 2 | Cluster 3 | Cluster 4 | Cluster 5 | Cluster 6 |
|---|---|---|---|---|---|
| 0.078 | 0.064 | 0.069 | 0.070 | 0.066 | 0.067 |
The averaged RMSE of our approach.
| Cluster 1 | Cluster 2 | Cluster 3 | Cluster 4 | Cluster 5 | Cluster 6 | Total |
|---|---|---|---|---|---|---|
| 4.889 | 59.347 | 31.449 | 17.446 | 20.488 | 11.521 | 145.14 |
The averaged RMSE for the averaged number of passengers in[22].
| Cluster 1 | Cluster 2 | Cluster 3 | Total |
|---|---|---|---|
| 66 | 48 | 32 | 146 |
The averaged RMSE for the maximum number of passengers in[22].
| Cluster 1 | Cluster 2 | Cluster 3 | Total |
|---|---|---|---|
| 234 | 185 | 115 | 534 |
Figure 19A comparison of the predicted and actual daily number of passengers for cluster 1 based on the test dataset.
Figure 20A comparison of the predicted and actual daily number of passengers for cluster 2 based on the test dataset.
Figure 21A comparison of the predicted and actual daily number of passengers for cluster 3 based on the test dataset.
Figure 22A comparison of the predicted and actual daily number of passengers for cluster 4 based on the test dataset.
Figure 23A comparison of the predicted and actual daily number of passengers for cluster 5 based on the test dataset.
Figure 24A comparison of the predicted and actual daily number of passengers for cluster 6 based on the test dataset.
Figure 18A diagram of a two-step procedure.