چكيده به لاتين
nowadays, a lot of data are being stored in various industries. in scientific studies, data mining is one of the most important tools in data analysis, in order to study various relationships. the urban rail transport industry also uses this tool, and over time, we have witnessed a significant increase in applications of data mining techniques in urban rail transport industry. in this regard, the clustering of metro stations and extraction and clustering of passengers’ flows play an important role in the more efficient management of the subway system. In this thesis, first the analysis and characterization of passengers’ flow in Tehran subway is studied, then the role of time and spatial parameters on the quality of clustering of stations has been investigated and, in the end, a practical method is used to extract the patterns of passengers flows. First, passenger flows are divided into three categories of working days, Thursdays and Fridays and are compared with each other in terms of the pattern of arrival and departure of passengers during the day. Also, a comparison between the volume and patterns of arrival and departure of passengers in the Ordibehesht of 1398, 1399 and 1400 has been done in different days of the month. In the next step, the role of parameters of the maximum distance of points of interest from metro stations and cumulative time intervals of passenger volume on the final quality of clustering by means of k-means algorithms and Gaussian mixture model in interaction with the number of clusters, is investigated. Finally, based on the silhouette score, k-means algorithm has provided the best quality of clustering by 6 clusters in the one-hour period, considering the maximum distance of 1000 m. Also, based on the Calinski score, k-means algorithm has provided the best quality of clustering by 6 clusters, in the one-hour period, considering the maximum distance of 1000 m. Finally, in the field of extraction and clustering of passenger flows, the clustering of passenger flows has been done in a definite and inflexible manner, which can affect the efficiency of decision-making by urban transportation management in critical situations. After applying clustering on the origin-destination pairs, some origin-destination pairs that do not belong to any cluster, if they have the conditions of spatial neighborhood with one of the origin-destination pairs in a particular cluster, are relatively placed in that cluster that in critical situations, according to this neighborhood, these origin-destination pairs are also involved in decisions.