The construct of second language (L2) utterance fluency is typically operationalized through various individual temporal features. However, in natural speech, fluency (or disfluency) is often characterized by the clustering of multiple temporal features, collectively revealing the speaker’s effort in speech production or disfluency recovery. In this study, we explore the co-occurrence patterns of disfluency features in L2 speech and their associations with speakers’ L2 oral proficiency. We initially segmented all speech samples into analysis of speech (AS)-units. Within each AS-unit, six individual fluency features were manually coded, standardized, and subsequently subjected to a hierarchical-based k-means cluster analysis to examine their co-occurrence patterns. The results revealed four distinct disfluency clusters. A subsequent qualitative analysis of disfluencies in each cluster revealed distinct distributional patterns, disfluency makeup, and communicative functions. Additionally, the proportions of different disfluency clusters were significantly influenced by speakers’ proficiency level, first language background, and their interaction. These findings carry implications for L2 speaking research in general, shedding light on the intricate nature of speech fluency and presenting an alternative approach to the operationalization of this multidimensional construct.