Hostname: page-component-89b8bd64d-9prln Total loading time: 0 Render date: 2026-05-08T02:07:07.077Z Has data issue: false hasContentIssue false

Dialectometry-based classification of the Central–Southern Italian dialects

Published online by Cambridge University Press:  29 April 2024

Antonio Sciarretta*
Affiliation:
Independent scholar
*
Corresponding author: Antonio Sciarretta; Email: antonio.sciarretta@ifpen.fr
Rights & Permissions [Opens in a new window]

Abstract

This paper provides a new classification of Central–Southern Italian dialects using dialectometric methods. All varieties considered are analyzed and cast in a data set where homogeneous areas are evaluated according to a selected list of phonetic features. Using numerical evaluation of these features and the Manhattan distance, a linguistic distance rule is defined. On this basis, the classification problem is formulated as a clustering problem, and a k-means algorithm is used. Additionally, an ad-hoc rule is set to identify transitional areas, and silhouette analysis is used to select the most appropriate number of clusters. While meaningful results are obtained for each number of clusters, a nine-group classification emerges as the most appropriate. As the results suggest, this classification is less subjective, more precise, and more comprehensive than traditional ones based on selected isoglosses.

Information

Type
Articles
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Table 1. Set of phonetic features considered and their possible outcomes

Figure 1

Figure 1. Localization of the homogeneous areas (circles). Each color corresponds to one of the administrative regions. Boundaries between regions are drawn

Figure 2

Table 2. List of province codes. Regional capital cities in bold

Figure 3

Table 3. Clustering results as a function of the number of clusters $K$

Figure 4

Figure 2. Silhouette coefficient as a function of the number of groups

Figure 5

Figure 3. HA clustered in $K = 8$ groups: schematic representation. Each color corresponds to one group: blue (Perimedian), purple (Median), pink (Abruzzese), red (Campanian-Molisan), orange (Apulian), yellow (Irpino-Lucanian), green (Cosentino), light blue (Salentino-Calabrian)

Figure 6

Figure 4. HA clustered in $K = 8$ groups: a linguistic map with actual group boundaries. Colors of groups correspond to those of Figure 3.

Figure 7

Figure 5. HA clustered in $K = 8$ groups with second-best clusters: schematic representation. Core clusters are identified by the left-half color of the circles; second-best clusters in transitional area are identified by the right-half color

Figure 8

Figure 6. HA clustered in $K = 8$ groups with second-best clusters: a linguistic map with actual group boundaries and transitional areas (hatched)

Supplementary material: File

Sciarretta supplementary material

Sciarretta supplementary material
Download Sciarretta supplementary material(File)
File 26.4 KB