Mapping Linguistic Variation with GIS
Since the 1980s geographers have achieved major advances in two areas: the development of powerful Geographic Information Systems (GIS) software and the development of statistical models for the analysis of geographical patterns of data. Modern GIS and its incorporated spatial analysis tools allow sophisticated and efficient analysis of spatial data by researchers in many fields (ESRI 2015). Although the spatial variation of language has long been of interest to linguists, researchers have made little use of the power of GIS to address hypotheses regarding spatial variation of language and correlated physical and social variables. Linguists have applied GIS technology in constructing language atlases, including recent online atlases; however, the steps of aggregating and analyzing the data using GIS are seldom discussed in detail. In addition, many linguistic studies that incorporate maps created with GIS treat them only as graphics, omitting the spatial aspect of the data. Consequently, they neglect space and spatiality (i.e., characteristics of geographical space and the way people inhabit it), two factors that have found to be important in language variation and change (Britain 2010).
To address these issues, I have proposed a set of GIS tools and streamlined techniques that researchers can use to study spatial patterns in sociolinguistic data. The advantages of using GIS for these types of studies are many. To date, GIS has been used successfully for perceptual dialectology studies in Evans (2011), Jeon (2012), Cukor-Avila et al. (2012), Montgomery (2012), Montgomery and Stoeckle (2013), and Jeon and Cukor-Avila (2015). I specifically examined the advantages of using GIS for: (1) aggregating and visualizing complex data sets and their geographic distribution; (2) exploring and analyzing subsets of data sets; and (3) transforming linguistic data into user-friendly resources such as maps for publications and presentations. This approach integrates the geographical distribution of linguistic variation together with the influence of social factors, while simultaneously providing a way to assess trends and relationships across linguistic variables. Furthermore, the results enable an analysis of the data with many linguistic variables and subsets of respondents, as well as with individual linguistic variables and speakers. GIS tools enable researchers using various types of sociolinguistic data (perceptual dialectology, sociophonetic, morphosyntactic, lexical, etc.) to validate empirical evidence and improve mapping of dialects as well as to study differences in the geographical distributions of linguistic variables.
I have organized several GIS for Linguistics workshops that offer opportunities for researchers to experiment with these methods themselves by applying them to provided datasets. In these workshops, I show how to install and use open source GIS software, digitize and aggregate map data, explore and stratify results by linguistic variables and other subsets, perform statistical queries, and create composite maps to visualize the spatial patterns in linguistic data using ArcGIS.