A Theory of Linguistic Individuality for Authorship Analysis

Andrea Nini

doi:10.1017/9781108974851

References

Anthonissen, L. and Petré, P. (2019) ‘Grammaticalization and the linguistic individual: New avenues in lifespan research’, Linguistics Vanguard, 5(s2), pp. 20180037. Available at: https://doi.org/10.1515/lingvan-2018-0037.

Antonia, A., Craig, H., and Elliott, J. (2014) ‘Language chunking, data sparseness, and the value of a long marker list: Explorations with word n-grams and authorial attribution’, Literary and Linguistic Computing, 29(2), pp. 147–63. Available at: https://doi.org/10.1093/llc/fqt028.

Argamon, S. (2008) ‘Interpreting Burrows’s Delta: Geometric and probabilistic foundations’, Literary and Linguistic Computing, 23(2), pp. 131–47. Available at: https://doi.org/10.1093/llc/fqn003.

Argamon, S. E. (2018) ‘Computational forensic authorship analysis: Promises and pitfalls’, Language and Law / Linguagem e Direito, 5(2), pp. 7–37.

Barlow, M. (2013) ‘Individual differences and usage-based grammar’, International Journal of Corpus Linguistics, 18(4), pp. 443–78. Available at: https://doi.org/10.1075/ijcl.18.4.01bar.

Beckner, C., Ellis, N. C., Blythe, R., et al. (2009) ‘Language is a complex adaptive system: Position paper’, Language Learning, 59, pp. 1–26.

Biber, D. (1988) Variation across Speech and Writing. Cambridge: Cambridge University Press.

Biber, D. (2009) ‘A corpus-driven approach to formulaic language in English: Multi-word patterns in speech and writing’, International Journal of Corpus Linguistics, 14(3), pp. 275–311.

Biber, D. and Conrad, S. (2009) Register, Genre, and Style. Cambridge: Cambridge University Press.

Bloch, B. (1948) ‘A set of postulates for phonemic analysis’, Language, 24(1), pp. 3–46.

Braun-Blanquet, J. (1932) Plant Sociology: The Study of Plant Communities. New York: McGraw-Hill.

Burrows, J. (2002) ‘“Delta”: A measure of stylistic difference and a guide to likely authorship’, Literary and Linguistic Computing, 17(3), p. 267.

Bybee, J. L. (2006) ‘From usage to grammar: The mind’s response to repetition’, Language, 82(4), pp. 711–33. Available at: https://doi.org/10.1353/lan.2006.0186.

Bybee, J. (2010) Language, Usage and Cognition. Cambridge: Cambridge University Press.

Carne, M. and Ishihara, S. (2021) ‘Feature-based forensic text comparison using a Poisson model for likelihood ratio estimation’, in Proceedings of the 18th Workshop of the Australasian Language Technology Association. Australasian Language Technology Association, pp. 32–42.

Chaski, C. E. (2001) ‘Empirical evaluations of language-based author identification techniques’, Forensic Linguistics, 8(1), pp. 1–65.

Christiansen, M. H. and Chater, N. (2016) ‘The Now-or-Never bottleneck: A fundamental constraint on language’, Behavioral and Brain Sciences, 39, p. e62. Available at: https://doi.org/10.1017/S0140525X1500031X.

Cohen, J. (1960) ‘A coefficient of agreement for nominal scales’, Educational and Psychological Measurements, 20, pp. 37–46.

Cole, L. C. (1949) ‘The measurement of interspecific association’, Ecology, 30, pp. 411–24.

Consonni, V. and Todeschini, R. (2012) ‘New similarity coefficients for binary data’, MATCH Communications in Mathematical and in Computer Chemistry, 68, pp. 581−92.

Coulthard, M. (2004) ‘Author identification, idiolect, and linguistic uniqueness’, Applied Linguistics, 25, pp. 431–47.

Coulthard, M. (2013) ‘On admissible linguistic evidence’, Journal of Law and Policy, 21, pp. 441–66.

Coulthard, M., Johnson, A., and Wright, D. (2017) An Introduction to Forensic Linguistics. Abingdon: Routledge.

Cowan, N. (2001) ‘The magical number 4 in short-term memory: A reconsideration of mental storage capacity’, Behavioral and Brain Sciences, 24(1), pp. 87–114. Available at: https://doi.org/10.1017/S0140525X01003922.

Croft, W. (2001) Radical Construction Grammar: Syntactic Theory in Typological Perspective. Oxford: Oxford University Press. Available at: https://doi.org/10.1093/acprof:oso/9780198299554.001.0001.

Dąbrowska, E. (2012) ‘Different speakers, different grammars’, Linguistic Approaches to Bilingualism, 2(3), pp. 219–53. Available at: https://doi.org/10.1075/lab.2.3.01dab.

Dąbrowska, E. (2015) ‘Individual differences in grammatical knowledge’, in Dąbrowska, E. and Divjak, D. (eds.) Handbook of Cognitive Linguistics. Berlin: De Gruyter, pp. 650–67.

Dąbrowska, E. (2018) ‘Experience, aptitude and individual differences in native language ultimate attainment’, Cognition, 178 (May), pp. 222–35. Available at: https://doi.org/10.1016/j.cognition.2018.05.018.

Dąbrowska, E. (2020) ‘Language as a phenomenon of the third kind’, Cognitive Linguistics, 31(2), pp. 213–29. Available at: https://doi.org/10.1515/cog-2019-0029.

Daelemans, W. (2013) ‘Explanation in computational stylometry’, Computational Linguistics and Intelligent Text Processing, 7817(2), pp. 451–62.

Dasgupta, I. and Gershman, S. J. (2021) ‘Memory as a computational resource’, Trends in Cognitive Sciences, 25(3), pp. 240–51. Available at: https://doi.org/10.1016/j.tics.2020.12.008.

Diessel, H. (2019) The Grammar Network: How Linguistic Structure is Shaped by Language Use. Cambridge: Cambridge University Press.

Divjak, D. (2019) Frequency in Language: Memory, Attention and Learning. Cambridge: Cambridge University Press.

Driver, H. E. and Kroeber, A. L. (1932) ‘Quantitative expression of cultural relationship’, University of California Publications in American Archaeology and Ethnology, 31, pp. 211–56.

Dugar, T. K., Gowtham, S., and Chakraborty, U. Kr. (2022) ‘Comparing word embeddings on authorship identification’, in Borah, S. and Panigrahi, R. (eds.) Applied Soft Computing: Techniques and Applications. Boca Raton, FL: CRC Press, pp. 177–94.

Dunn, J. (2017) ‘Computational learning of construction grammars’, Language and Cognition, 9(2), pp. 254–92. Available at: https://doi.org/10.1017/langcog.2016.7.

Dunn, J. and Nini, A. (2021) ‘Production vs perception: The role of individuality in usage-based grammar induction’, in Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (online), Association for Computational Linguistics. Available at: https://aclanthology.org/2021.cmcl-1.19/, pp. 149–59.

Eder, M. (2015) ‘Does size matter? Authorship attribution, small samples, big problem’, Digital Scholarship in the Humanities, 30(2), pp. 167–82. Available at: https://doi.org/10.1093/llc/fqt066.

Ellis, N. C. (2002) ‘Frequency effects in language processing’, Studies in Second Language Acquisition, 24(02), pp. 143–88. Available at: https://doi.org/10.1017/S0272263102002024.

Ellis, N. C., Römer, U., and O’Donnell, M. B. (2016) Usage-Based Approaches to Language Acquisition and Processing: Cognitive and Corpus Investigations of Construction Grammar. Malden, MA: Wiley-Blackwell.

Erman, B. and Warren, B. (2000) ‘The idiom principle and the open choice principle’, Text, 20(1), pp. 29–62. Available at: https://doi.org/10.1515/text.1.2000.20.1.29.

Evert, S., Proisl, T., Vitt, T., et al. (2015) ‘Towards a better understanding of Burrows’s Delta in literary authorship attribution’, in Feldman, A., Kazantseva, A., Szpakowicz, S., et al. (eds.) Proceedings of the Fourth Workshop on Computational Linguistics for Literature. Denver, CO: Association for Computational Linguistics, pp. 79–88.

Evert, S., Proisl, T., Jannidis, F., et al. (2017) ‘Understanding and explaining Delta measures for authorship attribution’, Digital Scholarship in the Humanities, 32, pp. ii4–ii16. Available at: https://doi.org/10.1093/llc/fqx023.

Fedorenko, E. (2021) ‘The human language system in the mind and brain’, in 5th Usage-Based Linguistics Conference (online), Tel Aviv University. Available at: https://youtu.be/edlY4GbH1tU .

Fonteyn, L. (2021) ‘Constructional change across the lifespan of 20 early modern gentlemen’, in 11th International Conference on Construction Grammar (ICCG11). Antwerp: University of Antwerp. Available at: https://doi.org/10.5281/zenodo.5220179.

Fonteyn, L. and Nini, A. (2020) ‘Individuality in syntactic variation: An investigation of the seventeenth-century gerund alternation’, Cognitive Linguistics, 31(2), pp. 279–308. Available at: https://doi.org/10.1515/COG-2019-0040.

Galbraith, D. (2009) ‘Cognitive models of writing’, German as a Foreign Language, 2–3, pp. 7–22.

Gerlach, M. and Altmann, E. G. (2013) ‘Stochastic model for the vocabulary growth in natural languages’, Physical Review X, 3(2), p. 021006. Available at: https://doi.org/10.1103/PhysRevX.3.021006.

Gobet, F., Lane, P. C. R., Croker, S., et al. (2001) ‘Chunking mechanisms in human learning’, Trends in Cognitive Sciences, 5(6), pp. 236–43. Available at: https://doi.org/10.1016/S1364-6613(00)01662-4.

Goldberg, A. E. (1995) Constructions: A Construction Grammar Approach to Argument Structure. Chicago, IL: University of Chicago Press.

Goldberg, A. E. (2003) ‘Constructions: A new theoretical approach to language’, Trends in Cognitive Science, 7(5), pp. 219–24.

Goldberg, A. E. (2006) Constructions at Work: The Nature of Generalization in Language. Oxford: Oxford University Press.

Goldberg, A. E. (2019) Explain Me This: Creativity, Competition, and the Partial Productivity of Constructions. Princeton, NJ: Princeton University Press.

Goodman, L. A. and Kruskal, W. H. (1954) ‘Measures of association for cross classifications’, Journal of the American Statistical Association, 49, pp. 732–64.

Grant, T. (2007) ‘Quantifying evidence in forensic authorship analysis’, International Journal of Speech Language and the Law, 14(1), pp. 1–25. Available at: https://doi.org/10.1558/ijsll.v14i1.1.

Grant, T. (2010) ‘Txt 4n6: Idiolect free authorship analysis’, in Coulthard, M. (ed.) Routledge Handbook of Forensic Linguistics. London: Routledge, pp. 508–23.

Grant, T. (2022) The Idea of Progress in Forensic Authorship Analysis. Elements in Forensic Linguistics. Cambridge: Cambridge University Press. Available at: www.cambridge.org/core/elements/idea-of-progress-in-forensic-authorship-analysis/6A4F7668B4831CCD7DBF74DECA3EBA06.

Grant, T. and MacLeod, N. (2018) ‘Resources and constraints in linguistic identity performance: A theory of authorship’, Language and Law / Linguagem e Direito, 5(1), pp. 80–96.

Grant, T. and MacLeod, N. (2020) Language and Online Identities: The Undercover Policing of Internet Sexual Crime. Cambridge: Cambridge University Press.

Gries, S. T. (2013) ‘50-something years of work on collocations: What is or should be next’, International Journal of Corpus Linguistics, 18(1), pp. 137–66. Available at: https://doi.org/10.1075/ijcl.18.1.09gri.

Grieve, J. (2007) ‘Quantitative authorship attribution: An evaluation of techniques’, Literary and Linguistic Computing, 22(3), pp. 251–70.

Grieve, J., Clarke, I., Chiang, E., et al. (2019) ‘Attributing the Bixby Letter using n-gram tracing’, Digital Scholarship in the Humanities, 34(3), pp. 493–512.

Halliday, M. A. K. and Matthiessen, C. M. I. M. (2004) An Introduction to Functional Grammar. London: Arnold.

Halvani, O., Graner, L., and Regev, R. (2020) ‘Cross-domain authorship verification based on topic agnostic features’, in L. Cappellato, C. Eickhoff, N. Ferro, and A. Névéol (eds.) Working Notes of CLEF 2020: Conference and Labs of the Evaluation Forum. Available at: https://ceur-ws.org/Vol-2696/.

Hasan, R. (1996) ‘Ways of saying: ways of meaning’, in Cloran, C., Butt, D., and Williams, G. (eds.) Ways of Saying, Ways of Meaning: Selected Papers of Ruqaiya Hasan. London: Cassell, pp. 191–242.

Hasan, R. (2009a) ‘On semantic variation’, in Webster, J. (ed.) The Collected Works of Ruqaiya Hasan Vol. 2: Semantic Variation: Meaning in Society and in Sociolinguistics. London: Equinox, pp. 41–72.

Hasan, R. (2009b) ‘Wanted: A theory for integrated sociolinguistics’, in Webster, J. (ed.) The Collected Works of Ruqaiya Hasan Vol. 2: Semantic Variation: Meaning in Society and in Sociolinguistics. London: Equinox, pp. 5–40.

Hasson, U., Chen, J., and Honey, C. J. (2015) ‘Hierarchical process memory: Memory as an integral component of information processing’, Trends in Cognitive Sciences, 19(6), pp. 304–313. Available at: https://doi.org/10.1016/j.tics.2015.04.006.

Hawkins, R. P. and Dotson, V. A. (1968) ‘Reliability scores that delude: An Alice in Wonderland trip through the misleading characteristics of interobserver agreement scores in interval coding’, in Ramp, E. and Semb, G. (eds.) Behavior Analysis: Areas of Research and Application. Englewood Cliffs, NJ: Prentice Hall.

Hayek, L.-A. C. (1994) ‘Analysis of amphibian biodiversity data’, in Heyer, R. W. et al. (eds.) Measuring and Monitoring Biological Diversity: Standard Methods for Amphibians. Washington, DC: Smithsonian Books, pp. 207–70.

Heaps, H. S. (1978) Information Retrieval: Computational and Theoretical Aspects. Library and Information Science Series. New York: Academic Press.

Herdan, G. (1960) Type-Token Mathematics. Janua linguarum, Series maior, 4. ’s-Gravenhage: Mouton.

Hilpert, M. (2014) Construction Grammar and Its Application to English. Edinburgh: Edinburgh University Press.

Hoey, M. (2005) Lexical Priming: A New Theory of Words and Language. London: Routledge.

Hoover, D. L. (2004) ‘Testing Burrows’s Delta’, Literary and Linguistic Computing, 19(4), pp. 453–75.

Houvardas, J. and Stamatatos, E. (2006) ‘N-gram feature selection for authorship identification’, in Euzenat, J. and Domingue, J. (eds.) Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2006, Bulgaria. Berlin: Springer, pp. 77–86. Available at: https://doi.org/10.1007/11861461_10.

Hudson, R. (2010) An Introduction to Word Grammar. Cambridge: Cambridge University Press.

Hudson, R. A. (1996) Sociolinguistics. 2^nd ed. Cambridge Textbooks in Linguistics. Cambridge: Cambridge University Press.

Hunston, S. and Francis, G. (2000) Pattern Grammar: A Corpus-Driven Approach to the Lexical Grammar of English. Edited by Francis, G.. Studies in Corpus Linguistics, 4. Amsterdam: John Benjamins.

Ishihara, S. (2021a) ‘Score-based likelihood ratios for linguistic text evidence with a bag-of-words model’, Forensic Science International, 327, p. 110980. Available at: https://doi.org/10.1016/j.forsciint.2021.110980.

Ishihara, S. (2021b) ‘The influence of background data size on the performance of a score-based likelihood ratio system: A case of forensic text comparison’, in Proceedings of the 18^th Workshop of the Australasian Language Technology Association. ALTA, pp. 21–31. Available at: https://aclanthology.org/volumes/2020.alta-1/.

Jaccard, P. (1912) ‘The distribution of the flora in the alpine zone’, New Phytologist, 11(2), pp. 37–50. Available at: https://doi.org/10.1111/j.1469-8137.1912.tb05611.x.

Jafariakinabad, F. and Hua, K. A. (2021) ‘Unifying lexical, syntactic, and structural representations of written language for authorship attribution’, SN Computer Science, 2(481), pp. 1–14. Available at: https://doi.org/10.1007/s42979-021-00911-2.

Jain, A. K., Ross, A., and Prabhakar, S. (2004) ‘An introduction to biometric recognition’, IEEE Transactions on Circuits and Systems for Video Technology, 14(1), pp. 4–20. Available at: https://doi.org/10.1109/TCSVT.2003.818349.

Jannidis, F., Pielström, S., Schöch, C., and Vitt., T. (2015) ‘Improving Burrows’ Delta: An empirical evaluation of text distance measures’, in Digital Humanities Conference 2015. Sydney, Australia: Alliance of Digital Humanities Organizations.

Johnson, A. and Wright, D. (2014) ‘Identifying idiolect in forensic authorship attribution: An n-gram textbite approach’, Language and Law/Linguagem e Direito, 1(1), pp. 37–69.

Johnstone, B. (1996) The Linguistic Individual: Self-Expression in Language and Linguistics. Oxford: Oxford University Press.

Juola, P. (2008) ‘Authorship attribution’, Foundations and Trends® in Information Retrieval, 1(3), pp. 233–334. Available at: https://doi.org/10.1561/1500000005.

Juola, P. (2012) ‘Large-scale experiments in authorship attribution’, English Studies, 93(3), pp. 275–83.

Jurafsky, D. and Martin, J. H. (2009) Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Upper Saddle River, NJ: Pearson/Prentice Hall.

Keller, R. (1994) On Language Change: The Invisible Hand in Language. London: Taylor & Francis.

Kestemont, M. (2014) ‘Function words in authorship attribution from black magic to theory?’, in Proceedings of the 3^rd Workshop on Computational Linguistics for Literature (CLfL) @ EACL 2014. Gothenburg, Sweden: Association for Computational Linguistics, pp. 59–66.

Kestemont, M., Stover, J., Koppel, M., Karsdorp, F., and Daelemans, W. (2016) ‘Authenticating the writings of Julius Caesar’, Expert Systems With Applications, 63, pp. 86–96. Available at: https://doi.org/10.1016/j.eswa.2016.06.029.

Kestemont, M., Manjavacas, E., Markov, I., et al. (2020) Overview of the Cross-Domain Authorship Verification Task at PAN 2020, Available at: https://pan.webis.de/downloads/publications/papers/kestemont_2020.pdf.

Kidd, E., Donnelly, S., and Christiansen, M. H. (2018) ‘Individual differences in language acquisition and processing’, Trends in Cognitive Sciences, 22(2), pp. 154–69. Available at: https://doi.org/10.1016/j.tics.2017.11.006.

Kidd, E., Bidgood, A., Donnelly, S., Durrant, S., Peter, M. S., and Rowland, C. F. (2020) ‘Individual differences in first language acquisition and their theoretical implications’, in Rowland, C. F., Theakston, A., Ambridge, B., and Twomey, K. (eds.) Current Perspectives on Child Language Acquisition: How children use their environment to learn. Amsterdam: John Benjamins, pp. 189–219. Available at: https://doi.org/10.1075/tilar.27.09kid.

Koppel, M. and Schler, J. (2004) ‘Authorship verification as a one-class classification problem’, in Proceedings of the 21th International Conference on Machine Learning. Banff, Alberta, Canada: ACM, pp. 62–7.

Koppel, M. and Winter, Y. (2014) ‘Determining if two documents are written by the same author’, Journal of the Association for Information Science and Technology, 65(1), pp. 178–87.

Koppel, M., Schler, J., and Argamon, S. (2009) ‘Computational methods in authorship attribution’, Journal of the American Society for Information Science and Technology, 60(1), pp. 9–26.

Koppel, M., Schler, J., and Argamon, S. (2011) ‘Authorship attribution in the wild’, Language Resources and Evaluation, 45(1), pp. 83–94. Available at: https://doi.org/10.1007/s10579-009-9111-2.

Koppel, M., Schler, J., and Argamon, S. (2013) ‘Authorship attribution: What’s easy and what’s hard?’, Journal of Law and Policy, 21, pp. 317–31.

Kulczynski, S. (1927) ‘Die Pflanzenassociationen der Pienenen’, Bulletin International de l’Academie Polonaise des Sciences et des Lettres. Classe des Sciences Mathematiques et Naturelles. Serie B. Sciences Naturelles, Suppl. II(2), pp. 57–203.

Lakoff, G. (1990) ‘The Invariance Hypothesis: Is abstract reason based on image-schemas?’, Cognitive Linguistics, 1(1), pp. 39–74. Available at: https://doi.org/10.1515/cogl.1990.1.1.39.

Lancashire, I. (1997) ‘Empirically determining Shakespeare’s idiolect’, Shakespeare Studies, 25, pp. 171–85.

Lancashire, I. (2010) Forgetful Muses: Reading the Author in the Text. Toronto: University of Toronto Press.

Langacker, R. W. (1987) Foundations of Cognitive Grammar. Stanford, CA: Stanford University Press.

Leeuwen, D. A. van (2015) ROC: Compute Structures to Compute ROC and DET Plots and Metrics for 2-Class Classifiers. R package. Available at: https://rdrr.io/github/davidavdav/ROC/.

Lewis, D. D., Yang, Y., Rose, T. G., and Li, F. (2004) ‘RCV1: A new benchmark collection for text categorization research’, Journal of Machine Learning Research, 5, pp. 361–97.

López-Monroy, A. P., Montes-y-Gómez, M., Villaseñor-Pineda, L., Carrasco-Ochoa, J. A., and Martínez-Trinidad, J. F. (2012) ‘A new document author representation for authorship attribution’, in Mexican Conference on Pattern Recognition. Berlin: Springer, pp. 283–92. Available at: https://doi.org/10.1007/978-3-642-31149-9_29.

Mccauley, S. M. and Christiansen, M. H. (2015) ‘Individual differences in chunking ability predict on-line sentence processing’, in Noelle, D. C., Dale, R., Warlaumont, A., et al. (eds.), Proceedings of the 37th Annual Conference of the Cognitive Science Society. Pasadena, CA: Cognitive Science Society, pp. 1553–8.

McMenamin, G. R. (2002) Forensic Linguistics: Advances in Forensic Stylistics. Boca Raton, FL: CRC Press.

Mikros, G. K. and Argiri, E. K. (2007) ‘Investigating topic influence in authorship attribution’, in Stein, B., Koppel, M., and Stamatatos, E. (eds.), Proceedings of the SIGIR 2007 International Workshop on Plagiarism Analysis, Authorship Identification, and Near-Duplicate Detection, vol. 276. Amsterdam: CEUR-WS.org. Available at: http://ceur-ws.org/Vol-276.

Miller, G. A. (1956) ‘The magical number seven, plus or minus two: Some limits on our capacity for processing information’, Psychological Review, 63(2), pp. 81–97.

Mollin, S. (2009) ‘“I entirely understand” is a Blairism: The methodology of identifying idiolectal collocations’, International Journal of Corpus Linguistics, 14(3), pp. 367–92. Available at: https://doi.org/10.1075/ijcl.14.3.04mol.

Mosteller, F. and Wallace, D. L. (1963) ‘Inference in an authorship problem’, Journal of the American Statistical Association, pp. 275–309. Available at: https://doi.org/10.2307/2283270.

Mountford, M. D. (1962) ‘An index of similarity and its applications to classificatory problems’, in Murphy, P.W. (ed.) Progress in Soil Zoology. London: Butterworths, pp. 43–50.

Murauer, B. and Specht, G. (2021) ‘Developing a benchmark for reducing data bias in authorship attribution’, in Proceedings of the 2^nd Workshop on Evaluation and Comparison of NLP Systems (Eval4NLP 2021). Association for Computational Linguistics, pp. 179–88. Available at: https://aclanthology.org/2021.eval4nlp-1.18.pdf.

Narayanan, A., Paskov, H., Gong, N. Z., et al. (2012) ‘On the feasibility of internet-scale author identification’, in Security and Privacy (SP), 2012 IEEE Symposium on. IEEE, pp. 300–14. Available at: https://ieeexplore.ieee.org/document/6234420.

Nini, A. (2018) ‘An authorship analysis of the Jack the Ripper letters’, Digital Scholarship in the Humanities, 33(3), pp. 621–36.

Nini, A. and Grant, T. (2013) ‘Bridging the gap between stylistic and cognitive approaches to authorship analysis using Systemic Functional Linguistics and multidimensional analysis’, International Journal of Speech Language and the Law, 20(2), pp. 173–202.

Nini, A., Cameron, M., and Murphy, C. (2021) ‘Experimental evidence on the individuality of lexicogrammar’, in International Construction Grammar Conference 11 (ICCG11). Antwerp: University of Antwerp. Available at: https://doi.org/10.5281/zenodo.5227222.

Oakes, M. P. (2014) Literary Detective Work on the Computer. Amsterdam: John Benjamins.

Ochiai, A. (1957) ‘Zoogeographic studies on the soleoid fishes found in Japan and its neighboring regions’, Bulletin of the Japanese Society of Fisheries Science, 22, pp. 526–30.

Pearson, K. and Heron, D. (1913) ‘On theories of association’, Biometrika, 9, pp. 159–315.

Petré, P. and Van de Velde, F. (2018) ‘The real-time dynamics of the individual and the community in grammaticalization’, Language, 94(4), pp. 867–901.

Pinker, S. (1994) Language Instinct. New York: William Morrow.

Plakias, S. and Stamatatos, E. (2008) ‘Tensor space models for authorship identification’, in Darzentas, J., Vouros, G. A., Vosinakis, S., and Arnellos, A. (eds.) Proceedings of the 5th Hellenic Conference on Artificial Intelligence (SETN’08). Syros, Greece: LNCS, pp. 239–49.

Pokhriyal, N., Tayal, K., Nwogu, I., and Govindaraju, V. (2017) ‘Cognitive-biometric recognition from language usage: A feasibility study’, IEEE Transactions on Information Forensics and Security, 12(1), pp. 134–43. Available at: https://doi.org/10.1109/TIFS.2016.2604213.

Proisl, T., Evert, S., Jannidis, F., Schöch, C., Konle, L., and Pielström, S. (2018) ‘Delta vs. n-gram tracing: Evaluating the robustness of authorship attribution methods’, in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018). Miyazaki, Japan: European Language Resources Association (ELRA), pp. 3309–14.

Renouf, A. and Sinclair, J. (1991) ‘Collocational frameworks in English’, in Aijmer, K. and Altenherg, B. (eds.) English Corpus Linguistics: Studies in Honour of Jan Svartvik. London: Longman, pp. 128–43.

Rogot, E. and Goldberg, I. D. (1966) ‘A proposed index for measuring agreement in test-retest studies’, Journal of Chronic Disease, 19, pp. 991–1006.

Russell, P. F. and Rao, T. R. (1940) ‘On habitat and association of species of Anopheline larvae in South Eastern Madras’, Journal of the Malaria Institute of India, 3, pp. 153–78.

Sapkota, U., Bethard, S., Montes-y-Gómez, M., and Solorio, T. (2015) ‘Not all character n-grams are created equal: A study in authorship attribution’, in Human Language Technologies: The 2015 Annual Conference of the North American Chapter of the ACL. Denver, CO: ACL, pp. 93–102.

Sari, Y., Vlachos, A., and Stevenson, M. (2017) ‘Continuous N-gram representations for authorship attribution’, in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pp. 267–73. Available at: https://doi.org/10.18653/v1/e17-2043.

Schmid, H.-J. (2015) ‘A blueprint of the Entrenchment-and-Conventionalization Model’, Yearbook of the German Cognitive Linguistics Association, 3(1), pp. 3–25. Available at: https://doi.org/10.1515/gcla-2015-0002.

Schmid, H.-J. and Mantlik, A. (2015) ‘Entrenchment in historical corpora? Reconstructing dead authors’ minds from their usage profiles’, Anglia, 133 (4), pp. 583–623. Available at: https://doi.org/10.1515/ang-2015-0056.

Schmid, H.-J., Würschinger, Q., Fischer, S., and Küchenhoff, H. (2021) ‘That’s cool: Computational sociolinguistic methods for investigating individual lexico-grammatical variation’, Frontiers in Artificial Intelligence, 3, p. 89. Available at: https://doi.org/10.3389/frai.2020.547531.

Schmitt, N. (2004) Formulaic Sequences: Acquisition, Processing, and Use. Amsterdam : John Benjamins.

Seidman, S. (2013) ‘Authorship verification using the impostors method’, in Forner, P., Navigli, R., Tufis, D., and Ferro, N. (eds.) CLEF 2013 Evaluation Labs and Workshop – Working Notes Papers. Valencia, Spain, pp. 23–6. Available at: https://ceur-ws.org/Vol-1179/.

Shannon, C. E. (1948) ‘A mathematical theory of communication’, Bell System Technical Journal, 27, pp. 379–423 & 623–56.

Simpson, G. G. (1943) ‘Mammals and the nature of continents’, Amercian Journal of Science, 241, pp. 1–31.

Sinclair, J. (1991) Corpus, Concordance, Collocation. Oxford: Oxford University Press.

Smet, H. De (2016) ‘The root of ruthless: Individual variation as a window on mental representation’, International Journal of Corpus Linguistics, 21(2), pp. 250–71. Available at: https://doi.org/10.1075/ijcl.21.2.05des.

Smith, P. W. H. and Aldridge, W. (2011) ‘Improving authorship attribution: Optimizing Burrows’ Delta method*’, Journal of Quantitative Linguistics, 18 (1), pp. 63–88. Available at: https://doi.org/10.1080/09296174.2011.533591.

Sokal, R. R. and Michener, C. D. (1958) ‘A statistical method for evaluating systematic relationships’, University of Kansas Science Bulletin, 38, pp. 1409–38.

Sokal, R. R. and Sneath, P. H. A. (1963) Principles of Numerical Taxonomy. San Francisco, CA: W.H. Freeman.

Solan, L. M. and Tiersma, P. M. (2005) Speaking of Crime: The Language of Criminal Justice. Chicago, IL: University of Chicago Press.

Sorgenfrei, T. (1958) ‘Molluscan assemblages from the marine middle Miocene of South Jutland and their environments’, Danmark Geologiske Undersøgelse. Serie 2, 79, pp. 403–8.

Stamatatos, E. (2009) ‘A survey of modern authorship attribution methods’, Journal of the American Society for Information Science and Technology, 60(3), pp. 538–56. Available at: https://doi.org/10.1002/asi.21001.

Stamatatos, E. (2013) ‘On the robustness of authorship attribution based on character n-gram features’, Journal of Law and Policy, 21(2), pp. 421–39.

Svartvik, J. (1968) The Evans Statements: A Case for Forensic Linguistics. Gothenburg: University of Gothenburg Press.

Todeschini, R., Consonni, V., Xiang, H., Holliday, J., Buscema, M., and Willett, P. (2012) ‘Similarity coefficients for binary chemoinformatics data: Overview and extended comparison using simulated and real data sets’, Journal of Chemical Information and Modeling, 52(11), pp. 2884–901. Available at: https://doi.org/10.1021/ci300261r.

Turell, M. T. (2010) ‘The use of textual, grammatical and sociolinguistic evidence in forensic text comparison’, International Journal of Speech Language and the Law, 17(2), pp. 211–50.

Turell, M. T. and Gavaldà, N. (2013) ‘Towards an index of idiolectal similitude (or distance) in forensic authorship analysis’, Journal of Law and Policy, 21, pp. 495–514.

Ullman, M. T. (2004) ‘Contributions of memory circuits to language: the declarative/procedural model’, Cognition, 92(1–2), pp. 231–70. Available at: https://doi.org/10.1016/j.cognition.2003.10.008.

Ullman, M. T. (2013) ‘The role of declarative and procedural memory in disorders of language’, Linguistic Variation, 13(2), pp. 133–54. Available at: https://doi.org/10.1075/lv.13.2.01ull.

Vetchinnikova, S. (2017) ‘On the relationship between the cognitive and the communal: A complex systems perspective’, in Filppula, M., Klemola, J., Mauranen, A., and Vetchinnikova, S. (eds.) Changing English. Berlin: De Gruyter, pp. 277–310. Available at: https://doi.org/10.1515/9783110429657-015.

Warrens, M. J. (2008) ‘Similarity coefficients for binary data’. Unpublished thesis, Leiden University.

Wible, D. and Tsao, N.-L. (2010) ‘StringNet as a computational resource for discovering and investigating linguistic constructions’, in Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics. Los Angeles, California, USA, pp. 25–31. Available at: https://aclanthology.org/W10-0804/.

Wray, A. (2008) Formulaic Language Pushing the Boundaries. Oxford: Oxford University Press.

Wright, D. (2013) ‘Stylistic variation within genre conventions in the Enron email corpus: Developing a text-sensitive methodology for authorship research’, International Journal of Speech Language and the Law, 20(1), pp. 45–75.

Wright, D. (2017) ‘Using word n-grams to identify authors and idiolects: A corpus approach to a forensic linguistic problem’, International Journal of Corpus Linguistics, 22(2), pp. 212–41. Available at: https://doi.org/10.1075/ijcl.22.2.03wri.

Yule, G. U. (1900) ‘On the association of attributes in statistics’, Philosophical Transactions of the Royal Society, 75, pp. 257–319.

Yule, G. U. (1912) ‘On the methods of measuring association between two attributes’, Journal of the Royal Statistical Society, 75, pp. 579–642.

A Theory of Linguistic Individuality for Authorship Analysis

This Element has been cited by the following publications. This list is generated based on data provided by Crossref.

Book description

References

Metrics

Altmetric attention score

Full text views

Book summary page views

Accessibility standard: Unknown

Why this information is here

Accessibility Information