Hostname: page-component-89b8bd64d-9prln Total loading time: 0 Render date: 2026-05-11T09:45:41.274Z Has data issue: false hasContentIssue false

Lexical stability of psychiatric clinical notes from electronic health records over a decade

Published online by Cambridge University Press:  25 August 2023

Lasse Hansen*
Affiliation:
Department of Affective Disorders, Aarhus University Hospital – Psychiatry, Aarhus, Denmark Department of Clinical Medicine, Aarhus University, Aarhus, Denmark Center for Humanities Computing, Aarhus University, Aarhus, Denmark
Kenneth Enevoldsen
Affiliation:
Department of Clinical Medicine, Aarhus University, Aarhus, Denmark Center for Humanities Computing, Aarhus University, Aarhus, Denmark
Martin Bernstorff
Affiliation:
Department of Affective Disorders, Aarhus University Hospital – Psychiatry, Aarhus, Denmark Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
Erik Perfalk
Affiliation:
Department of Affective Disorders, Aarhus University Hospital – Psychiatry, Aarhus, Denmark Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
Andreas A. Danielsen
Affiliation:
Department of Clinical Medicine, Aarhus University, Aarhus, Denmark Psychosis Research Unit, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
Kristoffer L. Nielbo
Affiliation:
Center for Humanities Computing, Aarhus University, Aarhus, Denmark
Søren D. Østergaard
Affiliation:
Department of Affective Disorders, Aarhus University Hospital – Psychiatry, Aarhus, Denmark Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
*
Corresponding author: Lasse Hansen; Email: lasse.hansen@clin.au.dk
Rights & Permissions [Opens in a new window]

Abstract

Objective:

Natural language processing (NLP) methods hold promise for improving clinical prediction by utilising information otherwise hidden in the clinical notes of electronic health records. However, clinical practice – as well as the systems and databases in which clinical notes are recorded and stored – change over time. As a consequence, the content of clinical notes may also change over time, which could degrade the performance of prediction models. Despite its importance, the stability of clinical notes over time has rarely been tested.

Methods:

The lexical stability of clinical notes from the Psychiatric Services of the Central Denmark Region in the period from January 1, 2011, to November 22, 2021 (a total of 14,811,551 clinical notes describing 129,570 patients) was assessed by quantifying sentence length, readability, syntactic complexity and clinical content. Changepoint detection models were used to estimate potential changes in these metrics.

Results:

We find lexical stability of the clinical notes over time, with minor deviations during the COVID-19 pandemic. Out of 2988 data points, 17 possible changepoints (corresponding to 0.6%) were detected. The majority of these were related to the discontinuation of a specific note type.

Conclusion:

We find lexical and syntactic stability of clinical notes from psychiatric services over time, which bodes well for the use of NLP for predictive modelling in clinical psychiatry.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use and/or adaptation of the article.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of Scandinavian College of Neuropsychopharmacology
Figure 0

Table 1. Demographic characteristics of the cohort

Figure 1

Table 2. Number and length of notes by note type sorted in descending order based on total number of tokens

Figure 2

Figure 1. Example of novelty calculation with a window size of 2. The figure shows the distribution shift over time of four keywords from the F4 category. The distribution of the four keywords at each time point is represented with a histogram, with the height of the bars indicating word counts and the colour indicating novelty levels. Major shifts such as from time 4 to 5 lead to high novelty, which gradually decreases as the distribution becomes stable. The larger the shift in distribution, the greater the novelty. Novelty cannot be calculated for the first two points as we have defined the window size to be 2.

Figure 3

Figure 2. Calculation of dependency distance. Numbers on arrows indicate the distance from the head to the word. Note that ‘experienced’ has no arrows pointing to it, as it is the root of the sentence. The dependency distance of ‘experienced’ is therefore zero. The mean dependency distance of the sentence is (1 + 2 + 1 + 0 + 2 + 1 + 3) / 7 = 1.43.

Figure 4

Figure 3. Time series of text stability. The figure panels illustrate the automated readability index (ARI), mean dependency distance, the mean number of tokens for the aggregated clinical notes (top three rows) and novelty across all psychopathological keywords. The grey line indicates the estimated changepoint segments (none found for any groups). The left column shows the data with the y-axis going to 0, and the right column shows the data with the y-axes allowed to vary.

Figure 5

Figure 4. Location of changepoints on novelty for terms across diagnostic categories.

Figure 6

Figure 5. The 12 words describing psychopathology with the largest relative change at the 2020 Q2 changepoint for F3 – mood disorders. The x-axis shows the quotient difference between the mean of the two previous time points (novelty window) to the time point of interest (1 = no difference). The shape indicates whether there was an increase or decrease in word use. The text next to the label denotes the actual difference in means, that is, a change of 0.012 means that the use of a word rose to comprise an additional 0.012% of all words in the F3 – mood disorder category (the 68 words in the F3 – mood disorder category are available in Supplementary Table 3) in 2020 Q2 compared to the previous two quarters.

Figure 7

Figure 6. Location of the 14 changepoints in the mean number of tokens, dependency distance, automated readability index (ARI) and proportion of total notes for each note type. The colours indicate which metric the changepoint occurred in.

Supplementary material: File

Hansen et al. supplementary material

Figures S1-S13 and Tables S1-S3

Download Hansen et al. supplementary material(File)
File 12.6 MB