Skip to main content
    • Aa
    • Aa

Professional language in Swedish clinical text: Linguistic characterization and comparative studies

  • Kelly Smith (a1), Beata Megyesi (a2), Sumithra Velupillai (a3) and Maria Kvist (a4)

This study investigates the linguistic characteristics of Swedish clinical text in radiology reports and doctor's daily notes from electronic health records (EHRs) in comparison to general Swedish and biomedical journal text. We quantify linguistic features through a comparative register analysis to determine how the free text of EHRs differ from general and biomedical Swedish text in terms of lexical complexity, word and sentence composition, and common sentence structures. The linguistic features are extracted using state-of-the-art computational tools: a tokenizer, a part-of-speech tagger, and scripts for statistical analysis. Results show that technical terms and abbreviations are more frequent in clinical text, and lexical variance is low. Moreover, clinical text frequently omit subjects, verbs, and function words resulting in shorter sentences. Clinical text not only differs from general Swedish, but also internally, across its sub-domains, e.g. sentences lacking verbs are significantly more frequent in radiology reports. These results provide a foundation for future development of automatic methods for EHR simplification or clarification.

Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

Helen Allvin , Elin Carlsson , Hercules Dalianis , Riitta Danielsson-Ojala , Vidas Daudaravicius , Martin Hassel , Dimitrios Kokkinakis , Heljö Lundgren-Laine , Gunnar H Nilsson , Øystein Nytrø , Sanna Salanterä , Maria Skeppstedt , Hanna Suominen & Sumithra Velupillai . 2011. Characteristics of Finnish and Swedish intensive care nursing narratives: A comparative analysis to support the development of clinical language technologies. Journal of Biomedical Semantics 2 (Suppl. 3):S1.

Douglas Biber & Susan Conrad . 2009. Register, Genre, and Style. Cambridge: Cambridge University Press.

Jung Wei Fan , Elly W. Yang , Min Jiang , Rashmi Prasad , Richard M. Loomis , Daniel S. Zisook , Josh C. Denny , Hua Xu & Yang Huang . 2013. Syntactic parsing of clinical text: Guideline and corpus development with handling ill-formed sentences. Journal of the American Medical Informatics Association 20, 110.

Jeffrey P. Ferraro , Hal Daumé III , Scott L. DuVall , Wendy Webber Chapman , Henk Harkema & Peter J. Haug . 2013. Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation. Journal of the American Medical Informatics Association 20 (5), 931939.

Carol Friedman , Pauline Kra & Andrey Rzhetsky . 2002. Two biomedical sublanguages: A description based on the theories of Zellig Harris. Journal of Biomedical Informatics 35 (4), 222235.

Michael Krauthammer & Goran Nenadic . 2004. Term identification in the biomedical literature. Journal of Biomedical Informatics 37 (6), 512526.

Robert Östling . 2013. Stagger: An open-source part of speech tagger for Swedish. Northern European Journal of Language Technology 3, 118.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Nordic Journal of Linguistics
  • ISSN: 0332-5865
  • EISSN: 1502-4717
  • URL: /core/journals/nordic-journal-of-linguistics
Please enter your name
Please enter a valid email address
Who would you like to send this to? *



Full text views

Total number of HTML views: 2
Total number of PDF views: 14 *
Loading metrics...

Abstract views

Total abstract views: 137 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 26th July 2017. This data will be updated every 24 hours.