Hostname: page-component-848d4c4894-v5vhk Total loading time: 0 Render date: 2024-06-24T21:04:08.094Z Has data issue: false hasContentIssue false

Segmenting documents by stylistic character

Published online by Cambridge University Press:  10 November 2005

NEIL GRAHAM
Affiliation:
Department of Computer Science, University of Toronto, Toronto, Ontario, Canada M5S 3G4 e-mail: gh@cs.toronto.edu Present address: IBM Canada Ltd.
GRAEME HIRST
Affiliation:
Department of Computer Science, University of Toronto, Toronto, Ontario, Canada M5S 3G4 e-mail: gh@cs.toronto.edu
BHASKARA MARTHI
Affiliation:
Department of Computer Science, University of Toronto, Toronto, Ontario, Canada M5S 3G4 e-mail: gh@cs.toronto.edu Present address: Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA, USA.

Abstract

As part of a larger project to develop an aid for writers that would help to eliminate stylistic inconsistencies within a document, we experimented with neural networks to find the points in a text at which its stylistic character changes. Our best results, well above baseline, were achieved with time-delay networks that used features related to the author's syntactic preferences, whereas low-level and vocabulary-based features were not found to be useful. An alternative approach with character bigrams was not successful.

Type
Papers
Copyright
2005 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

An earlier version of parts of this paper was presented at the Workshop on Computational Approaches to Style Analysis and Synthesis, Acapulco, August 2003.