To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Existing software systems for automated essay scoring can provide NLP researchers with opportunities to test certain theoretical hypotheses, including some derived from Centering Theory. In this study we employ the Educational Testing Service's e-rater essay scoring system to examine whether local discourse coherence, as defined by a measure of Centering Theory's Rough-Shift transitions, might be a significant contributor to the evaluation of essays. Rough-Shifts within students' paragraphs often occur when topics are short-lived and unconnected, and are therefore indicative of poor topic development. We show that adding the Rough-Shift based metric to the system improves its performance significantly, better approximating human scores and providing the capability of valuable instructional feedback to the student. These results indicate that Rough-Shifts do indeed capture a source of incoherence, one that has not been closely examined in the Centering literature. They not only justify Rough-Shifts as a valid transition type, but they also support the original formulation of Centering as a measure of discourse continuity even in pronominal-free text. Finally, our study design, which used a combination of automated and manual NLP techniques, highlights specific areas of NLP research and development needed for engineering practical applications.
The topic of mood and modality (MOD) is a difficult aspect of language description because, among other reasons, the inventory of modal meanings is not stable across languages, moods do not map neatly from one language to another, modality may be realised morphologically or by free-standing words, and modality interacts in complex ways with other modules of the grammar, like tense and aspect. Describing MOD is especially difficult if one attempts to develop a unified approach that not only provides cross-linguistic coverage, but is also useful in practical natural language processing systems. This article discusses an approach to MOD that was developed for and implemented in the Boas Knowledge-Elicitation (KE) system. Boas elicits knowledge about any language, L, from an informant who need not be a trained linguist. That knowledge then serves as the static resources for an L-to-English translation system. The KE methodology used throughout Boas is driven by a resident inventory of parameters, value sets, and means of their realisation for a wide range of language phenomena. MOD is one of those parameters, whose values are the inventory of attested and not yet attested moods (e.g. indicative, conditional, imperative), and whose realisations include flective morphology, agglutinating morphology, isolating morphology, words, phrases and constructions. Developing the MOD elicitation procedures for Boas amounted to wedding the extensive theoretical and descriptive research on MOD with practical approaches to guiding an untrained informant through this non-trivial task. We believe that our experience in building the MOD module of Boas offers insights not only into cross-linguistic aspects of MOD that have not previously been detailed in the natural language processing literature, but also into KE methodologies that could be applied more broadly.
With this first issue of Volume 10 of Natural Language Engineering we would like to take this opportunity to reflect on the development of the journal, especially over the last year or two, and to look forward to a number of future developments.
If you wanted to make some money out of natural language processing, an appropriate strategy might be to identify an area of technology that was relatively mature—one where the more fundamental technical problems had been resolved through a significant amount of research activity—and then identify potential applications for that technology.
This paper presents modifications to a standard probabilistic context-free grammar that enable a predictive parser to avoid garden pathing without resorting to any ad-hoc heuristic repair. The resulting parser is shown to apply efficiently to both newspaper text and telephone conversations with complete coverage and excellent accuracy. The distribution over trees is peaked enough to allow the parser to find parses efficiently, even with the much larger search space resulting from overgeneration. Empirical results are provided for both Wall St. Journal and Switchboard test corpora.
style: 1b. the shadow-producing pin of a sundial. 2c. -the custom or plan followed in spelling, capitalization, punctuation, and typographic arrangement and display.
—Webster's New Collegiate Dictionary
The syntax of a programming language tells you what code it is possible to write—what machines will understand. Style tells you what you ought to write—what humans reading the code will understand. Code written with a consistent, simple style is maintainable, robust, and contains fewer bugs. Code written with no regard to style contains more bugs, and may simply be thrown away and rewritten rather than maintained.
Attending to style is particularly important when developing as a team. Consistent style facilitates communication, because it enables team members to read and understand each other's work more easily. In our experience, the value of consistent programming style grows exponentially with the number of people working with the code.
Our favorite style guides are classics: Strunk and White's The Elements of Style and Kernighan and Plauger's The Elements of Programming Style. These small books work because they are simple: a list of rules, each containing a brief explanation and examples of correct, and sometimes incorrect, use. We followed the same pattern in this book. This simple treatment—a series of rules—enabled us to keep this book short and easy to understand.
Some of the advice that you read here may seem obvious to you, particularly if you've been writing code for a long time. Others may disagree with some of our specific suggestions about formatting or indentation.
A complete treatment of programming principles and software design is clearly beyond the scope of this book. However, this chapter includes some core principles that we have found to be central to good software engineering.
Engineering
Do Not be Afraid to Do Engineering
The ultimate goal of professional software development is to create something useful—an engineering task much more than a scientific one. (Science is more immediately concerned with understanding the world around us, which is admittedly necessary, but not sufficient, for engineering.)
Resist the temptation to write code to model scientific realities that include all theoretical possibilities. It is not a “hack” to write code that has practical limitations if you are confident those limits do not affect the utility of the resulting system.
For example, imagine you need a data structure for tree traversal and choose to write a stack. The stack needs to hold at least as many items as the maximum depth of any tree. Now suppose that there is no theoretical limit to how deep one of these trees can be. You might be tempted to create a stack that can grow to an arbitrary size by reallocating memory and copying its items as needed. On the other hand, your team's understanding of the application may be such that in your wildest imagination you'd be amazed to see a tree with depth greater than 10. If so, the better choice would be to create a fixed-length stack with a maximum of, say, 50 elements.
As commercial developers of software components, we always strive to have good, consistent style throughout our code. Since source code is usually included in our final products, our users often study our code to learn not just how the components work, but also how to write good software.
This fact ultimately led to the creation of a style guide for Java™ programming, entitled The Elements of Java Style. The positive reception to that book, coupled with recurring questions about C++ style issues, resulted in this edition for C++.
If you've read The Elements of Java Style (or even if you haven't), much of the advice in this book will probably be familiar. This is deliberate, as many of the programming principles described are timeless and valid across programming languages. However, the content has been reworked and expanded here to address the unique characteristics of the C++ language.
Audience
We wrote this book for anyone writing C++ code, but especially for programmers who are writing C++ as part of a team. For a team to be effective, everyone must be able to read and understand everyone else's code. Having consistent style conventions is a good first step!
This book is not intended to teach you C++, but rather it focuses on how C++ code can be written in order to maximize its effectiveness. We therefore assume you are already familiar with C++ and object-oriented programming.
Developers often forget that the primary purpose of their software is to satisfy the needs of an end user; they often concentrate on the solution but fail to instruct others on the use of that solution.
Good software documentation not only tells others how to use your software, it also acts as a specification of interfaces and behaviors for the engineers who must help you develop the software and those who will later maintain and enhance it. While you should always make every attempt to write software that is self-explanatory, your end users may not have access to the source code; there will always be significant information about usage and behavior that a programming language cannot express.
Good programmers enjoy writing documentation. Like elegant design and implementation, good documentation is a sign of a professional programmer.
Document Your Software Interface for Those Who Must Use It
Document the public interface of your code so others can understand and use it correctly and effectively.
The primary purpose for documentation comments is to define a programming contract between a client and a supplier of a service. The documentation associated with a method should describe all aspects of behavior on which a caller of that method can rely and should not attempt to describe implementation details.
Describe each C++ element that appears in, or forms part of, the interface.
Document Your Implementation for Those Who Must Maintain It
Document the implementation of your code so others can maintain and enhance it.