To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This paper describes the BIOSTATION, a generalized document preparation system, developed to guide an interactive editing of biological sequences by taking into account their semantics. This paper also focusses on the use of a document preparation system as the mediator for a larger application.
Introduction
The BIOSTATION is a generalized document preparation system, developed for the CRBM** and in use since May 85, able to guide an interactive editing of biological sequences by taking into account their semantics. This semantic is extracted at editing time from the document itself by an integrated expert system, and is used to express the structuration. This paper also focusses on the use of a document preparation system as the mediator for a larger application.
Genetic sequences are observed in this approach as generalized documents. This choice allows to associate convenient, and so more legible, visual representations to the abstract aspects of biological sequences semantic.
At first, we explain how semantic information on the sequences is obtained and used to guide editing. The biostation architecture is presented in a second section.
Problem position
The genetic information which allows organic cells to synthesize proteins is kept in genes. These genes are linear strings built with four types of molecules (Adenin, Thymin, Guanin, Cytosin) called nucleotids. The non biologist readers can refer to [Hélène 84]. The studied length of such strings can be up to 30000 atoms.
A gene can be analysed by the biologists to explicit its formula as a word on (A, T, G, C), and operations can be done on the gene (in vitro or in vivo) to modify it by insertions or deletions of some parts, at precise positions.
This paper presents a comprehensive survey of the typographic issues for laying out information within two-dimensional tables. Early typesetting systems formatted tables by coding the table style and layout into the program, and later systems provided a limited range of typographic features. The typographic issues include table structure, alignment of rows and columns simultaneously, formatting styles, treatment of whitespace within a table, graphical embellishments, placement of footnotes, various readability issues, and the problems of breaking large tables. Extending the table formatting problem to both page layout and arrangement of mathematical notation is highlighted, as is the need for interactive design tools for table layout.
Introduction
This paper presents a comprehensive survey of the typographic issues for laying out information within a two-dimensional table. Tables are a concentrated form of the more general layout problem; one can find table formatting analogies in both the larger-scale problem of page makeup, and the smaller-scale problem of aligning notation within a mathematical equation.
Few table formatting tools have addressed all the issues raised by this paper. In fact, it was a challenge to identify the various issues that typographers, compositors, and graphic designers have managed with great skill through the traditional graphic arts processes. Thus this paper provides a checklist for the designs and implementations of new table formatting tools, algorithms, and structures.
Document storage and retrieval systems should possess fast string search capabilities. The access paths needed to reduce the search times require substantial amounts of storage in addition to the very large storage requirements for the documents themselves. In this paper we investigate a technique that supports access paths on compressed documents, so that the total storage requirements for the access paths and the compressed documents are less than that for the original documents.
Introduction
Advances in hardware technology are unlikely to keep pace with the increasing growth of on-line document storage. In an environment where the trend is towards local and wide area networks (there is the promise of an interconnected society around the corner), a large number of documents would be transmitted between nodes. Document storage, their communication along network paths and between peripherals and processors requires, for the provision of a satisfactory service at reasonable cost, that the documents be held more compactly than at present. Natural language being highly redundant a suitable encoding scheme could be utilized with any resultant compression reducing both storage and communication cost. In an online environment the compression and decompression schemes must not involve excessive overheads in either time or space; since the documents would need to be compressed only once for storage while decompressed (or retrieved) more often, it is possible to tolerate higher levels of overhead during the compression stage.
Document retrieval requires fast string search capabilities, and it is usual to provide additional access paths to reduce the search times e.g. by providing inverted lists on words. In [Goyal83] a scheme was proposed that made use of inverted indexes associated with compressed documents.
It is difficult to make dynamic documents easy to use, but even more difficult to make authorship of dynamic documents simple. This paper outlines a system called GUIDE, which provides users with a modest yet powerful set of facilities for viewing documents on screens. GUIDE aims at a close integration of the author's view with the reader's view. Hie paper discusses the advantages of this approach, and the problems of adding functionality to a conceptually simple system.
Introduction
Writing good documentation is hard. Hardest of all, perhaps, is to write good user manuals for computer software. The reason why this is so difficult is that there is a great diversity of possible readers, and of modes of perusal. Readers will range from the naive to the expert, and in between there are important special cases of readers with expertise in a related area, such as a FORTRAN expert learning BASIC. Furthermore readers, whatever their background, will want to peruse the user manual in different ways at different stages in their learning process. Early on they will want summaries and tutorial information; later they may want to browse; finally they will want a reference manual. In order to cover this spectrum properly you need a huge range of user manuals. In a few spheres this range exists: there is, for example, a big range of manuals – mostly books – on Pascal and many of these are aimed at specific niches in the market of possible readers.
Integrated Editor/Formatters merge the document editing and formatting functions into a unified, interactive system. A common type of Integrated Editor/Formatter, the Exact-representation Editor/Formatter (also known as WYSIWYG), presents an interactive representation of the document that is identical to the printed document. Another powerful metaphor applied to documents has been to describe the document as abstract objects– to describe the document's logical structure, not its physical makeup. The goal of the research reported here is to merge the flexibility found in the abstract object-oriented approach with the naturalness of document manipulation provided by the Exact-representation Editor/Formatters. A tree-based model of documents that allows a variety of document objects as leaves (e.g., text, tables, and mathematical equations) has been defined. I suggest a template-oriented mechanism for manipulating the document and have implemented a prototype that illustrates the mechanism. Further work has concentrated on handling user operations on arbitrary, contiguous portions of the display.
Motivation and Goals of the Research
The world of text formatters can be divided into two parts. In one group are the pure formatters, which convert a document description, prepared by a separate editing system, into a formatted document suitable for display on an appropriate hardware device. In the other group are the Integrated Editor/Formatters, which merge the editing and the formatting functions into one unified, interactive system-documents are created, viewed, and revised without leaving the edit or/formatter.
Two experiments were conducted to investigate the effects of font-styles on legibility and on reading proficiency. The font-styles studied include Letter Gothic, Courier, and DECwriter font. Both upper and lower case letters were studied. The results revealed the significant effects of different font-styles, ambiguities between letters, and method of presentation. Directions for future research were suggested.
Introduction
Advances in computer technology have brought many new methods of conducting research in typesetting and typography. One area which benefits a lot from the use of modern computers is related to type-font design and evluation. With the aid of a computerized optical scanner, characters in different font styles can be read and converted into digital images. Once these images have been binarized [Suen 1986], they can be stored in matrices and reproduced easily by computers and matrix and laser printers. Hence it is not surprising to see that digital fonts have become more and more widely used in the computer environment. While these binary matrices can be stored and used later for printing, they form a new tool to study several subjects in typography such as legibility of font styles, reading speed/comprehension and font style, spacing of words and texts on the page, line lengths and character sizes. These topics have been of great interest to many psychologists and others (see e.g. Burt 1959, Tinker 1963, Hartley et al 1983). Using modern equipment, these topics can be studied much more rigorously and efficiently than before due to the fact that many parameters such as font styles, character shapes, exposure time, presentation speed, format and spacing, etc.
Grif is an interactive system for editing and formatting complex documents. It manipulates structured documents containing objects of various types: tables, mathematical formulae, programs, pictures, graphics, etc… It Is a structure directed editor which guides the user in accordance with the structure of the document and of the objects being edited; the image displayed on the screen also being constructed from that structure. Flexibility is one of the most Interesting characteristics of Grif. The user can define new document structures and new types of objects, as well as to specify the way in which the system displays these documents and objects.
Presentation
Existing document manipulation systems may be classified into various categories. There are batch formatters [Furuta] and interactive systems [Meyrowitz]. Some formatters such as Scribe [Reid] or Mint [Hibbard] consider the logical structure of the documents they manipulate. Some others, like TEX [Knuth] or Troff [Kernighan], are more concerned with layout, even if macros allow some structure to be introduced in the document.
Formatters have also evolved towards more friendly tools, that allow the user to see quickly on the screen the result of his work: TEX, for example, has several ‘preview’ systems. Janus [Chamberlin] is an original system that has been developed with the same approach. Although they allow the user to see the final form of the document on the screen, these systems cannot be considered as really interactive, as they do not allow the user to interact directly on the final form of the document. Other extensions to formatters have been proposed, by adding a truly interactive editor [André].
This paper discusses a document retrieval system designed as an integrated part of ICL's networked office product line. The system is designed to use the ISO ODA standard for document interchange and to support a user interface that can be tailored to the needs of particular users. The CAFS-ISP search engine, a special purpose hardware device, is used to locate the document required.
Introduction
This paper describes a project within ICL's Office Business Centre that is designing a new document filing and retrieval system. The system is designed to integrate with ICL's networked office product line and to make maximum use of international standards for Open Systems Interconnection. The project is known internally as Textmaster, and an initial subset of the total system is being delivered to selected customers during 1986 under the name ICLFILE.
The system is designed to allow end-users to find the documents they are interested in by means of simple enquiries. They may then view these documents either directly on the screen, or by requesting a printed copy, or by having the document mailed to them electronically. Throughout this process both the typographical layout and the editability of the document are fully preserved. Thus if the user requests a printed copy this can be produced in high quality on a laser printer if required, while if he wishes to edit the document all the necessary layout directives will be preserved. If the document is viewed on the screen it can be presented in a format as close to the printed layout as the screen characteristics will allow: the popular ‘what-you-see-is-what-you-get’ feature of modern word processors.
The use of a multi-task system seems to open up new perspectives in document preparation. This paper presents such an approach, bringing together the wide possibilities of old markup techniques with the convenience of recently appeared interactive systems. It requires a very clear separation between a document's content and its formatting specification. Furthermore the latter can be favourably expressed with a descriptive formalism based on the document's logical structure.
Introduction
The subject matter of this paper stems from ideas developed in the context of a research contribution made in Lausanne on a document preparation project. The initial goal to produce technical reports has been broadened to solve more general document preparation problems (flexibility, modularity).
As interactive editing systems that include sophisticated typographical features become more fashionable, one might expect traditional formatting techniques to give way. The fact that this is not really the case is due to the advantages and shortcomings inherent in either approach: fast viewing and nice man-machine interface on the WYSIWYG systems, highest typographical quality and greater portability of documents through a variety of textprocessing software on the markup based textfile formatters.
Attempting to combine the good sides of both above mentioned approaches entails several requirements. First, the formatting process needs a flexible parametrisation that provides descriptive formatting specification, clearly separated from the document's content. This approach should offer more flexibility and guarantee portability of a document to several systems with different printing devices. Second, a multi-tasking environment should permit to blend user-comfort with the high typographic quality realized by sophisticated formatting functions.
With the recent development of cheap highly functional laser printers and Raster Image Processors, there has been an upsurge of interest in languages for interfacing to these devices. An approach to the design of such a Page Description Language is described, the primary design requirement being a clean interface which is an easy target for translators from various front-end systems. The design of an actual PDL, the Chelgraph ACE language, based on these principles is described. Finally the ACE language is reviewed in the light of experience gained in its use.
Introduction
This paper discusses some issues relevant to the design of a Page Description Language (PDL). A PDL is a type of language commonly used for communicating page information from a composition system to an intelligent page printer. These languages are usually specified by the printer manufacturer as an input language, but device-independent outputs from some composition packages, for example DI-TROFF, are also PDLs. I also present a particular PDL, the ACE language, which has been designed by us at Chelgraph and implemented on our Raster Image Processor, the features of ACE itself have been described elsewhere [Chel84, Harris84].
What Is a PDL?
There have always been languages for communicating page information to typesetters and printers. Until recently the capabilities of computer output printers have been very limited, so their input languages have been simple ASCII formats modified by escape codes. Typesetters such as the Autologic APS 5, on the other hand, have quite complex input languages with a syntax and tens of commands.
In this paper, a system for interactive creation and browsing of dynamic documents is described. The Concept Browser allows the user to create a semantic network of interrelated concepts and interactively navigate through the network. Outlines and printed documents can be automatically generated from the network of concepts. The Concept Browser has been designed and implemented in a Smalltalk programming environment. An interactive, window-based user interface is provided that allows the user to browse through and modify the network of concepts.
Possible applications for the Concept Browser are in the areas of on line documentation, tutorial systems, document preparation systems and electronic books.
Introduction
The traditional method of storing information in a printed, linear form as it is done with conventional books has been demonstrated to be inadequate both to represent the complexity of information and to offer quick and flexible access to it [8].
The personal computer appears to be the ideal tool to satisfy these requirements, but a simple computer-based transcription of traditional books is not the best way of taking advantage of the new functionalities offered by computers.
The purpose of this article is to describe one experiment in the design of a documentation system that can take advantage of the flexible data structures and advanced user interface provided by a Smalltalk programming environment.
A semantic network was chosen as the best way of representing the complexity of information ([4],[6]) instead of a more traditional tree structure [5].
Speech act theory has its roots in the work of Wittgenstein, who in Philosophical Investigations proposed an analogy between using language and playing games. His basic point was that language is a form of rule-governed behavior, much the same as game-playing, employing rules and conventions that are mutually known to all the participants.
The field of speech act theory is usually considered to have been founded by Austin (1962) who analyzed certain utterances called performatives. He observed that some utterances do more than express something that is true about the world. In uttering a sentence like “I promise to take out the garbage,” the speaker is not saying anything about the world, but is rather undertaking an obligation. An utterance like “I now pronounce you man and wife” not only does not say anything that is true about the world, but when uttered in an appropriate context by an appropriate speaker, actually changes the state of the world. Austin argued that an account of performative utterances required an extension of traditional truth-theoretic semantics.
The most significant contribution to speech act theory has been made by philosopher John Searle (1969, 1979a, 1979b), who was the first to develop an extensive formulation of the theory of speech acts.
Kamp represents the first step in a very ambitious program of research. It is appropriate at this time to reflect upon this program, how far we have come, and what lies in the future.
KAMP represents not merely an attempt to devise an expedient strategy for getting text out of a computer, but rather embodies an entire theory of communication. The goal of such a theory could be summarized by saying that its objective is to account for how agents manage to intentionally affect the beliefs, desires and intentions of other agents. Developing such a theory requires examining utterances to determine the goals the speakers are attempting to achieve thereby, and in the process explicating the knowledge about their environment, about their audience, and about their language that these speakers must have. Language generation has been ehosen as an ideal vehicle for the study of problems arising from such a theory because it requires one to face the problem of why speakers choose to do the things they do in a way that is not required by language understanding. Theories of language understanding make heavy use of the fact that the speaker is behaving according to a coherent plan. Language generation requires producing such a coherent plan in the first place, and therefore requires uncovering the underlying principles that make such a plan coherent.