To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
So far we have studied single-hop networks in which each node is either a sender or a receiver. In this chapter, we begin the discussion of multihop networks, where some nodes can act as both senders and receivers and hence communication can be performed over multiple rounds. We consider the limits on communication of independent messages over networks modeled by a weighted directed acyclic graph. This network model represents, for example, a wired network or a wireless mesh network operated in time or frequency division, where the nodes may be servers, handsets, sensors, base stations, or routers. The edges in the graph represent point-to-point communication links that use channel coding to achieve close to error-free communication at rates below their respective capacities. We assume that each node wishes to communicate a message to other nodes over this graphical network. The nodes can also act as relays to help other nodes communicate their messages. What is the capacity region of this network?
Although communication over such a graphical network is not hampered by noise or interference, the conditions on optimal information flow are not known in general. The difficulty arises in determining the optimal relaying strategies when several messages are to be sent to different destination nodes.
We first consider the graphical multicast network, where a source node wishes to communicate a message to a set of destination nodes. We establish the cutset upper bound on the capacity and show that it is achievable error-free via routing when there is only one destination, leading to the celebrated max-flow min-cut theorem. When there are multiple destinations, routing alone cannot achieve the capacity, however. We show that the cutset bound is still achievable, but using more sophisticated coding at the relays. The proof of this result involves linear network coding in which the relays perform simple linear operations over a finite field.
We then consider graphical networks with multiple independent messages. We show that the cutset bound is tight when the messages are to be sent to the same set of destination nodes (multimessage multicast), and is achieved again error-free using linear network coding. When each message is to be sent to a different set of destination nodes, however, neither the cutset bound nor linear network coding is optimal in general.
The source and network models we discussed so far capture many essential ingredients of real-world communication networks, including
• noise,
• multiple access,
• broadcast,
• interference,
• time variation and uncertainty about channel statistics,
• distributed compression and computing,
• joint source–channel coding,
• multihop relaying,
• node cooperation,
• interaction and feedback, and
• secure communication.
Although a general theory for information flow under these models remains elusive, we have seen that there are several coding techniques—some of which are optimal or close to optimal—that promise significant performance improvements over today's practice. Still, the models we discussed do not capture other key aspects of real-world networks.
• We assumed that data is always available at the communication nodes. In real-world networks, data is bursty and the nodes have finite buffer sizes.
• We assumed that the network has a known and fixed number of users. In real-world networks, users can enter and leave the network at will.
• We assumed that the network operation is centralized and communication over the network is synchronous. Many real-world networks are decentralized and communication is asynchronous.
• We analyzed performance assuming arbitrarily long delays. In many networking applications, delay is a primary concern.
• We ignored the overhead (protocol) needed to set up the communication as well as the cost of feedback and channel state information.
While these key aspects of real-world networks have been at the heart of the field of computer networks, they have not been satisfactorily addressed by network information theory, either because of their incompatibility with the basic asymptotic approach of information theory or because the resulting models are messy and intractable. There have been several success stories at the intersection of networking and network information theory, however. In this chapter we discuss three representative examples.
We first consider the channel coding problem for a DMC with random data arrival. We show that reliable communication is feasible provided that the data arrival rate is less than the channel capacity. Similar results can be established for multiuser channels and multiple data streams. A key new ingredient in this study is the notion of queue stability.
Endowing robots with the ability of skill learning enables them to be versatile and skillful in performing various tasks. This paper proposes a neuro-fuzzy-based, self-organizing skill-learning framework, which differs from previous work in its capability of decomposing a skill by self-categorizing it into significant stimulus-response units (SRU, a fundamental unit of our skill representation), and self-organizing learned skills into a new skill. The proposed neuro-fuzzy-based, self-organizing skill-learning framework can be realized by skill decomposition and skill synthesis. Skill decomposition aims at representing a skill and acquiring it by SRUs, and is implemented by stages with a five-layer neuro-fuzzy network with supervised learning, resolution control, and reinforcement learning to enable robots to identify a sufficient number of significant SRUs for accomplishing a given task without extraneous actions. Skill synthesis aims at organizing a new skill by sequentially planning learned skills composed of SRUs, and is realized by stages, which establish common SRUs between two similar skills and self-organize a new skill from these common SRUs and additional new SRUs by reinforcement learning. Computer simulations and experiments with a Pioneer 3-DX mobile robot were conducted to validate the self-organizing capability of the proposed skill-learning framework in identifying significant SRUs from task examples and in common SRUs between similar skills and learning new skills from learned skills.
This paper presents an investigation of part of speech (POS) tagging for Arabic as it occurs naturally, i.e. unvocalized text (without diacritics). We also do not assume any prior tokenization, although this was used previously as a basis for POS tagging. Arabic is a morphologically complex language, i.e. there is a high number of inflections per word; and the tagset is larger than the typical tagset for English. Both factors, the second one being partly dependent on the first, increase the number of word/tag combinations, for which the POS tagger needs to find estimates, and thus they contribute to data sparseness. We present a novel approach to Arabic POS tagging that does not require any pre-processing, such as segmentation or tokenization: whole word tagging. In this approach, the complete word is assigned a complex POS tag, which includes morphological information. A competing approach investigates the effect of segmentation and vocalization on POS tagging to alleviate data sparseness and ambiguity. In the segmentation-based approach, we first automatically segment words and then POS tags the segments. The complex tagset encompasses 993 POS tags, whereas the segment-based tagset encompasses only 139 tags. However, segments are also more ambiguous, thus there are more possible combinations of segment tags. In realistic situations, in which we have no information about segmentation or vocalization, whole word tagging reaches the highest accuracy of 94.74%. If gold standard segmentation or vocalization is available, including this information improves POS tagging accuracy. However, while our automatic segmentation and vocalization modules reach state-of-the-art performance, their performance is not reliable enough for POS tagging and actually impairs POS tagging performance. Finally, we investigate whether a reduction of the complex tagset to the Extra-Reduced Tagset as suggested by Habash and Rambow (Habash, N., and Rambow, O. 2005. Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), Ann Arbor, MI, USA, pp. 573–80) will alleviate the data sparseness problem. While the POS tagging accuracy increases due to the smaller tagset, a closer look shows that using a complex tagset for POS tagging and then converting the resulting annotation to the smaller tagset results in a higher accuracy than tagging using the smaller tagset directly.
The European Molecular Biology Open Software Suite (EMBOSS) is a high quality, well documented package of open source software tools for molecular biology. EMBOSS includes extensive and extensible C programming libraries, providing a powerful and robust toolkit for developing new bioinformatics tools from scratch.The EMBOSS Developer's Guide is the official and definitive guide to developing software under EMBOSS. It includes comprehensive reference information and guidelines, including step-by-step instructions and real-world code examples:Learn how to write fully-featured tools guided by the people who developed EMBOSSStep-by-step guide to writing EMBOSS applications, illustrated with functional, deployed codeACD file development - learn how to customise existing tools without coding, or design and write entirely new application interfacesEMBOSS API programming guidelines - quickly master application developmentWrapping and porting applications under EMBOSS - learn how to incorporate third-party tools
The proceedings of the Los Angeles Caltech-UCLA 'Cabal Seminar' were originally published in the 1970s and 1980s. Wadge Degrees and Projective Ordinals is the second of a series of four books collecting the seminal papers from the original volumes together with extensive unpublished material, new papers on related topics and discussion of research developments since the publication of the original volumes. Focusing on the subjects of 'Wadge Degrees and Pointclasses' (Part III) and 'Projective Ordinals' (Part IV), each of the two sections is preceded by an introductory survey putting the papers into present context. These four volumes will be a necessary part of the book collection of every set theorist.
Fortran is one of the oldest high-level languages and remains the premier language for writing code for science and engineering applications. This book is for anyone who uses Fortran, from the novice learner to the advanced expert. It describes best practices for programmers, scientists, engineers, computer scientists and researchers who want to apply good style and incorporate rigorous usage in their own Fortran code or to establish guidelines for a team project. The presentation concentrates primarily on the characteristics of Fortran 2003, while also describing methods in Fortran 90/95 and valuable new features in Fortran 2008. The authors draw on more than a half century of experience writing production Fortran code to present clear succinct guidelines on formatting, naming, documenting, programming and packaging conventions and various programming paradigms such as parallel processing (including OpenMP, MPI and coarrays), OOP, generic programming and C language interoperability.
The authors thank the Numerical Algorithms Group Ltd. (NAG) who provided us with copies of their excellent compiler with which we could test our code. In particular, thanks go to Mr. Malcolm Cohen, Mr. Rob Holmes, Mr. Ian Hounam, Mr. Rob Meyer, Mr. Mike Modica, and Mr. John Morrissey.
The Portland Group provided us with a copy of their compiler. Special thanks go to Ms. Laura Gibon for arranging that.
We thank Mr. Art Lazanoff for the use of his network server system for our CVS repository.
We thank Mr. Dan Nagle who offered vigorous criticism and some good suggestions.
The following persons read over the manuscript; to them we owe our gratitude: Dr. Greg Brown, Dr. Charles Crawford, Mr. Ryan O'Kuinghttons, and Dr. James Hlavka.
Thanks go to Ms. Stacy L. Castillo at the IBM Corporate Archives for arranging permission to use the material for the frontispiece.
It was a great pleasure to work with our editors at Cambridge University Press: Ms. Heather Bergman, Ms. Lauren Cowles and Mr. David Jou.
Typographical Conventions
The following typographical conventions are used in this book:
• medium-weight serif font – normal text
This sentence is written in the font used for normal text.
This appendix provides the user with a complete list of all the guidelines contained in the book. They are listed in the order of their appearance and grouped by chapter. The page on which each rule can be found in parentheses following the rule.
Chapter 2. General Principles
1. Write programs that are clear to both the reader and the compiler. (3)
2. Write programs that can be efficiently tested. (4)
3. Write programs that will scale to different problem sizes. (5)
4. Write code that can be reused. (6)
5. Document all code changes, keeping a history of all code revisions. (6)
Chapter 3. Formatting Conventions
6. Always use free source form. (9)
7. Adopt and use a consistent set of rules for case. (10)
7.1 Use lowercase throughout. (11)
7.2 Capitalize the names of all user-written procedures. (11)
7.3 Write all named constants using uppercase letters. (12)
7.4 Begin the name of all data entities using a lowercase letter. (12)
8. Use a consistent number of spaces when indenting code. (13)
9. Increase the indentation of the source code every time the data scope changes. (13)
10. Indent the block of code statements within all control constructs. (14)
Use detailed names for data objects whose scope is global, less detailed names for those whose scope is a module, and simple but clear names for those whose scope is a single procedure.
Symbolic names are used in many places. At the outermost level are the names of modules, the main program, and external procedure program units. Within the confines of a program unit are derived-type definitions, named constants, and variables. In addition, there are also internal procedures and interface blocks.
Within individual procedures, there are statement labels for control flow and I/O purposes.
Generally, the more global the name, the longer and more descriptive it should be. And, likewise, the more limited the scope of a name is, the shorter it should be. For example, a module name should indicate the use of the definitions and related procedures it contains, for example: Triangular_solver_mod, whereas a simple loop index variable may be called i or j.
Name user-written procedures using verbs.
Almost all procedures perform some task. Name them using one or more verbs that succinctly describe the operation carried out. If appropriate, follow each verb with a specific noun that describes the object being used. This method is especially useful when you name functions; it aids in distinguishing them from arrays. (See also Rules 7.2 and 16.)
Reading data and writing results are fundamental operations that are common to almost all programs. Fortran provides a built-in set of input/output (I/O) statements. These statements generally operate on data files residing on disk drives, and with devices, such as keyboards and console displays, which can be presented to a program by the operating system in a file-like manner. Different devices have different capabilities, so not all operations are supported on each one. The executable I/O statements, divided into two groups, are shown in Table 8.1.
Fortran has always structured its I/O capabilities around the read and write statements. These statements operate on Fortran “units,” which are represented by simple cardinal numbers called “unit numbers.” A unit that is connected to a data file typically occupies system resources, such as buffers and file descriptors, from the time it is connected, when the file is opened, until the time it is disconnected using the close statement.
General I/O Operations
Use the named constants in the intrinsic module iso_fortran_env.
The intrinsic module iso_fortran_env provides named constants to help make applications portable. These named constants include values for standard input, output, and error unit numbers, such as INPUT_UNIT, OUTPUT_UNIT, and various values returned by the iostat specifier, such as IOSTAT_END, an argument common to many of the I/O statements that return an integer indicating the success or the failure of the I/O operation (see Rules 104 and 105).
In the world of scientific and engineering computing, applications must often take advantage of all the processing power the system is capable of providing. With the advent of low-cost computers with multiple processing cores, writing programs that employ parallel processing has become common.
Modern computers operate in the digital domain, yet users wish to simulate or control physical processes, which are usually continuous, or analog, in nature. The conversion between the two domains is necessarily imprecise. The closer the digital process can model the real-world one, the better the result.
A numerical model must be fine-grained enough to accurately model the process, yet it must not be too fine, or resource requirements such as CPU utilization and main memory space will be exceeded. A practical choice must often be made between precision of results and the ability to produce results in a reasonable time frame, if at all.
When faced with the requirement to speed up a computation, it is natural to consider parallel processing as a solution. At its inception, Fortran was designed in terms of a scalar processor model. That is, only a single thread of control is present, and code is executed in a serial fashion. However, as computers have evolved, many features have been added to the language that can be executed in parallel. These parallel capabilities are built in. Their use depends on the hardware available and the compiler.
Over the years, the Fortran standards committee has striven to keep each new standard backward-compatible with previous releases. Fortran is one of the oldest high-level languages; much old code is still in use. The committee has admirably succeeded, and programmers have known that they could continue to write programs that contained old features and that they could add features of the new standards whenever it was convenient, useful, or appropriate. Each new standard has marked only a few old features as “obsolescent,” defined as “A feature that is considered redundant but that is still in frequent use.” Those so marked in one standard may be “deleted” in a subsequent one. The standard describes a “deleted” feature as “A feature in a previous Fortran standard that is considered to have been redundant and largely unused.” This chapter describes many of these old features and the new ones provided by the modern Fortran standards (meaning from Fortran 90 and on) that you can use to replace them. We note the status of the each old feature. For further details, see the appropriate language standard: Fortran 90, Reference [40]; Fortran 95: Reference [42]; Fortran 2003, Reference [39]; and Fortran 2008, Reference [43].
Statements
Replace common blocks with modules.
In FORTRAN 66 and FORTRAN 77, programmers stored global data in common blocks. A program could have one unnamed common block and any number of named blocks.
Program units in Fortran are the main program, external subroutines, modules, and submodules. Placing each unit in a separate file makes the program easier to maintain. Shorter files tend to compile faster. You can locate different program components in files more easily. When a team of programmers is collaborating on a project, smaller files make it less likely that the work of one programmer will conflict with that of others.
It is crucial that you place submodules in files separate from their parent modules. Doing so prevents “compilation cascade,” a phenomena where a change in the implementation of a subprogram needlessly causes the recompilation of other program units (see Rule 124).
Whenever possible, use the module name, the type name, the subprogram name, or the program name as the file name.
This rule makes it easier to maintain programs. This is especially true if you code in conformance to Rule 133 and place each derived type in its own module, and you also choose to use either a prefix or suffix attached to a common base name when naming derived types and the modules that contain them. In that case, name the file according to the base name. For instance, you might have a type called pixel_t defined in module pixel_mod, and you can name the file pixel.f03 (see Section 4.2).