To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The ‘Mabinogion sheep’ problem, originally due to D. Williams, is a nice illustration in discrete time of the martingale optimality principle and the use of local time in stochastic control. The use of singular controls involving local time is even more strikingly highlighted in the context of continuous time. This paper considers a class of diffusion versions of the discrete-time Mabinogion sheep problem. The stochastic version of the Bellman dynamic programming approach leads to a free boundary problem in each case. The most surprising feature in the continuous-time context is the existence of diffusion versions of the original discrete-time problem for which the optimal boundary is different from that in the discrete-time case; even when the optimal boundary is the same, the value functions can be very different.
The partially observed control problem is considered for stochastic processes with control entering into the diffusion and the observation. The maximum principle is proved for the partially observable optimal control. A pure probabilistic approach is used, and the adjoint processes are characterized as solutions of related backward stochastic differential equations in finite-dimensional spaces. Most of the derivation is identified with that of the completely observable case.
A continuous-time, non-linear filtering problem is considered in which both signal and observation processes are Markov chains. New finite-dimensional filters and smoothers are obtained for the state of the signal, for the number of jumps from one state to another, for the occupation time in any state of the signal, and for joint occupation times of the two processes. These estimates are then used in the expectation maximization algorithm to improve the parameters in the model. Consequently, our filters and model are adaptive, or self-tuning.
We present two sufficient conditions for detection of optimal and non-optimal actions in (ergodic) average-cost MDPs. They are easily interpreted and can be implemented as detection tests in both policy iteration and linear programming methods. An efficient implementation of a recent new policy iteration scheme is discussed.
A reference probability is explicitly constructed under which the signal and observation processes are independent. A simple, explicit recursive form is then obtained for the conditional density of the signal given the observations. Both non-linear and linear filters are considered, as well as two different information patterns.
This paper presents a state space and time discretization for the general average impulse control of piecewise deterministic Markov processes (PDPs). By combining several previous results we show that under some continuity, boundedness and compactness conditions on the parameters of the process, boundedness of the discretizations, and compactness of the state space, the discretized problem will converge uniformly to the original one. An application to optimal capacity expansion under uncertainty is given.
We investigate the impact of switching penalties on the nature of optimal scheduling policies for systems of parallel queues without arrivals. We study two types of switching penalties incurred when switching between queues: lump sum costs and time delays. Under the assumption that the service periods of jobs in a given queue possess the same distribution, we derive an index rule that defines an optimal policy. For switching penalties that depend on the particular nodes involved in a switch, we show that although an index rule is not optimal in general, there is an exhaustive service policy that is optimal.
The problem treated is that of controlling a processwith values in [0, a]. The non-anticipative controls (µ(t), σ(t)) are selected from a set C(x) whenever X(t–) = x and the non-decreasing process A(t) is chosen by the controller subject to the condition where y is a constant representing the initial amount of fuel. The object is to maximize the probability that X(t) reaches a. The optimal process is determined when the function has a unique minimum on [0, a] and satisfies certain regularity conditions. The optimal process is a combination of ‘timid play' in which fuel is used gradually in the form of local time at 0, and ‘bold play' in which all the fuel is used at once.
The problem of estimating the transfer function of a linear system, together with the spectral density of an additive disturbance, is considered. The set of models used consists of linear rational transfer functions and the spectral densities are estimated from a finite-order autoregressive disturbance description. The true system and disturbance spectrum are, however, not necessarily of finite order. We investigate the properties of the estimates obtained as the number of observations tends to ∞ at the same time as the model order employed tends to ∞. It is shown that the estimates are strongly consistent and asymptotically normal, and an expression for the asymptotic variances is also given. The variance of the transfer function estimate at a certain frequency is related to the signal/noise ratio at that frequency and the model orders used, as well as the number of observations. The variance of the noise spectral estimate relates in a similar way to the squared value of the true spectrum.