The LCF system was the first mechanical theorem prover to be user-programmable via a
metalanguage, ML, from which the functional programming language Standard ML has been
developed. Paulson has demonstrated how a modular rewriting engine can be implemented
in LCF. This provides both clarity and flexibility. This paper shows that the same modular
approach (using higher-order functions) allows transparent optimisation of the rewriting
engine; performance can be improved while few, if any, changes are required to code written
using these functions. The techniques described have been implemented in the HOL system,
a descendant of LCF, and some are now in daily use. Comparative results are given. Some
of the techniques described, in particular ones to avoid processing parts of a data structure
that do not need to be changed, may be of more general use in functional programming and
beyond.