Skip to main content Accessibility help

Regular-expression derivatives re-examined

  • SCOTT OWENS (a1), JOHN REPPY (a2) and AARON TURON (a3)

Regular-expression derivatives are an old, but elegant, technique for compiling regular expressions to deterministic finite-state machines. It easily supports extending the regular-expression operators with boolean operations, such as intersection and complement. Unfortunately, this technique has been lost in the sands of time and few computer scientists are aware of it. In this paper, we reexamine regular-expression derivatives and report on our experiences in the context of two different functional-language implementations. The basic implementation is simple and we show how to extend it to handle large character sets (e.g., Unicode). We also show that the derivatives approach leads to smaller state machines than the traditional algorithm given by McNaughton and Yamada.

Hide All
Aho, A. V., Hopcroft, J. E. & Ullman, J. D. (1974) The Design and Analysis of Computer Algorithms. Reading, MA: Addison Wesley.
Aho, A. V., Sethi, R. & Ullman, J. D. (1986) Compilers: Principles, Techniques, and Tools. Reading, MA: Addison Wesley.
Aho, A. V. & Ullman, J. D. (1972) The Theory of Parsing, Translation, and Compiling. Vol. 1. Englewood Cliffs, NJ: Prentice Hall.
Appel, A. W. (1998) Modern Compiler Implementation in ML. Cambridge: Cambridge University Press.
Appel, A. W., Mattson, J. S. & Tarditi, D. R. (1994Oct.) A Lexical Analyzer Generator for Standard ML. Available at:
Baxter, I., Pidgeon, C., & Mehlich, M. (2004) DMS: Program transformations for practical scalable software evolution. In International Conference on Software Engineering.
Berry, G. (1999) The Esterel v5 Language Primer Version 5.21 Release 2.0. Available at:
Berry, G., & Sethi, R. (1986) From regular expressions to deterministic automata. Theoret. Comp. Sci. Dec., 48 (1)117126.
Brzozowski, J. A. (1964) Derivatives of regular expressions. J. ACM 11 (4), 481494.
English, J. (1999) How to Validate XML. Available at: (Accessed 24 November 2008).
Findler, R. B., Clements, J., Flanagan, C., Flatt, M., Krishnamurthi, S., Steckler, P., & Felleisen, M. (2002) DrScheme: A programming environment for Scheme. J. Funct. Prog. 12 (2), 159182.
Fisher, C. N., & LeBlanc, R. J. Jr., (1988) Crafting a Compiler. Menlo Park, CA: Benjamin/Cummings.
McNaughton, R., & Yamada, H. (1960) Regular expressions and state graphs for automata. IEEE Trans. Elec. Comp. 9, 3947.
Rabin, M. O., & Scott, D. (1959) Finite automata and their decision problems. IBM J. Res. Dev. 3 (2), 114125.
Schmidt, Martin. (2002) Design and Implementation of a Validating XML Parser in Haskell. Master's thesis, Computer Science Department, University of Applied Sciences Wedel.
Sen, K., & Roşu, G. (2003) Generating optimal monitors for extended regular expressions. In Proceedings of Runtime Verification (RV'03). Boulder, Colorado. Electronic Notes in Theoretical Computer Science, vol. 89, no. 2, pp. 226245. Elsevier Science.
Thompson, K. (1968) Regular expression search algorithm. Comm. ACM 11 (6), 419422.
Unicode Consortium. (2003) The Unicode Standard, Version 4. Reading, MA: Addison-Wesley Professional.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Journal of Functional Programming
  • ISSN: 0956-7968
  • EISSN: 1469-7653
  • URL: /core/journals/journal-of-functional-programming
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed

Regular-expression derivatives re-examined

  • SCOTT OWENS (a1), JOHN REPPY (a2) and AARON TURON (a3)
Submit a response


No Discussions have been published for this article.


Reply to: Submit a response

Your details

Conflicting interests

Do you have any conflicting interests? *