Specialized tools can make larger FSMs more manageable.
For example, the regular expression [a-z]*\.[0-9]$ matches any sequence of zero or more lower case letters, followed immediately by a period, a digit, and the end of the line.
From C or C++ you can call the regcomp() function to compile a regular expression. Internally, this function (in a typical implementation, at least) builds a representation of an FSM. Then you can call the regexec() function to search for a matching string within a given fragment of text. These functions are not defined as part of Standard C, but they are widely available.
Using these and related functions as building blocks, the UNIX world has developed powerful tools such as grep, awk, and lex. If similar functions are not already available for MVS, it would probably be possible to find public-domain versions, port them, and call them somehow from COBOL.
However, there is a major complication: the difference in character sets between IBM and the rest of the world. For example, the regular expression [A-Z] normally matches any upper-case letter. In EBCDIC, however, the alphabetic characters are not contiguous. Code designed for ASCII will result in some surprises when applied to EBCDIC.
yacc is another code generator -- the name is an acronym for Yet Another Compiler Compiler. It generates C code for further analysis of a stream of tokens. When used together, lex and yacc can quickly produce most of the front end for a non-trivial compiler.
The Free Software Foundation provides near-equivalents of lex and yacc, called flex and bison, respectively.
Libero is available for Unix, VMS, OS/2, MS-DOS, and various versions of Windows. An MVS version is not yet available. Contact iMatix for details.
Further reading