Rewriting Spaghetti Code
You've inherited a horrible ugly COBOL program with dozens or hundreds
of GO TOs and PERFORM THRUs. It has many obscure quirks peculiar to
the application. People have patched it for years, and by now nobody
knows how it works, or even if it works.
You have several options:
- Grit your teeth, hold your nose, and keep patching it. If you like
this option, then read no further.
- Rewrite the program from scratch. This option requires that you
completely understand everything the program does. The requirements
are not reliably documented except by the source code itself, which
is such a mess that you don't trust yourself to understand it.
- Use a code-structuring tool to mechanically convert the COBOL
to a formally structured style, i.e. to a style with no GO TOs.
While this approach may be a useful starting point, you should also be
aware of the drawbacks.
- Rewrite the program manually, a little at a time. Read on.
Rewriting Incrementally
You don't have to understand the entire program -- just a little bit
at a time. Wherever you see a safe chance to untangle a little piece
of code, do it. The more you untangle, the easier it will be to
untangle the rest.
Through much of this process you don't have to understand the program
at all. Mechanical rearrangements of the code can systematically
transform tangled logic into equivalent structured logic.
Of course, it is possible for COBOL to be structured in a formal sense --
but poorly structured. As you wrestle with the code you can become
familiar with small pieces of it, recognize what the code is trying to
do, and further massage it to achieve a good style. Step by step, your
goal is to make the code look normal.
(So what does "normal" mean? Well, er, um, it means "the way it
ought to be." In practice it means, "the way I would have
done it.")
Rewriting a little at a time is easy to fit into your schedule. At
each stage of the rewrite you still have a working program which is
the logical equivalent of the original. You can put it into
production and return to it later, when you have time or opportunity.
Even a partially rewritten program will be less bad than the original
mess.
Discovering Bugs
As you continue your campaign to make the code look normal, you will
likely reach an impasse. At some point, you may be unable to make
the code look normal without changing its behavior.
At that point you have probably discovered a bug. Study the code
carefully, verify that it doesn't make sense, determine what it
intended to do, and then fix it. Adjust your test plan
so that your regression testing uses the new code as a baseline
instead of the old.
In one of my early rewrite efforts, before I was as careful as I am
today, I rearranged some code to reflect its evident intent. To my
dismay, regression testing showed that the new program behaved
differently from the original, even though my code looked perfectly
sound. Eventually I discovered that I had fixed an unrecognized
bug by accident.
Discipline
The obvious danger to the incremental approach is that your changes
will have unintended consequences. While there is no substitute for
being careful to the point of paranoia, certain measures will
reduce the risks:
- Establish a test plan from the beginning, so that you can compare
the output of the old and new versions.
- Test early and often.
- Make your changes in the smallest possible increments. Stare at the
code and determine whether each change, considered separately, will
preserve the behavior of the original in all circumstances.
- Transform the code in a systematic sequence of
stages. Each stage makes subsequent stages
easier and safer, even if it makes the code uglier in the short run.
The first three points are fairly obvious, and require no further
comment. The last one is tougher. In fact it is the heart of the
problem: How do we transform the code without breaking it?
These pages do not attempt to define a rigorous algorithm, such as
would be needed by the people who write code-structuring tools. They
do outline a series of
stages to be applied manually.
Caveat
The techniques described on these pages are based mostly on my own
experience, but some of them are theoretical. For example, I have
never had the misfortune to encounter an ALTER statement. Having
read the manual and thought about it for a while, I think
I know what to do with one, but I've never tried it.
If my recommendations are flawed, please let me know. Send me your
tips, your tricks, and your war stories (and see the
Guidelines for Contributors).
Another Viewpoint
Tony Cahill has
written an article about restructuring COBOL code. His approach emphasizes
the transformation of paragraphs into subprograms or nested programs,
which communicate through explicit parameters and, where necessary,
global variables. I wouldn't usually
go that far, but the article is worth reading. It also contains
references to number of technical articles on similar topics.
COBOL Home
COBOL Style Forum
Stages