Comments and Documentation

This page is aimed at working programmers in large organisations but much of it applies to anyone producing code that will be viewed by and maintained by programmers other than themselves.

There is no excuse for poor documentation in any programme. All modern languages provide a mechanism for commenting and it should be used. What little extra effort it takes is more than amply repaid by the time saved in maintenance. It is possible to go further and state that any programmer who fails to provide adequate documentation has not met his obligation to provide a finished, usable product.

There are three types of comment that can be identified: production headers, unit descriptions and code explanations. We'll go through each in turn. First though, let's consider general style.

Before we start.

Comments are no good if they're not visible and clearly delineated from the code they describe. For this reason, I recommend placing all comments between a top and bottom line, as in...

// ------------------------------------------------
// This comment describes the code which follows...
// ------------------------------------------------

The length of lines is important as legibility is paramount. Restricting comments to 80 columns ensures that they'll be correctly formated on A4 paper printed in portrait mode (short side at the top). Using 132 columns gives more space but requires that listings are printed in landscape mode (long side at the top). Even today, programmers are often issued with relatively small screens so settling on a shorter line length is easier for viewing on screen as well.

Production headers

These should be present on every file and provide enough information to enable a competent programmer to understand the purpose of the file at a glance. They should also contain a clear history of changes so that the code may be wound back to a given point....

// ==================================================
// Filename: CustValSupLib.c
//   Author: Harvey Platter (hp@blah.com)
//    Owner: James Smith (js@blah.com)
//  Updated: 23-Oct-2004
// Function: All customer validation routines for
//           the sales tracking system.
//    Notes: Any other project using this file should
//           contact the owner for inclusion on the
//           change notification list.
// --------------------------------------------------
// History...
// --------------------------------------------------
// DATE   WHO  CHANGE
// --------------------------------------------------
// 041023 TLP Added support for HTTP transfers as per
//            CRST/34/03
// 040605 JMS Changed functionality of GetCustomerID
//            to reflect updated customer cross ref.
//            CRST/29/01
// 031211 HJP Amended all references to customer
//            representative to 60 characters as per
//            CRST/17/01
// ==================================================

The first thing to note about this example is that it gives just enough information and no more. The example is for a (fictitious) library file so there is no point in a description of functionality at this level. If this header belonged to a programme file, there would be a strong argument for the 'Function' entry providing a reasonably detailed description of what the programme does and why.

We list both an author and an owner. In many cases the two will not be the same. The author is the person who originally produced the code while the owner is the person who is currently responsible for it.

The Notes area is for things which apply to the whole file, as in the example shown. It should not be used for information which properly belongs with a particular piece of code.

The history is in reverse order so that the latest change is immediately visible. In any properly run shop, no change will ever be made without authorisation and the appropriate document should be quoted as part of the entry, thus tying the two together.

The last thing to note is that we enclose the heading in double lines (equals signs) in order to make it clear that this is an especially important group of comments. We'll see in a moment that we use this same technique for unit descriptions.

Unit descriptions.

These identify sections of code such as procedures or functions. Their most important purpose is to describe what the code is supposed to do, not necessarily the same as what it actually does, at least, before testing.

# ==================================================
# Removes a sales order if the customer has been
# placed on hold. Will NOT remove the order if the
# customer is still active, in which case it will
# write an exception to the log.
# --------------------------------------------------
# $_[0] the customer ID
# $_[1] the sales order number
# $_[2] the operator's work code
# $_[3] the authorisation code for the deletion.
# --------------------------------------------------
# Returns the new ID of the deleted sales order.
# --------------------------------------------------
# Note that the sales order is not actually deleted
# from the work file but the ID is given a 'DEL'
# prefix. Downstream functionality is expected to
# cleanse the work file.
# ==================================================

This example is based on a format used for Perl programmes. The four sections shown are appropriate for many languages but are not the only way of providing clear documentation.

The first section is the functional description referred to earlier. This needs to be as concise as possible while including everything which a newcomer might not already know.

The second section describes the parameters for this function. As this is Perl, they're not named but for clarity's sake the parameter array element of each item is explicitly named. Once more, we're not leaving anything to chance.

The third section tells us what this function will return. Again, being Perl, we don't have to state an explicit data type but for many other languages such a definition would be entirely appropriate. If nothing is specified as being returned it is important that this is declared here.

The last section is for notes which are not appropriate in any of the previous sections. Deciding what goes here has to be done on a case by case basis. In some, perhaps many, cases it will be blank.

Note that we don't have a history for each procedure. This is a stylistic choice and some sites may take the view that history should be included for each procedure or function.

If the programmer has problems with writing a clear description, there's absolutely nothing wrong with either getting a more literate colleague to write the description or even cutting and pasting from the functional specification where appropriate. The important thing is that the description is clear and thoroughly appropriate so that anyone coming fresh to the project can swiftly pick up the reigns and carry on.

Code explanations.

These are what most programmers immediately think of when the word 'comment' is uttered. They may be as short as two or three words or several paragraphs long. Their purpose is to clarify the function of the code.

Now here's a thing: the better written the code, the fewer the explanations that are required. If you've used good names for all the elements in your programme (variables, constants, procedures, etc) and laid out the code in a clear and readable manner, then you may need no code explanations at all because the code is self documented. If you feel you need a lot of explanations, then it's your code that may be at fault.

This isn't an infallible rule. Take the following snippet...

# --------------------------------

# This must be a data line,
# read it in to the local array...

# --------------------------------
   for( $V_THIS_ELEMENT = 1;
         $V_THIS_ELEMENT <= $G_TOTAL_DEFS;
        $V_THIS_ELEMENT++ )
   {

(etc...)

Without the description, all you'd know was that this was stepping through a list. Adding the explanation makes it all much clearer.

There's an art to documenting a programme file properly but it's an art worth learning. It makes the finished product much more useful because other people can, all the more readily, take on the code and amend it as required.

Good documentation is one of the signs of a professional coder and poor documentation speaks of carelessness and a lack of respect for colleagues.


Learn more by searching Google here...
Google