HCi Journal of Information Development


Print friendly version

Systems documentation explained - the Data Dictionary

By Stuart Lecky

This is the first of a series of articles explaining the role of all the documents that need to be in place to support the work of the IT staff whose job it is to operate and maintain computer systems.

These documents include:

  • Data Dictionary

  • System Maintenance Manual

  • Operating system configuration

  •  Lifecycle documentation

  • Operations Manual

In this issue we look at the Data Dictionary, which is a key document that allows developers to:

  • plan enhancements to the system

  • determine the effects of planned changes to the system

  •  track bugs

The Data Dictionary is also used by Business Analysts to understand what a system does so that they can specify clearly any changes required.

Preparing the Data Dictionary

The engine behind most information systems is the database.  A Data Dictionary is an essential part of understanding and maintaining the database. This key document describes in detail how the database was designed and all its key characteristics.  This information allows developers and analysts to quickly access information about the tables, fields, procedures, processes and other information in the system.

A Data Dictionary can be produced:

  • automatically using a software tool to interrogate the database and map its structure

  • manually by combing through the code to determine the structure

  • by a combination of automatic and manual processes.

The best approach is the last, which combines automatic and manual processes. A Data Dictionary can be automatically created using CASE (Computer Aided Software Engineering) tools but the problem with this is that the information produced will be limited to the superficial schema. Much of the information that a maintenance programmer actually needs ("What is stored in this field?", "Does this table store data, or is it just used as temporary storage during a transaction?") needs to be drawn from the knowledge of the original development programmers, and from the original design documents. A competent technical writer who can ask the right questions and arrange the information in a logical and well-ordered way is best-qualified to do this.

While CASE tools are useful for drawing a “map” of a system, gaps still need to be filled in manually.  A CASE tool cannot tell you from the database schema, for example, what the business context of each field and table means to the organisation.  This can only be gleaned from the analysts who specified the database and/or the developers who created the database.

The components of the Data Dictionary

The Data Dictionary defines the basic organisation of a database and collects together detailed information about database system components.  This can include:

  • data element definitions (tables, fields, key fields, primary keys, relationships etc)

  • program elements (stored procedures, scripts etc) used by the database to move data about or to manipulate it in some way

  • records (numbers of – not the actual records themselves)

  • system parameters

  • system information

  •  files and other system components

  • user information

  • entity relationship diagrams

  • database schema

  • database security model.

Of these, the most crucial, and the major part of a Data Dictionary, are the table definitions, the data elements within those tables, and the program elements.

The table definitions define the tables used in the database, including a brief description of their use, the key fields, the primary key and a list of the fields.  Each table definition includes:

  • table name

  • table description

  • table owner or database name

  •  data element, or column, or field name definitions (see below):

  • key order for all the elements

  • indexes

  • table organisation

  • duplicate rows allowed or not allowed

  • table size

  • security classification of the table

  • the business context of the table.

The Data Elements can include:

  • field names

  • related elements (which data elements are related to which other data elements?)

  • data type (character, numeric, etc) and size

  • data element description (what data is held in this field?)

  • null value allowed?

  • default value

  • validation rules, if any, between this element and other elements in the Data Dictionary.

  • database table references (where the data element is used and whether the data element is the primary key for the table, or a part of the key)

  • data sources (where the data comes from, including rules used in calculations to producing data element values)

Program elements include:

  • stored procedures, for example, SQL scripts, that are integral to the database

  • external procedures, for example, Unix scripts

Finally, the Data Dictionary includes descriptions of each process carried by the database system, including:

  • where and how data enters the system

  • what is done to the data, at which stage and why

  • what are the outputs (if any) of the system

  • the business context of the process

It takes time and effort to construct a good Data Dictionary but it is well spent because this key document will save the cost of its production many times over in time saved maintaining the system. And when the time comes for a major upgrade or replacement of the database to, say, take advantage of new technology, the Data Dictionary will save an enormous amount of effort when specifying the detailed requirements of the new system.

This article may be reproduced only with the permission of HCi (email HCi ). Copyright HCi, 2001-3.

Back to Journal First Quarter 2003

More articles from the HCi Journal


HCi has formed a new consulting arm called Realisation.  Click here to visit the Realisation site for further information.