CB2Java


Table of Contents

1. Introduction
2. Working with Application Data
CharData
IntegerData
DecimalData
FloatData
Conclusion
3. Working with parsed Copybooks
CopybookParser
Copybook
Element
Group
Leaf

Chapter 1. Introduction

The CB2Java project's goal is to simplify the lives of developers (e.g. the author) charged with writing Java applications that communicate with COBOL applications. The main motivation for wiriting this library was that in the limited number of available (free) libraries, none had been designed around a dynamic approach. While it may seem strage to write about dynamic approaches in Java as it is a statically typed, compiled language but it solves a lot of issues that arise in a enterprise environment where almost nothing stays the same for very long.

Note

CB2Java is not a standalone tool for editing and viewing COBOL data.

With tools that require class generation (or worse, hand-coded classes) to parse data defined in COBOL copybooks, a lot of changes require regenerating and recompiling the code even when application logic does not change. For example, if an element in a copybook is defined as being a 6 digit integer, you will most likely end up using an int to represent that value in Java. If later that element is increased to 8 digits, your Java code is still correct. An int will still hold the value. But if you generated the code to parse the message, you need to regenerate the classes and recompile. Some readers might be thinking "that's great but it almost never happens because it would break other applications." This is true to some extent but often a secondary copybook is defined that differs only in one element. With CB2Java, one Java module can use two different copybooks by merely changing the copybook instance. With a generated approach, you need two sets of generated classes.

There are a lot of other ways that a generated approach provides benefits that cannotbe realized effectively with a generated approach. As time permits, this documentation will detail more such techniques.

The Cb2Java is meant to be simple to use and limit the amount of esoteric knowledge needed to make use of it. COBOL is esoteric enough, there is no need to add make it more painful. As such, the documentation here is terse. As time permits it will be expaneded to contain more detailed information but the beginning sections will remain focused on getting started and making use of the tool rapdily.

Chapter 2. Working with Application Data

The first thing that needs to be done in order to work with application data is to create a new Copybook instance using the CopybookParser class. To do this, supply an InputStream (or Reader) to the parse method. This method also takes a String for the name. It is recommended that you supply a unique name at this point but it's not important exaclty what you use, just that it's meaningful and hopefully unique. See the parser documentation for more detail.

Once you have an instance of the Copybook class, you can create a new Record with the createnew method or you can create a Record from existing data using one of the parse methods. There are two versions of the parse method, one taking an InputStream and the other taking a byte array. One key setting to note at this point is the encoding setting on the Copybook class. If no action is taken to modify this setting, it will default to the host system's default encoding. You can also set this encoding using the system property cb2java.encoding or by creating a file on the class path called copybook.props that contains a properly named encoding. You may also programmatically set the encoding on the Copybook instance, overriding any other settings. There is intentionally no parse method that takes a String because conversion to String will corrupt data in many COBOL types in irreversible ways and should be avoided.

Once you have created a Record object you can programmatically browse it's tree and read or modify the data. Each node, including the Record itself is an instance of the Data class. There are two main types of Data class: groups and values. Groups contain other elements and cannot be modified directly. Values never contain children and are modifiable. The Record class is a special type of GroupData that is always the root of the record tree. The easiest way to distinguish between groups and value objects is to call the isLeaf method. A value is a 'leaf' and a group is not.

Groups mainly have identity and children. When working with groups, the main action to be taken is to retrieve it's children. A group's children can contain values (leaves) and other groups.

There are currently four types of ValueData.

CharData
IntegerData
DecimalData
FloatData

All leaf Data types have Object versions for retrieving and setting the values. In addition, all leaf Data objects take String values and attempt to convert them to the proper type.

The use of these should be fairly intuitive but the basic idea is that any data associated numeric type that has no decimal portion will be represented in an IntegerData. All other numeric types except floating-point types are represented in DecimalData. Anything that is not strictly numeric is represented in a CharData. Boolean types are not currently supported. Floating-point types are not strictly decimal types and have special rules.

CharData

CharData objects are used to represent text data. It consists of normal text data and is always described by an AlphaNumeric definition object. The validation rules for alphanumeric depend on the PICTURE clause and how it specifies which elements can be in what positions and the length of the element. The 'natural' Java type for CharData is String

IntegerData

IntegerData objects represent numeric types that have no fraction part. The validation for these types is that no fraction is included and the number of digits are within the range specified by the PICTURE clause of the elements definition. The 'natural' Java type for IntegerData is java.math.BigInteger

DecimalData

DecimalData objects are for numeric data types that do contain a fractional portion. Validation rules for DecimalData are that both the fractional portion and the integer portion are within their respective ranges. The 'natural' Java type for DecimalData is java.math.BigDecimal.

FloatData

FloatData objects are used to represent single and double precision floating point types. Floating point is treated separately from other numeric types because the fractional portion of a floating-point number is not decimal and the rules for validation are very different for floating-point data. Floating-point validation requires that the number specified is exactly representable in floating point representation specified by its definition element. Floating-point representations are hardware specific in COBOL and not all numbers that can be represented on one platform can be represented on all others. Java uses IEEE-754 floating point representation which presents a problem when data is in another form. Because of this issue the 'natural' Java type in cb2java is BigDecimal. The reasoning is that all floating point types should be representable as decimal. However, be aware that most BigDecimal values can not be converted exactly to floating point. If the underlying representation is IEEE-754, float and double Java types (depending on the precision of the COBOL type) are absolutely safe to use and are generally preferable. It may also be possible to safely work with Java floats and doubles when the underlying data is not represented as IEEE-754 with careful management of the precision of the integer and (especially) the fraction portion.

Conclusion

The four main types of Data are really the heart of the design behind the CB2Java project. By mapping each COBOL type to one of these four, working with COBOL data is very much simplified. These Objects are created dynamically as the data is parsed which removes the requirement for brittle generated code and recompiling code for trivial changes to the field layouts such as the size of the element.

Chapter 3. Working with parsed Copybooks

The Copybook class, while primarily used to parse application data, can also be used to build generators or for any other introspective task related to the copybook. It is expected that most users of this package will not need to worry too much about the Element types and the classes are designed such that very little interaction with the Element types is required.

The base type for the parse tree is the Element class. There are a good number of Element types, one for each unique COBOL type. This means that while there is a separate type for BINARY vs. PACKED types, there is no type for COMP vs. BINARY. The different types and how they relate back to the COBOL types is detailed in the JavaDocs for these classes (or will be anyway.)

Like the Data types, the Element types are grouped into two major types: Leaf and Group. Leaf elements are associated with values a Group types with, well, groups. The element tree will essentially mirror the Data tree except that the Elements have much more information relating to the actual COBOL types in the copybook definition.

The basic role of the Element types is translating byte input into Data instances, validating that data matches the specification in the copybook and writing Data objects back to byte format.

This chapter discusses the base classes and classes that are general to the package. Discussion of each of the Element implementations is not included at this time.

CopybookParser

The CopybookParser class is as the name suggests, a parser for copybooks. Currently the CopybookParser class requires that each stream parsed contains only one record layout. This is will likely change in the future to support any number of layouts in a single stream. The the CopybookParser class takes the COBOL code and produces instances of the Copybook class to represent the data structures in the stream.

Copybook

The CopyBook class represents a parsed copybook definition. It is a special form of the Group class (discussed in a following section.) Every parsed copybook layout has a single copybook element that acts as the root node for the parse tree.

Element

The Element class represents a single element in the record. This is generally defined by a PIC clause. The Element class is an abstract class. The actual instances are actually specific to the different types in a COBOL data declaration.

Group

The Group class is used to represent group types i.e. elements in the copybook that are composed of other elements. This class should be fairly straightforward. It provdes a List of it's children.

Leaf

The Leaf class is used to represent value elements i.e. those elements that contain data that is not composed of other elements such as numbers and string character data. A value element can never have child elements.