Table of Contents
The CB2Java project's goal is to simplify the lives of developers (e.g. the author) charged with writing Java applications that communicate with COBOL applications. The main motivation for wiriting this library was that in the limited number of available (free) libraries, none had been designed around a dynamic approach. While it may seem strage to write about dynamic approaches in Java as it is a statically typed, compiled language but it solves a lot of issues that arise in a enterprise environment where almost nothing stays the same for very long.
CB2Java is not a standalone tool for editing and viewing COBOL data.
With tools that require class generation (or worse, hand-coded classes) to parse data defined in COBOL copybooks, a lot of changes require regenerating and recompiling the code even when application logic does not change. For example, if an element in a copybook is defined as being a 6 digit integer, you will most likely end up using an int to represent that value in Java. If later that element is increased to 8 digits, your Java code is still correct. An int will still hold the value. But if you generated the code to parse the message, you need to regenerate the classes and recompile. Some readers might be thinking "that's great but it almost never happens because it would break other applications." This is true to some extent but often a secondary copybook is defined that differs only in one element. With CB2Java, one Java module can use two different copybooks by merely changing the copybook instance. With a generated approach, you need two sets of generated classes.
There are a lot of other ways that a generated approach provides benefits that cannotbe realized effectively with a generated approach. As time permits, this documentation will detail more such techniques.
The Cb2Java is meant to be simple to use and limit the amount of esoteric knowledge needed to make use of it. COBOL is esoteric enough, there is no need to add make it more painful. As such, the documentation here is terse. As time permits it will be expaneded to contain more detailed information but the beginning sections will remain focused on getting started and making use of the tool rapdily.
Table of Contents
The first thing that needs to be done in order to work with
application data is to create a new Copybook
instance
using the CopybookParser
class. To do this, supply an
InputStream
(or Reader
) to the
parse
method. This method also takes a
String
for the name. It is recommended that you
supply a unique name at this point but it's not important exaclty what you
use, just that it's meaningful and hopefully unique. See the parser documentation for more detail.
Once you have an instance of the Copybook
class, you can create a new Record
with the
createnew
method or you can create a
Record
from existing data using one of the
parse
methods. There are two versions of the
parse
method, one taking an
InputStream
and the other taking a byte
array. One key setting to note at this point is the encoding setting on the
Copybook
class. If no action is taken to modify this
setting, it will default to the host system's default encoding. You can also
set this encoding using the system property
cb2java.encoding or by creating a file on the class
path called copybook.props that contains a properly
named encoding. You may also programmatically set the encoding on the
Copybook
instance, overriding any other settings.
There is intentionally no parse
method that takes a
String
because conversion to
String
will corrupt data in many COBOL types in
irreversible ways and should be avoided.
Once you have created a Record
object you can
programmatically browse it's tree and read or modify the data. Each node,
including the Record
itself is an instance of the
Data
class. There are two main types of
Data
class: groups and values. Groups contain other
elements and cannot be modified directly. Values never contain children and
are modifiable. The Record
class is a special type of
GroupData
that is always the root of the record tree.
The easiest way to distinguish between groups and value objects is to call
the isLeaf
method. A value is a 'leaf' and a group
is not.
Groups mainly have identity and children. When working with groups, the main action to be taken is to retrieve it's children. A group's children can contain values (leaves) and other groups.
There are currently four types of
ValueData
.
CharData |
IntegerData |
DecimalData |
FloatData |
All leaf Data
types have
Object
versions for retrieving and setting the
values. In addition, all leaf Data
objects take
String
values and attempt to convert them to the
proper type.
The use of these should be fairly intuitive but the basic idea is that
any data associated numeric type that has no decimal portion will be
represented in an IntegerData
. All other numeric
types except floating-point types are represented in
DecimalData
. Anything that is not strictly numeric is
represented in a CharData
. Boolean types are not
currently supported. Floating-point types are not strictly decimal types and
have special rules.
CharData
objects are used to represent text
data. It consists of normal text data and is always described by an
AlphaNumeric
definition object. The validation
rules for alphanumeric depend on the PICTURE clause and how it specifies
which elements can be in what positions and the length of the element. The
'natural' Java type for CharData
is
String
IntegerData
objects represent numeric types
that have no fraction part. The validation for these types is that no
fraction is included and the number of digits are within the range
specified by the PICTURE clause of the elements definition. The 'natural'
Java type for IntegerData
is
java.math.BigInteger
DecimalData
objects are for numeric data
types that do contain a fractional portion. Validation rules for
DecimalData
are that both the fractional portion
and the integer portion are within their respective ranges. The 'natural'
Java type for DecimalData
is
java.math.BigDecimal
.
FloatData
objects are used to represent
single and double precision floating point types. Floating point is
treated separately from other numeric types because the fractional portion
of a floating-point number is not decimal and the rules for validation are
very different for floating-point data. Floating-point validation requires
that the number specified is exactly representable in floating point
representation specified by its definition element. Floating-point
representations are hardware specific in COBOL and not all numbers that
can be represented on one platform can be represented on all others. Java
uses IEEE-754 floating point representation which presents a problem when
data is in another form. Because of this issue the 'natural' Java type in
cb2java is BigDecimal
. The reasoning is that all
floating point types should be representable as decimal. However, be aware
that most BigDecimal values can not be converted exactly to floating
point. If the underlying representation is IEEE-754, float
and double Java types (depending on the precision of the
COBOL type) are absolutely safe to use and are generally preferable. It
may also be possible to safely work with Java floats and
doubles when the underlying data is not represented as
IEEE-754 with careful management of the precision of the integer and
(especially) the fraction portion.
The four main types of Data
are really the
heart of the design behind the CB2Java project. By mapping each COBOL type
to one of these four, working with COBOL data is very much simplified.
These Objects are created dynamically as the data is parsed which removes
the requirement for brittle generated code and recompiling code for
trivial changes to the field layouts such as the size of the
element.
Table of Contents
The Copybook
class, while primarily used to
parse application data, can also be used to build generators or for any
other introspective task related to the copybook. It is expected that most
users of this package will not need to worry too much about the
Element
types and the classes are designed such that
very little interaction with the Element
types is
required.
The base type for the parse tree is the Element
class. There are a good number of Element
types, one
for each unique COBOL type. This means that while there is a separate type
for BINARY vs. PACKED types, there is no type for COMP vs. BINARY. The
different types and how they relate back to the COBOL types is detailed in
the JavaDocs for these classes (or will be anyway.)
Like the Data
types, the
Element
types are grouped into two major types:
Leaf
and Group
.
Leaf
elements are associated with values a
Group
types with, well, groups. The element tree will
essentially mirror the Data
tree except that the
Elements
have much more information relating to the
actual COBOL types in the copybook definition.
The basic role of the Element
types is
translating byte input into Data
instances,
validating that data matches the specification in the copybook and writing
Data
objects back to byte format.
This chapter discusses the base classes and classes that are general
to the package. Discussion of each of the Element
implementations is not included at this time.
The CopybookParser
class is as the name
suggests, a parser for copybooks. Currently the
CopybookParser
class requires that each stream
parsed contains only one record layout. This is will likely change in the
future to support any number of layouts in a single stream. The the
CopybookParser
class takes the COBOL code and
produces instances of the Copybook
class to
represent the data structures in the stream.
The CopyBook
class represents a parsed
copybook definition. It is a special form of the Group
class (discussed in a following section.) Every parsed
copybook layout has a single copybook element that acts as the root node
for the parse tree.
The Element
class represents a single element
in the record. This is generally defined by a PIC clause. The
Element
class is an abstract class. The actual
instances are actually specific to the different types in a COBOL data
declaration.
The Group
class is used to represent group
types i.e. elements in the copybook that are composed of other elements.
This class should be fairly straightforward. It provdes a List of it's
children.