A names file is required for attribute-value data analysis. It is a text file that lists the attributes that describe the cases in the data file to be analyzed.
Each attribute is described on a separate line.
Each line starts with the name of the attribute.
For categorical attributes, the attribute name is followed by a colon (:) and then either the keyword categorical or a comma separated list of the values allowed for the attribute.
Example:
Department: bakery, dairy, beverages
This specifies that the attribute Department can assume any one of the three values bakery, dairy, or beverages. Any case containing any other value will be discarded and an error message generated.
Example:
Department: categorical
This specifies that the attribute Department can assume any value that appears in the data file.
For compatibility with See-5, Magnum Opus also accepts the keyword discrete which is treated as equivalent to categorical.
Numeric attributes must be divided into sub-ranges. These can be specified in the names file. Alternatively, the names file can simply identify the number of sub-ranges and Magnum Opus will select the sub-ranges for you.
For a numeric attribute with specified sub-ranges, the attribute name is followed by a list of sub-range cut points. These indicate how the numeric values for the attribute are to be subdivided into sub-ranges. Each cut point is introduced by one of the relations < or <= which is followed by the value that terminates the sub-range. If the relation is <, the sub-range includes all values less than the specified value. If the relation is <=, the sub-range includes all values less than or equal to the specified value.
Example:
Spend < 10 <= 100
This specifies that the attribute Spend has three sub-ranges, below the first cut point, between the two cut points, and above the last cut point:
Spend < 10
10 <= Spend <= 100
Spend > 100
To allow Magnum Opus to select sub-ranges, use the keyword numeric, followed by the number of sub-ranges required.
Example:
Spend: numeric 5
For compatibility with See-5, Magnum Opus also accepts the keyword continuous which is treated as numeric 3.
The keyword ignore instructs Magnum Opus to discard any data for the given attribute. This is useful for handling attributes that may appear in the data but which should not be used, such as record identifiers.
A categorical attribute declaration can be followed by any number of generalization declarations. A generalization declares a new value for an attribute that is equivalent to a group of previously declared values. Each generalization declaration appears on a separate line, starting with a colon (:). This must be followed by a name for the generalization. This is followed in turn by another colon. Then appears a list of values to which the new value is equivalent.
Example:
Amino Acid 1: A,C,D,E,F,G,H,I,K,L,M,N,P,Q,R,S,T,V,W,Y
: small: G,A,S
: medium: C,D,T,N,P,E,V,Q,H,M,I,L,K
: large: W,F,Y,R
In this example an attribute called Amino Acid 1 is declared with 20 initial values. The next three lines declare three further values, each of which only applies when the corresponding initial values apply. So, if the data file contains the value A for this attribute then the attribute will be assigned two values, both A and small.
Attribute names can consist of any sequence of printable characters other than colon (:) and less than (<).
Attribute value names can consist of any sequence of printable characters other than comma (,). Although names may be any length, Magnum Opus only considers the first 50 characters of the name. The special name `? is reserved for missing values.
The names file must list the attributes in the order that they appear in the data file.
Magnum Opus ignores any line beginning with a comment character (|).