Related information
Description
The file command
uses the /etc/magic file in its attempt to identify the type
of a binary file. Essentially, /etc/magic contains templates
showing what different types of files look like.
The
magic file
contains lines describing magic numbers, which identify particular
types of files. Lines beginning with a > or & character represent
continuation lines to a preceding main entry:
- >
- If the file command finds a match on
the main entry line, these additional patterns are checked. Any pattern
that matches is used. This may generate additional output; a single
blank separates each matching line's output if any output exists for
that line.
- If the file command finds a match on
the main entry line, and a following continuation line begins with
this character, that continuation line's pattern must also match,
or neither line is used. Output text associated with any line beginning
with the & character is ignored.
Each line consists of four fields, separated
by one or more tabs:
- (a)
- The first field is a byte offset in the file, consisting of an
optional offset operator and a value. In continuation lines, the offset
immediately follows a continuation character.
If no offset operator
is specified, then the offset value indicates an offset from the beginning
of the file.
The * offset operator specifies that the value
located at the memory location following the operator be used as the
offset. Thus, *0x3C indicates that the value contained
in 0x3C should be used as the offset.
The
+ offset operator specifies an incremental offset, based on the value
of the last offset. Thus, +15 indicates that the
offset value is 15 bytes from the last specified offset.
If
the byte offset has passed the file length limit, the test will not
match.
- (b)
- The second field is the type of the value.
The valid specifiers
are listed below:
- d
- Signed decimal
- u
- Unsigned decimal
- s
- String
u and d can
be followed by an optional unsigned decimal integer that specifies
the number of bytes represented by the type. The numbers of bytes
supported are refined to the byte length of the C-language type char,
short, int,long. u and d can also be followed by an optional size
specifiers listed below: - C
- char
- S
- short
- I
- int
- L
- long
The C, S, I,
or L specifiers are correspond to the number
of bytes in the C-language types char, short, int, or long. All
type specifiers, except for s, can be followed
by a mask specifier of the form &number.
The mask value will be bitwise AND 'ed with the value of the input
file before the comparison with the value field of the line is made.
By default the mask will be interpreted as an unsigned decimal number.
With a leading 0x or 0X, the mask will be interpreted as an unsigned
hexadecimal number; otherwise, with a leading 0, the mask will be
interpreted as an unsigned octal number.
The long format of
type specifiers is supported. The valid specifiers, and their interpretation,
are listed below:
Specifier |
_UNIX03=YES |
_UNIX03 is not YES |
byte |
dC |
uC |
short |
dS |
uS |
long |
dL |
uL |
string |
s |
s |
- (c)
- The next field is a value, preceded by an optional operator.
If
the specifier from the type field is s or string, then interpret the
value as a string. Otherwise, interpret it as a number. If the value
is a string, then the test will succeed only when a string value exactly
matches the bytes from the file. The string value field can contain
at most 127 characters per magic line.
If the value is a string,
it can contain the following sequences:
- \character
The backslash-escape sequences as specified in
the Base Definitions volume of IEEE Std 1003.1-2001, Table 5-1, Escape
Sequences and Associated Actions (\\, \a, \b, \f, \n, \r, \t, \v ).
In addition, the escape sequence \ (the <backslash> character
followed by a <space> character) will be recognized to represent
a <space> character.
- \octal
Octal sequences that can be used to represent characters
with specific coded values. An octal sequence consists of a backslash
followed by the longest sequence of one, two, or three octal-digit
characters (01234567).
By default, any value that is not a string will be interpreted
as a signed decimal number. Any such value, with a leading 0x or 0X,
will be interpreted as an unsigned hexadecimal number; otherwise,
with a leading zero, the value will be interpreted as an unsigned
octal number. To maintain compatibility with other systems, numeric
values are not subject to bounds checking. Use numeric values that
match the specified type.
Operators only apply to nonstring
types:
byte,
short and
long.
The default operator is = (exact match). The operators are:
- =
- Equal.
- !
- Not equal.
- >
- Greater than.
- <
- Less than.
- &
- All bits in pattern must match.
- ‸
- At least one bit in pattern must not match.
- x or ?
- Any value matches (must be the only character in the field). ?
is an extension to traditional implementations of magic.
- (d)
- The rest of the line is the message string to be printed if the
particular file matches the template. Note that the contents of this
field is ignored if the line begins with the & continuation character.
The fourth field may contain a printf ()-type
format indicator to output the magic number (see printf for
more details on format indicators). If the field contains a printf
()-type format indicator, the value read from the file
will be the argument to printf.
Usage notes
- Characters from a code page other than IBM-1047 should not be
added to the /etc/magic file (the default magic file).
- Characters from a code page other than IBM-1047 can be used in
alternate magic files that are specified by the –m or -M option
on the file command. These characters should
only be used in the third field of the magic file template
when the field type is string. They will only match
files containing these characters when the file command
is invoked in the non-IBM-1047 locale.
Examples
Here are some sample entries:
Characters |
|
|
0 short |
0x5AD4 |
DOS executable |
*0x18 |
Short |
0x40 |
>*0x3c |
Short |
0x6584C OS/2 linear executable |
>*0x3C |
Short |
0x454e |
>+54byte |
1 |
OS/2 format |
>+54byte |
2 |
Windows format |
0 short |
0xFDF0 |
DOS library |
0 string |
AH |
Halo bitmapped font file |
0 short |
0x601A |
Atara ST contiguous executable |
>14 long |
>0 |
– not stripped |
0 byte |
0X1F |
|
>1 byte |
0x1E |
Packed file |
>1 byte |
0x9D |
Compressed file |
Related information
The file command.