Format
tr [–c | C] [-s] string1 string2
tr –s [–c | C] string1
tr –d [–c | C] string1
tr –ds [–c | C] string1 string2
Description
tr copies
data read from the standard input to
stdout,
substituting or deleting characters as specified by the options and
string1 and
string2.
string1 and
string2 are
considered to be sets of characters. In
its simplest form,
tr translates each character
in
string1 into the character at the corresponding
position in
string2.
Note: tr works
on a character basis, not on a collation element basis. Thus, for
example, a range that includes the multicharacter collation element ch in
regular expressions, does not include it here.
Options
- –c
- If the variable _UNIX03 is unset or
is not set to YES, the behavior of -c option
complements the set of characters specified by string1.
This means that tr constructs a new set
of characters, consisting of all the characters not found in string1 and
uses this new set in place of string1.
If
the variable _UNIX03=YES is set, the behavior
of -c option complements the set of values
specified by string1. This means that tr constructs
a new set and the complements of the values specified by string1 (the
set of all possible binary values, except for those actually specified
in the string1 operand) are placed in this
new set in ascending order by binary value. The new set is used in
place of string1.
- -C
- Complements the set of characters specified by string1.
This means that tr constructs a new set
and the complements of the characters specified by string1 (the
set of all characters in the current character set, as defined by
the current setting of LC_CTYPE, except
for those actually specified in the string1 operand)
are placed in this new set in ascending collation sequence, as defined
by the current setting of LC_COLLATE. This
behaves the same as -c when the variable _UNIX03 is
unset or is not set to YES.
- –d
- Deletes input characters found in string1 from
the output.
- –s
- tr checks for sequences of a string1 character
repeated several consecutive times. When this happens, tr replaces
the sequence of repeated characters with one occurrence of the corresponding
character from string2; if string2 is
not specified, the sequence is replaced with one occurrence of the
repeated character itself. For example:
tr –s abc xyz
translates
the input string aaaabccccb into the output string
of xyzy. If you specify both the
–d and
–s options,
you must specify both
string1 and
string2.
In this case,
string1 contains the characters
to be deleted, whereas
string2 contains
characters that are to have multiple consecutive appearances replaced
with one appearance of the character itself. For example:
tr –ds a b
translates the input string
abbbaaacbb into the output
string
bcb.
The actions of the –s option
take place after all other deletions and translations.
String options
You can use the following
conventions to represent elements of
string1 and
string2:
- character
- Any character not described by the conventions that follow represents
itself.
- \ooo
- An octal representation of a character with a specific coded value.
It can consist of one, two, or three octal digits (01234567). Double-byte
characters require multiple, concatenated escape sequences of this
type, including the leading \ for each byte.
- \character
- The \ (backslash) character is used as an escape
to remove the special meaning of characters. It also introduces escape
sequences for nonprinting characters, in the manner of C character
constants: \b, \f, \n, \r, \t,
and \v.
- c1–c2
- In the POSIX locale, as long as neither endpoint is an octal sequence
of the form \ooo, this represents all characters
between characters c1 and c2 (in
the current locale's collating sequence) including the end values.
For example, 'a–z' represents all the lowercase letters
in the POSIX locale, whereas 'A–Z' represents all
that locale's uppercase letters. One way to convert lowercase and
uppercase is with the following filter:
tr 'a-z' 'A-Z'
This
is not, however, the recommended method; use the [:class:] construct
instead.
If the second endpoint
precedes the starting endpoint in the collation sequence, it causes
an error.
If either or both of the range endpoints are octal
sequences of the form \ooo, this represents
the range of specific coded values between the two range endpoints,
inclusive.
This construct
c1–c2 is
only applied in POSIX locale.
Note: The current locale has a
significant effect on results when specifying subranges using this
method. If the command is required to give consistent results irrespective
of locale, the use of construct c1-c2 should be avoided.
- [c*n]
- This represents n repeated occurrences
of character c. (If n has
a leading zero, tr assumes it is octal;
otherwise, it is assumed to be decimal.) You can omit the number
for the last character in a subset. This representation is valid only
in string2.
- [:class:]
- This represents all characters that belong to the character class class in
the locale indicated by LC_CTYPE. When the class [:upper] or [:lower:] appears
in string1 and the opposite class, [:lower:] or [:upper:] appears
in string2, tr uses
the LC_CTYPE tolower or toupper mappings
in the same relative positions.
- [=c=]
- This represents all characters that belong to the same equivalence
class as the character c in the locale indicated
by LC_COLLATE. Only international versions of the code support
this format.
Usage notes
When
string2 is
shorter than
string1,
tr does
not pad
string2. The remaining characters
in
string1 will not be translated. For example:
tr '0123456789' 'd'
only
translates '0' to 'd', '123456789' remain unchanged.
Coding
the example in the following way:
tr '0123456789' '[d*]'
translates
all digits to the letter 'd'.
Examples
This example creates a list of
all words (strings of letters) found in file1 and puts it in file2:
tr –cs "[:alpha:]" "[\n*]" <file1 >file2
Environment variables
tr uses
the following environment variable: _UNIX03.
For more
information about the effect of _UNIX03 on this command, see Shell commands changed for UNIX03.
Localization
tr uses
the following localization environment variables:
- LANG
- LC_ALL
- LC_COLLATE
- LC_CTYPE
- LC_MESSAGES
- LC_SYNTAX
- NLSPATH
See Localization for more
information.
Exit values
- 0
- Successful completion
- 1
- Failure because of unknown command line option, or too few arguments
Portability
POSIX.2, X/Open Portability Guide.
tr is
compatible with earlier versions of both the UNIX Version
7 and System V variants of this command, but with extensions (C escapes,
handles ASCII NUL, globalization).