diff — Compare two text files and show the differences

Format

diff [–BbefHhimNnrsw] [–C n] [–c[n]] [–Difname] [-M mark] [-W option[,option]...] path1 path2

Description

The diff command attempts to determine the minimal set of changes needed to convert a file whose name is specified by the path1 argument into the file specified by the path2 argument.

Input files must be text files. If either (but only one) file name is , diff uses a copy of the standard input (stdin) for that file. If exactly one of path1 or path2 is a directory, diff uses a file in that directory with the same name as the other file name. If both are directories, diff compares files with the same file names under the two directories; however, it does not compare files in subdirectories unless you specify the –r option. When comparing two directories, diff does not compare character special files, or FIFO special files with any other files.

By default, output consists of descriptions of the changes in a style like that of the ed text editor. A line indicating the type of change is given. The three types are a (append), d (delete), and c (change). The output is symmetric: A delete in path1 is the counterpart of an append in path2. diff prefixes each operation with a line number (or range) in path1 and suffixes each with a line number (or range) in path2. After the line giving the type of change, diff displays the deleted or added lines, prefixing lines from path1 with < and lines from path2 with >.

Options

Options that control the output or style of file comparison are:
–B
Disables the automatic conversion of tagged files. This option is ignored if the filecodeset or pgmcodeset options (-W option) are specified.
–b
Ignores trailing blanks and tabs and considers adjacent groups of blanks and tabs elsewhere in input lines to be equivalent.

For example, if one file contained a string of three spaces and a tab at a given location while the other file contained a string of two spaces at the same location, diff would not report this as a difference.

–C n
Shows n lines of context before and after each change. diff marks lines removed from path1 with , lines added to path2 with +, and lines changed in both files with !.
–c[n]
Is equivalent to –Cn, but n is optional. The default value for n is 3. diff marks lines removed from path1 with , lines added to path2 with +, and lines changed in both files with !.
–Difname
Displays output that is the appropriate input to the C preprocessor to generate the contents of path2 when ifname is defined, and the contents of path1 when ifname is not defined.
–e
Writes out a script of commands for the ed text editor, which converts path1 to path2. diff sends the output to the standard output (stdout).
–f
Writes a script to stdout that shows modifications necessary to convert path1 to path2 in the reverse order of that produced by the –e option. However, the script is not in a form that is suitable for use with the ed editor. The commands produced is reversed from that produced by –e, and the line number ranges are separated by spaces, rather than commas. This option conflicts with the –m option.
–H
Uses the half-hearted (–h) algorithm only if the normal algorithm runs out of system resources.
–h
Uses a fast, half-hearted algorithm instead of the normal diff algorithm. This algorithm can handle arbitrarily large files; however, it is not good at finding a minimal set of differences in files with many differences.
–i
Ignores the case of letters when doing the comparison.
–m
Produces the contents of path2 with extra formatter request lines interspersed to show which lines were added (those with vertical bars in the right margin) and deleted (indicated by a * in the right margin).
–M
Is an IBM® internal option and is not supported.
–n
Is an IBM internal option and is not supported.
–N
Is an IBM internal option and is not supported.
–r
Compares corresponding files under the directories, and recursively compares corresponding files under corresponding subdirectories under the directories. You can use this option when you specify two directory names on the command line.
–s
Compares two directories, file by file, and prints messages for identical files between the two directories.
–w
Ignores white space during the comparison process.
-W option[,option]...
Specifies z/OS-specific options. The option keywords are case-sensitive. Possible options are:
filecodeset=codeset
Performs text conversion from one code set to another when reading from the file. The coded character set of the file is codeset. codeset can be a code set name known to the system or a numeric coded character set identifier (CCSID). Note that the command iconv -l lists existing CCSIDs along with their corresponding code set names. The filecodeset and pgmcodeset options can be used on files with any file tag.

If pgmcodeset is specified but filecodeset is omitted, then the default file code set is ISO8859-1 even if the file is tagged with a different code set. If neither filecodeset nor pgmcodeset is specified, text conversion will not occur unless automatic conversion is enabled or the _TEXT_CONV environment variable indicates text conversion. For more information about text conversion, see Controlling text conversion for z/OS UNIX shell commands.

If filecodeset or pgmcodeset is specified, then automatic conversion is disabled for this command invocation and the -B option is ignored if it is also specified. See z/OS UNIX System Services Planning for more information about automatic conversion.

When specifying values for filecodeset, use the values that Unicode Service supports. For more information about supported code sets, see z/OS Unicode Services User's Guide and Reference.

pgmcodeset=codeset
Performs text conversion from one code set to another when reading from the file. The coded character set of the program (command) is codeset. codeset can be a code set name known to the system or a numeric coded character set identifier (CCSID). Note that the command iconv -l lists existing CCSIDs along with their corresponding code set names. The filecodeset and pgmcodeset options can be used on files with any file tag.

If filecodeset is specified but pgmcodeset is omitted, then the default program code set is IBM-1047. If neither filecodeset nor pgmcodeset is specified, text conversion will not occur unless automatic conversion is enabled or the _TEXT_CONV environment variable indicates text conversion. For more information about text conversion, see Controlling text conversion for z/OS UNIX shell commands.

If filecodeset or pgmcodeset is specified, then automatic conversion is disabled for this command invocation and the -B option is ignored if it is also specified. See z/OS UNIX System Services Planning for more information about automatic conversion.

Restriction: The only supported values for pgmcodeset are IBM-1047 and 1047.

Examples

  1. To compare two text files containing UTF-8 characters and show the differences, assuming that:
    • The text files are untagged and you do not want to tag them or enable automatic conversion, and
    • You cannot alter the tag (for example, you are comparing untagged public text files or read-only text files):
    diff -W filecodeset=UTF-8,pgmcodeset=IBM-1047 myUtf8File01 myUtf8File02
  2. To compare two text files containing EBCDIC characters and show the differences, assuming that automatic conversion has been enabled but the text files are incorrectly tagged as ASCII:
    diff -B myMisTaggedFile01 myMisTaggedFile02
  3. The following example illustrates the effect of the –c option on the output of the diff command. The following two files, price1 and price2, are compared with and without the use of the –c option.
    The contents of price1 are as follows:
    Company X Price List:
    $  0.39  -- Package of Groat Clusters
    $  5.00  -- Candy Apple Sampler Pack
    $ 12.00  -- Box of Crunchy Frog Chocolates
    $ 15.99  -- Instant Rain (Just Add Water)
    $ 20.00  -- Asparagus Firmness Meter
    $ 25.00  -- Package of Seeds for 35 Herbs
    $ 30.00  -- Child's Riding Hood (Red)
    $ 35.00  -- Genuine Placebos
    $ 45.00  -- Case of Simulated Soy Bean Oil
    $ 75.88  -- No-Name Contact Lenses
    $ 99.99  -- Kiddie Destructo-Bot
    $125.00  -- Emperor's New Clothes
    The contents of price2 are as follows:
    Company X Price List:
    $  0.39  -- Package of Groat Clusters
    $  5.49  -- Candy Apple Sampler Pack
    $ 12.00  -- Box of Crunchy Frog Chocolates
    $ 15.99  -- Instant Rain (Just Add Water)
    $ 17.00  -- Simulated Naugahyde cleaner
    $ 20.00  -- Asparagus Firmness Meter
    $ 25.00  -- Package of Seeds for 35 Herbs
    $ 30.00  -- Child's Riding Hood (Red)
    $ 35.00  -- Genuine Placebos
    $ 45.00  -- Case of Simulated Soy Bean Oil
    $ 75.88  -- No-Name Contact Lenses
    $ 99.99  -- Kiddie Destructo-Bot
    The command:
    diff price1 price2
    results in the following display:
    3c3
    < $  5.00  -- Candy Apple Sampler Pack
    --->
     $  5.49  -- Candy Apple Sampler Pack
    5a6
    > $ 17.00  -- Simulated Naugahyde cleaner
    13d13
    < $125.00  -- Emperor's New Clothes
    The addition of the –c option, as in:
    diff -c price1 price2
    results in the following display:
    *** price1 Wed Oct 1 13:59:18 1997
    --- price2 Wed Oct 1 14:03:36 1997
    ***************
    *** 1,8 ****
    Company X Price List:
    
      $  0.39  -- Package of Groat Clusters
    ! $  5.00  -- Candy Apple Sampler Pack
      $ 12.00  -- Box of Crunchy Frog Chocolates
      $ 15.99  -- Instant Rain (Just Add Water)
      $ 20.00  -- Asparagus Firmness Meter
      $ 25.00  -- Package of Seeds for 35 Herbs
      $ 30.00  -- Child's Riding Hood (Red)
    --- 1,9 ----
      Company X Price List:
    
      $  0.39  -- Package of Groat Clusters
    ! $  5.49  -- Candy Apple Sampler Pack
      $ 12.00  -- Box of Crunchy Frog Chocolates
      $ 15.99  -- Instant Rain (Just Add Water)
    + $ 17.00  -- Simulated Naugahyde cleaner
      $ 20.00  -- Asparagus Firmness Meter
      $ 25.00  -- Package of Seeds for 35 Herbs
      $ 30.00  -- Child's Riding Hood (Red)
    ***************
    *** 10,13 ****
      $ 45.00  -- Case of Simulated Soy Bean Oil
      $ 75.88  -- No-Name Contact Lenses
      $ 99.99  -- Kiddie Destructo-Bot
    - $125.00  -- Emperor's New Clothes
    --- 11,13 ----
    diff –c marks lines removed from price1 with , lines added to price1 with + and lines changed in both files with !. In the example, diff shows the default three lines of context around each changed line. One line was changed in both files (marked with !), one line was added to price1 (marked with +), and one line was removed from price1 (marked with ).
    Note: If there are no marks to be shown in the corresponding lines of the file being compared, the lines are not displayed. Lines 11 to 13 of price2 are suppressed for this reason.

Localization

diff uses the following localization environment variables:
  • LANG
  • LC_ALL
  • LC_CTYPE
  • LC_MESSAGES
  • LC_TIME
  • LC_SYNTAX
  • NLSPATH

See Localization for more information.

Environment variables

diff uses the following environment variable:
_TEXT_CONV
Contains text conversion information for the command. The text conversion information is not used when either the -B option or the filecodeset or pgmcodeset option (-W option) is specified. For more information about text conversion, see Controlling text conversion for z/OS UNIX shell commands.

Exit values

0
No differences between the files compared.
1
diff compared the files and found them to be different.
2
Failure due to any of the following:
  • The code set is not valid
  • Could not turn off automatic conversion
  • Could not perform requested text conversion
  • Incorrect command-line argument
  • Inability to find one of the input files
  • Out of memory
  • Read error on one of the input files
4
At least one of the files is a binary file containing embedded NUL (\0) bytes or newlines that are more than LINE_MAX bytes apart.

Messages

Possible error messages include:
file filename: no such file or directory
The specified filename does not exist. filename was either typed explicitly, or generated by diff from the directory of one file argument and the basename of the other.
Files file1 and file2 are identical
The –s option was specified and the two named files are identical.
Common subdirectories: name and name
This message appears when diff is comparing the contents of directories, but you have not specified –r. When diff discovers two subdirectories with the same name, it reports that the directories exist, but it does not try to compare the contents of the two directories.
Insufficient memory (try diff –h)
diff ran out of memory for generating the data structures used in the file differencing algorithm. (See Limits.) The –h option of diff can handle any size file without running out of memory.
Internal error—cannot create temporary file
diff was unable to create a working file that it needed. Ensure that you either have a directory /tmp or that the environment contains the TMPDIR environment variable that names a directory where diff can store temporary files. Also, ensure that there is sufficient file space in this directory.
Missing ifdef symbol after -D
You did not specify a conditional label on the command line after the –D option.
Only one file may be –
Of the two input files typically found on the command line of diff, only one can be the standard input (stdin).
Too many lines in filename
A file of more than the maximum number of lines (see Limits) was given to diff.

Limits

The longest input line is 1024 bytes. Except under –h, files are limited to INT_MAX lines. INT_MAX is defined in limits.h.

Portability

POSIX.2, X/Open Portability Guide, UNIX systems.

The –B, –D, –H, –h, –i, –m, –s, –W, and –w options, and the n argument to the –c option, are extensions of the POSIX standard.

Related information

cmp, comm, patch

J. W. Hunt and M. D. McIlroy, An Algorithm for Differential File Comparison, Report 41, from Computing Science, Bell Laboratories, Murray Hill, NJ 07974, (June 1976), 9 pages.