Format
csplit [–Aaks]
[–f prefix] [–n number] file
arg arg …
Description
csplit takes
a text file as input and
breaks up its contents into pieces, based on criteria given by the arg value
on the command line. For example, you can use csplit to
break up a text file into chunks of ten lines each, then save each
of those chunks in a separate file. See Splitting criteria for
more information. If you specify – as the file argument, csplit uses
the standard input (stdin).
The files created by
csplit normally
have names of the form
xxnumber
where
number is a 2-digit decimal number
that begins at zero and increments by one for each new file that
csplit creates.
csplit also
displays the size, in bytes, of each file that it creates.
Options
- –A
- Uses uppercase letters in place of numbers in the number portion
of created file names. This generates names of the form xxAA, xxAB,
and so on.
- –a
- Uses lowercase letters in place of numbers in the number portion
of created file names. This generates names of the form xxaa, xxab,
and so on.
- –f prefix
- Specifies a prefix to use in place of the default xx when
naming files. If it causes a file name longer than NAME_MAX bytes,
an error occurs and csplit exits without
creating any files.
- –k
- Leaves all created files intact. Normally, when an error occurs, csplit removes
files that it has created.
- –n number
- Specifies the number of digits in the number portion of created
file names.
- –s
- Suppresses the display of file sizes.
Splitting criteria
csplit processes
the args on the command line sequentially.
The first argument breaks off the first chunk of the file, the second
argument breaks off the next chunk (beginning at the first line remaining
in the file), and so on. Thus each chunk of the file begins with the
first line remaining in the file and goes to the line given by the
next arg.
arg values
can take any of the following forms:
- /regexp/
- Takes the chunk as all the lines from the current line up to but
not including the next line that contains a string matching the regular
expression regexp. After csplit obtains
the chunk and writes it to an output file, it sets the current line
to the line that matched regexp.
- /regexp/offset
- Is the same as the previous criterion, except that the chunk goes
up to but not including the line that is a given offset from
the first line containing a string that matches regexp.
The offset can be a positive or negative
integer. After csplit has obtained the chunk
and written it to an output file, it sets the current line to the
line that matched regexp.
Note: This current
line is the first one that was not part of the chunk just written
out.
- %regexp%
- Is the same as /regexp/, except that csplit does
not write the chunk to an output file. It simply skips over the chunk.
- %regexp%offset
- Is the same as /regexp/offset,
except csplit does not write the chunk to
an output file.
- linenumber
- Obtains a chunk beginning at the current line and going up to
but not including the linenumberth line.
After split writes the chunk to an output
file, it sets the current line to linenumber.
- {number}
- Repeats the previous criterion number times.
If it follows a regular expression criterion, it repeats the regular
expression process number more times. If
it follows a linenumber criterion, csplit splits
the file every linenumber lines, number times,
beginning at the current line. For example,
csplit file 10 {10}
obtains
a chunk from line 1 to line 9, then every 10 lines after that, up
to line 109.
Errors occur if any criterion tries to "grab"
lines beyond the end of the file, if a regular expression does not
match any line between the current line and the end of the file, or
if an offset refers to a position before
the current line or past the end of the file.
Localization
csplit uses
the following localization variables:
- LANG
- LC_ALL
- LC_COLLATE
- LC_CTYPE
- LC_MESSAGES
- LC_SYNTAX
- NLSPATH
See Localization for more
information.
Exit values
- 0
- Successful completion
- 1
- Failure due to any of the following:
- csplit could not open the input or output
files
- A write error on the output file
- 2
- Failure due to any of the following:
- Unknown command-line option
- The prefix name was missing after –f
- The number of digits was missing after –n
- The input file was not specified
- No arg values were specified
- The command ran out of memory
- An arg was incorrect
- The command found end-of-file before it was expected
- A regular expression in an arg was badly
formed
- A line offset/number in an arg was badly
formed
- A {number} repetition count was misplaced or
badly formed
- Too many file names were generated when using –n
- Generated file names would be too long
Portability
POSIX.2 User Portability Extension, X/Open Portability Guide, UNIX systems.
The –A and –a options
are extensions to the POSIX standard.