IBM InfoSphere Streams Version 4.1.0

Housekeeping

As an administrator, you are responsible to run some housekeeping tasks regularly to ensure that the Lookup Manager and ITE applications work as expected. Typically, you will archive and/or remove processed input or consumed output files only, so the file system has enough free memory for new input and output files.

Run the following housekeeping tasks regularly:

  • Identify processing failures for command and data files
  • Archive and/or remove input files.
  • Archive and/or remove result files as soon as they are consumed.
  • Archive and/or remove statistics files.
  • Remove debug outputs if debugging features are used.

Procedure

  1. Identify processing failures for command and data files by checking the following directories.
    • <base>/failed with base being specified in the lm.commandsDirectory parameter
    • <base>/failed with base being specified in the ite.ingest.directory.input parameter
    • <base>/invalid with base being specified in the ite.ingest.directory.input parameter
    • <base>/duplicate with base being specified in the ite.ingest.directory.input parameter
    • <base>/rejected with base being specified in the ite.storage.directory.outputs parameter
  2. Optionally archive the files you want to keep for a longer period of time. Archiving is not part of the Streams product and is also not implemented by the Lookup Manager or ITE applications.
  3. Remove the unneeded files if not already deleted during the archiving step. You can delete the files using a file explorer or the rm Linux command.

The following directories contain files that shall be archived and/or removed:

  1. <base>/archive with base being specified in the lm.commandsDirectory parameter
  2. <base>/failed with base being specified in the lm.commandsDirectory parameter
  3. <base>/archive with base being specified in the lm.file.directory parameter
  4. <base>/archive with base being specified in the lm.statisticsDirectory parameter
  5. <base>/archive with base being specified in the ite.ingest.directory.input parameter
  6. <base>/failed with base being specified in the ite.ingest.directory.input parameter
  7. <base>/invalid with base being specified in the ite.ingest.directory.input parameter
  8. <base>/duplicate with base being specified in the ite.ingest.directory.input parameter
  9. <base>/load with base being specified in the ite.storage.directory.outputs parameter

    CAUTION: delete already consumed load files only

  10. <base>/rejected with base being specified in the ite.storage.directory.outputs parameter
  11. <base>/archive with base being specified in the ite.storage.directory.statistics parameter

CAUTION: Do not delete the directories, but the files only.