Use the Oozie Workflow Activity stage
to invoke Oozie workflows from the Designer client.
Oozie workflows are a collection of actions that are arranged in a
control dependency. These actions are computation tasks that are written
in Jaql, MapReduce, or other frameworks that you use to write applications
to process large amounts of data.
Attention: This stage can be installed
on Red Hat Linux 64-bit systems only.
The Oozie
Workflow Activity stage contains the following fields:
- Oozie Server
- Specify the URL for the Oozie server to connect to. For example, http://myserver:8280/oozie,
where myserver is the name of the Oozie server
that you are connecting to. The port number 8020 is
used for InfoSphere BigInsights, but differs depending on the Hadoop
system that you use.
- Workflow Definition Path
- Enter the location of the workflow definition that you want to
run. A workflow definition is a programmatic description of a workflow
in XML format.
- You must include the full path to the workflow definition, which
is defined as hdfs://host:Hadoop_port/HDFS_application_path/workflow_directory.
For example, hdfs://mymachine.host.com:8280/user/biadmin/workflows/aasd-098098098.xml.
- host
- The machine where your Hadoop application is installed.
- Hadoop_port
- The port number that your Hadoop application is listening on.
- HDFS_application_path
- The application path to your Hadoop distributed file system application.
- workflow_directory
- The directory that contains the workflow.xml that
you want to run.
- Do not checkpoint job run
- Select this option to specify that checkpoint information will
not be recorded for this activity. This option specifies that if an
activity later in the sequence fails, and the sequence is restarted,
this activity will run again, regardless of whether it finished successfully
in the original run. This option is only available if checkpoint information
is recorded for the entire sequence job.
- Workflow Parameters
- Enter values for any parameters that the activity requires. The
grid displays all parameters that are expected by the activity.
- Name
- Type a name for your parameter.
- Value Expression
- Type an expression that specifies a value for the parameter. Literal
values must be enclosed in inverted commas.
- You can also click the browse arrow to open a window that displays
all available parameters and arguments that occur in the sequence
job before the current activity. This window shows parameters that
you define for the job sequence in the Parameters stage
of the Job Sequence Properties window. Choose
the required parameter or argument and click OK.
You can use this feature to determine control flow through the sequence.
- Enter -pf file_name to
specify a parameter file that the activity reads at run time, where file_name is
the name of the parameter file. All parameters contained in this file
will be added to the workflow configuration.