Extended data sources

Extended data sources are modeled objects in the metadata repository that represent different data sources.

By creating and importing extended data sources, you can capture metadata that is not written to disk or that for some other reason cannot be imported into the catalog. You can then use the extended data sources in extension mapping documents in order to track and report on the flow of information to and from the extended data sources and other assets.

You define extended data sources in a comma-separated value (CSV) file. During the import process, the metadata of the extended data source assets is imported into the catalog. However, the CSV file itself is not imported into the catalog.

Types of extended data sources

There are three major types of extended data source assets:
  • Applications
  • Stored procedure definitions
  • Files

You have flexibility to use each type of extended data source to represent the various types of metadata in your enterprise. The following definitions include suggested usage.

application icon Application
Represents a program that performs a specific function directly for the user or, in some cases, for another application.
For example, an application might be a database program, communication program, or SAP program that interacts with a corporate database. You can use an application extended data source to loosely model SAP or other data programs.

Applications can have one or more object types.

object type icon Object type
A grouping of methods or a defined data format that characterizes the input and output structures within a single application. For example an object type could represent a common feature or business process within an application.
An object type belongs to a single application. An object type can have multiple methods. The identity of an object type is application.objectType.
method icon Method
A function or procedure that is defined within an object type to perform an operation. Operations pass or receive information as input parameters or output values. For example, a method can be a specific operation or procedure call for reading or writing data through the application and object type. You could also use a method to represent the equivalent of a database table, while the input parameters and output values represent columns of the table.
A method belongs to a single object type. A method can have multiple input parameters and output values. The identity of a method is application.objectType.method.
input parameter icon Input parameter
Input parameters are the most common way to deliver information from a client to a method. Methods require information from the client to perform their intended function. This information can be in the form of presentation options for a report, selection criteria for data to be analyzed, individual columns, or many other possibilities. For example, MONTH and YEAR could be input parameters for a method that analyzes monthly sales data.
An input parameter belongs to a single method. The identity of an input parameter is application.objectType.method.inputParameter.
output value icon Output value
Methods retrieve and return data to the client or application in the form of output values. Output values can represent the returned value for the database column or data file field. For example, JANUARY and 2000 could be output values for a method that analyzes monthly sales data.
An output value belongs to a single method. The identity of an output value is application.objectType.method.outputValue.
stored procedure definition icon Stored procedure definition
Stored procedures are routines that are available to applications that access database systems, and are stored within the database system. Stored procedures consolidate and centralize complex logic and SQL statements, and might update, append, or retrieve data. For example, stored procedures are used to control transactions as condition handlers or programs, and in some cases are similar to ETL transactions when they update.
The extended data source that represents stored procedures is called a stored procedure definition to distinguish it from the stored procedure assets that are saved by IBM® InfoSphere® DataStage® and QualityStage®.

A stored procedure definition can have multiple in parameters, out parameters, inOut parameters, and result columns.

in parameter icon In parameter
An in parameter carries information that is required for the stored procedures to perform its intended function. For example, variables passed to the stored procedure are in parameters.
An in parameter belongs to a single stored procedure definition. The identity of an in parameter is storedProcedureDefinition.inParameter.
out parameter icon Out parameter
An out parameter represents the value or variable returned when a stored procedure executes. For example, a field included in the result set of the stored procedure can be an out parameter.
An out parameter belongs to a single stored procedure definition. The identity of an out parameter is storedProcedureDefinition.outParameter.
inOut parameter icon InOut parameter
You use inOut parameters when a stored procedure requires information from the client to perform its intended function, and then processes and returns the same information. For example, an inOut parameter could be a variable that the stored procedure processes or aggregates and returns to the calling application.
An inOut parameter belongs to a single stored procedure definition. The identity of an inOut parameter is storedProcedureDefinition.inOutParameter.
result column icon Result column
Result columns represent the returned data values of a stored procedure, when it queries data or processes data in a database.
A result column belongs to a single stored procedure definition. The identity of a result column is storedProcedureDefinition.resultColumn.
file icon File
A file represents a storage area for capturing, transferring or otherwise reading data. Files are often the source of ETL transactions and can be loaded and moved by using FTP. The extended data source type file represents files that cannot be imported into the catalog by standard means. Files that can be imported into the catalog are called data files, and are not extended data sources.