Implementation relationships

You can set and explore relationships between logical data models, physical data models, and implemented data resources.

InfoSphere® Information Server tools analyze and transform data that comes from databases and data files. The metadata that describes databases and data files and their contents is stored in the metadata repository as a hierarchy of implemented data resources. In this hierarchy, a host computer contains one or more databases, which contain schemas. The database schemas contain tables, which contain columns. A host can also contain a data file, which contains data file structures. Data file structures contain data file fields, which are the equivalent of columns.

The organization within these data hierarchies is often determined by decisions that data modelers make when they use design tools such as CA ERwin Data Modeler or InfoSphere Data Architect. As a data modeler, you use a modeling tool to create logical data models that capture the business definition of information assets and the relationships between them. By using the same tool, you can then implement logical data models as physical data models, transforming the logical concepts into the design for a database or sometimes the design for a data file. Finally, you can implement the physical data models as schemas in real-world databases, or as data files. You can also implement the logical data models directly as database schemas.

Although database and data file assets are placed in the category of implemented data resources, they do not always have implementation relationships to data model assets. Some databases, for example, are constructed without reference to a logical data model or a physical data model. In other cases, the databases are imported separately from the data models and the relationship between them is not established unless you set it manually in InfoSphere Metadata Asset Manager.
Figure 1. Implementation relationships between types of assets
The image shows that logical data models can be implemented by physical data models and database schemas. Physical data models can be implemented by database schemas and data files.

By importing and storing the metadata for logical and physical data models and implemented data resources, InfoSphere Information Server provides a unified view of your data flow, from the logical conception through multiple transformations in jobs.

Setting implementation relationships during bridge imports

When you use a bridge to import a logical data model and a related physical data model from a design tool, the bridge automatically sets implementation relationships between the corresponding assets in the models. For example, entities and attributes in the logical data model are connected by implementation relationships to corresponding design tables and design columns in the physical data model.

An implementation relationship is a two-way relationship:
  • A logical data model is implemented by a physical data model.
  • A physical data model implements a logical data model.

When you import a physical data model from a design tool, you can choose to create an additional set of implemented data resources, based on identity parameters that you specify during the import. This process transforms the physical data model into a database schema with database tables and database columns that correspond to the design tables and design columns in the model. Both the implemented data resources and the physical data model assets are saved to the metadata repository. The database tables and columns can be used by developers of InfoSphere DataStage® and QualityStage® jobs. The bridge sets implementation relationships between the physical data model assets and the corresponding assets in the database schema.

You can use these implementation relationships to trace the source of your metadata back to its logical origin in a design tool. For example, by using data lineage in metadata workbench, you can trace the definition of a database column back to the design column that it implements. For each column, you can view the related entity attribute or design column. You can also view and follow these relationships in InfoSphere Metadata Asset Manager. This gives you an end-to-end perspective on your information flow that can alert you to changes in your data structure. For example, if you change the properties of an entity attribute in the logical data model, you can see which database columns are affected downstream by the change. Or you can trace implementation relationships to ensure that all the entity attributes in a logical data model are implemented as database columns.

Additional ways of setting implementation relationships

You can manually set relationships between logical data models, physical data models, and implemented data resources that are stored in the metadata repository. Setting implementation relationships manually is useful when you import logical data models and physical data models separately, or when you import database or data file assets and you want to indicate that they implement logical or physical data model assets.

On the Repository Management tab of InfoSphere Metadata Asset Manager, you can specify that a logical data model is implemented by one or more physical data models or database schemas. You can also specify that a physical data model is implemented by one or more database schemas or data files. You can set additional implementation relationships between the contained assets. For example, if you create an implementation relationship between a logical data model and a database schema you can specify that a logical entity in the data model is implemented by one or more database tables in the schema.

When you analyze a data source in InfoSphere Information Analyzer, you can use the analysis results to create a physical data model. This sets implementation relationships between corresponding assets in the physical data model and the database schema that was analyzed. You can then create a rule, associate the physical data model assets with default rule bindings, and use the implementation relationships to view which database tables or database columns you can bind to.