IBM Support

New features and changes in InfoSphere Information Server, Version 11.7

Question/Answer


Question

What new functionality has been added with rollup patches or fix packs since the release of InfoSphere Information Server, Version 11.7?

Answer

Latest information:
Information Server general updates
Governance updates
Data Quality updates
Integration updates
Administration and management updates
Deprecated features


Governance updates:

Information Governance Catalog New
InfoSphere Information Governance Catalog
Information Server Enterprise Search
Information Server Governance Monitor


Information Server general updates


Information Governance Catalog New


New in 11.7 Fix Pack 1 Service Pack 1

  • Graphical view for term to term relationships is added to the details page of terms.
  • When you explore asset relationships in the relationship graph, you can use filters to hide asset types and relationship types.
  • When you display details page for information assets, you can open data lineage and business lineage viewers.
  • You can edit and delete information assets in the catalog.


New in 11.7 Fix Pack 1

  • You can display collections, and add assets to collections.
  • You can run queries on assets.
  • New cards that you can display on the home page are available. These are assets in selected collection, and assets in selected query.
  • In addition to customizing details page of assets, you can also select which asset types are displayed.
  • When you search for assets in catalog, you can filter search results by using custom attributes.
  • When you view relationships between assets in graph explorer, you can close the expanded nodes.
  • In the Monitoring tab, you can view Curator and Quality dashboards to track and review important metrics about your data.
  • You can preview source data for database tables and data files from the asset details page.


New in 11.7

  • This version of Information Governance Catalog has a new user interface which improves user experience.
  • You can search for assets in your entire enterprise by using enhanced search that takes into account factors like text match, related assets, ratings and comments, modification date, quality score, and usage.
  • You can browse the assets in your catalog and narrow down the results by using filters.
  • You can use producers, which are applications that collect relevant data on systems like Db2®, Hive, Hadoop Distributed File System (HDFS), Oracle, or Teradata, to improve the quality of search results.
  • You can use unstructured data sources to enrich your data with information that has no clear structure - emails, instant messages, word-processing documents, or audio or video files.
  • You can view a graphical depiction of relationships between assets in the graph explorer.
  • You can add comments and ratings to assets.
  • On the home page, you can display the most important information such as assets with highest rates, your collections, or information assets statistics.
  • Administrators can create custom profiles to display customized details page of assets.


 

Information Server Enterprise Search


New in 11.7 Fix Pack 1

  • When you view relationships between assets in graph explorer, you can close the expanded nodes.


New in 11.7

  • Information Server Enterprise Search is a stand-alone application which enables you to explore data in your enterprise.
  • You can search for assets in your entire enterprise by using enhanced search that takes into account factors like text match, related assets, ratings and comments, modification date, quality score, and usage.
  • You can use producers, which are applications that collect relevant data on systems like Db2®, Hive, Hadoop Distributed File System (HDFS), Oracle, or Teradata, to improve the quality of search results.
  • When you view basic information about an asset in the search results, you can open the details page in the application of origin such as Information Governance Catalog.
  • You can view a graphical depiction of relationships between assets in the graph explorer.
  • You can add comments and ratings to assets.


InfoSphere Information Governance Catalog


New in 11.7 Fix Pack 1

  • While creating custom attributes, you can add restrictions when you do not want a custom attribute to be available for all assets of the selected type.


New in 11.7

  • You can discover new asset family - unstructured data sources. These assets are synchronized from IBM Stored IQ, and represent data with no clear structure. New asset types are instances, volumes, infosets and filters.
  • You can export your own asset types that you added to Information Governance Catalog by using REST API bundles. You can later reimport such assets and merge them with the existing ones.
  • When using REST API, two new properties are displayed in the basic information about an asset: asset group and class name.
  • When searching for assets by using REST API, you can specify context, such as the name of a host, to narrow down search results.
  • When searching for assets by using REST API, you can use new parameter IncludeHistory to specify whether the asset history (if applicable) is included in the details of returned assets.
  • You can create, edit and delete custom attributes by using REST API.


Information Server Governance Monitor


New in 11.7

  • Information Server Governance Monitor is a tool where you can monitor the status and health of the data in your enterprise.
  • On the Curation Dashboard, you can see whether the data in your enterprise is cataloged, classified and governed.
  • On the Quality Dashboard, you can review the overall quality of the data in your enterprise, including scoring and quality dimensions.



Data Quality updates


InfoSphere Information Analyzer

InfoSphere Information Analyzer

New in 11.7 Fix Pack 1 Service Pack 1

  • You can run Data rules on files located in Hadoop by using Spark. Note: This is a technology preview and is not yet supported for use in production environments.
  • You can delete frequency distribution results by using IAAdmin -deleteFrequencyDistribution option to prevent exceeding the limit of total number of objects per tablespace.
  • The following GDPR-related classifiers are added for most of the European Union countries: ProvinceCode, ProvinceName, TAXID, and Phone numbers.
  • New data sources are supported: AVRO, MySQL, Postgres, and MongoDB.


New in 11.7 Fix Pack 1 Patch JR59564

  • Avro files are supported in Information Analyzer. To download the patch, go to IBM Fix Central.


New in 11.7 Fix Pack 1

  • You can run a primary key analysis on selected data sets in your workspace to identify single or compound primary keys in your data sets.
  • When terms are automatically assigned to information assets, a machine learning mechanism, which by continuous learning, gives better suggestions and automatically assign terms with higher confidence.
  • On Oracle database, you can generate a single frequency distribution table with the analysis results to manage the number of objects created when you run a column analysis.
  • You can run the column analysis and quality analysis of files located in Hadoop by using Spark. Note: This is a technology preview and is not yet supported for use in production environments.
  • Hive is supported as an analysis database (IADB) for jobs that are executed in Spark.
  • You can run column analysis on the column level in the thin client.
  • You can analyze HDFS files that use ORC and Parquet storage formats.


New in 11.7

  • You can create and run automation rules to automate the process of applying and running rule definitions and quality dimensions.
  • Terms can be automatically assigned to data sets and columns when you use automated discovery to import data or when you run a column analysis.
  • You can run automated discovery to import and analyze new data sets from a data connection. With one click you can register data sets and add metadata from the data connection to a default workspace, run column and quality analysis, and assign business terms to imported assets.
  • You can create new data quality dimensions in order to detect if a row, column, or cell contains a particular data quality problem.
  • On Db2 database, when you run column analysis, a single frequency distribution table can be generated for each project rather than each column in the analysis.
  • You can use different credentials for running analysis jobs than the ones you use to import data by using by using InfoSphere Metadata Asset Manager. You can set the credentials for a specific project.
  • When you run quality and column analysis, you can update settings for a specific analysis. For example, you can run the analysis with a data sample, specify how you want analysis statistics generated and stored, or apply advanced engine settings.
  • You can run stored procedures for SQL Server and Oracle to get the analysis results from Information Analysis Database.
  • You can export the results of the overlap analysis.
  • New data class type 'Script' is introduced. By using this script classifier, you can classify your data by creating a custom script snippet.


Integration updates


IBM DataStage Flow Designer
InfoSphere DataStage
IBM InfoSphere DataStage Container
Connectivity
InfoSphere Information Server on Hadoop

IBM DataStage Flow Designer

New in patch_July2018_DFD_all_11700

  • New stages: Aggregator, Row Generator, Tail, File connector and Netezza connector.
  • Simplified UI for Transformer stage.
  • When Kafka connector is added as source, single column table definition is added automatically.
  • Jobs in the jobs dashboard can be grouped by clusters formed using Machine Learning techniques. Admins can manage the Machine Learning thread execution using projects dashboard.
  • Datastage Flow Designer supports Microsoft Edge browser v41 and above.
  • Job Parameter of the type 'List' is supported


New in 11.7 Fix Pack 1

  • Jobs, connections, and table definitions can be renamed from the corresponding dashboards.
  • Integration with GitHub allows you to publish jobs and related artifacts to GitHub and load different job versions from GitHub to the canvas. You can create new branches, map a GitHub version to a version in the XMETA repository, and satisfy requirements around CICD and Auditing.
  • You can create, edit, and delete job parameters, such as encrypted, date, integer, float, pathname, date and time, and configuration files.
  • You can create, edit, and delete connections.
  • You can preview a sample of data from relational connectors by using a live connection and sequential files.
  • You can rename links and stages from the Details card.
  • Using machine learning, Smart Palette arranges the most frequently used connectors and stages on top.
  • You can map columns for each column in an output link.
  • The connectors Amazon S3, SQL Server, and Greenplum are supported.
  • The stages Peek, Lookup, Compress, Expand, and Head are supported.
  • You can open a locked job in read-only mode.
  • You can create and delete projects.
  • You can add users to a project, remove users from a project, and assign roles to users.
  • Runtime column propagation is supported. When you define part of your schema, you can specify that, if your job encounters extra columns that are not defined in the meta data when it actually runs, it will adopt these extra columns and propagate them through the rest of the job.
  • You can load columns from table definitions in any stage. You can append or replace existing columns and have them automatically propagated to downstream stages.


New in 11.7

  • You can use DataStage Flow Designer, a web-based tool, to create, edit, load, and run DataStage jobs.
  • You can search for jobs by using built-in search.
  • Metadata is automatically propagated to subsequent stages in a flow.
  • When compilation errors occur, all of them are highlighted and a hover over each stage provides details, which makes it easier to correct the errors.
  • If you are new to the product, you can take a quick tour to learn how to use its features.


 

InfoSphere DataStage


New in 11.7

  • By using DataStage Product Insights, you can connect your InfoSphere DataStage installation to IBM Cloud Product Insights. Such connection gives you ability to review your installation details and metrics like CPU usage, memory usage, active jobs, jobs that failed, and jobs that completed.
  • Data Masking stage supports Optim Data Privacy Providers version 11.3.0.5.

 

IBM InfoSphere DataStage Container

New in 11.7 Fix Pack 1 Service Pack 1

  • A separate NFS server can be designated during an in-place installation when you use the Kubernetes master node as the file server.
  • Ability to upgrade containers from 11.7.0.1-1.0 to 11.7.0.1-2.0.
  •  Support for product consumption metering using IBM Cloud Private Metering Service.


Connectivity

New in 11.7 Fix Pack 1

  • Cassandra connector is supported. It has the following features:
    • DataStax Enterprise (DSE) data platform built on the Apache Cassandra is supported. DataStax Enterprise Java Driver is used to connect to the Cassandra database.
    • The connector reads data from and writes data to Cassandra database in parallel and sequential modes.
    • You can provide SQL statement (SELECT, INSERT, UPDATE, DELETE) to read or write data.
    • Reading and writing data of a single column in JSON format is supported.
    • Custom types of codecs are supported.
    • You can specify a consistency level for each read query and write operation.
    • You can modify data by inserting, updating, deleting an entire row or a set of specified columns.
  • Azure Storage connector is supported. You can use it to connect to the Azure Blob storage and Azure File Storage and perform the following operations:
    • Read data from or write data to Azure Blob and File Storage.
    • Import metadata about files and folders in Azure Blob Storage and Azure File Storage.
  • HBase connector supports the following features:
    • Metadata can be imported on the table level and higher.
    • You can run data lineage on a table level.
  • ILOG JRules connector supports Decision Engine rules in all engine modes (Core, J2EE, J2SE).
  • Kafka connector supports the following features:
    • Connector supports new secure Kafka connections, including SASL/PLAIN, SASL/SSL and SSL with user authentication.
    • Apart from String/Varchar, new message types are supported: Integer, Small Integer, Double and Byte array.
    • Kafka partitioning - fetching key and partition number from Kafka - is supported. The partitioning type is preserved for the write mode.
  • Data lineage for the Hive connector is enhanced in the following ways:
    • Data flow to the column level is supported.
    • The following URL formats of JDBC drivers are supported: jdbc:ibm:hive (DataDirect) and jdbc:hive2. For the jdbc:ibm:hive driver version, the database name is 'ibm'. For the jdbc:hive2 driver version, the database name is set by using the entire URL, which is JDBC default behavior.
    • You can use the URL attribute Database to set the database schema.
    • You can use the URL path as the database schema.
  • Db2 connector supports Db2 12 for z/OS.
  • SFDC API 42 is supported.
  • Sybase IQ 16.1 is supported.
  • Greenplum connector supports Greenplum database 5.4.


New in 11.7

  • HBase connector is supported. You can use HBase connector to connect to tables stored in the HBase database and perform the following operations:
    • Read data from or write data to HBase database.
    • Read data in parallel mode.
    • Use HBase table as a lookup table in sparse or normal mode.
    • Kerberos keytab locality is supported.
  • Hive connector supports the following features:
    • Modulus partition mode and minimum maximum partition mode during the read operation are supported.
    • Kerberos keytab locality is supported.
    • Connector supports connection to Hive on Amazon EMR.
  • Kafka connector supports the following features:
    • Continuous mode, where incoming topic messages are consumed without stopping the connector.
    • Transactions, where a number of Kafka messages is fetched within a single transaction. After record count is reached, an end of wave marker is sent to the output link.
    • TLS connection to Kafka.
    • Kerberos keytab locality is supported.
  • IBM MQ version 9 is supported.
  • IBM InfoSphere Data Replication CDC Replication version 11.3 is supported.
  • SQL Server Operator supports SQL Server Native Client 11.
  • Sybase Operator supports unichar and univarchar datatypes in Sybase Adaptive Server Enterprise.
  • Amazon S3 connector supports connecting by using a HTTP proxy server.
  • File connector supports the following features:
    • Native HDFS FileSystem mode is supported.
    • You can import metadata from the ORC files.
    • New data types are supported for reading and writing the Parquet formatted files: Date / Time and Timestamp.
  • JDBC connector is certified to connect to MongoDB and Amazon Redshift.


InfoSphere Information Server on Hadoop


New in 11.7 Fix Pack 1 Service Pack 1

  • When you run a job with checkpointing enabled, you can automatically restart jobs that fail from intermediate checkpoint data. There are two new options supported, one for calculating the amount of disk space which is required for such jobs, and the other to delay automatic restart of the failed jobs.
  • Ambari quick links and statistics for Information Server Service were added in InfoSphere DataStage.
  • Ability to upgrade Docker containers from 11.7.0.1-1.0 to 11.7.0.1-2.0.


New in 11.7 Fix Pack 1

  • When running InfoSphere Information Server on Hadoop the size of binaries that are distributed to data nodes is reduced by 500 MB.
  • The deployment of Kerberos enabled cluster in Information Server is simplified.
  • Support to set a checkpoint so that InfoSphere DataStage jobs automatically restart after a failure. This feature works as long as no data is written to the target source.
  • An improved allocation of YARN jobs to containers. YARN jobs no longer fail while waiting to be allocated by the resource manager to new containers.
  • Integration with Ambari to show BigIntegrate log information and monitoring metrics.
  • The Big Data File Stage (BDFS) can now be enabled with Kerberos. Note: This is not supported when running jobs on Hadoop.
  • You can configure the Complex Flat File stage (CFF) so that it can read files from an HDFS system by setting the APT_IMPEXP_HDFS_USER environment variable.
  • It is detected when the capacity of a scratch disk (a local disk that is used for the temporary storage of data) is low.
  • You can have clusters in both Red Hat Enterprise Linux 6 and 7 versions in your environment.


New in 11.7

  • The following Hadoop distributions are supported: MapR 5.2.2, Cloudera CDH 5.13, and Hortonworks HDP 2.6.2.
  • Parallel engine configuration files on YARN can include nodes which are not part of Hadoop cluster, so Hadoop jobs can access relational databases outside the cluster.
  • Parallel jobs on YARN can use Hadoop Shuffle space for scratch files.
  • Parallel jobs give clearer messages when they are preempted by the YARN resource manager.
  • Ambari scripts have been enhanced to improve administration.



Administration and management updates


Managing metadata

Managing metadata

New in 11.7

  • You can run automated discovery by using command line to discover the database schema available for a given database or the files and folders for a given file system.


Deprecated features


Features deprecated in 11.7 Fix Pack 1

  • The following attributes are deprecated in Information Governance Catalog REST API, because they have duplicates:
    • IISMDMCompositeView (Display name: Composite View, REST name: composite_view)
      deprecated attribute: of_Member_for_query (Display name: Composite View, REST name: composite_view) 
      duplicate of: of_Member (Display name: Member Type, REST name: member_type)
    • IISMDMEntity (Display name: Entity Type, REST name: entity_type)
      deprecated attribute: of_Member_for_query (Display name: Member Type, REST name: member_type_for_query)
      duplicate of: of_Member (Display name: Member Type, REST name: member_type)
    • IISMDMMember (Display name: Member Type, REST name: member_type)
      deprecated attribute: of_MemberModel_for_query (Display name: MDM Model, REST name: mdm_model_for_query)
      duplicate of: of_MemberModel (Display name: MDM Model, REST name: mdm_model)
    • IISMDMMemberAttribute (Display name: Attribute, REST name: attribute)
      deprecated attribute: of_Member_for_query (Display name: Member Type, REST name: member_type_for_query)
      duplicate of: of_Member (Display name: Member Type, REST name: member_type)
    • IISMDMPhysicalObjectAttribute (Display name: Physical Object Attribute, REST name: physical_object_attribute)
      deprecated attribute: of_PhysicalObject_for_query (Display name: Physical Object, REST name: physical_object_for_query)
      duplicate of: of_PhysicalObject (Display name: Physical Object, REST name: physical_object)
    • IISMDMSegment (Display name: Attribute Type, REST name: attribute_type)
      deprecated attribute: of_MemberModel_for_query (Display name: MDM Model, REST name: mdm_model_for_query)
       duplicate of: of_MemberModel (Display name: MDM Model, REST name: mdm_model)
    • IISMDMSegmentField (Display name: Attribute Type Field, REST name: attribute_type_field)
      deprecated attribute: of_Segment_for_query (Display name: Attribute Type, REST name: attribute_type_for_query)
      duplicate of: of_Segment (Display name: Attribute Type, REST name: attribute_type)

Features deprecated in 11.7

  • Using InfoSphere Data Click to move data in IBM InfoSphere BigInsights is no longer supported. In previous releases, InfoSphere Data Click was used to copy selected database tables, data files, data file folders, and Amazon S3 buckets from the catalog to a target distributed file system, such as a Hadoop Distributed File System (HDFS) in IBM InfoSphere BigInsights.
  • The following asset types are deprecated in Information Governance Catalog:
    • Machine Profiles
    • Blueprint Director
    • CDC Mapping Document
    • Warehouse Mapping Document
    • Information Server Reports
    • External Assets
  • Data policy asset type is deprecated in Information Analyzer.

Document information

More support for: InfoSphere Information Server

Software version: 11.7, 11.7.0.1, 11.7.0.1 SP1

Operating system(s): AIX, Linux, Windows

Reference #: 2009615

Modified date: 13 July 2018


Translate this page: