Hortonworks Data Platform V3.0 enhances its suite of solutions, delivering agile data management capabilities and optimized support for deep learning applications to enable faster, smarter, and hybrid data

IBM United States Software Announcement 218-419
August 14, 2018

Table of contents
OverviewOverviewTechnical informationTechnical information
Key prerequisitesKey prerequisitesOrdering informationOrdering information
Planned availability datePlanned availability dateTerms and conditionsTerms and conditions
DescriptionDescriptionPricesPrices
Program numberProgram numberOrder nowOrder now
PublicationsPublications


(Corrected on February 13, 2019)

Revised Software Subscription and Support applies content in the Terms and conditions section.



Overview

Top rule

Hortonworks Data Platform is a massively scalable, enterprise-ready, and 100% open source platform for storing, processing, and analyzing large volumes of data-at-rest. It is a key component of the modern data architecture and can be deployed both on-premises and on the cloud. Enhancements in Hortonworks Data Platform V3.0 are based on Apache Hadoop 3.1 and include containerization, GPU support, erasure coding, and namenode federation.

Hortonworks Data Platform V3.0 consists of the following essential set of Apache Hadoop projects:

  • Apache Accumulo 1.7.0
  • Apache Atlas 1.0.0
  • Apache Calcite 1.2.0
  • Apache DataFu 1.3.0
  • Apache Hadoop 3.1.0
  • Apache HBase 2.0.1
  • Apache Hive 3.0.0
  • Apache Kafka 1.0.1
  • Apache Knox 1.0.0
  • Apache Livy 0.5
  • Apache Oozie 4.3.1
  • Apache Phoenix 5.0.0
  • Apache Pig 0.16.0
  • Apache Ranger 1.1.0
  • Apache Solr 7 (this component is not supported as part of Hortonworks Data Platform 3.0)
  • Apache Spark 2.3.1
  • Apache Sqoop 1.4.7
  • Apache Storm 1.2.1
  • Apache TEZ 0.9.1
  • Apache Zeppelin 0.8.0
  • Apache ZooKeeper 3.4.6

Version 3.0 of Hortonworks Data Platform is available for download from Passport Advantage®.



Back to topBack to top

Key prerequisites

Top rule

For details, see the Software requirements section.



Back to topBack to top

Planned availability date

Top rule

September 4, 2018



Back to topBack to top

Description

Top rule

Hortonworks Data Platform V3.0 is faster, smarter, and hybrid. It provides users with:

  • Agile application deployment through containerization, enabling applications to be launched quickly. This saves time and resources for users and helps them to achieve faster time to market for services and increased developer productivity.
  • Support for deep learning applications, enabling users to run workloads such as machine learning and deep learning that require substantial GPU resources. GPU pooling enables the sharing of GPU resources with more workloads for cost effectiveness.
  • Optimization for the cloud, ensuring automated cloud provisioning to simplify big data deployments while optimizing the use of cloud resources.

New features in Hortonworks Data Platform V3.0 components

Atlas

Core capabilities:

  • Capability to show attributes for time-bound classification or business catalog mapping (ATLAS-2457)
  • Support for business terms and categories
  • Capability to migrate Atlas data in Titan graph DB to JanusGraph DB
  • Atlas HBase hook to capture metadata and lineage
  • Capability to tag propagation from object to child object or derivative asset (ATLAS-1821)
  • Storm Atlas hook compatibility with Storm 1.2

Security:

  • Metadata security enhancement that enables authorization based on Classification or Business Catalog mapping (ATLAS-2459)

Druid

Integration:

  • Kafka-Druid ingest. A Kafka topic can now be mapped to a Druid table. The events will be automatically ingested and available for querying in near real-time.

HDFS

Core capabilities:

  • Capability to balance utilization of disks (varied capacities) inside a data node (HDFS-1312)
  • Capability to reduce storage overhead of HDFS with Directory Level Reed Solomon Erasure Coding Encoding (HDFS-7285)
  • Support for two Standby Name Nodes for NameNode High Availability
  • NFS Gateway needs to work in front of ViewFS (file access to the unified namespace)
  • Capability to expose encrypted zones and erasure coded zones by WebHDFS API (HDFS-11394, HDFS-13512)
  • Hive support on Erasure Coded Directories (HDFS-7285)
  • Cloud testing for HDFS; cloud failure modes and availability status
  • Cloud capability to connect or reattach to Elastic Block Volume, enabling the use of centralized block storage for better TCO (versus local disks)

HBase

Core capabilities:

  • Procedure V2. Procedure V2, or procv2, is an updated framework for executing multistep HBase administrative operations. When there is a failure, this framework can implement all of the master operations and remove the need for tools such as HBaseFsck (hbck) in the future. Procedure V2 can be used to create, modify, and delete tables. Other functions, such as new Assignment Manager, are implemented using Procedure V2.
  • Fully off-heap read/write path. When data is written into HBase through the Put operation, the cell objects do not enter JVM heap until the data is flushed to disk in an HFile. This helps to reduce total heap usage of a RegionServer, and it improves efficiency by copying less data.
  • Use of Netty for RPC layer and Async API. This replaces the previous Java™ NIO RPC server with a Netty RPC server. Netty provides the capability to easily develop an Asynchronous Java client API.
  • In-memory compactions. Periodic reorganization of the data in the Memstore can result in a reduction of overall I/O. Data is written and accessed from HDFS. The net performance increases when more data is stored in memory for a longer period of time.
  • Better dependency management. HBase now internally shades commonly-incompatible dependencies to prevent problems for downstream users. Shaded client jars can be used to reduce the burden on the existing applications.
  • Coprocessor and Observer API rewrite. Minor changes made to the API to remove ambiguous, misleading, and dangerous calls.

Hive

Core capabilities:

  • Workload management for LLAP. LLAP can now run in a multitenant environment without resource competition.
  • ACID V2 and ACID are on by default. ACID V2 has performance improvements in both storage format and execution engine, enabling equal or better performance to be achieved when compared to non-ACID tables. ACID on is enabled by default to enable full support for data updates.
  • Materialized view navigation. The Hive query engine now supports materialized view. The query engine can automatically use materialized view (when available) to speed up queries.
  • Information schema. Hive now exposes the metadata of the database (tables, columns, and so on) through the use of the Hive SQL interface directly.

Integration:

  • Hive Warehouse Connector for Spark. Hive Warehouse Connector enables you to connect Spark application with Hive data warehouses. The connector automatically handles ACID tables.
  • JDBC storage connector. Any JDBC database tables can now be mapped into Hive and queried in conjunction with other tables.

Kafka

Core capabilities:

  • Kafka has been upgraded from 1.0.0 to 1.0.1 with some critical bug fixes.

Security:

  • The lastEntry in TimeIndex can be cached to avoid unnecessary disk access (KAFKA-6172).
  • AbstractIndex caches index file to avoid unnecessary disk access during resizing (KAFKA-6175).
  • SSLTransportLayer continues to read from socket until either the buffer is full or the socket has no more data (KAFKA-6258).

Knox

Usability:

  • Admin UI and service discovery and topology generation feature for simplifying and accelerating Knox configuration

Security:

  • SSO support for Zeppelin, YARN, MR2, HDFS, and Oozie
  • Knox Proxy support for YARN, Oozie, SHS (Spark History Server), HDFS, MR2, Livy, and SmartSense

Oozie

Core capabilities:

  • Upgraded Oozie baseline from 4.2 to 4.3.1
  • Disabled Oozie Hive Action

Integration:

  • Oozie support for Spark2

Phoenix

Core capabilities:

  • Query log. This is a new system table SYSTEM.LOG that captures information about queries that are being run against the cluster (client-driven).
  • Column encoding. This is new to Hortonworks Data Platform. The amount of space can be reduced by applying a custom encoding scheme of data in the HBase table. This increases the performance due to less data to read and thereby reduces the storage and optimizes the performance gain for the sparse tables.
  • Support for GRANT and REVOKE commands. This provides automatic changes to indexes ACLs, if access changed for data table or view.
  • Support for sampling tables.
  • Support for atomic update (on duplicate key).
  • Support for snapshot scanners for MR-based queries.

Integration:

  • HBase 2.0 support.
  • Python driver for Phoenix Query Server. This Provides Python DB 2.0 API implementation.
  • Hive 3.0 support for Phoenix. This provides updated phoenix-hive StorageHandler for the new Hive version.
  • Spark 2.3 support for Phoenix. This provides updated phoenix-spark driver for new the Spark version.

Security:

  • Hardening of both of the secondary indexes, which include Local and Global

Ranger

Security:

  • Time bound authorization policies
  • Hive UDF execution authorization
  • Hive workload management authorization
  • RangerKafkaAuthorizer to support new operations and resources added in Kafka 1.0
  • Read-only Ranger user roles for auditing purposes
  • Auditing for usersync operations
  • HDFS Federation support
  • Support for metadata authorization changes in Atlas 1.0
  • Capability to specify passwords for admin accounts during ranger install
  • Consolidated db schema script for all supported DB flavor

Ease of use:

  • Actual Hive queries can be displayed in Ranger Audit UI.
  • Policies can be grouped by using labels.
  • Ranger and Atlas can be installed and turned on by default in Hortonworks Data Platform V3.0.

Spark

Core capabilities:

  • Spark 2.3.1 GA on Hortonworks Data Platform V3.0
  • Structured streaming support for ORC
  • Security and ACLs can be enabled in History Server
  • Support for running Spark jobs in a Docker Container
  • Capability to upgrade Spark, Zeppelin, or Livy from Hortonworks Data Platform V2.6 to Hortonworks Data Platform V3.0
  • Cloud capability enabling Spark testing with S3Guard or S3A Committers
  • Certification for the Staging Committer with Spark
  • Capability to integrate with new Metastore Catalog feature
  • Beeline support for Spark thrift server
  • Capability to configure LLAP mode in Ambari

Integration:

  • Support per notebook interpreter configuration
  • Livy to support ACLs
  • Knox to proxy Spark History Server UI
  • Structured Streaming support for Hive Streaming library
  • Transparent write to Hive warehouse
  • Spark-LLAP connector GA for Ranger

Storm

Core capabilities:

  • Storm has been upgraded from 1.1.0 to 1.2.1. Storm 1.2.1 now supports all Hortonworks Data Platform V3.0 components including Hadoop/HDFS 3.0, HBase 2.0, and Hive 3.

YARN

Core capabilities:

  • Support for intra-queue preemption for balancing between apps from different users and priorities in the same queue
  • Support for async-scheduling (versus per node-heartbeat) for better response time in a large YARN cluster
  • Support for generalized resource-placement in YARN; option for affinity or anti-affinity
  • Support for application priority scheduling in Capacity Scheduler
  • Capability to expose a framework for more powerful apps to queues mappings (YARN-8016), (YARN-3635)
  • Support for application timeout feature in YARN
  • Support for GPU scheduling/isolation on YARN
  • Support for docker containers running on YARN
  • Support for assemblies on YARN (YARN-6613)
  • YARN Service framework; Slider functionality in YARN
  • Support for a simplified services REST API on Slider or YARN
  • Simplified discovery of services through the use of DNS
  • Support for Hortonworks Data Platform on (YARN + Slider + Docker/YCloud) through CloudBreak integration
  • NodeManager support for automatic restart of service containers
  • Support for auto-spawning of administrator-configured system services
  • Capability to migrate LLAP-on-Slider to LLAP-on-YARN-Service-Framework
  • Support for dockerized Spark jobs on YARN

Ease of use:

  • Timeline Service V2
  • Capability to enable resource manager with web UI authorization control (while users can only see their own jobs)
  • Support for better classpath isolation for users; capability to remove guava conflicts from user runtime
  • Improved user-friendly and developer-friendly YARN web UI
  • ARN/MapReduce integration with SSO/Proxy (by using Knox)

Enterprise readiness:

  • CS preemption enabled by default
  • Cgroup support for YARN in a non-secure cluster, with Linux® Container Executor always on by default
  • Cgroups and CPU scheduling for YARN containers enabled by default
  • Support for deleting queues without requiring an RM restart
  • Enhancements to STOP queue handling capability
  • Support for side-by-side HDFS tarball-based install of multiple Spark auxiliary services in YARN (YARN-1151)
  • Create a log aggregation tool (HAR files) to reduce NameNode load (YARN-4086)
  • ARN queue ACL support when doAs=false
  • Capability to provide an API in YARN to get queue mapping result before application submitted

Zeppelin

Core capabilities:

  • Updated to 0.8 release of Zeppelin
  • Capability to change Zeppelin UI across the board to not display stack traces
  • Option to select user name case conversion (ZEPPELIN-3312)

Ease of use:

  • Zeppelin UI features Knox SSO for Quicklinked Web UIs

Accessibility by people with disabilities

A US Section 508 Accessibility Compliance Report containing details on accessibility compliance can be found on the Product accessibility information website.



Back to topBack to top

Reference information

Top rule

For more information about Hortonworks Data Platform, see Software Announcement 218-187, dated March 20, 2018.



Back to topBack to top

Program number

Top rule

Program number VRM Program name
5737-H46 3.0.0 Hortonworks Data Platform


Back to topBack to top

Offering Information

Top rule

Product information is available on the IBM® Offering Information website.

More information is also available on the Passport Advantage and Passport Advantage Express® website.



Back to topBack to top

Publications

Top rule

Technical documentation can be found in IBM Knowledge Center.



Back to topBack to top

Services

Top rule

Software Services

IBM Software Services has the breadth, depth, and reach to manage your services needs. You can leverage the deep technical skills of our lab-based, software services team and the business consulting, project management, and infrastructure expertise of our IBM Global Services team. Also, we extend our IBM Software Services reach through IBM Business Partners to provide an extensive portfolio of capabilities. Together, we provide the global reach, intellectual capital, industry insight, and technology leadership to support a wide range of critical business needs.

To learn more about IBM Software Services, contact your Lab Services Sales or Delivery Leader.



Back to topBack to top

Technical information

Top rule

Specified operating environment

Software requirements

Operating System Version
CentOS (64-bit) CentOS 7.2, 7.3, and 7.4

Browsers Version
Internet Explorer 10,11
Safari 10.0.1 and 10.0.3

JDK Version
Oracle JDK JDK8
OpenJDK JDK8

Planning information

Packaging

This offering is delivered through the internet as an electronic download. There is no physical media.

This program, when downloaded from a website, contains the applicable IBM license agreement and License Information, if appropriate, which will be presented for acceptance at the time of installation of the program. For future reference, the license and License Information will be stored in a directory such as LICENSE.TXT.



Back to topBack to top

Ordering information

Top rule

For ordering information, consult your IBM representative or authorized IBM Business Partner, or go to the Passport Advantage website.

This product is only available through Passport Advantage. It is not available as shrinkwrap.

These products may only be sold directly by IBM or by authorized IBM Business Partners for Channel Value Rewards.

More information can be found on the IBM Channel Value Rewards website.

To locate IBM Business Partners for Channel Value Rewards in your geography for a specific Channel Value Rewards portfolio, go to the Find a Business Partner page.

Product: Hortonworks Data Platform (5737-H46)


Passport Advantage

No new part numbers.

Charge metric

Definitions of the charge metric for this licensed product can be found in the following License Information document:

Program name PID number License Information document number
Hortonworks Data Platform 5737-H46 L-MLOY-AW5TJY

Select your language of choice and scroll down to the Charge Metrics section.



Back to topBack to top

Terms and conditions

Top rule

The information provided in this announcement letter is for reference and convenience purposes only. The terms and conditions that govern any transaction with IBM are contained in the applicable contract documents such as the IBM International Program License Agreement, IBM International Passport Advantage Agreement, and the IBM Agreement for Acquisition of Software Maintenance.

This product is only available through Passport Advantage.

Licensing

IBM International Program License Agreement including the License Information document and Proof of Entitlement (PoE) govern your use of the program. PoEs are required for all authorized use. Part number products only, offered outside of Passport Advantage, where applicable, are license only and do not include Software Maintenance.

Software Maintenance

Licenses under the IBM Program License Agreement (IPLA) provide for support with ongoing access to releases and versions of the program. IBM includes one year of Software Subscription and Support (also referred to as Software Maintenance) with the initial license acquisition of each program acquired. The initial period of Software Subscription and Support can be extended by the purchase of a renewal option, if available. Two charges apply: a one-time license charge for use of the program and an annual renewable charge for the enhanced support that includes telephone assistance (voice support for defects during normal business hours), as well as access to updates, releases, and versions of the program as long as support is in effect.

License Information number

The following License Information document applies to the offering in this announcement:

Program identifier License Information document title License Information document number
5737-H46 Hortonworks Data Platform V3.0 L-MLOY-AW5TJY

Select your language of choice and scroll down to the Charge Metrics section. Follow-on releases, if any, may have updated terms. See the License Information documents website for more information.

Limited warranty applies

Yes

Limited warranty

IBM warrants that when the program is used in the specified operating environment, it will conform to its specifications. The warranty applies only to the unmodified portion of the program. IBM does not warrant uninterrupted or error-free operation of the program or that IBM will correct all program defects. You are responsible for the results obtained from the use of the program.

IBM provides you with access to IBM databases containing information about known program defects, defect corrections, restrictions, and bypasses at no additional charge. For further information, see the IBM Software Support Handbook.

IBM will maintain this information for at least one year after the original licensee acquires the program (warranty period).

Program technical support

Technical support of a program product version or release will be available for a minimum of five years from the general availability date, as long as your Software Subscription and Support (also referred to as Software Maintenance) is in effect.

This technical support allows you to obtain assistance (by telephone or electronic means) from IBM for product-specific, task-oriented questions regarding the installation and operation of the program product. Software Subscription and Support (Software Maintenance) also provides you with access to updates (modifications or fixes), releases, and versions of the program. You will be notified, through an announcement letter, of discontinuance of support with 12 months' notice. If you require additional technical support from IBM, including an extension of support beyond the discontinuance date, contact your IBM representative or IBM Business Partner. This extension may be available for a fee.

For additional information about the IBM Software Support Lifecycle Policy, see the IBM Software Support Lifecycle Policy website.

Money-back guarantee

If for any reason you are dissatisfied with the program and you are the original licensee, you may obtain a refund of the amount you paid for it, if within 30 days of your invoice date you return the program and its PoE to the party from whom you obtained it. If you downloaded the program, you may contact the party from whom you acquired it for instructions on how to obtain the refund.

For clarification, note that (1) for programs acquired under the IBM International Passport Advantage offering, this term applies only to your first acquisition of the program and (2) for programs acquired under any of IBM's On/Off Capacity on Demand (On/Off CoD) software offerings, this term does not apply since these offerings apply to programs already acquired and in use by you.

Volume orders (IVO)

No

Passport Advantage applies

Yes, information is available on the Passport Advantage and Passport Advantage Express website.

Software Subscription and Support applies

Yes. Software Subscription and Support, also referred to as Software Maintenance, is included with licenses purchased through Passport Advantage and Passport Advantage Express. Product upgrades and Technical Support are provided by the Software Subscription and Support offering as described in the Agreements. Product upgrades provide the latest versions and releases to entitled software, and Technical Support provides voice and electronic access to IBM support organizations, worldwide.

IBM includes one year of Software Subscription and Support with each program license acquired. The initial period of Software Subscription and Support can be extended by the purchase of a renewal option, if available.

While your Software Subscription and Support is in effect, IBM provides you assistance for your routine, short duration installation and usage (how-to) questions, and code-related questions. IBM provides assistance by telephone and, if available, electronic access, only to your information systems (IS) technical support personnel during the normal business hours (published prime shift hours) of your IBM support center. (This assistance is not available to your users.) IBM provides Severity 1 assistance 24 hours a day, 7 days a week. For additional details, see the IBM Software Support Handbook. Software Subscription and Support does not include assistance for the design and development of applications, your use of programs in other than their specified operating environment, or failures caused by products for which IBM is not responsible under the applicable agreements.

Unless specified otherwise in a written agreement with you, IBM does not provide support for third-party products that were not provided by IBM. Ensure that when contacting IBM for covered support, you follow problem determination and other instructions that IBM provides, including in the IBM Software Support Handbook.

For additional information about the International Passport Advantage Agreement and the IBM International Passport Advantage Express Agreement, go to the Passport Advantage and Passport Advantage Express website.

For additional software support information, see Software Support Lifecycle.

Variable charges apply

No

Educational allowance available

Not applicable.



Back to topBack to top

Statement of good security practices

Top rule

IT system security involves protecting systems and information through intrusion prevention, detection, and response to improper access from within and outside your enterprise. Improper access can result in information being altered, destroyed, or misappropriated or can result in misuse of your systems to attack others. Without a comprehensive approach to security, no IT system or product should be considered completely secure and no single product or security measure can be completely effective in preventing improper access. IBM systems and products are designed to be part of a regulatory compliant, comprehensive security approach, which will necessarily involve additional operational procedures, and may require other systems, products, or services to be most effective.

Important: IBM does not warrant that any systems, products, or services are immune from, or will make your enterprise immune from, the malicious or illegal conduct of any party.



Back to topBack to top

Prices

Top rule


Business Partner information

If you are an IBM Business Partner acquiring products from IBM, you may link to Passport Advantage Online for resellers where you can obtain Business Partner pricing information. An IBMid and password are required to access the IBM Passport Advantage website.


Passport Advantage

For Passport Advantage information and charges, contact your IBM representative or authorized IBM Business Partner for Channel Value Rewards. Additional information is also available on the Passport Advantage and Passport Advantage Express website.



Back to topBack to top

Order now

Top rule

To order, contact the IBM Digital Sales Center, your local IBM representative, or your IBM Business Partner. To identify your local IBM representative or IBM Business Partner, call 800-IBM-4YOU (426-4968). For more information, contact the IBM Digital Sales Center.

Phone: 800-IBM-CALL (426-2255)

Fax: 800-2IBM-FAX (242-6329)

For IBM representative: askibm@ca.ibm.com

For IBM Business Partner: pwcs@us.ibm.com

IBM Digital Sales Offices
1177 S Belt Line Rd
Coppell, TX 75019-4642, US

The IBM Digital Sales Center, our national direct marketing organization, can add your name to the mailing list for catalogs of IBM products.

Note: Shipments will begin after the planned availability date.


IBM Channel Value Rewards

This product is available under Channel Value Rewards (CVR), either directly from IBM or through authorized Business Partners who invest in skills and high-value solutions. IBM clients may benefit from the industry-specific or horizontal solutions, skills, and expertise provided by these Business Partners.

Additions to CVR will be communicated through standard product announcements. To determine what IBM software is available under CVR, see the IBM Passport Advantage Online for IBM Business Partners website.

For questions regarding CVR, see the IBM Channel Value Rewards website.

Trademarks

Passport Advantage, IBM, PartnerWorld and Express are registered trademarks of IBM Corporation in the United States, other countries, or both.

Oracle and Java are trademarks of Oracle and/or its affiliates in the United States, other countries, or both.

Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

Other company, product, and service names may be trademarks or service marks of others.

Terms of use

IBM products and services which are announced and available in your country can be ordered under the applicable standard agreements, terms, conditions, and prices in effect at the time. IBM reserves the right to modify or withdraw this announcement at any time without notice. This announcement is provided for your information only. Additional terms of use are located at

Terms of use

For the most current information regarding IBM products, consult your IBM representative or reseller, or go to the IBM worldwide contacts page

IBM United States