This document explains how to download the three IBM InfoSphere BigInsights Enterprise Edition Version 2.1 products from the Passport Advantage web site.
IBM InfoSphere BigInsights Enterprise Edition, a platform for the analysis and visualization of Internet-scale data volumes, helps organizations quickly build and deploy custom analytics and workloads to capture insight from big data that can then be integrated into existing database, data warehouse, and business intelligence infrastructures.
InfoSphere BigInsights Quick Start Edition for V2.1
The Quick Start Edition is a non-production edition that supports the features of Enterprise Edition except high availability, GPFS, software bundles, and accelerators. It is not available from Passport Advantage but is available as a free download from the following sites:
- You can download the installable product from https://www14.software.ibm.com/webapp/iwm/web/preLogin.do?source=swg-ibmibqse.
- You can download the VMware image of the product from https://www14.software.ibm.com/webapp/iwm/web/preLogin.do?source=swg-ibmibqsevmw.
Downloading IBM InfoSphere BigInsights
- From a browser, navigate to the Find Downloads and Media page on the IBM Passport Advantage web site. You must log on to continue.
- Select the Find by part number search option under Download finder options.
- Search for the following part numbers and download the files.
Application Part number Enterprise Edition Software CIL0XEN Enterprise Edition Quick Start Guide CIL0WEN Enterprise Edition for Non-Production Environment Software CIL0ZEN Enterprise Edition for Non-Production Environment Quick Start Guide CIL0YEN Enterprise Edition Starter Kit Software CIL11EN Enterprise Edition Starter Kit Quick Start Guide CIL10EN
- Unpack the part into a temporary directory on your system.
- For installation instructions, follow the instructions in Installing IBM InfoSphere BigInsights.
- For the latest information, review the Release notes.
Downloading optional software
IBM accelerators apply advanced analytics to help integrate and manage your data. The following table describes the IBM accelerators that are available for InfoSphere BigInsights and InfoSphere Streams.
|IBM Accelerator for Machine Data Analytics||Provides business logic, data processing capabilities, and visualization for use cases related to machine data.||Runs on InfoSphere BigInsights only.|
|IBM Accelerator for Social Data Analytics||Provides business logic, data processing capabilities, and visualization for use cases related to social data.||Requires InfoSphere BigInsights and InfoSphere Streams.|
|IBM Accelerator for Telecommunications Event Data Analytics||Enables you to analyze, leverage, and transform raw telecommunications data into meaningful insight.||Runs on InfoSphere Streams only.|
For the most current information about installing and using IBM accelerators, see the InfoSphere Streams Knowledge Center ( http://www.ibm.com/support/knowledgecenter/SSCRJU/SSCRJU_welcome.html) and the InfoSphere BigInsights Knowledge Center ( http://www.ibm.com/support/knowledgecenter/SSPT3X/SSPT3x_welcome.html.
|Accelerator for Machine Data Analytics||CIL12EN|
|Accelerator for Social Data Analytics||CIL13EN|
|Accelerator for Telecommunications Event Data Analytics||CIKJ8EN|
The system requirements list is the recommended hardware and software to install and run InfoSphere BigInsights per computer in a cluster that is dedicated to running BigInsights applications. Additional disk space might be required to store data for processing by InfoSphere BigInsights application workflows. Do not install on an existing Hadoop cluster or an existing Hadoop Distributed File System (HDFS) if you need to retain the original cluster and the HDFS data.
Verify your prerequisites before beginning your installation.
|InfoSphere BigInsights V2.1 system requirements||English||565|
To install InfoSphere BigInsights Enterprise Edition:
- Enable adaptive MapReduce with high availability.
- Ensure that the shared Network File System (NFS) is mounted on all high availability nodes, and that it is readable and writeable by the cluster administrator.
- Ensure that the reserved IP address for the NameNode is not used by any host.
- If you run the installation program as a non-root user, ensure that you can run the nfsstat command on the high availability nodes through SSH. For example, log in as the cluster administrator and run the following command from the installation host, where server_name.example.com is the name of your high availability server.
ssh server_name.example.com nfsstat
If the output reads
bash:nfsstat: command not found, ensure that the nfs-utils RPM package is installed on each of the high availability nodes. In addition, add the location of the nfsstat command to the PATH variable in the
.bashrcfile for the system administrator on each of the high availability nodes in your cluster. The default location of the nfsstat command is
- When installing over an existing version of GPFS, ensure that the InfoSphere BigInsights administrator user has read, write, and execute permissions on the GPFS mount point.
- The installation program supports only GPFS FPO configurations when you install InfoSphere BigInsights and GPFS at the same time. The installation program does not support installing GPFS only. Alternatively, you can install GPFS by using scripts or commands.
- HBase requires that your distributed file system supports the sync call. This call pushes data through the write pipeline and blocks it until the data receives acknowledgement from all three nodes in the pipeline. If you select GPFS as your file system when installing InfoSphere BigInsights, the hbase.fsutil.gpfs.impl property in the hbase-site.xml file is set to org.apache.hadoop.hbase.util.FSGPFSUtils.
- GPFS is not included as part of the InfoSphere BigInsights Quick Start Edition
Before you begin
Installing with high availability: If you are installing with high availability or want to deploy your cluster with adaptive MapReduce, complete the following tasks before starting the installation program:
Installing with GPFS: If you are installing GPFS™ as your distributed file system, ensure that you read and understand the following considerations.
- Navigate to the directory where you extracted the biginsights-enterprise-linux64_release_number.tar.gz file, where release_number is the release number that you are installing.
- Run the start.sh script.
./start.shThe script starts WebSphere® Application Server Community Edition on port 8300. The script provides you with a URL to the installation wizard, which is available at:
server_name is the server where you extracted the .tar.gz file. Multiple URLs might be provided if the server has multiple IP addresses; pick one that is accessible from your browser.
- Complete the remaining panels in the installation wizard.
- Review the Welcome panel, and then click Next.
- Review the License Agreement panel, accept the terms in the license agreement, and then click Next.
- On the Installation Type panel, select Cluster installation, and then click Next.
- On the File System panel, make selections for your distributed file system, and then click Next.
- Select the distributed file system that you want to install, either HDFS or GPFS.Installing with high availability: If you are installing with high availability, select Install Hadoop Distributed File System (HDFS).
- Specify the installation directories for InfoSphere BigInsights.
- Expand MapReduce general settings and specify the MapReduce settings that you want to use.
Table 1. MapReduce general settings Directory Description Cache directory Directory where the MapReduce intermediate data (map output data) is stored Log directory Directory where MapReduce logs are written to MapReduce system directory System directory where Hadoop stores its configuration data
- Select the distributed file system that you want to install, either HDFS or GPFS.
- On the Secure Shell panel, specify the user that you want to install the product with, and then click Next.
- On the Nodes panel, click Add Node to add single nodes, or Add Multiple Nodes to add several nodes simultaneously. For each node, use the Short host name and Rack ID that you recorded in the InfoSphere BigInsights installation worksheet.
Installing with GPFS: If you are installing as your distributed file system, on the Add Nodes panel or the Add Multiple Nodes panel, enter the disks that you want to use for in the Disks to use for GPFS field. You can also click Discover Disks on the Nodes panel to discover all disks that are available.
The installation program overwrites all disks that you specify for each node. Ensure that you specify the correct disks to avoid losing data.Installing with high availability: If you are installing with high availability, add all nodes in the cluster, including the nodes intended as high availability nodes.
After you finish adding nodes, click Next.
- On the Components 1, Components 2, and Components 3 panels, specify the host names and port numbers for each of the components that you are installing.
Installing with high availability:
- On the Components 1 panel, ensure that no service is assigned to any of the high availability nodes.
- On the Components 2 panel, select the Configure High Availability option.
- Next to the High Availability nodes field, click Assign. Add the high availability nodes by selecting them in the left pane and clicking the right arrow. You must select at least two nodes, but cannot select more than three nodes.
- In the Virtual NameNode FQDN field, enter the fully qualified domain name for the NameNode, which should resolve to the unassigned virtual IP address for the NameNode.
- In the Virtual NameNode IP address field, enter the unassigned virtual IP address for the NameNode.
- In the NFS server information field, enter the server and NFS directory in the following format: server:shared_directory. For example, nfs-server.com:/remote/path.
- In the NFS local mount point field, enter the path to the mount point of the NFS shared directory.
- Next to the NameNode field, click Assign to specify which high availability nodes you want to run the NameNode and JobTracker processes. The Secondary NameNode cannot be one of the high availability nodes.
- On the Security panel, specify the type of authentication that you want to use, and then click Next.
- When you reach the Summary panel, review the information for the settings, nodes, and components.
- Click Install to start the installation. The installation progress shows so that you know how much time is remaining in the installation process.
- When the installation completes, click Finish to stop the web server. Alternatively, you can run the start.sh shutdown script after the installation completes.
- Optional: To clear disk space, you can remove the extracted installation files from your system.
|Installing InfoSphere BigInsights||English||1|
Download from Passport Advantage and select your preferred operating system: http://www.ibm.com/software/howtobuy/passportadvantage/