Hadoop for Enterprise with IBM
Secure, Resilient and High Performance
IBM InfoSphere BigInsights brings the power of Hadoop to the enterprise, accelerating time to value from big data while facilitating rapid big data application development. It enhances Hadoop by adding administrative, discovery, development, provisioning, security, and support, along with best-in-class analytical capabilities. The result is a solution for complex, large scale projects
Data Loading and Integration
Codeless Data Integration
Codeless creation of data integration logic and jobs, reusable across the enterprise through ETL jobs powered by Information Server. Enable data governance including data lineage, business rule and policy management, data quality.
Unified View of Data
Unified view of all data-driven information, including on Hadoop, for a comprehensive, contextually-relevant view. Powered by Watson Explorer
Data Query & Exploration
SQL on Hadoop with Big SQL
Unmatched simplicity, performance and security for SQL on Hadoop. It provides a single point of access and view across all big data, exactly where it lives. Run federated queries on large volumes of structured and unstructured data.
Read Big SQL whitepaper
What is BigSQL?
Spreadsheet-style Tool with BigSheets
Web-based analysis and visualization tool with a familiar, spreadsheet-like interface, featuring D3 graphs, that enables analysis of large amounts of data and helps to design and manage long running data collection jobs.
Analytics - Predictive, Prescriptive and Descriptive
Enables the use of R as a query language to explore, visualize, transform, and model big data right from their R environment and without any explicit programming using MapReduce or Jaql.
Sophisticated text analytics unique to BigInsights with a vast library of extractors enabling actionable insights from large amounts of native textual data.
Familiar, Eclipse based development environment for building and deploying analytic applications and a set of developer tools extractors and editors for fast adoption and reduced coding and debugging.
Social Data Analytics Accelerator
Provides the capability to ingest and process large volumes of social media data, yielding key insights that can be used to develop programs/applications such as customer retention, customer acquisition, lead generation, brand management, and campaign effectiveness.
Machine Data Analytics Accelerator
Provides the capability to ingest and process large volumes of machine data sources, including IT machines, sensors, meters, GPS devices and more.
Optimized Workloads with Adaptive MapReduce
Adapts to user needs and system workloads automatically to improve performance and simplify job tuning while workload scheduler provides optimization and control of job scheduling based on user-selected metrics.
Fault-tolerant, secure, POSIX file system option
Provides POSIX compliant, enterprise-class distributed file system - GPFS-FPO that brings proven big data distributed file system capabilities to the Hadoop environment. GPFS-FPO permits multiple applications to access data stored on it - simplifying workflows, allowing traditional backup and restore services, and works well with small block sizes of data as well (Open Source HDFS is optimized for 64MB blocks)
Secure search, discovery and navigation across the enterprise. Rapid development and deployment of search-based applications. Distributed, highly-scalable architecture. Powered by Watson Explorer.
Administration & Security
Establishes a unique identifier for customer information stored in Hadoop with Master Data Management probabilistic matching.
Management Console for cluster management, managing jobs and for browsing the file system.
Data Privacy and Masking
Mitigates risk with sensitive data discovery. Maintains an acceptable risk tolerance with data monitoring, within source systems and on Hadoop itself. Auditing helps tighten security and access control while monitoring provides the ability to control all applications from a centralized dashboard.