IBM Intelligent Miner for Text Version 2 Release 3 Supports Sun Solaris

Software Announcement
December 8, 1998
Announcement Letter Number: 298-447


Table of Contents:

(Corrected on January 6, 1999)

The Ordering Information section has been updated.

At a Glance

IBM Intelligent Miner for Text offers system integrators, solution providers, and application developers a wide range of sophisticated text analysis tools, a full-text search engine, enhanced with text mining functions, and a Web Crawler to enrich business intelligence, content management, and knowledge management solutions.

Intelligent Miner for Text V2.3 offers the following key features:

  • Text Analysis Tools
    • Language identification to discover the language of a document
    • Clustering to group related documents by contents
    • Categorization to assign documents to a set of pre-defined categories
    • Summarization of documents
    • Feature extraction to identify key elements of free-text
  • Text Search Engine to search textual information and to uncover related concepts with Java-based samples for GUI application development

  • Web Crawler package consisting of a toolkit and a ready-to-run Web Crawler

  • Flexible and ready-to-use NetQuestion Web-search Solution

  • Full-product, limited-time trial version at no charge
-----------------------------------------
For ordering, contact:
  Your IBM representative, an IBM
  Business Partner, or IBM North America
  Sales Centers at
    800-IBM-CALL  Reference: YE010



Overview

IBM Intelligent Miner (TM) for Text is a knowledge discovery software development toolkit. It contains tools for application programmers who want to build applications to extract key information from very large quantities of documents, e-mails, or Web pages stored online, often in the Internet or intranets, without having to read them all. With Intelligent Miner for Text, you can:

  • Organize the documents by subject, find the predominant themes in a collection of documents, and summarize them.

  • Search for relevant documents using powerful and flexible queries -- and more.
Intelligent Miner for Text features three major components:
  • IBM Text Analysis Tools: Include a Language Identification tool, comprehensive Clustering tools, a Topic Categorization tool, a Summarization tool, and Feature Extraction tools. These tools identify document language, group conceptually related documents, classify documents by content, generate document summaries, and extract key elements of text.

  • IBM Text Search Engine: Comprehensive search engine which is customizable for either sophisticated full-text search (including text mining functions) or Web-tuned search functions. It is enhanced with Java (TM) and Java Beans samples to help build applications for text search and administrative functions accessible from a Java-enabled browser.

  • IBM Web Crawler Package: Consists of a ready-to-run Web Crawler and a Web Crawler toolkit to build customized Web crawlers.
The Intelligent Miner for Text toolkit also features the IBM NetQuestion Solution, a powerful, drop-in Internet/intranet text-search solution for searching a local Web server or a multiserver domain based on the Text Search Engine and Web Crawler.

New in Version 2.3

  • Platform support for Sun Solaris in addition to AIX (R) and Windows NT (R). Support for OS/390 (R) is also available.

  • IBM Text Analysis Tools:
    • Document summarizer
    • Named relations in the Feature Extraction tools
    • Socket input and output
  • IBM Text Search Engine:
    • One, all purpose, and customizable search engine
    • Section support
    • Additional language support for Arabic, Hebrew, and Russian
    • Enhanced thesaurus support
    • Improved Java and Java Beans samples for developing text search and administrator applications
  • IBM NetQuestion Solution -- To construct Internet/intranet text-search solutions



Key Prerequisites

Intelligent Miner for Text operates on AIX V4.3, Sun Solaris V2.5.1, and Windows NT V4.0, and requires TCP/IP. The Web Crawler requires DB2 (R) UDB Enterprise Edition which is shipped with the product.



Planned Availability Date

December 29, 1998

------------------------------

This announcement is provided for your information only. For additional information, contact your IBM representative, call 800-IBM-4YOU, or visit the IBM home page at: http://www.ibm.com



DESCRIPTION

IBM Intelligent Miner for Text Version 2 Release 3 is a knowledge discovery software development toolkit providing three major components to build advanced-technology information retrieval and mining applications and a Web-search solution. It consists of:

  1. IBM Text Analysis tools
  2. IBM Text Search Engine
  3. IBM Web Crawler package
It also provides the NetQuestion Solution using the Text Search Engine and the Web Crawler to support your e-business in the form of a drop-in solution.



Text Analysis Tools

The Text Analysis tools introduce a state-of-the-art toolset for text analysis, text mining, and knowledge management. They can be used to identify the language of documents, intelligently classify documents by content, discover clusters of conceptually related documents, summarize documents to automatically create short descriptions or summaries, and extract key elements of free text. The documents need to be provided in plain text format. For other formats, conversion tools can be obtained from third parties.

  • Language Identification

    This tool analyzes a document or string, and identifies its language. Testing to date indicates a high rate of accuracy, even on short input. The tool supports the 14 languages -- Catalan, Danish, Dutch, English, Finnish, French, German, Icelandic, Italian, Norwegian Bokmal, Norwegian Nynorsk, Portuguese, Spanish, and Swedish. The language identifier is extensible. A training tool is included that can be used to add a language not yet recognized.

  • Clustering

    By analyzing key concepts, the Clustering tools automatically find groups of related documents in document collections, such as news feeds, patents, or technical reports. The clusters are created dynamically without requiring a predefined taxonomy. Titles for clusters are generated as short lists of the key concepts that are characteristic for the documents contained in the cluster.

    Intelligent Miner for Text includes two different approaches to clustering: binary relational clustering and hierarchical clustering. Binary relational clustering is a top-down approach that splits the collection into clusters at points of maximal difference, while hierarchical clustering is a bottom-up approach that incrementally puts similar documents together in groups. Both approaches usually provide different results when applied to the same data. A practical way of using the tools is to try both approaches with different parameter settings, review the results, and select the one which is most suitable for the task at hand.

  • Categorization

    The Topic Categorization tool assigns documents to one or more categories from a user-defined taxonomy. A possible application includes automatically sorting documents into a Yahoo-like schema. A training tool is included that allows you to define your own taxonomy and build reliable categorizers for many applications.

  • Summarization

    The Summarization tool extracts sentences from a document to create a document summary. It works best with well-edited structured documents.

  • Feature Extraction

    The Feature Extraction tools recognize different kinds of significant items in text, such as proper names, technical terms, relations, or abbreviations.

    The training tool creates a scheme of significant features from a document collection. The document extraction tool identifies features in documents either by using a set of extraction functions (exploration mode) or by looking them up in a scheme created by the training tool (lookup mode). Both the training tool and the document extraction tool use the same set of extraction functions based on heuristics and dictionary information.

    The exploration mode of the document extraction tool should be used for finding significant features in an isolated document while the lookup mode should be used when rating the content of a given document with respect to the contents of a document collection. In lookup mode, the document extraction tool extracts only those features from a document that also occur in the scheme and provides statistical information about the occurrences of these features in the corresponding collection.

    The names extraction function recognizes names even when they occur in different forms, such as "Robert Jordan" versus "Mr. Jordan" and distinguishes between names of persons, organizations, or locations, such as "Houston, Texas" and "Whitney Houston".

    When recognizing terminology, the terminology extraction function automatically finds many multiword terms that have a meaning of their own, for example, "laser printer", and recognizes different forms of the same term, such as "expense account" and "expense accounts".

    The relation extraction function finds information of the type "R. Jordan is_CIO_of XY Corp.", "XY Corp. produces handheld computers", or "R. Jordan has_age 52".

    The abbreviation extraction function finds and links abbreviations introduced in a text together with their full forms, such as "American Bar Association" versus "ABA".

    Other kinds of significant entities are also recognized by other extraction functions, such as dates, numbers, and money amounts (for example, "$50", "50 Dollars", "100 EUR", or "100 Euros").

Apart from the Language Identification tool, the Text Analysis tools currently work with English text only.

The Text Analysis tools are designed in such a way that the output from one tool can be used as input to another. This allows you to create powerful toolsets to satisfy your requirements.



Text Search Engine

The Text Search Engine is an advanced search engine that is able to perform in-depth document analysis during indexing. It allows for sophisticated query enhancement and result preparation in order to supply high-quality information retrieval. The most important components are client/server handling, linguistic support for different languages, and document analysis algorithms. In addition, the Text Search Engine features an online update mechanism that allows you to search while the index is being updated. As soon as the update is complete, the newly indexed documents are available for search.

The Text Search Engine provides two user exits that enable you to:

  • Access document repositories or library systems to get documents for indexing or preprocessing before indexing. This allows you to integrate the Text Search Engine with any document management system. The product supports various document source formats including plain text and text with HTML markup.

  • Convert your specific input formats not explicitly supported, by means of conversion tools or filtering tools that can be obtained from third parties.
The Text Search Engine supports the new Euro code pages and features a broad range of functions accessible through published programming interfaces. Administration can also be performed through command line functions.

The functions for application programming depend, however, strongly on the natural language they support. Therefore, the functions described below are categorized into the three language groups: single-byte character set (SBCS), bi-directional character set (BIDI), and double-byte character set (DBCS) languages.

Functions Supported for SBCS languages: For the following 19 single-byte character set languages -- Brazilian Portuguese, Canadian French, Catalan, Danish, Dutch, Finnish, French, German, Icelandic, Italian, Norwegian Bokmal, Norwegian Nynorsk, Portuguese, Russian, Spanish, Swedish, Swiss German, U.K. English, and U.S. English -- the Text Search Engine features:

  • Multilingual morphological analysis and lemmatization.

  • Advanced relevance ranking.

  • Boolean queries allowing for phrase and proximity searches as well as for front-, middle-, and end-masking using wildcards and nested sub-queries.

  • Free-text queries based on probabilistic logic.

  • IBM's advanced hybrid query that enables mixing Boolean terms with free-text queries.

  • Sophisticated lexical affinities-based ranking for free-text and hybrid queries.

  • Fuzzy searches using an n-gram index.

  • Thesaurus support providing query expansion through a given thesaurus as well as construction of a user-defined thesaurus.

  • Section support for query restriction to certain sections of a document, such as a title or author field.

  • Match information sufficient to develop viewers with highlighting capabilities.

  • Java sample graphical user interface, an attractive and user-friendly sample graphical user interface (GUI) in English that is written in Java. It runs as an applet on any client machine that hosts a Java-capable browser as described in the Software Requirements section. It allows you easy access to the search engine from any point of the Internet or your intranet. Included in the Java sample GUI is the visualization of clustered result lists. This is shipped as compiled Java. All other components of the sample GUI, available in the form of source files and their makefiles, offer most of the Text Search Engine capabilities based on its C-language application programming interface (API) and can easily be modified for your special needs.

  • Java Beans samples based on the state-of-the-art Java Beans component model architecture, are offered as two adaptable sample GUIs in English for:
    1. Search functions
    2. Administrative functions
    The samples, provided as source files, show how to develop reusable components for search and administrative tasks in a flexible way and how to combine them into attractive and useful GUIs.

    The sample administration GUI can be used as a standalone Java application on a TCP/IP-connected workstation. You can use the administration GUI to perform some of the Text Search Engine administration functions, such as creating an index, and monitoring and changing the status of an index.

In addition, the following features are available for English documents:
  • Clustering of the result list, which eases the user's comprehension of search results.

  • Query refinement methods based on user-assigned relevance.

  • Feature index. Recognized features like proper names, locations, or terms may be used to obtain higher precision of queries. This index allows a search for documents about, for example, President George Washington. The result will not include documents that contain information about the city of Washington, DC.
Functions Supported for BIDI Languages: For the following two BIDI languages -- Arabic and Hebrew -- the Text Search Engine supports:
  • Multilingual morphological analysis and lemmatization
  • Advanced relevance ranking
  • Boolean queries
  • Free-text queries based on probabilistic logic
  • Hybrid queries
The document source format must be in logical format.

Functions Supported for DBCS Languages: For the following four double-byte character set languages -- Japanese, Korean, Simplified Chinese, and Traditional Chinese -- the Text Search Engine supports through an n-gram index:

  • Boolean queries

  • Precise term search

  • Fuzzy search

  • Thesaurus support providing query expansion through a given thesaurus as well as construction of a user-defined thesaurus

  • Match information
Scope of Text Search Engine Usage: Beyond developing enterprise applications, the Text Search Engine can also be used to build a global Internet search service or a centralized intranet search service in support of your e-business initiative. The Text Search Engine provides functions to optimally handle the large amounts of information that are typically stored on Web sites.

The Text Search Engine offers the choice of the different base index types: precise, linguistic, and n-gram. Each type differs with respect to:

  • Indexing speed
  • Size of index produced
  • Complexity of the queries the end user can perform
  • Target languages the documents are written in
The trade-offs should be considered during application development.

The Text Analysis tools and the Text Search Engine can interoperate in various aspects by means of application development to satisfy end-user requirements in the knowledge management area.



Web Crawler Package

A Web crawler is a robot that starts at one or more Web sites and follows selected HTML links you must define in a customization step prior to execution. In addition to defining the domain you want to crawl, including types and number of levels of HTML links, you must specify additional parameters, such as selection criteria for objects to be found on the Web and the directory in a file system where you want the crawler store the retrieved objects. The Web crawler can retrieve objects of any content type and language, such as HTML, text, images, audio, or video, and will store them to the defined directory for further processing. For example, an indexer can use HTML and other text documents to build an index of the documents. After processing, it is the user's responsibility to delete the retrieved objects.

The Web Crawler toolkit provided with Intelligent Miner for Text allows you to develop Web crawlers according to your needs.

A ready-to-run implementation, simply called IBM Web Crawler, is included with the product. The Web Crawler:

  • Can run on a single machine and can spawn off a user-specified number of crawler copies that run in parallel.

  • Allows individual crawl results, consisting of data objects and their metadata, to be shared for subsequent processing. The data objects are stored as flat files; whereas the metadata, for example, consisting of URL, size, and last modification date of each data object, and crawler-specific control data is stored in DB2.

  • Allows for controlled restart due to the persistent and save storage of the metadata in DB2.

  • Provides socks support and, as such, is able to crawl the Web from inside a firewall.

  • Provides a UNIX (R) command line interface.

  • Monitors Web-page activities and changes.
NetQuestion Solution: The NetQuestion Solution is a powerful Internet/intranet text-search solution based on the Text Search Engine and Web Crawler. Although it is a drop-in solution, it allows you enough flexibility to meet specific needs. It provides easy-to-use installation and configuration, and selection of objects located on a local Web server or found on a multiserver domain by means of the Web Crawler.

After installation, a simple configuration step must be completed, and the solution is ready to run. You specify the Web space -- the portion of the Internet or intranet you wish to search. The IBM Web Crawler gathers the pages that will be searched, and the IBM Text Search Engine is able to search the pages after indexing them.

Part of the NetQuestion Solution is a search form and an associated CGI script which allow you to easily define your queries through a Web browser. The search form and the CGI script are also provided in sample C and HTML source code to allow you to modify the input definition of your queries and the presentation of the results. You can exploit the full functionality of the Text Search Engine by implementing all functions of the published API. In addition, you can perform most of the search engine's administration functions also through forms and CGI scripts and a Web browser.

In some circumstances, the NetQuestion Solution even allows you to detect misspellings in documents and expand your search request accordingly. For example, if one of the occurrences of "Toyota" is misspelled as "Toyotta" in a document and someone later tries to search for "Toyota", the solution automatically adds "Toyotta" to the query.

The NetQuestion Solution and associated components provide key technologies to build intelligent Internet or intranet Web sites. They allow you to leverage the use of the Internet and intranets to gain access to relevant information and support your e-business initiative. For those users wishing to tailor this solution, a full range of settings can be configured.



Year 2000

This product is Year 2000 ready. When used in accordance with its associated documentation, it is capable of correctly processing, providing, and/or receiving date data within and between the twentieth and twenty-first centuries, provided that all products (for example, hardware, software, and firmware) used with the product properly exchange accurate date data with it.

The service end date for this Year 2000 ready product is January 31, 2001.



PRODUCT POSITIONING

With Intelligent Miner for Text, IBM extends your technology assets based on the full range of business intelligence solutions available to you, including DB2, Intelligent Miner for Data, and KnowledgeX.

Similar to data mining, text mining discovers patterns in document collections and other unstructured information. The Text Search Engine is not a typical search tool. Together with the Text Analysis tools, it performs categorization and clustering that are necessary when processing large volumes of information.

Complemented with the Web Crawler, you can develop comprehensive solutions to support your e-business initiative.

For more information about the product, refer to the Web page at:



REFERENCE INFORMATION

  • Software Announcement 298-450 dated December 8, 1998 (IBM Intelligent Miner for Text for OS/390 Version 2 Release 3 Enables You to Extract Key Information Efficiently from Large Quantities of Text)

  • Software Announcement 298-379 dated September 22, 1998 (IBM KnowledgeX for Workgroup Edition Version 5.2)

  • Software Announcement 298-349 dated September 22, 1998 (IBM Intelligent Miner for Data Version 2.1.2 and IBM Intelligent Miner for Data for RS/6000 (TM) SP (TM) Version 2.1.2 Availability)
Trademarks
      Intelligent Miner, RS/6000, and SP are trademarks of
      International Business Machines Corporation in the United
      States or other countries or both.
      AIX, OS/390, and DB2 are registered trademarks of International
      Business Machines Corporation in the United States or other
      countries or both.
      Windows NT is a registered trademark of Microsoft Corporation.
      Java is a trademark of Sun Microsystems, Inc.
      UNIX is a registered trademark in the United States and other
      countries exclusively through X/Open Company Limited.
      Other company, product, and service names may be trademarks or
      service marks of others.



SUPPLEMENTAL INFORMATION



RELATED SOLUTIONS

The following solutions from IBM Global Business Intelligence Solutions (GBIS), together with services, can be purchased from IBM.

  • IBM Customer Relationship Intelligence (CRI) PRPQ Order Number 5799-A34

    CRI is a cross-industry text mining solution to manage customer-related information of all kinds (for example, complaint letters, opinion surveys, or requests for information) over e-mail or normal mail. It also handles customer calls to call centers. The solution is able to take individual documents, such as letters, messages, or calls, analyze them linguistically, and cluster them into pre-defined or automatically-defined homogeneous groups.

    CRI allows companies to glean valuable customer information from huge quantities of these documents. By analyzing, organizing, and classifying the information contained in these documents, CRI can reveal new insight into customer relationship management and market trends.

  • IBM Text Knowledge Miner (TKM) PRPQ Order Number 5799-A33

    The TKM provides large- or medium-size enterprises with a focal point for storing and accessing all text information, whatever its source is. The server can then be used by executives, managers, and employees to obtain and update their knowledge of the business environment and anticipate changes. Typical information sources are: newspapers, wire services, trade press reports, scientific publications, or patent databases.

    TKM is powerful enough to analyze tens of thousands of complex, text-based documents in a single iteration. It analyzes and clusters textual data to help companies discover new subject relationships, extract important concepts, and stimulate new ideas.

The above GBIS solutions are using Intelligent Miner (TM) for Text as the underlying search and mining technology. For more information, contact:
    IBM Corporation
    GBIS Solutions
    Dr. Michael Hehenberger
    Internet: hehenbem@us.ibm.com
For more information, contact your IBM representative or visit IBM GBIS' home page at:



EDUCATION SUPPORT

An Intelligent Miner for Text Application Programming Workshop (class number DW56) provided by IBM Education and Training, will be available after planned general availability.

Visit the following Web site for additional information:

Descriptions of all classroom and self-study courses are contained in the Catalog of IBM Education and Training.

Call IBM Education and Training at 800-IBM-TEACH (426-8322) for catalogs, schedules, and enrollments.



DEMONSTRATIONS

You can find live solution demonstrations on the Web at:



PUBLICATIONS

The following publications can be ordered after planned availability. To order, contact your IBM representative.

                                                          Order
Title                                                     Number

Intelligent Miner for Text* Fact Sheet GC26-9294 Getting Started SH12-6302 Text Analysis Tools SH12-6370 Text Search Engine: Customization and Administration SH12-6365 Text Search Engine: Programming Interfaces SH12-6362 NetQuestion Solution SH12-6368 Web Crawler SH12-6371

The following publications will be shipped with Intelligent Miner for Text:

Title                                                      Format

Intelligent Miner for Text* Getting Started (SH12-6302 U.S. English) Hardcopy Getting Started (U.S. English and additional languages) HTML, PDF Text Analysis Tools HTML, PDF Text Search Engine: HTML, PDF Customization and Administration Text Search Engine: Programming Interfaces HTML, PDF NetQuestion Solution HTML, PDF Web Crawler HTML, PDF

*     All available platforms, AIX (R), Sun Solaris, Windows NT (R),
      and OS/390 (R), are covered by the same publications.

The Intelligent Miner for Text Fact Sheet can be displayed and printed through the Web with an HTML browser from the following URL:

Displayable Softcopy Publications: The displayable manuals listed above with the formats HTML and PDF are part of the basic machine-readable material. They can be displayed with an HTML browser or with Adobe Acrobat Reader, respectively.

Full-text online documentation search is provided for all publications provided in HTML.

They also can be used with an HTML browser or with Adobe Acrobat Reader, respectively, to create unmodified printed copies of the manuals. Terms and conditions for use of the machine-readable files are shipped with the files.

Intelligent Miner for Text Trial Version

Intelligent Miner for Text can be ordered after planned availability as a full-product, limited-time trial version as one Media Pack containing CD-ROMs for the workstation platforms AIX, Sun Solaris, and Windows NT at no charge. To order, contact your IBM representative.

                                                           Order
Title                                                      Number

Intelligent Miner for Text GK2T-0167 Version 2.3 60 Day Trial License

The Intelligent Miner for Text servers will cease their operation sixty (60) days after installation. In order to maintain an ongoing operation, a full license program package has to be purchased and the supplied license key must be installed by executing one of the provided license key programs, either on AIX, Sun Solaris, or on Windows NT.



TECHNICAL INFORMATION



Specified Operating Environment

Hardware Requirements: The Intelligent Miner for Text servers: Text Analysis tools, Text Search Engine, and Web Crawler, as well as the Text Search Engine client and the Java (TM) sample GUI, are designed to run on the AIX and Sun Solaris operating systems, and on the Windows NT operating environment supporting the following hardware:

  • On AIX

    A processor of the RS/6000 (TM) System family, including SMP-processor systems, with a minimum of 64 MB random access memory (RAM). Intelligent Miner for Text does not exploit multiple nodes concurrently within POWERparallel (R) Systems (RS/6000 SP (TM)).

  • On Sun Solaris

    A processor of the Sun SPARC System family, including SMP-processor systems, with a minimum of 64 MB RAM.

  • On Windows NT

    A Pentium (TM)-based processor, including SMP-processor systems, with a minimum of 64 MB RAM.

Additional RAM may be needed based on the size of data stored in memory and execution speed required.

The following minimum disk space is required and valid for all supported platforms:

.-------+----------------------------------------------------.
|       |          Intelligent Miner for Text                |
|       |                                                    |
| Disk  |Text     |Text   |TSE    |Web     |Net-     |Online |
| Space |Analysis |Search |Java   |Crawler |question |Docu-  |
| in MB |Tools    |Engine |sample |Package |Solution |ment-  |
|       |         |       |GUI    |        |         |ation  |
|-------+---------+-------+-------+--------+---------+-------|
|Server |  40     |   65  |   10  |   10   |    5    | 100   |
|       |         |       |       |        |         |       |
|Client |  -      |   55  |   10  |   -    |    -    |0 - 10 |
'-------+---------+-------+-------+--------+---------+-------'

Additional disk space needed depends on the amount of data processed per run and developed Intelligent Miner for Text applications.

Software Requirements: The following software is required to execute Intelligent Miner for Text V2.3:

  • On AIX
    • AIX V4.3
  • On Sun Solaris
    • Sun Solaris V2.5.1
  • On Windows NT
    • Windows NT V4.0 plus Service Pack 3
  • TCP/IP, available with the operating systems, for communication
The Web Crawler requires DB2 (R) UDB Enterprise Edition which is provided with the product (refer to the Terms and Conditions section).

To display or print the online documentation, Adobe Acrobat Reader or any Web browser supporting HTML V2.0, or higher is needed. For online documentation search, an HTML browser is required.

At the present time, you can download the Adobe Acrobat Reader, presently at no charge, from the Web at:

Text Search Engine Client/Server Combinations

Applications using the Text Search Engine (TSE) API invoke the TSE client. The TSE client must be installed on the same machine where the TSE server is installed and can be installed on any of the client workstations described below. The Java sample GUI is a set of TSE applications. The following software is required:

  • On AIX
.-------------------------------------+--------+------------.
| Software                            | TSE    | TSE Java   |
|                                     | client | sample GUI |
|-------------------------------------+--------+------------|
| AIX V4.3                            |   x    |     x      |
| A C/C++ compiler, such as IBM       |   x (1)|            |
|  CSet++ V3.1.4                      |        |            |
| JDK V1.1.4 (3)                      |        |     x (1)  |
| Java Runtime Environment            |        |     x (2)  |
|  (JRE, included in JDK)             |        |     x (2)  |
| Any Java V1.1-capable HTML          |        |     x      |
|  browser (3) like                   |        |     x      |
|  HotJava(tm) V1.0 (3), (4)          |        |            |
| A visual builder for Java Beans,    |        |     x (1)  |
|  such as                            |        |            |
|   IBM VisualAge(rm) for Java or     |        |            |
|   BDK from Sun                      |        |            |
| A HTTP server, such as Apache or    |        |     x (5)  |
|  Lotus(rm) Domino(tm) Go Webserver  |        |            |
|  for AIX                            |        |            |
'-------------------------------------+--------+------------'

  • On Sun Solaris
.--------------------------------------+-------+------------.
| Software                             |TSE    | TSE Java   |
|                                      |client | sample GUI |
|--------------------------------------+-------+------------|
| Sun Solaris V2.5.1                   |  x    |     x      |
| A C/C++ compiler, such as Sun's      |  x (1)|            |
|  Workshop C/C++ Compiler V4.2        |       |            |
| JDK V1.1.6 including native          |       |     x (1)  |
|  Thread Patch (3)                    |       |     x (1)  |
| Java Runtime Environment (JRE,       |       |     x (2)  |
|  included in JDK)                    |       |     x (2)  |
| An HTML browser, such as             |       |     x      |
|  Netscape Navigator V4 (3)           |       |            |
|  with the Java Plug-in V1.1.1 (3), or|       |            |
|  any Java V1.1-capable browser (3)   |       |            |
|  like HotJava V1.0 (3), (4)          |       |            |
| A visual builder for Java Beans,     |       |     x (1)  |
|  such as, Java Studio, or            |       |            |
|  Sun's Beans Development Kit         |       |            |
| A HTTP server, such as Apache or     |       |     x (5)  |
|  Lotus Domino Go Webserver for       |       |            |
|  Sun Solaris                         |       |            |
'--------------------------------------+-------+------------'

  • On Windows NT
.----------------------------------------+-------+-----------.
| Software                               |TSE    |TSE Java   |
|                                        |client |sample GUI |
|----------------------------------------+-------+-----------|
| Windows NT V4.0 plus Service Pack 3    |  x    |    x      |
| A C/C++ compiler, such as              |  x (1)|           |
|  Microsoft(tm) Visual C++ V5.0         |       |           |
| JDK V1.1.6 (3)                         |       |    x (1)  |
| Java Runtime Environment (JRE,         |       |    x (2)  |
|  included in JDK)                      |       |    x (2)  |
| An HTML browser, such as               |       |    x      |
|  Microsoft Internet Explorer V3.0 (3)  |       |           |
|  or Netscape Navigator V4 (3)          |       |           |
|  both with the Java Plug-in V1.1.1 (3),|       |           |
|  or any Java V1.1-capable browser (3)  |       |           |
|  with HotJava V1.0 (3), (4)            |       |           |
| A visual builder for Java Beans, such  |       |    x (1)  |
|  as IBM VisualAge for Java             |       |           |
| A HTTP server, such as Apache or       |       |    x (5)  |
|  Lotus Domino Go Webserver for         |       |           |
|  Windows NT                            |       |           |
'----------------------------------------+-------+-----------'

(1) Optional, only required for application development (2) Optional, only required to run standalone Java applications (3) Or higher (4) At the present time, you can download HotJava, presently at no charge, from the Web at:

(5) Optional

NetQuestion Solution Client/Server Combinations

The NetQuestion Solution requires an HTTP server, such as Apache or Lotus Domino Go Webserver, installed on the TSE server and an HTML browser, such as:

  • Netscape Navigator V4 for AIX
  • Netscape Navigator V4 for Sun Solaris/SPARC
  • Microsoft Internet Explorer V4 for Windows NT V4.0
  • Netscape Navigator V4 for Windows NT
running on any possible TCP/IP-connected workstation.



Planning Information

Customer Responsibilities: The Intelligent Miner for Text product license includes an entitlement for one server install with one processor. If additional servers or additional processors on one server are used, the customer has to purchase additional entitlements.

The use of DB2 UDB, included in the Intelligent Miner for Text product package, is restricted.

Refer to the Usage Restriction in the Terms and Conditions section for additional information.

Direct Customer Support

  • For AIX and Sun Solaris:

    Direct customer support is provided by the AIX Support Line. This fee service enhances customers' productivity by providing voice and electronic access into the IBM support organization. The AIX Support Line will help answer questions pertaining to usage, "how to", and suspected software defects for eligible products.

    For support charges, additional information on the AIX Support Line, and other available services, call the IBM Support Family of Services Sales Office at 800-IBM-4YOU (426-4968) and ask for the IBM Support Family.

  • For Windows NT:

    Direct customer support is provided by the PC Support Line. This fee service enhances customers' productivity by providing voice and electronic access into the IBM support organization. The PC Support Line will help answer questions pertaining to usage, "how to", and suspected software defects for eligible products.

    For support charges, additional information on the PC Support Line, and other available services, call the IBM Support Family of Services Sales Office at 800-IBM-4YOU (800-426-4968) and ask for the IBM Support Family.

  • For all three platforms:

    To obtain information on customer eligibility and registration procedures, refer to HONE SUPPORTINFO using the search word TEXTMINE provided by the support center.

Eligible customers can obtain installation and usage assistance from the IBM Global Business Intelligence Solutions (GBIS) in Paris, France. For information, refer to the GBIS home page: To obtain information on customer eligibility and registration procedures, refer to HONE SUPPORTINFO using the search word TEXTMINE.

Packaging: The Intelligent Miner for Text V2.3 program package contains:

  • IBM International Program License Agreement in multilanguage booklet

  • License Information in multilanguage booklet

  • Proof of Entitlement for one server with one processor

  • Service contacts

  • Intelligent Miner for Text:
    • Server CD-ROM for AIX, including the TSE client for AIX and online documentation

    • Server CD-ROM for Sun Solaris, including the TSE client for Sun Solaris and online documentation

    • Server CD-ROM for Windows NT, including the TSE client for Windows NT and online documentation

    • CD-ROM with license keys for the above servers

    • Getting Started hardcopy documentation
  • DB2 Universal Database (R) (UDB) Enterprise Edition, Version 5.2:
    • CD-ROM for AIX

    • CD-ROM for Sun Solaris

    • DB2 UDB for UNIX (R) Quick Beginnings hardcopy document

    • CD-ROM for Windows NT

    • DB2 UDB for Windows NT Quick Beginnings hardcopy document
The limited-time trial version of Intelligent Miner for Text V2.3 Media Pack contains the same Intelligent Miner for Text server CD-ROMs as described above, however, without the CD-ROM containing the license keys.

A trial version of DB2 UDB can be obtained through the Web at:

Note: Most of the Intelligent Miner for Text code will be delivered as object code only. Header source files and libraries required for programming against the API, and sample command files are also included in the media. In addition, Java sample source code and sample Java Beans for developing GUIs are provided.



Security, Auditability, and Control

Intelligent Miner for Text uses the security and auditability features of the AIX, Sun Solaris, and Windows NT operating systems, the network systems used, and DB2 UDB, respectively.

The customer is responsible for evaluation, selection, and implementation of security features, administrative procedures, and appropriate controls in application systems and communication facilities.



ORDERING INFORMATION

The Intelligent Miner for Text V2.3 license includes a CD-ROM for the AIX operating system, a CD-ROM for the Sun Solaris operating system, a CD-ROM for the Windows NT operating environment, and a CD-ROM with the respective license key programs.

The Intelligent Miner for Text is a software development toolkit consisting of three server components. One component, Text Search Engine, provides a client. The client is packaged with the toolkit and may be replicated freely, at no additional charge, for use with the Text Search Engine server component.

For all supported platforms, the toolkits run on uniprocessors and symmetric multiprocessor (SMP) systems, that can also be single nodes of multiple parallel processor (MPP) systems, however, without exploiting multiple nodes in parallel.

Intelligent Miner for Text can be ordered only as a whole package and has two charge units (refer to the Usage Restriction in the Terms and Conditions section):

  • Server (workstation) install with one processor

  • Additional SMP processors in the quantities of 1, 5, and 10, or multiples of these quantities (Note that Windows NT V4.0 can support a maximum of four SMP processors only.)
                                 Order        Feature     Part
Program Name/Description         Type         Number      Number

Intelligent Miner for Text Version 2.3

Program Package (1 5801-AAR 5050 11L3691 Processor License)

Use Packs

Entitlement to run on a 5807-AAR 0002 22L3455 machine with 1 Additional SMP Processor Entitlement to run on a 5807-AAR 1286 22L3456 machine with 5 Additional SMP Processors Entitlement to run on a 5807-AAR 0004 22L3457 machine with 10 Additional SMP Processors



TERMS AND CONDITIONS

Licensing: IBM International Program License Agreement. Proofs of Entitlement (PoE) are required for all authorized use.

Limited Warranty Applies: Yes

Program Services: Available until January 31, 2001

Money-Back Guarantee: Two-month, money-back guarantee

Copy and Use on Home/Portable Computer: No

Usage Restriction: Yes

Additional 1 Processor Server Install Use Authorization

The Intelligent Miner for Text license is restricted on one server install with one processor. If additional servers are used, an entitlement for each additional server has to be purchased.

Additional SMP-Processors Use Authorization

The Intelligent Miner for Text license is restricted on one server install with one processor. If additional processors are used within one server (SMP processor server), an entitlement for each additional processor has to be purchased.

DB2 Universal Database Use Authorization

The Intelligent Miner for Text (Program) also includes DB2 Universal Database components. You may only use these components in association with your licensed use of the Program. For each valid Intelligent Miner for Text PoE, you are entitled to install, from the media provided with the Program, the DB2 Universal Database components required to support Intelligent Miner for Text, regardless of platform and number of processors. For product use other than Intelligent Miner for Text, you must purchase a full version DB2 Universal Database.

Support Line

  • Personal Systems
  • AIX
Upgrades: Customers can acquire upgrades up to the currently authorized level of use of the qualifying programs.

Volume Orders: Yes, contact your IBM representative

Passport Advantage Applies: Yes

AIX/UNIX Upgrade Protection applies: No

Entitled Upgrade for Current AIX/UNIX Upgrade Protection Licensees: No

Variable Charges Apply: No

Educational Allowance Available: No



CHARGES

The charges provided in this announcement are suggested retail prices for the U.S. only and are provided for your information only. Dealer prices may vary, and prices may also vary by country. Prices are subject to change without notice. For additional information and current prices, contact your local IBM representative.

Intelligent Miner for Text Version 2.3

                                                             Monthly
                                                             Support
                                 Part         One-Time       Line
Program/Description              Number       Charge         Charge

Program Package 11L3691 $30,000 (1 Processor License)

Monthly Optional PS $234 Support Line Charge Monthly Optional AIX 242 Support Line Charge

Use Packs

Entitlement to run on 22L3455 10,000 a machine with 1 Additional SMP Processor Entitlement to run on 22L3456 40,000 a machine with 5 Additional SMP Processors Entitlement to run on 22L3457 60,000 a machine with 10 Additional SMP Processors



Examples

  • For an SMP machine with 8 processors running AIX, the customer would pay:
    • Program Package (1 Processor License), and
    • One entitlement for 5 Additional Processors, and
    • Two entitlements for 1 Additional Processor
  • To upgrade from an SMP machine with 8 processors to 12 processors running AIX, the customer would pay:
    • Four entitlements for 1 Additional Processor, or

    • One entitlement for 5 Additional Processors whatever the cheaper price would be
  • To operate Intelligent Miner for Text on two 1 processor systems running AIX and Windows NT, respectively, the customer would pay:
    • Two Program Packages
Note: For Passport Advantage ordering information and charges, contact your IBM Lotus representative or authorized IBM Lotus Business Partner. Additional information is also available on the Passport Advantage URL:



CALL NOW TO ORDER

To order, contact the IBM North America Sales Centers, your local IBM representative, or your IBM Business Partner.

IBM North America Sales Centers, our national direct marketing organization, can add your name to the mailing list for catalogs of IBM products.

 Phone:     800-IBM-CALL
 Fax:       800-2IBM-FAX
 Internet:  ibm_direct@vnet.ibm.com
 Mail:      IBM North America Sales Centers
            Dept. YE010
            P.O. Box 2690
            Atlanta, GA  30301-2690
 Reference: YE010

To identify your local IBM Business Partner or IBM representative, call 800-IBM-4YOU.

Note: Shipments will begin after the planned availability date.

Trademarks

      Intelligent Miner, RS/6000, and SP are trademarks of
      International Business Machines Corporation in the United
      States or other countries or both.
      AIX, OS/390, POWERparallel, DB2, VisualAge, and DB2 Universal
      Database are registered trademarks of International Business
      Machines Corporation in the United States or other countries or
      both.
      Pentium is a trademark of Intel Corporation.
      Microsoft is a trademark of Microsoft Corporation.
      Windows NT is a registered trademark of Microsoft Corporation.
      Java and HotJava are trademarks of Sun Microsystems, Inc.
      UNIX is a registered trademark in the United States and other
      countries exclusively through X/Open Company Limited.
      Domino is a trademark of Lotus Development Corporation.
      Lotus is a registered trademark of Lotus Development
      Corporation.
      Other company, product, and service names may be trademarks or
      service marks of others.