IBM SPSS Analytic Server 1.0 Release Notes


IBM SPSS Analytic Server is a solution for big data analytics that combines IBM SPSS technology with big data systems and allows you to work with familiar IBM SPSS user interfaces to solve problems on a previously unattainable scale.

System requirements and installation

For information on system requirements and installation, see the installation documents.

For information on downloading the product, including the interim fix, see the Download Document.

For information on using the product, see the Information Center.

Known issues for Analytic Server

At time of publication, the following issues were known.

Problem: Some streams that have a DB2z data source may fail due to an incompatibility with Oracle JRE.
Solution: Use IBM JRE to run Analytic Server 1.0

Problem: MIT Kerberos v5-1.11.1 does not work with IBM J9 VM SR12.
Solution: Issue filed with IBM Java support with PMR 93184,001,866 and APAR IV40109. There are two workarounds:
1. Use Oracle 1.6 JRE to run Analytic Server 1.0
2. Use MIT Kerberos 5-1.10.3 server

Problem: A data source with an Excel file with a large number of columns can cause out of memory errors.
Solution: Save the Excel file as a delimited file and define the data source on the delimited file.

Problem: Kerberos is not supported on BigInsights.
Solution: Use a different supported Hadoop distribution when using Kerberos.

Problem: When a split field is present, tree models built locally in Modeler are slightly different from tree models built by Analytic Server on HDFS, and thus produce different scores.
Solution: The algorithms in both cases are valid; the algorithms used by Analytic Server are simply newer. Given the fact that that tree algorithms tend to have many heuristic rules, the difference between the two components is normal. Looking at the predicted values, we see only about a 3% difference between the two models.

Problem: When there is a categorical target, a split field is present, and not all categories of the target are represented in each split, the predicted probabilities for categories of the target may be scored incorrectly.
Solution: Ensure that all categories of the target are represented in each split. This may require discarding or merging rare categories.

Problem: A stream that has a model nugget built with the "Very large datasets" objective, followed by a modeling node that has the "Very large datasets" objective, may fail to run.
Solution: Put a cache on the model nugget, or separate the scoring and model building into separate streams.

Problem: The RFM Analysis node "Keep in current" method of dealing with ties is not supported.
Solution: Specify "Add to next" as the method of dealing with ties.

Problem: In the Statistics node, the mode is computed on a subsample of the data for continuous fields.
Solution: This is a limitation of the current version.

Problem: In the Analytic Server console, if you encounter an error when creating a data source, you may be unable to create another new data source in the same session.
Solution: If data source creation fails for any reason, click Cancel to exit edit mode; you can then create another new data source.

Problem: Field names containing a backslash may cause jobs run on HDFS to fail.
Solution: Rename these fields prior to using them with Analytic Server.

Problem: When creating a data source from multiple input files, the Preview feature of the Analytic Server console will only show data from the first file listed if:
1. One file contains a header row with field names and the other does not, or
2. You have not yet clicked Save
Solution: Make sure all your files either do or do not have a header row, and click Save before previewing the data.

Problem: Installing Analytic Server into a directory where Analytic Server is already installed will cause an error.
Solution: Either uninstall Analytic Server completely before reinstalling, or install into a different directory.

Problem: A stream that contains an SLRM model nugget built in a version of Modeler prior to version 15 may not run with Analytic Server.
Solution: Rebuild the SLRM model in Modeler 15.

Problem: A stream that contains a Restructure node that uses a value field with an underscore in field name will fail.
Solution: Remove underscores from the names of value fields.

Problem: The performance of Merge and Sort is slow. The Hadoop job for these actions only generate 2 reduce tasks.
Solution: In the {AS_install_root_dir)/ae_wlpserver/usr/servers/aeserver/configuration/ file, change the property:

to a comment.
    # mapred.tasktracker.reduce.tasks.maximum=2

This will enable the Hadoop jobs to generate the reduce tasks up to the capacity of your Hadoop environment and will improve the performance of Merge and Sort.

Problem: The Means node cannot produce a 95% confidence interval.
Solution: This is a limitation of the current version.

Problem: When executing a stream that processes local data, but exports the results to HDFS, the export may fail if the data set is larger than 10MB.
Solution: Export to a local delimited file, upload to HDFS, and create the data source through the Analytic Server console.

Problem: When using the Select node with the discard option, fields with null values are discarded in the result set. For example: if the criteria is to discard rows where where OCCUPATION = "Retired" all rows where OCCUPATION = "Retired" AND OCCUPATION = null are discarded.
Solution: The work around is to modify the selection criteria to add "not(field = undef)". For example: update the selection criteria to ((OCCUPATION = "Retired) and not(OCCUPATION = undef)). The result set will contain rows where the OCCUPATION field is null.

Problem: The Data Audit node cannot produce the mode for continuous fields.
Solution: This is a limitation of the current version.

Problem: The Sample node will not run with Analytic Server if:

Solution: This is a limitation of the current version.

Known issues with IBM SPSS Analytic Catalyst

At time of publication, the following issues with SPSS Analytic Catalyst were known. These issues are applicable only if you are using the SPSS Analytic Catalyst interface.

Problem: Building projects concurrently might cause a project to fail.
Solution: Wait and try to run the project at a later time.

Problem: When using Internet Explorer, adding a data source fails or visualizations do not display properly.
Solution: It is recommended that you use a different browser. Chrome and Firefox are preferred. If you need to continue to use Internet Explorer, try the following:

Problem: When changing default field properties in a data source, the Apply changes button is disabled.
Solution: Click Change default field properties again, and then click continue. The button is now enabled.

Problem: If a CSV or Excel file contains field values that begin with dollar signs ($), the values might not be read correctly and the unit is not automatically set to Dollar.
Solution: If a CSV file contains dollar signs, remove the dollar signs before uploading or open in Microsoft Excel and save as an Excel file. If an Excel file contains dollar signs, upload the file and then set the unit to Dollar manually.

Problem: Cancelling a project might fail.
Solution: If you are building a new project, delete the project and create a new project. If you are re-building an existing project, you might have to delete the latest version and re-build again. If the problem persists, create a new project.

Problem: When the project is re-built, fields with a role of None might be incorrectly included as inputs. As a result, some top key drivers are shown without visualizations.
Solution: Create a new project and make any field properties changes before the project is generated the first time.

Problem: Smoother in field associations might have an unexpected peak or dip.

Problem: If a user of IBM SPSS Modeler updates a data source while you are also updating the data source, you will get an error.
Solution: Cancel your changes and update the data source after the SPSS Modeler user is done.

Problem: If you share during project creation, the sharing settings might be lost.
Solution: Create the project first. Then modify the project to share it with other users.

Problem: Entering a long tag name on tablet will cause an error.
Solution: Use a shorter tag name.

Problem: Cannot use keyboard to navigate to statistical terms in tables.
Solution: If you need help for a term, open the help system and search for the term.

Problem: Clicking an entry on the Summary page might show different output on the Explore page than what is shown on the Summary page.
Solution: Clicking an entry on the Summary page always opens the first output in the Explore page. Navigate the output in the Explore page to find the output that you want to view.

Problem: Deleting a data source while another user is creating a project from the data source will cause an error.
Solution: Delete the data source after the project is created.

Modified date: 29 May 2013