Release Notes - IBM SPSS Modeler 15.0
IBM® SPSS® Modeler is a data mining toolset that helps you build predictive models quickly and intuitively. System requirements, installation, and known problems are addressed.
For a list of new features in the latest SPSS Modeler release, see the topic “New and Changed Features in IBM SPSS Modeler 15.0” in the online help.
Refer to the download document for information regarding requirements and installation.
Currently, 64-bit code is not supported on Windows XP Professional SP3. Thus the relevant portion of the system requirements section in the IBM SPSS Modeler 15 Client Installation instructions should say "Microsoft Windows XP Professional with Service Pack 3 x86 and x64 (32-bit code) Editions".
In the IBM SPSS Modeler Server 15 for Windows Installation instructions, the system requirements should also include support for Windows Server 2008 Standard Edition for 32-bit x86 systems, and Windows Server 2008 Enterprise Edition for 64-bit x64 systems.
For integration with IBM SPSS Statistics, this release of SPSS Modeler requires SPSS Statistics 20 or later.
For integration with IBM DB2 on z/OS, use the installation instructions in the SPSS Modeler Adapter DB2 z/OS Program Directory supplied with the adapter, then use the configuration instructions in the IBM SPSS Modeler 15 Scoring Adapter Installation guide supplied with IBM SPSS Modeler Server.
At time of publication, the following issues were known:
SPSS Modeler Professional
Database import and export
- SQL Server support with the Data Access Pack driver.
The ODBC configuration for SQL Server must have the "Enable Quoted Identifiers" ODBC connection attribute set to “Yes” (the default for this driver is "No"). On UNIX this attribute is configured in the system information file (odbc.ini) using the "QuotedId" option.
- In Database Caching with IBM DB2.
When attempting to cache a node in a stream which reads data from a DB2 database, you may see the error message “A default table space could not be found with a pagesize of at least 4096 that authorization ID TEST is authorized to use”. To configure DB2 to enable in-database caching to work properly in SPSS Modeler, the database administrator should create a "user temporary" tablespace and grant access to this tablespace to the relevant DB2 accounts. We recommend using a pagesize of 32768 in the new tablespace, as this will increase the limit on the number of fields that can be successfully cached.
- Database errors with IBM DB2 for z/OS.
When running streams against DB2 for z/OS, you may experience database errors if the timeout for idle database connections is enabled and set too low. In DB2 for z/OS version 8, the default has changed from no timeout to 2 minutes. The solution is to increase the value of the DB2 system parameter IDLE THREAD TIMEOUT (IDTHTOIN), or reset the value to 0.
- Scoring some models with confidences enabled using generated SQL returns a database error message on DB2 z/OS.
Scoring a subset of algorithms, with confidences enabled, using generated SQL can return an error on execution. The issue is specific to DB2 for z/OS; to fix this, use the Modeler Server Scoring Adapter for DB2 on z/OS.
- Database bulk loaders.
In order to use the bulk loading feature of the Database export node, you need to install Python on the same machine as SPSS Modeler (or if using SPSS Modeler Server, on the same machine as the server). The "python_exe_path" parameter must be set in the options.cfg file. You can install Python from the SPSS Modeler Client, SPSS Modeler Server or SPSS Modeler Solution Publisher product DVDs.
- Some aggregation results can differ between SQL pushback and native modes with Oracle
When running a stream containing an Aggregate node, the values returned for 1st and 3rd Quartiles when pushing back SQL to an Oracle database may differ from those returned in native mode.
- Record ID field.
Passing a non-numeric Record ID field into a modeling algorithm may cause a stream to execute slowly. The Record ID field is not a requirement for modeling, so we recommend filtering out the field.
- Logistic Regression.
Binomial Logistic Regression does not allow strings longer than 8 characters. You can avoid this problem by encoding strings before passing them to the algorithm.
If temporary disk space is low, Binomial Logistic Regression can fail to build, and reports an error. When building from a large data set (10GB or more), the same amount of free disk space is needed. You can use the environment variable SPSSTMPDIR to set the location of the temporary directory.
- Cox regression.
On scoring a Cox regression model, an error is reported if empty strings in categorical variables are used as input to model building. Avoid using empty strings as input.
- Settings information.
For some models, Settings information may not be displayed in the information sidebar of the model nugget if split fields are used. Settings information can be accessed from the modeling node as follows. For auto modeling nodes, open the modeling node and choose Expert tab > click the Model parameters column > Specify > Expert tab of Algorithm settings dialog box. For single modeling nodes, open the modeling node and choose Build Options tab > Ensembles.
Excel export node
- Memory required increases with number of rows when exporting to Excel 2003.
Exporting a large number of records (tens of thousands) to an Excel 2003 (.xls) file can fail with the message "Insufficient memory for JVM - please increase in jvm.cfg". However, increasing the value of the Java heap size may not cure the problem. Try exporting fewer records at a time. The problem does not occur when exporting to Excel 2007/2010 (.xlsx) format.
SPSS Modeler Server on UNIX
- HP-UX fails to set locale.
On HP-UX, you may see the error message: Failed to set locale to “” when running the modelerrun command, or in the server's messages.log file. This message only occurs when the default locale is "C", does not affect the result of the execution, and can be ignored. To avoid seeing this message, use POSIX for the character type part of the locale, by setting the environment variable LC_CTYPE to “POSIX”.
- Insufficient virtual memory in multithreaded AIX environments.
In a multithreaded AIX environment, it is possible for workloads that formerly completed successfully to fail with insufficient memory owing to a change to the startup scripts for SPSS Modeler Server, SPSS Modeler Solution Publisher and SPSS Modeler Solution Publisher Runtime Library. To avoid this problem, remove the following line from the startup script:
Integration with IBM SPSS Collaboration and Deployment Services
SPSS Modeler Server adapters no longer support Predictive Enterprise Services 3.5, or SPSS Collaboration and Deployment Services 4.0 or 4.1. However, SPSS Collaboration and Deployment Services 4.2.1 and 5.0 are supported. In addition, SPSS Collaboration and Deployment Services does not support the JBoss and Oracle WebLogic application servers on IBM System i (iSeries) systems. This information supersedes that given in the IBM SPSS Modeler Server Adapter Installation instructions.
In a mixed-architecture environment, use the adapter version that corresponds to the architecture (32-bit or 64-bit) of the Deployment Services application server (the one that was used to install Collaboration and Deployment Services).
- Support for Kerberos credentials
Kerberos credentials are now supported for running jobs and managing user roles.
- Storing SPSS Modeler streams in SPSS Collaboration and Deployment Services.
It is not possible to store SPSS Modeler streams into SPSS Collaboration and Deployment Services 4. n if there are parameters in multiple supernodes with the same name. You can avoid this problem by ensuring that no duplicate parameter names are used anywhere.
No unlock checkbox is available when storing streams in the Collaboration and Deployment Services repository. Unlock is the default when storing streams. To lock or unlock an object, choose Tools->Repository->Explore, navigate to the object, and right-click on its name to display the context menu.
- Incomplete output for Gains graph from Evaluation node.
In Deployment Manager, on running a job containing a stream with an Evaluation node set to produce a Gains graph, the graph output may be incomplete if the system is running under Oracle Weblogic 11g using the Oracle JRockit JRE. To avoid this problem, use the IBM JRE.
- When installing the Modeler Adapter on Collaboration and Deployment Services 5.0 the install returns success even when errors have occurred.
There are some known scenarios where a Modeler Adapter install will fail, but the failure will not be properly reported back to the user performing the install. Review the Modeler Adapter install log and "packageManager" logs in the Collaboration and Deployment Services log directory (< CDS_Home>/log) and take appropriate action.
Integration with IBM SPSS Statistics
- Generating non-English output.
When generating output in languages other than English, it is advisable to specify the language in the Syntax.
- The 'Launch application' option of a Statistics Export node does not open the data set when working in distributed mode.
When running Modeler and Statistics together in Server mode, writing the data out and launching a Statistics session does not automatically open a Statistics client showing the data set read into the active data set. The workaround is to manually open the data file in Statistics client once it is launched.
- Additional requirements for using SPSS Statistics functionality with AIX servers
If you want to use SPSS Statistics functionality with SPSS Modeler Server on AIX, you will need to install XL Fortran Enterprise Edition V13.1 for AIX Runtime Environment, version 184.108.40.206 or
higher, in addition to the AIX-specific system requirements listed in the IBM SPSS Modeler Server 15 for UNIX Installation Instructions.
Integration with IBM Netezza Performance Server
- IBM Netezza Performance Server version
To use the Netezza database modeling nodes in SPSS Modeler, a prerequisite is IBM Netezza Performance Server 6.0 P8 or later and IBM Netezza Analytics version 1.1.0 or later, running the IBM SPSS In-Database Analytics package. The other prerequisites are as listed under “Requirements for Integration with IBM Netezza Analytics” in the IBM SPSS Modeler 15.0 In-Database Mining Guide.
- Inserting multibyte data into Teradata from SPSS Modeler Server.
To insert multibyte data into a Teradata database from SPSS Modeler Server, use the following configuration:
1. Run the server in Unicode.
2. Set the Teradata user default character set to UNICODE using tdadmin.
3. Configure CharacterSet=UTF8 (UNIX DSN), or set the Session Character Set to UTF-8 (Windows DSN).
4. Ensure that there are only ASCII characters in the column names.
- String Collation.
In this release, string sorting and comparison use the ICU 4.8.1 collation service for the system locale. Japanese collation does not distinguish half-width from full-width Katakana.
The screen reader is not able to read graphs, so these are not accessible to visually-impaired users.
Modeler Administration Console
- Port number.
The default port number for SPSS Modeler Server to listen on is listed in the IBM SPSS Modeler Administration Console User Guide as 28047; this should be 28052.
Online Help System
- Help system table of contents slow to display on first access
On first accessing the Help system in an SPSS Modeler session, the table of contents can take several minutes to display in the browser. The issue does not reoccur in subsequent browser sessions within the same SPSS Modeler session.
SPSS Modeler Premium
SPSS Modeler Social Network Analysis
- Compatibility with Collaboration and Deployment Services
Full support for SNA features is available in Collaboration and Deployment Services 5.0. Within Collaboration and Deployment Services 4.2, SNA features are supported for storing and retrieval only.
- Administrator access required on Windows 7 64 bit
To use SNA in Windows 7 64 bit, you must either be logged in as an Administrator or right-click the executable file in the Program Files directory and choose the "Run as Administrator" option.
SPSS Modeler Entity Analytics
- Compatibility with Collaboration and Deployment Services
Full support for Entity Analytics features is available in Collaboration and Deployment Services 5.0. Within Collaboration and Deployment Services 4.2, Entity Analytics features are supported for storing and retrieval only.
- Upgrading from Beta version
If you have the Beta version of SPSS Modeler Entity Analytics installed, perform the following steps to upgrade to the Production version:
- Restart the system on which the Beta version is installed.
- From the Control Panel, double-click "Add or Remove Programs" and remove the Beta version (the entry beginning "IBM SPSS Entity Analytics...").
- Remove the entry "IBM SPSS Modeler 15.0".
- Install the Production version of SPSS Modeler 15.0.
- Install the Production version of SPSS Modeler Entity Analytics.
- Delete any remaining repositories that were created under the Beta version. Follow the procedure in the Tasks section of the Entity Analytics User Guide, under "Deleting a repository when unable to connect to it". Note that the delete_repository.bat or delete_repository.sh scripts are actually located in the "<modeler_install_dir>/ext/bin/pasw.entityanalytics" directory and not the ".../tools" directory as stated in the User Guide.
- Attempt to install SPSS Modeler Server Entity Analytics in invalid UNIX directory results in error message
If you specify an invalid directory for installing SPSS Modeler Server Entity Analytics on a UNIX server, an error message is displayed, rather than the product failing to work, as stated in the IBM SPSS Modeler Entity Analytics Installation instructions.
- Extra installation step needed to prevent repository creation failure on Solaris systems.
After installing SPSS Modeler Server Entity Analytics on a Solaris server, run the following command as rootto avoid failure of repository creation:
crle -64 -u -s <modelerserver-installation-directory>/ext/bin/pasw.entityanalytics/g2
crle -64 -u -s /usr/IBM/SPSS/ModelerServer/15/ext/bin/pasw.entityanalytics/g2
- Non-Latin character data not supported.
Data in non-Latin characters is not supported for Entity Analytics in this release. Where the data consists of a mixture of records in Latin (i.e., Western European) and non-Latin character sets, only the entities for the Latin data will be resolved.
- Incorrect number of fields detected when running Streaming EA node
data model" when running the Streaming EA node. This can happen if you have edited the
repository configuration since creating the Streaming EA node. Editing the configuration in these
circumstances can have the effect of changing the number and names of the fields output from the
node. To resolve the issue, open the Streaming EA node and click the Refresh button. Doing so
causes the number and names of the output fields to be recalculated.
- Repositories not removed when Entity Analytics is uninstalled.
Before uninstalling Entity Analytics, note that any repositories that were created are not removed when Entity Analytics is uninstalled. These repositories will be available if Entity Analytics is later reinstalled.
To delete a repository, follow the instructions in the "Entity Analytics tasks" section of the Entity Analytics User Guide. Use the procedure "Deleting an entity repository" or "Deleting a repository when unable to connect to it" as appropriate.
Note that the delete_repository.bat or delete_repository.sh scripts are actually located in the "< modeler_install_dir>/ext/bin/pasw.entityanalytics" directory and not the ".../tools" directory as stated in the User Guide.
SPSS Modeler Text Analytics
- Backward compatibility with previous Text Mining for Clementine versions.
Text models created in Text Mining for Clementine v11.1 cannot be edited or executed in Text Analytics 15.0.
- Cancelling an extraction can take a long time when working with very large data sets.
Multiple Interactive Workbench sessions can cause sluggish behavior. Text Analytics and Modeler share a common Java run-time engine when an interactive workbench session is launched. Depending on the number of Interactive Workbench sessions you invoke during a Modeler session - even if opening and closing the same session - system memory may cause the application to become sluggish. This effect may be especially pronounced if you are working with large data or have a machine with less than the recommended RAM setting of 4GB. If you notice your machine is swapping memory, it is recommended that you save all your work, shut down Modeler and re-launch the application. Running Text Analytics on a machine with less than the recommended memory - particularly when working with large data sets or for prolonged periods of time - may cause Java to run out of memory and shut down. It is strongly suggested you upgrade to the recommended memory setting or larger (or use Text Analytics Server) if you work with large data.
- The whole record from Excel file is not processed.
Create an .xlsx or a .sav file as a workaround.
- Scoring on rules is different in TA depending on whether you load a TAP from Text Analytics directly or whether you load a TAP from TAfS
Use TAPs made within TA as the ones made in TAfS may be created using a different version of the linguistic resources.
- PDF filter related problem.
To use PDF or Office filters with the product you should download and install these filters yourself. Starting with Acrobat Reader 8.x the PDF filter was included with the reader so you don't need to do a separate download. Versions of Acrobat from 10 onwards no longer have this filter so use no higher a version than Acrobat 9.
- Configuring Text Analytics for Collaboration and Deployment Services 4.x.
Add the path <C&DS_installation_root>/components/modeler/ext/bin/spss.TMWBServer to the system path used by the application server running the Collaboration and Deployment Services.
On Windows this environment variable is PATH
On AIX platform this environment variable is LIBPATH
On other Unix platforms the environment variable is called LD_LIBRARY_PATH
The location of this environment variable differs between application servers.
For WebSphere, the environment variable can be set through the Text Analytics Administration Console for the appropriate server.
For JBoss, the variable can be found in the wrapper.conf file
For Weblogic, the variable can be found in the domain’s start script.
On HP-UX platform, make sure that all the files in the directory <C&DS_installation_root>/components/modeler/ext/bin/spss.TMWBServer/bin have execute permissions.
Note: <C&DS_installation_root> refers to the full installation path for Collaboration and Deployment Services server. For example: /usr/IBM/SPSS/Collaboration and Deployment Services/4.2/Server
- Linux on x86/x64 - openMP support requires the customer to install a separate package.
Linux Red Hat x86/x64 support.
For Red Hat Linux, openMP support requires the package "libgomp-4.4.4-13.el5", which is available from the RedHat website: https://rhn.redhat.com/network/software/search.pxt
Linux SuSe x86/x64 support.
For SuSe11, openMP support requires the package "libgomp43 4.3.3_20081022", which is available from the SuSe website: http://www.suse.com/LinuxPackages/packageRouter.jsp?product=server&version=11&service_pack=&architecture=i386&package_name=index_group.
The package is for the GNU compiler collection OpenMP runtime library, and is available from the section titled: "Development/Libraries/Parallel".
- The Template Editor does not open on Vista and returns an error.
As a post installation step for Vista 32 and 64 to correct this it is necessary to give the users group modify permission for the "tmwb_15.db file" located in: < ALLUSERSPROFILE>\IBM\SPSS\TextAnalytics\15\<folder>.
An example of the location of this file is: C:\ProgramData\IBM\SPSS\TextAnalytics\15\tmwb_15.db.
- Client memory exhausted after many repeated Interactive Workbench extractions.
Modeler Client can run out of memory after multiple Text Analytics Interactive Workbench sessions have been run without restarting the application. Monitor the memory usage in the status line and, if running low, close and re-open Modeler Client.
- Concept Model Building.
A new “Optimize for Speed of Scoring” option has been added to Concept Model Building and is enabled by default. To maximize the accuracy of scoring disable this option; this results in the following differences:
- The built Concept model contains all extracted terms for exact same matching/counting during concept building and scoring; this can result in a much larger model and as a result can be much slower to score. The default setting only includes information related to the top most frequent concepts.
- A second pass over the data is completed to perform indexing to count concepts found during extraction. This enables insensitivity to character case (Upper Case insensitive) and it has more accurate matching (hyphen insensitive).
- Text Link Analysis Node.
The Text Link Analysis (TLA) node has a new “Optimize for speed of Scoring” option which is enabled by default. If you want to maximize accuracy of scoring, disable this option; doing so results in the following differences:
- Post processing of results is enabled; this guarantees that each concept is only counted once. For example, if you have created two rules and each rule defines a different type, the TLA result could contain information about the same concept but with two different types (such as: Paris, Noun and Paris, Location).
- During the second pass over the data, indexing is performed to count the concepts found during extraction. This enables insensitivity to character case (upper case insensitive) and has more accurate matching (hyphen insensitive).
- Setting LD_PRELOAD on HP-UX.
For the HP-UX platform to use Text Analytics adapters, you need to have the LD_PRELOAD defined when you start the C&DS server. To do this, add or edit the C&DS startup script environment variable:
LD_PRELOAD=/usr/lib/hpux64/libcps.so.1:$LD_PRELOAD; export LD_PRELOAD
- On WebSphere you need to configure the environment entries via the admin console and store them along with the application server configuration.
For example: /usr/lib/hpux64/libcps.so.1:$LD_PRELOAD
- For Weblogic, you need to edit the C&DS server startup scripts called setCDSEnv.sh.
For example, in C&DS 4.x: <Application_Server_Install_Directory>/user_projects/domains/LP177/bin/startWebLogic.cmd
For example, in C&DS 5.0: <Application_Server_Install_Directory>/user_projects/domains/LP177/bin/setCDSEnv.sh
- For JBoss, edit the startup scripts used to start the server process (Tanuki Java Service Wrapper) and add the variable to:
Note that with C&DS 5.0, there is the possibility of a remote Scoring Server environment that is separate from the base C&DS environment. Therefore, it is possible that this setting also needs to be configured in the remote Scoring Server. The Scoring Server is a special deployment of the C&DS scoring service, external to, but loosely coupled to the C&DS server.
Any special settings that are necessary for the native code under C&DS will also need to be made for the Scoring Server. The LD_PRELOAD option needs to be defined differently for the Scoring Server.
For WebLogic, the file that controls the execution environment for the Scoring Server is:
In this case the setting for WebSphere should be added through the WebSphere administration console.
Technical Support is available to maintenance customers. Customers may contact Technical Support for assistance in using IBM Corp. products or for installation help for one of the supported hardware environments. To reach Technical Support, see the IBM Corp. web site at http://www.ibm.com/support. We recommend that you check the support site for updates.
© Copyright IBM Corporation 1994, 2012.