You can use an in-database approach to analytics in the
data warehouse by running the SAS embedded process (SAS EP) on the DB2® database server.
The ability to dynamically score, or to run regression, clustering,
neural net, and other analytic algorithms within the database server
environment-thereby pushing the processing and deep analytics to where
the data lives-reduces the latencies that are associated with moving
data, and increases flexibility in end user analytic capabilities.
SAS Scoring Accelerator for DB2 enables
the scoring process to be done inside the database and does not require
data movement. Prior to Version 4.1, SAS Scoring Accelerator for DB2 translated the models that were
developed by SAS Enterprise Miner into scoring functions that could
be deployed inside the DB2 data
server. The scoring functions could be used in SQL statements like
other DB2 user-defined scalar
functions, providing the following benefits:
- Reduced data movement and storage requirements
- Better data governance (most of the data stays inside the database)
- Greater computational power from the relational database management
system (RDBMS)
- Better productivity through a shorter cycle of ideas to product
New enhancements in SAS Scoring Accelerator for DB2 Version 4.1, which eliminate the need to
register scalar user-defined functions (UDFs), make the process of
deploying and running scoring models in DB2 more
dynamic, and result in better performance when executing models against
large data sets. Scoring models that were developed for SAS Enterprise
Miner can be deployed in a DB2 database
in their native form. The models can be referenced and used in SQL
statements through an analytic expression.
- The ANALYZE_TABLE expression, which you can specify on the table-reference
clause of a subselect, enables you to efficiently execute scoring
models. For more information, see "table-reference clause" or "Analyze
table expressions" in the "subselect" topic.
- The DB2_SAS_SETTINGS registry variable enables
the SAS EP. Use the db2set command to configure
its settings. For more information, see the "Miscellaneous variables" topic.
- The SAS EP library is loaded and run in a fenced-mode process
named db2sasep. In a partitioned database environment, this process
runs on each database partition of the DB2 instance.
For more information, see the "db2ida - Stop or restart the SAS
embedded process command" topic.
- TBFUNC is a new explain operator type.
- When SAS queries are processing, DB2 agents
might have to wait during communications with the SAS EP. This can
occur when sending data to the SAS EP or when receiving data from
the SAS EP. Two new in-database analytics wait times in the time spent
hierarchy (ida_send_wait_time and ida_recv_wait_time) provide visibility
into the impact of these waits on overall system and query performance.