DeveloperWorks Article: Developing, Compiling, and Running a Non-Web Data Source Crawler Plug-in with IBM Content Analytics with Enterprise Search 3.0
This document describes how to create a data source crawler plugin for IBM Content Analytics with Enterprise Search's data source crawlers and how to extract and modify metadata, security tokens and content. The document also covers how to develop the code to perform logging of all the operations at info, warn and error levels and troubleshooting problems.
A data source crawler plug-in is a Java application associated with a data source crawler (a non-web crawler). Examples of such crawlers are Windows file system crawler, Seed list crawler, etc.
Using a data source crawler plug-in, you can create, modify or delete the content, metadata, or security tokens.
For more information about the API used in a crawler plug-in, see the following location on the machine where IBM Content Analytics with Enterprise Search 3.0 is installed.
In this article, we cover the plug-ins for non-web data source crawlers only. We do not cover the enhanced crawler framework plug-ins (for FileNet P8, SharePoint, and Agent for Windows file systems crawlers).