Configuring a SPNEGO filter to allow a Portal Crawler to Authenticate

Technote (troubleshooting)


Problem

Attempts to run a Portal Search crawler result in "EJPJO0046E" errors when SPNEGO is enabled in an IBM WebSphere Portal environment. The crawler fails to authenticate, and no documents can be added to the collection.

Symptom

After defining a new Portal or WCM content source for a search collection, or when trying to to run the crawler, it fails with this error message:

PortalCollect E com.ibm.hrl.portlets.WsPse.PortalCollectionsService checkCrawler EJPJO0046E: Failed to connect to content source <b>Portal Content Source</b>. Either a wrong URL is defined, the content source's authentication info is incorrect, or the site is blocked by robot.txt.

Cause

If you have SPNEGO enabled, then you will need to define a filter to allow the crawler to bypass any SPNEGO challenge, so that the crawler can authenticate using BASIC AUTH instead.

Environment

Portal 6.1.x or 7.0.0.x on WebSphere Application Server 7.0.0.x

Diagnosing the problem

Start the crawler, search the SystemOut.log files for the following error:

[13:14:19:406 GMT] 0000008d PortalCollect E com.ibm.hrl.portlets.WsPse.PortalCollectionsService checkCrawler EJPJO0046E: Failed to connect to content source <b>Portal Content Source</b>. Either a wrong URL is defined, the content source's authentication info is incorrect, or the site is blocked by robot.txt.


Resolving the problem

You will need to define a SPNEGO Web Authentication Filter as defined in this InfoCenter page:

http://publib.boulder.ibm.com/infocenter/wasinfo/v7r0/index.jsp?topic=%2Fcom.ibm.websphere.express.doc%2Finfo%2Fexp%2Fae%2Fusec_kerb_SPNEGO_edit.html

You will need to identify the HTTP request header for the crawler, and when you start the crawler you should see the following messages in the SystemOut.log:

"WARNING: Unknown User Browser to WCL DeviceContext. Dump UserAgent: javacrawler/1.1"

Taking this into account, you should set the filter in the following way:

1) Open the Administrative Console for the server

2) Click Security > Global Security

3) From Authentication, expand Web and SIP Security

4) Click SPNEGO Web Authentication

5) Under SPNEGO filters, click New or select an existing one to edit

6) for the value, set the following

user-agent!=JavaCrawler

7) Save and exit


Related information

SPNEGO web authentication filter values
SPNEGO web authentication enablement


Rate this page:

(0 users)Average rating

Document information


More support for:

WebSphere Portal
Portal Search Engine

Software version:

6.1, 7.0

Operating system(s):

Windows

Reference #:

1569890

Modified date:

2012-11-19

Translate my page

Machine Translation

Content navigation