Extracting Content and Using Metadata Process Overview

About this task

A basic outline of the process for extracting and using metadata from either an existing or new search collection is the following:

Procedure

  1. Creating a Search Collection, create a search collection by clicking the Add icon () beside the Search Collection entry in the Watson Explorer Engine administration tool's left-side navigation bar.
    • Click Add a new seed to Defining Resources to Crawl and Index that you are going to crawl.
    • You will need to Extracting Metadata in your source and extract them. The process of extracting metadata from a source is known as converting. Select the Converting tab, click Add a new converter, select Custom Converter, and click Add to create the new converter. You will need to write XSL parser code to process the input from your source and produce output.
  2. Creating a Fast Index for Metadata. Metadata content elements must be indexed with a fast index in order to be used for sorting and filtering. Select the Indexing tab, click edit, locate the Fast Index section, and enter fast index definitions in the form of a list of name|data-type entries, one per line.
  3. Crawling and Indexing the Sample Files. Select the Overview tab and click start to the right of the Live Status header.
    • Test the index by clicking Search under the search box that appears under the Test with project label in the Watson Explorer Engine administration tool's left-side navigation bar.
  4. Searching Metadata Content. This enables you to search by using appropriately-phrased field:value queries. To identify the metadata content elements, you must edit the form for the source used in that query. Click the list icon () to the right of the Sources entry in the Watson Explorer Engine administration tool's left-side navigation bar. Select your source from the list of available sources.
    • Add Field Mappings to define additional fields that you can specify in your queries to search metadata content elements other than those that are searchable by default. Select the Form tab, and click the edit link to the right of the VSE Source Form entry. Select the Modified radio button to enter the mappings between fields and metadata content elements that you want to be able to search, one per line.
    • Define the fields so that they are recognized as valid parts of an input query. Click the list icon () to the right of the Syntax entry in the Watson Explorer Engine administration tool's left-side navigation menu, select custom, enter the name of the field you want to define, and click Add Field.
  5. Sorting Results Using Metadata that allows you to organize search results. Return to the Form tab of your source. Click edit beside the VSE Source Form entry, and scroll down to the Sorting section. Select the Modified radio button, and use the form name:sort-type for each entry.
  6. Fine-Tuning the Search Collection by excluding any extraneous files, such as index.html, from the crawling and indexing process. Open your search collection, select the Configuration tab, click the Crawling sub-tab, and scroll to the Conditional Settings section. Click Add a new condition, select Custom conditional settings, and click Add.
  7. Whenever you make changes to the metadata of your search collection, perform a test search on the collection. This helps you to identify any mistakes made in the configuration process or any unexpected results before the search application is made public.

Results

To proceed to the next section of this tutorial, click Files Used in This Tutorial.