In an XML to the common analysis structure mapping file, you can employ the full range of configuration options for mapping XML to UIMA data types.
The XML elements to the common analysis structure mapping file is shown in the following example.
<report>
<doc>
<crimeType>Car theft</crimeType>
<crimeDate>04/23/05 09:23 pm</crimeDate>
<crimeLocation>27 Main Street, Brynston, Springfield, New Jersey</crimeLocation>
<reportingOfficer rank="Lt">Jakob
<lastName>Collins</lastName>
</reportingOfficer>
<policePrecinct>14th Precinct</policePrecinct>
<suspectDescription>Male, dark haired, dark glasses,
blue jeans with dark, probably black,
jacket</suspectDescription>
<abstract>A Mercedes CLK was stolen on 04/23/2005 from a parking
lot in front of the Blue Lagoon restaurant on
27 Main Street, Brynston.(serial number: 32 2761 50871)</abstract>
<body>A Mercedes CLK was stolen on 04/23/2004 from a parking
lot in front of the Blue Lagoon restaurant on 27 Main Street,
Brynston.(serial number: 32 2761 50871)
It has a black color and wide Michelin tires.
Eyewitnesses in front of the restaurant saw two darkly dressed
males drive away in the car at high speed. The car was
found abandoned on Aliway Ave in Brooklyn. The fuel tank was empty.
The seats were badly stained and the back seat was vandalized.
Nothing was stolen out of the car....</body>
</doc>
<image>
<--! image of the crime scene as a base64-encoded string -->
</image>
</report>
<?xml version="1.0"?>
<xmlCasInitializerConfiguration
xmlns="http://www.ibm.com/2005/uima/jedii_ci_xml">
<identifier>Default</identifier>
<description>Sample configuration</description>
<contentElements>
<element>/report/doc</element>
</contentElements>
<elementToTypeMappings>
<elementToTypeMapping>
<element>//doc//reportingOfficer</element>
<type>com.ibm.omnifind.types.Person</type>
<featureValueAssignment>
<feature>role</feature>
<basicValue default="Reporting officer">
</basicValue>
</featureValueAssignment>
<featureValueAssignment>
<feature>gender</feature>
<basicValue default="male"
useAttributeValue="sex"/>
</featureValueAssignment>
<featureValueAssignment>
<feature>surName</feature>
<values concatenate="true" delimiter=" ">
<basicValue useAttributeValue="rank" default="Lt"/>
<basicValue useElementContent="lastName"/>
</values>
</featureValueAssignment>
</elementToTypeMapping>
<elementToTypeMapping>
<element>//doc</element>
<type>com.ibm.omnifind.types.PoliceReport</type>
<featureValueAssignment>
<feature>crimeDescription</feature>
<basicValue useElementContent="abstract" trim="true">
</basicValue>
</featureValueAssignment>
</elementToTypeMapping>
</elementToTypeMappings>
</xmlCasInitializerConfiguration>
If you use the content extraction option, the XML elements that are specified in the <elementToTypeMappings> section must be contained within the XML elements that are specified in the <contentElements> section.
To create an XML to the common analysis structure mapping file:
<elementToTypeMapping>
<element>/doc//reportingOfficer</element>
<type>com.ibm.omnifind.types.Person</type>
<featureValueAssignment>
<feature>role</feature>
<basicValue default="Reporting officer"/>
</featureValueAssignment>
<featureValueAssignment>
<feature>gender</feature>
<basicValue default="male" useAttributeValue="sex"/>
</featureValueAssignment>
</elementToTypeMapping>
This example results in the following output: <elementToTypeMapping>
<element>//doc</element>
<type>com.ibm.omnifind.types.PoliceReport</type>
<featureValueAssignment>
<feature>crimeDescription</feature>
<basicValue useElementContent="abstract" trim="true"/>
</featureValueAssignment>
</elementToTypeMapping>
the text covered by the element <abstract> in <doc> becomes
the value of the feature structure crimeDescription.
All leading and trailing blanks are removed. <elementToTypeMapping>
<element>//doc//reportingOfficer</element>
<type>com.ibm.omnifind.types.Person</type>
<featureValueAssignment>
<feature>surName</feature>
<values concatenate="true" delimiter=" ">
<basicValue default="Mr."/>
<basicValue useAttributeValue="rank"
default="Lt."/>
<basicValue useElementContent="lastName"/>
</values>
</featureValueAssignment>
</elementToTypeMapping>
String feature values are extracted from the mapping file as is. The values retain any leading or trailing blanks. However, names of types and features are trimmed of any blanks. For example, <type> com.ibm.omnifind.types.Person </type> becomes <type>com.ibm.omnifind.types.Person</type>.
<elementToTypeMapping>
<element>//suspectDescription</element>
<type>com.ibm.omnifind.types.Person</type>
<condition attribute="armed" value="yes"/>
</elementToTypeMapping>