About the Data Analysis sample

The Data Analysis sample contains an example Data Analysis project that is pre-populated with some example XML data. You can use this Data Analysis project to explore the Data Analysis perspective and its views. In these views, you can explore your pre-analyzed XML data, create a target model, and generate Data Analysis tools, including a subflow. You can use the subflow to transform the input XML data into your new target model for further processing.



Data Analysis workflow

Data Analysis workflow

The Data Analysis sample uses the IBM predefined Book Series Data Analysis profile. The Book Series Data Analysis profile contains the Book Series schema, 'BookSeries.xsd,' which is shown in the following image. This profile also contains a glossary file lookup service that replaces some XML terms with a more readable form. For example, MediaType="MONO" becomes "Monograph" in the Data Analysis views, which makes it easier to interpret the data.

The sample Data Analysis Project contains several Book Series XML files (in a book.xml directory). These files are pre-analyzed and the relevant data is loaded into the Data Analysis project.

BookSeries.xsd

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" 
        targetNamespace="book-ns" xmlns:tns="book-ns">
    <complexType name="Paper">
    	<sequence>
    		<element name="Title" type="xsd:string"></element>
    		<element name="Author" type="xsd:string" maxOccurs="unbounded"
    			minOccurs="1">
    		</element>
    	</sequence>
    </complexType>

    <complexType name="Section" mixed="true">
    	<sequence>
    		<element name="Title" type="xsd:string"></element>
    		<element name="Author" type="xsd:string"
    			maxOccurs="unbounded" minOccurs="0">
    		</element>
    		<element name="Editor" type="xsd:string"
    			maxOccurs="unbounded" minOccurs="0">
    		</element>
    		<element name="Paper" type="tns:Paper" maxOccurs="unbounded"
    			minOccurs="0">
    		</element>
    	</sequence>
    	<attribute name="secType" type="xsd:string" use="required"></attribute>
    </complexType>

    <complexType name="Volume">
    	<sequence>
    		<element name="Title" type="xsd:string"></element>
    		<element name="Editor" type="xsd:string"
    			maxOccurs="unbounded" minOccurs="1">
    		</element>
            <element name="volumeInfo" type="tns:VolumeInfo" maxOccurs="1" minOccurs="0"></element>
            <element name="Section" type="tns:Section"
    			maxOccurs="unbounded" minOccurs="1">
    		</element>
    		<element name="Media" type="tns:Media" maxOccurs="unbounded"
    			minOccurs="0">
    		</element>
    		<element name="Appendix" type="tns:AppendixType"
    			maxOccurs="unbounded" minOccurs="0">
    		</element>
    	</sequence>
    </complexType>
    
    <complexType name="Series">
    	<sequence>
    		<element name="Title" type="xsd:string"></element>
            <element name="seriesInfo" type="tns:SeriesInfo" maxOccurs="1" minOccurs="1"></element>
            <element name="Volume" type="tns:Volume"
    			maxOccurs="unbounded" minOccurs="1">
    		</element>
    	</sequence>
    </complexType>
    
    <element name="BookSeries" type="tns:Series"></element>

    <complexType name="SeriesInfo">
    	<sequence>
    		<element name="Editor" type="xsd:string"></element>
    		<element name="PubDate" type="xsd:int"></element>
    	</sequence>
    </complexType>

    <complexType name="Media">
    	<sequence>
    		<element name="Title" type="xsd:string"></element>
    	</sequence>
    	<attribute name="mediaType" type="xsd:string" use="required"></attribute>
    </complexType>

    <complexType name="AppendixType">
    	<sequence>
    		<element name="Bibliography" type="tns:Bibliography" maxOccurs="1" minOccurs="0"></element>
    		<element name="References" type="tns:RefList" maxOccurs="1" minOccurs="0"></element>
    	</sequence>
    </complexType>

    <complexType name="Bibliography">
    	<sequence>
    		<element name="entry" type="tns:BibEntry" maxOccurs="unbounded" minOccurs="1"></element>
    	</sequence>
    </complexType>
    
    <complexType name="RefList">
    	<sequence>
    		<element name="Reference" type="xsd:string" maxOccurs="unbounded" minOccurs="1"></element>
    	</sequence>
    </complexType>
    
    <complexType name="BibEntry">
    	<sequence>
    		<element name="Name" type="xsd:string"></element>
    		<element name="Author" type="xsd:string"></element>
    	</sequence>
    </complexType>

    <complexType name="VolumeInfo">
    	<sequence>
    		<element name="VolPubDate" type="xsd:string"></element>
    		<element name="VolPubLoc" type="xsd:string"></element>
    	</sequence>
    </complexType>
</schema>

Back to sample home