IBM InfoSphere Federation Server, Version 10.1

BioRS data source - Example queries

These examples include a comprehensive set of sample queries that you can use to access BioRS data sources and show you the statements that are necessary to create the nicknames that are used in the examples.

These examples show you how to:
The sample queries use the nicknames swiss and swissft.

CREATE NICKNAME statement for the swiss nickname

The parent nickname swiss was registered for the SwissProt databank by using the following CREATE NICKNAME statement:
CREATE NICKNAME swiss
  (
  ID                 CHAR (30) OPTIONS (ELEMENT_NAME '_ID_'),
  EntryDate          VARCHAR (15),
  Update             CLOB (15),
  Description        CLOB (15),
  Crossreference     CLOB (15),
  Authors            CLOB (15),
  Journal            VARCHAR (15),
  JournalIssue       VARCHAR (15),
  PublicationYear    CLOB (15),
  PublicationTitle   CLOB (15),
  Gene               CLOB (15),
  Remarks            CLOB (15),
  RemarkType         VARCHAR (15),
  CatalyticActivity  VARCHAR (15),
  CoFactor           VARCHAR (15),
  Disease            VARCHAR (15),
  Function           CLOB (15),
  Pathway            VARCHAR (15),
  Similarity         CLOB (15),
  Complex            VARCHAR (15),
  FtKey              VARCHAR (15),
  FtDescription      CLOB (15),
  FtLength           VARCHAR (15),
  MolWeight          CHAR (15),
  Protein_Length     VARCHAR (15),
  Sequence           CLOB (15),
  AccNumber          VARCHAR (15),
  Taxonomy           CLOB (15),
  Organelle          VARCHAR (15),
  Organism           VARCHAR (15),
  Keywords           VARCHAR (15),
  Localization       VARCHAR (15),
  FtKey_count        VARCHAR (15),
  AllText            CLOB (15)
  )
    FOR SERVER biors_server
      OPTIONS (REMOTE_OBJECT 'swissprot');

CREATE NICKNAME statement for the swissft nickname

The child nickname swissft was registered for the SwissProt.Features databank by using the following CREATE NICKNAME statement:
CREATE NICKNAME swissft
  (
  ID                VARCHAR (30) OPTIONS (ELEMENT_NAME '_ID_'),
  FtKey             VARCHAR (15),
  FtLength          VARCHAR (15),
  FtDescription     VARCHAR (15),
  Parent            VARCHAR (30) OPTIONS (REFERENCED_OBJECT 'swissprot'),
  AllText           CLOB (15)
  )
    FOR SERVER biors_server 
      OPTIONS (REMOTE_OBJECT 'swissprot.features');

Query structures impact federated server performance

The queries and results in the following table show how you can structure your queries to optimize the workload between the federated system and the BioRS server.

Table 1. Samples of different queries that produce identical results
Query Result
SELECT s.id FROM Swiss s WHERE BIORS.CONTAINS(s.id, '100K_RAT') = 1 FETCH FIRST 3 ROWS ONLY;
ID
---------------
100K_RAT 

1 record(s) selected.
SELECT s.id FROM Swiss s WHERE s.id LIKE '%100K_RAT%' FETCH FIRST 3 ROWS ONLY;
ID
---------------
100K_RAT 

1 record(s) selected.

Both of the queries in this table produce the same results. However, the first query runs much faster than the second query. The first query uses the BIORS.CONTAINS function to specify the input predicate. As a result, the BioRS server selects the data in the SwissProt databank, then passes the selected data to the federated server. In the second query, the LIKE input predicate is specified directly on the swiss nickname. As a result, the BioRS server transfers the entire SwissProt databank to the federated server. After the databank contents are transferred, the federated server then selects the data.

Query performance is typically much better when predicates are sent to the data source for processing.

Queries that use wildcards in the BIORS.CONTAINS custom function

The queries and results in the following table show the use of wildcard characters in the BIORS.CONTAINS custom function. All of the query results are identical, even though different wildcard characters are used.

Table 2. Sample queries that use wildcards in the BIORS.CONTAINS custom function
Query Result
SELECT s.crossreference FROM Swiss s WHERE BIORS.CONTAINS(s.crossreference, 'MEDLINE') = 1 FETCH FIRST 3 ROWS ONLY;
CROSSREFERENCE 
---------------
NCBI_TaxID=1011
NCBI_TaxID=5875
NCBI_TaxID=4081

  3 record(s) selected.
SELECT s.crossreference FROM Swiss s WHERE BIORS.CONTAINS(s.crossreference, '?ED?IN?') = 1 FETCH FIRST 3 ROWS ONLY;
CROSSREFERENCE 
---------------
NCBI_TaxID=1011
NCBI_TaxID=5875
NCBI_TaxID=4081

  3 record(s) selected.
SELECT s.crossreference FROM Swiss s WHERE BIORS.CONTAINS(s.crossreference, '*D*N*') = 1 FETCH FIRST 3 ROWS ONLY;
CROSSREFERENCE 
---------------
NCBI_TaxID=1011
NCBI_TaxID=5875
NCBI_TaxID=4081

  3 record(s) selected

Queries that access BioRS Author data type columns

The queries and results in the following table show how you can access information in BioRS Author data type elements with the BIORS.CONTAINS custom function. The syntax of all of the queries is nearly identical. The only difference is the presence or absence of the first initial in the query term, and the amount of space between the first name and the last initial.

Table 3. Sample queries that access BioRS Author data type columns
Query Result
SELECT s.authors FROM Swiss s WHERE BIORS.CONTAINS(s.authors, 'Mueller') = 1 FETCH FIRST 3 ROWS ONLY;
AUTHORS        
---------------
Mueller D. Rehb
Mayer K.F.X. Sc
Zemmour J. Litt

  3 record(s) selected.
SELECT s.authors FROM Swiss s WHERE BIORS.CONTAINS(s.authors, 'Mueller,D') = 1 FETCH FIRST 3 ROWS ONLY;
AUTHORS        
---------------

  0 record(s) selected.
SELECT s.authors FROM Swiss s WHERE BIORS.CONTAINS(s.authors, 'Mueller ,D') = 1 FETCH FIRST 3 ROWS ONLY;
AUTHORS        
---------------

  0 record(s) selected.
SELECT s.authors FROM Swiss s WHERE BIORS.CONTAINS(s.authors, 'Mueller, D') = 1 FETCH FIRST 3 ROWS ONLY;
AUTHORS        
---------------
Mueller D. Rehb
Zou P.J. Borovo
Davies J.D. Mue

  3 record(s) selected.

Queries that access BioRS Date data type columns

The queries and results in the following table who how you can access information in BioRS Date type elements with the BIORS.CONTAINS custom function.

When a BioRS Date type field contains a sequence of dates, the results can contain extra information, as shown in the second example in the table. BioRS Numeric data type elements (Date and Number) can contain multiple values. Therefore, the results of the queries that are run on BioRS Date or Number elements can also contain multiple values. Multiple values are always separated by spaces.

Table 4. Sample queries that access BioRS Date data type columns
Query Result
SELECT e.entrydate FROM embl e WHERE BIORS.CONTAINS(e.entrydate, date('11/01/1997') ) = 1 FETCH FIRST 3 ROWS ONLY;
ENTRYDATE      
---------------
01-NOV-1997    
01-NOV-1997    
01-NOV-1997    

  3 record(s) selected.
SELECT g.update FROM gen g WHERE BIORS.CONTAINS(g.update, date('11/01/1997') ) = 1 FETCH FIRST 3 ROWS ONLY;
UPDATE         
---------------
01-NOV-1997 11-
01-NOV-1997 12-
01-NOV-1997 06-

  3 record(s) selected.

Queries that use the BIORS.CONTAINS_LE and BIORS.CONTAINS_GE custom functions

The queries and results in the following table show how you can use the BIORS.CONTAINS_LE and the BIORS.CONTAINS_GE custom functions.

Table 5. Sample queries that use the BIORS.CONTAINS_LE and BIORS.CONTAINS_GE custom functions
Query Result
SELECT s.molweight FROM Swiss s WHERE BIORS.CONTAINS_LE(s.molweight, 100368) = 1 FETCH FIRST 3 ROWS ONLY;
MOLWEIGHT      
---------------
100368         
10576          
8523           

  3 record(s) selected.
SELECT s.molweight FROM Swiss s WHERE BIORS.CONTAINS_GE(s.molweight, 100368) = 1 FETCH FIRST 3 ROWS ONLY;
MOLWEIGHT      
---------------
100368         
103625         
132801         

  3 record(s) selected.
SELECT s.journalissue FROM Swiss s WHERE BIORS.CONTAINS_GE(s.journalissue, 172) = 1 FETCH FIRST 3 ROWS ONLY;
JOURNALISSUE   
---------------
172 21         
242            
196            

  3 record(s) selected.      

Queries that use the BIORS.SEARCH_TERM custom function

The queries and results in the following table show how you can use the BIORS.SEARCH_TERM custom function to specify a search term using the BioRS format.

Table 6. Sample queries that use the BIORS.SEARCH_TERM custom function
Query Result
SELECT s.publicationyear FROM Swiss s WHERE BIORS.SEARCH_TERM (s.id, '[PublicationYear EQ number:1997;]')=1 FETCH FIRST 10 ROWS ONLY;
PUBLICATIONYEAR
---------------
1997           
1997 2000      
1988 1991 1997 
1994 1997      
1997 1998      
1994 1995 1997 
1997 1999      
1997           
1994 1994 1995 
1993 1992 1997 

  10 record(s) selected.
SELECT s.molweight FROM Swiss s WHERE BIORS.SEARCH_TERM (s.id, '[MolWeight EQ number:100368;]') = 1 FETCH FIRST 10 ROWS ONLY;
MOLWEIGHT      
---------------
100368         
100368         

  2 record(s) selected.
SELECT s.molweight FROM Swiss s WHERE BIORS.SEARCH_TERM (s.id, '[MolWeight GREATER number:100368;]') = 1 FETCH FIRST 10 ROWS ONLY;
MOLWEIGHT      
---------------
100368         
103625         
132801         
194328         
130277         
287022         
289130         
135502         
112715         
112599         

  10 record(s) selected.

Using relational predicates to form an equijoin between two databanks that have a parent-child relationship

The following query shows how to use relational predicates to form an equijoin between two databanks that have a parent-child relationship:

SELECT s.id, f.id, f.parent FROM Swiss s, Swissft f
    WHERE (f.parent = s.id) FETCH FIRST 10 ROWS ONLY;
In the following query results, the 100K_RAT record is a parent to nine child records (100K_RAT.1 through 100K_RAT.9).
ID                   ID                 PARENT                        
-------------------- ------------------ ------------------------------
100K_RAT             100K_RAT.1         swissprot:100K_RAT            
100K_RAT             100K_RAT.2         swissprot:100K_RAT            
100K_RAT             100K_RAT.3         swissprot:100K_RAT            
100K_RAT             100K_RAT.4         swissprot:100K_RAT            
100K_RAT             100K_RAT.5         swissprot:100K_RAT            
100K_RAT             100K_RAT.6         swissprot:100K_RAT            
100K_RAT             100K_RAT.7         swissprot:100K_RAT            
100K_RAT             100K_RAT.8         swissprot:100K_RAT            
100K_RAT             100K_RAT.9         swissprot:100K_RAT            
104K_THEPA           104K_THEPA.1       swissprot:104K_THEPA          

  10 record(s) selected.


Feedback

Update icon Last updated: 2012-05-18