__mem_delay

Purpose

The __mem_delay built-in function specifies how many delay cycles there are for specific loads. These specific loads are delinquent loads with a long memory access latency because of cache misses.

When you specify which load is delinquent the compiler takes that information and carries out optimizations such as data prefetching. In addition, when you run -qprefetch=assistthread, the compiler uses the delinquent load information to perform analysis and generate prefetching assist threads. For more information, see -qprefetch.

Prototype

void* __mem_delay (const void *address, const unsigned int cycles);

Parameters

address
The address of the data to be loaded or stored.
cycles
A compile time constant, typically either L1 miss latency or L2 miss latency.

Usage

The __mem_delay built-in function is placed immediately before a statement that contains a specified memory reference.

Examples

Here is how you generate code using assist threads with __mem_delay:

Initial code:
int y[64], x[1089], w[1024];

  void foo(void){
    int i, j;
    for (i = 0; i &l; 64; i++) {
      for (j = 0; j < 1024; j++) {
        
        /* what to prefetch? y[i]; inserted by the user */  
        __mem_delay(&y[i], 10);               
        y[i] = y[i] + x[i + j] * w[j];                            
        x[i + j + 1] = y[i] * 2;       
    }     
  }    
}
Assist thread generated code:
void foo@clone(unsigned thread_id, unsigned version)

{ if (!1) goto lab_1;

/* version control to synchronize assist and main thread */
if (version == @2version0) goto lab_5; 

goto lab_1;

lab_5:

@CIV1 = 0;

do { /* id=1 guarded */ /* ~2 */

if (!1) goto lab_3;

@CIV0 = 0;

do { /* id=2 guarded */ /* ~4 */

/* region = 0 */

/* __dcbt call generated to prefetch y[i] access */
__dcbt(((char *)&y + (4)*(@CIV1))) 
@CIV0 = @CIV0 + 1; 
} while ((unsigned) @CIV0 < 1024u); /* ~4 */  

lab_3:
@CIV1 = @CIV1 + 1;
} while ((unsigned) @CIV1 < 64u); /* ~2 */  

lab_1:

return; 
}

Related information



Voice your opinion on getting help information Ask IBM compiler experts a technical question in the IBM XL compilers forum Reach out to us