The mpxsort utility is used to reorder a binary file generated from the bulk cross match (BXM) and incremental cross match (IXM) utilities.
Specifically, mpxsort reorders the bxmlink file when there are multiple member parts or multiple threads used during the creation of the bxmlink file. This sort order is required by the non-transitive logic to keep the transitive entity sets grouped together so that members can be removed from the set (and possibly form additional entities) for the non-transitive phase.
The mpxsort utility is run between the second mpxcomp and mpxlink phase. The input to mpxsort is the output of the mpxcomp utility. When using the mpxsort utility, match the number of parts (-mpxparts) with the number of parts specified for the mpxcomp utility. The mpxsort output is then consumed by the mpxlink utility.
A command-line parameter unique to mpxsort is the -{no}radix sort option. A radix sort, also known as a binary sort, is an extremely fast method of sorting binary records. While a radix sort is faster than a quick sort (which is our default sorting algorithm), the radix sort consumes twice as much memory as a quick sort. On servers where memory is a constraint, the -noradixsort option can be specified and a quick sort is used to conserve memory. On servers where memory is not an issue and maximum performance is required, the default -radixsort option can be used.
Again, the mpxsort utility supports only bxmlink files which are the output of the mpxcomp utility. I
Usage example:
mpxsort -enttype hh -bxmlink -bxminpdir /bxminp -bxmoutdir /bxmout
This example sorts the mpx_bxmlink_xx.XXX file for the household (hh) entity type.
All options and flags are case independent; option values are not.
-nthreads option defaults to the number of processors on the server.
Option | Type | Description | Default |
---|---|---|---|
-entType | name | entity type name | NONE |
-bxmInpDir | dirName | .bin file input directory | NONE |
-bxmOutDir | dirName | .bin file output directory | NONE |
-nMxmParts | N | Number of maximum partitions. Match this setting to the number of parts specified in the output of the BXM utility used to generate the file being used as input to the mpxsort utility. | 1 |
-nThreads | N | Number of threads | the number of CPUs |
-{no}bxmLink | Use linkage records from the mpxcomp utility. Currently, the mpxsort utility supports only bxmlink files which are the output of the mpxcomp utility. Use the -bxmLink to avoid errors. |
-nobxmLink | |
-{no}radixSort | Use quick sort instead of radix sort. | radixSort |