Sizing memory and partition count calculation
You can calculate the amount of memory and partitions needed for your specific configuration.
Attention: This topic applies when you are not using the COPY_TO_BYTES copy mode.
If you are using the COPY_TO_BYTES mode, then the memory size is much less and the calculation
procedure is different. For more information about this mode, see Tuning the copy mode.
WebSphere eXtreme Scale stores data within the address space of Java™ virtual machines (JVM). Each JVM provides processor space for servicing create, retrieve,
update, and delete calls for the data that is stored in the JVM. In addition, each JVM provides memory space for data entries and replicas. Java objects vary in size, therefore you must make a measurement
to make an estimate of how much memory you need.
To size the memory that you need, load your
application data into a single JVM. When the heap usage
reaches 60%, note the number of objects that are used. This number is the maximum recommended object
count for each of your Java virtual machines. To get the most accurate
sizing, use realistic data and include any defined indexes in your sizing because indexes also
consume memory. The best way to size memory use is to run garbage collection
verbosegc output because this output gives you the numbers after garbage
collection. You can query the heap usage at any given point through MBeans or programmatically, but
those queries give you only a current snapshot of the heap. This snapshot might include uncollected
garbage, so using that method is not an accurate indication of the consumed memory.
Tip: Configure the minimum and maximum JVM heap sizes with the same value.
For example, start with a maximum setting of 2048 megabytes. You can set the minimum and maximum JVM
heap sizes by running the following Java command from a command
line:
java -Xms2048M -Xmx2048M
In this example, the Java heap size is 2048
MB.Scaling up the configuration
Number of shards per partition (numShardsPerPartition value)To calculate the number of shards per partition, or the
numShardsPerPartition value, add 1 for the primary shard plus the total
number of replica shards you want. For more information about partitioning, see Partitioning.
numShardsPerPartition = 1 + total_number_of_replicas
Number of Java virtual machines (minNumJVMs value)
To scale up your configuration, first decide on the maximum number of objects that need to be stored in total. To determine the number of Java virtual machines you need, use the following formula:minNumJVMS=(numShardsPerPartition * numObjs) / numObjsPerJVM
Round
this value up to the nearest integer value.Number of shards (numShards value)
At the final growth size, use 10 shards for each JVM. As described before, each JVM has one primary shard and (N-1) shards for the replicas, or in this case, nine replicas. Because you already have a number of Java virtual machines to store the data, you can multiply the number of Java virtual machines by 10 to determine the number of shards:numShards = minNumJVMs * 10 shards/JVM
Number of partitions If a partition has one primary shard and one replica
shard, then the partition has two shards (primary and replica). The number of partitions is the
shard count divided by 2, rounded up to the nearest prime number. If the partition has a primary and
two replicas, then the number of partitions is the shard count divided by 3, rounded up to the
nearest prime number.
numPartitions = numShards / numShardsPerPartition
Example of scaling
In this example, the number of entries begins at 250 million. Each year, the number of entries grows about 14%. After seven years, the total number of entries is 500 million, so you must plan your capacity accordingly. For high availability, a single replica is needed. With a replica, the number of entries doubles, or 1,000,000,000 entries. As a test, 2 million entries can be stored in each JVM. Using the calculations in this scenario the following configuration is needed:- 500 Java virtual machines to store the final number of entries.
- 5000 shards, calculated by multiplying 500 Java virtual machines by 10.
- 2500 partitions, or 2503 as the next highest prime number, calculated by taking the 5000 shards, divided by two for primary and replica shards.
Starting configuration
Based on the previous calculations, start with 250 Java virtual machines and grow toward 500 Java virtual machines over five years. With this configuration, you can manage incremental growth until you arrive at the final number of entries.In this configuration, about 200,000 entries are stored per partition (500 million entries divided by 2503 partitions).
When the maximum number of Java virtual machines is reached
When you reach your maximum number of 500 Java virtual machines, you can still grow your data grid. As the number of Java virtual machines grows beyond 500, the shard count begins to drop below 10 for each JVM, which is below the recommended number. The shards start to become larger, which can cause problems. Repeat the sizing process considering future growth again, and reset the partition count. This practice requires a full data grid restart, or an outage of your data grid.Number of servers
Attention: Do not use paging on a server under
any circumstances.
A single JVM uses more memory
than the heap size. For example, 1 GB of heap for a JVM
actually uses 1.4 GB of real memory. Determine the available free RAM on the server. Divide the
amount of RAM by the memory per JVM to get the maximum
number of Java virtual machines on the server.