Placement and partitions

You have two placement strategies available for WebSphere® eXtreme Scale: fixed partition and per-container. The choice of placement strategy affects how your deployment configuration places partitions over the remote data grid.

Fixed partition placement

You can set the placement strategy in the deployment policy XML file. The default placement strategy is fixed-partition placement, enabled with the FIXED_PARTITIONS setting. The number of primary shards that are placed across the available containers is equal to the number of partitions that you configured with the numberOfPartitions attribute. If you configured replicas, the minimum total number of shards that are placed is defined by the following formula: ((1 primary shard + minimum synchronous shards) * partitions defined). The maximum total number of shards that are placed is defined by the following formula: ((1 primary shard + maximum synchronous shards + maximum asynchronous shards) * partitions). Your WebSphere eXtreme Scale deployment spreads these shards over the available containers. The keys of each map are hashed into assigned partitions that are based on the total partitions you defined. They keys hash to the same partition even if the partition moves because of failover or server changes.

For example, if the numberPartitions value is 6 and the minSync value is 1 for MapSet1, the total shards for that map set is 12 because each of the six partitions requires a synchronous replica. If three containers are started, WebSphere eXtreme Scale places four shards per container for MapSet1.

Per-container placement

The per-container placement strategy is enabled when you set the placementStrategy attribute to PER_CONTAINER in the map set element in the deployment XML file. With this strategy, the number of primary shards that are placed on each new container is equal to the number of partitions, P, that you configured. The WebSphere eXtreme Scale deployment environment places P replicas of each partition for each remaining container. The numInitialContainers setting is ignored when you are using per-container placement. The partitions get larger as the containers grow. The keys for maps are not fixed to a certain partition in this strategy. The client routes to a partition and uses a random primary. If a client wants to reconnect to the same session that it used to find a key again, it must use a session handle.

For more information, see SessionHandle for routing.

For failover or stopped servers, the WebSphere eXtreme Scale environment moves the primary shards in the per-container placement strategy if they still contain data. If the shards are empty, they are discarded. In the per-container strategy, old primary shards are not kept because new primary shards are placed for every container.

WebSphere eXtreme Scale allows per-container placement as an alternative to what could be termed the "typical" placement strategy, a fixed-partition approach with the key of a Map hashed to one of those partitions. In a per-container placement, your deployment places the partitions on the set of online container servers and automatically scales them out or in as containers are added or removed from the server data grid. A data grid with the fixed-partition approach works well for key-based grids, where the application uses a key object to locate data in the grid. The following discusses the alternative.

Example of a per-container data grid

Data grids that are configured to use a per-container placement strategy are different. You can specify that the data grid uses the per-container placement strategy by setting the placementStrategy attribute in your deployment XML file to PER_CONTAINER. Instead of configuring how many partitions total you want in the data grid, you specify how many partitions you want per container that you start.

For example, if you set five partitions per container, five new anonymous partition primaries are created when you start that container server, and the necessary replicas are created on the other deployed container servers.

The following is a potential sequence in a per-container environment as the data grid grows.

  1. Start container C0 hosting five primaries (P0 - P4).
    • C0 hosts: P0, P1, P2, P3, P4.
  2. Start container C1 hosting five more primaries (P5 - P9). Replicas are balanced on the containers.
    • C0 hosts: P0, P1, P2, P3, P4, R5, R6, R7, R8, R9.
    • C1 hosts: P5, P6, P7, P8, P9, R0, R1, R2, R3, R4.
  3. Start container C2 hosting five more primaries (P10 - P14). Replicas are balanced further.
    • C0 hosts: P0, P1, P2, P3, P4, R7, R8, R9, R10, R11, R12.
    • C1 hosts: P5, P6, P7, P8, P9, R2, R3, R4, R13, R14.
    • C2 hosts: P10, P11, P12, P13, P14, R5, R6, R0, R1.

The pattern continues as more containers are started, creating five new primary partitions each time and rebalancing replicas on the available containers in the data grid.

Note: WebSphere eXtreme Scale does not move primary shards when the per-container placement strategy is used, only replicas.

The partition numbers are arbitrary and have nothing to do with keys, so you cannot use key-based routing. If a container stops, then the partition IDs created for that container are no longer used, so there is a gap in the partition IDs. In the example, there would no longer be partitions P5 - P9 if the container C1 failed, leaving only P0 - P4 and P10 - P14, so key-based hashing is impossible.

Using numbers like five or even more likely 10 for how many partitions per container work best if you consider the consequences of a container failure. To spread the load of hosting shards evenly across the data grid, you need more than just one partition for each container. If you had a single partition per container, then when a container fails, only one container (the one hosting the corresponding replica shard) must bear the full load of the lost primary. In this case, the load is immediately doubled for the container. However, if you have five partitions per container, then five containers pick up the load of the lost container, lowering impact on each by 80 percent. Using multiple partitions per container generally lowers the potential impact on each container substantially. More directly, consider a case in which a container spikes unexpectedly–the replication load of that container is spread over five containers rather than only one.

A per-container policy

Several scenarios make the per-container strategy an ideal configuration, such as with HTTP session replication or application session state. In such a case, an HTTP router assigns a session to a servlet container. The servlet container needs to create an HTTP session and chooses one of the five local partition primaries for the session. The "ID" of the partition that is chosen is then stored in a cookie. The servlet container now has local access to the session state, which means zero latency access to the data for this request as long as you maintain session affinity. And eXtreme Scale replicates any changes to the partition.

In practice, remember the repercussions of a case in which you have multiple partitions per container (say five again). With each new container started, you have five more partition primaries and five more replicas. Over time, more partitions should be created and they should not move or be destroyed. But this is not how the containers would behave. When a container starts, it hosts five primary shards, which can be called "home" primaries, existing on the respective containers that created them. If the container fails, the replicas become primaries and eXtreme Scale creates five more replicas to maintain high availability (unless you disabled auto repair). The new primaries are in a different container than the one that created them, which can be called "foreign" primaries. The application should never place new state or sessions in a foreign primary. Eventually, the foreign primary has no entries and eXtreme Scale automatically deletes it and its associated replicas. The foreign primaries' purpose is to allow existing sessions to still be available (but not new sessions).

A client can still interact with a data grid that does not rely on keys. The client just begins a transaction and stores data in the data grid independent of any keys. It asks the Session for a SessionHandle object, a serializable handle that allows the client to interact with the same partition when necessary. WebSphere eXtreme Scale chooses a partition for the client from the list of home partition primaries. It does not return a foreign primary partition. The SessionHandle can be serialized in an HTTP cookie, for example, and later convert the cookie back into a SessionHandle. Then, the WebSphere eXtreme Scale APIs can obtain a Session that is bound to the same partition with the SessionHandle.

Note: You cannot use agents to interact with a PER_CONTAINER data grid.

Advantages

The per-container placement strategy is different from a normal fixed partition or hash data grid because the client stores data in a place in the grid. The client gets a handle to it and uses the handle to access it again. There is no application-supplied key as there is in the fixed-partition placement strategy.

Your deployment does not make a new partition for each Session. So in a per-container deployment, the keys that are used to store data in the partition must be unique within that partition. For example, you can have your client generate a unique SessionID and then use it as the key to find information in Maps in that partition. Multiple client sessions then interact with the same partition so the application needs to use unique keys to store session data in each partition.

The previous examples used five partitions, but the numberOfPartitions parameter in the object grid XML file can be used to specify the partitions as required. Instead of per data grid, the setting is per container. (The number of replicas is specified in the same way as with the fixed-partition policy.)

The per-container policy can also be used with multiple zones. If possible, eXtreme Scale returns a SessionHandle to a partition whose primary is in the same zone as that client. The client can specify the zone as a parameter to the container or by using an API. The client zone ID can be set to serverproperties or clientproperties.

The per-container strategy for a data grid suits applications that store conversational type state rather than database-oriented data. The key to access the data would be a conversation ID and is not related to a specific database record. It provides higher performance (because the partition primaries can be collocated with the servlets for example) and easier configuration (without having to calculate partitions and containers).