Controlling placement

You can use several different options to control when shards are placed on various container servers in the configuration. During startup, you might choose to delay the placement of shards. When you are running all of your container servers, you might need to suspend, resume, or change placement while you maintain servers.

Procedure

Controlling placement during startup

You can control when shards begin to be placed while your environment is starting. Some control is in place by default. If you do not take any actions to control placement, shards begin to be placed immediately. When shards are placed immediately, the shards might not be placed evenly as subsequent container servers start, and further placement operations run to balance the distribution.

  • Temporarily suspend the balancing of shards to prevent immediate shard placement when your container servers are starting.

    Suspending the balancing of shards prevents uneven shard placement. Before you start your container servers, use the xscmd -c suspendBalancing command to stop the balancing of shards for a specific data grid and map set. After the container servers are started, you can use the xscmd -c resumeBalancing command to begin the placement of shards on the container servers.

  • Suspend or resume placement and heartbeating. A placement bottleneck occurs when the catalog service runs through the balance algorithm on each container start. In this case, you can suspend placement, start all the containers, and then resume placement so that the balance algorithm is only run once after all the containers are started. You can also suspend heartbeating to prevent consistency issues when the high availability manager core group is formed. When many container servers are started at the same time, the catalog server sends out a new defined list of container servers to a high availability core group. At the same time, the container servers are also trying to create the current view of the high availability core group. Since there is a chance for inconsistencies in the catalog server's list of core group members versus the container's list of core group members, it is optimal to have the catalog process reports from the containers on core group health only after all the containers are started. The result is that there is a final defined set for each corresponding core group, and the containers can form a high availability manager view around those core groups. As such, there is benefit to ignoring high availability manager core group events when servers are started concurrently. WebSphere® eXtreme Scale provides an administrative option that is called suspend heartbeating, which ignores high availability manager core group events. When you suspend heartbeating, the catalog server ignores any failure detection reports from the high availability catalog service. You can also resume heartbeating, in which the default behavior returns and container failures are detected and reported by the high availability catalog service. When heartbeat is resumed, it also includes any containers that failed to start while heartbeating was suspended. Those failures are still detected and acted upon after heartbeating is resumed. For more information, see Tuning the heartbeat interval setting for failover detection.
    1. Use the xscmd -c suspend command to stop heartbeating and placement for a shard container or a domain.
      The following example illustrates how heartbeating and placement for a domain named E13896FA-6141-4435-E000-000C2962AA71 is suspended using this command. The domain contains three data grids myOG3, myOG2, and myOG1 and a mapset named myMapSet.
      Console> xscmd -c suspend 
      
      [Tue Oct 22 2013 17:38:36] xscmd starting... 
      Starting at: 2013-10-22 17:38:38.012  
      
      CWXSI0068I: Executing command: suspend 
      
      Type      Suspendable Object Name Status Details 
      --------- ------------------ ---- ------ ------- 
      heartbeat Domain name E13896FA-6141-4435-E000-000C2962AA71 Heartbeating was suspended. 
      placement Grid Name/Map Set Name myOG3/myMapSet Balancing was suspended. 
      placement Grid Name/Map Set Name myOG1/myMapSet Balancing was suspended. 
      placement Grid Name/Map Set Name myOG2/myMapSet Balancing was suspended.
      
      CWXSI0040I: The suspend command completed successfully.
      When heartbeating and placement is suspended, the following placement actions can still run:
      • Shard promotion can occur when container servers fail.
      • Shard role swapping with the xscmd -c swapShardWithPrimary command.
      • Shard placement triggered balancing with the xscmd -c triggerPlacement -g myOG -ms myMapSet command, where myOG and myMapSet are set to values for your data grid and map set.
    2. Start the container servers.
    3. Use the xscmd -c resume -t placement command to resume placement.
      The following example illustrates how placement is resumed for the data grids myOG3, myOG1, and myOG2 and the map set myMapSet.
      Console> xscmd -c resume -t placement
      
      [Tue Oct 22 2013 17:40:28] xscmd starting...
      Starting at: 2013-10-22 17:40:29.509
      
      CWXSI0068I: Executing command: resume
      
      Type Suspendable Object Name Status Details
      ---- ------------------ ---- ------ -------
      placement Grid Name/Map Set Name myOG3/myMapSet Balancing has resumed.
      placement Grid Name/Map Set Name myOG1/myMapSet Balancing has resumed.
      placement Grid Name/Map Set Name myOG2/myMapSet Balancing has resumed.
      
      CWXSI0040I: The resume command completed successfully.
      Ending at: 2013-10-22 17:40:34.371
      Important: After placement is resumed, use the xscmd -c showPlacement command until all partitions show up as placed.
    4. Optional: Use the xscmd -c balanceShardTypes command to adjust the ratio of primary and replica shards to be equitable among the running container servers in the configuration. The ratio is consistent within one shard on each container server.
    5. Use the xscmd -c resume -t heartbeat command to resume heartbeating after placement has finished.
      The following example illustrates how heartbeating is resumed for the domain E13896FA-6141-4435-E000-000C2962AA71.
      Console> xscmd -c resume -t heartbeat
      
      [Tue Oct 22 2013 17:42:59] xscmd starting...
      Starting at: 2013-10-22 17:42:59.872
      
      CWXSI0068I: Executing command: resume
      
      Type Suspendable Object Name Status Details
      ---- ------------------ ---- ------ -------
      heartbeat Domain name E13896FA-6141-4435-E000-000C2962AA71 Balancing has resumed.
      
      CWXSI0040I: The resume command completed successfully.
      Ending at: 2013-10-22 17:43:04.549
  • [Version 8.6.0.4 and later]Delay placement when you are adding a zone to the environment.

    For example, you might start the first zone with 30 container servers. Then, at a later point in time, you want to add a second zone to the configuration. The default behavior is to place as many replicas as possible in the second zone as soon as it becomes available. So, as soon as one container server is available in the second zone, the replicas for the 30 container servers in the first zone are placed on the newly started container server in the second zone. This action can lead to out of memory exceptions in the newly-started container server. You can prevent this behavior by setting the allowableShardOverrage property on the catalog server.

    allowableShardOverrage
    Specifies the percentage of container servers that a zone must have compared to the other zones in a multi-zone deployment before all the replica shards are placed in that zone. If the percentage of container servers in the zone is below the specified value, only a relative subset of the replicas available are placed. After the percentage exceeds the specified value, all the replicas are placed. Primary shards are always placed. For example, the allowableShardOverrage value is set to 0.75 (75 percent). If zone1 has two container servers, and zone2 has three container servers, the percentage of the container servers between the zones is 2/3 (67 percent). Because this percentage is less than the allowableShardOverrage value of 75 percent, not all the replicas for the data grid are necessarily placed until the zones have an equal number of container servers.
  • Configure the placementDeferralInterval property to minimize the number of shard placement cycles on the container servers. Shard placement is triggered at the defined time interval.
    placementDeferralInterval
    Specifies the interval in milliseconds for deferring the balancing and placement of shards on the container servers. Placement does not start until after the time specified in the property has passed. Increasing the deferral interval lowers processor utilization, but the placement of work items is completed over time. A decrease in the deferral interval increases short-term processor usage, but the placement of work items is more immediate and expedited.

    If multiple container servers are starting in succession, the deferral interval timer is reset if a new container server starts within the given interval. For example, if a second container server starts 10 seconds after the first container server, placement does not start until 15 seconds after the second container server started. However, if a third container server starts 20 seconds after the second container server, placement has already begun on the first two container servers.

    When container servers become unavailable, placement is triggered as soon as the catalog server learns of the event so that recovery can occur as quickly as possible.

    Default: 15000 ms (15 seconds)

    You can use the following tips to help determine if your placement deferral value is set to the right amount of time:
    • As you concurrently start the container servers, look at the CWOBJ1001 messages in the SystemOut.log file for each container server. The timestamp of these messages in each container server log file indicates the actual container server start time. You might consider adjusting the placementDeferralInterval property to include more container server starts. For example, if the first container server starts 90 seconds before the last container server, you might set the property to 90 seconds.
    • Note how long the CWOBJ1511 messages occur after the CWOBJ1001 messages. This amount of time can indicate if the deferral has occurred successfully.
    • If you are using a development environment, consider the length of the interval when you are testing your application.
  • Configure the numInitialContainers attribute.

    If you previously used the numInitialContainers attribute, you can continue using the attribute. However, the use of the xscmd -c suspendBalancing and xscmd -c resumeBalancing commands followed by the placementDeferralInterval are suggested over the numInitialContainers attribute to control placement. The numInitialContainers attribute specifies the number of container servers that are required before initial placement occurs for the shards in this mapSet element. The numInitialContainers attribute is in the deployment policy descriptor XML file. If you have both numInitialContainers and placementDeferralInterval set, note that until the numInitialContainers value has been met, no placement occurs, regardless of the value of the placementDeferralInterval property.

Controlling placement after initial startup

  • Force placement to occur.

    You can use the xscmd -c triggerPlacement -g myOG -ms myMapSet command, where myOG and myMapSet are set to values for your data grid and map set, to force placement to occur during a point in time at which placement might not occur otherwise. For example, you might run this command when the amount of time specified by the placementDeferralInterval property has not yet passed or when balancing is suspended.

  • Reassign a primary shard.

    Use the xscmd -c swapShardWithPrimary command to assign a replica shard to be the new primary shard. The previous primary shard becomes a replica.

  • Rebalance the primary and replica shards.

    Use the xscmd -c balanceShardTypes command to adjust the ratio of primary and replica shards to be equitable among the running container servers in the configuration. The ratio is consistent within one shard on each container server.

  • Suspend or resume placement.
    Use the xscmd -c suspendBalancing command or the xscmd -c resumeBalancing command to stop and start the balancing of shards for a specific data grid and map set. When balancing has been suspended, the following placement actions can still run:
    • Shard promotion can occur when container servers fail.
    • Shard role swapping with the xscmd -c swapShardWithPrimary command.
    • Shard placement triggered balancing with the xscmd -c triggerPlacement -g myOG -ms myMapSet command, where myOG and myMapSet are set to values for your data grid and map set.
  • [Version 8.6 and later]Re-enable shard containers that were disabled for shard placement.

    When a problem occurs with placing shards on a particular shard container, the shard container is placed in the disabled for shard placement list. The shard containers in this list cannot be used for placement until you re-enable the shard container, or the JVM that is hosting the shard container is recycled. When the JVM is stopped, the shard container is removed. When the JVM is restarted, the container count increments and a new name is used for the shard container for a specified data grid. Problems that might cause a shard container to be disabled include: long garbage collection cycles that are impacting JVM health, DNS or naming configuration problems, intermittent network outages, and other problems. Any shards that were successfully placed on the shard container are not moved off the container shard. It is possible that clients can access a shard, however, communication between container shards or between catalog servers and container servers is not working.

    The shard containers that are in the disabled for shard placement list are designated as UNASSIGNED. Unless the JVM for the shard container is recycled, or another shard container stopped or started, the shards remain unassigned, unless you run the xscmd -c triggerPlacement command. The balance cycle does not automatically run when a shard container is disabled because it is possible the shard in question (or the data in the shard) could be causing the problem. To avoid propagating that shard to other shard containers, the balance cycle does not automatically run. You must investigate the issue, and run the xscmd -c triggerPlacement command before any container lifecycle changes.

    To list the shard containers that are disabled, use the xscmd -c listDisabledForPlacement command.

    The shard containers in this list cannot be used for placement until you re-enable the shard container. Resolve any issues with the shard container, then run the xscmd -c enableForPlacement -ct <shard_container> command.

What to do next

You can monitor the placement in the environment with the xscmd -c placementServiceStatus command.