Appliance topology: collectives, zones, and data grids

A data grid is a storage unit that can be created to hold objects for a specific application or set of applications. A collective groups appliances together for scalability and management purposes. A zone defines a physical location for your appliance and are used to determine the placement of the data in your cache.

Appliance topology

Both collectives and zones are associated with one or more WebSphere® DataPower® XC10 Appliances. Each appliance can be a member of one collective and one zone. Each appliance hosts multiple data grids, which hold the cache data.

Figure 1. Collectives and zones topology

This collective contains two zones, Rack1 and Rack2, that each contain one appliance. The appliances are each running the catalog service, and primary and replica data grids.

Important: Two appliances are required to make your data grid highly available.

Collectives and multimaster replication

Multimaster replication is a technique for ensuring continuous availability across multiple deployment environments. Multimaster topologies can be implemented in the WebSphere DataPower XC10 Appliances by creating multiple collectives and linking them. When you define a collective, the following information is shared among the appliances in the collective: data grids, monitoring information, collective and zone members, and users. When you update any of this information, your changes are persisted to all of the other appliances in the collective. The catalog service enables the communication between appliances. The catalog service is a group of catalog servers. Each appliance in the collective runs a catalog server, with a limit of three catalog servers for each collective. If you have more than three appliances in a collective, the catalog service runs on the first three appliances that were added to the collective. If you remove an appliance with a catalog server from the collective or an appliance with a catalog server becomes unavailable, the next appliance that you add to the collective runs a catalog server. The catalog server does not fail over to other appliances.

To add an appliance to a collective, add the host name and secret key that are found on the Collective > Members panel. The host name and secret key on this panel are for the appliance that you want to add to the same configuration panel as an appliance that is already in the collective. You can create this configuration from any appliance in the collective because the collective membership is persisted among the collective members. During assimilation, the secret key authenticates the new appliance to the collective.

The appliance secret key is used for assimilation first. When an appliance assimilates another, the first appliance must authenticate to the target of assimilation. After assimilation has occurred, the secret key for the assimilated appliance is set to the value of the secret key for the appliance that initiates the assimilation. Subsequently, this appliance secret key is used to authenticate administrative operations that are processed by internal components of the collective. You cannot configure or modify the appliance secret key. It is randomly chosen when you start the appliance.

The data grid operations secret key is the one that you set on the Collective settings panel. This secret key is used for catalog servers and container servers to authenticate to one another for replication and other data grid operations. This key has the same value for all members of a collective. In addition, however, the data grid operations secret key must be the same value for any multimaster replication (MMR)-linked appliance collective, so that MMR replication works correctly. Because each of the MMR-linked collectives must use the same data grid operations secret key, you cannot use the appliance secret key for this purpose, since different collectives have different appliance secret keys.

Appliances can only be in one collective. You cannot add an appliance that is already in a collective to a different collective. You also cannot join two collectives into a single collective. To join appliances from separate collectives, you must remove each appliance from its respective collective, making each appliance stand alone. You can then create a new collective that includes all of the appliances.

While you can use a collective to make most configuration changes, you must log in to a given appliance to change the settings on the Appliance > Appliance Settings and Appliance > Troubleshooting panels.

Zones

Zones are associated with a physical location of the appliance, such as a city or rack location in a lab. Zones help the catalog service to define where the data in your data grids is stored. For example, if the primary information for your data grid is stored in a given zone, then the replica data is stored in an appliance that is in a different zone. With this configuration, failover can occur from the primary to a replica if the appliance that holds the data grid primary fails.

Differences between multimaster replication, collectives, and zones

One of the most noticeable differences in data replication from an administrative perspective is that all appliances in a collective share the following configuration data: data grids, monitoring information, collective and zone members, and users and groups. This data is not shared across MMR, and therefore all configuration changes must be made for each collective separately. In terms of data replication and failure recovery, collectives and zones are similar in that you have two appliances and each one has one copy of the data. However, the way MMR replication works is different. MMR is used to replicate changes and existing data. Deletes are not tracked. If two MMR sides are disconnected and are rejoined later, any data deleted while down on one side is recovered on the other side.

See the following additional benefits and disadvantages of MMR:

MMR requires collision arbitration when you make changes on both sides of the MMR link, which might add a possible performance degradation.
MMR provides higher performance for geographically separated data centers or when the network performance between appliances is unstable.
MMR between collectives of multiple appliances might reduce capacity, rather than putting all the appliances in a collective; replicas are placed on both sides of the link. The difference in capacity occurs only as you grow to three or more members.
MMR between collectives of single appliances might lengthen recovery time if one side fails. As you get to larger scenarios, for example four appliances, in the MMR scenario (two collectives of two members), each appliance has more data versus one collective of four members. Therefore, in this case, recovery time due to replication is longer with MMR because that the appliance likely had more data that must be recovered.
With MMR, Java clients must connect to each domain or collective. Load balancing is not done for collectives that are enabled for MMR.
MMR might yield higher throughput because you can write against more primary shards. However, load balancing or multidomain and multicollective client routing is not supported.

Data grids

Data grids hold the objects for your applications. By caching objects, you can increase the performance of your application. There are three types of data grids:

Simple data grid: Simple data grids hold data in key-value pairs. For example, you can store the results of a database query in a simple data grid. You use the ObjectMap API to implement a simple data grid. The ObjectMap API works similarly to Java™ Maps.
Session data grid: If you are using WebSphere Application Server sessions, you can configure your application to use a session data grid on the appliance for session management data. You can configure your application to use a session data grid when you are installing a new application. You can also update your existing application or server settings to use the session data grid on the appliance.
Dynamic cache data grid: You can use a dynamic cache data grid on the appliance to store data from your WebSphere Application Server dynamic cache. You can enable applications that are written with the Dynamic Cache API or applications that use container-level caching, such as servlets, to use the appliance as the cache provider. As a result, less memory is used by your application servers. All the cache data is offloaded to the appliance and is no longer stored in application server memory.

Data grid replicas

You can define a target number of replicas for a given data grid. Replicas are created when you have at least two appliances in your collective. If you have one appliance, no replicas are created. If you have n number of appliances in your collective, the maximum number of replicas is n-1, because one appliance hosts the primary data grid. If your target number of replicas is higher than the current n-1, more replicas can be placed when you add appliances to the collective. Consider setting the number of replicas to the highest number of replicas you might want in the future. Editing the replica settings requires the data grids to be cleared, so set the value with consideration to the future number of replicas. As new appliances join the collective, additional replicas are created. Primary and replica data grids are evenly distributed, or striped, across all of the appliances in the collective. As new appliances join the collective, rebalancing occurs to distribute the primary and replica data grids.

Replicas can be synchronous replicas or asynchronous replicas. Synchronous replicas receive updates as part of the transaction on the primary data grid. Asynchronous replicas are updated after the transaction on the primary data grid is committed. Synchronous replicas guarantee data consistency, but can increase the response time of a request when compared with an asynchronous replica. Asynchronous replicas do not have the same guarantee in data consistency, but can make your transactions complete faster. A data grid has one asynchronous replica by default. A placement algorithm controls where the replicas are located.

Maps

Maps are the data structures that contain the data for the data grid in key-value pairs. A single data grid can have multiple maps that reside on the data grids and data grid replicas.

You can create additional maps in the data grid by having your client application connect to a specifically-named map. A dynamic map is automatically created.

Collective links

A single collective should not span an unreliable network because false positive failure detections might occur. However, you might still want to replicate data grid data across appliances that have unreliable network connectivity. Some common scenarios where you might want to use this type of topology follow:

Disaster recovery between data centers where one collective is active and the other is used for backup
Geographically distributed data centers where all collectives are active for geographically close clients.

After you connect two collectives, any data grids that have the same names are asynchronously replicated between the collectives. These data grids must have the same number of replicas in each collective, and must have the same dynamic map configurations.