Large program support

This topic provides information about using the large and very large address-space models to accommodate programs requiring data areas that are larger than those provided by the default address-space model.

The large address-space model is available on AIX® 4.3 and later. The very large address-space model is available on AIX 5.1 and later.

Note: This discussion applies only to 32-bit processes.

The virtual address space of a 32-bit process is divided into 16 256-megabyte areas (or segments), each addressed by a separate hardware register. The operating system refers to segment 2 (virtual addresses 0x20000000-0x2FFFFFFF) as the process-private segment. By default, this segment contains the user stack and data, including the heap. The process-private segment also contains the u-block of the process, which is used by the operating system and is not readable by an application.

Because a single segment is used for both user data and stack, their maximum aggregate size is slightly less than 256 MB. Certain programs, however, require large data areas (initialized or uninitialized), or they need to allocate large amounts of memory with the malloc or sbrk subroutine. Programs can be built to use the large or very large address-space model, allowing them to use up to 2 GB of data.

It is possible to use either the large or very large address-space model with an existing program, by providing a non-zero maxdata value. The maxdata value is obtained either from the LDR_CNTRL environment variable or from a field in the executable file. Some programs have dependencies on the default address-space model, and they will break if they are run using the large address-space model.

Understanding the large address-space model

The large address-space model allows specified programs to use more than 256 MB of data. Other programs continue to use the default address-space model. To allow a program to use the large address-space model, specify a non-zero maxdata value. You can specify a non-zero maxdata value either by using the ld command when you're building the program, or by exporting the LDR_CNTRL environment variable before executing the program.

When a program using the large address-space model is executed, the operating system reserves as many 256 MB segments as needed to hold the amount of data specified by the maxdata value. Then, beginning with segment 3, the program's initialized data is read from the executable file into memory. The data read begins in segment 3, even if the maxdata value is smaller than 256 MB. With the large address-space model, a program can have a maximum of 8 segments or 2 GB or 3.25 GB of data respectively.

In the default address-space model, 12 segments are available for use by the shmat or mmap subroutines. When the large address-space model is used, the number of segments reserved for data reduces the number of segments available for the shmat and mmap subroutines. Because the maximum size of data is 2 GB, at least two segments are always available for the shmat and mmap subroutines.

The user stack remains in segment 2 when the large address-space model is used. As a result, the size of the stack is limited to slightly less than 256 MB. However, an application can relocate its user stack into a shared memory segment or into allocated memory.

While the size of initialized data in a program can be large, there is still a restriction on the size of text. In the executable file for a program, the size of the text section plus the size of the loader section must be less than 256 MB. This is required so that these sections will fit into a single, read-only segment (segment 1, the TEXT segment). You can use the dump command to examine section sizes.

Understanding the very large address-space model

The very large address-space model enables large data programs in much the same way as the large address-space model, although there are several differences between them. To allow a program to use the very large address-space model, you must specify a maxdata value and the dynamic segment allocation (dsa) property. Use either the ld command or the LDR_CNTRL environment variable to specify a maxdata value and the DSA option.

If a maxdata value is specified, the very large address-space model follows the large-address space model in that a program's data is read into memory starting with segment 3, and occupies as many segments as needed. The remaining data segments, however, are not reserved for the data area at execution time, but are obtained dynamically. Until a segment is needed for a program's data area, it can be used by the shmat or mmap subroutines. With the very large address-space model, a program can a maximum of 13 segments or 3.25 GB of data. Of these 13 segments, 12 segments or 3 GB, are available for shmat and mmap subroutine purposes.

When a process tries to expand its data area into a new segment, the operation succeeds as long as the segment is not being used by the shmat or mmap subroutines. A program can call the shmdt or munmap subroutine to stop using a segment so that the segment can be used for the data area. After a segment has been used for the data area, however, it can no longer be used for any other purpose, even if the size of the data area is reduced.

If a maxdata value is not specified (maxdata = 0) with the dsa property, a slight variation from the above behaviour is achieved. The process will have its data and stack in segment 2, similiar to a regular process. The process will not have access to the global shared libraries, so all shared libraries used by the process will be loaded privately. The advantage to running this way is that a process will have all 13 segments (3.25 GB) available for use by the shmat and mmap subroutines.

To reduce the chances that the shmat or mmap subroutines will use a segment that could be used for the data area, the operating system uses a different rule for choosing an address to be returned (if a specific address is not requested). Normally, the shmat or mmap subroutines return an address in the lowest available segment. When the very large address-space model is used, these subroutines will return an address in the highest available segment. A request for a specific address will succeed, as long as the address is not in a segment that has already been used for the data area. This behaviour is followed for all process that specify the dsa property.

With the very large address-space model, a maxdata value of zero or a value of up to 0xD0000000 can be specified. If a maxdata value larger than 0xAFFFFFFF is specified, a program will not use globally loaded shared libraries. Instead, all shared libraries will be loaded privately. This can affect program performance.

Enabling the large and very large address-space models

The large address space model is used if any non-zero value is specified for the maxdata value, and the dynamic segment allocation (dsa) property is not specified. The very large address-space model is used if any maxdata value is given and the dsa property is specified. Use the ld command with the -bmaxdata flag to specify a maxdata value and to set the dsa property.

Use the following command to link a program that will have the maximum 8 segments reserved for its data:
cc -bmaxdata:0x80000000 sample.o 
To link a program with the very large-address space model enabled on the POWER® processor-based platform, use the following command:
cc -bmaxdata:0xD0000000/dsa sample.o 
To link a program with the very large-address space model enabled, use the following command:
cc -bmaxdata:0xD0000000/dsa sample.o 
You can cause existing programs to use the large or very large address-space models by specifying the maxdata value with the LDR_CNTRL environment variable. For example, use the following command to run the a.out program with 8 segments reserved for the data area:
LDR_CNTRL=MAXDATA=0x80000000 a.out
The following command runs the a.out program using the very large address-space model, allowing the program's data size to use up to 8 segments for data:
LDR_CNTRL=MAXDATA=0x80000000@DSA a.out
You can also modify an existing program so that it will use the large or very large address-space model. To set the maxdata value in an existing 32-bit XCOFF program, a.out, to 0x80000000, use the following command:
/usr/ccs/bin/ldedit -bmaxdata:0x80000000 a.out
If an existing 32-bit XCOFF program, a.out, with a maxdata value of 0x80000000 does not already have the DSA property, you can add the property with the following command:
/usr/ccs/bin/ldedit -bmaxdata:0x80000000/dsa a.out

You can use the dump command to examine the maxdata value, or to determine whether a program has the dsa property.

Some programs have dependencies on the default address-space model. These programs terminate if a non-zero maxdata value has been specified, either by modifying the executable file of the program or by setting the LDR_CNTRL environment variable.

Executing programs with large data areas

When you execute a program that uses the large address-space model, the operating system attempts to modify the soft limit on data size, if necessary, to increase it to match the maxdata value. If the maxdata value is larger than the current hard limit on data size, either the program will not execute if the environment variable XPG_SUS_ENV has the value set to ON, or the soft limit will be set to the current hard limit.

If the maxdata value is smaller than the size of the program's static data, the program will not execute.

After placing the program's initialized and uninitialized data in segments 3 and beyond, the break value is computed. The break value defines the end of the process's static data and the beginning of its dynamically allocatable data. Using the malloc, brk or sbrk subroutine, a process can move the break value to increase the size of the data area.

For example, if the maxdata value specified by a program is 0x68000000, then the maximum break value is in the middle of segment 9 (0x98000000). The brk subroutine extends the break value across segment boundaries, but the size of the data area cannot exceed the current soft data limit.

The setrlimit subroutine allows a process to set its soft data limit to any value that does not exceed the hard data limit. The maximum size of the data area, however, is limited to the original maxdata value, rounded up to a multiple of 256 MB.

The majority of subroutines are unaffected by large data programs. The shmat and mmap subroutines are the most affected, because they have fewer segments available for use. If a large data-address model program forks, the child process inherits the current data resource limits.

Special considerations

Programs with large data spaces require a large amount of paging space. For example, if a program with a 2-GB address space tries to access every page in its address space, the system must have 2 GB of paging space. The operating system terminates processes when paging space runs low. Programs with large data spaces are terminated first because they typically consume a large amount of paging space.

Debugging programs using the large data model is no different than debugging other programs. The dbx command can debug these large programs actively or from a core dump. A full core dump from a large-data program can be quite large. To avoid truncated core files, be sure the coredump resource limit is large enough, and make sure that there is enough free space in the file system where your program is running.

Some application programs might be written in such a way that they rely on characteristics of the default address space model. These programs might not work if they execute using the large or very large address-space model. Do not set the LDR_CNTRL environment variable when you run these programs.

Processes using the very large address-space model must make code changes to their programs in order to move the break value of the address-space in chunks larger than 2 GB. This is a limitation of the sbrk system call which takes a signed value as the parameter. As a workaround, a program can call sbrk more than one time to move the break value to the desired position.