The model of the computing environment for ScaLAPACK is represented as a one-dimensional array of processes (for operations on band or tridiagonal matrices) or also a two-dimensional process grid (for operations on dense matrices). To use ScaLAPACK, all global matrices or vectors should be distributed on this array or grid prior to calling the ScaLAPACK routines.
ScaLAPACK uses the two-dimensional block-cyclic data distribution as a layout for dense matrix computations. This distribution provides good work balance between available processors, as well as gives the opportunity to use BLAS Level 3 routines for optimal local computations. Information about the data distribution that is required to establish the mapping between each global array and its corresponding process and memory location is contained in the so called array descriptor associated with each global array. An example of an array descriptor structure is given in Table "Content of the array descriptor for dense matrices".
Array Element # | Name | Definition |
---|---|---|
1 | dtype | Descriptor type ( =1 for dense matrices) |
2 | ctxt | BLACS context handle for the process grid |
3 | m | Number of rows in the global array |
4 | n | Number of columns in the global array |
5 | mb | Row blocking factor |
6 | nb | Column blocking factor |
7 | rsrc | Process row over which the first row of the global array is distributed |
8 | csrc | Process column over which the first column of the global array is distributed |
9 | lld | Leading dimension of the local array |
The number of rows and columns of a global dense matrix that a particular process in a grid receives after data distributing is denoted by LOCr() and LOCc(), respectively. To compute these numbers, you can use the ScaLAPACK tool routine numroc.
After the block-cyclic distribution of global data is done, you may choose to perform an operation on a submatrix of the global matrix A, which is contained in the global subarray sub(A), defined by the following 6 values (for dense matrices):
The number of rows of sub(A)
The number of columns of sub(A)
A pointer to the local array containing the entire global array A
The row index of sub(A) in the global array
The column index of sub(A) in the global array
The array descriptor for the global array
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804 |