For general linear algebraic computations in science, engineering and mathematics, problem sizes typically grow until constraints imposed by computation rates, I/O or memory availability result in unreasonably long execution times. From this point of view such computing architectures as systolic arrays provide a potential for processing much larger arrays. However, in order to utilize this potential, the problem of how to efficiently decompose large arrays onto a fixed systolic array in a numerically stable way must be solved. In some cases this is most efficiently done by partitioning the algorithm so that it can be run as a sequence of operations on input subarrays that are of the same size as the underlying hardware array. For some cases, such as those associated with banded matrices, it is necessary only to process the non-zero data. In all cases, however, the algorithms must run on the same hardware array without an undue amount of extra control or communication overhead. In this paper we describe algorithmic and architectural approaches to providing a fairly broad range of linear algebraic operations on matrices that are much larger than the underlying hardware array. All of the techniques we describe here have been simulated in detail using APL in order to verify correctness.