Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Stencil codes such as the Jacobi, Gauß-Seidel, and red-black Gauß-Seidel kernels are among the most time-consuming routines in many scientific and engineering applications. The performance of these codes critically depends on an efficient usage of caches, and can be improved by tiling. Several tiling schemes have been suggested in the literature; this paper gives an overview and comparison. Then, in the main part, we prove a lower bound on the number of cold and capacity misses. Finally, we analyze a particular tiling scheme, and show that it is off the lower bound by a factor of at most ten. Our results show up limitations to the speedup that can be gained by future research.