Abstract

An important component of many scientific computations is the matrix-vector multiplication. An
efficient parallelization of the matrix-vector multiplication would instantly lower computation times in
many areas of scientific computing, and enable solving larger problems than is currently possible. A
major obstacle in development of this parallelization is finding out how to distribute the
... read more
matrix data
over all processors without introducing a large communication cost. This problem can be accurately
modelled by the matrix partitioning problem with an appropriate cost function.
In this thesis, an optimal branch-and-bound algorithm for 2-way matrix partitioning is developed.
Since matrix partitioning is NP-Complete, a polynomial time algorithm is not expected to exist.
However, several lower bounds on the cost of branches of the branching-tree are introduced in
this thesis, reducing the computation time of the optimal algorithm significantly. Comparisons with
Integer Linear Programs that solve the same problem show that the branch-and-bound method is
efficient in solving the 2-way matrix partitioning problem. The algorithm was able to find the optimal
2-way partitioning of 75% of all matrices of the University of Florida Sparse Matrix Collection [12]
with at most 200 rows or columns. Finally, several suggestions for further research on optimal
bipartitioning matrices are given.
In Chapter 3, using information on common characteristics of optimal solutions, a new heuristic
method is developed that attempts to find good solutions in polynomial time. This heuristic is not
based on hypergraphs, as most publicly available partitioners are, but uses a novel partitioning model.
The model assigns individual rows and columns to different processors, ensuring a low computation
time, and allows for 2-dimensional partitionings, ensuring a high solution quality. In its current form,
the heuristic partitions the matrix in p parts directly when solving the p-way partitioning problem,
avoiding the possible negative effects of recursive bisection. Furthermore, a multilevel scheme is
included to enable solving extremely large matrices in reasonable time. Comparison of the solution
quality of the new heuristic with that of the state-of-the-art matrix partitioning software Mondriaan
3.0 [40] shows that for some matrices, the new heuristic performs significantly better.
Finally, in Appendix A, an orthogonal array testing method is applied in order to find better
default values for the parameters of Mondriaan 3.0. This results in values that produce solutions
that have, on average, a 10.4 % smaller cost than the default values would produce. Furthermore,
the results indicate that the quality of of the initial partitioning in the multilevel scheme has a large
impact on the final solution quality.
show less