Sains Malaysiana 39(6)(2010): 1041–1048

 

Improving Pipelined Time Stepping Algorithm for Distributed Memory Multicomputers

(Menambahbaik Algoritma Sela-Masa Penalian Paip bagi Multikomputer Memori Teragih)

 

Ng Kok Fu* & Norhashidah Hj. Mohd. Ali

School of Mathematical Sciences, Universiti Sains Malaysia

11800 Penang, Malaysia

 

Diserahkan: 19 Mei 2009 / Diterima: 1 Jun 2010

 

ABSTRACT

 

Time stepping algorithm with spatial parallelisation is commonly used to solve time dependent partial differential equations. Computation in each time step is carried out using all processors available before sequentially advancing to the next time step. In cases where few spatial components are involved and there are relatively many processors available for use, this will result in fine granularity and decreased scalability. Naturally one alternative is to parallelise the temporal domain. Several time parallelisation algorithms have been suggested for the past two decades. One of them is the pipelined iterations across time steps. In this pipelined time stepping method, communication however is extensive between time steps during the pipelining process. This causes a decrease in performance on distributed memory environment which often has high message latency. We present a modified pipelined time stepping algorithm based on delayed pipelining and reduced communication strategies to improve overall execution time on a distributed memory environment using MPI. Our goal is to reduce the inter-time step communications while providing adequate information for the next time step to converge. Numerical result confirms that the improved algorithm is faster than the original pipelined algorithm and sequential time stepping algorithm with spatial parallelisation alone. The improved algorithm is most beneficial for fine granularity time dependent problems with limited spatial parallelisation.

 

Keywords: Distributed memory multicomputer; parallel time stepping; pipelining iteration; time dependent problem

 

ABSTRAK

 

Algoritma sela-masa dengan penyelarian ruang umumnya digunakan untuk menyelesaikan persamaan pembezaan separa bersandar masa. Pengiraan pada setiap sela masa dilakukan dengan menggunakan kesemua pemproses yang sedia ada sebelum mara ke sela masa berikutnya. Dalam kes dengan sedikit sahaja komponen ruang yang terlibat dan terdapat banyak pemproses untuk digunakan, algoritma ini akan mengakibatkan kegranulan halus dan pengurangan skalabiliti. Lazimnya satu alternatif dalam kes begini adalah menyelarikan domain masa. Beberapa algoritma penyelarian masa telah dicadangkan sepanjang dua dekad yang lalu. Salah satu daripadanya ialah lelaran penalian paip merentasi sela masa. Walau bagaimanapun dalam kaedah sela masa penalian paip ini, komunikasi di antara sela masa berlaku secara meluas sepanjang proses penalian paip. Ini mengakibatkan penurunan prestasi dalam persekitaran memori teragih yang lazimnya mempunyai latensi mesej yang tinggi. Kami mencadangkan satu algoritma sela-masa penalian paip terubahsuai berdasarkan strategi penalian paip tertangguh dan pengurangan komunikasi bagi memperbaiki masa pelaksanaan keseluruhan dalam persekitaran memori teragih menggunakan MPI. Tujuan kami ialah mengurangkan komunikasi antara sela masa di samping menyediakan maklumat yang mencukupi bagi penumpuan pada sela masa berikutnya. Keputusan berangka menunjukkan algoritma yang ditambahbaik ini adalah lebih pantas berbanding dengan algoritma penalian paip asal dan algoritma sela-masa dengan penyelarian ruang sahaja. Algoritma yang ditambahbaik ini adalah paling berguna bagi masalah bersandar masa berkegranulan halus dengan penyelarian ruang yang terhad.

 

Kata kunci: Masalah bersandar masa; multikomputer memori teragih; lelaran penalian paip; sela-masa selari

 

RUJUKAN

 

Ali, N.H.M. & Ng, K.F. 2006. Explicit group PDE solvers in an MPI environment. International Conference on Mathematical Modelling and Computation, June 5-8, Universiti Brunei Darussalam.

Bonomo, J.P. & Dyksen, W.R. 1981. Pipelined iterative methods for shared memory machines. Technical Report 688, Computer Science Department, Purdue University.

Frantziskonis, G., Muralidharan, K., Deymier, P., Simunovic, S., Nukala, P. & Pannala S. 2009. Time-parallel multiscale/multiphysics framework. J. Comput. Phys. 228: 8085-8092.

Gropp, W., Lusk, E. & Skjellum, A. 1999. Using MPI: Portable parallel programming with the message-passing interface. Cambridge, MA: MIT Press.

Horton, G. & Vandewalle, S. 1995. A space-time multigrid method for parabolic PDEs. SIAM J. Sci. Computing 16(4): 848-864.

Lions, J-L., Maday, Y. & Turinice, G. 2001. A “parareal” in time discretization of PDE’s. C. R. Acad. Sci. Paris Ser. I Math. 332: 661-668.

Ng, K.F. & Ali, N.H.M. 2008. Performance analysis of explicit group parallel algorithms for distributed memory multicomputer. Parallel Computing 34(6-8): 427-440.

Pacheco, P.S. 1997. Parallel Programming with MPI. San Francisco: Morgan Kaufmann Publishers.

Pormann, J.B., Board, J.A. Jr. & Rose, D.J. 1998. The implicit pipeline method. Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium. IEEE Computer Society, Washington, DC721-725.

Srinivasan, A. & Chandra, N. 2005. Latency tolerance through parallelization of time in scientific applications. Parallel Computing 31(7): 777-796.

Womble, D.E. 1990. A time stepping algorithm for parallel computers. SIAM J. Sci. Computing 11(5): 824-837.

Zhu, J. 1994. Solving partial differential equations on parallel computers. Singapore: World Scientific Publishing.

 

*Pengarang untuk surat-menyurat; email: ngkokfu@gmail.com

 

 

 

sebelumnya