Automatic Optimization of OpenCL-Based Stencil Codes for FPGAs and Its Evaluation

  • Tsukasa Endo Tohoku University
  • Hasitha Muthumala Waidyasooriya Tohoku University
  • Masanori Hariyama Tohoku University
Keywords: OpenCL for FPGA, performance tuning, stencil computation, code optimization

Abstract

Recently, C-based OpenCL design environment is proposed to design FPGA (field programmable gate array) accelerators. Although many C-programs can be executed on FPGAs, the best c-code for a CPU may not be the most appropriate one for an FPGA. Users must have some knowledge about computer architecture in order to write a good OpenCL code. To solve this problem, we propose an automatic optimization method. We accurately predict the kernel performance using the log files generated at the initial stage of the compilation. Then we find the optimized FPGA architecture by searching all possible design parameters. We implement the proposed method to find the optimized architecture for stencil computation. According to the results, the design time has been reduced to 4% ∼ 8% of the conventional approach.

Author Biographies

Hasitha Muthumala Waidyasooriya, Tohoku University
Waidyasoriya Hasitha Muthumala is an Assistant Professor in the Graduate School of Information Sciences at Tohoku University. He received the BE degree in information engineering, the MS degree in information sciences, and the PhD degree in information sciences from Tohoku University, Japan, in 2006, 2008, and 2010 respectively. His research interests include reconfigurable computing, processor architectures for big-data applications, and high-level design methodology for VLSIs. He is a member of the IEEE.
Masanori Hariyama, Tohoku University
Masanori Hariyama is a Professor in the Graduate School of Information Sciences at Tohoku University. He received the BE degree in electronic engineering, the MS degree in information sciences, and the PhD degree in information sciences from Tohoku University, Japan, in 1992, 1994, and 1997, respectively. His research interests include real-world applications such as robotics and medical applications, big data applications such as bio-informatics, high-performance computing, VLSI computing for real-world application, high-level design methodology for VLSIs, and reconfigurable computing. He is a member of the IEEE.

References

S. Brown and Z Vranesic, “Fundamentals of digital logic design with Verilog Design”, 2007.

S. Brown, “Fundamentals of digital logic design with VHDL Design”, 2008.

T.S., Czajkowski, D. Neto, M. Kinsner, U. Aydonat, J. Wong, D. Denisenko, P. Yiannacouras, J. Freeman, D.P. Singh, S.D. Brown, “OpenCL for FPGAs: Prototyping a compiler”, International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA), pp. 3-12, 2012.

The open standard for parallel programming of heterogeneous systems. https://www.khronos.org/opencl/, 2015.

Q. Jia and H. Zhou, “Tuning Stencil Codes in OpenCL for FPGAs”, IEEE 34th International Conference on Computer Design (ICCD), pp. 249-256, 2016.

K. Krommydas, R. Sasanka, W. Feng, “Bridging the FPGA programmabilityportability Gap via automatic OpenCL code generation and tuning”, IEEE 27th International Conference on Application-specific Systems, Architectures and Processors (ASAP), pp. 213-218, 2016.

T. Endo, H.M. Waidyasooriya, M. Hariyama, “Automatic Optimization of OpenCLBased Stencil Codes for FPGAs”. In: Lee R. (eds) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. SNPD 2017. Studies in Computational Intelligence, Vol 721. Springer, Cham, 2017.

G. Roth, J. Mellor-Crummey, K. Kennedy, R.G. Brickner, “Compiling Stencils in High Performance Fortran”, Proceedings of the 1997 ACM/IEEE conference on Supercomputing, pp.1-20, 1997.

G. Karniadakis, S. Sherwin, “Spectral/hp Element Methods for Computational Fluid Dynamics”, Oxford University Press, 2013.

K.S. Yee, “Numerical solution of initial boundary value problems involving Maxwells equations in isotropic media”, IEEE Transactions on Antennas and Propagation, Vol.14, No.3, pp.302-307, 1966.

K. Sano, Y. Hatsuda, S. Yamamoto, “Multi-FPGA Accelerator for Scalable Stencil Computation with Constant Memory Bandwidth”, IEEE Transctions on Parallel and Distributed Systems, Vol.25, No.3, pp.695-705, 2014.

K. Dohi, K. Okina, R. Soejima, Y. Shibata, K. Oguri, “Performance Modeling of Stencil Computing on a Stream-Based FPGA Accelerator for Efficient Design Space Exploration”, IEICE Transactions on Information and Systems, Vol.E98-D, No.2, pp.298-308, 2015.

H.M. Waidyasooriya, Y. Takei, S. Tatsumi and M. Hariyama, “OpenCL- Based FPGAPlatform for Stencil Computation and Its Optimization Methodology,” IEEE Transactions on Parallel and Distributed Systems, vol. 28, no.5, pp.1390-1402, 2017

H.M. Waidyasooriya, M. Hariyama and K. Kasahara, “Architecture of an FPGA Accelerator for Molecular Dynamics Simulation Using OpenCL,” Proc. 15th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2016), pp.115-119, 2016.

H.M. Waidyasooriya, M. Hariyama and K. Kasahara, “An FPGA Accelerator for Molecular Dynamics Simulation Using OpenCL,” International Journal of Networked and Distributed Computing, Vol. 5, No. 1, pp.52-61, 2017.

Intel FPGA SDK for OpenCL, https://www.altera.com/products/design-software/embedded-software-developers/opencl/overview.html, 2016.

Terasic, DE5-Net FPGA Development Kit, http://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=English&CategoryNo=158&No=526

Terasic, DE5a-Net Arria 10 FPGA Development Kit, https://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=English&CategoryNo=231&No=970&PartNo=2

Published
2017-12-31
Section
Technical Papers (Advanced Applied Informatics)