中国综合性科技类核心期刊(北大核心)

中国科学引文数据库来源期刊(CSCD)

美国《化学文摘》(CA)收录

美国《数学评论》(MR)收录

俄罗斯《文摘杂志》收录

Message Board

Respected readers, authors and reviewers, you can add comments to this page on any questions about the contribution, review, editing and publication of this journal. We will give you an answer as soon as possible. Thank you for your support!

Name
E-mail
Phone
Title
Content
Verification Code
Issue 1
Jan.  2019
Turn off MathJax
Article Contents
GU Zong-jing, ZHAO Xun-wang, LIU Ying-yu, LIN Zhong-chao, ZHANG Yu, ZHAO Yu-ping. Optimization of parallel method of moments based on KNL many-core processors[J]. Journal of East China Normal University (Natural Sciences), 2019, (1): 105-114. doi: 10.3969/j.issn.1000-5641.2019.01.012
Citation: GU Zong-jing, ZHAO Xun-wang, LIU Ying-yu, LIN Zhong-chao, ZHANG Yu, ZHAO Yu-ping. Optimization of parallel method of moments based on KNL many-core processors[J]. Journal of East China Normal University (Natural Sciences), 2019, (1): 105-114. doi: 10.3969/j.issn.1000-5641.2019.01.012

Optimization of parallel method of moments based on KNL many-core processors

doi: 10.3969/j.issn.1000-5641.2019.01.012
  • Received Date: 2017-09-29
  • Publish Date: 2019-01-25
  • The parallel method of moments (MoM) is successfully optimized using the MPI+OpenMP hybrid programming strategy, based on the second-generation Intel Xeon Phi many-core processor platform, codenamed Knights Landing (KNL). Using OpenMP programming technology, the utilization rate of the CPU (Center Processing Unit) is increased, and the computing resources of KNL are fully utilized. The introduction of threads substantially reduces the inter-process redundant integrals in the filling matrix process. In order to give full play to the advantage of KNL's 512-bit vector width, the efficiency of the loop structure is further enhanced through vector optimization. For the matrix solution process, which typically requires intensive computation and high CPU utilization, MPI (Message Passing Interface) communication time is reduced and the solution process is accelerated by introducing an OpenMP programming strategy. Numerical results show that the efficiency of solving complex electromagnetic problems by parallel MoM is greatly improved through optimization on the KNL many-core processor platform.
  • loading
  • [1]
    CHEN Y, ZHANG G, LIN Z, et al. Solution of EM problems using hybrid parallel MIC/CPU implementation of higher-order MoM[C]//IEEE, International Symposium on Microwave, Antenna, Propagation, and Emc Technologies. IEEE, 2016: 789-791.
    [2]
    张光辉. CPU/MIC异构平台中矩量法与时域有限差分法的研究[D].西安: 西安电子科技大学, 2015.
    [3]
    左胜, 陈岩, 张玉, 等.一种可扩展异构并行核外高阶矩量法[J].西安电子科技大学学报(自然科学版), 2017, 44(1):146-151. doi:  10.3969/j.issn.1001-2400.2017.01.026
    [4]
    赖明澈.数据并行协处理器体系结构的研究与实现[D].长沙: 国防科学技术大学, 2005.
    [5]
    HARRINGTON R F, HARRINGTON J L. Field Computation by Moment Methods[M]. NewYork:Oxford University Press, 1996.
    [6]
    ZHANG Y, SARKAR T K. Parallel Solution of Integral Equation Based EM Problems in the Frequency Domain[M]. Hoboken, NJ:Wiley-IEEE Press, 2009.
    [7]
    RAO S M, WILTON D R, GLISSON A W. Electromagnetic scattering by surfaces of arbitrary shape[J]. IEEE Transactions on Antennas & Propagation, 1982, 30(3):409-418. doi:  10.1109-TAP.1982.1142818/
    [8]
    张玉, 赵勋旺, 陈岩, 等.计算电磁学中的超大规模并行矩量法[M].西安:西安电子科技大学出版社, 2016.
    [9]
    RANA V S, LIN M, CHAPMAN B. A scalable task parallelism approach for LU decomposition with multicore CPUs[C]//Proceedings of the 2nd Internationsl Workshop on Extreme Scale Programming Models and Middleware. Piscataway, NJ, USA: IEEE Press, 2016: 17-23.
    [10]
    ZHANG G, CHEN Y, ZHANG Y, et al. MIC accelerated LU decomposition for method of moments[C]//IEEE International Symposium on Antennas and Propagation & Usnc/ursi National Radio Science Meeting. IEEE, 2015: 756-757.
    [11]
    JEFFERS J, REINDERS J. Intel Xeon Phi协处理器高性能编程指南[M].陈健, 李慧, 杨昆, 等, 译.北京: 人民邮电出版社, 2014.
    [12]
    高伟, 赵荣彩, 韩林, 等. SIMD自动向量化编译优化概述[J].软件学报, 2015, 26(6):1265-1284. http://d.old.wanfangdata.com.cn/Periodical/rjxb201506001
    [13]
    周领良, 朱延超, 刘轶, 等.基于Cache命中率校准的并行程序性能预测[C]//2014全国高性能计算学术年会论文集.中国计算机学会, 2015: 814-817.
    [14]
    艾维丽.浅析Cache命中率与块的大小之间的关系[J].价值工程, 2011, 32:153. http://d.old.wanfangdata.com.cn/Periodical/jzgc201132110
    [15]
    叶凝, 应忍冬, 朱新忠, 等.众核处理器系统可靠性优化方案[J].计算机与现代化, 2013, 218(10):143-148. doi:  10.3969/j.issn.1006-2475.2013.10.036
    [16]
    MIWA M, NAKASHIMA K. Progression of MPI Non-blocking Collective Operations Using Hyper-Threading[C]//201523rd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). IEEE, 2015: 163-171.
    [17]
    QUN N H, KHALIB Z I A, WARIP M N, et al. Hyper-threading technology: Not a good choice for speeding up CPU-bound code[C]//International Conference on Electronic Design. IEEE, 2017: 578-581.
    [18]
    RAJESH N, MALATHI K, RAJU S, et al. Design of vivaldi antenna with wideband radar cross section reduction[J]. IEEE Transactions on Antennas and Propagation, 2017, 65(4):2102-2105. doi:  10.1109/TAP.2017.2670566
    [19]
    HU C F, LI N J, CHEN W J, et al. High-precision RCS measurement of aircraft's weak scattering source[J]. Chinese Journal of Aeronautics, 2016, 29(3):772-778. doi:  10.1016/j.cja.2016.03.003
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(5)

    Article views (125) PDF downloads(165) Cited by()
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return