-
摘要: 大量研究表明, Java, 软件系统是一种人工复杂网络, 它的入度分布符合幂律, 然而出度却是对数正态分布.现有的这些研究都仅仅以, Java, 开发工具包, 如, JDK、log4j, 和, Tomcat, 这一类软件系统为研究对象. 除此之外, 所分析的数据类型也很有限, 只考虑了程序包与类的依赖关系, 忽略了很多有用的数据类型, 比如函数成员变量和函数临时变量.本文将这两类数据类型纳入了研究范畴, 拓展了类依赖关系. 不仅如此, 本文也将研究粒度细化至函数层面, 构建了函数依赖图.针对这两类依赖关系, 本文提出了将系统转化为加权有向网络图的理论方法.全面的实验结果显示, 本文所选取的, 10, 类, Java, 应用系统, 无论是出度还是入度绝大多数都符合幂律分布, 由此证明了所提出的理论方法, 验证了有向加权网络是否具有无标度特性在分析Java应用系统网络结构时是有效的.Abstract: A lot of empirical studies have demonstrated that Java software system is a kind of artificial complex network and its in-degree distribution obeys the power law, while out-degree distribution is lognormal. However, most experiment objects in these studies are Java development tools, such as JDK, log4j and Tomcat, and the type of data analyzed in dependence graph is limited, because they only consider some class dependence relationships and omit certain useful data types, like member variables and local variables. In this paper, we all useful dependence relationships between entities or modules on both class and function levels, and we further propose a novel method to transform a system network into a weighted directed graph. Comprehensive experiment results show that the in- and out-degree of 10 types of Java application systems mostly fit the power law distributions, and our proposed method to detect the scale-free feature of a weighted and directed network is effective in analyzing Java application systems.
-
Tab. 1 CDG and FDG networks for 10 types of Java application systems
Network Description |V| |E| |Vf| |Ef| GISToolkit GIS system 470 2 467 468 1 488 OpenGTS GIS system 357 2 474 67 04 7 122 OpenJUMP GIS system 1 369 8 429 6 062 8 815 dotCMS CMS system 2 412 13 801 2 382 6 680 infoGlue CMS system 1 371 6 891 1 359 3 375 OpenCms CMS system 1 572 12 288 1 558 6 452 eConf LMS system 61 169 134 103 a-LMS LMS system 440 1 861 1 988 1 759 OLMS LMS system 130 358 141 97 JSPWiki Wiki system 489 2 056 1 843 1 927 JAMWiki Wiki system 150 744 762 770 FitNesse Wiki system 672 2 789 1 957 1 764 jforum Forum system 355 1 587 1 290 1 423 jGossip Forum system 263 919 507 611 Yazd Forum system 218 885 891 793 jPortlet Portal system 133 555 318 273 OpenPortal Portal system 84 151 87 92 Pluto Portal system 351 1 189 821 706 GatorMail WebMail system 110 324 269 197 OlivaMail WebMail system 60 247 191 215 yawebmail WebMail system 76 196 157 139 FocusSNS Blog and SNS system 299 913 283 491 Pebble Blog and SNS system 655 3 003 2 295 2 875 Roller Blog and SNS system 513 2 843 2 536 3 314 ITracker Bug tracking system 386 1 835 382 736 BugRat Bug tracking system 74 383 364 513 Scarab Bug tracking system 575 2 449 5 947 5 332 Hipergate ERP and CRM system 643 2 517 3 181 3 366 SourceTap ERP and CRM system 169 737 652 636 OpenCustomer ERP and CRM system 398 2 019 1 151 1 392 Tab. 2 The invocation number of the maximum node in each system
Network R1 R2 R3 R4 R5 GISToolkit 1 20 155 86 54 OpenGTS 3 0 391 335 60 OpenJUMP 1 79 394 64 179 dotCMS 3 0 3900 349 243 infoGlue 4 0 1850 58 199 OpenCms 0 107 1 894 292 1 162 eConf 0 1 2 5 14 a-LMS 1 0 309 52 2 OLMS 0 0 0 155 9 JSPWiki 1 57 126 52 73 JAMWiki 0 74 2 0 4 FitNesse 2 48 279 103 25 jforum 0 56 55 89 89 jGossip 1 0 105 18 2 Yazd 1 19 39 27 14 jPortlet 4 0 101 1 10 OpenPortal 0 0 8 1 6 Pluto 1 7 56 33 14 GatorMail 0 0 1 0 47 OlivaMail 0 1 1 26 27 yawebmail 1 1 4 0 6 FocusSNS 1 9 31 24 8 Pebble 0 32 86 132 147 Roller 13 0 98 104 0 ITracker 1 2 10 3 94 BugRat 0 0 250 51 101 Scarab 1 27 266 187 45 Hipergate 0 3 151 46 39 SourceTap 0 1 50 45 25 OpenCustomer 0 17 59 100 38 Tab. 3 Value of α and xmin in all ten types of Java application networks
Network CDG networks[5R] CDG networks[3R] FDG networks ${\alpha}^{in}$ $x_{\min}^{in}$ ${\alpha}^{out}$ $x_{\min}^{out}$ ${\alpha}^{in}$ $x_{\min}^{in}$ ${\alpha}^{out}$ $x_{\min}^{out}$ ${\alpha}^{in}$ $x_{\min}^{in}$ ${\alpha}^{out}$ $x_{\min}^{out}$ GISToolkit 2.01 6 4.23 39 1.96 3 2.72 8 1.80 1 3.38 14 OpenGTS 2.01 27 2.08 20 2.02 8 2.26 8 2.50 2 3.84 3 OpenJUMP 2.09 12 4.33 42 2.19 8 3.41 16 2.22 2 4.25 6 dotCMS 1.81 6 2.19 35 2.02 24 1.96 5 3.36 6 2.72 6 infoGlue 1.81 5 2.03 11 1.88 1 2.25 6 1.79 3 2.02 4 OpenCms 1.90 11 2.34 36 1.94 6 2.35 17 1.84 4 2.29 16 eConf 3.03 9 3.56 10 2.70 5 3.38 5 3.10 1 2.82 1 a-LMS 1.96 8 2.17 9 1.65 1 2.19 5 2.80 2 4.75 2 OLMS 1.85 2 2.25 6 2.51 1 2.38 1 2.59 1 4.89 1 JSPWiki 1.61 1 3.31 33 1.91 3 2.30 4 2.80 4 2.70 1 JAMWiki 1.90 10 2.57 23 1.63 1 2.47 13 2.26 2 4.62 2 FitNesse 1.99 6 3.20 18 1.78 1 4.03 16 2.66 2 2.87 1 jforum 2.15 32 3.92 34 1.78 1 2.92 6 2.80 5 5.00 4 jGossip 1.95 4 3.05 6 2.13 1 2.82 4 2.61 1 5.00 4 Yazd 1.87 7 1.96 6 1.86 1 2.26 5 2.30 1 5.00 3 jPortlet 2.35 12 2.32 8 2.12 4 2.63 5 2.40 1 2.61 1 OpenPortal 2.21 2 2.28 2 2.52 2 3.10 3 2.26 1 2.83 2 Pluto 2.47 8 2.66 11 2.41 5 2.48 4 2.92 2 2.67 1 GatorMail 2.00 3 3.00 5 2.10 1 2.45 2 2.68 1 3.49 1 OlivaMail 2.34 10 2.67 7 3.51 4 2.05 1 2.38 1 2.37 1 yawebmail 3.35 9 2.19 3 1.81 1 2.78 4 3.03 2 3.32 2 FocusSNS 1.71 1 2.57 7 1.90 1 5.00 11 2.75 15 4.58 8 Pebble 1.73 1 2.79 14 1.89 3 2.71 3 2.49 5 4.03 3 Roller 1.84 4 2.61 20 1.77 1 2.41 5 2.36 2 4.54 5 ITracker 2.94 83 2.57 15 2.37 18 2.10 2 2.31 1 2.36 3 BugRat 1.42 1 2.56 21 2.05 5 2.77 8 2.13 2 2.10 1 Scarab 2.32 37 2.62 41 2.56 29 2.44 21 2.30 1 4.25 1 Hipergate 2.19 17 2.59 24 2.07 5 2.72 15 2.39 2 3.43 3 SourceTap 1.64 1 2.11 8 1.62 2 2.37 5 2.45 1 2.68 1 OpenCustomer 1.94 4 5.00 27 2.12 3 4.13 10 2.19 1 2.31 1 Tab. 4 p-values in all ten types of Java application networks
Network CDG networks[5R] CDG networks[3R] FDG networks p-valuein p-valueout p-valuein p-valueout p-valuein p-valueout GISToolkit 0.0708 0.2084 0.637 6 0.001 2 0.344 4 0.1764 OpenGTS 0.6414 0.011 2 0.352 0 0.813 6 0.543 6 0.027 6 OpenJUMP 0.407 2 0.766 4 0.928 4 0.088 8 0.741 2 0.5608 dotCMS 0.034 4 0.001 2 0.664 0 0.008 0 0.234 0 0.168 0 infoGlue 0.218 8 0.175 6 0.794 0 0.041 6 0.326 0 0.0392 OpenCms 0.232 8 0.144 4 0.289 2 0.745 2 0.127 6 0.566 8 eConf 0.476 4 0.756 8 0.477 2 0.199 2 0.339 6 0.832 4 a-LMS 0.004 8 0.205 6 0.004 0 0.258 0 0.558 8 0.035 2 OLMS 0.020 8 0.055 6 0.342 0 0.666 4 0.382 8 0.497 2 JSPWiki 0.000 0 0.593 6 0.595 6 0.027 2 0.342 4 0.000 0 JAMWiki 0.297 6 0.902 8 0.000 4 0.195 2 0.548 8 0.365 6 FitNesse 0.027 6 0.255 2 0.019 2 0.917 6 0.461 6 0.0000 jforum 0.772 0 0.620 0 0.000 0 0.084 4 0.193 2 0.241 2 jGossip 0.305 2 0.059 6 0.001 6 0.860 8 0.079 6 0.692 8 Yazd 0.495 6 0.060 8 0.053 2 0.302 0 0.014 4 0.208 8 jPortlet 0.458 8 0.017 2 0.714 0 0.529 6 0.003 2 0.1716 OpenPortal 0.188 0 0.020 0 0.001 6 0.710 8 0.360 8 0.2524 Pluto 0.437 2 0.872 0 0.632 4 0.368 0 0.326 8 0.000 0 GatorMail 0.467 2 0.818 4 0.016 4 0.620 0 0.442 4 0.1440 OlivaMail 0.503 2 0.800 4 0.110 8 0.047 6 0.512 0 0.1420 yawebmail 0.658 0 0.665 6 0.004 8 0.562 0 0.549 2 0.0752 FocusSNS 0.001 6 0.042 8 0.080 4 0.939 6 0.695 6 0.7408 Pebble 0.000 0 0.613 6 0.643 2 0.001 6 0.264 0 0.012 4 Roller 0.045 6 0.353 2 0.000 4 0.666 8 0.916 0 0.026 8 ITracker 0.817 2 0.010 4 0.824 4 0.363 6 0.258 4 0.0416 BugRat 0.005 6 0.463 6 0.228 8 0.7728 0.4364 0.000 0 Scarab 0.307 6 0.229 2 0.0016 0.9808 0.0000 0.000 0 Hipergate 0.577 2 0.226 8 0.7416 0.2812 0.068 0 0.7228 SourceTap 0.528 4 0.001 2 0.7084 0.0448 0.606 0 0.0000 OpenCustomer 0.260 8 0.7868 0.5608 0.7300 0.3144 0.0000 -
[1] CONCAS G, MARCHESI M, PINNA S, et al. Power-laws in a large object-oriented software system [J]. IEEE Trans Softw Eng, 2007, 33: 687-708. doi: 10.1109/TSE.2007.1019 [2] DE MOURA A P, LAI Y C, MOTTER A E. Signatures of small-world and scale-free properties in large computer programs [J]. Phys Rev E, 2003, 68: 017102. DOI: 1103/PhysRevE.68.017102. [3] KOHRING G A. Complex dependencies in large software systems [J]. Adv Complex Syst, 2009, 12: 565-581. doi: 10.1142/S0219525909002362 [4] LABELLE N, WALLINGFORD E. Inter-package dependency networks in open-source software [J]. Computer Science, arXiv: cs/0411096v1. [5] MAILLART T, SORNETTE D, SPAETH S, et al. Empirical tests of Zipf's law mechanism in open source linux distribution [J]. Phys Rev Lett, 2008, 101: 218-701. [6] ZHENG X, ZENG D, LI H, et al. Analyzing open-source software systems as complex networks [J]. Physica A, 2008, 387: 6190-6200. doi: 10.1016/j.physa.2008.06.050 [7] HYLAND-WOOD D, CARRINGTON D, KAPLAN S. Scale-free nature of java software package, class and method collaboration graphs [C]//Submitted to the 5th International Symposium on Empirical Software Engineering. 2005. [8] FORMAN I R, FORMAN N. Java Reflection in Action [M]. [S.l.]: Manning Publications, 2004: 121-142. [9] GIULIO C, MICHELE M, SANDRO P, NICOLA S. On the suitability of Yule process to stochastically model some properties of object-oriented systems [J]. Physica A, 2006, 370: 817-831. doi: 10.1016/j.physa.2006.02.024 [10] GU Q, XIONG S J, CHEN D X. Correlations between characteristics of maximum influence and degree distributions in software networks [J]. Sci China Inf Sci, 2014, 57(7). DOI: 10.1007/s11432-013-5047-7. [11] LI C F, LIU L Z, LI X Y. Software networks of Java class and application in fault localization [C]//Proceedings of the International Conference on Intelligent Systems Design and Engineering Application. 2012: 1117-1120. [12] TETSUO T, TAKAKO N. Analysis of software evolution processes using statistical distribution models [C]//Proc International Workshop Principles of Software Evolution (IWPSE). 2002: 120-123. [13] LI D Y, HAN Y N, HU J. Complex network thinking in software engineering [C]//Proceedings of the International Conference on Computer Science and Software Engineering, 2008, 1: 264-268. [14] WHEELDON R, COURNSELL S. Power law distributions in class relationships [C]//Proceedings of the 3rd IEEE International Workshop Source Code Analysis and Manipulation. 2003: 45-54. [15] VENNERS B. Inside the Java Virtual Machine [M]. 2nd edition. New York: The McGraw-Hill Companies, Inc., 1996: 563-570. [16] BRUEGGE B, DUTOIT A H. Object-Oriented Software Engineering using UML, Patterns and Java-(Required) [M]. Upper Saddle River, NJ, USA: Prentice Hall Press, 2004: 29-77. [17] ROUBTSOV V. Cracking Java byte-code encryption [R/OL]. (2003)[2015-01-12]. http://www.javaworld.com/linebreakarticle/2077342/core-jave/cracking-java-byte-code-encryption.html. [18] MCCLUSKEY G. Using Java reflection [R/OL]. (1998-01-06)[2015-01-12]. http://www.oracle.com/technetwork/linebreak articles/java/javareflection-1536171.html. [19] The JavaTM tutorial [R/OL]. 6th edition. (2015)[2015-01-12] http://docs.oracle.com/javase/tutorial/. [20] ATANAS R, ANA M, BARBARA G R. Fragment class analysis for testing of polymorphism in Java software [J]. IEEE Trans Softw Eng, 2004, 30: 372-386. doi: 10.1109/TSE.2004.20 [21] CLAUSET A, SHALIZI C R, NEWMAN M E J. Power-law distributions in empirical data [J]. SIAM Review, 2009, 51: 661-703. doi: 10.1137/070710111 [22] MYERS C R. Software systems as complex networks: structure, function, and evolvability of software collaboration graphs [J]. Phys Rev E, 2003, 68: 046-116. http://www.baidu.com/link?url=kXQxI2p7hlDxDCeW4uwQvHOAMlezMYRB7DcxS3iyGg_lrJRrxss4hZ9495N1M_9KE9O-aW1Zj3AnXfmXkof7j9PCgjtV55VmMqY-S2jsxIcTrGJoUKWWyy1pxT4zco_IsErDY3bXqbj9fGBsIpL6FRGxlbzGQ457psEg-z3iCH-0RVATsahkVyo-dYzMVcP2dqwCYLpNDpoBRGuvnSaRgMR1UCuQDZ0QTxGXLt5ud1Vtb7pBjY6WY3x2oxeaK6YTiPlIMa-xiWDdMNF6j5DuZ_57BIgD7IlZFwbfzZ6u1yfCiSqRZWtAtaVAJhdbgDeAZiUSbdid6knGiEdEpscdUMDxVHVqgV83DFQTXqLua7j17dZx23-eSqcd0w-PTnlyCZ7V6iQdMynaGrut7OSIiK&wd=&eqid=df0e1b070001abaa0000000558b91cd9 [23] NEWMAN, M. E. J. Networks An Introduction [M]. London: Oxford University Press, 2010: 235-266. [24] ALBERT L B, R'EKA A. Emergence of scaling in random networks [J]. Science, 1999, 286: 509-512. doi: 10.1126/science.286.5439.509 [25] CHEN T, LI K W, WANG S R, et al. Design and implementation of a tool of Java program analysis based on complex networks (in Chinese) [J]. Computer Science, 2009, 36(4): 145-150.