Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
Generating meaningful assert statements is one of the key challenges in automated test case generation, which requires understanding the intended functionality of the tested code. Recently, deep learning based models have shown promise in improving the performance of assert statement generation. However, the existing models only rely on the test prefixes along with their corresponding focal methods, yet ignore the developer-written summarization. Based on our observations, the summarization contents usually express the intended program behavior or contain parameters that will appear directly in the assert statement. Such information will help existing models address their current inability to accurately predict assert statements. This paper presents a summarization-guided approach for automatically generating assert statements. To derive generic representations for natural language (i.e., summarization) and programming language (i.e., test prefixes and focal methods), we leverage a pre-trained language model as the reference architecture and fine-tune it on the task of assert statement generation. To the best of our knowledge, the proposed approach makes the first attempt to leverage the summarization of focal methods as the guidance for making the generated assert statements more accurate. We demonstrate the effectiveness of our approach on two real-world datasets compared with state-of-the-art models.
Garousi V, Zhi J. A survey of software testing practices in Canada. Journal of Systems and Software, 2013, 86(5): 1354–1376. DOI: 10.1016/j.jss.2012.12.051.
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu P J. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 2020, 21(1): Article No. 140.
Zhang Y, Jin D, Xing Y, Gong Y. Automated defect identification via path analysis-based features with transfer learning. Journal of Systems and Software, 2020, 166:110585. DOI: 10.1016/j.jss.2020.110585.
Zhao Y, Wang Y, Zhang Y, Zhang D, Gong Y, Jin D. ST-TLF: Cross-version defect prediction framework based transfer learning. Information and Software Technology, 2022, 149:106939. DOI: 10.1016/j.infsof.2022.106939.
Xing Y, Qian X, Guan Y, Yang B, Zhang Y. Cross-project defect prediction based on G-LSTM model. Pattern Recognition Letters, 2022, 160: 50–57. DOI: 10.1016/j.patrec.2022.04.039.
Chen Z, Kommrusch S, Tufano M, Pouchet L N, Poshyvanyk D, Monperrus M. SequenceR: Sequence-to-sequence learning for end-to-end program repair. IEEE Trans. Software Engineering, 2021, 47(9): 1943–1959. DOI: 10.1109/TSE.2019.2940179.
Mastropaolo A, Cooper N, Nader Palacio D, Scalabrino S, Poshyvanyk D, Oliveto R, Bavota G. Using transfer learning for code-related tasks. IEEE Trans. Software Engineering, 2023, 49(4): 1580–1598. DOI: 10.1109/TSE.2022.3183297.
Pawlak R, Monperrus M, Petitprez N, Noguera C, Seinturier L. SPOON: A library for implementing analyses and transformations of Java source code. Software: Practice and Experience, 2016, 46(9): 1155–1179. DOI: 10.1002/spe.2346.
McNemar Q. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 1947, 12(2): 153–157. DOI: 10.1007/BF02295996.
Mohammadi M, Atashin A A, Hofman W, Tan Y. Comparison of ontology alignment systems across single matching task via the McNemar’s test. ACM Trans. Knowledge Discovery from Data, 2018, 12(4): Article No. 51. DOI: 10.1145/3193573.
Raschka S. MLxtend: Providing machine learning and data science utilities and extensions to Python’s scientific computing stack. Journal of Open Source Software, 2018, 3(24): Article No. 638. DOI: 10.21105/joss.00638.