AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

SAGA: Summarization-Guided Assert Statement Generation

Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, Beijing 100871, China
School of Computer Science, Peking University, Beijing 100871, China
School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China
Show Author Information

Abstract

Generating meaningful assert statements is one of the key challenges in automated test case generation, which requires understanding the intended functionality of the tested code. Recently, deep learning based models have shown promise in improving the performance of assert statement generation. However, the existing models only rely on the test prefixes along with their corresponding focal methods, yet ignore the developer-written summarization. Based on our observations, the summarization contents usually express the intended program behavior or contain parameters that will appear directly in the assert statement. Such information will help existing models address their current inability to accurately predict assert statements. This paper presents a summarization-guided approach for automatically generating assert statements. To derive generic representations for natural language (i.e., summarization) and programming language (i.e., test prefixes and focal methods), we leverage a pre-trained language model as the reference architecture and fine-tune it on the task of assert statement generation. To the best of our knowledge, the proposed approach makes the first attempt to leverage the summarization of focal methods as the guidance for making the generated assert statements more accurate. We demonstrate the effectiveness of our approach on two real-world datasets compared with state-of-the-art models.

Electronic Supplementary Material

Download File(s)
JCST-2209-12878-Highlights.pdf (173.5 KB)

References

[1]

Garousi V, Zhi J. A survey of software testing practices in Canada. Journal of Systems and Software, 2013, 86(5): 1354–1376. DOI: 10.1016/j.jss.2012.12.051.

[2]
Pacheco C, Ernst M D. Randoop: Feedback-directed random testing for Java. In Proc. the 22nd ACM SIGPLAN Conference on Object-Oriented Programming Systems and Applications Companion, Oct. 2007, pp.815–816. DOI: 10.1145/1297846.1297902.
[3]
Pacheco C, Lahiri S K, Ernst M D, Ball T. Feedback-directed random test generation. In Proc. the 29th International Conference on Software Engineering, May 2007, pp.75–84. DOI: 10.1109/ICSE.2007.37.
[4]
Fraser G, Arcuri A. EvoSuite: Automatic test suite generation for object-oriented software. In Proc. the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of software engineering, Sept. 2011, pp.416–419. DOI: 10.1145/2025113.2025179.
[5]
Shamshiri S. Automated unit test generation for evolving software. In Proc. the 10th Joint Meeting on Foundations of Software Engineering, Aug. 30–Sept. 4, 2015, pp.1038–1041. DOI: 10.1145/2786805.2803196.
[6]
Almasi M M, Hemmati H, Fraser G, Arcuri A, Benefelds J. An industrial evaluation of unit test generation: Finding real faults in a financial application. In Proc. the 39th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice Track, May 2017, pp.263–272. DOI: 10.1109/ICSE-SEIP.2017.27.
[7]
White R, Krinke J. TestNMT: Function-to-test neural machine translation. In Proc. the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering, Nov. 2018, pp.30–33. DOI: 10.1145/3283812.3283823.
[8]
Watson C, Tufano M, Moran K, Bavota G, Poshyvanyk D. On learning meaningful assert statements for unit test cases. In Proc. the 42nd International Conference on Software Engineering, Jul. 2020, pp.1398–1409. DOI: 10.1145/3377811.3380429.
[9]
Wang Y, Wang W, Joty S, Hoi S C H. CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In Proc. the 2021 Conference on Empirical Methods in Natural Language Processing, Nov. 2021, pp.8696–8708. DOI: 10.18653/v1/2021.emnlp-main.685.
[10]

Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu P J. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 2020, 21(1): Article No. 140.

[11]
Tufano M, Deng S K, Sundaresan N, Svyatkovskiy A. Methods2Test: A dataset of focal methods mapped to test cases. In Proc. the 19th International Conference on Mining Software Repositories, May 2022, pp.299–303. DOI: 10.1145/3524842.3528009.
[12]
Padhye R, Lemieux C, Sen K. JQF: Coverage-guided property-based testing in Java. In Proc. the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2019, pp.398–401. DOI: 10.1145/3293882.3339002.
[13]
Gopinath R, Kampmann A, Havrikov N, Soremekun E O, Zeller A. Abstracting failure-inducing inputs. In Proc. the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2020, pp.237–248. DOI: 10.1145/3395363.3397349.
[14]
Li X, Li W, Zhang Y, Zhang L. DeepFL: Integrating multiple fault diagnosis dimensions for deep fault localization. In Proc. the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2019, pp.169–180. DOI: 10.1145/3293882.3330574.
[15]
Wang S, Liu T Y, Tan L. Automatically learning semantic features for defect prediction. In Proc. the 38th International Conference on Software Engineering, May 2016, pp.297–308. DOI: 10.1145/2884781.2884804.
[16]

Zhang Y, Jin D, Xing Y, Gong Y. Automated defect identification via path analysis-based features with transfer learning. Journal of Systems and Software, 2020, 166:110585. DOI: 10.1016/j.jss.2020.110585.

[17]

Zhao Y, Wang Y, Zhang Y, Zhang D, Gong Y, Jin D. ST-TLF: Cross-version defect prediction framework based transfer learning. Information and Software Technology, 2022, 149:106939. DOI: 10.1016/j.infsof.2022.106939.

[18]

Xing Y, Qian X, Guan Y, Yang B, Zhang Y. Cross-project defect prediction based on G-LSTM model. Pattern Recognition Letters, 2022, 160: 50–57. DOI: 10.1016/j.patrec.2022.04.039.

[19]
Luo S, Xu H, Bi Y, Wang X, Zhou Y. Boosting symbolic execution via constraint solving time prediction (experience paper). In Proc. the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2021, pp.336–347. DOI: 10.1145/3460319.3464813.
[20]
Pan C, Pradel M. Continuous test suite failure prediction. In Proc. the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2021, pp.553–565. DOI: 10.1145/3460319.3464840.
[21]
Chen J, Bai Y, Hao D, Xiong Y, Zhang H, Xie B. Learning to prioritize test programs for compiler testing. In Proc. the 39th International Conference on Software Engineering, May 2017, pp.700–711. DOI: 10.1109/ICSE.2017.70.
[22]
Spieker H, Gotlieb A, Marijan D, Mossige M. Reinforcement learning for automatic test case prioritization and selection in continuous integration. In Proc. the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2017, pp.12–22. DOI: 10.1145/3092703.3092709.
[23]
Lutellier T, Pham H V, Pang L, Li Y, Wei M, Tan L. CoCoNut: Combining context-aware neural translation models using ensemble for program repair. In Proc. the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, Jul. 2020, pp.101–114. DOI: 10.1145/3395363.3397369.
[24]
Zhu Q, Sun Z, Xiao Y, Zhang W, Yuan K, Xiong Y, Zhang L. A syntax-guided edit decoder for neural program repair. In Proc. the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Aug. 2021, pp.341–353. DOI: 10.1145/3468264.3468544.
[25]

Chen Z, Kommrusch S, Tufano M, Pouchet L N, Poshyvanyk D, Monperrus M. SequenceR: Sequence-to-sequence learning for end-to-end program repair. IEEE Trans. Software Engineering, 2021, 47(9): 1943–1959. DOI: 10.1109/TSE.2019.2940179.

[26]
Mastropaolo A, Scalabrino S, Cooper N, Nader-Palacio D, Poshyvanyk D, Oliveto R, Bavota G. Studying the usage of text-to-text transfer transformer to support code-related tasks. In Proc. the 43rd International Conference on Software Engineering, May 2021, pp.336–347. DOI: 10.1109/ICSE43902.2021.00041.
[27]

Mastropaolo A, Cooper N, Nader Palacio D, Scalabrino S, Poshyvanyk D, Oliveto R, Bavota G. Using transfer learning for code-related tasks. IEEE Trans. Software Engineering, 2023, 49(4): 1580–1598. DOI: 10.1109/TSE.2022.3183297.

[28]
Dinella E, Ryan G, Mytkowicz T, Lahiri S K. TOGA: A neural method for test oracle generation. In Proc. the 44th International Conference on Software Engineering, May 2022, pp.2130–2141. DOI: 10.1145/3510003.3510141.
[29]
Yu H, Lou Y, Sun K, Ran D, Xie T, Hao D, Li Y, Li G, Wang Q. Automated assertion generation via information retrieval and its integration with deep learning. In Proc. the 44th International Conference on Software Engineering, May 2022, pp.163–174. DOI: 10.1145/3510003.3510149.
[30]
Kudo T, Richardson J. SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In Proc. the 2018 Conference on Empirical Methods in Natural Language Processing, Nov. 2018, pp.66–71. DOI: 10.18653/v1/d18-2012.
[31]
Gu J, Lu Z, Li H, Li V O K. Incorporating copying mechanism in sequence-to-sequence learning. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, Aug. 2016, pp.1634–1640. DOI: 10.18653/v1/p16-1154.
[32]

Pawlak R, Monperrus M, Petitprez N, Noguera C, Seinturier L. SPOON: A library for implementing analyses and transformations of Java source code. Software: Practice and Experience, 2016, 46(9): 1155–1179. DOI: 10.1002/spe.2346.

[33]
Klein G, Kim Y, Deng Y, Senellart J, Rush A M. OpenNMT: Open-source toolkit for neural machine translation. In Proc. the 2017 System Demonstrations, Jul. 2017, pp.67–72. DOI: 10.18653/v1/P17-4012.
[34]
Papineni K, Roukos S, Ward T, Zhu W J. BLEU: A method for automatic evaluation of machine translation. In Proc. the 40th Annual Meeting of the Association for Computational Linguistics, Jul. 2002, pp.311–318. DOI: 10.3115/1073083.1073135.
[35]
Lin C Y. ROUGE: A package for automatic evaluation of summaries. In Proc. the 2004 Text Summarization Branches Out, Jul. 2004, pp.74–81.
[36]

McNemar Q. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 1947, 12(2): 153–157. DOI: 10.1007/BF02295996.

[37]

Mohammadi M, Atashin A A, Hofman W, Tan Y. Comparison of ontology alignment systems across single matching task via the McNemar’s test. ACM Trans. Knowledge Discovery from Data, 2018, 12(4): Article No. 51. DOI: 10.1145/3193573.

[38]

Raschka S. MLxtend: Providing machine learning and data science utilities and extensions to Python’s scientific computing stack. Journal of Open Source Software, 2018, 3(24): Article No. 638. DOI: 10.21105/joss.00638.

[39]
Ghafari M, Ghezzi C, Rubinov K. Automatically identifying focal methods under test in unit test cases. In Proc. the 15th IEEE International Working Conference on Source Code Analysis and Manipulation, Sept. 2015, pp.61–70. DOI: 10.1109/SCAM.2015.7335402.
[40]
White R, Krinke J, Tan R. Establishing multilevel test-to-code traceability links. In Proc. the 42nd International Conference on Software Engineering, Jul. 2020, pp.861–872. DOI: 10.1145/3377811.3380921.
[41]
Panichella A, Panichella S, Fraser G, Sawant A A, Hellendoorn V J. Revisiting test smells in automatically generated tests: Limitations, pitfalls, and opportunities. In Proc. the 36th IEEE International Conference on Software Maintenance and Evolution, Sept. 27–Oct. 3, 2020, pp.523–533. DOI: 10.1109/ICSME46990.2020.00056.
[42]
Chen M, Tworek J, Jun H et al. Evaluating large language models trained on code. arXiv: 2107.03374, 2021. https://arxiv.org/abs/2107.03374, Jan. 2025.
Journal of Computer Science and Technology
Pages 138-157
Cite this article:
Zhang Y-W, Jin Z, Wang Z-J, et al. SAGA: Summarization-Guided Assert Statement Generation. Journal of Computer Science and Technology, 2025, 40(1): 138-157. https://doi.org/10.1007/s11390-023-2878-6

60

Views

0

Crossref

1

Web of Science

1

Scopus

0

CSCD

Altmetrics

Received: 30 September 2022
Accepted: 30 December 2023
Published: 23 February 2025
© Institute of Computing Technology, Chinese Academy of Sciences 2025
Return