TMH: A Transformer-Markov Hybrid Model for Behavior-Aware Code Summarization

Zakarya Benyamina; Ahmed Benyamina

doi:10.5935/jetia.v11i56.2687

Zakarya Benyamina Department of Computer Science, Institute of Science, University Center of Aflou, Laghouat, Algeria https://orcid.org/0009-0008-6362-436X
Ahmed Benyamina Department of Computer Science, Faculty of Exact Sciences, Tahri Mohammed University, Bechar, Algeria https://orcid.org/0000-0002-6710-6462

DOI: https://doi.org/10.5935/jetia.v11i56.2687

Abstract

Automatic code comment generation is crucial for enhancing code readability, maintainability, and developer efficiency. However, existing models often treat source code as static text, overlooking its dynamic execution behavior. To address this, we propose Transformer with Markov modeling (TMH) a hybrid architecture that combines static lexical embeddings with behavioral signals derived from control-flow semantics. This dual-view representation enables the model to capture both what the code says and how it behaves. To further enhance relevance, we introduce an entropy-guided attention mechanism that prioritizes tokens critical to control logic during decoding. TMH outperforms state-of-the-art baselines (e.g., SeTransformer, ALSI-Transformer) by +1.91 BLEU-4 and +1.37 METEOR on large-scale Java datasets. Human evaluations confirm improved accuracy and contextual fluency, particularly for logic-heavy methods. By unifying static and dynamic code understanding, our approach advances neural code summarization and paves the way for more intelligent, behavior-aware documentation tools in software engineering.

Downloads

Download data is not yet available.

JETIA Journal Data
Available:	2015 - 2026
Volumes:	12
Issues:	58
Articles:	1.110
Article Processing Charges (APC):	PAID