TMH: A Transformer-Markov Hybrid Model for Behavior-Aware Code Summarization

  • Zakarya Benyamina Department of Computer Science, Institute of Science, University Center of Aflou, Laghouat, Algeria https://orcid.org/0009-0008-6362-436X
  • Ahmed Benyamina Department of Computer Science, Faculty of Exact Sciences, Tahri Mohammed University, Bechar, Algeria https://orcid.org/0000-0002-6710-6462

Abstract

Automatic code comment generation is crucial for enhancing code readability, maintainability, and developer efficiency. However, existing models often treat source code as static text, overlooking its dynamic execution behavior. To address this, we propose Transformer with Markov modeling (TMH) a hybrid architecture that combines static lexical embeddings with behavioral signals derived from control-flow semantics. This dual-view representation enables the model to capture both what the code says and how it behaves. To further enhance relevance, we introduce an entropy-guided attention mechanism that prioritizes tokens critical to control logic during decoding. TMH outperforms state-of-the-art baselines (e.g., SeTransformer, ALSI-Transformer) by +1.91 BLEU-4 and +1.37 METEOR on large-scale Java datasets. Human evaluations confirm improved accuracy and contextual fluency, particularly for logic-heavy methods. By unifying static and dynamic code understanding, our approach advances neural code summarization and paves the way for more intelligent, behavior-aware documentation tools in software engineering.

Downloads

Download data is not yet available.
Published
2025-12-12
How to Cite
Benyamina, Z., & Benyamina, A. (2025). TMH: A Transformer-Markov Hybrid Model for Behavior-Aware Code Summarization. ITEGAM-JETIA, 11(56), 186-199. https://doi.org/10.5935/jetia.v11i56.2687
Section
Articles