Toward an automated cross-multimodal verification of mobile app bug fixes integrating user feedback, developer responses, changelogs, and UI visual analysis
Context: Verifying claimed bug fixes in mobile applications is crucial, yet the "fixed but not resolved" phenomenon remains a persistent challenge. Existing bug analysis tools focus on pre-fix tasks like detection and reproduction, but lack mechanisms to holistically verify a fix post-deployment by cross-referencing developer claims, visual UI changes, and subsequent user feedback. This gap leads to persistent bugs, wasted developer effort, and user dissatisfaction. Objective: This paper introduces BUGFixChecker, the first framework for automated, multimodal cross-verification of mobile app bug fixes. Our primary goal is to determine if a claimed fix has truly resolved a user-reported issue. Methods: BUGFixChecker integrates five data sources: the original user bug report, the developer's fix claim, "before" and "after" UI screenshots, and post-fix user reviews. The core methodology employs a Multimodal Large Language Model (MLLM) guided by a Chain-of-Thought prompt to perform a comparative reasoning task. We evaluated the framework on a curated dataset of 53 real-world bug fix cases from Android applications. Results: BUGFixChecker achieved a high overall accuracy of 83.0 % and a macro F1-score of 0.805 in correctly verifying the status of bug fixes. It proved particularly effective at identifying discrepancies with strong evidentiary signals, such as "Unresolved Visual Mismatch" (F1-score = 0.865). Most significantly, a rigorouss ablation study demonstrated the critical contribution of the visual modality: the full multimodal framework outperformed a text-only baseline by over 19 % points in F1-score (0.805 vs. 0.610), proving that visual evidence is indispensable for this task. Conclusion: BUGFixChecker offers a novel and pragmatic approach to automated bug fix verification. By moving beyond pre-fix analysis to the critical post-fix verification stage, our multimodal framework provides a scalable solution to enhance the integrity of bug tracking systems, reduce developer workload, and ensure higher software quality in rapidly evolving mobile ecosystems.
| Item Type | Article |
|---|---|
| Identification Number | 10.1016/j.infsof.2025.107996 |
| Additional information | © 2025 Elsevier B.V. This is the accepted manuscript version of an article which has been published in final form at https://doi.org/10.1016/j.infsof.2025.107996 |
| Keywords | bug fix verification, empirical software engineering, mobile ui analysis, multimodal large language models (mllms), software maintenance, user-reported bugs, software, information systems, computer science applications |
| Date Deposited | 16 Mar 2026 09:09 |
| Last Modified | 17 Mar 2026 19:18 |
