Toward an automated cross-multimodal verification of mobile app bug fixes integrating user feedback, developer responses, changelogs, and UI visual analysis

Massenon, Rhodes, Gambo, Ishaya and Khan, Javed Ali (2026) Toward an automated cross-multimodal verification of mobile app bug fixes integrating user feedback, developer responses, changelogs, and UI visual analysis. Information and Software Technology, 191: 107996. ISSN 0950-5849
Copy

Context: Verifying claimed bug fixes in mobile applications is crucial, yet the "fixed but not resolved" phenomenon remains a persistent challenge. Existing bug analysis tools focus on pre-fix tasks like detection and reproduction, but lack mechanisms to holistically verify a fix post-deployment by cross-referencing developer claims, visual UI changes, and subsequent user feedback. This gap leads to persistent bugs, wasted developer effort, and user dissatisfaction. Objective: This paper introduces BUGFixChecker, the first framework for automated, multimodal cross-verification of mobile app bug fixes. Our primary goal is to determine if a claimed fix has truly resolved a user-reported issue. Methods: BUGFixChecker integrates five data sources: the original user bug report, the developer's fix claim, "before" and "after" UI screenshots, and post-fix user reviews. The core methodology employs a Multimodal Large Language Model (MLLM) guided by a Chain-of-Thought prompt to perform a comparative reasoning task. We evaluated the framework on a curated dataset of 53 real-world bug fix cases from Android applications. Results: BUGFixChecker achieved a high overall accuracy of 83.0 % and a macro F1-score of 0.805 in correctly verifying the status of bug fixes. It proved particularly effective at identifying discrepancies with strong evidentiary signals, such as "Unresolved Visual Mismatch" (F1-score = 0.865). Most significantly, a rigorouss ablation study demonstrated the critical contribution of the visual modality: the full multimodal framework outperformed a text-only baseline by over 19 % points in F1-score (0.805 vs. 0.610), proving that visual evidence is indispensable for this task. Conclusion: BUGFixChecker offers a novel and pragmatic approach to automated bug fix verification. By moving beyond pre-fix analysis to the critical post-fix verification stage, our multimodal framework provides a scalable solution to enhance the integrity of bug tracking systems, reduce developer workload, and ensure higher software quality in rapidly evolving mobile ecosystems.

mail Request Copy

picture_as_pdf
BUGFixChecker_Paper_-_Manuscript_ID_INFSOF-D-25-00884R1_1_.pdf
subject
Submitted Version
lock_clock
Restricted to Repository staff only until 8 December 2026
Available under Creative Commons: BY-NC-ND 4.0

Request Copy

EndNote BibTeX Reference Manager Refer Atom Dublin Core RIOXX2 XML OpenURL ContextObject MPEG-21 DIDL ASCII Citation MODS METS HTML Citation Data Cite XML OpenURL ContextObject in Span
Export

Downloads