Toward an automated cross-multimodal verification of mobile app bug fixes integrating user feedback, developer responses, changelogs, and UI visual analysis

Massenon, Rhodes, Gambo, Ishaya and Khan, Javed Ali (2026) Toward an automated cross-multimodal verification of mobile app bug fixes integrating user feedback, developer responses, changelogs, and UI visual analysis. Information and Software Technology, 191: 107996. ISSN 0950-5849

Copy

Context: Verifying claimed bug fixes in mobile applications is crucial, yet the "fixed but not resolved" phenomenon remains a persistent challenge. Existing bug analysis tools focus on pre-fix tasks like detection and reproduction, but lack mechanisms to holistically verify a fix post-deployment by cross-referencing developer claims, visual UI changes, and subsequent user feedback. This gap leads to persistent bugs, wasted developer effort, and user dissatisfaction. Objective: This paper introduces BUGFixChecker, the first framework for automated, multimodal cross-verification of mobile app bug fixes. Our primary goal is to determine if a claimed fix has truly resolved a user-reported issue. Methods: BUGFixChecker integrates five data sources: the original user bug report, the developer's fix claim, "before" and "after" UI screenshots, and post-fix user reviews. The core methodology employs a Multimodal Large Language Model (MLLM) guided by a Chain-of-Thought prompt to perform a comparative reasoning task. We evaluated the framework on a curated dataset of 53 real-world bug fix cases from Android applications. Results: BUGFixChecker achieved a high overall accuracy of 83.0 % and a macro F1-score of 0.805 in correctly verifying the status of bug fixes. It proved particularly effective at identifying discrepancies with strong evidentiary signals, such as "Unresolved Visual Mismatch" (F1-score = 0.865). Most significantly, a rigorouss ablation study demonstrated the critical contribution of the visual modality: the full multimodal framework outperformed a text-only baseline by over 19 % points in F1-score (0.805 vs. 0.610), proving that visual evidence is indispensable for this task. Conclusion: BUGFixChecker offers a novel and pragmatic approach to automated bug fix verification. By moving beyond pre-fix analysis to the critical post-fix verification stage, our multimodal framework provides a scalable solution to enhance the integrity of bug tracking systems, reduce developer workload, and ensure higher software quality in rapidly evolving mobile ecosystems.

Item Type	Article
Identification Number	10.1016/j.infsof.2025.107996
Additional information	© 2025 Elsevier B.V. This is the accepted manuscript version of an article which has been published in final form at https://doi.org/10.1016/j.infsof.2025.107996
Keywords	bug fix verification, empirical software engineering, mobile ui analysis, multimodal large language models (mllms), software maintenance, user-reported bugs, software, information systems, computer science applications
Date Deposited	16 Mar 2026 09:09
Last Modified	08 Apr 2026 21:10

Explore Further

Khan, Javed Ali

Information and Software Technology

mail

Request Copy

picture_as_pdf: BUGFixChecker_Paper_-_Manuscript_ID_INFSOF-D-25-00884R1_1_.pdf
subject: Submitted Version
lock_clock: Restricted to Repository staff only until 8 December 2026
: Available under Creative Commons: BY-NC-ND 4.0

Request Copy

EndNote

BibTeX

Reference Manager

Refer

Atom

Dublin Core

MPEG-21 DIDL

METS

HTML Citation

RIOXX2 XML

OpenURL ContextObject

MODS

Data Cite XML

ASCII Citation

OpenURL ContextObject in Span

Export

Downloads