Download PDFOpen PDF in browserVision LLM-Driven Operational Hazard Recognition for Building Fire Safety Compliance Checking12 pages•Published: August 28, 2025AbstractBuilding fire incidents pose significant risks to human lives and property, making fire safety compliance a critical aspect of building management. Traditional compliance checks are largely manual, relying on expert inspectors to assess and report on fire safety standards. While prior research has explored Automated Compliance Checking (ACC) during the design phase, limited attention has been given to the operational phase, where dynamic risks necessitate continuous monitoring. This study proposes a novel approach that leverages vision Large Language Models (vLLMs) to automate fire safety compliance monitoring in the operational phase. The developed method frames hazard recognition as a Visual Question Answering (VQA) task, enabling the model to analyze visual data and respond to textual queries regarding potential fire hazards. The system employs a Vision Transformer (ViT) for visual encoding and a multimodal fusion process, allowing the vLLM to generate contextually relevant descriptions of observed hazards, along with regulatory references including Occupational Safety and Health Administration (OSHA) standards. Evaluation results demonstrate significant improvements in hazard recognition over a generic vLLM baseline, with an average BLEU score of 0.1355 compared to 0.0410 and higher ROUGE scores reflecting superior precision and coherence. The model’s ability to automatically generate structured hazard description reports has practical implications for assisting expert-driven inspections, offering a comprehensive and effective solution for long-term fire safety management. This study thus advances ACC research by providing a comprehensive, automated method for continuous fire safety compliance in operational building environments.Keyphrases: automated compliance checking (acc), computer vision, fire safety compliance, operational phase monitoring, vision large language models (vllm), visual question answering (vqa) In: Jack Cheng and Yu Yantao (editors). Proceedings of The Sixth International Conference on Civil and Building Engineering Informatics, vol 22, pages 829-840.
|