ICCBEI 2025: Papers with Abstracts

Papers
Abstract. Building Information Modelling (BIM) can significantly improve aging buildings’ smart operation and maintenance (O&M) efficiency, and support advanced equipment maintenance and emergency response. However, many aging buildings lack BIM due to the era in which they were built, and the manual reconstruction process requires experts with specialized knowledge and consumes considerable modeling time. Many existing studies utilize deep learning methods based on image features to extract component information from paper drawings for BIM reconstruction, however, the paper drawing layering pre-processing, which is essential for improving the information extraction accuracy, still requires inefficient manual drafting and line classification. Although existing studies have proposed methods for drawings layering by line classification, these methods perform poorly on construction engineering drawings due to the complexity and line dense in drawings. To fill this gap, we propose a heterogeneous graph neural networks (GNNs) based method that predicts the category of line elements on paper drawings to achieve automatic layering with three modules: 1) paper drawing vectorization by line extraction and duplicate elements merging to detect the line elements representing component contour and annotation; 2) graph structure construction by considering the different topological relationships among lines to represent the line properties; 3) heterogeneous graph nodal classification model to predict the line category and realize automatic layering. The proposed method was tested on an actual engineering drawing dataset, and the results show that the method has an overall F1 score more than 0.74 and exceeds the baseline model by over 0.1. This research improves paper drawing pre-processing efficiency and provides a new solution for information extraction in ageing buildings BIM reconstruction.
Abstract. Japan's construction industry is facing a critical labor shortage due to its aging population. To address this issue, utilizing Building Information Modeling (BIM)/Construction Information Modeling (CIM) to enhance productivity and employing foreign workers are considered effective solutions. However, research on the integration of foreign technical talents in Japan's construction sector remains limited. This paper aims to explore the potential of introducing young Chinese technicians trained in BIM to alleviate the shortage of site management personnel in Japan. First, a comparative study of BIM education in universities in China and Japan is conducted to assess the current state of BIM training in both countries. Following this, the willingness of 187 young Chinese technicians to take on management roles in Japan is investigated and analyzed, along with the satisfaction levels of 10 Chinese individuals currently managing sites in Japan. Finally, by synthesizing the survey results, the paper identifies the challenges in practical implementation and confirms the feasibility of employing BIM-educated young Chinese technicians to address Japan's labor shortage in site management.
Abstract. In Hong Kong, low-rise residential buildings, which are mostly built next to mountains and occupy half of the total residential area, are often seriously damaged by debris flow. The extent of damage to each building depends not only on the intensity of the debris flow but also on the interaction of the neighboring components of the building and their collective performance. With the application of BIM (Building Information Modeling) technology in structural analysis, data conversion and information transfer between the BIM model and the FEM (Finite Element Method) model have become the main factors limiting its efficiency and quality. Unlike the automatic conversion method for frame structure building models, the conversion method for such models is a semi-automated method that relies on manual labor at the cost of losing a large amount of BIM information due to the problems of non-standardization and confusing information. Therefore, this study proposes an automatic standardized conversion method based on OpenBIM data (IFC) for developing refined FEM models of low-rise residential buildings. It consists of three parts: 1) guaranteeing model refinement and spatial robustness through model geometry transformation and calibration; 2) standardized conversion during the conversion process through material information database establishment and matching; and 3) integration of the output information to achieve data interoperability and provide a reference for debris flow damage assessment. Conversion tests of this method and the traditional conversion method were conducted in the same BIM model, and the results show that the FEM model generated by this method is significantly better than the traditional conversion method in terms of spatial closure and mesh quality (i.e., spatial robustness and model refinement). Future research will continue refining spatial robustness detection, standardized conversion, and data interoperability.
Abstract. The primary aim of this research is to identify and address the key barriers to Building Information Modeling (BIM) adoption and implementation within the Mongolian construction industry, serving as a case study for developing economies. BIM offers significant benefits and represents a key technological innovation shaping the future of the Architecture, Engineering, Construction (AEC), and Infrastructure sector towards Digital Transformation. However, there are substantial challenges in BIM adoption and implementation in the construction industry. To identify the barriers to BIM adoption in Mongolia’s construction sector, this study conducts a comprehensive analysis of the industry’s current state and the status of BIM adoption, supported by a review of relevant literature and case-specific data. The comprehensive analysis utilizes SWOT (Strengths, Weaknesses, Opportunities, and Threats) and PESTLE (Political, Economic, Social, Technological, Legal, and Environmental) analysis approach to identify the key challenges, including technical and policy, socio-economy, human resource related factors. Then, framework to address those challenges is developed based on the analysis results and data. Also, the framework draws insight from the review and comparison study of global BIM implementation best practices. This study contributes to filling the knowledge gap regarding BIM adoption in developing economies and offers practical insights that can aid in developing BIM adoption models for countries with similar economic and industry conditions.
Abstract. Quality inspection in outdoor prefabricated storage yards is highly challenging due to the large volume, diversity, and complex management demands of components. Currently, these inspections are conducted manually, which is insufficient to fully meet industry needs. This study proposes an innovative approach to enable intelligent inspection for multiple prefabricated components in large-scale prefabricated storage yards by integrating Building Information Modeling (BIM), LiDAR, Unmanned Aerial Vehicles (UAVs), and Unmanned Ground Vehicles (UGVs). First, an intelligent sensing environment and stepwise collaboration mechanism are established, where UAVs are used to reconstruct a 3D comprehensive environment of the prefabrication site, providing a map to plan the optimal scanning path for LiDAR-equipped UGVs. Next, a point cloud-driven integrated geometric quality inspection method is introduced, where UGVs autonomously collect, and process point cloud data to extract precise geometric features of large components within expansive spaces. To verify the effectiveness of the proposed method, an experiment at a large-prefabricated component factory that produces a variety of types of prefabricated components is conducted. By integrating point cloud processing results with BIM model design information, this research achieves high-precision, large-scale quality inspections of the non-structural performance of prefabricated components, significantly enhancing inspection efficiency and accuracy.
Abstract. The urgent need to reduce carbon emissions within the building and construction industry has underscored the importance of embodied carbon assessment. Building Information Modeling (BIM) and Discrete Event Simulation (DES) emerge as promising tools for enhancing this assessment process, offering detailed data extraction capabilities and dynamic simulation for energy consumption quantification. While previous research has explored the potential of BIM-DES integration, this paper addresses existing gaps by identifying and incorporating essential information requirements into BIM models for more effective DES-based embodied carbon assessment. This paper thus develops a BIM-based DES method for cradle-to-site embodied carbon assessment by (1) developing an integrated ontology to identify data requirements, (2) enriching BIM models with the necessary information for DES modeling based on the ontology, and (3) building the DES model based on the data extracted from the enriched BIM-related file and implementing scenario-based analysis. This integrated approach facilitates efficient and comprehensive analysis of cradle-to-site embodied carbon. The synergy between BIM and DES enables stakeholders to make informed decisions early in the project lifecycle, optimizing carbon reduction strategies through scenario-based analysis.
Abstract. Automatic generation of Building Information Models (BIMs) from point clouds (i.e., Scan-to-BIM) plays a critical role in bridge maintenance and the development of Digital Twinning (DT). However, the problem of incomplete point cloud (e.g., caused by occlusions in laser scanning) significantly hinders the Scan-to-BIM accuracy. To overcome this challenge, we propose addressing the occlusion problems in bridge point clouds by introducing an additional point cloud completion task in the Scan-to-BIM process. This new task aims to take incomplete bridge point clouds, following segmentation, as input, and generate complete point clouds as output. The learning-based completion model, Point Completion Network (PCN), is adopted to validate the proposed strategy and show robust completion performance for bridge components with varying levels of occlusion. It can improve the average 11.92 Chamfer Distance (CD) and 11.82 F-score for coarse completion, 8.89 CD and 6.95 F-score for dense completion. This study contributes to Scan-to-BIM by refining the Scan-to-BIM framework in bridge engineering and defining a point cloud completion task to facilitate the development of bridge DT systems.
Abstract. The surface of ancient masonry structures is prone to various defects over time. H-BIM assists in defect inspection digitally, which improves efficiency and saves the labor force. However, existing H-BIM of masonry structures still has limitations of low defect complexity and ideal geometric shape, and the defect information is not integrated with corresponding masonry units, lacking accurate and comprehensive prediction in structural analysis. The developed model presents detailed and realistic masonry units fused with defect information and could be used for numerical simulations. A YOLO model is used to detect and segment defects in masonry structures. K-fold cross-validation is employed during model training to mitigate the impact of category imbalance in the dataset. The YOLO model has also been employed to segment masonry units and extract their contours. The defect information is integrated with masonry units based on their positions. A case study was carried out in an ancient city wall in Suzhou, China. The generated masonry H-BIM assists the current and future protection of the structures, highlighting the feasibility of the method for the analysis of masonry structures.
Abstract. This study presents a computational workflow for detecting overbreak and underbreak in drill-and-blast tunnels, grounded in the principles of Digital Twin technology, with the objective of enhancing blasting quality control during tunnel construction. Utilizing 3D laser scanning technology, a 3D model of the real-world tunnel is constructed and integrated with the design model for comparative analysis. The differences in the contours of both models are computed, facilitating the automated assessment of overbreak and underbreak volumes in blasting sections. The specific workflow encompasses data acquisition, point cloud processing, model construction and optimization, as well as model integration and analysis, thereby establishing an efficient and precise system for detecting overbreak and underbreak. In comparison to traditional manual section measurement methods, the proposed approach not only significantly reduces labor workload but also substantially enhances detection accuracy. This technology offers reliable technical support for tunnel blasting quality assessment, effectively addressing the challenges of high labor input in drill-and-blast tunnel construction.
Abstract. Fire safety inspection is essential for protecting occupants from fire hazards. Traditional inspection methods often rely on subjective assessments, often leading to potential errors and inadequate maintenance. This study introduces a visual fire safety inspection tool that integrates self-trained Machine Learning (ML) services to enhance the accuracy and efficiency of documenting Fire Safety Equipment (FSE) using images. The ML services were incorporated into a web-based application built with the React framework, featuring a backend developed using FastAPI and MongoDB for efficient processing and scalability. The tool achieved a high mean average precision (mAP) over 80% on testing datasets. It offers a robust environment for fire safety experts to validate and compare models, also providing insights into the impact of active learning algorithms. Despite the tool’s high accuracy, challenges such as slow loading times and application freezing were identified, with proposed solutions focusing on optimizing backend and frontend processes. The integration of ML services demonstrates significant potential for improving fire safety inspections, with future work aimed at refining models, expanding real-time monitoring capabilities, and ensuring compatibility with Building Information Modeling (BIM) systems and conventional smartphones.
Abstract. The construction industry is characterized by physically demanding tasks and the adoption of awkward postures, both of which contribute to a high incidence of work-related musculoskeletal disorders (WMSDs). Despite the significance of these factors, few studies considered the external load estimation that considers the actual weights being lifted and carried in ergonomic assessments. This research aims to enhance the accuracy of WMSD risk evaluations by integrating external load estimation into ergonomic assessments. We utilized skeleton tracking technology to automatically evaluate awkward postures based on the Rapid Upper Limb Assessment (RULA) framework, a method for evaluating the exposure of workers to ergonomic risk factors. Concurrently, we analyzed electromyography (EMG) signals measuring muscle activity to extract pertinent features for estimating external loads, which were subsequently integrated into the overall ergonomic assessment. Experimental results demonstrate that the Multi-Layer Perceptron-Back Propagation algorithm outperforms alternative machine learning classification methods, achieving an accuracy rate of 98.3%.
Abstract. Passive back support exoskeletons (PBSEs) have been promoted as a means of alleviating the physical strain associated with manual tasks in industrial settings. These devices are known to influence the wearer's kinematics, muscle activation, and balance. Slips and trips, which are frequent precursors to falls, often occur during construction tasks. The effects of PBSEs on balance and the ability to recover from slip- and trip-like perturbations during walking have not been thoroughly examined. The present study aimed to investigate the effects of a PBSE on ground reaction forces (GRF) after slip and trip-like perturbations during walking on an instrumented treadmill. Nine male participants walked on an instrumented treadmill under two conditions: without wearing a PBSE (WOE) and after wearing a PBSE (WE). Each participant experienced normal walking, slip, and trip events, presented in a random order. GRFs were recorded using a force plate integrated into the treadmill. Fx (force in the mediolateral direction) was higher (p = 0.003) in WE (mean, 166.32 N) than in WOE (mean, 140.52 N) by 18.36 % after slip perturbations. Fy (force in the anteroposterior direction), Fz (force in the vertical direction), and Fr (the resultant force) did not show statistically significant differences between WE and WOE after slip perturbations. Following trip perturbations, a statistically significant increase was observed in Fz (p<0.001) and Fr (p<0.001). Fz and Fr were higher in WE than WOE by 25.24 % and 8.88 %, respectively. Wearing a PBSE may alter the GRF in a mediolateral or vertical direction that may predispose the wearer to fall. Construction workers should be provided with balance training while wearing a PBSE and then exercise caution while walking on a construction site.
Abstract. Low back pain (LBP) is a prevalent issue among construction workers, with prevalence rates ranging from 27% to 52%. This significant burden not only affects worker health but also incurs substantial economic costs, exemplified by Brazil's annual expenditure of approximately USD $500 million and Spain's EUR €8945.6 million in related costs. To mitigate LBP, back exoskeletons have emerged as promising solutions, designed to reduce low back load during repetitive lifting tasks. This study compares the human-exoskeleton interaction forces in three back exoskeletons - SV Exosuit (a soft active exosuit integrated with a safety vest), Laevo V2 (a rigid passive exoskeleton), and MATE-XB (also rigid and passive) - through experimental tasks involving bending and squatting. Three healthy male participants performed these tasks while wearing each exoskeleton, during which human-exoskeleton interaction forces at the thigh and shoulder/chest were measured. Results demonstrated that SV Exosuit produced higher contact forces at both body regions, attributed to its smaller contact area and less cushioning material. Conversely, Laevo V2 exhibited two peaks in contact forces during a motion cycle due to its torque generation mechanism, highlighting the influence of supporting torque design on user comfort. This research underscores the critical need for optimizing exoskeleton designs to enhance comfort and usability in construction settings. Future studies should investigate a larger sample size and additional body regions to comprehensively assess the relationship between supportive torque, contact forces, joint angle, and user comfort.
Abstract. The rapid growth of urbanization and traffic complexity in smart cities has led to heightened safety risks, particularly for vulnerable road users (VRUs) like pedestrians and cyclists. This study aims to enhance VRU safety by investigating their perceived risk and decision-making processes in various traffic scenarios. Traditional methods, such as surveys and observational studies, have limitations in capturing VRUs' cognitive processes and real-time behavior. This research proposes a multimodal approach combining electroencephalography (EEG) and eye-tracking technology to analyze VRUs' cognitive and physiological responses in simulated urban environments. By integrating data on attention, vigilance, and emotional states, this approach provides a comprehensive assessment of VRUs' risk perception mechanisms. Utilizing the CARLA simulator, a platform for pedestrian-vehicle interaction is developed to simulate realistic urban scenarios, while physiological and behavioral data are collected to establish a model for predicting VRU behavior. The study contributes to theoretical insights on risk perception and practical implications for designing safer urban infrastructure in smart cities, ultimately aiming to reduce VRU accidents and enhance traffic safety.
Abstract. Multi-object tracking in videos is an important task in various domains, such as traffic engineering and construction management. This paper proposes two methods, Grid Mean State and InCo-Skip, to improve multi-object tracking performance, particularly under frame-skipping scenarios. The study focuses on traffic flow counting, using YOLOv8 for vehicle tracking. Initial tests show that while car tracking remains accurate, motorcycles suffer a significant accuracy degradation when homogeneous frame skipping is applied. Grid Mean State addresses the issue by utilizing velocity vectors from earlier frames, and InCo-Skip provides an alternative skipping strategy to balance computational efficiency and accuracy. The combined methods show a substantial enhancement in counting accuracy, achieving up to 28.2% improvement for motorcycles under challenging conditions.
Abstract. In the field of construction engineering, particularly in steel structure construction, welding operations are of great importance. However, this activity carries significant safety risks, such as severe accidents like eye burns. Therefore, safety management and protective measures for these high-risk activities are especially crucial. Although computer vision-based object detection algorithms have already been applied in the construction sector, these algorithms generally lack the capability to process high-level semantic information. They can only detect objects, but cannot understand the state of the objects. To address this issue, this paper proposes an improved welding safety inspection method based on gaze estimation and object detection. First, from a human-machine interaction perspective, the method combines gaze estimation and object detection to determine the worker's operational status. Second, it performs real-time checks on the personal protective equipment (PPE) used by the welders. Experiments demonstrate the feasibility of this approach, enhancing its ability to respond to complex scenarios and contributing to improved safety levels in construction projects.
Abstract. This study examines the job-housing balance in Guangzhou, focusing on its implications for new residents. Initially, it calculates the self-sufficiency of employment and residential populations across 11 administrative districts and 170 streets/towns in Guangzhou, along with peak commuting times for weekdays, to evaluate the job-housing relationship of new residents. Subsequently, the study assesses the housing vacancy rate in Guangzhou and analyzes the current state of vacant properties. Finally, by integrating the job-housing dynamics of new residents with the housing vacancy situation, the study proposes strategies aimed at addressing the challenges of vacant housing and job-housing imbalance through the rational allocation of unoccupied properties.
Abstract. This paper presents a lightweight transfer learning approach for water body segmentation by applying Adaptor-based fine-tuning on general image datasets. Traditional deep learning models often require full-scale retraining for each new task, which is both computationally expensive and time-consuming. In contrast, Adaptor networks—lightweight modules that selectively fine-tune task-specific layers while retaining most pre-trained model parameters—offer an efficient alternative. Water bodies present unique challenges for segmentation, such as varying lighting, reflections, and seasonal fluctuations. These factors can confuse distinguishing water from land, particularly in cases where reflections resemble adjacent features. Adaptor-based fine-tuning helps to reduce computational costs while ensuring the model captures the fine distinctions between similar regions like shallow water and land. This paper evaluated the method on a dataset containing lakes, rivers, and wetlands under diverse environmental conditions to test the model's robustness. The results indicate that Adaptor-based fine-tuning achieves comparable performance to fully fine-tuned models, with a significant reduction in computational costs and training time. The method also demonstrated high precision in segmenting water bodies under challenging conditions, such as occlusions and reflections. This study highlights the potential of lightweight transfer learning in resource-constrained environments, with applications in environmental monitoring, hydrological modeling, and geographic information systems (GIS). By demonstrating the effectiveness of Adaptor networks, this work contributes to the broader field of efficient transfer learning, showcasing how minimal adjustments to pre-trained models can yield accurate task-specific performance.
Abstract. Periodic bridge damage inspections result in a vast number of image records stored in a database, which can be used as a reference for the subsequent damage assessment. However, the currently used content-based image retrieval (CBIR) techniques are limited by the 'semantic gap'. They tend to consider only the low-level visual features in an image and ignore the high-level semantic information. This study proposes a semantic-enriched image retrieval framework (SEIR-Net) for bridge damage assessment. The framework enables the image encoder to extract low-level visual features and high-level semantic information by fine-tuning the multi-modal image captioning model (CNN-LSTM). The high-dimensional vectors extracted using the fine-tuned encoder are stored in the FAISS vector database, and efficient retrieval is achieved based on L2 Euclidean distance. Retrieval evaluation was performed on a damage dataset constructed on the real-world bridge inspection report, and our proposed method outperforms the commonly used VGG-16 and ResNet-50 models on the mAP and Recall@K (K=1, 2, 4) metrics. These results suggest that incorporating the semantic content of damage in image retrieval would be more beneficial for assessment references. In summary, this study effectively enhances the utility of historical image records in bridge damage assessment through semantic-enriched image retrieval techniques.
Abstract. Blockchain (BC) and smart contract (SC), known for their decentralized and secure frameworks, have been successfully applied across industries to enhance security, efficiency, and transparency in technical and engineering aspects such as automated transactions, data management, and supply chain monitoring. In recent years, the construction industry has seen significant advancements in adopting BC and SC. This review paper examines the latest BC and SC prototypes, case studies, and technical applications published over the past four years, focusing on how these academic developments can be transitioned into industry practice. By categorizing the collected studies using Technology Readiness Levels (TRLs) and conducting an in-depth analysis, the paper seeks to identify strategies for accelerating the adoption of these technologies in the construction sector. Reviewing recent advancements, this paper uncovers key trends, benefits, and challenges associated with implementing BC and SC in construction. The findings suggest that these technologies can significantly enhance project efficiency, contract management, and supply chain transparency while fostering trust among stakeholders. While these technologies offer significant potential, critical challenges, including scalability, privacy, and integration complexities, hinder their widespread adoption. Moreover, specific limitations, such as the difficulty in establishing regulatory compliance and technical barriers in data management, remain unresolved. By providing actionable insights and highlighting pathways to bridge the adoption gaps, the paper aims to accelerate the integration of BC and SC into the construction industry, driving innovation, transparency, and operational efficiency.
Abstract. Japan is covered by mountainous and hilly terrain and thus is prone to disasters such as landslides caused by earthquakes and heavy rainfall. Currently, the identification of sediment disaster areas is conducted through visual interpretation of aerial photographs taken after these disasters. To improve the efficiency of this process, research has been advancing in the automatic detection of sediment movement areas using deep learning techniques applied to aerial photographs. In this study, we compare the performance of sediment movement detectors using Unet and Unet++, in order to investigate the impact of changes in the loss function parameters during training on detection performance. Additionally, we evaluate the performance improvement when the sliding partitioning method is used for input images to the detectors. Results show that using sliding partitioning and adjusting the value of the loss function coefficient β can improve detection performance.
Abstract. In Yamaguchi Prefecture, periodic inspections of social infrastructure facilities, such as bridges, have become mandatory every five years due to revisions to the Road Act. However, a significant challenge arises from the lack of personnel available to conduct these inspections, especially in rural municipalities. This shortage of staff, combined with the rapid aging of infrastructure, has heightened the risk of accidents and disasters. This paper introduces the "Voices of Bridges" (VoB) system, which leverages citizen participation to address bridge maintenance through a user-friendly, interactive platform. The system's architecture integrates modern web technologies such as Node.js, MySQL, and Google Maps API to allow efficient data handling and visualization. By utilizing open data from Yamaguchi Prefecture and employing crowdsourcing techniques, VoB enables users to report bridge defects while simultaneously augmenting the efforts of local governments in maintaining infrastructure through community engagement.
Abstract. The monitoring and operation maintenance (O&M) of urban streetlights is crucial for traffic safety and socio-economic development. However, how to accurately and robustly detect streetlights in low-light and high-interference environments is still a problem that concerns researchers. In recent years, deep learning has made remarkable progress in the field of object detection, among which the single-stage detection algorithm represented by You Only Look Once (YOLO) shows a satisfactory detection effect. It brings a new opportunity to detect streetlights based on images collected in a complicated street environment. Therefore, this study proposes an improved YOLOv5 model, as CB-YOLOv5, to accurately and robustly detect streetlights based on low-light images with high interferences. This proposed model integrates a Convolutional Block Attention Module (CBAM) and Bidirectional Feature Pyramid Network (BiFPN) to enhance its learning ability of spatial and channel dimension feature information, promote information fusion and transfer between multi-scale objects. Experimental results show that compared with the standard YOLOv5 algorithm, the proposed CB-YOLOv5 model can achieve significant improvement in accuracy and ability of interference-resistant in streetlight detection tasks. The mAP0.5 reached 0.968, which is 23.5% higher than that of the standard YOLOv5 algorithm. In general, the CB-YOLOv5 model provides a new method to detect small objects in low-light and complex scenes. The developed method is also expected to provide a theoretical basis for automated monitoring and operation maintenance of urban lighting facilities.
Abstract. In an era of increasing uncertainty and rapid change, international contractors are faced with significant challenges in navigating pressures across diverse institutional environments. In response, contractors are adopting Environmental, Social, and Governance (ESG) strategies to manage these institutional challenges and enhance their legitimacy. However, the decision-making process surrounding ESG implementation is complex, particularly under volatile conditions where traditional frameworks fall short in addressing uncertainty. This study proposes a novel approach using a Markov Decision Process (MDP) model to optimize ESG decision-making for international contractors. The results demonstrate that MDP provides a structured, scientific approach to ESG decision-making in uncertain and changing environments, offering actionable insights for contractors operating in global markets.
Abstract. The dimension design of components in reinforced concrete frame structures heavily relies on engineering experience and iterative calculations, leading to significant inefficiencies. Existing intelligent design methods struggle to conduct component size design because it is challenging to accurately and densely represent information such as component layout, dimensions, and design conditions. This study proposes a method for intelligent component size design based on feature space for accurate and dense representation of design information, as well as a diffusion design process constrained by multi-channel masks. Firstly, the method substitutes the feature space for the traditional RGB space to represent component layout, dimensions, and design conditions, thereby enhancing data representation and neural network learning capabilities. Secondly, the study introduces an image-guided diffusion model with multi-channel mask tensors, and the corresponding training method is derived. Experimental results demonstrate that this model exhibits strong feature extraction capabilities and performs well in component dimension design tasks. Lastly, the study discusses the impact of parameters such as multi-channel masks and different dataset construction methods on the final prediction results.
Abstract. Constructability-based optimization design of reinforcing bar (rebar) in concrete structures (RC) has been attracting attention in recent years when aiming towards industrialized sustainable construction. This paradigm enables it to more effectively link the practicality of reinforced concrete designs and their associated material usage and construction cost. The problem itself is multi-objective (MO), and the development of effective optimization algorithmic frameworks to approach its solution is essential. For this purpose, Artificial Intelligence (AI) based optimization with enhanced Meta-heuristic algorithms (MA) has demonstrated to be the key to reduce the computational demand. Particularly for beams, the deployment of Graph Neural Networks (GNN) has proven to be of the most effective AI-based optimization approaches. Nonetheless, its application for these elements has been limited, so far, to single-objective (SO) optimization and not for MO optimization, which entails further considerations to effectively reach optimal Pareto Fronts (PF) in a time-efficient manner. Additionally, the lack of constructability metrics, at this point, for rebar design in RC structures, in the literature, is still evident. Even though some efforts have been made in the last years, for some types of elements, there is still a gap when it comes to elaborate and flexible constructability models that may be used in general, for any project at hand.
This work presents the development of a novel MO optimization framework with GNN-Enhanced Metaheuristics (MA), for rebar design in multi-span beams. For this purpose, the development of a constructability score (CS) model is proposed based on rebar cuts and labor assembling complexity. The Non-Sorting Genetic Algorithm II (NSGA-II) is used for enhancement. The performance of each algorithm is analyzed and compared between Non-Enhanced and GNN, in terms of convergence and time efficiency.
Abstract. Building Information Modeling (BIM) is being increasingly used in the maintenance of buildings. Since 3D models do not exist for old buildings, it is common to create BIM from 3D scanned point clouds. To date, it has become possible to construct simple BIM consisting of major components such as floors, walls, ceilings, and columns almost automatically. However, automatic construction of detailed BIM including building equipment (lighting, air conditioning, fire alarm, etc.) necessary for building maintenance has not yet been achieved. These are often attached to ceilings and walls, and are difficult to recognize because of their small surface area and thickness. In this paper, we propose a method to detect building equipment using laser reflection intensity to automatically construct detailed BIM from point clouds of buildings by mobile laser scanners (MLS). In this method, first, the effects of distance and incident angle included in the reflection intensity are eliminated based on polynomial approximation, and the reflection intensity value of ceilings and walls made of the same material is approached to a constant value. Next, since the corrected intensity follows a normal distribution, a set of points that deviate from the normal distribution is extracted as building equipment candidates by thresholding. Finally, the point cloud is converted into an image representation, and each equipment is extracted using morphological and labeling process. Through various experiments targeting the ceilings and a wall of buildings, the proposed method achieved a high detection rate.
Abstract. With the increasing complexity of construction projects, the efficient installation and management of Mechanical, Electrical, and Plumbing (MEP) equipment has become particularly important. Currently, the management of complex MEP equipment faces challenges of multi-disciplinary collaboration, errors, and collisions during installation, as well as inefficient progress management and problem feedback. Therefore, this paper proposes a BIM and Augmented Reality (AR) driven collaborative management method for the installation of complex MEP equipment, including intelligent scene localization, accurate matching of model information, and collaboration of multi-terminal devices to reduce errors and improve efficiency. In this method, the physical built structural scene is firstly collected and accurately aligned with the MEP BIM model, and then the MEP component data is processed and matched using Dynamo and Unity. The human-computer interaction functions of installation guidance and review, information display and filtering, progress simulation, and data management are then realized for the complex MEP equipment. Additionally, the collaborative application and data management of multi-terminal devices utilizing AR headsets and handheld devices significantly improves the efficiency of communication. Through the experiment in a public building project under construction in Shenzhen, the feasibility and application effect of the method are verified, which provides strong support for the intelligent construction and collaborative management of complex MEP equipment, and has a wide range of application prospects.
Abstract. Urban development has driven the widespread proliferation of high-rise buildings, making curtain wall construction a critical aspect of progress management. However, the extensive surface areas of curtain walls and the severe perspective distortions present significant challenges to efficient construction progress monitoring. To address these challenges, a novel method is proposed for tracking curtain wall installation progress in high-rise buildings through integrated 3D reconstruction. Cameras are strategically mounted at the ends of adjacent tower crane booms, with multiple ground control points (GCPs) deployed on-site as hardware anchors. The process begins with capturing multi-view images using the crane-mounted cameras. The COLMAP 3D reconstruction pipeline is enhanced by incorporating GCPs to ensure accurate 3D reconstruction within a real-world coordinate system. The building structure is subsequently extracted from the site model, and rectified facade images are generated using projection techniques. The curtain wall installation progress is assessed at both the floor and overall building levels using the YOLOv8 image segmentation model. The proposed method was validated through a case study on a super high-rise construction site. This approach achieved decimeter-level accuracy in 3D reconstruction and a precision rate of 95.9% in curtain wall progress identification, meeting project requirements. These findings establish a robust framework for managing large-scale outdoor construction progress, particularly for high-rise curtain walls. Additionally, the site modeling methodology enables more refined and timely monitoring practices, offering significant potential for the development of digital twin models.
Abstract. As the demand for digital delivery of construction projects in the Architectural Engineering and Construction (AEC) industry continues to increase, the importance of worksite inspection and supervision is emphasized. Digital twin modeling of construction processes can reflect real-time site conditions, aiding refined management and project delivery. This paper explores the task of 3D layout reconstruction of interior construction sites through inspection by employing a portable 360-degree panoramic camera. The method uses visual simultaneous localization and Mapping (vSLAM) technology to precisely estimate camera poses during inspections, generating a motion trajectory and selecting key panoramic frames through an optimal capture point searching algorithm. Before reconstruction, the system integrates Inertial Measurement Unit (IMU) data to determine positional relationships between panoramic camera viewpoints, aligning multiple panoramic images into a unified coordinate system for accurate spatial reconstruction. Three-dimensional indoor layouts are reconstructed from panoramic images using a deep learning-based algorithm to automatically detect vertices through panoramic geometry calculations from a single panorama. An experiment with the existing floor plan is conducted to demonstrate the validity of the proposed method. This research introduces a novel approach that enhances the real-time capabilities and automation of spatial layout modeling for construction sites, laying the groundwork for intelligent inspections and holding significant engineering potential for rapid spatial layout recovery and object space mapping in future applications.
Abstract. This paper presents a RGBD slam construction dataset with a mounted platform, designed to collect the unique challenges encountered in construction sites. An Ouster OS0-128 LiDAR is utilized as the sensor of LiDAR SLAM, working as the ground truth for localization. Our dataset records various construction settings with different stages of building materials and structures, such as concrete, brick, plaster, and putty, providing a comprehensive benchmark for training and evaluating SLAM algorithms. Through testing on current SLAM algorithms, we demonstrate the limitations of traditional approaches in these environments and provide a VINS based algorithm as the benchmark. This dataset serves as a valuable resource for researchers aiming to enhance SLAM performance in the real construction environments. The detailed information of the dataset is available at https://github.com/WenyuLWY/HCIC-Construction-VSLAM-Dataset.git
Abstract. Semantic segmentation of point clouds with deep learning (DL) heavily relies on large datasets for training. However, there is a significant scarcity of datasets for Mechanical, Electrical, and Plumbing (MEP) scenes. To address this gap, this study proposes a method namely the ray-based laser scanning and intersection algorithm (RBLSIA) for automatically generating synthetic point clouds for MEP from Building Information Modeling (BIM) models. Based on RBLSIA, this study conducted totally 25 groups of comparative experiments, investigating the semantic segmentation performance on MEP scenes with different training datasets and different generation approaches for synthetic point clouds. The results show that: 1) the mean Intersection over Union (mIoU) with synthetic point clouds produced by the RBLSIA method is on average 3.32% higher than that by the uniform sampling method; 2) increasing the number of synthetic point cloud samples further improved both the OA and mIoU for semantic segmentation, even surpassing the training accuracy achieved with real point clouds.
Abstract. 3D scene understanding is revolutionising tunnel engineering. However, deep learning algorithms are data-hungry, which means the application of scene understanding on tunnel engineering requires a customized point cloud dataset in the construction field. In this paper, we introduce a new point cloud dataset called HTunnel-HLS, specifically designed for construction highway tunnel environment. HTunnel-HLS aims to establish a new database for developing semantic segmentation, and importantly, construction highway tunnel scene. Besides, the dataset provides both point-level semantic labelling along with a large range of types of semantic instance labels categorized into support structures, mechanical facilities, and others. Data have been acquired by the Hand Laser Scanning (HLS) system Hovermap and contains 28 scenes, over 1.58 billion 3D points, correspond to a 9 km long tunnel section. This paper also provides the performance of several representative baseline methods. The impact of scale on model performance is analyzed from the perspective of grid size, and outlines potential future works and challenges for fully exploiting this dataset.
Abstract. Since global energy consumption is a critical issue and the building sector plays a significant role in high energy demand, renovating existing buildings is crucial. Building Information Modelling (BIM) represents the physical and functional characteristics of a building or structure with detailed information. It allows all the architecture, engineering, and construction (AEC) industry stakeholders to work collaboratively. As-built BIM models reflect the modifications of existing conditions of buildings and fully automated generation of as-built BIM models remains a major challenge. The Scan-to-BIM process is widely used for the renovation and documentation of existing buildings. This process includes capturing the physical conditions of a building or structure using 3D laser scanning technology and converting it into a BIM model. Originally, 3D laser scanners had a very high cost, however, free 3D scanning applications that use Light Detection and Ranging (LiDAR) technology can be easily compatible with mobile phones or tablets nowadays. This paper proposes to contribute to renovating existing buildings by developing a new approach to scan-to-BIM combining surface reconstruction from point cloud data and object detection with deep learning. A 3D free scanning application generated the point cloud data of the interior room of the building and the surface model was reconstructed. Moreover, the location of the air terminals on the ceiling was investigated by developing the object detection model with deep learning and installing the air terminals on the surface reconstructed model. The finalized surface model with the air terminals was exported to Industry Foundation Classes (IFC) format. The reconstructed IFC model developed in this research can be used for Computational Fluid Dynamics (CFD) analysis with appropriate property sets.
Abstract. This paper presents a virtual scanning method based on Building Information Model (BIM) which can simulate the real scan process and occlusion condition. The method converts the geometry of BIM into triangular mesh and point cloud. The multilevel data encoding is performed for the triangular mesh and point cloud and an octree is created based on point cloud to accelerate query speed. Two stage intersection tests (ray-octree leaf node and ray-triangle intersection tests) are organized to minimize computational cost. The scanning process of the scanner is parameterized and returns the nearest point with a component label along the emitted ray. The virtual scanning method is validated on a steel structure with over 800,000 triangular meshes.The calculations of over 100 million rays are completed within 10 minutes with serial computation. The visibility ratio of every component is obtained by compare the number of points between virtual scan component and complete component with the same point cloud spacing. Furthermore, the method is validated by comparing the consistency between the virtual scan cross-sections and real scan cross-sections. The real component axis extracted by the registration between virtual scan cross-sections and real scan cross-sections demonstrates better stability.
Abstract. Robots offer a promising solution to relieve workers from physically demanding tasks and improve safety and productivity in construction. It is critical that the robots on construction sites are coordinated effectively. However, most multi-robot coordination algorithms are designed for planar areas, neglecting the multi-story nature of building construction sites. It is still unclear how construction robots should be coordinated given the constraints of elevators while adhering to construction schedules. To fill the gap, this paper introduced the deployment of commonly used elevator algorithms and robot target allocation strategies in a multi-story construction simulation environment. Through a series of group experiments conducted in a simulated multi-story construction environment, we evaluated the performance of these algorithms and examined the characteristics of robot-elevator coordination. The results reveal that while existing algorithms with strong generalization capabilities are useful, they may be less effective in specialized scenarios like multi-story construction. This research contributes valuable insights into the future of automation in construction, paving the way for enhanced integration of robotic systems and elevator operations.
Abstract. Robotics is expected to enhance productivity and safety in the construction industry, but the real-world application remains limited. Introducing robotics in construction may require humans and robots to work together for the same tasks or in close proximity. While significant attention has been paid to organizational-level robot adoption, little exploration has been done from the perspective of construction workers. This paper aims to provide a comprehensive understanding of the many factors that may influence workers’ attitudinal acceptance of robotics in construction. A case study including observations and interviews with 40 construction workers of a project in the Guangdong-Hong Kong-Macao Greater Bay Area was conducted, coupled with semi-structured interviews with 13 site managers. Various factors influencing workers’ acceptance were identified, including individual differences of workers, technological performance, and output quality of robots. Additionally, external factors including organizational support and social influences can affect workers’ attitudes toward robots. The findings reveal that most workers will passively accept construction robots when their organization mandates their utilization, although changes in income remain a major concern. Strategies are recommended for future research and practice of robots for various stakeholders, such as guaranteeing workers’ income, strategizing practice-based technology, improving multi-level robot interface management, and enhancing government support. This study should encourage different stakeholders to design and adopt construction robots guided by human-centered design principles.
Abstract. In the context of “machine substitution”, advanced automation technologies, such as construction robots, are anticipated to mitigate future labor shortages in the construction industry. However, the feasibility and urgency of automating various construction tasks differ significantly, indicating that not all tasks are suitable for automation. Given limited investment resources, it is essential to prioritize which construction tasks should be automated first. This study focuses on construction task attributes and proposes a method for assessing the automation potential of construction tasks. By comparing and analyzing multiple common construction tasks through the dimensions of safety and technology, we generate a heatmap that illustrates the automation potential of these tasks. The results of this study provide valuable insights for investment decisions in construction robots and the strategic allocation of construction tasks between human workers and robots.
Abstract. Off-site construction has become widely acknowledged for its advantages, such as saving time, enabling faster assembly, and being cost-efficient. The sector's rapid growth has driven the demand for more advanced and effective methods of construction scheduling. Construction scheduling is naturally complicated due to the numerous constraints it involves, including those connected to workforce and resource availability. Conventional approaches, like the Critical Path Method (CPM), fail to account for multiple constraints, which limits their effectiveness in practical project scenarios. This research presents a simulation-based Genetic Algorithm (S-GA) approach to develop optimal construction schedules while accounting for constraints in labour and resources. Reducing the total project duration is the objective of proposed method. The proposed S-GA framework enhances the ability to manage scheduling across all construction phases. A real-world case which contains a prefabricated bridge with 6 spans was conducted to assess the method. For comparison, traditional methods and the evolution algorithm (EA) were adopted, and the findings indicated that S-GA not only produced superior construction schedules but also operated with less computational time. As a result, the proposed approach offers an advanced scheduling method that is applicable to real-world construction projects.
Abstract. The digital transformation in the Architecture, Engineering, and Construction (AEC) sector underscores the growing need for efficient data transmission, especially in computer vision tasks that depend on the transfer of large volumes of images. In this work, a novel method is introduced to enhance data transmission efficiency in an edge-cloud coordinated architecture using Learned Image Compression (LIC). By integrating the LIC model with multiple downstream task models (Mask R-CNN and Faster R-CNN), the proposed framework aligns their respective latent features, resulting in a task-oriented LIC model that optimises compression for specific tasks. The approach increases the proportion of task-relevant information—referred to as information density—in the transmitted bitstream. Experimental results demonstrate that this method significantly reduces data transmission load while concentrating the transmitted bits on regions essential for downstream tasks, all without a notable decrease in task accuracy.
Abstract. In onsite construction, however, there are numerous delicate tasks that require skilled workers to perform. A key challenge in automating these tasks is developing motor skills in robots trained through RL, as manipulating irregular and delicate objects like hammers, scaffolding, and drills remains difficult. To address this issue, this paper proposes an RL-based approach for performing delicate tasks using a robotic arm with grippers. We present a simulation-based policy learning framework utilizing the Critic-Actor algorithm in Pybullet to control the robotic arm. In experimental trials, the learned policy was used to grasp six different types of construction tools, and the results demonstrated the feasibility of training with randomly shaped objects to manipulate the construction tools with a reasonable success rate. This method provides a foundation for enhancing the manipulative skills of construction robots, potentially reducing labor costs in the industry.
Abstract. With the increase of modular integrated construction (MiC) projects, the planning of tower crane layout (TCLP) becomes vitally essential to achieve a balance among multiple goals, such as efficiency, economy, and safety. However, existing TCLP studies are usually formed as single-objective optimization based on total lifting time for conventional construction sites. There should be trade-offs among multiple goals, especially safety. In addition, the heavier components of MiC, requiring cranes with larger lifting capacity, pose a challenge in terms of cost. Therefore, it is necessary to propose a more general and reasonable model to assist managers in making better decisions for TCLP. This study aims to develop a multi-objective optimization model with efficiency, cost, and lifting safety considerations for MiC projects. Firstly, based on literature research and accident statistics, the total transportation time, total cost of tower cranes, and total lifting moment are chosen as the three optimization objectives. Then, the improved three-objective optimization model, considering module positioning time and separate movement features of tower cranes, is proposed. To solve the proposed multi-objective problem, the evolutionary algorithm NSGA-III is used for solutions. The proposed model can provide a series of trade-off solutions for efficiency, cost, and safety, representing different combinations of crane location, supply point location, and orientation. A MiC project in Hong Kong is studied as a case to verify the feasibility and effectiveness of the proposed model. The results show that the proposed model can determine the optimized layout plan with minimum time, cost, and lifting moment by locating the tower crane point, supply point, and supply point orientation. Disregarding the orientation of the supply point would result in an additional 18.2% transportation time, leading to increased costs. Compared to the original layout scheme, the developed model can save up to 41.7% in transportation time and improve safety by 27.4%.
Abstract. In recent years, the construction industry has been expanding its production scale, however the industry is facing problems such as low productivity and frequent safety accidents. With the potential to improve construction quality and safety, construction robots have become an important means of solving the industry's problems. However, given that construction robots are built for specific operating environments, they are not yet able to fully replace manual labor, and thus human-robot collaboration is bound to be the mainstay of construction for quite some time to come. Therefore, this paper focuses on the problem of human-robot collaboration on construction sites, and innovatively proposes a task allocation framework based on the degree of danger. Taking concrete floor slab construction as an example, this paper firstly identifies the dangers in traditional construction scenarios and robotic construction scenarios, and then quantitatively evaluates each danger using the likelihood, exposure, and consequence level evaluation method. Further, this paper designs a task assignment framework that aims to minimize the degree of danger while taking into account the cost-effectiveness. The framework is intended to assist the construction team in making scientific decisions to accurately select the appropriate robot type for introduction and ensure that the overall cost is maintained within an economically feasible interval.
Abstract. The construction sector is a major contributor to global CO2 emissions and energy consumption. 3D concrete printing (3DCP) provides sustainable solutions to tackle the environmental challenges. However, the long-time and continuous operation of robotic arm printers in 3DCP incur critical challenges in energy efficiency. To address these challenges, this study aims to develop an energy consumption (EC) model of a robotic arm in 3DCP. The proposed EC model has desirable agreement compared to the experimental result, achieving an accuracy 99.51%. The impact of the proposed EC model is evaluated by printing a pre-designed path with various positions. Results reveal that the EC reduction can achieve up to 53.72% with varying positions. The findings reveal that the proposed EC model has the potential to reduce the EC for energy efficiency in 3DCP.
Abstract. Offsite construction (OSC) requires an integrated design and delivery system. Reusing prior OSC knowledge is paramount to the success of new OSC projects. However, a gap remains in managing and reusing this knowledge across the broader industry. The challenge lies in the fragmented nature of project-based organisations with isolated knowledge systems, which often lack integration and advanced capabilities for knowledge-based collaboration. This paper proposes a novel framework that leverages cutting-edge knowledge-based methods and artificial intelligence (AI) technologies to create a container-based knowledge system (CBKS) that can enhance knowledge capture and reuse towards collaborative OSC. Specifically, the semantic web stack is adopted to construct multimodal knowledge containers for both human users and AI agents. In addition, GPT-4o, a large language model (LLM), is embedded into the knowledge system for better knowledge querying, matching and retrieving. By using this framework, constructed modular knowledge units can integrate product, process and organisation factors to address specific OSC problems. To evaluate the technical feasibility of the proposed framework, a prototype is developed and illustrated through a modular connection design. This illustrative case study demonstrates how knowledge is captured for product representation under manufacturing and assembly constraints, enabling its reuse in different projects. Moreover, the usefulness of GPT-4o for enhancing this process is also tested.
Abstract. Risk management is crucial for construction safety, but safety risk assessment often relies on experts' knowledge, which makes automatic risk management in engineering projects still a big challenge. Fortunately, for large-scale infrastructure construction, on-site inspection is required, and the conditions on-site are recorded in text format, which provides an opportunity to learn risk information from inspection reports. To improve document processing efficiency, automatic text classification plays an important role. However, currently, automatic text classification requires large scale training datasets. It is a big challenge for the engineering industry, especially for the fields which heavily rely on the experts’ knowledge, such as risk assessment. Limited data sources, high time and labor costs make it not practical to establish a large-scale dataset. This work proposes a BERT-based ensemble model for small-sample text classification, leveraging the Focal loss function to address data imbalance issues. Concurrently, an ensemble strategy is employed to enhance the model's generalization capabilities, while the learning rate gradient descent method is applied to mitigate the risk of model overfitting. The efficacy of the proposed framework is validated through a four-classification task about identifying risk levels based on the inspection reports of a metro construction project. The BERT-based ensemble model proposed in this paper achieves an accuracy of 96.24% on the test set, surpassing other pre-trained classification models and excelling in automated text classification tasks.
Abstract. The accurate and efficient calculation of floor areas is a critical aspect of building design processes, yet it remains a challenging task due to the intricate rules and regulations that must be adhered to. Traditional methods, which rely on manual calculations, are not only time-consuming but also prone to human error, highlighting the need for an automated solution. Building Information Modeling (BIM) can support the automated solution by providing digital representations of building designs. The openBIM workflow enables collaborative work among various stakeholders, allowing them to share and integrate project information seamlessly without being locked into proprietary systems. While Industry Foundation Classes (IFC) files serve as a robust digital representation of BIM, facilitating data exchange across different software platforms, the complexity associated with processing these files poses a significant barrier to ordinary users. To address these gaps, we propose an innovative semi-automated floor area calculation system that leverages state-of-the-art Large Language Model (LLM) agent. This system is designed to efficiently and accurately extract relevant information from IFC files applying specific filtering conditions and then perform the necessary calculations to determine floor areas according to the required standards. Due to LLMs’ characteristic of interacting using natural language, our approach significantly reduces the technical barriers faced by non-expert users, making the process of floor area calculation more accessible and user-friendly. We validate the proposed method through the practice of simplified floor area calculation of an academic building, offering a promising solution to streamline BIM workflows and enhance productivity in the architecture, engineering, and construction (AEC) industries.
Abstract. Heritage buildings face challenges in documentation due to inconsistent records and complex data from historical documents, archaeological surveys, and materials. Traditionally, converting unstructured data into structured formats required significant expert effort. The advent of large language models (LLMs) has transformed heritage research by enabling the creation and maintenance of knowledge graphs. These graphs integrate diverse data sources, facilitating the preservation and study of heritage buildings. LLMs help extract and organize unstructured data, improving knowledge graph accuracy and consistency. This research proposes a comprehensive framework that integrates multimodal data, including text, images, and videos, into a unified knowledge graph. The framework employs LLMs for extracting information from textual data, the CLIP model for aligning images with corresponding text, and keyword searches for processing video content. The resulting knowledge graph is stored in a Neo4j graph database, providing an interactive platform for users to query and explore detailed information about heritage buildings. This approach not only supports academic research but also contributes to practical applications in cultural heritage conservation, enabling more efficient access to valuable information and enhancing preservation efforts. The proposed method was validated in European 'Gothic' and 'Gothic Revival' architecture by comparing the relationships between components.
Abstract. Using an Unmanned Aerial Vehicle (UAV) in bridge inspections can reduce human involvement in complex and hazardous inspection environments and automate the inspection process. Current practices require human operators to define task objectives, oversee safe flight operations, and evaluate bridge conditions. There is a growing demand for improving the seamless collaboration between UAVs and human inspectors to complete the inspection task efficiently and more safely, especially in post-disaster scenarios where critical bridges and other infrastructure facilities need to be inspected within hours or days. A significant gap exists in enabling UAVs to intelligently perceive and understand the bridge inspection scene according to human instructions. An intuitive human-UAV collaboration system using a multi-modal Vision Language Model (VLM) was proposed to partially fill this gap. This system leverages a few-shot Contrastive Language–Image Pretraining (CLIP)-based model to enable UAVs to visually and semantically understand the bridge inspection environment based on human commands. By incorporating text prompt learning with a cache adapter, the proposed model enhances the ability of CLIP to interpret both textual and visual inputs in the context of bridge inspection. The model was trained and evaluated in a bridge inspection image dataset and achieved an accuracy of 83.33%, outperforming other few-shot image classification methods, demonstrating its effectiveness in the bridge inspection domain. This approach is expected to improve collaboration between AI-empowered UAVs, inspectors, and bridge environments, thereby enhancing the overall efficiency of bridge inspections.
Abstract. Spatial relationships between BIM elements are crucial for various BIM-based analysis (e.g., identifying external building envelope components for energy simulation), however current BIM information retrieval researches are mainly in the field of extracting attributes from BIM element, a solution for obtaining implicit spatial information from BIM model is needed. Thus, this research focus on how to accurately and effectively obtaining spatial relationship between BIM elements and reliable answering spatial-related BIM query. Addressing this issue, we proposed a IFC-based spatial relation calculation and query-answer framework, which is summarized as follows: (1) extract geometric information of the IFC entities with openBIM standards and obtain the triangulated boundary data of each entities; (2) generate AABB tree of the entities using triangulated boundary data to index the entities and improve the search efficiency in spatial calculation; (3) bSDD-aided LLM workflow to align natural-language queries with corresponding IFC entities to answer spatial relationship queries. We use several building cases to verify our proposed method, the results indicate that our method can accurately understand the natural language query (92.1% correct rate on query understanding tasks) and efficiently determine elements’ spatial relationship (saving 61.4% average query time) to answer the original query. With our query method, users with minimal BIM experience (e.g., construction site workers) can still easily query the spatial relationship in a user-friendly way, improving the applicability of the BIM technique.
Abstract. A realistic and informative 3D digital model of historical buildings holds significant value for heritage preservation, public education, and cultural dissemination. Traditional digital representations, such as Heritage Building Information Modeling, panoramic images, LiDAR point clouds, photogrammetric mesh models, face limitations in user interaction and engagement. The automatic generation of a semantically enriched 3D model requires advanced scene-understanding capabilities. Pre-trained zero-shot methods struggle with domain-specific knowledge in heritage component semantics, while CNN-based approaches demand extensive manual effort for dataset preparation and model training. Therefore, this study proposes an optimized language-embedded 3DGS framework for the digitalization of historical buildings. It involves three steps: (1) data preparation of on-site images and relevant text; (2) component segmentation by the integration of SAM and MLLM; (3) scene reconstruction using the language-embedded 3DGS. The combination of SAM's localization ability and MLLM's in-context learning achieves 95.6% accuracy in the semantic segmentation of historical building components, requiring only a single annotated sample for each component category. Compared with previous methods, our language-embedded 3DGS model accurately captures complex semantics while providing realistic appearance and convenient navigation. The generated 3D model can be further integrated with an LLM-based chatbot assistant to achieve open-vocabulary and vague searches. This framework was validated on the Shishi Sacred Heart Cathedral in Guangzhou, China, offering a novel digital solution for the protection and sustainment of historical buildings.
Abstract. As sustainable development is gaining more and more attention, the construction industry continues to explore this aspect. Both Sustainable Development Goals (SDGs) and Environmental, Social and Governance (ESG) provide management goals for corporate sustainable development and assessment. However, due to the complexity of construction events and multiple data sources, sustainable development management in the construction industry is still hindered by the need for a large amount of labor costs. Therefore, this paper proposes an LLM-based sustainable development data processing framework for construction, which achieves three goals: (1) identifying indicators of SDGs and ESG assessment frameworks for construction projects, (2) mapping sustainable development indicators to construction events and data, and (3) developing an LLM-based localized data processing framework for construction sustainability. The proposed method can achieve rapid data processing of construction projects and provide information and information sources related to sustainable development goals. It realizes automated report generation or correlation traceability of sustainable development in construction projects.
Abstract. Compliance checking of Building Information Modeling (BIM) models is a critical process throughout the construction lifecycle, particularly during the design phase. Building design often involves the integration of multiple disciplines and complex spatial relationships, leading to errors. The growing volume and complexity of information embedded in BIM models have further complicated compliance checking. Traditional manual methods are not only time-consuming but also prone to mistakes. To address these challenges, this study proposes an integrated conceptual framework for automated BIM compliance checking, leveraging knowledge graph (KG) and machine learning. The framework aims to convert unstructured clauses in Chinese building standards into structured, interpretable, and extractable data, enabling the automatic detection of design errors in BIM models. The framework incorporates several key components. First, it constructs a knowledge graph by developing ontologies for Chinese building standards and training semantic role annotation models. A data extraction pipeline is designed using the Dynamo module in Revit to retrieve relevant information from BIM models. Finally, compliance checking logic is defined using Java to establish rules for matching the extracted building standard knowledge with BIM model information. The feasibility of this automated compliance-checking framework was validated using BIM models from two real-world projects, demonstrating its potential to streamline the compliance process and reduce errors in building design
Abstract. Although building performance simulation using physical models is frequently utilized for performance prediction, its significant computational demands pose challenges to its implementation in the early design stage. Surrogate models have been proposed to replicate computationally expensive physics-based simulation models, but existing surrogate models for sustainable residential block design are limited in scope, focusing on specific cases. Graph neural network (GNN) could be a solution to enhance the generality of the surrogate models for residential block design. However, the optimal architectures of the surrogate model and the time costs compared with physics-based simulation models have not been discussed yet. To fill these gaps, this study explores the development of GNN-based surrogate models for multi-objective sustainable performance predictions of residential blocks. Firstly, we introduce a graph schema to represent the general geometric features and relations, and a regional dataset for training and testing of the surrogate models. Secondly, we propose two kinds of architectures (individual architectures for specific indicators and an integrative architecture) for the surrogate models. Thirdly, we train and optimize the models utilizing the graph schema, regional dataset and architectures. Finally, the optimized surrogate models are evaluated in two aspects: 1) the optimized models using the individual architectures for specific indicators and the ones using the integrative architecture are compared in terms of prediction accuracy and time costs; and 2) the time costs of the optimized model are analyzed by comparing with physics-based simulations. The results showed that surrogate models based on individual architectures outperform the model using the integrative architecture in terms of prediction accuracy and time costs for all sustainable performance indicators. Although the model preparation time of the surrogate models exceeds that of the physics-based simulations, the surrogate models reduce the calculation time from 6.346 min to 1.565 ms per case compared with the physics-based simulations.
Abstract. Accurately predicting the photovoltaic (PV) potential of urban building facades plays a crucial role in the development of photovoltaics. This study proposes an innovative building facade PV potential prediction method based on the Geospatial Graph Attention Neural Network (GGAT). Compared to traditional methods, this approach considers the differences in solar radiation intensity at various heights of the building facade, enabling more precise identification of areas with higher PV potential on the facade. The study focuses on buildings in the Manhattan area of New York City and employs Rhino software and the Ladybug Tools plugin to conduct building solar radiation simulations, obtaining high-quality training data. During the modeling process, the concept of building height stratification is introduced, dividing the building facade vertically into 10 equal-height layers, with each prediction point representing the average solar radiation intensity within that height range. Experimental results indicate that GNN-based algorithms (especially GGAT) outperform traditional machine learning algorithms in predicting solar radiation on building facades. GGAT integrates geospatial features and graph attention mechanisms, enabling more accurate prediction of solar radiation on building facades. Solar radiation intensity exhibits significant differences both in the vertical direction of the building facade and in the horizontal direction (between census tracts). The stratified modeling method can reveal these differences, providing more comprehensive and detailed information for analyzing the PV potential of building facades.
Abstract. The integration of recycled coarse aggregates (RCA) into construction projects has encountered industry resistance, primarily attributable to apprehensions about variable quality. This paper underscores the imperative need for reliable material quality assessments of RCA to ensure compliance with industry standards. Addressing this concern, we introduce a novel solution: a mobile, containerized sensor-based quality inspection system. This system is furnished with a 3D scanner Gocator, which, through optimized point cloud processing and streamlined segmentation algorithms, ensures rapid extrapolation of particle size distribution (PSD) from the RCA's surface point cloud data, producing outcomes closely aligned with conventional manual sieving techniques. Additionally, the application of laser-induced breakdown spectroscopy (LIBS) within this system has proven effective, consistently producing stable spectral data indicative of the material composition. The effectiveness of LIBS is further enhanced through the adoption of a cluster-based identification algorithm, which provides exceptional accuracy and precision in the spectral analysis. The system also includes conveyor belts capable of processing more than 100 tons of RCA per hour. This synergistic integration of technologies underpins a paradigm shift in RCA assessment, offering a scalable and adaptable model for enhancing the efficiency and reliability of End-of-Life material processing, aligning with global aspirations for sustainable infrastructural development.
Abstract. With the rapid growth of the economy, the problem of plastic pollution in rivers is becoming increasingly severe, particularly in key river basins such as the Taihu Basin. Plastic pollution not only disrupts aquatic ecosystems but also poses a threat to human health and regional economic development. Therefore, it is imperative to take effective measures to reduce plastic pollution in rivers in order to protect the environment and promote sustainable development. This study proposes an efficient river trash detection method by combining unmanned equipment and deep learning technology. A dataset comprising 1,347 RGB images of river trash, captured under diverse environmental conditions, was developed to offer a wealth of diversity for model training. YOLOv10-N is employed for object detection and an mAP@0.5 of 95% on the dataset is achieved. The research results highlight the potential of applying deep learning techniques in environmental monitoring and providing support for ecological protection. In addition, the contribution of this study's dataset provides valuable resources for future model training, with diverse types of images enhancing the model's generalization capabilities and offering possibilities for more effective litter collection.
Abstract. Air pollution on urban sidewalks poses a huge threat to human health. We conducted an air pollution mobile monitoring campaign to measure PM2.5 levels on sidewalks in the Futian CBD area of Shenzhen, using portable air pollution sensors. The campaign involved collecting data during peak commuting times to capture variations in air quality. Calibration of the AirBeam3 devices against a standard instrument ensured data accuracy. Using GIS spatial analysis, we mapped air pollution patterns and identified notable differences between weekdays and weekends, with concentrations ranging from 3-9 μg/m³. Our findings indicate higher pollution levels on weekdays, particularly in the afternoon, correlating with increased traffic and economic activity. The study also highlights spatio-temporal heterogeneity, with morning pollution concentrations more pronounced in the northern financial district and afternoon levels higher in the southern area, characterized by commercial facilities. These patterns suggest that regional industrial distribution and traffic flow significantly influence air quality. By understanding the dynamics of pollution in urban environments, this research contributes to the development of effective strategies for improving air quality and public health in densely populated city centers.
Abstract. The integration of blockchain and smart contracts in the construction industry has the potential to revolutionize the tender phase and enhance waste management practices. The prototype is designed to enhance transparency, efficiency, and trustworthiness in Italian public procurement. To analyze the national public procurement database with a view to identifying common issues, which are often related to the lack of trust among stakeholders, Large Language Models (LLMs) are exploited. Blockchain technology has the potential to facilitate this process by eliminating discrepancies and disputes, providing a decentralized, immutable ledger to notarize data related to digital models submitted during the tender phase, thereby ensuring transparency and tamper-proof data. The automation of bid evaluations based on predefined criteria such as those pertaining to waste management, is a key feature of smart contracts, which are of particular importance in the context of construction sustainability. Such assessments are enabled, allowing for unbiased and transparent evaluations based on quantifiable data, with the results recorded automatically on the blockchain. This results in a more efficient tender process and the promotion of sustainable practices, as projects with superior waste management are given priority. A tailor-made blockchain protocol is put forth to delineate requirements and facilitate data exchanges. It establishes standards and procedures for data submission, verification, and evaluation, ensuring secure and transparent interactions and enhancing stakeholder confidence in a fair and transparent evaluation process. In summary, the use of blockchain and smart contracts in the construction tender phase improves data integrity, transparency, and efficiency. The focus on waste management indicators allows for objective project evaluations and the promotion of sustainable practices. This innovative approach has the potential to transform public procurement, establishing a new global standard for the construction industry.
Abstract. The growing frequency and intensity of climate-related events and natural disasters present substantial challenges to the resilience and adaptability of critical infrastructure, particularly electricity transmission and distribution networks. This study provides a review of existing literature and incorporates recent research findings to identify the primary factors influencing resilience and adaptability within these networks. The study emphasizes the importance of key areas including technical design strategies, infrastructure investments, facility design considerations, organizational capabilities, operational strategies, and supply chain factors. The findings offer essential insights for stakeholders in the energy sector aiming to enhance the resilience of transmission and distribution networks against climate change impacts and natural disasters. Additionally, the study underscores the importance of establishing standardized resilience metrics and advocates for future research focusing on cost-benefit analyses and data-driven approaches to predict and mitigate cascading failures and high-impact, low-probability (HILP) events.
Abstract. In response to the global climate crisis, managing building energy usage to achieve energy efficiency and low carbon emissions is becoming increasingly important. With the help of Digital Twin technology, building energy data can be collected, analyzed, and simulated for more efficient building operations. However, sensor registration and updating still lack an integrated process, resulting in ineffective collaboration between equipment providers and building managers, and further data maintenance issues. To address this situation, we propose a streamlined collaboration and management platform, realized through the Application Programming Interfaces (APIs) of Autodesk Platform Services (APS) and Building Information Modeling (BIM). This platform aims to integrate the complex processes of constructing digital twins, including sensor registration, spatial coordinate management, energy data collection, and updates. At the same time, a streamlined user interface is designed to enable a cohesive and integrated workflow, facilitating data updates and maintenance for building managers in the future and reducing the difficulty of developing and maintaining digital twins.
Abstract. Transportation performance is heavily influenced by the overall quality and the effectiveness of road maintenance. However, this remains an expert-dependent activity, despite recent efforts to digitalize road geometries and management processes. Road maintenance knowledge accumulated through addressing relevant enquiries is inexplicitly learned by experts and transferred into experience, which contributes very little to developing maintenance digitalization techniques and automated decision-making processes. In this case, fully utilizing historical maintenance records and turning them into computer-readable knowledge is a crucial task to be solved. This paper aims to extract key information from road maintenance request texts and then implement step-by-step thinking to make road maintenance decisions. This chain of thought is first proposed by reviewing the key elements and logical flow of road maintenance decision-making. Then, a cross-attention mechanism based on a transformer architecture is implemented on maintenance record texts and target knowledge element sequences. The result of this experiment overperforms on a pre-trained BERT model and demonstrates a valid performance on the text-knowledge alignment in road maintenance domain. The method proposed in this paper provides a solution for reliable and traceable decision-making and shows a promising application in domain-specific knowledge management.
Abstract. Designing complex, multi-occupancy building layouts requires considering complicated spatial arrangements, design constraints, and strict accessibility requirements. Despite advancements in automatic building layout generation, existing methods struggle to address the complexities of these layouts and often overlook critical accessibility features. This paper presents a novel deep learning-based approach for generating complex, multi-occupancy building layouts that meet both architectural and accessibility standards. To improve the efficiency of training, non-corridor rooms are approximated by minimum rotation rectangles, while a graph neural network (GNN) predicts the number of corners for corridors. To address the quadratic complexity of transformers, we incorporate FlashAttention, to enhance computational efficiency. Accessibility features are integrated into the model by enforcing geometric requirements, including room size ratios and maximum corner distances to account for travel distance to egress. Additionally, a distance penalty is introduced in the loss function to ensure compliance with wheelchair clearance requirements. Experimental results show that our approach outperforms baseline models in generating realistic, complex layouts while ensuring compliance with design and accessibility constraints, making it a robust solution for generating multi-occupancy building layouts.
Abstract. The automatic design of architectural floor plans using deep learning has been widely studied to assist architectural design. Traditionally, floor plans generated by deep learning have been limited to single floors. Recently, research has been developing the use of graph neural networks (GNNs), which applies deep learning to graph data to generate building volumes that consider the in-building spatial use. Although these studies aim to generate new building volumes, practical architectural design often requires the generation of floor plans within predefined building outlines, constrained by various legal and regulatory requirements. This study proposes a method for generating multi-story floor plans in a given building volume by using a graph convolutional network (GCN), which adapts convolutional operations to graph data representing the given building volume. The implemented GCN model successfully predicted, with an accuracy of 74.66%, the spatial use class for each node within the graph representing the building. This enables the generation of detailed floorplans across multiple floors. This research contributes to the design support of multi-story floor plans in a given building volume. Moreover, when integrated with the latest 3D generative AI technologies, this approach promises to advance the automatic creation of 3D building models with comprehensive interior designs, starting from scratch in volumes initially devoid of any interior information.
Abstract. The semantic enrichment of building information models (BIMs) has been widely explored, with various approaches utilizing graph neural networks (GNNs) to infer the types of indoor spaces. However, there is a gap in the intermediate process that translates the building information model into a graph model suitable for GNNs. To address this problem, we propose a structured graph model designed to represent the attributes and topological relationships of indoor spaces for indoor space classification. Based on the Industry Foundation Classes (IFC) file format, we define the concept of indoor space and propose an automated method for its extraction. The extracted interior subspaces are analyzed based on their geometric and topological properties, focusing on their relationship to the overall layout of the building's interior spaces. The subspaces are represented as nodes in the graph model, and the edges between the nodes are defined according to the topological relationships between the subspaces. A result was carried out to demonstrate the effect of the proposed indoor space extraction algorithm, which provided the basis for the inference of indoor spatial semantic types in building information models.
Abstract. With the rise of residential housing demand worldwide, offsite construction emerges as a possible option to speed up construction while improving the safety of workers. However, offsite construction sites are normally a dynamic environment in which workers collaborate with various machinery and large moving objects, resulting in additional safety concerns. Accurate prediction of future trajectories is an important step in building a collision alarm system that can be utilized to mitigate such safety risks. Traditional methods, such as Kalman filters (KF) and Markov processes, rely heavily on past trajectories and hand-crafted features, which fail to account for the dynamic nature of construction sites. With the rising interest in data-driven approaches, several studies have explored different methods of trajectory prediction. Long Short-Term Memory (LSTM) network is one of the major methods used for forecasting future trajectories by leveraging both past individual and contextual information. However, one of the main limitations of LSTM is error accumulation, which limits the model from providing accurate results. Inspired by the success of the transformer model in natural language processing, this paper proposes the use of transformer encoder-decoder architecture with graph attention networks (GATs) to predict worker trajectories on construction sites. The temporal interactions of the workers are captured by the transformer model, while GAT captures the spatial relationships of the workers, which allow the model to build more comprehensive view of the workers behavior. The model is able to take 8 frames, covering 3.2 seconds, and predict the next 12 frames, covering 4.8 seconds, with an average displacement error (ADE) of 1.25 m and a final displacement error (FDE) of 2.3 m. The proposed model improves performance compared to traditional methods such as LSTM.
Abstract. The integration of occupant data into the management of indoor environment factors is gaining increasing attention for creating intelligent and inclusive built environments. Existing approaches have mostly relied on static models, often failing to account for the ever-changing nature of occupant behavior and environmental factors across time and dimensions. Recent advancements in deep learning, especially deep sequential models capable of capturing both local and global dependencies between time steps, provide an opportunity to overcome these challenges. To address these challenges, the authors propose an LSTM-based model framework that utilizes multimodal self-attention-based fusion, realtime occupant data, indoor environmental quality (IEQ) data, and outdoor environmental data to predict future IEQ conditions, preferred IEQ conditions, and classify current IEQ conditions based on collected occupant feedback. To develop and test the proposed framework, four key steps were followed: (1) collecting IEQ data through smart sensors, (2) collecting perceived occupant feedback, (3) collecting outdoor environmental data, and (4) developing an attention-fusion-based Bi-Directional LSTM(Bi-LSTM) model. The proposed framework was tested at the Virginia Tech Blacksburg campus, showing promising results.
Abstract. Compliance checking and performance evaluation are crucial components of the iterative design and review process. However, traditional manual reviews and refined simulations are often subjective, time-consuming, and require a large number of input parameters. Therefore, automated rule checking (ARC) and performance simulation based on surrogate models have gained increasing attention. Despite this, most existing research focuses on one aspect or the other, without effectively integrating both, leading to gaps in reliability and efficiency when applied in practice. Therefore, this study proposes an integrated framework that combines automated compliance checking and efficient performance evaluation based on surrogate models, enabling rapid design review iterations. The framework comprises three interconnected modules: the NLP-AutoChecking module for rule checking based on IFC (Industry Foundation Classes) and semantic alignment; the DiffEvac module for evacuation performance simulation based on diffusion models; and a BIM design software module that connects both through a unified data interaction approach to modify design. Specifically, the BIM design can be converted into IFC format for rule checking to identify non-compliant elements and specify the violated regulations. Designers can then modify the design within the BIM software accordingly. It is important to note that multiple solutions may meet regulatory requirements, but not all are scientifically or practically optimal, as some may compromise safety or increase costs. Therefore, the modified BIM designs are exported as floor plans, cleaned and annotated, and then fed into the surrogate model for performance simulation, which evaluates and selects the optimal solution from the available options. This iterative cycle of compliance checking, simulation, and design modification continues until the design meets all regulatory standards and achieves optimal performance. Case studies demonstrate that this framework enables quick iterations and adjustments throughout both the design and review stages, significantly improving design quality and offering strong potential for widespread practical adoption.
Abstract. Installed fire safety equipment within buildings plays a crucial role in ensuring the safety of personnel and minimizing losses. Nevertheless, if not maintained appropriately, these devices may fail to function optimally in emergency situations. As building sizes continue to grow, traditional manual inspection methods encounter significant challenges, including a heavy workload and complex information recording tasks. To tackle these issues, advanced emergency equipment detection frameworks and improvement plans have been put forward. This framework is specifically designed to overcome the problem of remote inspection being unable to accurately locate objects by establishing spatial relationships among devices, cameras, and trajectories. Firstly, the improved detection algorithm is utilized to detect objects of interest. Subsequently, these objects are located through a tracking algorithm and Visual Simultaneous Localization and Mapping (vSLAM). The on-site experimental results clearly show that the framework can effectively solve various types of equipment detection problems in a wide range of complex scenarios and holds great promise for replacing manual labor.
Abstract. Building fire incidents pose significant risks to human lives and property, making fire safety compliance a critical aspect of building management. Traditional compliance checks are largely manual, relying on expert inspectors to assess and report on fire safety standards. While prior research has explored Automated Compliance Checking (ACC) during the design phase, limited attention has been given to the operational phase, where dynamic risks necessitate continuous monitoring. This study proposes a novel approach that leverages vision Large Language Models (vLLMs) to automate fire safety compliance monitoring in the operational phase. The developed method frames hazard recognition as a Visual Question Answering (VQA) task, enabling the model to analyze visual data and respond to textual queries regarding potential fire hazards. The system employs a Vision Transformer (ViT) for visual encoding and a multimodal fusion process, allowing the vLLM to generate contextually relevant descriptions of observed hazards, along with regulatory references including Occupational Safety and Health Administration (OSHA) standards. Evaluation results demonstrate significant improvements in hazard recognition over a generic vLLM baseline, with an average BLEU score of 0.1355 compared to 0.0410 and higher ROUGE scores reflecting superior precision and coherence. The model’s ability to automatically generate structured hazard description reports has practical implications for assisting expert-driven inspections, offering a comprehensive and effective solution for long-term fire safety management. This study thus advances ACC research by providing a comprehensive, automated method for continuous fire safety compliance in operational building environments.
Abstract. Seismic Loss Estimation (SLE) has become a critical aspect of modern building engineering, aiding in mitigation strategies, real-time disaster response, and post-earthquake reconstruction. The FEMA P-58 method, a performance-based earthquake engineering tool, efficiently links component damage states with engineering demand parameters for comprehensive seismic loss assessment. However, managing the extensive data and semantics required for such evaluations poses challenges. This paper proposes a Knowledge Graph (KG)-based solution, integrating object-based information management principles akin to Common Data Environment (CDE) and Building Information Modeling (BIM). By leveraging KG and digital twin technologies, this approach aims to facilitate dynamic seismic loss estimation, providing stakeholders with a comprehensive view of building performance and enabling efficient data access and analysis.
Abstract. High-resolution (HR) imaging devices are now widely used for capturing crack images from civil structures, necessitating the development of algorithms for HR image segmentation. However, the traditional refined segmentation of HR images requires substantial GPU resources, which leads to the adoption of the cost-effective point rendering technique for inference. Considering that traditional rendering techniques require the use of coarse masks to guide the rendering points for processing prediction, these coarse masks typically fail to effectively focus the rendering points on the boundary regions of the slender cracks, resulting in ambiguous predictions at crack boundaries. In contrast, we introduce a novel rendering point sampling paradigm that enables the network to focus rendering points on crack boundary regions, guided by the probability maps during the inference phase. This approach significantly improves the segmentation accuracy of crack boundary regions from HR images without increasing computational resource dependence. Experiments on an open-source HR crack image dataset consistently show our method's superiority over state-of-the-art approaches.
Abstract. Since the development of public administration management in the 1980s into the New Management Model, the principles of asset management have found their way into the management of public administrations. This enabled the already existing systematic maintenance management of roads to be integrated as an element of asset management. Systematic road maintenance today ensures the provision of a well-functioning infrastructure. The requirements for maintenance management - especially for municipal infrastructure - are becoming increasingly complex. In addition to the purely technical view, the commercial view and life cycle cost considerations have also been incorporated into maintenance management. The available financial resources should be used optimally in such a way as to achieve the useful lives of the assets, ensure the value retention of the fixed assets and implement the new infrastructure requirements. When renewing existing infrastructure today, it is therefore necessary to address issues that will become relevant tomorrow. These include, for example, an ageing society, increased e-mobility, a child and disability-friendly society, flood protection, reduced noise and a reduction in air pollution. These and other aspects are therefore integrated into the existing management system and must also be considered during maintenance. Municipal asset management (AM) is a suitable controlling method for this. This paper intends to show how modern maintenance management of municipal infrastructure as an element of asset management can be achieved by changing the way it is viewed and approached, without generating additional costs in the medium term.
Abstract. With the development of artificial intelligence (AI) technology, human-machine collaboration (HMC) plays an important role in enhancing construction safety. Human trust in AI is the key to the successful implementation of HMC. In this study, the effects of human-related factors (gender, technology acceptance) and AI-related factors (accuracy) on trust were measured and analyzed with controlled experiments simulating a tower crane operation scenario. Twenty-four college students were recruited for the experiment and randomly assigned to two false alarm rate conditions. A remote-controlled tower crane model toy was used to simulate a lifting task. A tablet computer was used to simulate an intrusion alarm monitoring system. Users' initial technical acceptance of the system was assessed via a questionnaire. Subjects' trust scores at the end of each alarm were measured using a trust rating scale. The statistical methods of t-test and two-way ANOVA were used to test the significant relationship between false alarm rate, gender, technology acceptance and trust score. The results show that the false alarm rate is a key factor affecting trust, while gender and technology acceptance and their interaction effects with the false alarm rate are not significant. The study emphasizes the importance of reducing false alarms and improving AI accuracy to enhance user trust.
Abstract. To address the industry-wide and policy-driven requirements toward construction site safety monitoring, this paper develops a virtual assistant agent based on a large vision-language model (VLM), integrated into on-site surveillance camera system for real-time identification and alerting of unsafe worker behaviors. First, we designed a semi-automatic image-text labeling pipeline, employing in-context learning to enhance data annotation efficiency. Then, we established a two-stage curriculum learning paradigm to deeply embed construction domain knowledge into the VLM, which is eventually embedded into a real-time video analytical engine for safety compliance inspection and interactive visual question answering. The system has been deployed on a real construction site, with around 90% accuracy in identifying violations of work-at-height safety regulations.
Abstract. The increasing emphasis on integrating technological advancements with human-centered and sustainable practices highlights the paradigm shift toward Construction 5.0 (C5.0) in the architecture, engineering, and construction (AEC) sector. Despite its potential, investigations about C5.0’s key pillars, practical implications, and adoption challenges remain limited, with much existing research focusing on conceptual frameworks or literature reviews. This study addresses these gaps through an empirical investigation, incorporating insights from a focus group of 17 industry practitioners to explore C5.0’s key pillars, core features, technological enablers, and implications. The findings highlight three core features of C5.0: human-centricity, sustainability-driven practices, and collaborative intelligence. Seventeen emerging digital technologies were identified as critical enablers of C5.0, with artificial intelligence/machine learning, digital twins, and collaborative robots ranked as the most impactful technologies. These technologies support 31 application domains and enable AEC organizations to achieve enhanced productivity, innovation, sustainability, worker safety and well-being, and competitive advantage. Under the enhanced sustainability category, improved compliance with environmental regulations and increased capacity to meet client demands for sustainable practices were emphasized as key outcomes. This study contributes both theoretically and practically to the understanding of C5.0. Theoretically, it defines the key pillars, core features, and technological enablers of C5.0, bridging gaps in the existing literature and advancing the academic discourse on the evolution to C5.0. Practically, it offers a roadmap for integrating critical technologies with human-centered and sustainability goals, enabling AEC practitioners to prioritize investments effectively. Future research should expand empirical studies to conduct a cost-benefit evaluation of C5.0 technologies and explore C5.0’s impact on project management methodologies, stakeholder collaboration, and organizational strategy development.
Abstract. In the design stage of substation projects, there is an enduring task for designing an optimal layout for equipment under various constraints from functional, geometric, and economic aspects. In traditional practices, professionals need to recursively adjust the positions of involved objects in the specialized working spaces to meet the requirements of different projects and to comply with design codes. It is highly dependent on professional skills and understanding of regulatory documents. To streamline this process, we propose an AI-driven substation layout design approach using large language model (LLM). This approach exploits the capacity of large language model by converting the task of generating the full layout plan into generating sequences of the positions of involved equipment. We have finetuned two models based on a base model Llama3.1-7B with two auto-generated datasets under professional guidance. One dataset consists of the input requirements and output layout scheme, while the other is augmented through chain-of-thought (CoT) to elicit the underlying language model to retrieve optimal capacity with the consideration of specific design constraints. To implement the output scheme, an automated procedure is further developed to translate the scheme into the corresponding layout plan and information models.
Abstract. The sustainable and efficient management of the built environment is a crucial challenge in the increasingly digitalized AEC sector. Innovative technologies such as Building Information Modeling (BIM) and Digital Twin (DT) offer significant opportunities to enhance the operational efficiency and sustainability of physical assets. However, digitalization generates vast amounts of Big Data, and their handling through centralized architectures leads to risks of fragmentation, lack of transparency, and vulnerability to manipulation. In response to these challenges, this study presents an innovative Proof of Concept (PoC) that integrates Blockchain (BT), Digital Twin (DT), and Non-Fungible Token (NFT) technologies to promote decentralized and sustainable data management in the construction industry. The application, called dDT (decentralized Digital Twin), was initially deployed on the Solana blockchain and later integrated with Polygon to leverage EVM compatibility and the ERC-721 standard for NFTs. The platform enables the tokenization of data flows generated by physical assets, ensuring traceability, security, and transparency throughout the entire asset lifecycle. The dDT system represents a sustainable innovation as it creates a secondary data market, fostering collaboration among industry stakeholders and financing new developments through the sale of data-linked NFTs. This decentralized solution addresses fragmentation and transparency issues, promoting more secure, resilient, and sustainable data management practices. The PoC demonstrates how the integration of BT, DT, and NFT can accelerate the transition toward more efficient and innovative practices, with positive impacts on sustainability and technological advancement in the AEC sector.
Abstract. Classifying nighttime clouds is crucial for understanding their impact on Earth's radiative balance. This study presents a semantic segmentation model using U-Net with a MobileNetV3 backbone for classification of the following cloud types: Cirrus, Nimbus, Stratus, and Cumulus from nighttime images. Despite challenges from reduced visibility at night, cloud types and coverage were effectively detected, classified and measured. The model results potentially facilitate future research on nighttime radiation analysis.
Abstract. Identification of earth pressures acting on in-service underground structures is critical for their health monitoring and performance prediction. Given that the extensive deployment of sensors on poorly performing structures to measure pressure incurs high costs and presents technical challenges, the inversion of these pressures from easily observed deformation data has become increasingly desirable. However, traditional pressure inversion methods require subjective assumptions about the complexity of the pressure, necessitating extensive engineering judgment that may not be confidently applied in practice. To address this challenge, this paper proposes a trans-dimensional Bayesian method for pressure inversion. This method simultaneously incorporates pressure complexity and quantities into the inversion by parameterizing a set of previously unknown parameters, where the number of parameters itself is unknown. A recorded case study is presented for illustration and verification. It is found that the proposed method yields good inversion results on the pressures on a diaphragm wall, whereas traditional methods lead to poor inversion results due to inadequate assumptions. These outcomes highlight the advancements of the proposed method. Lastly, deficiencies and future extensions are discussed in the conclusion.
Abstract. In Japan, the number of people working in construction has been decreasing yearly since peaking in 1997 because of the declining birthrate and aging population. The Ministry of Land, Infrastructure, Transport and Tourism (MLIT) is promoting i-Construction, which aims to increase productivity by using ICT (Information Communication Technology) across the entire construction industry. In particular, the use of ICT is expected to be applied to the inspection of reinforcement rebars, which form the basis of all concrete structures, as the inspection of reinforcement bars has become stricter due to the influence of the structural calculation falsification problem that occurred in 2005. The existing methods use image triangulation to obtain a 3D point cloud of the rebar surface, so depending on the lighting conditions and the arrangement and shape of the rebars, there are some sites where it is not possible to obtain a sufficient number of point clouds, and thus the range of application is limited. This paper proposes a method for using rapidly developing differentiable rendering technology for rebar inspection. Since differential rendering makes it possible to reproduce free viewpoint images, it is possible to inspect the rebar even if the distance to the rebar surface cannot be directly measured, as long as it is possible to reproduce the appearance of the rebar as seen from a viewpoint in 3D space. Image inspection can be expected to become more versatile than before. This paper reports on the evaluation of accuracy and systematization schemes.
Abstract. Construction software’s promise of efficacy has been significantly unfulfilled. Although Information and Communication Technology (ICT) has been commonly used for over thirty years, many problems have existed relatively unchanged. There are several reasons for this, but one is overlooked: the absence of contractor best practices embedded in the industry’s software. The primary cause appears to be developers’ commercial interest. This paper used a literature review and an industry survey to study possible contributing factors. This paper quantifies this lack of software capability to support valuable processes. Currently, surveyed professionals perceive over 40% difference between the value of the practices tested and the enabling ability of the respondent’s software. The industry is in a new age with the emergence of Artificial Intelligence (AI) and Quantum Computing. Will the same problems exist after these advanced technologies are widely adopted? This research adds to the body of knowledge by articulating a gap between industry-accepted practices’ value and commercial software support that enables their execution. The paper asserts that government, associations, researchers and construction organisations should facilitate reengineering to existing ICT to reflect contractor processes. This will make the coming transformation of technology more valuable to all stakeholders.
Abstract. The rapid advancements in Generative Artificial Intelligence (GenAI) have unlocked transformative potential across various industries, including construction. With its ability to generate content, automate processes, and enhance decision-making, GenAI offers significant opportunities to improve the efficiency and accuracy of Construction Risk Management (CRM). However, its integration into CRM also brings a new set of risks and uncertainties that are unprecedented in traditional risk management frameworks. To this end, the purpose of this research is to identify and classify the key risks associated with integrating GenAI into CRM. To achieve this, a three-step systematic literature review was conducted, analysing 48 scholarly articles on GenAI for CRM from Scopus-indexed academic journals published between 2014 and 2024. A total of 25 risk factors associated with GenAI integration in CRM were identified and classified under seven key categories: financial risks, technological adaptability risks, information integrity risks, input quality risks, and ethical and governance risks. This study enhances the understanding of risk factors in GenAI integration by presenting a structured framework that categorises the associated risks of GenAI integration into CRM while highlighting their interconnectedness. It also lays the foundation for interdisciplinary approaches and future empirical research to validate and expand these insights across diverse construction contexts.
Abstract. Digital twins (DT) represent a critical methodology for digitising infrastructures, such as hydropower plants (HPP), enhancing their efficiency, sustainability, and competitive edge. However, despite the potential benefits, there is yet to be a consensus within industrial and academic spheres on the implementation process of digital twins for hydropower plants, a relatively uncommon type of infrastructure. This study explores the feasibility of using DT to provide digital solutions that enhance the flexibility and sustainability of hydropower plants. The paper examines the unique challenges associated with implementing digital twins in this context, including the high demand for water flow real-time updates, the complex logistical geographical dispersion, and environmental considerations. Subsequently, it reviews mainstream digital twin frameworks from other domains to establish foundational references. Building on this analysis, the study proposes a bespoke digital twin development framework specifically tailored for hydropower plants. This proposed framework advances the theoretical understanding by integrating interdisciplinary research and domain-specific insights, offering a robust theoretical foundation for digital twin applications in hydropower settings. Practically, it delivers actionable guidelines and detailed strategies designed to facilitate the construction of digital twins in hydropower plants. By addressing domain-specific challenges and incorporating established best practices, the framework equips engineers and project managers with the tools necessary to enhance operational efficiency and sustainability. This comprehensive framework not only aids in navigating the complexities of hydropower projects but also sets a benchmark for future digital twin implementations.
Abstract. In Japan, the importance of maintenance is growing as civil engineering structures and buildings age. It is important to carry out regular inspections at high frequencies and accumulate the data for proper maintenance. In recent years, with the wide spread of 3D scanning technology, it has become an effective method to use 3D models obtained by 3D scanning of the target structure as a base, and to associate electronic inspection information such as photos and documents with the relevant locations and manage them. It is the advantage that subsequent searches can be made in intuitive and easy ways by providing location information on the 3D model to the inspection information. In this paper, we develop a simple inspection tool using the head-mounted device HoloLens2. The developed tool allows inspectors to perform 3D scans as they move around a structure based on SLAM. And when deterioration or damage is found, images can be captured with the device's camera. In addition, the corresponding location on the 3D scanned model can be specified by simply pointing to the relevant location in the real environment with a hand gesture, and inspection information such as captured images and documents can be registered here. Experiments on an actual bridge demonstrated that this tool allows a series of inspection processes, such as 3D scanning, taking photos, and registering the photos on a 3D model, to be carried out at the inspection site with very simple and intuitive operations. In addition, the accuracy of the constructed 3D model was evaluated by comparing it with the TLS point cloud.
Abstract. This article combines the author's past software development and other work practices related to computer-aided design and construction of building enclosure walls, discussing how computer technology can assist in various methods and directions during the design and construction of building enclosure walls. It mainly involves structural design of load-bearing enclosure walls, structural design of non-load-bearing enclosure walls, thermal insulation design of building enclosure walls, waterproof design of building enclosure walls, planning for enclosure wall hoisting, calculation verification for enclosure wall hoisting, and construction of enclosure walls. Whether in the design of the main structure of the building or in the planning and decoration of the building space, designers need to put in a lot of effort for enclosure walls. It is worth studying how to fully utilize the capabilities of computer technology to free up more time for designers. This article has certain reference value for research and application in related fields.
Abstract. For bridge visual inspection and maintenance, a fundamental task is to determine the condition of a bridge component from its appearance as captured by images. Then the information of the identified defects is recorded in documents and assessed collectively by bridge practitioners to determine what maintenance activities are required for the component. This engineering practice naturally fits for the idea of IFC components in BIM workflow. So, to automate this labour-intensive and time-consuming process, a smart and practical framework is proposed utilising BIM component-centred bridge digital twin system. In this system, sensors-equipped robotics is integrated with the bridge digital twin to identify any defect. Component-wise defect data is linked to the BIM model for detailed and holistic assessment, ensuring that maintenance decisions are fully informed. This framework is validated by demonstration of key functions including RTK-enabled drone for automatic defect localisation, defect quantification by computer vision, defect data storage in SQL and visualisation of enriched BIM model in interactive web-based platform for maintenance decision-making. It is demonstrated that the framework can streamline defect data transfer from on-site inspection to an online bridge digital twin, supporting decision-making processes by referencing relevant industrial standards.
Abstract. Highway agencies face challenges managing scattered asset data across maintenance processes and information systems, obstructing efficient retrieval of dynamic cross-system road information for timely interventions. This paper presents a Digital Twin (DT)-based data federation framework to effectively manage fragmented systems and dispersed data for highway infrastructure operation and maintenance. The framework middleware can decompose users’ queries and requests from different subsystems based on a metadata database and a distributed system architecture. The connected data ecosystem enables dynamic communication between different asset systems, ensuring efficient maintenance planning and process coordination between different users and teams. The presented framework is demonstrated based on datasets and synthesised systems conforming to asset management practices adopted by United Kingdom (UK) National Highways.
Abstract. Timely and accurate identification of workers' intentions in construction scenarios is crucial for seamless worker-robot collaboration. However, limited worker behavior due to varying behavioral styles and difficulties in collecting worker action data limit the practical application of existing methods that rely heavily on extensive worker action data. This paper addresses the dynamic nature of construction environments by proposing a few-shot worker intention recognition method. The proposed approach constructs worker intention query features using randomly sampled frame combinations and then applies metric learning to develop a few-shot worker intention recognition model. To validate the effectiveness of this method, a worker scaffolding installation action video dataset was used for the experiments on worker intent recognition. Given five categories with five worker action samples, the method achieved an accuracy of 71% in recognizing workers' intentions. The results demonstrate that the proposed method can effectively learn and detect novel worker actions with a minimal number of classified action videos, thereby improving model performance while reducing the number of required training videos. This approach not only reduces the labor needed for data labeling but also enhances the practicality of worker-robot collaboration in construction scenarios.
Abstract. Out-of-Hospital Cardiac Arrest (OHCA) is a critical medical emergency that requires immediate intervention. With technological advancements, integrating drones into the Emergency Medical Services (EMS) has emerged as a promising solution. This study examines OHCA cases in Taipei and New Taipei City, Taiwan, from September to December 2019. A two-phase optimization model was employed to identify the effective locations for drone bases. Among the scenarios tested, settings with a maximum flight time of 7 minutes, 90% OHCA case coverage, and 13 drone bases, demonstrated the best overall performance in terms of system-wide and area-specific usage rates, response time reduction, and cost-effectiveness. These findings provide valuable insights into the feasibility of drone deployment and offer practical reference for real-world emergency applications.
Abstract. To address the increasing complexity and sustainability challenges facing civil engineering, an educational approach that integrates theory with practical experience is needed. Traditional methods of teaching often fail to equip students with the skills needed by the industry which later on hampers their job readiness and the development of infrastructure. This study investigates the use of Problem-Based Learning as an integrative feature of BIM in a civil engineering module for the purpose of increasing engagement, problem-solving skills, and comprehension of complex concepts. The research included both pre- and post-class surveys that were directed toward the students and the professor. The pre-class surveys were used for the examination of the anticipated expectations, perceived difficulties, and awareness of PBL and BIM, while the post-class surveys were designed for the estimation of the shift in perceptions, confidence levels, and contentedness. The gathered data were also examined through tools like descriptive statistics and thematic coding. The two approaches, PBL and BIM, were the most positively rated, where instead of discussing the effectiveness of these methods, 62.5% of students mentioned that BIM was the most beneficial tool for deepening structural concepts. The instructor particularly made mention of successful alliances with industry and sustainable direction but pointed to delays in permits and logistics maintenance as big threats. The integration of PBL and BIM in the civil engineering curriculum is found to be a significant aim of improved teaching as students are more motivated to complete course objectives and are more confident when they are exposed to real-life problems of the industry. It is proposed that PBL sessions be held in advance and that new digital resources be designed. Further research should be conducted to find out the effect of such methods on career readiness in the long term.
Abstract. This study examines the impact of virtual reality (VR) training on enhancing the skills of tower crane operators in Aerial Construction Factories (ACFs). As ACF operations become increasingly complex with higher safety demands, traditional training methods struggle to simulate high-risk scenarios and provide sufficient hands-on experience. By leveraging VR technology, this study offers an immersive and interactive training environment where operators can practice in risk-free, controlled virtual scenarios. VR-based experiments were conducted to measure performance in task completion time, operational accuracy, self-efficacy, and the frequency of collision warnings during simulated tasks. The results demonstrated significant improvements in both task accuracy and operator confidence, along with reductions in task completion time and collision warnings. These findings suggest that VR training addresses the limitations of conventional methods, offering a more cost-effective, scalable, and safe solution for improving crane operator skills within ACFs. Ultimately, the study highlights the transformative potential of VR technology in enhancing operator proficiency, safety awareness, and accident prevention in the challenging and hazardous environment of ACFs.
Abstract. Artificial intelligence (AI) is playing an increasing role in the construction industry to enhance productivity, reduce safety accidents, and optimize collaboration efficiency. However, attacks on AI systems also introduce cybersecurity threats that could lead to severe consequences, such as equipment damage, financial loss, operational downtime, safety accidents, and potential loss of life. Motivated by the construction industry's limited efforts to defend against AI cybersecurity vulnerabilities—a result of a lack of awareness and IT resources—this paper aims to propose a cybersecurity-aware decentralized machine learning (CADML) framework to protect the life cycle cybersecurity of machine learning (ML) models leveraging blockchain. First, the workflow of the CADML framework will be introduced to illustrate the logic of blockchain-ML integration. Second, a new blockchain smart contract algorithm, ML-embed smart contract (MLSC), will be developed to train and apply AI in a decentralized manner. The primary innovation framework extends current "partially" blockchain-ML integration methods to enable the ML's "lifecycle" (from raw data storage, training, implementation, to model update) to operate in a decentralized and secure blockchain environment. The framework is tested to recognize construction equipment motions. Results show that (1) the ML model could be successfully trained and implemented within a blockchain and (2) the ML performance (accuracy, precision, and recall) is acceptable.
Abstract. Hong Kong faces critical challenges in the maintenance and redevelopment of aged buildings. Recently, advancements in multi-modal generative AI (GenAI) and high-definition urban geospatial data, such as point clouds, have offered new opportunities to the architectural, engineering, and construction industry. This paper defines, assesses, and maps a Building Condition Index (BCI) as the condition of aged building fabrics using GenAI and high-definition geospatial data. First, a BCI is defined as a numerical scale of multi-dimensional factors, including floor area, building age, management quality, and the presence of unauthorized building works. Then, multiple data sources, including building exterior photos, airborne point clouds, and government building datasets, are processed and trained for the BCI using multiple regression and image embedding with ChatGPT4. Finally, a comprehensive BCI map and focused BCI hot spots can be visualized for an urban area. Experiments with over 1,200 building data points in Kowloon City, Hong Kong, indicated the robustness of the BCI in explaining the exogenous factors causing decayed buildings while accurately reflecting the building condition of buildings.
Abstract. It is necessary to conduct efficient repairs on road pavements by extending the service life of pavements and reducing life cycle costs (LCC). On the other hand, many small and medium-sized local governments and construction enterprises (SMEs) are responsible for road maintenance work, and the sophistication of ICT-based maintenance management technology has not yet progressed. The authors applied digital measurement technology and constructed a pavement management system based on digital twin technology, assuming that (SMEs) can use it. The maintenance cycle of road pavement can be expressed as follows: inspection, diagnosis, action, record, and next inspection. The authors use a digitalized virtual model (digital twin) as the core, acquire (sensing or monitoring) information and inspection results from the actual model existing in the physical space, and store and accumulate them in the virtual model, which becomes the digital twin. We defined a cycle model that accurately predicts the future by analyzing and simulating the data using a mathematical model and aimed to construct a system to realize this model. In this study, we developed a system based on a spatial information infrastructure. The system can display point cloud data generated from images taken by a simple in-vehicle stereo camera, road surface property data from simple laser measurement, and other data in conjunction with spatial information, and its usefulness has been evaluated. The system was developed using QGIS, which is freeware, to make it easy for SMEs to use. In the future, it is necessary to consider introducing a system that can be shared on the cloud for easier use.
Abstract. CONNECTIA, a "Digital Twin" application developed under the "Construction DX" initiative to improve productivity. It combines cloud-based data management with native app rendering using Unity, offering advanced visualization and user-friendly UI to simplify 3D workflows. It has shown potential time savings of over 10 hours per person per month.
Abstract. With the evolving functional attributes of sports venues in China and the promotion of national fitness strategies, sports venues are increasingly transitioning into comprehensive entertainment and leisure facilities. However, due to spatial constraints and other factors, many of these facilities suffer from suboptimal operational effectiveness and limited spatial vitality. This study employs literature review, case data collection, comparative analysis, and inductive analysis to propose a design strategy for the integration of sports venues and commercial spaces, emphasizing the "optimized horizontal space utilization, vertical compound utilization, and scene content creation". Through specific case discussion, the study emphasizes the need for these three strategies to work in synergy and highlights the importance of continuous optimization across the entire building lifecycle—from decision-making, strategic planning stages to design, construction, operation, maintenance, and feedback optimization. These findings provide a valuable reference for the organic integration of sports venues and commercial spaces in China and beyond, fostering new vitality in sports venues and promoting healthier urban environments.
Abstract. In this paper, we propose a real-time on-site indoor modeling and modeling quality visualization method using a Mixed Reality (MR) device with a Time-of-Flight (ToF) sensor to realize efficient reliable indoor model generation. The modeling method is based on the rule-based fast voxel labeling and consists of two steps, i.e., local modeling and global modeling. In the local modeling, space occupancy labels and attribute labels of ceiling, wall, wall opening, wall candidate, floor, object, and space are assigned to each cell of the local voxel from point clouds in each frame of laser scanning by the MR device. In the global modeling, the results of the local modeling are integrated with the global voxel considering the label assignment quality based on entropy and probability. The quality of the resulting model is shown to the user using MR visualization to check the modeling results. For real-time visualization of the quality of modeling results, a simple textured polygon model is created and used in the MR visualization. The polygon model is efficiently generated from the local and global voxels, and the texture represents the quality of the modeling results. Our method was applied to a real indoor environment, and its performance was evaluated. The modeling framerate was 2fps, and the precision for labeling was 98%. The polygon model for MR visualization was created and updated within 20msec, and real-time MR visualization could be achieved. These results showed that the real-time indoor modeling and quality check on-site using the MR device with a ToF sensor could be realized.