Hydropower underground engineering encounters significant safety management challenges owing to overlapping construction activities, diverse process stages, and dynamic resource flows. This involves multidisciplinary safety tasks, such as safety hazard identification and rectification, emergency response, and regulatory compliance checks, which require specialized domain knowledge. In this context, safety management knowledge is intricate, such as expert experience, patterns and characteristics, and management codes, and is dispersed across multimodal data formats, including text, tables, and images. Efficient extraction of these multimodal data sources can significantly enhance data utility and support intelligent safety management. However, owing to the diverse nature of data formats, the complexity of the knowledge system, and the various management scenarios, current research struggles with limited knowledge sources, acquisition difficulties, and poor generalization.
This study proposes a method of constructing a multimodal knowledge graph (KG) for safety management in hydropower underground engineering. (1) A large-scale, high-quality, multisource heterogeneous dataset is built from safety hazard identification and rectification records, regulations, and images. (2) Knowledge modeling employs top-down and bottom-up approaches to define entities, relationships, attributes, and events pertinent to safety management in hydropower underground engineering. (3) The entity and relationship information from text data is obtained using a knowledge extraction method that uses a large language model (LLM) tuned with domain knowledge, enriched by specific examples for each entity type to handle small sample sizes. This approach uses demonstrations to provide the model with prior knowledge. (4) Instance segmentation is used to annotate safety hazard images. The entities identified in the images are then converted into vectors. Image and text data are linked based on semantic similarity. Image data are integrated into the textual KG, enabling the transformation from multimodal data to multimodal knowledge. (5) The multimodal KG is stored in Neo4j, an open-source graph database management system. (6) A scenario-specific knowledge acquisition method addresses the specific needs of safety management scenarios, integrating KG with LLMs to enable retrieval-augmented generation and interpretable knowledge reasoning.
(1) This paper collected more than 120 000 safety hazard records, 30 regulatory documents, and 300 000 images of safety hazards. Leveraging these comprehensive data, this paper constructed a large-scale, high-quality, multisource heterogeneous dataset specifically designed for managing safety in hydropower underground engineering projects. (2) Taking a hydropower underground engineering project as an example, the constructed multimodal KG was applied to intelligent recommendations for safety hazard rectification and compliance checks. (3) The workflow for generating intelligent recommendations for safety hazard rectification measures involved the following steps. After users input safety hazard information, the scene-KG was extracted from the multimodal KG and fed into an LLM to generate appropriate rectification measures. (4) Based on the scene-KG, an inference retrieval method extended neighboring nodes and constructed inference-KG for compliance checks. By integrating inference-KG with an LLM, the system retrieved relevant content from regulatory documents based on user input.
The proposed method effectively extracts and applies domain knowledge from multimodal data in the context of safety management in hydropower underground engineering. It also successfully applies domain knowledge for safety management. The results serve as a reference for transitioning infrastructure construction safety management from a data-driven approach to a knowledge-driven approach.