KEYWORDS: Data modeling, Education and training, Performance modeling, Systems modeling, Head, Transform theory, Reflection, Engineering, System integration, Statistical analysis
Temporal knowledge graph (TKG) can abstract the temporal information of entities and relations in the real world. If we can infer unknown temporal quadruples, we can predict the developmental trends of events in the long run. However, current TKG reasoning methods are difficult to model the relative temporal relations between quadruples well, and there is also an issue of insufficient reasoning information. Therefore, we propose a TKG reasoning model named TKBK, which combines formalized temporal knowledge and generative background knowledge. TKBK retrieves temporal knowledge from TKG and generates background knowledge from large language models (LLM). It uses a masking strategy to train a pre-trained language model and transforms the complex reasoning task into a masked token prediction task. We evaluated our proposed model on two datasets. The results show that TKBK outperforms the baseline model on most metrics, proving the effectiveness of this model in TKG reasoning tasks.
Offline reinforcement learning is proposed to learn policy from static data. However, when the model is deployed to real world, models without safety constraints can cause unsafe problems in reality. Using negative rewards or restricting the agent's action space for unsafe policies in related work may lead to models being overly conservative or aggressive, thus failing to balance task performance and safety effectively. In this paper, we propose a safety offline reinforcement learning method based on knowledge constraints by adding prior expert knowledge from the static data set. First, our algorithm utilizes an expert model to evaluate safety checks on state-action pairs, aiming to ensure that the model learns from safe data. Second, we use adaptive adjustment factors to impose safety constraints on the current policy. When the current policy is evaluated unsafe, constraints on the policy are heightened. Conversely, when the policy is evaluated safety, the model optimizes task performance accordingly. Experiments demonstrate that our algorithm outperforms baseline algorithms in safety, exhibiting reduced variance and enhanced training stability.
In highly sparse reward composite tasks, agents often face a lack of reward feedback within fixed time steps, leading to getting trapped in local optima and compromising their ability to effectively explore superior strategies. Skill learning is one approach to increase the density of reward signals, enabling adaptation to multi-stage tasks and expediting the learning process. However, contemporary methods for skill acquisition heavily rely on online asynchronous training. Although certain intrinsic motivation approaches excel at addressing sparse reward challenges, they suffer from issues of low sampling efficiency and limited interpretability of skills. These challenges hinder the speed of model learning and severely impede the reusability of skill policies. In this study, we employ expert demonstration data to facilitate the learning of skill policies, aiming to accelerate the convergence of the model while increasing the utilization of sample data. Subsequently, we engage in interactive learning with the environment. Additionally, we define an evaluation criterion for skill redundancy to encourage the selection of the most cost-effective skill strategy among similar skill policies that manifest between initial and final states. This process helps the agent efficiently and effectively accomplish complex tasks. Our objective is to minimize ineffective and redundant exploration and learning during the skill acquisition process. We evaluate our approach in the simulated UGV-Pyramid and simulated UGV-Hallway tasks, both implemented using Unity3D modeling. The results demonstrate the superiority of our algorithm compared to previous skill learning methods.
In intelligent unmanned ground vehicle systems, decision-making algorithms often face challenges in adapting to dynamically changing environments, and their generalization capabilities may be limited. Most existing decision-making algorithms can only achieve robust results in the original scene, but when transferred to new scenes under same task, the algorithm performance drops sharply. Moreover, when deploying decision-making algorithms to unmanned ground vehicles, they often struggle to achieve performance comparable to computer simulations. To tackle this challenge, this paper proposes Scene Semantic Reconstruction for Unmanned Ground Vehicle Virtual-Real Integration(S2RU). S2RU decomposes the scene into abstract entities with object semantic information and then combines these entities using compositional neural radiance fields to enhance the capabilities of the UGV agent. This means the decision-making process is divided into two stages. In the first stage, concrete entities in the original perceptual information are mapped to abstract entities and transformed into scene semantic maps. In the second stage, decisions are made based on scene semantic maps. We have validated in both simulation and real-world environments, showcasing robust transferability between these environments and enabling cross-scene transfer for the same task and validate the usability, completeness and stability of S2RU.Results demonstrate that our methods improve success rate of a particular task across different scenes by at least 20% compared to other virtual-real integration methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.