Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ruohua Li

Visual Story Generation Based on Emotion and Keywords

Jan 07, 2023

Yuetian Chen, Ruohua Li, Bowen Shi, Peiru Liu, Mei Si

Figure 1 for Visual Story Generation Based on Emotion and Keywords

Figure 2 for Visual Story Generation Based on Emotion and Keywords

Figure 3 for Visual Story Generation Based on Emotion and Keywords

Figure 4 for Visual Story Generation Based on Emotion and Keywords

Abstract:Automated visual story generation aims to produce stories with corresponding illustrations that exhibit coherence, progression, and adherence to characters' emotional development. This work proposes a story generation pipeline to co-create visual stories with the users. The pipeline allows the user to control events and emotions on the generated content. The pipeline includes two parts: narrative and image generation. For narrative generation, the system generates the next sentence using user-specified keywords and emotion labels. For image generation, diffusion models are used to create a visually appealing image corresponding to each generated sentence. Further, object recognition is applied to the generated images to allow objects in these images to be mentioned in future story development.

* 8 pages, 8 figures, AIIDE INT 2022

Via

Access Paper or Ask Questions

Learning Sparse Interaction Graphs of Partially Observed Pedestrians for Trajectory Prediction

Jul 19, 2021

Zhe Huang, Ruohua Li, Kazuki Shin, Katherine Driggs-Campbell

Figure 1 for Learning Sparse Interaction Graphs of Partially Observed Pedestrians for Trajectory Prediction

Figure 2 for Learning Sparse Interaction Graphs of Partially Observed Pedestrians for Trajectory Prediction

Figure 3 for Learning Sparse Interaction Graphs of Partially Observed Pedestrians for Trajectory Prediction

Figure 4 for Learning Sparse Interaction Graphs of Partially Observed Pedestrians for Trajectory Prediction

Abstract:Multi-pedestrian trajectory prediction is an indispensable safety element of autonomous systems that interact with crowds in unstructured environments. Many recent efforts have developed trajectory prediction algorithms with focus on understanding social norms behind pedestrian motions. Yet we observe these works usually hold two assumptions that prevent them from being smoothly applied to robot applications: positions of all pedestrians are consistently tracked; the target agent pays attention to all pedestrians in the scene. The first assumption leads to biased interaction modeling with incomplete pedestrian data, and the second assumption introduces unnecessary disturbances and leads to the freezing robot problem. Thus, we propose Gumbel Social Transformer, in which an Edge Gumbel Selector samples a sparse interaction graph of partially observed pedestrians at each time step. A Node Transformer Encoder and a Masked LSTM encode the pedestrian features with the sampled sparse graphs to predict trajectories. We demonstrate that our model overcomes the potential problems caused by the assumptions, and our approach outperforms the related works in benchmark evaluation.

* 10 pages, 3 figures

Via

Access Paper or Ask Questions