Abstract:A rhythm action game is a music-based video game in which the player is challenged to issue commands at the right timings during a music session. The timings are rendered in the chart, which consists of visual symbols, called notes, flying through the screen. KLab Inc., a Japan-based video game developer, has operated rhythm action games including a title for the "Love Live!" franchise, which became a hit across Asia and beyond. Before this work, the company generated the charts manually, which resulted in a costly business operation. This paper presents how KLab applied a deep generative model for synthesizing charts, and shows how it has improved the chart production process, reducing the business cost by half. Existing generative models generated poor quality charts for easier difficulty modes. We report how we overcame this challenge through a multi-scaling model dedicated to rhythm actions, by considering beats among other things. Our model, named Gen\'eLive!, is evaluated using production datasets at KLab as well as open datasets.
Abstract:In this paper, we propose a preprocessing method for conventional image and video encoders that can make these existing encoders content-aware. By going through our process, a higher quality parameter could be set on a traditional encoder without increasing the output size. A still frame or an image will firstly go through an object detector. Either the properties of the detection result will decide the parameters of the following procedures, or the system will be bypassed if no object is detected in the given frame. The processing method utilizes an adaptive quantization process to determine the portion of data to be dropped. This method is primarily based on the JPEG compression theory and is optimum for JPEG-based encoders such as JPEG encoders and the Motion JPEG encoders. However, other DCT-based encoders like MPEG part 2, H.264, etc. can also benefit from this method. In the experiments, we compare the MS-SSIM under the same bitrate as well as similar MS-SSIM but enhanced bitrate. As this method is based on human perception, even with similar MS-SSIM, the overall watching experience will be better than the direct encoded ones.