Picture for Jia Jia

Jia Jia

Pre-train and Fine-tune: Recommenders as Large Models

Add code
Jan 24, 2025
Viaarxiv icon

learning discriminative features from spectrograms using center loss for speech emotion recognition

Add code
Jan 02, 2025
Viaarxiv icon

Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT

Add code
Jan 02, 2025
Figure 1 for Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Figure 2 for Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Figure 3 for Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Figure 4 for Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT
Viaarxiv icon

Skinned Motion Retargeting with Dense Geometric Interaction Perception

Add code
Oct 28, 2024
Viaarxiv icon

Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization

Add code
Oct 16, 2024
Figure 1 for Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization
Figure 2 for Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization
Figure 3 for Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization
Figure 4 for Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization
Viaarxiv icon

VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding

Add code
Oct 11, 2024
Viaarxiv icon

VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling

Add code
Aug 28, 2024
Figure 1 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 2 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 3 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 4 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Viaarxiv icon

PlacidDreamer: Advancing Harmony in Text-to-3D Generation

Add code
Jul 19, 2024
Viaarxiv icon

Enhancing Monotonic Modeling with Spatio-Temporal Adaptive Awareness in Diverse Marketing

Add code
Jun 20, 2024
Viaarxiv icon

DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance

Add code
Mar 20, 2024
Viaarxiv icon