Picture for Jingbo Zhu

Jingbo Zhu

Early Exit Is a Natural Capability in Transformer-based Models: An Empirical Study on Early Exit without Joint Optimization

Add code
Dec 02, 2024
Viaarxiv icon

Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning

Add code
Nov 05, 2024
Viaarxiv icon

Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context Models

Add code
Oct 07, 2024
Figure 1 for Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context Models
Figure 2 for Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context Models
Figure 3 for Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context Models
Figure 4 for Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context Models
Viaarxiv icon

LRHP: Learning Representations for Human Preferences via Preference Pairs

Add code
Oct 06, 2024
Viaarxiv icon

A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation

Add code
Sep 24, 2024
Figure 1 for A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation
Figure 2 for A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation
Figure 3 for A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation
Figure 4 for A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation
Viaarxiv icon

More Effective LLM Compressed Tokens with Uniformly Spread Position Identifiers and Compression Loss

Add code
Sep 22, 2024
Viaarxiv icon

NDP: Next Distribution Prediction as a More Broad Target

Add code
Aug 30, 2024
Figure 1 for NDP: Next Distribution Prediction as a More Broad Target
Figure 2 for NDP: Next Distribution Prediction as a More Broad Target
Figure 3 for NDP: Next Distribution Prediction as a More Broad Target
Figure 4 for NDP: Next Distribution Prediction as a More Broad Target
Viaarxiv icon

RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data

Add code
Aug 22, 2024
Figure 1 for RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Figure 2 for RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Figure 3 for RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Figure 4 for RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
Viaarxiv icon

Cross-layer Attention Sharing for Large Language Models

Add code
Aug 04, 2024
Figure 1 for Cross-layer Attention Sharing for Large Language Models
Figure 2 for Cross-layer Attention Sharing for Large Language Models
Figure 3 for Cross-layer Attention Sharing for Large Language Models
Figure 4 for Cross-layer Attention Sharing for Large Language Models
Viaarxiv icon

Translate-and-Revise: Boosting Large Language Models for Constrained Translation

Add code
Jul 18, 2024
Viaarxiv icon