Picture for Nikolas Gritsch

Nikolas Gritsch

Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts

Add code
Aug 28, 2024
Viaarxiv icon

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

Add code
Aug 15, 2024
Figure 1 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Figure 2 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Figure 3 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Figure 4 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Viaarxiv icon

Divergent Token Metrics: Measuring degradation to prune away LLM components -- and optimize quantization

Add code
Nov 13, 2023
Viaarxiv icon