Picture for Jacob Nielsen

Jacob Nielsen

Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?

Add code
Feb 17, 2025
Viaarxiv icon

FlexDeMo: Decoupled Momentum Optimization for Fully and Hybrid Sharded Training

Add code
Feb 10, 2025
Viaarxiv icon

When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization

Add code
Nov 08, 2024
Viaarxiv icon

Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?

Add code
Dec 07, 2023
Viaarxiv icon