auxerta · labs
research notev1.0 · 2026

Research.

Foundation models from the ground up.

Most large language models train the same way: predict the next token, scale up the data, scale up the model. We don't think that's the whole story.

We're training models on objectives that separate how they learn internal representations from how they produce tokens. The bet is that representations come out cleaner when they aren't bent to fit a specific output format. Three things follow if it works:

  • (i)Representations that don't carry the marks of next-token prediction. Better odds of transferring to reasoning, vision, and audio.
  • (ii)One trunk, many heads. Swap the decoder for a different modality without retraining the base.
  • (iii)Cheaper to change behavior at inference. Fine-tune the output side; leave the foundation alone.

Cite as

Auxerta Labs. Foundation models, from the ground up. Research note v1.0, 2026. auxerta.com/research