Epistemic status: an informal note.
It is common to use finetuning on a narrow data distribution, or narrow finetuning (NFT), to study AI safety. In these experiments, a model is trained on a very specific type of data, then evaluated for broader properties, such as a capability or general disposition.
Narrow finetuning is different than the training procedures that frontier AI companies use, like pretraining on the internet, or posttraining on a diverse mixture of data and tasks. Here are some ways it is different:
---
Outline:
(00:31) Ways that narrow finetuning is different
(02:08) Anecdote
(03:05) Examples
(03:37) Counterpoints
(04:54) Takeaways
The original text contained 1 footnote which was omitted from this narration.
---
First published:
August 5th, 2025
Source:
https://www.lesswrong.com/posts/7emjxGADozzm7uwKL/narrow-finetuning-is-different
---
Narrated by TYPE III AUDIO.
En liten tjänst av I'm With Friends. Finns även på engelska.