Mats_9_extension_news_entry
I have been selected for the MATS extension phase in London, where I am continuing my research from the program. My research investigates the representation level picture of alignment pretraining: comparing how models acquire a preference through post-training versus incorporating it directly during pretraining, and what mechanistic differences this leaves behind in their internal representations.