What work is Redwood doing on LLM interpretability?

From Stampy's Wiki

Canonical Answer

Redwood is also doing some work on interpretability tools, though when this was written we did not know of a published a writeup of their interpretability results. As of April, they were focused on getting a complete understanding of nontrivial behaviors of relatively small models. They have released a website for visualizing transformers. Apart from the standard benefits of interpretability, one possibility is that this might be helpful for solving ELK.

Stamps: None
Show your endorsement of this answer by giving it a stamp of approval!

Tags: redwood research (create tag) (edit tags)

Non-Canonical Answers

Redwood is also doing some work on interpretability tools, though when this was written we did not know of a published a writeup of their interpretability results. As of April, they were focused on getting a complete understanding of nontrivial behaviors of relatively small models. They have released a website for visualizing transformers. Apart from the standard benefits of interpretability, one possibility is that this might be helpful for solving ELK.

Stamps: None
Show your endorsement of this answer by giving it a stamp of approval!

Tags: redwood research (create tag) (edit tags)

Question Info
Asked by: RoseMcClelland
OriginWhere was this question originally asked
Wiki
Date: 2022/09/13


Discussion