this is actually very interesting.
i take it you’ve heard of the concept of “mechanistic interpretability”? perhaps you could learn something about your networks by implementing some of that methodology. here’s a glossary. also recommend poking around neelanda’s blog if you want to learn more.
this is actually very interesting. i take it you’ve heard of the concept of “mechanistic interpretability”? perhaps you could learn something about your networks by implementing some of that methodology. here’s a glossary. also recommend poking around neelanda’s blog if you want to learn more.