Prob("not") as
function of the number of highest-weighted component vectors. 1681
components are required to get to get to 90% of the
full-representation probability 98.8% The encoding is HIGHLY
distributed. There are no simple "grandmother cells" for the "not"
concept. Figure[numComp]
- Pushing the output components through softmax probabilities shows
the entropy tightly ranges from 10.37294 to 10.37344 nats. For
reference, maximum entropy uniform is 10.37349 nats. The token
distributions are close to uniform. The fine-scale
highly-distributed weightings by the network, and not the output
components alone, are interesting.
- ( Note: the above analysis simply adds subsets of the weights sorted
by magnitude. However, the entropy of softmax Boltzman distributions
is controlled by temperature (constant factor changing
magnitude). Perhaps the vector direction of the subsets is correct
but the lower magnitude caused by smaller number of terms causes
more uniform distributions. Redoing the analysis to adjust the
magnitude of the weighted subset hidden to be the same as the full
leads to similar curves. The direction of the hidden is highly
distributed. )
- Is this a feature or a bug for LLMs? Highly distributed likely makes
the model robust. However, it makes interpretation of model
internals almost impossible. It could be useful to look into the LLM
and see semantically cohesive concepts locally modeled in single
vectors. One might interpret how these clear concepts influence how
the text is generated. An L1 penalty on the output vectors
could make the interpretation semantically cleaner.