One of the questions I've been thinking about a lot looking at the past year of interpretability research is just how much of what we are finding is "what we're attuned to find" as opposed to "what's actually there."<p>Are we only measuring the tip of the iceberg, and have coalesced towards getting better at iceberg tip measuring?