Similarity, Transformation and the Newly Found Invariance of Influence Functions

Andrew Yuan Liu; Gerald Penn

doi:10.7275/scil.3141

Options

Paper

Similarity, Transformation and the Newly Found Invariance of Influence Functions

Authors

Andrew Yuan Liu (University of Toronto)
Gerald Penn (University of Toronto)

Abstract

Ensuring that semantic representations capture the actual meanings of sentences to the exclusion of extraneous features remains a difficult challenge despite the amazing performance of representations like sBERT. We compare and contrast the semantic-encoding behaviours of sentence embeddings as well as influence functions, a resurgent method in the field of language model intepretability, using meaning-preserving grammatical transformations. Under the two tasks of sentence similarity and a new task called entity invariance, we seek to understand how these two measures of semantics warp under surface-level syntactic changes. Invariance to meaning-preserving transformations is an important aspect in which sentence embeddings and influence functions seem to differ. Nevertheless, our experiments find that across all our tasks and transformations, sentence embeddings and influence functions are highly correlated. We conclude that there is evidence that influence functions point towards a deeper encoding of semantics.

Keywords: grammar transformations, influence functions, semantic similarity

How to Cite:

Liu, A. Y. & Penn, G., (2025) “Similarity, Transformation and the Newly Found Invariance of Influence Functions”, Society for Computation in Linguistics 8(1): 9. doi: https://doi.org/10.7275/scil.3141

Downloads:
Download PDF

339 Views

71 Downloads

Published on
2025-06-13

Peer Reviewed

License

Creative Commons Attribution 4.0

Authors

Andrew Yuan Liu (Computer Science, University of Toronto)
Gerald Penn (Computer Science, University of Toronto)

Publication details

Article Number: 9
Submitted on: 2025-05-27
Accepted on: 2025-06-12

File Checksums (MD5)

PDF: 7e0cf17e8ad8a1b83b5be18516973e9d

Similarity, Transformation and the Newly Found Invariance of Influence Functions

Abstract

Harvard-Style Citation

Vancouver-Style Citation

APA-Style Citation

Non Specialist Summary