by Grace @gracekind.net
In How might LLMs detect injected tokens? I described two methods that LLMs could use to detect injected tokens in their output:
November 01, 2025
Let's say Claude is generating some text in an autoregressive fashion. The output might look something like:
October 08, 2025