Ask HN: How training of LLM dedicated to code is different from LLM of “text”

2 点作者 transformi超过 1 年前

Does it go beyond exposing to the LLM only "code"?, or there are extra steps in the training? (like giving the compiler/interpreter rules)? Since programming are more structured , I think that using grammar that are dedicated to those language might be useful.

1 comment

sp332超过 1 年前

The tokenizer might need tweaking. Base Llama models, for example, are trained on text that has had consecutive spaces reduced to a single space. This is unhelpful for coding where specific amounts of whitespace is at least very nice to have and can even be meaningful.<p>When you talk about a grammar, that's in the decoder, right? You don't need to retrain a model to use one of those.

评论 #37737278 未加载