[Discussion/Help/Wish] About Model Output Length Limitation (Is it a bug?)

NNNINE · February 12, 2026, 5:08pm

Recently, it has been found that when using certain APIs, truncation always occurs (for example, truncating at 8129), but other software that allows setting the model output length limit (such as setting it to twenty or thirty thousand) does not have this truncation issue (using the same API). I would like to ask what might be causing this problem? If possible, could this feature be added? (Perhaps it could be included in the token settings? Also, adjustments to the top-t and top-p values.)