Chat Markup Language (CML) 是一种用于描述对话结构的标记语言。它可以帮助大模型和 AI 助手之间的对话更加结构化和清晰。CML 可以描述对话中的各种元素,例如对话的开始和结束、用户和 AI 助手的发言、对话中的问题和回答等等。使用 CML 可以使得对话的处理更加方便和高效,同时也可以提高对话的可读性和可维护性。
DeepMind的相关研究指出,相关研究指出,LLM可以通过选取合适的prompt)来转化为对话代理。这些文本提示通常包含一种所谓的“系统”信息来定义 LLM 的角色,一种更好的结构化方法是ChatML,它对每个对话轮次进行包装,并使用预定义的特殊Token来表示询问或回答的角色。这种方法可以更好地区分对话中不同角色的发言,并且可以更准确地捕捉对话的语境和上下文。相比于简单的插入系统信息和角色信息的方法,ChatML更加灵活和可扩展,可以适应不同类型的对话场景和任务。
1 2 3 4 5 6 7 8 9 10 11
[ {"token": "<|im_start|>"}, "system\nYou are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.\nKnowledge cutoff: 2021-09-01\nCurrent date: 2023-03-01", {"token": "<|im_end|>"}, "\n", {"token": "<|im_start|>"}, "user\nHow are you", {"token": "<|im_end|>"}, "\n", {"token": "<|im_start|>"}, "assistant\nI am doing well!", {"token": "<|im_end|>"}, "\n", {"token": "<|im_start|>"}, "user\nHow are you now?", {"token": "<|im_end|>"}, "\n" ]
<|im_start|>system You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible. Knowledge cutoff: 2021-09-01 Current date: 2023-03-01<|im_end|> <|im_start|>user How are you<|im_end|> <|im_start|>assistant I am doing well!<|im_end|> <|im_start|>user How are you now?<|im_end|>
<|im_start|>system Translate from English to French <|im_end|> <|im_start|>system name=example_user How are you? <|im_end|> <|im_start|>system name=example_assistant Comment allez-vous? <|im_end|> <|im_start|>user {{user input here}}<|im_end|>