https://tokenhub-intl.tencentcloudmaas.com/v1 and authenticated using a TokenHub-specific API Key.Model ID | Type | Reasoning Capability | Visual Capability | Video Capability |
kimi-k2.6 | General Conversation Model | Configurable (Enabled by default) | Supported | Supported |
kimi-k2.5 | General Conversation Model | Configurable (Enabled by default) | Supported | Not supported. |
thinking parameter.Writing back | Kimi K2.6 / K2.5 | OpenAI / Claude / GLM, etc |
Reasoning Capability Switch | Explicitly controlled via the thinking.type parameter | Typically controlled by switching the model or a separate reasoning parameter. |
Reasoning Process Field | Independently returned in the response as reasoning_content | Most models do not expose the reasoning process. |
Access Reasoning Fields via OpenAI SDK | Must use hasattr / getattr | - |
Retaining Reasoning Across Multi-turn Conversations | Controls whether to pass through historical reasoning_content via thinking.keep. | - |
temperature | Fixed at 1.0 in thinking mode and at 0.6 in non-thinking mode. | Freely adjustable between 0 and 2 by default. |
Recommended value for max_tokens | Greater than or equal to 16000 (shared quota for reasoning + response) | Typically 1024 to 4096 is sufficient. |
Multimodal Image Input | Supports two methods: Base64 encoding and public network URL direct links | Generally support URL direct links. |
Video Input | Only supported by K2.6. | Most models do not support it. |
Writing back messages in Multi-turn Conversations | When thinking is enabled, both content and reasoning_content must be written back. | Typically only content need to be written back. |
YOUR_API_KEY with the API Key you created.curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\-H 'Authorization: Bearer YOUR_API_KEY' \\-H 'Content-Type: application/json' \\-d '{"model": "kimi-k2.6","messages": [{"role": "user", "content": "Hello, please introduce yourself"}]}'
from openai import OpenAIclient = OpenAI(api_key="YOUR_API_KEY",base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",)response = client.chat.completions.create(model="kimi-k2.6",messages=[{"role": "user", "content": "Hello, please introduce yourself"},],)print(response.choices[0].message.content)
import OpenAI from 'openai';const client = new OpenAI({apiKey: 'YOUR_API_KEY',baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',});const response = await client.chat.completions.create({model: 'kimi-k2.6',messages: [{ role: 'user', content: 'Hello, please introduce yourself' },],});console.log(response.choices[0].message.content);
// Using the OpenAI-compatible protocol, call the HTTP API directly with OkHttpimport okhttp3.*;import com.google.gson.Gson;import java.util.*;public class BasicChat {public static void main(String[] args) throws Exception {Map<String, Object> body = new HashMap<>();body.put("model", "kimi-k2.6");body.put("messages", Arrays.asList(Map.of("role", "user", "content", "Hello, please introduce yourself")));RequestBody requestBody = RequestBody.create(new Gson().toJson(body),MediaType.parse("application/json"));Request request = new Request.Builder().url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions").header("Authorization", "Bearer YOUR_API_KEY").post(requestBody).build();try (Response response = new OkHttpClient().newCall(request).execute()) {System.out.println(response.body().string());}}}
package mainimport ("bytes""encoding/json""fmt""io""net/http")func main() {body := map[string]interface{}{"model": "kimi-k2.6","messages": []map[string]string{{"role": "user", "content": "Hello, please introduce yourself"},},}payload, _ := json.Marshal(body)req, _ := http.NewRequest("POST","https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",bytes.NewBuffer(payload))req.Header.Set("Authorization", "Bearer YOUR_API_KEY")req.Header.Set("Content-Type", "application/json")resp, _ := http.DefaultClient.Do(req)defer resp.Body.Close()data, _ := io.ReadAll(resp.Body)fmt.Println(string(data))}
stream: true to obtain responses in a streaming manner, which facilitates the display of a typewriter effect and prevents long responses from triggering gateway timeouts.curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\-H 'Authorization: Bearer YOUR_API_KEY' \\-H 'Content-Type: application/json' \\-d '{"model": "kimi-k2.6","stream": true,"messages": [{"role": "user", "content": "Please introduce yourself in one sentence"}]}'
from openai import OpenAIclient = OpenAI(api_key="YOUR_API_KEY",base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",)stream = client.chat.completions.create(model="kimi-k2.6",messages=[{"role": "user", "content": "Please introduce yourself in one sentence"}],stream=True,)for chunk in stream:if chunk.choices and chunk.choices[0].delta.content:print(chunk.choices[0].delta.content, end="", flush=True)
import OpenAI from 'openai';const client = new OpenAI({apiKey: 'YOUR_API_KEY',baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',});const stream = await client.chat.completions.create({model: 'kimi-k2.6',messages: [{ role: 'user', content: 'Please introduce yourself in one sentence' }],stream: true,});for await (const chunk of stream) {const content = chunk.choices?.[0]?.delta?.content;if (content) process.stdout.write(content);}
import okhttp3.*;import okhttp3.sse.*;import com.google.gson.Gson;import java.util.*;public class StreamChat {public static void main(String[] args) {Map<String, Object> body = new HashMap<>();body.put("model", "kimi-k2.6");body.put("stream", true);body.put("messages", List.of(Map.of("role", "user", "content", "Please introduce yourself in one sentence")));Request request = new Request.Builder().url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions").header("Authorization", "Bearer YOUR_API_KEY").header("Content-Type", "application/json").post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json"))).build();EventSources.createFactory(new OkHttpClient()).newEventSource(request,new EventSourceListener() {@Override public void onEvent(EventSource es, String id, String type, String data) {System.out.println(data);}});}}
package mainimport ("bufio""bytes""encoding/json""fmt""net/http""strings")func main() {body, _ := json.Marshal(map[string]interface{}{"model": "kimi-k2.6","stream": true,"messages": []map[string]string{{"role": "user", "content": "Please introduce yourself in one sentence"},},})req, _ := http.NewRequest("POST","https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",bytes.NewBuffer(body))req.Header.Set("Authorization", "Bearer YOUR_API_KEY")req.Header.Set("Content-Type", "application/json")resp, _ := http.DefaultClient.Do(req)defer resp.Body.Close()scanner := bufio.NewScanner(resp.Body)for scanner.Scan() {line := scanner.Text()if strings.HasPrefix(line, "data: ") {fmt.Println(strings.TrimPrefix(line, "data: "))}}}
system role messages.curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\-H 'Authorization: Bearer YOUR_API_KEY' \\-H 'Content-Type: application/json' \\-d '{"model": "kimi-k2.6","messages": [{"role": "system", "content": "You are Kimi, a rigorous physics teaching assistant, and your answers should incorporate simple analogies."},{"role": "user", "content": "Please explain what quantum entanglement is"}]}'
from openai import OpenAIclient = OpenAI(api_key="YOUR_API_KEY",base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",)response = client.chat.completions.create(model="kimi-k2.6",messages=[{"role": "system", "content": "You are Kimi, a rigorous physics teaching assistant, and your answers should incorporate simple analogies."},{"role": "user", "content": "Please explain what quantum entanglement is"},],)print(response.choices[0].message.content)
import OpenAI from 'openai';const client = new OpenAI({apiKey: 'YOUR_API_KEY',baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',});const response = await client.chat.completions.create({model: 'kimi-k2.6',messages: [{ role: 'system', content: 'You are Kimi, a rigorous physics teaching assistant, and your answers should incorporate simple analogies.' },{ role: 'user', content: 'Please explain what quantum entanglement is' },],});console.log(response.choices[0].message.content);
import okhttp3.*;import com.google.gson.Gson;import java.util.*;public class SystemPromptChat {public static void main(String[] args) throws Exception {Map<String, Object> body = new HashMap<>();body.put("model", "kimi-k2.6");body.put("messages", List.of(Map.of("role", "system", "content", "You are Kimi, a rigorous physics teaching assistant, and your answers should incorporate simple analogies."),Map.of("role", "user", "content", "Please explain what quantum entanglement is")));Request request = new Request.Builder().url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions").header("Authorization", "Bearer YOUR_API_KEY").post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json"))).build();try (Response response = new OkHttpClient().newCall(request).execute()) {System.out.println(response.body().string());}}}
package mainimport ("bytes""encoding/json""fmt""io""net/http")func main() {body, _ := json.Marshal(map[string]interface{}{"model": "kimi-k2.6","messages": []map[string]string{{"role": "system", "content": "You are Kimi, a rigorous physics teaching assistant, and your answers should incorporate simple analogies."},{"role": "user", "content": "Please explain what quantum entanglement is"},},})req, _ := http.NewRequest("POST","https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",bytes.NewBuffer(body))req.Header.Set("Authorization", "Bearer YOUR_API_KEY")req.Header.Set("Content-Type", "application/json")resp, _ := http.DefaultClient.Do(req)defer resp.Body.Close()data, _ := io.ReadAll(resp.Body)fmt.Println(string(data))}
messages history in each request. The following example does not enable reasoning, so only the examplecontent field needs to be written back. If reasoning is enabled, the reasoning_content must also be written back. For details, see Multi-turn Conversation Writing Back reasoning_content.curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\-H 'Authorization: Bearer YOUR_API_KEY' \\-H 'Content-Type: application/json' \\-d '{"model": "kimi-k2.6","messages": [{"role": "system", "content": "You are Kimi."},{"role": "user", "content": "Recommend a popular science book"},{"role": "assistant", "content": "I recommend 'A Brief History of Time'."},{"role": "user", "content": "Recommend another advanced one"}]}'
from openai import OpenAIclient = OpenAI(api_key="YOUR_API_KEY",base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",)messages = [{"role": "system", "content": "You are Kimi."},{"role": "user", "content": "Recommend a popular science book"},{"role": "assistant", "content": "I recommend 'A Brief History of Time'."},{"role": "user", "content": "Recommend another advanced one"}]response = client.chat.completions.create(model="kimi-k2.6", messages=messages)print(response.choices[0].message.content)
import OpenAI from 'openai';const client = new OpenAI({apiKey: 'YOUR_API_KEY',baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',});const messages = [{ role: 'system', content: 'You are Kimi.' },{"role": "user", "content": "Recommend a popular science book"},{"role": "assistant", "content": "I recommend 'A Brief History of Time'."},{"role": "user", "content": "Recommend another advanced one"}];const response = await client.chat.completions.create({model: 'kimi-k2.6',messages,});console.log(response.choices[0].message.content);
import okhttp3.*;import com.google.gson.Gson;import java.util.*;public class MultiTurnChat {public static void main(String[] args) throws Exception {Map<String, Object> body = new HashMap<>();body.put("model", "kimi-k2.6");body.put("messages", List.of(Map.of("role", "system", "content", "You are Kimi."),Map.of("role", "user", "content", "Recommend a popular science book"),Map.of("role", "assistant", "content", "I recommend 'A Brief History of Time'."),Map.of("role", "user", "content", "Recommend another advanced one")));Request request = new Request.Builder().url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions").header("Authorization", "Bearer YOUR_API_KEY").post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json"))).build();try (Response response = new OkHttpClient().newCall(request).execute()) {System.out.println(response.body().string());}}}
package mainimport ("bytes""encoding/json""fmt""io""net/http")func main() {body, _ := json.Marshal(map[string]interface{}{"model": "kimi-k2.6","messages": []map[string]string{{"role": "system", "content": "You are Kimi."},{"role": "user", "content": "Recommend a popular science book"},{"role": "assistant", "content": "I recommend 'A Brief History of Time'."},{"role": "user", "content": "Recommend another advanced one"}},})req, _ := http.NewRequest("POST","https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",bytes.NewBuffer(body))req.Header.Set("Authorization", "Bearer YOUR_API_KEY")req.Header.Set("Content-Type", "application/json")resp, _ := http.DefaultClient.Do(req)defer resp.Body.Close()data, _ := io.ReadAll(resp.Body)fmt.Println(string(data))}
curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\-H 'Authorization: Bearer YOUR_API_KEY' \\-H 'Content-Type: application/json' \\-d '{"model": "kimi-k2.6","messages": [{"role": "user", "content": "What is the weather like in Beijing today?"}],"tools": [{"type": "function","function": {"name": "get_weather","description": "Query the weather for a specified city","parameters": {"type": "object","properties": {"city": {"type": "string", "description": "City name"}},"required": ["city"]}}}],"tool_choice": "auto"}'
import jsonfrom openai import OpenAIclient = OpenAI(api_key="YOUR_API_KEY",base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",)tools = [{"type": "function","function": {"name": "get_weather","description": "Query the weather for a specified city","parameters": {"type": "object","properties": {"city": {"type": "string", "description": "City name"}},"required": ["city"],},},}]messages = [{"role": "user", "content": "What is the weather like in Beijing today?"}]# Round 1: The Model Decides Whether to Call the Toolresp = client.chat.completions.create(model="kimi-k2.6", messages=messages, tools=tools, tool_choice="auto",)msg = resp.choices[0].messagemessages.append(msg.model_dump(exclude_none=True))# If the model chooses to call a tool, execute the tool and backfill the resultif msg.tool_calls:for call in msg.tool_calls:args = json.loads(call.function.arguments)# Replace with actual business logic hereresult = {"city": args["city"], "temperature": "22°C", "weather": "Sunny"}messages.append({"role": "tool","tool_call_id": call.id,"content": json.dumps(result, ensure_ascii=False),})# Round 2: Return the Tool Result to the Model and Obtain the Final Responsefinal = client.chat.completions.create(model="kimi-k2.6", messages=messages, tools=tools)print(final.choices[0].message.content)else:print(msg.content)
import OpenAI from 'openai';const client = new OpenAI({apiKey: 'YOUR_API_KEY',baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',});const tools = [{type: 'function',function: {name: 'get_weather',description: 'Query the weather for a specified city',parameters: {type: 'object',properties: { city: { type: 'string', description: 'City name' } },required: ['city'],},},}];const messages = [{ role: 'user', content: 'What is the weather like in Beijing today?' }];const resp = await client.chat.completions.create({model: 'kimi-k2.6',messages,tools,tool_choice: 'auto',});const msg = resp.choices[0].message;messages.push(msg);if (msg.tool_calls) {for (const call of msg.tool_calls) {const args = JSON.parse(call.function.arguments);// Replace with actual business logic hereconst result = { city: args.city, temperature: '22°C', weather: 'Sunny' };messages.push({role: 'tool',tool_call_id: call.id,content: JSON.stringify(result),});}const final = await client.chat.completions.create({model: 'kimi-k2.6',messages,tools,});console.log(final.choices[0].message.content);} else {console.log(msg.content);}
import okhttp3.*;import com.google.gson.*;import java.util.*;public class FunctionCallingDemo {static final String URL = "https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions";static final String API_KEY = "YOUR_API_KEY";static final OkHttpClient HTTP = new OkHttpClient();static final Gson GSON = new Gson();static String chat(List<Map<String, Object>> messages, List<Map<String, Object>> tools) throws Exception {Map<String, Object> body = new HashMap<>();body.put("model", "kimi-k2.6");body.put("messages", messages);body.put("tools", tools);body.put("tool_choice", "auto");Request req = new Request.Builder().url(URL).header("Authorization", "Bearer " + API_KEY).post(RequestBody.create(GSON.toJson(body), MediaType.parse("application/json"))).build();try (Response resp = HTTP.newCall(req).execute()) {return resp.body().string();}}public static void main(String[] args) throws Exception {List<Map<String, Object>> tools = List.of(Map.of("type", "function","function", Map.of("name", "get_weather","description", "Query the weather for a specified city","parameters", Map.of("type", "object","properties", Map.of("city", Map.of("type", "string", "description", "City name")),"required", List.of("city")))));List<Map<String, Object>> messages = new ArrayList<>();messages.add(Map.of("role", "user", "content", "What is the weather like in Beijing today?"));// Round 1: The model decides whether to call the tool.String r1 = chat(messages, tools);JsonObject msg = JsonParser.parseString(r1).getAsJsonObject().getAsJsonArray("choices").get(0).getAsJsonObject().getAsJsonObject("message");messages.add(GSON.fromJson(msg, Map.class));if (msg.has("tool_calls")) {for (JsonElement el : msg.getAsJsonArray("tool_calls")) {JsonObject call = el.getAsJsonObject();JsonObject argsObj = JsonParser.parseString(call.getAsJsonObject("function").get("arguments").getAsString()).getAsJsonObject();// Replace with actual business logic hereMap<String, String> result = Map.of("city", argsObj.get("city").getAsString(),"temperature", "22°C","weather", "Sunny");messages.add(Map.of("role", "tool","tool_call_id", call.get("id").getAsString(),"content", GSON.toJson(result)));}// Round 2: Return the Tool Result to the ModelSystem.out.println(chat(messages, tools));} else {System.out.println(msg.get("content").getAsString());}}}
package mainimport ("bytes""encoding/json""fmt""io""net/http")const (URL = "https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions"APIKEY = "YOUR_API_KEY")func chat(messages []map[string]interface{}, tools []map[string]interface{}) (map[string]interface{}, error) {body, _ := json.Marshal(map[string]interface{}{"model": "kimi-k2.6","messages": messages,"tools": tools,"tool_choice": "auto",})req, _ := http.NewRequest("POST", URL, bytes.NewBuffer(body))req.Header.Set("Authorization", "Bearer "+APIKEY)req.Header.Set("Content-Type", "application/json")resp, err := http.DefaultClient.Do(req)if err != nil {return nil, err}defer resp.Body.Close()data, _ := io.ReadAll(resp.Body)var out map[string]interface{}json.Unmarshal(data, &out)return out, nil}func main() {tools := []map[string]interface{}{{"type": "function","function": map[string]interface{}{"name": "get_weather","description": "Query the weather for a specified city","parameters": map[string]interface{}{"type": "object","properties": map[string]interface{}{"city": map[string]string{"type": "string", "description": "City name"},},"required": []string{"city"},},},}}messages := []map[string]interface{}{{"role": "user", "content": "What is the weather like in Beijing today?"},}// Round 1: The model decides whether to call the tool.r1, _ := chat(messages, tools)msg := r1["choices"].([]interface{})[0].(map[string]interface{})["message"].(map[string]interface{})messages = append(messages, msg)if calls, ok := msg["tool_calls"].([]interface{}); ok {for _, c := range calls {call := c.(map[string]interface{})argsStr := call["function"].(map[string]interface{})["arguments"].(string)var args map[string]stringjson.Unmarshal([]byte(argsStr), &args)// Replace with actual business logic hereresult, _ := json.Marshal(map[string]string{"city": args["city"],"temperature": "22°C","weather": "Sunny",})messages = append(messages, map[string]interface{}{"role": "tool","tool_call_id": call["id"],"content": string(result),})}// Round 2: Return the Tool Result to the Modelr2, _ := chat(messages, tools)fmt.Printf("%+v\\n", r2)} else {fmt.Println(msg["content"])}}
thinking field, which is its key distinction from models like OpenAI / GLM.thinking Parameterthinking field resides at the top level of the request body and has the following structure:"thinking": {"type": "enabled","keep": "all"}
Field | Type | Default Value | Value | Description |
type | string | "enabled" | "enabled" / "disabled" | Whether thinking capability is enabled for the current request. |
keep | string | null | null | "all" / not passed | Whether to pass through historical reasoning_content in multi-turn conversations. |
thinking is not a standard OpenAI field. When using the official SDK, you must pass it through extra_body (Python) or directly at the top level (Node.js). For direct HTTP calls, place it at the top level of the request body.reasoning_content Fieldat the same level as content, is added to the response message to carry the model's reasoning process:{"choices": [{"message": {"role": "assistant","reasoning_content": "First, we need to analyze...","content": "The final answer is ..."}}]}
ChatCompletionMessage / ChoiceDelta types in the official OpenAI SDK do not directly declare the reasoning_content attribute. Therefore, you cannot directly access it via obj.reasoning_content and must use the following method:# ❌ Errorcontent = message.reasoning_content# ✅ Correctif hasattr(message, "reasoning_content"):content = getattr(message, "reasoning_content")
stream: true):reasoning_content is always fully output before the content, allowing the UI to distinguish between the thinking and answering states.curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\-H 'Authorization: Bearer YOUR_API_KEY' \\-H 'Content-Type: application/json' \\-d '{"model": "kimi-k2.6","max_tokens": 32768,"stream": true,"thinking": {"type": "enabled"},"messages": [{"role": "user", "content": "Explain the Fourier transform in one sentence."}]}'# The response is an SSE stream: each `data:` line contains a chunk,# delta.reasoning_content always appears before delta.content
from openai import OpenAIclient = OpenAI(api_key="YOUR_API_KEY",base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",)stream = client.chat.completions.create(model="kimi-k2.6",messages=[{"role": "user", "content": "Explain the Fourier transform in one sentence."}],max_tokens=32768,stream=True,extra_body={"thinking": {"type": "enabled"}},)thinking = Falsefor chunk in stream:if not chunk.choices:continuedelta = chunk.choices[0].delta# Thinking Phaseif hasattr(delta, "reasoning_content") and getattr(delta, "reasoning_content"):if not thinking:print("=== Start Thinking ===")thinking = Trueprint(getattr(delta, "reasoning_content"), end="", flush=True)# Answering Phaseif delta.content:if thinking:print("\\n=== Thinking Completed ===")thinking = Falseprint(delta.content, end="", flush=True)
import OpenAI from 'openai';const client = new OpenAI({apiKey: 'YOUR_API_KEY',baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',});// The Node.js SDK does not natively support the thinking / extra_body fields, so you can directly expand them to the top level.// Note: If using TypeScript, append `as any` after the last object to bypass type checking.const stream = await client.chat.completions.create({model: 'kimi-k2.6',messages: [{ role: 'user', content: 'Explain the Fourier transform in one sentence.' }],max_tokens: 32768,stream: true,thinking: { type: 'enabled' },});let thinking = false;for await (const chunk of stream) {const delta = chunk.choices?.[0]?.delta;if (!delta) continue;if (delta.reasoning_content) {if (!thinking) { console.log('=== Start Thinking ==='); thinking = true; }process.stdout.write(delta.reasoning_content);}if (delta.content) {if (thinking) { console.log('\\n=== Thinking Completed ==='); thinking = false; }process.stdout.write(delta.content);}}
import okhttp3.*;import okhttp3.sse.*;import com.google.gson.*;import java.util.*;public class ThinkingStream {public static void main(String[] args) {Map<String, Object> body = new HashMap<>();body.put("model", "kimi-k2.6");body.put("max_tokens", 32768);body.put("stream", true);body.put("thinking", Map.of("type", "enabled"));body.put("messages", List.of(Map.of("role", "user", "content", "Explain the Fourier transform in one sentence.")));Request request = new Request.Builder().url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions").header("Authorization", "Bearer YOUR_API_KEY").header("Content-Type", "application/json").post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json"))).build();EventSources.createFactory(new OkHttpClient()).newEventSource(request,new EventSourceListener() {@Override public void onEvent(EventSource es, String id, String type, String data) {if ("[DONE]".equals(data)) return;JsonObject delta = JsonParser.parseString(data).getAsJsonObject().getAsJsonArray("choices").get(0).getAsJsonObject().getAsJsonObject("delta");if (delta.has("reasoning_content")) {System.out.print(delta.get("reasoning_content").getAsString());}if (delta.has("content") && !delta.get("content").isJsonNull()) {System.out.print(delta.get("content").getAsString());}}});}}
package mainimport ("bufio""bytes""encoding/json""fmt""net/http""strings")func main() {body, _ := json.Marshal(map[string]interface{}{"model": "kimi-k2.6","max_tokens": 32768,"stream": true,"thinking": map[string]string{"type": "enabled"},"messages": []map[string]string{{"role": "user", "content": "Explain the Fourier transform in one sentence."},},})req, _ := http.NewRequest("POST","https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",bytes.NewBuffer(body))req.Header.Set("Authorization", "Bearer YOUR_API_KEY")req.Header.Set("Content-Type", "application/json")resp, _ := http.DefaultClient.Do(req)defer resp.Body.Close()scanner := bufio.NewScanner(resp.Body)thinking := falsefor scanner.Scan() {line := scanner.Text()if !strings.HasPrefix(line, "data: ") {continue}data := strings.TrimPrefix(line, "data: ")if data == "[DONE]" {break}var chunk map[string]interface{}if err := json.Unmarshal([]byte(data), &chunk); err != nil {continue}choices, _ := chunk["choices"].([]interface{})if len(choices) == 0 {continue}delta, _ := choices[0].(map[string]interface{})["delta"].(map[string]interface{})if rc, ok := delta["reasoning_content"].(string); ok && rc != "" {if !thinking {fmt.Println("=== Start Thinking ===")thinking = true}fmt.Print(rc)}if c, ok := delta["content"].(string); ok && c != "" {if thinking {fmt.Println("\\n=== Thinking Completed ===")thinking = false}fmt.Print(c)}}}
thinking.keep controls whether the historical turns' reasoning_content participates in the next round of reasoning:Value | Meaning | Use Cases |
Not passed / null (default) | The reasoning content from historical rounds is not passed through, resulting in a shorter context and lower cost. | General multi-turn conversation |
"all" | Retains the reasoning process from historical rounds in full, enabling the model to continue its previous line of thought. | Complex multi-step reasoning, Agent tool calling, long-range code tasks |
keep only affects whether the historical thinking is passed to the model, and does not affect whether the current turn generates thinking. It is recommended to use it in conjunction with type: "enabled" in scenarios requiring continuous reasoning.reasoning_content and content returned by the previous API call together to the messages; otherwise, the model will lose the reasoning thread in subsequent turns.curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\-H 'Authorization: Bearer YOUR_API_KEY' \\-H 'Content-Type: application/json' \\-d '{"model": "kimi-k2.6","stream": true,"thinking": {"type": "enabled", "keep": "all"},"messages": [{"role": "system", "content": "You are Kimi."},{"role": "user", "content": "The first question..."},{"role": "assistant","reasoning_content": "<reasoning_content returned by the previous API call>","content": "<content returned by the previous API call>"},{"role": "user", "content": "Please continue the deduction based on the previous analysis."}]}'
from openai import OpenAIclient = OpenAI(api_key="YOUR_API_KEY",base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",)messages = [{"role": "system", "content": "You are Kimi."},{"role": "user", "content": "The first question..."},{"role": "assistant","reasoning_content": "<reasoning_content returned by the previous API call>","content": "<content returned by the previous API call>",},{"role": "user", "content": "Please continue the deduction based on the previous analysis."},]response = client.chat.completions.create(model="kimi-k2.6",messages=messages,stream=True,extra_body={"thinking": {"type": "enabled", "keep": "all"}},)for chunk in response:delta = chunk.choices[0].deltaif getattr(delta, "reasoning_content", None):print(delta.reasoning_content, end="", flush=True)if delta.content:print(delta.content, end="", flush=True)print()
import OpenAI from 'openai';const client = new OpenAI({apiKey: 'YOUR_API_KEY',baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',});const messages = [{ role: 'system', content: 'You are Kimi.' },{ role: 'user', content: 'The first question...' },{role: 'assistant',reasoning_content: '<reasoning_content returned by the previous API call>',content: '<content returned by the previous API call>',},{ role: 'user', content: 'Please continue the deduction based on the previous analysis.' },];const stream = await client.chat.completions.create({model: 'kimi-k2.6',messages,stream: true,thinking: { type: 'enabled', keep: 'all' },});for await (const chunk of stream) {const delta = chunk.choices?.[0]?.delta;if (!delta) continue;if (delta.reasoning_content) process.stdout.write(delta.reasoning_content);if (delta.content) process.stdout.write(delta.content);}console.log();
import okhttp3.*;import com.google.gson.Gson;import java.util.*;public class MultiTurnWithThinking {public static void main(String[] args) {Map<String, Object> body = new HashMap<>();body.put("model", "kimi-k2.6");body.put("stream", true);body.put("thinking", Map.of("type", "enabled", "keep", "all"));body.put("messages", List.of(Map.of("role", "system", "content", "You are Kimi."),Map.of("role", "user", "content", "The first question..."),Map.of("role", "assistant","reasoning_content", "<reasoning_content returned by the previous API call>","content", "<content returned by the previous API call>"),Map.of("role", "user", "content", "Please continue the deduction based on the previous analysis.")));Request request = new Request.Builder().url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions").header("Authorization", "Bearer YOUR_API_KEY").post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json"))).build();try (Response response = new OkHttpClient().newCall(request).execute();java.io.BufferedReader r = new java.io.BufferedReader(new java.io.InputStreamReader(response.body().byteStream()))) {String line;while ((line = r.readLine()) != null) {if (line.startsWith("data: ")) System.out.println(line.substring(6));}}}}
package mainimport ("bufio""bytes""encoding/json""fmt""net/http""strings")func main() {body, _ := json.Marshal(map[string]interface{}{"model": "kimi-k2.6","stream": true,"thinking": map[string]string{"type": "enabled", "keep": "all"},"messages": []map[string]interface{}{{"role": "system", "content": "You are Kimi."},{"role": "user", "content": "The first question..."},{"role": "assistant","reasoning_content": "<reasoning_content returned by the previous API call>","content": "<content returned by the previous API call>",},{"role": "user", "content": "Please continue the deduction based on the previous analysis."},},})req, _ := http.NewRequest("POST","https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",bytes.NewBuffer(body))req.Header.Set("Authorization", "Bearer YOUR_API_KEY")req.Header.Set("Content-Type", "application/json")resp, _ := http.DefaultClient.Do(req)defer resp.Body.Close()scanner := bufio.NewScanner(resp.Body)for scanner.Scan() {line := scanner.Text()if strings.HasPrefix(line, "data: ") {fmt.Println(strings.TrimPrefix(line, "data: "))}}}
# First, read the image as base64IMAGE_B64=$(base64 -i image.jpg | tr -d '\\n')# Use a temporary file to pass the body, avoiding the "Argument list too long" error triggered by an excessively large base64 string.cat > /tmp/req.json <<EOF{"model": "kimi-k2.6","messages": [{"role": "user","content": [{"type": "text", "text": "Please describe this picture."},{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,${IMAGE_B64}"}}]}]}EOFcurl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\-H 'Authorization: Bearer YOUR_API_KEY' \\-H 'Content-Type: application/json' \\-d @/tmp/req.json
import base64from openai import OpenAIclient = OpenAI(api_key="YOUR_API_KEY",base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",)with open("image.jpg", "rb") as f:image_b64 = base64.b64encode(f.read()).decode()response = client.chat.completions.create(model="kimi-k2.6",messages=[{"role": "user","content": [{"type": "text", "text": "Please describe this picture"},{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_b64}"}},],}],)print(response.choices[0].message.content)
import fs from 'node:fs';import OpenAI from 'openai';const client = new OpenAI({apiKey: 'YOUR_API_KEY',baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',});const imageB64 = fs.readFileSync('image.jpg').toString('base64');const response = await client.chat.completions.create({model: 'kimi-k2.6',messages: [{role: 'user',content: [{ type: 'text', text: 'Please describe this picture' },{ type: 'image_url', image_url: { url: `data:image/jpeg;base64,${imageB64}` } },],}],});console.log(response.choices[0].message.content);
import okhttp3.*;import com.google.gson.Gson;import java.nio.file.*;import java.util.*;public class ImageBase64Chat {public static void main(String[] args) throws Exception {byte[] bytes = Files.readAllBytes(Paths.get("image.jpg"));String imageB64 = Base64.getEncoder().encodeToString(bytes);Map<String, Object> body = new HashMap<>();body.put("model", "kimi-k2.6");body.put("messages", List.of(Map.of("role", "user","content", List.of(Map.of("type", "text", "text", "Please describe this picture"),Map.of("type", "image_url", "image_url", Map.of("url", "data:image/jpeg;base64," + imageB64))))));Request request = new Request.Builder().url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions").header("Authorization", "Bearer YOUR_API_KEY").post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json"))).build();try (Response response = new OkHttpClient().newCall(request).execute()) {System.out.println(response.body().string());}}}
package mainimport ("bytes""encoding/base64""encoding/json""fmt""io""net/http""os")func main() {img, _ := os.ReadFile("image.jpg")imageB64 := base64.StdEncoding.EncodeToString(img)body, _ := json.Marshal(map[string]interface{}{"model": "kimi-k2.6","messages": []map[string]interface{}{{"role": "user","content": []map[string]interface{}{{"type": "text", "text": "Please describe this picture"},{"type": "image_url","image_url": map[string]string{"url": "data:image/jpeg;base64," + imageB64,},},},},},})req, _ := http.NewRequest("POST","https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",bytes.NewBuffer(body))req.Header.Set("Authorization", "Bearer YOUR_API_KEY")req.Header.Set("Content-Type", "application/json")resp, _ := http.DefaultClient.Do(req)defer resp.Body.Close()data, _ := io.ReadAll(resp.Body)fmt.Println(string(data))}
curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\-H 'Authorization: Bearer YOUR_API_KEY' \\-H 'Content-Type: application/json' \\-d '{"model": "kimi-k2.6","messages": [{"role": "user","content": [{"type": "text", "text": "Please describe this picture"},{"type": "image_url", "image_url": {"url": "https://www.gstatic.com/webp/gallery/1.jpg"}}]}]}'
from openai import OpenAIclient = OpenAI(api_key="YOUR_API_KEY",base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",)response = client.chat.completions.create(model="kimi-k2.6",messages=[{"role": "user","content": [{"type": "text", "text": "Please describe this picture"},{"type": "image_url", "image_url": {"url": "https://www.gstatic.com/webp/gallery/1.jpg"}},],}],)print(response.choices[0].message.content)
import OpenAI from 'openai';const client = new OpenAI({apiKey: 'YOUR_API_KEY',baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',});const response = await client.chat.completions.create({model: 'kimi-k2.6',messages: [{role: 'user',content: [{ type: 'text', text: 'Please describe this picture' },{ type: 'image_url', image_url: { url: 'https://www.gstatic.com/webp/gallery/1.jpg' } },],}],});console.log(response.choices[0].message.content);
import okhttp3.*;import com.google.gson.Gson;import java.util.*;public class ImageUrlChat {public static void main(String[] args) throws Exception {Map<String, Object> body = new HashMap<>();body.put("model", "kimi-k2.6");body.put("messages", List.of(Map.of("role", "user","content", List.of(Map.of("type", "text", "text", "Please describe this picture"),Map.of("type", "image_url", "image_url", Map.of("url", "https://www.gstatic.com/webp/gallery/1.jpg"))))));Request request = new Request.Builder().url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions").header("Authorization", "Bearer YOUR_API_KEY").post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json"))).build();try (Response response = new OkHttpClient().newCall(request).execute()) {System.out.println(response.body().string());}}}
package mainimport ("bytes""encoding/json""fmt""io""net/http")func main() {body, _ := json.Marshal(map[string]interface{}{"model": "kimi-k2.6","messages": []map[string]interface{}{{"role": "user","content": []map[string]interface{}{{"type": "text", "text": "Please describe this picture"},{"type": "image_url","image_url": map[string]string{"url": "https://www.gstatic.com/webp/gallery/1.jpg"},},},},},})req, _ := http.NewRequest("POST","https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",bytes.NewBuffer(body))req.Header.Set("Authorization", "Bearer YOUR_API_KEY")req.Header.Set("Content-Type", "application/json")resp, _ := http.DefaultClient.Do(req)defer resp.Body.Close()data, _ := io.ReadAll(resp.Body)fmt.Println(string(data))}
data:video/<format>;base64,... format:# First, read the video as base64VIDEO_B64=$(base64 -i demo.mp4 | tr -d '\\n')# Use a temporary file to pass the body, avoiding the "Argument list too long" error triggered by an excessively large base64 string.cat > /tmp/req.json <<EOF{"model": "kimi-k2.6","messages": [{"role": "user","content": [{"type": "text", "text": "Please summarize the video content"},{"type": "video_url", "video_url": {"url": "data:video/mp4;base64,${VIDEO_B64}"}}]}]}EOFcurl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\-H 'Authorization: Bearer YOUR_API_KEY' \\-H 'Content-Type: application/json' \\-d @/tmp/req.json
import base64from openai import OpenAIclient = OpenAI(api_key="YOUR_API_KEY",base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",)with open("demo.mp4", "rb") as f:video_b64 = base64.b64encode(f.read()).decode()response = client.chat.completions.create(model="kimi-k2.6",messages=[{"role": "user","content": [{"type": "text", "text": "Please summarize the video content"},{"type": "video_url", "video_url": {"url": f"data:video/mp4;base64,{video_b64}"}},],}],)print(response.choices[0].message.content)
import fs from 'node:fs';import OpenAI from 'openai';const client = new OpenAI({apiKey: 'YOUR_API_KEY',baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',});const videoB64 = fs.readFileSync('demo.mp4').toString('base64');const response = await client.chat.completions.create({model: 'kimi-k2.6',messages: [{role: 'user',content: [{ type: 'text', text: 'Please summarize the video content' },{ type: 'video_url', video_url: { url: `data:video/mp4;base64,${videoB64}` } },],}],});console.log(response.choices[0].message.content);
import okhttp3.*;import com.google.gson.Gson;import java.nio.file.*;import java.util.*;public class VideoChat {public static void main(String[] args) throws Exception {byte[] bytes = Files.readAllBytes(Paths.get("demo.mp4"));String videoB64 = Base64.getEncoder().encodeToString(bytes);Map<String, Object> body = new HashMap<>();body.put("model", "kimi-k2.6");body.put("messages", List.of(Map.of("role", "user","content", List.of(Map.of("type", "text", "text", "Please summarize the video content"),Map.of("type", "video_url", "video_url", Map.of("url", "data:video/mp4;base64," + videoB64))))));Request request = new Request.Builder().url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions").header("Authorization", "Bearer YOUR_API_KEY").post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json"))).build();try (Response response = new OkHttpClient().newCall(request).execute()) {System.out.println(response.body().string());}}}
package mainimport ("bytes""encoding/base64""encoding/json""fmt""io""net/http""os")func main() {video, _ := os.ReadFile("demo.mp4")videoB64 := base64.StdEncoding.EncodeToString(video)body, _ := json.Marshal(map[string]interface{}{"model": "kimi-k2.6","messages": []map[string]interface{}{{"role": "user","content": []map[string]interface{}{{"type": "text", "text": "Please summarize the video content"},{"type": "video_url","video_url": map[string]string{"url": "data:video/mp4;base64," + videoB64,},},},},},})req, _ := http.NewRequest("POST","https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",bytes.NewBuffer(body))req.Header.Set("Authorization", "Bearer YOUR_API_KEY")req.Header.Set("Content-Type", "application/json")resp, _ := http.DefaultClient.Do(req)defer resp.Body.Close()data, _ := io.ReadAll(resp.Body)fmt.Println(string(data))}
Parameter / Practice | Recommendation | Description |
max_tokens | Greater than or equal to 16000 (32768 recommended) | Reasoning + response share the max_tokens quota. If the quota is too small, the content is prone to truncation. |
temperature | Do not explicitly set. | The K2.6 / K2.5 series uses 0.6 by default. Passing any other value returns a 400 error ( invalid temperature: only 0.6 is allowed). It is recommended to omit this parameter entirely. |
stream | Enable (recommended). | The output is longer in thinking mode. Streaming can improve the experience and avoid gateway timeouts. |
Multimodal priority | URL direct links > Base64 | For public network images, prioritize using URLs. For local files, use Base64. The body of a single request must not exceed 100 MB. |
Writing back messages in Multi-turn Conversations | Write back the entire message when thinking is enabled. | both reasoning_content and content must be written back to messages together. Do not omit any fields. |
Access Reasoning via OpenAI SDK | hasattr / getattr | Do not directly access .reasoning_content, otherwise an attribute access error occurs. |
フィードバック