新增国内直连 Base URL： https://api.highwayapi.ai/openai，原域名继续提供服务，详见产品文档

Alibaba

Qwen3 235B A22B Instruct 2507

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following, logical reasoning, math, code, and tool usage. The model supports a native 262K context length and does not implement "thinking mode" (<think> blocks). Compared to its base variant, this version delivers significant gains in knowledge coverage, long-context reasoning, coding benchmarks, and alignment with open-ended tasks. It is particularly strong on multilingual understanding, math reasoning (e.g., AIME, HMMT), and alignment evaluations like Arena-Hard and WritingBench.

文本

Qwen 2.5 72B Instruct

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters.

文本

Qwen MT Plus

Qwen-MT is a large language model optimized for machine translation, built upon the foundation of the Tongyi Qianwen model. It supports translation across 92 languages — including Chinese, English, Japanese, Korean, French, Spanish, German, Thai, Indonesian, Vietnamese, Arabic, and more — enabling seamless multilingual communication.

文本

Qwen2.5 7B Instruct

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. - Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. - Long-context Support up to 128K tokens and can generate up to 8K tokens. - Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.

文本

Qwen2.5 VL 72B Instruct

Qwen2.5-VL, the latest vision-language model in the Qwen2.5 series, delivers enhanced multimodal capabilities including advanced visual comprehension for object/text recognition, chart/layout analysis, and agent-based dynamic tool orchestration. It processes long-form videos (>1 hour) with key event detection while enabling precise spatial annotation through bounding boxes or coordinate points. The model specializes in structured data extraction from scanned documents (invoices, tables, etc.) and achieves state-of-the-art performance across multimodal benchmarks encompassing image understanding, temporal video analysis, and agent task evaluations.

文本

Qwen3 30B A3B

Achieves effective integration of inference and non-inference modes, allowing seamless switching between modes during conversations. Its inference capability matches that of QwQ-32B with a smaller parameter size, and its general capabilities significantly surpass those of Qwen2.5-14B, reaching the state-of-the-art (SOTA) level among models of the same scale.

文本

Qwen3 32B

文本

Qwen3 235B A22B

Achieves effective integration of inference and non-inference modes, enabling seamless switching between modes during conversations. The model's inference capability significantly surpasses that of QwQ, and its general capabilities exceed those of Qwen2.5-72B-Instruct, reaching the state-of-the-art (SOTA) level among models of the same scale.

文本

Qwen3 235B A22b Thinking 2507

The Qwen3-235B-A22B-Thinking-2507 represents the newest thinking-enabled model in the Qwen3 series, delivering groundbreaking improvements in reasoning capabilities. This advanced AI demonstrates significantly enhanced performance across logical reasoning, mathematics, scientific analysis, coding tasks, and academic benchmarks - matching or even surpassing human-expert level performance to achieve state-of-the-art results among open-source thinking models. Beyond its exceptional reasoning skills, the model shows markedly better general capabilities including more precise instruction following, sophisticated tool usage, highly natural text generation, and improved alignment with human preferences. It also features enhanced 256K long-context understanding, allowing it to maintain coherence and depth across extended documents and complex discussions.

文本

Qwen3 Coder 480B A35B Instruct

Qwen3-Coder-480B-A35B-Instruct is a cutting-edge open coding model from Qwen, matching Claude Sonnet’s performance in agentic programming, browser automation, and core development tasks. With native 256K context (extendable to 1M tokens via YaRN), it excels at repository-scale analysis and features specialized function-call support for platforms like Qwen Code and CLINE—making it ideal for complex, real-world development workflows.

文本

Qwen3 Coder Next FP8

Qwen3-Coder-Next is an open-weight language model specifically engineered for coding agents and local development environments. This highly efficient model delivers exceptional performance with only 3B activated parameters out of 80B total parameters, achieving results comparable to models with 10-20x more active parameters while maintaining remarkable cost-effectiveness for agent deployment. Through its sophisticated training methodology, Qwen3-Coder-Next excels in advanced agentic capabilities including long-horizon reasoning, complex tool usage, and robust recovery from execution failures, ensuring reliable performance across dynamic coding tasks. The model's versatility is further enhanced by its 256k context length and adaptability to various scaffold templates, enabling seamless integration with diverse CLI/IDE platforms such as Claude Code, Qwen Code, Qoder, Kilo, Trae, and Cline, making it an ideal solution for comprehensive development environments.

文本

Qwen3 Next 80B A3B Instruct

Qwen3-Next uses a highly sparse MoE design: 80B total parameters, but only ~3B activated per inference step. Experiments show that, with global load balancing, increasing total expert parameters while keeping activated experts fixed steadily reduces training loss.Compared to Qwen3’s MoE (128 total experts, 8 routed), Qwen3-Next expands to 512 total experts, combining 10 routed experts + 1 shared expert — maximizing resource usage without hurting performance. The Qwen3-Next-80B-A3B-Instruct performs comparably to our flagship model Qwen3-235B-A22B-Instruct-2507, and shows clear advantages in tasks requiring ultra-long context (up to 256K tokens).

文本

Qwen3 Next 80B A3B Thinking

Qwen3-Next uses a highly sparse MoE design: 80B total parameters, but only ~3B activated per inference step. Experiments show that, with global load balancing, increasing total expert parameters while keeping activated experts fixed steadily reduces training loss.Compared to Qwen3’s MoE (128 total experts, 8 routed), Qwen3-Next expands to 512 total experts, combining 10 routed experts + 1 shared expert — maximizing resource usage without hurting performance. The Qwen3-Next-80B-A3B-Thinking excels at complex reasoning tasks — outperforming higher-cost models like Qwen3-30B-A3B-Thinking-2507 and Qwen3-32B-Thinking, outpeforming the closed-source Gemini-2.5-Flash-Thinking on multiple benchmarks, and approaching the performance of our top-tier model Qwen3-235B-A22B-Thinking-2507.

文本

Qwen3.5-27B

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities are comparable to those of the Qwen3.5-122B-A10B.

文本

Qwen3.5-122B-A10B

The Qwen3.5-122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of overall performance, this model is second only to Qwen3.5-397B-A17B. Its text capabilities significantly outperform those of Qwen3-235B-2507, and its visual capabilities surpass those of Qwen3-VL-235B.

文本

Qwen3.5-35B-A3B

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency. Its overall performance is comparable to that of the Qwen3.5-27B.

文本

Qwen3.5-397B-A17B

The Qwen3.5 series 397B-A17B native vision-language model is based on a hybrid architecture design that integrates linear attention mechanisms with sparse Mixture-of-Experts (MoE), achieving higher inference efficiency. Across a variety of tasks—including language understanding, logical reasoning, code generation, agentic tasks, image understanding, video understanding, and graphical user interface (GUI) interaction—it demonstrates exceptional performance comparable to current top-tier frontier models. Possessing robust code generation and agentic capabilities, it exhibits strong generalization across various agent scenarios.

文本

Qwen3.5-Plus

The Qwen3.5 native vision-language series Plus models are based on a hybrid architecture design that integrates linear attention mechanisms with sparse Mixture-of-Experts (MoE), achieving higher inference efficiency. Across various task evaluations, the 3.5 series demonstrates exceptional performance comparable to current top-tier frontier models, marking a leap forward in both plain text and multimodal capabilities compared to the 3 series.

嵌入

qwen/qwen3-embedding-0.6b

嵌入

Qwen3 Embedding 8B

Qwen3 Embedding 8B Model is the latest proprietary embedding model from the Qwen family, specifically optimized for text embedding tasks. Built upon the dense foundational architecture of the Qwen3 series, it fully inherits the base model's exceptional multilingual capabilities, long-context comprehension, and advanced reasoning skills. The Qwen3 Embedding series delivers groundbreaking performance across multiple embedding applications, including text retrieval, code search, text classification, document clustering, and bitext mining.

视频

万相 Wan 2.7 视频编辑

万相 Wan 2.7 视频编辑模型，支持多模态输入（文本/图像/视频），可完成指令编辑和视频迁移任务。支持720P和1080P分辨率，时长2~10秒，按秒计费。输出默认包含音频。

视频

万相 Wan 2.7参考生视频

万相 Wan 2.7参考生视频模型，支持多模态输入（文本/图像/视频），可将人或物体作为主角，生成单角色表演或多角色互动视频。支持智能分镜，生成多镜头视频。支持720P和1080P分辨率，时长2~10秒，按秒计费。输出默认包含音频。

视频

万相 Wan 2.7 文生视频

万相 Wan 2.7 文生视频模型，基于文本提示词生成流畅视频。支持音频驱动或自动配音，支持720P和1080P分辨率，时长2~15秒，按秒计费。输出默认包含音频。

视频

万相 Wan 2.7 图生视频

万相 Wan 2.7 图生视频模型，支持多模态输入（文本/图像/音频/视频），可完成首帧生视频、首尾帧生视频、视频续写三大任务。支持720P和1080P分辨率，时长2~15秒，按秒计费。输出默认包含音频。

视频

Wan 2.1 Image to Video

阿里通义万相 Wan 以高画质、强时序一致性与复杂提示词跟随著称，适合规模化商用视频生成。Wan 2.1 强化运动稳定与细节质感，适合电商与广告批量生产。图生视频支持用一张参考图驱动动作与运镜，适合人物舞蹈、产品展示与风格延展。即时推理 API，性能稳定，无需等待，价格亲民

视频

Wan 2.1 Text to Video

阿里通义万相 Wan 以高画质、强时序一致性与复杂提示词跟随著称，适合规模化商用视频生成。Wan 2.1 强化运动稳定与细节质感，适合电商与广告批量生产。文生视频可直接用提示词生成分镜与镜头语言，适合脚本到成片的快速试制。即时推理 API，性能稳定，无需等待，价格亲民

视频

Wan 2.2 Image to Video

阿里通义万相 Wan 以高画质、强时序一致性与复杂提示词跟随著称，适合规模化商用视频生成。Wan 2.2 增强镜头连贯与人物动作自然度，复杂场景更稳。图生视频支持用一张参考图驱动动作与运镜，适合人物舞蹈、产品展示与风格延展。即时推理 API，性能稳定，无需等待，价格亲民

视频

Wan 2.2 Text to Video

阿里通义万相 Wan 以高画质、强时序一致性与复杂提示词跟随著称，适合规模化商用视频生成。Wan 2.2 增强镜头连贯与人物动作自然度，复杂场景更稳。文生视频可直接用提示词生成分镜与镜头语言，适合脚本到成片的快速试制。即时推理 API，性能稳定，无需等待，价格亲民

视频

Wan 2.5 Image to Video Preview

阿里通义万相 Wan 以高画质、强时序一致性与复杂提示词跟随著称，适合规模化商用视频生成。Wan 2.5 在画面清晰度与提示词跟随上进一步提升，预览版便于快速试错。图生视频支持用一张参考图驱动动作与运镜，适合人物舞蹈、产品展示与风格延展。即时推理 API，性能稳定，无需等待，价格亲民

视频

Wan 2.5 Text to Video Preview

阿里通义万相 Wan 以高画质、强时序一致性与复杂提示词跟随著称，适合规模化商用视频生成。Wan 2.5 在画面清晰度与提示词跟随上进一步提升，预览版便于快速试错。文生视频可直接用提示词生成分镜与镜头语言，适合脚本到成片的快速试制。即时推理 API，性能稳定，无需等待，价格亲民

图像

Z Image Turbo LoRA

Z 系列提供稳定的生成能力，适合生产场景。该系列面向生产级调用，强调稳定性与可控输出。适合通用内容生成与工具调用，便于集成到你的生产工作流。即时推理 API，性能稳定，无需等待，价格亲民

图像

Z Image Turbo

视频

Wan 2.6 Video Reference

Wan2.6 系列提供稳定的生成能力，适合生产场景。该系列面向生产级调用，强调稳定性与可控输出。参考视频生成可保留原视频结构并做风格/质感重渲染，适合二创与修复升级。即时推理 API，性能稳定，无需等待，价格亲民

视频

Wan 2.6 Image To Video

Wan2.6 系列提供稳定的生成能力，适合生产场景。该系列面向生产级调用，强调稳定性与可控输出。图生视频支持用一张参考图驱动动作与运镜，适合人物舞蹈、产品展示与风格延展。即时推理 API，性能稳定，无需等待，价格亲民

视频

Wan 2.6 Text to Video

Wan2.6 系列提供稳定的生成能力，适合生产场景。该系列面向生产级调用，强调稳定性与可控输出。文生视频可直接用提示词生成分镜与镜头语言，适合脚本到成片的快速试制。即时推理 API，性能稳定，无需等待，价格亲民