千问开源的 8 个最新模型,Qwen3,LLMs 系列.

其中,

Qwen3-235B-A22B, 大模型,benchmarks 超过了 DeepSeek-R1, OpenAI’s o1, o3-mini, Grok 3, 和 Gemini 2.5-Pro 等模型.

Qwen3-30B-A3B,小模型,性能超过了 QWQ-32B ,后者的激活参数是前者的 10 倍.

八个模型的对比,

ModelsLayersHeads (Q/KV)Tie EmbeddingContext Length
Qwen3-0.6B2816/8Yes32K
Qwen3-1.7B2816/8Yes32K
Qwen3-4B3632/8Yes32K
Qwen3-8B3632/8No128K
Qwen3-14B4040/8No128K
Qwen3-32B6464/8No128K
Qwen3-30B-A3B4832/4No128K
Qwen3-235B-A22B9464/4No128K

其中,6 个模型是 dense,即,在推理和训练时,所有的参数都会被激活;2 个是 open-weighted.

  • Qwen3-235B-A22B: 235B 参数的大模型,其中会被激活 22B 参数
  • Qwen3-30B-A3B: 30B 参数的 MoE 模型,3B 参数会被激活.

说明:

  • Layers: 表示 transformer blocks 的个数. 其包含 multi-head self-attention mechanism、feed forward networks、positional encoding、layer normalization 以及 residual connections. 例如,Qwen3-30B-A3B 有 48 个 Layers,表示模型采用了串联的 48 个 transformer blocks.
  • Heads:Transformers 采用了 multi-head attention,其将 attention mechanism 划分为多个 heads,每个分别学习数据的一部分特征. 这里,Q/KV 表示,

    • Q (Query heads) - 用于生成 queries 的 attention heads 的总数
    • KV(Key and Value) - 每个 attention block 的 key/value heads 的总数

Qwen3 的关键点

1. Pre-training

pre-training 包含三个阶段:

  1. 第一阶段,模型在 30 trillion tokens 上训练,上下文长度 4k tokens. 让模型具备基本的语言技能和通用知识;
  2. 第二阶段,通过增加知识密集型数据的比例,如STEM、Coding、Reasoning task等,提升数据的质量。模型再在 5 trillion tokens 上进行训练。
  3. 第三阶段,通过增加上下文长度到 32K tokens,来提升高质量的上下文数据。确保模型能够有效处理更长的输入。

2. Post-training

为了开发能够逐步推理和快速响应的混合模型,实施了 4 阶段的训练流程:

  1. Long chain-of-thoughts(CoT)
  2. 基于 reinforcement learning 的推理
  3. Thinking mode fusion
  4. General RL

3. Hybrid Thinking Modes

Qwen3 采用混合方法,有两种模式:

  1. Thinking Mode: 这种模式,模型花时间将复杂问题分解为小的、程序化的步骤来处理;
  2. Non-Thinking Mode:模型提供快速结构,适用于简单问题

4. Multilingual Support

Qwen3 支持119 种语言

5. 提升 Agentic 能力

Qwen3 优化了 Coding 和 Agentic 能力,支持MCP

Qwen3 构建 AI Agent

代码实现,如,

'''
pip install langchain langchain-community openai duckduckgo-search
'''
from langchain.chat_models import ChatOpenAI
from langchain.agents import Tool
from langchain.tools import DuckDuckGoSearchRun
from langchain.agents import initialize_agent

llm = ChatOpenAI(
   base_url="https://openrouter.ai/api/v1",
   api_key="your_api_key",
   model="qwen/qwen3-235b-a22b:free"
)
# Web Search Tool,搜索Tool
search = DuckDuckGoSearchRun()


# Tool for DestinationAgent,目的Tool
def get_destinations(destination):
   return search.run(f"Top 3 tourist spots in {destination}")


DestinationTool = Tool(
   name="Destination Recommender",
   func=get_destinations,
   description="Finds top places to visit in a city"
)

# Tool for CurrencyAgent, 货币Tool
def convert_usd_to_inr(query):
   amount = [float(s) for s in query.split() if s.replace('.', '', 1).isdigit()]
   if amount:
       return f"{amount[0]} USD = {amount[0] * 83.2:.2f} INR"
   return "Couldn't parse amount."


CurrencyTool = Tool(
   name="Currency Converter",
   func=convert_usd_to_inr,
   description="Converts USD to inr based on static rate"
)


'''
构建Agent
'''
tools = [DestinationTool, CurrencyTool]

agent = initialize_agent(
   tools=tools,
   llm=llm,
   agent_type="zero-shot-react-description",
   verbose=True
)

def trip_planner(city, usd_budget):
   dest = get_destinations(city)
   inr_budget = convert_usd_to_inr(f"{usd_budget} USD to INR")
   return f"""Here is your travel plan:
*Top spots in {city}*:
{dest}
*Budget*:
{inr_budget}
Enjoy your day trip!"""


'''
初始化Agent
'''
# Initialize the Agent
city = "Delhi"
usd_budget = 8500


# Run the multi-agent planner
response = agent.run(f"Plan a day trip to {city} with a budget of {usd_budget} USD")

打印输出,如:

from IPython.display import Markdown, display
display(Markdown(response))

Qwen3 构建 RAG System

代码实现,如,

'''
!pip install langchain langchain-community langchain-core openai tiktoken chromadb sentence-transformers duckduckgo-search
'''
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI

# Load your document
loader = TextLoader("/content/my_docs.txt")
docs = loader.load()

'''
创建 Embeddings
'''
# Split into chunks
splitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = splitter.split_documents(docs)

# Embed with HuggingFace model
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
db = Chroma.from_documents(chunks, embedding=embeddings)

# Setup Qwen LLM from OpenRouter
llm = ChatOpenAI(
   base_url="https://openrouter.ai/api/v1",
   api_key="YOUR_API_KEY",
   model="qwen/qwen3-235b-a22b:free"
)

# Create RAG chain
retriever = db.as_retriever(search_kwargs={"k": 2})
rag_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)

'''
初始化RAG
'''
# Ask a question
response = rag_chain.invoke({"query": "How can i use Qwen with MCP. Please give me a stepwise guide along with the necessary code snippets"})

打印输出,如

display(Markdown(response['result']))



Qwen3 应用场景

  • Automated Coding
  • Education and Research
  • Agent-Based Tool Integration
  • Advanced Reasoning Tasks
Last modification:May 15th, 2025 at 08:44 pm