> GitHub: langchain-ai/langchain-benchmarks
LangChain官方基准测试,评估组件和Agent性能
---
LangChain开发者、组件选型决策者、LangChain性能优化团队
LangChain组件众多但性能差异不明,缺乏官方标准测试来评估各组件和Agent表现
非LangChain生态用户、不需要性能对比的项目、已有自有评估体系的团队
官方标准测试、组件性能对比、Agent能力评估、持续回归验证、选型参考
---
``bash
pip install langchain-benchmarks
`
`python
from langchain_benchmarks import run_benchmark
results = run_benchmark(
agent=my_agent,
benchmark="tool-use"
)
print(results.score)
`
---
`bash
pip install langchain-benchmarks
pip install "langchain-benchmarks[all]"
`
`bash
export OPENAI_API_KEY="your-openai-key"
`
---
`python
from langchain_benchmarks import AgentBenchmark
benchmark = AgentBenchmark("tool-use-gaia")
result = benchmark.evaluate(
agent=my_agent,
num_examples=100
)
print(f"成功率: {result.success_rate}")
print(f"平均步骤: {result.avg_steps}")
`
`python
from langchain_benchmarks import compare
results = compare(
retrievers=[retriever1, retriever2],
benchmark="retrieval-qa"
)
results.to_chart()
``
---
> 更多详情请参考 GitHub 官方文档