GitHub - langwatch/langwatch: The platform for LLM evaluations and AI agent testing

Website · Docs · Discord · Self-hosting

lwp_og.webm

Why LangWatch?

The platform for LLM evaluations and AI agent testing. We help teams test, simulate, evaluate, and monitor LLM-powered agents end-to-end — before release and in production. Built for teams that need regression testing, simulations, and production observability without building custom tooling.

End-to-end agent simulations Run realistic scenarios against your full stack (tools, state, user simulator, judge) and pinpoint where your agents break, and why? down to each decision.
Eval + observability + prompts in one loop Trace → dataset → evaluate → optimize prompts/models → re-test. No glue code, no tool sprawl.
Open standards, no lock-in OpenTelemetry/OTLP-native. Framework- and LLM-provider agnostic by design.
Collaboration that doesn't slow shipping Review runs, annotate failures, and ship fixes faster. Let domain experts label edge cases with annotations & queues, keep prompts in Git with the GitHub integration, and link prompt versions to traces.

LangWatch gives you full visibility into agent behavior and the tools to systematically improve reliability, performance, and cost, while keeping you in control of your AI system

Getting Started

Cloud ☁️

The easiest way to get started with LangWatch.

Create a free account → create a project → get started/ copy your API key.

Local setup 💻

Get up and running on your own machine using docker compose:

git clone https://github.com/langwatch/langwatch.git
cd langwatch
cp langwatch/.env.example langwatch/.env
docker compose up -d --wait --build

Once running, LangWatch will be available at http://localhost:5560, where you can create your first project and API key.

Deployment options ⚓️

Run LangWatch on your own infrastructure:

Docker Compose - Run LangWatch on your own machine.
Kubernetes (Helm) - Run LangWatch on a Kubernetes cluster using Helm.
OnPrem - Cloud-specific setups for AWS, Google Cloud, and Azure.

Hybrid (OnPrem data) 🔀

For companies that have strict data residency and control requirements, without needing to go fully on-prem.

🚀 Quick Start

Ship safer agents in minutes. Create a free account, then dive into these guides:

Run your first agent simulation - Test agents against realistic scenarios before production
Set up evaluations - Measure quality, performance, and reliability
Send your first traces - Integrate LangWatch with your stack
Get started with LangWatch MCP - Use LangWatch in Claude Desktop and other MCP clients

🗺️ Integrations

LangWatch builds and maintains several integrations listed below. Our tracing platform is built on top of OpenTelemetry, so we support any OpenTelemetry-compatible library out of the box.

Frameworks:
LangChain · LangGraph · Vercel AI SDK · Mastra · CrewAI · Google ADK

Model Providers:
OpenAI · Anthropic · Azure · Google Cloud · AWS · Groq · Ollama

Platforms

LangFlow · Flowise · n8n

and many more…

Are you using a platform that could benefit from a direct LangWatch integration? We'd love to hear from you, please fill out this very quick form.

💬 Support

Have questions or need help? We're here to support you in multiple ways:

Documentation: Our comprehensive documentation covers everything from getting started to advanced features.
Discord Community: Join our Discord server for real-time help from our team and community.
X (Twitter): Follow us on X for updates and announcements.
GitHub Issues: Report bugs or request features through our GitHub repository.
Enterprise Support: Enterprise customers receive priority support with dedicated response times. Our pricing page contains more information.

🤝 Collaborating

Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Please read our Contribution Guidelines for details on our code of conduct, and the process for submitting pull requests.

✍️ License

Please read our LICENSE.md file.

👮‍♀️ Security + Compliance

As a platform that has access to data that is highly likely to be sensitive, we take security incredibly seriously and treat it as a core part of our culture.

Legal Framework	Current Status
GDPR	Compliant. DPA available upon request.
ISO 27001	Certified. Certification report available upon request on our Enterprise plan.

Please refer to our Security page for more information. Contact us at [email protected] if you have any further questions.

Vulnerability Disclosure

If you need to do a responsible disclosure of a security vulnerability, you may do so by email to [email protected], or if you prefer you can reach out to one of our team privately on Discord.

Name		Name	Last commit message	Last commit date
Latest commit History 3,569 Commits
.chainlit		.chainlit
.claude		.claude
.github		.github
agentic-e2e-tests		agentic-e2e-tests
assets		assets
bin		bin
bullboard		bullboard
charts/langwatch		charts/langwatch
docs		docs
langwatch		langwatch
langwatch_nlp		langwatch_nlp
mcp-server		mcp-server
python-sdk		python-sdk
scripts		scripts
sdk-go		sdk-go
specs		specs
typescript-sdk		typescript-sdk
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.mcp.json		.mcp.json
.python-version		.python-version
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.langwatch_nlp		Dockerfile.langwatch_nlp
Dockerfile.langwatch_nlp.lambda		Dockerfile.langwatch_nlp.lambda
Dockerfile.opensearch_lite		Dockerfile.opensearch_lite
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
build_hooks.py		build_hooks.py
compose.dev.yml		compose.dev.yml
compose.yml		compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Website · Docs · Discord · Self-hosting

Why LangWatch?

Getting Started

Cloud ☁️

Local setup 💻

Deployment options ⚓️

🚀 Quick Start

🗺️ Integrations

Platforms

💬 Support

🤝 Collaborating

✍️ License

👮‍♀️ Security + Compliance

Vulnerability Disclosure

About

Uh oh!

Releases 181

Used by 350

Contributors 29

Uh oh!

Languages

License

langwatch/langwatch

Folders and files

Latest commit

History

Repository files navigation

Website · Docs · Discord · Self-hosting

Why LangWatch?

Getting Started

Cloud ☁️

Local setup 💻

Deployment options ⚓️

🚀 Quick Start

🗺️ Integrations

Platforms

💬 Support

🤝 Collaborating

✍️ License

👮‍♀️ Security + Compliance

Vulnerability Disclosure

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 181

Used by 350

Contributors 29

Uh oh!

Languages