Website · Docs · Discord · Self-hosting
lwp_og.webm
The platform for LLM evaluations and AI agent testing. We help teams test, simulate, evaluate, and monitor LLM-powered agents end-to-end — before release and in production. Built for teams that need regression testing, simulations, and production observability without building custom tooling.
-
End-to-end agent simulations Run realistic scenarios against your full stack (tools, state, user simulator, judge) and pinpoint where your agents break, and why? down to each decision.
-
Eval + observability + prompts in one loop Trace → dataset → evaluate → optimize prompts/models → re-test. No glue code, no tool sprawl.
-
Open standards, no lock-in OpenTelemetry/OTLP-native. Framework- and LLM-provider agnostic by design.
-
Collaboration that doesn't slow shipping Review runs, annotate failures, and ship fixes faster. Let domain experts label edge cases with annotations & queues, keep prompts in Git with the GitHub integration, and link prompt versions to traces.
LangWatch gives you full visibility into agent behavior and the tools to systematically improve reliability, performance, and cost, while keeping you in control of your AI system
The easiest way to get started with LangWatch.
Create a free account → create a project → get started/ copy your API key.
Get up and running on your own machine using docker compose:
git clone https://github.com/langwatch/langwatch.git
cd langwatch
cp langwatch/.env.example langwatch/.env
docker compose up -d --wait --buildOnce running, LangWatch will be available at http://localhost:5560, where you can create your first project and API key.
Run LangWatch on your own infrastructure:
- Docker Compose - Run LangWatch on your own machine.
- Kubernetes (Helm) - Run LangWatch on a Kubernetes cluster using Helm.
- OnPrem - Cloud-specific setups for AWS, Google Cloud, and Azure.
Hybrid (OnPrem data) 🔀
For companies that have strict data residency and control requirements, without needing to go fully on-prem.
Read more about it on our docs.
Local Development 👩💻
You can also run LangWatch locally without docker to develop and help contribute to the project.
Start just the databases using docker and leave it running:
docker compose up redis postgres opensearchThen, on another terminal, install the dependencies and start LangWatch:
make install
make startShip safer agents in minutes. Create a free account, then dive into these guides:
- Run your first agent simulation - Test agents against realistic scenarios before production
- Set up evaluations - Measure quality, performance, and reliability
- Send your first traces - Integrate LangWatch with your stack
- Get started with LangWatch MCP - Use LangWatch in Claude Desktop and other MCP clients
LangWatch builds and maintains several integrations listed below. Our tracing platform is built on top of OpenTelemetry, so we support any OpenTelemetry-compatible library out of the box.
Frameworks:
LangChain ·
LangGraph ·
Vercel AI SDK ·
Mastra ·
CrewAI ·
Google ADK
Model Providers:
OpenAI ·
Anthropic ·
Azure ·
Google Cloud ·
AWS ·
Groq ·
Ollama
and many more…
Are you using a platform that could benefit from a direct LangWatch integration? We'd love to hear from you, please fill out this very quick form.
Have questions or need help? We're here to support you in multiple ways:
- Documentation: Our comprehensive documentation covers everything from getting started to advanced features.
- Discord Community: Join our Discord server for real-time help from our team and community.
- X (Twitter): Follow us on X for updates and announcements.
- GitHub Issues: Report bugs or request features through our GitHub repository.
- Enterprise Support: Enterprise customers receive priority support with dedicated response times. Our pricing page contains more information.
Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
Please read our Contribution Guidelines for details on our code of conduct, and the process for submitting pull requests.
Please read our LICENSE.md file.
As a platform that has access to data that is highly likely to be sensitive, we take security incredibly seriously and treat it as a core part of our culture.
| Legal Framework | Current Status |
|---|---|
| GDPR | Compliant. DPA available upon request. |
| ISO 27001 | Certified. Certification report available upon request on our Enterprise plan. |
Please refer to our Security page for more information. Contact us at [email protected] if you have any further questions.
If you need to do a responsible disclosure of a security vulnerability, you may do so by email to [email protected], or if you prefer you can reach out to one of our team privately on Discord.