Open Source AI Coding Tools
Open Source AI coding tools, editors, agents, and MCP servers — 43 curated.
Arize Phoenix
An open-source LLM observability and evaluation tool built on OpenTelemetry, runnable anywhere.
AutoGen (AG2)
A multi-agent conversation framework where agents collaborate via message-passing, with group chat and code execution.
BAML
A domain-specific language for type-safe LLM functions, with generated clients and schema-aligned parsing.
Chonkie
A lightweight, fast chunking library for RAG with many splitting strategies in one API.
Chroma
An open-source, Python-first vector database that runs in-process — the fastest path from pip install to a working retrieval prototype.
Codex CLI
OpenAI's open-source terminal coding agent with sandboxed execution and two-layer approval controls.
Continue
An open-source IDE extension for building custom AI coding assistants.
CrewAI
A Python framework for orchestrating role-playing AI agents as collaborating 'crews', plus event-driven flows.
DeepEval
An open-source evaluation framework for LLM apps — 'Pytest for LLMs' with ready-made metrics and CI integration.
DSPy
Program language models instead of prompting them: declare tasks as typed signatures and let optimizers compile the prompts and few-shot examples for you.
FastMCP
A Pythonic framework for building Model Context Protocol servers and clients — decorator-based tools, resources, and prompts, with auth and deployment built in.
Gemini CLI
Google's open-source terminal AI agent powered by Gemini models, with a 1M-token context window and built-in tools.
Goose
Block's open-source, on-machine AI agent that is MCP-native and model-agnostic, with a CLI and desktop app.
Helicone
Open-source LLM observability and AI gateway with one-line integration — logging, tracing, caching, and cost/latency tracking across providers.
Instructor
Get structured, validated output from LLMs using plain type definitions, with automatic retries on validation failure.
LanceDB
An open-source embedded vector database built on the Lance columnar format — serverless, multimodal, and designed to scale on local disk or object storage.
Langfuse
An open-source LLM engineering platform for tracing, evals, prompt management, and metrics.
LangGraph
A low-level library for building stateful, controllable agents as graphs, with checkpointing and human-in-the-loop.
LiteLLM
Call 100+ LLM APIs with one OpenAI-format interface — as a Python library or a self-hosted gateway/proxy.
LLM Guard
An open-source security toolkit of input and output scanners for LLM apps — prompt injection, PII/anonymize, secrets, toxicity, and more, from Protect AI.
MCP Inspector
The official open-source visual tool for testing and debugging Model Context Protocol servers — connect, list, and call tools, resources, and prompts.
Mem0
A memory layer for AI agents and apps — persistent, personalized long-term memory across sessions.
Milvus
An open-source vector database built for billion-scale similarity search, with a distributed architecture and a wide menu of index types.
NeMo Guardrails
NVIDIA's open-source toolkit for adding programmable guardrails to LLM apps — input, dialog, retrieval, and output rails defined in the Colang language.
Ollama
An open-source tool to run open-weight LLMs locally with a single command, including a local OpenAI-compatible API.
OpenAI Agents SDK
OpenAI's lightweight, open-source framework for agents — handoffs, guardrails, sessions, and built-in tracing.
pgroll
An open-source CLI for zero-downtime, reversible Postgres schema migrations using the expand-contract pattern behind versioned schema views.
pgvector
An open-source Postgres extension that adds a vector type and HNSW/IVFFlat indexes for similarity search inside your existing database.
Pipecat
An open-source Python framework for real-time voice and multimodal conversational AI — it orchestrates streaming STT, LLM, and TTS into composable pipelines.
Playwright MCP
Microsoft's open-source MCP server that gives AI agents structured browser automation via Playwright's accessibility tree.
promptfoo
An open-source CLI for testing, comparing, and red-teaming LLM prompts, models, and apps.
Qdrant
An open-source vector database written in Rust, built for low-latency similarity search at scale.
Qwen3-VL
Alibaba Qwen's open-weights vision-language model family (2B–235B, Apache-2.0): image and document understanding, OCR, visual reasoning, and video.
RAGAS
An open-source framework for evaluating retrieval-augmented generation with reference-free RAG metrics.
Roo Code
A discontinued open-source VS Code agent (a Cline fork); the team has since pivoted away from the IDE extension.
Unsloth
An open-source library that makes LoRA/QLoRA fine-tuning of LLMs roughly 2x faster and far more memory-efficient, so you can fine-tune on a single GPU.
Vercel AI SDK
An open-source TypeScript toolkit for building AI apps — unified model API, streaming, structured output, tool calling, and UI hooks.
vLLM
A high-throughput, memory-efficient inference and serving engine for LLMs, with PagedAttention, continuous batching, and an OpenAI-compatible API server.
Warp
A modern, AI-powered terminal with an agent mode that can run and chain commands across your codebase.
Weaviate
An open-source vector database with built-in hybrid search, pluggable vectorizer modules, and GraphQL/REST/gRPC APIs.