Bring Your Own Agentic Framework with Llama Stack
Goal
Use any agentic framework (LangGraph, AutoGen, CrewAI) with Llama Stack’s OpenAI-compatible APIs.
This tutorial integrates LangGraph with Llama Stack inference and MCP weather tools.
Prerequisites
-
Llama Stack server running (see: Llama-stack Helloworld)
-
MCP weather service running (see: MCP Weather Setup)
-
Python environment with virtual environment activated
Complete LangGraph Integration Example
This example demonstrates all key integration patterns in a single comprehensive script that includes:
-
Basic Integration: Connecting LangGraph to Llama Stack’s OpenAI-compatible endpoint
-
Simple Agent: LangGraph ReAct Agent implementation with tool binding
-
MCP Tools: Weather service integration with proper MCP format
-
Advanced Patterns: Error handling and fallback strategies
Install the required dependencies:
pip install langgraph==0.6.7 langchain-openai==0.3.32 langchain-core==0.3.75
This is a complete integration example. Run the script to see the LangGraph agent in action!
cat << 'EOF' > langgraph_llama_stack.py
import os
from llama_stack_client import LlamaStackClient
from langgraph.graph import StateGraph, END, START
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage
from typing import Annotated
from typing_extensions import TypedDict
from langgraph.graph.message import add_messages
# Environment variables
INFERENCE_MODEL = os.getenv("INFERENCE_MODEL", "meta-llama/Llama-3.2-3B-Instruct")
INFERENCE_SERVER_OPENAI = os.getenv("LLAMA_STACK_ENDPOINT_OPENAI", "http://localhost:8321/v1/openai/v1")
API_KEY = os.getenv("OPENAI_API_KEY", "not-applicable")
print("๐ LangGraph + Llama Stack Integration")
print(f" LLAMA_STACK_URL: {INFERENCE_SERVER_OPENAI}")
print(f" INFERENCE_MODEL: {INFERENCE_MODEL}")
llm = ChatOpenAI(
model=INFERENCE_MODEL,
openai_api_key=API_KEY,
openai_api_base=INFERENCE_SERVER_OPENAI,
use_responses_api=True,
)
# Test connectivity
print("\n๐งช Testing basic connectivity:")
response = llm.invoke("Hello")
print(f"โ
Connection successful")
# Bind MCP weather tools
print("\n๐ ๏ธ Setting up MCP weather tools...")
llm_with_tools = llm.bind_tools([
{
"type": "mcp",
"server_label": "weather",
"server_url": "http://host.containers.internal:3001/sse",
"require_approval": "never",
},
])
print("โ
MCP tools configured")
# Define LangGraph State and Agent
class State(TypedDict):
messages: Annotated[list, add_messages]
def chatbot(state: State):
message = llm_with_tools.invoke(state["messages"])
return {"messages": [message]}
# Build LangGraph StateGraph
print("\n๐๏ธ Building LangGraph agent...")
graph_builder = StateGraph(State)
graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph_builder.add_edge("chatbot", END)
graph = graph_builder.compile()
print("โ
LangGraph agent ready")
# Test the integration
print("\n" + "="*50)
print("๐ Testing LangGraph Agent with MCP Tools")
print("="*50)
response = graph.invoke({
"messages": [{"role": "user", "content": "What is the weather in Seattle?"}]
})
print("Weather Response:")
for message in response['messages']:
if hasattr(message, 'content'):
if isinstance(message.content, list):
for content_block in message.content:
if content_block.get('type') == 'text':
print(content_block.get('text', ''))
elif isinstance(message.content, str):
print(message.content)
else:
message.pretty_print()
EOF
python langgraph_llama_stack.py
We use http://localhost:8321/v1/openai/v1/ instead of the
standard http://localhost:8321/ because we’re leveraging
Llama Stack’s OpenAI-compatible endpoint for seamless
integration with existing OpenAI-based frameworks.
|
When you run the script, you should see output similar to this:
๐ LangGraph + Llama Stack Integration
LLAMA_STACK_URL: http://localhost:8321/v1/openai/v1/ (OpenAI-compatible endpoint)
INFERENCE_MODEL: meta-llama/Llama-3.2-3B-Instruct
๐งช Testing basic connectivity:
INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/responses "HTTP/1.1 200 OK"
โ
Connection successful
๐ ๏ธ Setting up MCP weather tools...
โ
MCP tools configured
๐๏ธ Building LangGraph agent...
โ
LangGraph agent ready
==================================================
๐ Testing LangGraph Agent with MCP Tools
==================================================
INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/responses "HTTP/1.1 200 OK"
Weather Response:
What is the weather in Seattle?
It looks like the weather forecast for Seattle is mostly sunny with a chance of rain showers. Here are the details:
* Temperature: High of 73ยฐF today and tonight, with lows in the mid-50s to low 60s throughout the week.
* Wind: Light breeze blowing at around 5-6 mph most days, with some gusts up to 12 mph on Tuesday afternoon.
* Precipitation: A slight chance of rain showers on most days, with a higher chance on Saturday and Sunday.
You’ve successfully integrated LangGraph with Llama Stack! The agent can now make weather queries using MCP tools while leveraging Llama Stack’s OpenAI-compatible inference API.
Summary
This tutorial demonstrated how to:
-
Integrate any agentic framework with Llama Stack using standard APIs
-
Leverage OpenAI compatibility for easy migration from other providers
-
Add MCP tools for enhanced agent capabilities
The BYO approach gives you the flexibility to use your preferred framework while selectively leveraging Llama Stack’s powerful APIs.
Next, explore comprehensive deployment options with All-in-One Setup for a complete production-ready environment.