LangGraph 및 MongoDB Atlas 로 AI 에이전트 구축

MongoDB Atlas와 LangGraph 를 통합하여 AI 에이전트를 빌드 할 수 있습니다. 이 튜토리얼에서는 MongoDB 의 샘플 데이터에 대한 질문에 답변하는 AI 에이전트 빌드 방법을 보여줍니다.

구체적으로, 에이전트 통합을 사용하여 에이전트적 RAG 및 에이전트 메모리를 구현 . 시맨틱 검색 및 전체 텍스트 검색 도구를 사용하여 관련 정보를 조회 하고 데이터에 대한 질문에 답변 . 또한 대화 기록과 중요한 상호 작용을 별도의 컬렉션에 저장하여 MongoDB 사용하여 장단기 기억을 모두 구현합니다.

이 페이지의 코드는 전체 샘플 애플리케이션 빌드합니다. 단계별로 학습 하려는 경우 Python 노트북 으로 코드를 통해 작업할 수도 있습니다.

전제 조건

이 튜토리얼을 완료하려면 다음 조건을 충족해야 합니다.

다음 MongoDB cluster 유형 중 하나입니다.
- MongoDB 버전 6.0.11 을 실행 Atlas cluster , 7.0.2 이상입니다. IP 주소 가 Atlas 프로젝트의 액세스 목록에 포함되어 있는지 확인하세요.
- Atlas CLI 사용하여 생성된 로컬 Atlas 배포서버 입니다. 자세히 학습 로컬 Atlas 배포 만들기를 참조하세요.
- 검색 및 벡터 검색이 설치된 MongoDB Community 또는 Enterprise 클러스터.
Voyage AI API 키입니다. 자세한 학습 은 API 키 및 Python 클라이언트를 참조하세요.
OpenAI API 키입니다. API 요청에 사용할 수 있는 크레딧이 있는 OpenAI 계정이 있어야 합니다. OpenAI 계정 등록에 대해 자세히 학습하려면 OpenAI API 웹사이트를 참조하세요.

참고

langchain-voyageai 패키지의 요구 사항을 확인하여 호환되는 Python 버전을 사용하고 있는지 확인하세요.

환경 설정

환경을 설정하려면 다음 단계를 완료하세요.

프로젝트 초기화하고 종속성을 설치합니다.

새 프로젝트 디렉토리를 만든 후, 필요한 종속성을 설치합니다.

mkdir langgraph-mongodb-ai-agent
cd langgraph-mongodb-ai-agent
pip install --quiet --upgrade python-dotenv langgraph langgraph-checkpoint-mongodb langgraph-store-mongodb langchain langchain-mongodb langchain-voyageai langchain-openai pymongo

참고

프로젝트는 다음 구조를 사용합니다.

langgraph-mongodb-ai-agent
├── .env
├── config.py
├── search-tools.py
├── memory-tools.py
├── agent.py
├── main.py

환경 변수를 설정합니다.

프로젝트 에 .env 파일 만들고 다음 변수를 지정합니다. 자리 표시자 값을 유효한 API 키와 MongoDB 클러스터의 연결 문자열 로 바꿉니다.

VOYAGE_API_KEY = "<voyage-api-key>"
OPENAI_API_KEY = "<openai-api-key>"
MONGODB_URI = "<connection-string>"

참고

<connection-string>을 Atlas 클러스터 또는 로컬 Atlas 배포서버의 연결 문자열로 교체합니다.

연결 문자열은 다음 형식을 사용해야 합니다.

mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net

자세한 학습은 드라이버를 통해 클러스터에 연결을 참조하세요.

연결 문자열은 다음 형식을 사용해야 합니다.

mongodb://localhost:<port-number>/?directConnection=true

학습 내용은 연결 문자열을 참조하세요.

MongoDB 벡터 데이터베이스로 사용

저장 및 검색을 위한 벡터 데이터베이스 로 MongoDB 구성하려면 다음 단계를 완료하세요.

샘플 데이터를 불러옵니다.

이 튜토리얼에서는 샘플 데이터 세트 중 하나를 데이터 소스 로 사용합니다. 아직 완료하지 않았다면 샘플 데이터를 Atlas cluster 에 로드하는 단계를 완료하세요.

구체적으로 영화의 줄거리 벡터 임베딩을 포함한 영화에 대한 문서를 포함하는 embedded_movies 데이터 세트를 사용하게 됩니다.

참고

자체 데이터를 사용하려면 LangChain 시작하기 또는 벡터 임베딩 생성 방법을 참조하여 Atlas에 벡터 임베딩을 삽입하는 방법을 알아보세요.

벡터 저장 와 인덱스를 설정합니다.

프로젝트 에 config.py 이라는 파일 만듭니다. 이 파일 MongoDB 에이전트 의 벡터 저장 로 구성합니다. 또한 샘플 데이터에 대한 벡터 검색 및 전체 텍스트 검색 쿼리를 활성화 인덱스를 생성합니다.

config.py

다음 코드를 복사하여 config.py 파일에 붙여넣습니다.

import os
from pymongo import MongoClient
from langchain_mongodb import MongoDBAtlasVectorSearch
from langchain_mongodb.index import create_fulltext_search_index
from langchain_voyageai import VoyageAIEmbeddings
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
# Get required environment variables
MONGODB_URI = os.getenv("MONGODB_URI")
if not MONGODB_URI:
    raise ValueError("MONGODB_URI environment variable is required")
# Initialize models
embedding_model = VoyageAIEmbeddings(
    model="voyage-3-large",
    output_dimension=2048
)
llm = ChatOpenAI("gpt-4o")
# MongoDB setup
mongo_client = MongoClient(MONGODB_URI)
collection = mongo_client["sample_mflix"]["embedded_movies"]
# LangChain vector store setup
vector_store = MongoDBAtlasVectorSearch.from_connection_string(
    connection_string=MONGODB_URI,
    namespace="sample_mflix.embedded_movies",
    embedding=embedding_model,
    text_key="plot",
    embedding_key="plot_embedding_voyage_3_large",
    relevance_score_fn="dotProduct",
)
# Create indexes on startup
print("Setting up vector store and indexes...")
try:
    existing_indexes = list(collection.list_search_indexes())
    vector_index_exists = any(idx.get('name') == 'vector_index' for idx in existing_indexes)
    if vector_index_exists:
        print("Vector search index already exists, skipping creation...")
    else:
        print("Creating vector search index...")
        vector_store.create_vector_search_index(
            dimensions=2048,  # The dimensions of the vector embeddings to be indexed
            wait_until_complete=60  # Number of seconds to wait for the index to build (can take around a minute)
        )
        print("Vector search index created successfully!")
except Exception as e:
    print(f"Error creating vector search index: {e}")
try:
    fulltext_index_exists = any(idx.get('name') == 'search_index' for idx in existing_indexes)
    if fulltext_index_exists:
        print("Search index already exists, skipping creation...")
    else:
        print("Creating search index...")
        create_fulltext_search_index(
            collection=collection,
            field="title",
            index_name="search_index",
            wait_until_complete=60  # Number of seconds to wait for the index to build (can take around a minute)
        )
        print("Search index created successfully!")
except Exception as e:
    print(f"Error creating search index: {e}")

검색 도구 정의

프로젝트 에 search_tools.py 파일 만듭니다. 이 파일 에서는 에이전트 에이전트적 RAG를수행하는 데 사용하는 검색 도구를 정의합니다.

search_tools.py

다음 코드를 복사하여 search_tools.py 파일 에 붙여넣습니다.

plot_search: 이 도구는 벡터 저장 객체를 리트리버로 사용합니다. 내부적으로 리트리버는 의미론적으로 유사한 문서를 조회 위해 MongoDB Vector Search 쿼리 실행합니다. 그런 다음 이 도구는 조회된 영화 문서의 제목과 줄거리를 반환합니다.
title_search: 이 도구는 전체 텍스트 검색 리트리버 를 사용하여 지정된 영화 제목과 일치하는 영화 문서를 조회 . 그런 다음 도구는 지정된 영화의 줄거리를 반환합니다.

from langchain.agents import tool
from langchain_mongodb.retrievers.full_text_search import MongoDBAtlasFullTextSearchRetriever
from config import vector_store, collection
@tool
def plot_search(user_query: str) -> str:
    """
    Retrieve information on the movie's plot to answer a user query by using vector search.
    """
    
    retriever = vector_store.as_retriever(
        search_type="similarity",
        search_kwargs={"k": 5}  # Retrieve top 5 most similar documents
    )
    results = retriever.invoke(user_query)
   
    # Concatenate the results into a string
    context = "\n\n".join([f"{doc.metadata['title']}: {doc.page_content}" for doc in results])
    return context
@tool
def title_search(user_query: str) -> str:
    """
    Retrieve movie plot content based on the provided title by using full-text search.
    """
    
    # Initialize the retriever
    retriever = MongoDBAtlasFullTextSearchRetriever(
        collection=collection,            # MongoDB Collection
        search_field="title",             # Name of the field to search
        search_index_name="search_index", # Name of the MongoDB Search index
        top_k=1,                          # Number of top results to return       
    ) 
    results = retriever.invoke(user_query)
   
    for doc in results:
        if doc:
            return doc.metadata["fullplot"]
        else:
            return "Movie not found"
# List of search tools
SEARCH_TOOLS = [ plot_search, title_search ]

참고

특정 작업 수행에 필요한 모든 도구를 정의할 수 있습니다. 하이브리드 검색이나 상위 문서 조회와 같은 다른 검색 방법을 위한 도구도 정의할 수 있습니다.

메모리 도구 정의

프로젝트 에 memory_tools.py 파일 만듭니다. 이 파일 에서는 에이전트 장기 기억을 구현 위해 세션 전반에 걸쳐 중요한 상호 작용을 저장 하고 조회 사용할 수 있는 도구를 정의합니다.

Memory_tools.py

다음 코드를 복사하여 Memory_tools.py 파일 에 붙여넣습니다.

store_memory: 이 도구는 LangGraph MongoDB 저장소 를 사용하여 중요한 상호 작용을 MongoDB 컬렉션에 저장합니다.
retrieve_memory: 이 도구는 LangGraph MongoDB 저장 통해 시맨틱 검색 통해 쿼리 기반으로 관련 상호 작용을 조회 .

from langchain.agents import tool
from langgraph.store.mongodb import MongoDBStore, create_vector_index_config
from config import embedding_model, MONGODB_URI
# Vector search index configuration for memory collection
index_config = create_vector_index_config(
    embed=embedding_model,
    dims=2048,
    relevance_score_fn="dotProduct",
    fields=["content"]
)
@tool
def save_memory(content: str) -> str:
    """Save important information to memory."""
    with MongoDBStore.from_conn_string(
        conn_string=MONGODB_URI,
        db_name="sample_mflix",
        collection_name="memories",
        index_config=index_config,
        auto_index_timeout=60 # Wait a minute for vector index creation
    ) as store:
        store.put(
            namespace=("user", "memories"),
            key=f"memory_{hash(content)}",
            value={"content": content}
        )
    return f"Memory saved: {content}"
@tool
def retrieve_memories(query: str) -> str:
    """Retrieve relevant memories based on a query."""
    with MongoDBStore.from_conn_string(
        conn_string=MONGODB_URI,
        db_name="sample_mflix",
        collection_name="memories",
        index_config=index_config
    ) as store:
        results = store.search(("user", "memories"), query=query, limit=3)
    if results:
        memories = [result.value["content"] for result in results]
        return f"Retrieved memories:\n" + "\n".join(memories)
    return "No relevant memories found."
MEMORY_TOOLS = [save_memory, retrieve_memories]

지속성을 사용하여 에이전트 빌드

프로젝트 에 agent.py 파일 만듭니다. 이 파일 에서는 에이전트의 워크플로를 조정하는 그래프 를 빌드 . 이 에이전트는 MongoDB Checkpointer 구성 요소를 사용하여 단기 메모리를 구현, 별도의 기록으로 여러 개의 동시 대화를 허용합니다.

에이전트 다음 워크플로를 사용하여 쿼리에 응답합니다.

시작: 에이전트 사용자 쿼리 수신합니다.
에이전트 노드: 도구 바인딩된 LLM은 쿼리를 분석하고 도구가 필요한지 결정합니다.
연장 노드 (필요한 경우): 적절한 검색 또는 메모리 연장을 실행합니다.
종료: LLM이 도구의 출력을 사용하여 최종 응답을 생성합니다.

LangGraph-MongoDB 에이전트의 워크플로를 보여주는 다이어그램입니다.

클릭하여 확대

에이전트.py

다음 코드를 복사하여 에이전트.py 파일 에 붙여넣습니다.

에이전트 구현 다음과 같은 여러 구성 요소로 구성됩니다.

LangGraphAgent: 워크플로를 조정하는 메인 에이전트 클래스
build_graph: LangGraph 워크플로를 구성하고 단기 메모리 지속성을 위해 MongoDBSaver 체크포인터를 구성합니다.
agent_node: 메시지를 처리하고 도구 사용량을 결정하는 주요 의사 결정자
tools_node: 요청된 도구를 실행하고 결과를 반환합니다.
route_tools: 워크플로 방향을 결정하는 조건부 라우팅 기능
execute: 대화 스레드 추적을 위한 thread_id 매개변수를 허용하는 기본 진입 점 .

from typing import Annotated, Dict, List
from typing_extensions import TypedDict
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import ToolMessage
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langgraph.checkpoint.mongodb import MongoDBSaver
from config import llm, mongo_client
from search_tools import SEARCH_TOOLS
from memory_tools import MEMORY_TOOLS
# Define the graph state
class GraphState(TypedDict):
    messages: Annotated[list, add_messages]
# Define the LangGraph agent
class LangGraphAgent:
    def __init__(self):
        # Combine search tools with memory tools
        self.tools = SEARCH_TOOLS + MEMORY_TOOLS
        self.tools_by_name = {tool.name: tool for tool in self.tools}
        
        # Create prompt template
        self.prompt = ChatPromptTemplate.from_messages([
            (
                "system",
                "You are a helpful AI chatbot."
                " You are provided with tools to answer questions about movies."
                " Think step-by-step and use these tools to get the information required to answer the user query."
                " Do not re-run tools unless absolutely necessary."
                " If you are not able to get enough information using the tools, reply with I DON'T KNOW."
                " You have access to the following tools: {tool_names}."
            ),
            MessagesPlaceholder(variable_name="messages"),
        ])
        
        # Provide the tool names to the prompt
        self.prompt = self.prompt.partial(tool_names=", ".join([tool.name for tool in self.tools]))
        
        # Prepare the LLM with tools
        bind_tools = llm.bind_tools(self.tools)
        self.llm_with_tools = self.prompt | bind_tools
        
        # Build the graph
        self.app = self._build_graph()
    
    def _build_graph(self):
        """Build and compile the LangGraph workflow."""
        # Instantiate the graph
        graph = StateGraph(GraphState)
        
        # Add nodes
        graph.add_node("agent", self._agent_node)
        graph.add_node("tools", self._tools_node)
        
        # Add edges
        graph.add_edge(START, "agent")
        graph.add_edge("tools", "agent")
        
        # Add conditional edge
        graph.add_conditional_edges(
            "agent",
            self._route_tools,
            {"tools": "tools", END: END},
        )
        # Use the MongoDB checkpointer for short-term memory
        checkpointer = MongoDBSaver(mongo_client, db_name = "sample_mflix")
        return graph.compile(checkpointer=checkpointer)
    
    def _agent_node(self, state: GraphState) -> Dict[str, List]:
        """Agent node that processes messages and decides on tool usage."""
        messages = state["messages"]
        result = self.llm_with_tools.invoke(messages)
        return {"messages": [result]}
    
    def _tools_node(self, state: GraphState) -> Dict[str, List]:
        """Tools node that executes the requested tools."""
        result = []
        messages = state["messages"]
        if not messages:
            return {"messages": result}
        last_message = messages[-1]
        if not hasattr(last_message, "tool_calls") or not last_message.tool_calls:
            return {"messages": result}
        tool_calls = last_message.tool_calls
        # Show which tools the agent chose to use
        tool_names = [tool_call["name"] for tool_call in tool_calls]
        print(f"🔧 Agent chose to use tool(s): {', '.join(tool_names)}")
        for tool_call in tool_calls:
            try:
                tool_name = tool_call["name"]
                tool_args = tool_call["args"]
                tool_id = tool_call["id"]
                print(f"   → Executing {tool_name}")
                
                if tool_name not in self.tools_by_name:
                    result.append(ToolMessage(content=f"Tool '{tool_name}' not found", tool_call_id=tool_id))
                    continue
                tool = self.tools_by_name[tool_name]
                observation = tool.invoke(tool_args)
                result.append(ToolMessage(content=str(observation), tool_call_id=tool_id))
            except Exception as e:
                result.append(ToolMessage(content=f"Tool error: {str(e)}", tool_call_id=tool_id))
        return {"messages": result}
    
    def _route_tools(self, state: GraphState):
        """
        Uses a conditional_edge to route to the tools node if the last message
        has tool calls. Otherwise, route to the end.
        """
        messages = state.get("messages", [])
        if len(messages) > 0:
            ai_message = messages[-1]
        else:
            raise ValueError(f"No messages found in input state to tool_edge: {state}")
        
        if hasattr(ai_message, "tool_calls") and len(ai_message.tool_calls) > 0:
            return "tools"
        return END
    
    def execute(self, user_input: str, thread_id: str) -> str:
        """Execute the graph with user input."""
        input_data = {"messages": [("user", user_input)]}
        config = {"configurable": {"thread_id": thread_id}}
        outputs = list(self.app.stream(input_data, config))
        # Get the final answer
        if outputs:
            final_output = outputs[-1]
            for _, value in final_output.items():
                if "messages" in value and value["messages"]:
                    return value["messages"][-1].content
        return "No response generated."

워크플로우 정보

이 섹션을 펼치면 에이전트 에 사용되는 LangGraph 구성 요소에 대한 자세한 내용을 볼 수 있습니다.

이 그래프 에는 다음과 같은 주요 구성 요소가 포함되어 있습니다.

그래프 상태: 워크플로 전체에서 공유 데이터를 유지 관리하여 사용자 쿼리, LLM 응답 및 도구 호출 결과를 포함한 에이전트의 메시지 를 추적합니다.
Nodes:
- 에이전트 노드: 메시지를 처리하고, LLM 을 호출하고, LLM 응답으로 상태 업데이트합니다.
- 도구 노드: 도구 호출을 처리하고 결과로 대화 기록을 업데이트합니다.
노드를 연결하는 에지:
- 일반 에지: 시작에서 에이전트 노드 로, 에이전트 도구 노드 로 라우팅
- 조건부 에지: 도구가 필요한지 여부에 따라 조건부로 라우팅합니다.
지속성: MongoDBSaver 체크포인터를 사용하여 대화 상태 특정 스레드에 저장하여 세션 전반에서 단기 기억을 활성화합니다. checkpoints 및 checkpoint_writes 컬렉션에서 스레드 데이터를 찾을 수 있습니다.

팁

지속성, 단기 기억 및 MongoDB 체크포인터에 대해 자세히 학습 다음 리소스를 참조하세요.

에이전트 실행

마지막으로 프로젝트에 main.py라는 파일을 만듭니다. 이 파일은 에이전트를 실행하고 사용자와 상호 작용할 수 있게 합니다.

main.py

다음 코드를 복사하여 main.py 파일 에 붙여넣습니다.

from agent import LangGraphAgent
from config import mongo_client
def main():
    """LangGraph and MongoDB agent with tools and memory."""
    # Initialize agent (indexes are created during config import)
    agent = LangGraphAgent()
    thread_id = input("Enter a session ID: ").strip()
    print("Ask me about movies! Type 'quit' to exit.")
    try:
        while True:
            user_query = input("\nYour question: ").strip()
            if user_query.lower() == 'quit':
                break
            # Get response from agent
            answer = agent.execute(user_query, thread_id)
            print(f"\nAnswer: {answer}")
    finally:
        mongo_client.close()
if __name__ == "__main__":
    main()

프로젝트를 저장하고 다음 명령을 실행합니다. 에이전트를 실행할 때 다음을 수행합니다.

에이전트 벡터 저장 초기화하고 인덱스가 아직 존재하지 않는 경우 인덱스를 생성합니다.
세션 ID 입력하여 새 세션을 시작하거나 기존 세션을 계속할 수 있습니다. 각 세션은 유지되며 언제든지 이전 대화를 재개할 수 있습니다.
영화에 대해 질문합니다. 에이전트 도구와 이전 상호 작용을 기반으로 응답을 생성합니다.

다음 출력은 샘플 상호 작용을 보여줍니다.

python main.py

Creating vector search index...
Vector search index created successfully!
Creating search index...
Search index created successfully!
Enter a session ID: 123
Ask me about movies! Type 'quit' to exit.
Your query: What are some movies that take place in the ocean?
🔧 Agent chose to use tool(s): plot_search
   → Executing plot_search
Answer: Here are some movies that take place in the ocean:
1. **20,000 Leagues Under the Sea** - A marine biologist, his daughter, and a mysterious Captain Nemo explore the ocean aboard an incredible submarine.
2. **Deep Rising** - A group of armed hijackers board a luxury ocean liner in the South Pacific Ocean, only to fight man-eating, tentacled sea creatures.
... (truncated)
Your query: What is the plot of the Titanic?
🔧 Agent chose to use tool(s): title_search
   → Executing title_search
Answer: The plot of *Titanic* involves the romantic entanglements of two couples aboard the doomed ship's maiden voyage
... (truncated)
Your query: What movies are like the movie I just mentioned?
🔧 Agent chose to use tool(s): plot_search
   → Executing plot_search
Answer: Here are some movies similar to *Titanic*:
1. **The Poseidon Adventure** - A group of passengers struggles to survive when their ocean liner capsizes at sea.
2. **Pearl Harbor** - Focused on romance and friendship amidst the backdrop of a historical tragedy, following two best friends and their love lives during wartime.
... (truncated)
Your query: I don't like sad movies.
🔧 Agent chose to use tool(s): save_memory
   → Executing save_memory
Answer: Got it—I'll keep that in mind. Let me know if you'd like recommendations that focus more on uplifting or happy themes!
(In different session)
Enter a session ID: 456
Your query: Recommend me a movie based on what you know about me.
🔧 Agent chose to use tool(s): retrieve_memories
   → Executing retrieve_memories
Answer: Based on what I know about you—you don't like sad movies—I'd recommend a fun, uplifting, or action-packed film. Would you be interested in a comedy, adventure, or family-friendly movie?
Your query: Sure!
🔧 Agent chose to use tool(s): plot_search, plot_search, plot_search
   → Executing plot_search
   → Executing plot_search
   → Executing plot_search
Answer: Here are some movie recommendations from various uplifting genres that suit your preferences:
### Comedy:
1. **Showtime** (2002): A spoof of buddy cop movies where two very different cops are forced to team up on a new reality-based TV cop show. It's packed with laughs and action!
2. **The Big Bus** (1976): A hilarious disaster film parody featuring a nuclear-powered bus going nonstop from New York to Denver, plagued by absurd disasters.
### Adventure:
1. **Journey to the Center of the Earth** (2008): A scientist, his nephew, and their mountain guide discover a fantastic and dangerous lost world at the earth's core.
2. **Jason and the Argonauts** (1963): One of the most legendary adventures in mythology, brought to life in this epic saga of good versus evil.
### Family-Friendly:
1. **The Incredibles** (2004): A family of undercover superheroes is forced into action to save the world while living in quiet suburban life.
2. **Mary Poppins** (1964): A magical nanny brings joy and transformation to a cold banker's unhappy family.
3. **Chitty Chitty Bang Bang** (1968): A whimsical adventure featuring an inventor, his magical car, and a rescue mission filled with fantasy.

돌아가기

LangGraph

LangGraph.js