Artificial Intelligence has rapidly progressed from single-task models to collaborative networks of specialized agents working in tandem. This new frontier—multi-agent AI systems—mimics the dynamics of human teams, where different members tackle distinct roles, coordinate, delegate, and collectively achieve complex goals. Powered by large language models (LLMs), these systems are now easier than ever to build using modern frameworks.
In this guide, we’ll walk you through the process of creating a fully functional multi-agent AI system using Watsonx.ai and CrewAI, integrating multiple LLMs, assigning distinct tasks, and automating web-based research and content generation. Whether you’re an AI enthusiast or a developer looking to build intelligent automation workflows, this article provides a comprehensive, hands-on blueprint to get started.
Understanding the Building Blocks of Multi-Agent Systems
At the heart of a multi-agent AI system is the concept of agent specialization. Rather than relying on a single, monolithic model, the system consists of several agents—each powered by a specific LLM—assigned with unique roles, tasks, and goals. These agents interact with one another, communicate outputs, and even delegate responsibilities when needed.
The architecture generally includes:
- Core LLMs to handle content generation and reasoning.
- Function-calling LLMs to interface with APIs or tools.
- Agents that encapsulate persona, goals, and domain expertise.
- Tasks assigned to specific agents.
- Crew or Orchestrator that manages execution and communication across agents.
Step 1: Setting Up the Environment
To begin building, we first import key dependencies:
- CrewAI: The orchestrator framework that enables multi-agent coordination.
- Watsonx.ai LLM SDK: To connect IBM’s hosted language models.
- Langchain tools: For enabling external data access, like web search via Serper.dev.
- OS module: For securely managing API credentials.
You’ll need API keys for both Watsonx.ai and Serper.dev to make your system internet-capable and cloud-integrated.
Step 2: Configuring Your Large Language Models (LLMs)
The system uses two different LLMs:
- LLaMA 3 70B Instruct (from Meta, via Watsonx): This is the primary generation model for reasoning and research.
- Merlinite-7B (an IBM model): Handles function calling and is optimized for tasks like summarization and formatting.
These models are configured by setting:
- Model ID: A unique identifier for the selected LLM (e.g.,
meta-llama/llama-3-70b-instruct
). - API URL: Endpoint for Watsonx deployment.
- Project ID: For tracking and managing workloads.
- Decoding parameters: Such as
greedy
decoding andmax_new_tokens
, which control output length and generation style.
This dual-model approach allows for separation of concerns—one model thinks, the other executes.
Step 3: Creating the First Agent — The Researcher
The first AI agent you create is a Senior AI Researcher. This agent’s task is to explore the web and identify promising AI research, particularly in the field of quantum computing.
The agent is defined by:
- Role: Senior AI researcher
- Goal: Identify breakthrough trends in quantum AI
- Backstory: A veteran in quantum computing with a strong physics background
- Tools: Connected to Serper.dev to perform live web searches
- LLMs: Uses both LLaMA 3 and Merlinite for generation and function calling
Once the agent is initialized, it is assigned a task:
- Description: Search the internet for five examples of promising AI research.
- Expected Output: A bullet-point summary covering background, utility, and relevance.
- Output File: Saved as a
.txt
file for later use.
The CrewAI framework is used to assign this task to the agent and run the job.
Step 4: Running the First Agent
Upon execution, the researcher agent connects to the web via the integrated Serper.dev tool, fetches relevant articles and papers, processes them using LLaMA 3, and then compiles a structured summary.
This step demonstrates the core capability of an AI agent:
- Independently navigating a knowledge base (the internet)
- Extracting meaningful data
- Organizing it into a coherent output file
At this point, you have a fully functional single-agent AI system. But the goal is to build multi-agent intelligence, so we move to the next phase.
Step 5: Adding the Second Agent — The Speechwriter
The second agent in the system is a Senior Speechwriter, whose job is to turn the research from the first agent into an engaging keynote address.
This agent differs from the first in key ways:
- Role: Expert communicator with experience writing for executives
- Goal: Transform technical content into accessible, compelling speeches
- Backstory: A seasoned science communicator with a flair for narrative
- Tools: This agent doesn’t require web access—it relies solely on internal data
A new task is assigned to the writer agent:
- Description: Craft a keynote speech on quantum computing using the prior research.
- Expected Output: A complete speech with an introduction, body, and conclusion.
- Output File: Saved separately as a text file for review or public use.
Step 6: Orchestrating a Multi-Agent Workflow
The real magic happens when both agents are assigned to the Crew, and tasks are executed in sequence.
- First, the Researcher agent runs and generates
task1_output.txt
. - Next, the Speechwriter agent picks up the content of
task1_output.txt
and transforms it into a keynote saved astask2_output.txt
.
This chain illustrates a basic pipeline of intelligent delegation—an LLM-driven research-to-content-production pipeline.
It’s worth noting that the system currently executes tasks in a fixed order, but future versions could allow dynamic delegation, where agents decide among themselves who’s best suited for each task.
Debugging and Execution Insights
During execution, small bugs—such as assigning the wrong agent to a task—can occur. In the demo, the same agent was mistakenly assigned to both tasks initially. This was quickly corrected by specifying the correct agent object in the task definition.
This highlights an important lesson: as multi-agent systems grow in complexity, agent-task mapping and error handling become essential to maintain reliability.
Final Outputs and Results
Once the system runs successfully:
task1_output.txt
contains a well-structured list of current AI + Quantum research, including areas like Quantum Optimization, Quantum Neural Networks, and Reinforcement Learning.task2_output.txt
delivers a speech starting with a warm welcome and leading into the transformative power of Quantum Computing in AI, illustrating its potential to redefine innovation.
The ability to go from web-based research to polished, publish-ready content through autonomous AI agents is not only remarkable—it’s incredibly useful.
Expanding the System Further
What was demonstrated is only a minimum viable multi-agent system. This system could be further enhanced by:
- Adding more agents: editors, data analysts, graphic designers
- Enabling delegation logic: where agents choose tasks dynamically
- Introducing memory: to maintain continuity across long projects
- Scaling horizontally: run multiple tasks in parallel
Why Watsonx.ai and CrewAI?
Watsonx.ai provides:
- Access to powerful LLMs like LLaMA 3 and Merlinite
- Enterprise-ready deployment across regions
- Security and project management for data science workflows
CrewAI offers:
- A clean orchestration framework for multi-agent coordination
- Modular agent and task definition
- Integration with external tools like Serper.dev, GitHub, CSV parsers, and more
Together, they create a powerful stack for building complex, distributed AI systems.
Conclusion: Multi-Agent AI Is the Future
Multi-agent systems represent a seismic shift in how we approach problem-solving with AI. By distributing intelligence across roles—just like in human teams—we unlock a new level of automation, flexibility, and performance.
What began as a 15-minute demo ends with a framework that can be applied to enterprise automation, content generation, scientific research, and beyond.
With platforms like Watsonx.ai and CrewAI, the barriers to building advanced multi-agent systems have never been lower. The question isn’t whether you can build one—it’s what kind of team of agents you’ll assemble next.