In today’s fast-paced environment, the ability to swiftly access, understand, and act upon data is no longer a luxury , it’s a necessity. However, many organizations find that while they are rich in data, deriving timely, actionable insights remains a significant challenge, particularly for non-technical business users.
Also, technical users need to understand their data to know what queries to construct for getting the results, which requires significant time and effort and is not a single click away allowing the user to ask what’s on their mind in simple natural language.
On top of that, the user still needs to spend time making sense of the data, even with visualizations in place. There’s often a question of “why” when data is being presented, and data without the crucial “why” leaves a gap between data presentation and true understanding. In essence, self service business analytics remains elusive.
What is Polaris?
Polaris is a multi-agent AI-powered conversational interface built to analyze data in our Couchbase Operational database. Polaris leverages a multi-agent architecture that enables users to interact with their enterprise data through an intuitive, conversational interface, transforming complex data analysis into a simple dialogue. For example, if a company has global enterprise sales data across various regions and product lines, and a business analyst wants to understand “Why did Q2 sales decline in the Northeast region for Product X?”, our application can autonomously execute the entire analysis workflow.
It retrieves and filters relevant sales data by region, product, and time period, compares performance trends across comparable regions or products, visualizes key patterns and anomalies, and generates a narrative report summarizing the root causes, such as reduced promotional spend, stock availability issues, or a shift in customer behavior. To make it even more interesting the business analyst could ask a follow-up question to understand, in detail, some part of the report or maybe ask for more visualizations etc hence enabling fast data driven decision making.
Now, let us address the elephant in the room – AI Agents.
What are AI Agents and what are their capabilities?
AI agents are autonomous systems powered by artificial intelligence, typically involving Large Language Modules (LLMs) that can perform tasks, make decisions, and interact with real-world environments-often without constant human supervision. Unlike traditional chatbots or rule-based programs, AI Agents also learn from their experience. The goal for an agent is that it does everything that a human operator does autonomously and automatically. It’s still a far-fetched goal, but the AI industry is progressing towards it. Now, let us look at the capabilities of AI Agents:
Agent Plan: Step-by-Step Problem Solving
AI agents break down complex tasks into clear, manageable steps—identifying the problem, executing each phase, and adjusting as needed. In multi-agent systems, each agent can own a specific task, enabling efficient, coordinated problem-solving.
Context Awareness: Memory Management and State Tracking
Agents maintain context across interactions, remembering past inputs and adapting to ongoing workflows. This state tracking creates more natural, consistent, and intelligent user experiences.
Tool Usage: Extending Agent Capabilities
Agents can interact with external tools—APIs, databases, scripts—to perform real actions, not just offer suggestions. This transforms them from passive assistants into active executors within workflows.
Learn from Past Data: Adapting Over Time
By analyzing historical data and behavior, agents improve over time—anticipating user needs, refining responses, and optimizing workflows based on usage patterns.
What is a multi-agent-system (MAS) ?
Multi-Agent Architecture is a system design where multiple independent agents work together to solve problems or perform tasks. Each agent has its own role, such as collecting data, analyzing information, or making decisions. These agents communicate and collaborate to achieve a common goal, making the system more organised, this is just like a team where each member does a specific job, but they all work toward the same result! We have made use of Multi-Agent Architecture for Polaris.
Why the shift from single agent architecture?
A single AI agent operates independently, handling specific tasks autonomously. This works well for straightforward applications, like a Retrieval-Augmented Generation (RAG) system, where an agent answers user queries based on an LLM and a knowledge. However, in practical applications, user interactions are rarely simple. They often involve complex logic, multi-step reasoning, and the need to work across dynamic data models and evolving business requirements. At this point, single-agent systems begin to hit performance and scalability limits. They may falter when chaining multiple operations, adapting to schema changes, or coordinating nuanced workflows.
Why do Multi-Agent Architectures (MAS) work?
The separation of concerns inherent in MAS design leads to more robust and maintainable systems. Each agent focuses on its specific task, reducing complexity and making it easier to identify and resolve issues. This approach shines in scenarios like autonomous vehicle control, where separate agents handle navigation, obstacle detection, and vehicle dynamics, allowing for focused development and troubleshooting in each area.
Supervisor vs Network Multi Agent Systems

Decentralized (Peer-to-Peer) Multi-Agent Architecture

We picked: Supervisor-Based Architecture
Our application makes use of the LangGraph Supervisor Agent:
-
- Centralized reasoning, consistency, and coherence
- Complex reasoning benefits from having a global view of data, user intent, and context. The supervisor can maintain coherent logic across multiple steps.
- A single decision-making point ensures that outputs are aligned (e.g., chart matches explanation, summary reflects analysis).
- Central control allows dynamic allocation of tasks to specialized sub-agents (e.g., chart generator, query agent). Prevents duplication of effort and optimizes resource usage.
- Easier error handling recovery and scalability
- Errors can be centrally detected and managed. The supervisor can retry tasks, reassign roles, or generate fallback responses.
- Central control allows dynamic allocation of tasks to specialized sub-agents (e.g., chart generator, query agent). Prevents duplication of effort and optimizes resource usage.
- Easier to add, replace, or update sub-agents without redesigning the whole system.
- Centralized reasoning, consistency, and coherence
Polaris Core
At its core, Polaris makes use of a network of specialized AI agents, each optimized for different aspects of the data interaction lifecycle. Now let us understand what are the components and the overall high-level multi-agent architecture of Polaris with the help of an example:
Understanding and orchestration: Supervisor Agent
그리고 Supervisor Agent acts as the central controller and intelligent orchestrator of the multi-agent system.
Functions:
-
- Intent Parsing: Parse user input and extract task-related intents and parameters.
- Example: User asks: “Why did the total sales for electronics in Q1 2024 go down in APAC region.” The Supervisor Agent parses this to identify Intent: “Reason- causal analysis for sales drop”, Product Category: “Electronics”, Time Period: “Q1 2024” , Region: “APAC”.
- Agent Routing Logic: Implements a decision engine or rule-based orchestration layer to route tasks to appropriate agents.
- Example: Based on the parsed intent “causal analysis for sales drop” the Supervisor Agent decides to first route the task to the Query Expert to fetch sales data, then to the Charting Expert for visualization, then to the reasoning agent for causal identification , finally to the Report Expert for summarization.
- Context Management: Maintains global conversation context and state.
- Error Handling & Recovery: Monitors task success/failure and can reassign or rephrase sub-tasks based on agent feedback.
- 예: If the Query Expert reports that a requested column, Product_Type, does not exist in the schema, the Supervisor Agent might re-route the request to the Reasoning Expert to suggest alternative relevant columns or inform the user about the missing data.
- Intent Parsing: Parse user input and extract task-related intents and parameters.
Extraction of relevant data: Query Expert
The Query Expert translates natural language questions into SQL++ , thus fetching the needed data.
Functions:
-
- Schema Inference and Annotations: Infers data schema using the SQL++ INFER command which fetches the column names , datatype of the columns and sample documents, this along with the help of annotations helps understand the data , table relationships, data types and constraints.
- Example: When SQL++ INFER is run on a collection, it might identify a field simply as “amount”: NUMBER. Without further context, the Query Expert wouldn’t know if this refers to sale_amount, discount_amount또는 수량. However, through annotations, Polaris is explicitly told: “amount” field in ‘enterprise_sales‘ collection represents ‘total sales amount‘ for a transaction. This annotation is crucial because when the user asks “total sales“, the Query Expert now confidently maps 판매 를 금액 field, correctly generating SUM(amount).
- Input Canonicalization: Transforms the user’s original natural language input into a more verbose, unambiguous, and structured form. This helps the IQ tool better understand the task.
- Example: User input: “sales last month.” Canonicalized input: “Retrieve total sales amount for the category “Electronics” for the previous 30 days from the current date.“
- NL-to-SQL++ Translation: Call to the IQ tool to convert NL to SQL++
- Data Quality Checks and Error Recovery: the agent inspects for null values and other data integrity issues that could affect interpretation. If data quality is poor (e.g., all NULLs in a column), the agent either reformulates the query or returns a warning for user intervention. Based on error diagnostics, the agent auto-adjusts the query (e.g., corrects column names, or limits result sizes) and retries execution intelligently.
- Example: If the sale_amount column might contain nulls, the Query Expert automatically adds: AND sale_amount IS NOT NULL to the generated query to ensure accurate sum calculations.
- Schema Inference and Annotations: Infers data schema using the SQL++ INFER command which fetches the column names , datatype of the columns and sample documents, this along with the help of annotations helps understand the data , table relationships, data types and constraints.
Insight generation: Charting Expert
Responsible for converting structured query results into meaningful visual representations, tailored to the nature of the data and user query.
Functions:
-
- Chart Selection Logic: Uses rule-based heuristics to select appropriate chart types based on data characteristics (e.g., dimensions, metrics, time series).
- Example: Based on the rules given in the prompt and the type of data, the expert will choose an appropriate chart, for example if it is sales and time series data where we need to identify some trend, it will select a line chart.
- Dynamic Visualization Generation: Constructs visualizations using libraries like Plotly and Seaborn.
- Chart Selection Logic: Uses rule-based heuristics to select appropriate chart types based on data characteristics (e.g., dimensions, metrics, time series).
Reporting and summarization: Report Expert
Compiles insights, visualizations, and context into structured reports.
Functions:
-
- Content Aggregation: Automatically summarizes query results, embeds visualizations, the methodology and includes metadata (e.g., data sources, query parameters).
- Versioning & Audit Logs: Optionally integrates version control and logging for compliance and traceability of generated reports.
Explanation and reasoning: Reasoning Expert
Provides causal reasoning, trend analysis, and hypothesis generation by interpreting data insights through the lens of domain knowledge and logical inference.
Functions:
-
- LLM-based Reasoning: Leverages LLMs to reason over data results, uncover latent patterns, and generate explanatory narratives.
- Contextual Augmentation: Utilizes domain-specific knowledge extracted from the user’s database to provide grounded explanations.
Workflow
The Polaris platform is designed to turn natural language questions into intelligent, multi-modal insights by orchestrating a team of specialized agents. Here’s how the workflow unfolds:
- Polaris System Initialization
The user begins by selecting the relevant bucket, scope, collection, and metadata collection. Based on this context, Polaris initializes specialized agents and uses the schema, metadata, and sample data to prompt an LLM, which generates example questions to guide user exploration.
- Natural Language Interaction
- Intelligent Query Processing
High-Level Design Diagram
- 그리고 Query Expert handles core data access tasks: inferring the schema, translating the natural language query into SQL++ using a generator tool, and executing the query.
- Tools supporting the Query Expert include the Schema Inference Tool, SQL++ Generator Tool및 Query Execution Tool.
- Multi-Faceted Response Generation
Based on the results of the initial query, the Supervisor coordinates:- 그리고 Charting Expert, which creates data visualizations via a Chart Generator Tool.
- 그리고 Report Expert, responsible for generating textual summaries using a Report Generator Tool.
- 그리고 Reasoning Expert, which adds context, rationale, or further explanations to enrich the response.
- Comprehensive Insight Delivery
Polaris synthesizes the structured query results, visual outputs, and narrative explanations into a cohesive, user-friendly response. This multi-modal insight is delivered back through the chat interface, combining clarity, depth, and interactivity. - Iterative Exploration
Users are encouraged to ask follow-up questions. Since the system retains context and state across the session, the agent network can build on previous interactions to support deep, iterative data exploration.
Usage of ReAct agents
What is a ReAct agent?
“A ReAct agent is an AI agent that uses the “reasoning and acting” (ReAct) framework to combine chain of thought (CoT) reasoning with external tool use. The ReAct framework improves the ability of a large language model (LLM) to handle complex tasks and decision-making in agentic workflows.”—Dave Bergmann, IBM

Working of a ReAct Agent
Unlike traditional Artificial Intelligence (AI) systems, ReAct agents don’t separate decision-making from task execution. This framework inherently creates a feedback loop in which the model problem-solves by iteratively repeating this interleaved thought-action-observation process. We use the inbuilt LangGraph ReAct framework in our application, and each of the expert is modeled as a ReAct agent.
The unseen architects: the power of efficient prompts in AI-driven data analysis
In the realm of data analysis, the spotlight often shines on algorithms, statistical models, and visualization techniques. However, behind every insightful chart, every well-structured report, and every data-driven conclusion lies a crucial , unseen aspect: the prompt.
To-Do List Prompting
To-do list prompting gives the model a persistent, structured task list that it refers to at every step. Instead of relying on memory or previous messages, the full plan is injected into each prompt. Therefore the agent has a clear understanding of all tasks it has to check off . This prevents drifting, repetition, or skipping steps.
Identity Prompting
Identity prompting tells the model what it is, not just what it should do. This establishes a consistent role or persona that influences how the model behaves and responds. Prompts like “You are very proficient in data visualization tasks.” can instantly trigger domain-specific behavior—clear, confident, and focused responses.
Self-Reflection Prompting
Self-reflection prompting instructs the model to evaluate its own output after completing a task. This allows the model to introspect and verify whether it has met the user’s goal, and make corrections if needed. In our application, we’ve implemented self-reflection prompting within the Query Expert agent. After the SQL query is generated and executed the agent checks if all required data points are present.
Prompting is more of an art than a science—there’s no one-size-fits-all formula. However, by applying proven heuristics and clear task framing, we can guide models toward more accurate, useful, and context-aware outputs. The key is experimentation, iteration, and learning what works best in your specific application.
Demo of Polaris, the multi-agent conversational interface
Challenges and future work
Polaris represents a paradigm shift in how organizations can harness their data assets, especially through natural language interactions – enabling intuitive data discovery and significantly accelerating decision-making. A major advancement has been our development of a dynamic multi-agent architecture that adapts its approach based on the context and can work with diverse datasets.
However, several challenges remain. One key area has been managing data annotations. Ensuring consistent and meaningful annotations across varied columns is critical to maintaining the quality of insights generated by AI agents. We could explore integrating with a global data catalog to make this easier. Another significant challenge is data cleanliness, while we mitigate some of these issues at the query level through conditional clauses and basic data cleaning—there is still room for improvement in upstream data validation and preprocessing.
Additionally, handling large-scale data retrieval has been a technical hurdle. In real-world scenarios, retrieved datasets often exceed the context window limits of current large language models. To address this, we perform aggregation operations and generate visual summaries such as charts to provide high-level insights without overwhelming the model.
Looking ahead, future work will focus on enhancing annotation pipelines, improving data quality management, and exploring more efficient methods of summarization and multi-turn agent collaboration to scale Polaris even further.
Conclusion: ushering in a new era of data interaction
Polaris is more than just a new tool,by combining the power of a multi-agent AI system with the simplicity of natural language conversation, Polaris democratizes data access, empowers business users, and accelerates the journey from data to decision. We believe Polaris will unlock significant value for our customers, fostering a more agile, data-informed, and competitive enterprise.