Natural Language Query Tools

Overview

The Natural Language Query tools enable agents to convert natural language questions into SQL queries, execute them against the database, and return results in natural language format. These tools are workspace-context-aware, automatically applying portfolio filters, date ranges, and focus modes from the current user context. The system uses a three-stage pipeline: natural language to SQL translation, SQL validation for safety, and intelligent result formatting.

nl_query

Convert a natural language question to SQL, validate, execute, and return formatted results. Type: Tool Required: Yes Since: v1.0 Status: Production

Description

This tool transforms conversational questions into database queries while respecting workspace context. It performs validation to prevent destructive operations, executes the query, and formats results in natural language with optional SQL transparency. The tool is workspace-context-aware, automatically incorporating:

Selected portfolio filtering
Date and date range constraints
Focus mode (validation, dashboard, visual query builder)
User preferences for data presentation

Parameters

question (string, required) The natural language question about the database. Examples: “Show me all active portfolios”, “What are today’s validation breaches?”, “Which portfolios have the highest VaR?” include_sql (boolean, optional) Include the generated SQL query in the response. Useful for transparency and debugging. Default: true max_results (integer, optional) Maximum number of results to return from the query. Prevents memory issues with large result sets. Default: 100, Range: 1-10000 config (RunnableConfig, optional) Agent runtime configuration containing workspace context and state. Automatically populated by the agent framework.

Returns

Dictionary with the following structure:

{
  "success": boolean,
  "results": [
    {
      "column1": "value1",
      "column2": "value2"
    }
  ],
  "summary": "Natural language summary of results",
  "sql_query": "SELECT ... (if include_sql=true)",
  "row_count": number,
  "generation": {
    "confidence": 0.0-1.0,
    "explanation": "How the SQL was generated",
    "tables_used": ["table1", "table2"]
  },
  "validation": {
    "warnings": [],
    "safety_score": 0.0-1.0
  },
  "context_aware": {
    "context_used": boolean,
    "portfolio_filter": "portfolio_id or null",
    "date_filter": "YYYY-MM-DD or null",
    "focus_mode": "validation|dashboard|visual_query_builder or null"
  }
}

Throws

Generation Error: Failed to convert natural language to SQL
Validation Error: Generated SQL failed safety validation (DROP, DELETE, etc.)
Database Error: Query execution failed
Context Error: Invalid workspace context provided

Processing Pipeline

Extract Workspace Context: Retrieves portfolio, date, and focus mode from agent config
Load Schema: Loads cached database schema with full table and column definitions
Enhance Question: Augments the natural language question with workspace filters
Generate SQL: Uses GPT-4 to convert enhanced question to SQL with full schema context
Validate SQL: Checks for destructive operations and validates syntax
Execute Query: Runs validated query with automatic row limits
Format Results: Converts raw results to natural language summary
Store History: Records query for learning and pattern extraction

Examples

Simple Selection Query

Agent: "Show me all active portfolios"

Returns:
{
  "success": true,
  "summary": "Found 12 active portfolios with total AUM of $2.3B. Top performers are TechFund (15.2% YTD) and BlueChip (12.8% YTD)",
  "sql_query": "SELECT id, name, aum, status, ytd_return FROM investment.portfolios WHERE status = 'active' ORDER BY aum DESC",
  "row_count": 12,
  "generation": {
    "confidence": 0.98,
    "explanation": "Simple filter on active status with order by AUM",
    "tables_used": ["investment.portfolios"]
  }
}

Context-Aware Aggregation

Agent: "What are the holdings by sector?"
(With selected portfolio: "Tech500")

Returns:
{
  "success": true,
  "summary": "Sector breakdown for Tech500: Technology 45% ($1.2B), Healthcare 20% ($530M), Financials 18% ($475M), Other 17% ($450M)",
  "sql_query": "SELECT sector, COUNT(*) as count, SUM(market_value) as total_value FROM investment.holdings WHERE portfolio_id = 'Tech500' GROUP BY sector ORDER BY total_value DESC",
  "row_count": 4,
  "context_aware": {
    "context_used": true,
    "portfolio_filter": "Tech500",
    "focus_mode": "dashboard"
  }
}

Date-Filtered Analysis

Agent: "Show me today's validation failures"
(With selected date: "2024-10-29")

Returns:
{
  "success": true,
  "summary": "3 validation rules failed today affecting 2 portfolios: Portfolio X (rule 7), Portfolio Y (rules 2, 4)",
  "sql_query": "SELECT portfolio_id, rule_id, error_message FROM validation.failures WHERE check_date = '2024-10-29' ORDER BY portfolio_id",
  "row_count": 3,
  "context_aware": {
    "context_used": true,
    "date_filter": "2024-10-29",
    "focus_mode": "validation"
  }
}

Error Handling

Agent: "Show me all portfolios with bad syntax"

Returns:
{
  "success": false,
  "error": "Generated SQL failed validation",
  "issues": ["SQL syntax error: unexpected keyword 'bad'"],
  "warnings": ["Query may not return expected results"],
  "question": "Show me all portfolios with bad syntax"
}

Workspace Context Features

When workspace context is available, the tool automatically:

Filters by selected_portfolio_id if a portfolio is selected
Applies single date filter if selected_date is set
Applies date range if selected_date_range contains start/end dates
Focuses schema hints based on focus_mode:
- validation: Emphasizes validation schema tables
- dashboard: Emphasizes analytics and performance metrics
- visual_query_builder: Suggests tables for visual query construction

Safety & Validation

All generated SQL is validated before execution:

Blocks destructive operations (DROP, DELETE, TRUNCATE, ALTER)
Validates SQL syntax
Enforces row limits to prevent runaway queries
Records all queries in history for audit trail
Returns safety score (0.0-1.0) in response

get_query_history

Retrieve recent natural language queries with pattern analysis and suggestions. Type: Tool Required: No Since: v1.0 Status: Production

Description

This tool retrieves the history of natural language queries executed in the workspace. It analyzes successful queries to extract common patterns and generates suggestions for similar queries. Useful for learning what queries have worked well and improving query generation through examples.

Parameters

limit (integer, optional) Number of recent queries to retrieve. Default: 20, Range: 1-100 user_id (string, optional) Filter history by specific user ID. If not provided, returns all users’ queries in the workspace. Default: null

Returns

Dictionary with the following structure:

{
  "success": boolean,
  "total_queries": number,
  "successful_count": number,
  "failed_count": number,
  "recent_queries": [
    {
      "question": "Natural language question",
      "sql_query": "SELECT ...",
      "success": boolean,
      "row_count": number,
      "error_message": "Error description if failed",
      "created_at": "ISO timestamp"
    }
  ],
  "common_patterns": [
    {
      "pattern": "Aggregation|Multi-table Join|Time-filtered Query|Top N Query|Simple Select",
      "count": number,
      "examples": [
        {
          "question": "Example question",
          "sql": "Generated SQL"
        }
      ]
    }
  ],
  "suggestions": [
    "Try: 'Show me all active portfolios ordered by AUM'",
    "Try: 'What validation rules failed today?'"
  ]
}

Throws

Database Error: Failed to retrieve history from database
Invalid User ID: User ID format invalid or user not found

Pattern Recognition

The tool automatically identifies five query pattern types:

Simple Select: Basic SELECT queries without complex clauses
Aggregation: Queries with GROUP BY and aggregate functions
Multi-table Join: Queries joining multiple tables
Time-filtered Query: Queries filtering by date/timestamp
Top N Query: Queries with ORDER BY and LIMIT

Examples

Retrieve All Recent Queries

Agent: "Show me recent queries"

Returns:
{
  "success": true,
  "total_queries": 15,
  "successful_count": 13,
  "failed_count": 2,
  "recent_queries": [
    {
      "question": "Show me all active portfolios",
      "sql_query": "SELECT id, name, aum FROM investment.portfolios WHERE status = 'active'",
      "success": true,
      "row_count": 12,
      "created_at": "2024-10-29T14:30:00Z"
    },
    {
      "question": "What is the VaR by portfolio?",
      "sql_query": "SELECT portfolio_id, SUM(value_at_risk) as total_var FROM risk.calculations GROUP BY portfolio_id",
      "success": true,
      "row_count": 8,
      "created_at": "2024-10-29T14:15:00Z"
    }
  ],
  "common_patterns": [
    {
      "pattern": "Aggregation",
      "count": 5,
      "examples": [
        {
          "question": "Show me holdings by sector",
          "sql": "SELECT sector, COUNT(*) FROM investment.holdings GROUP BY sector"
        }
      ]
    },
    {
      "pattern": "Time-filtered Query",
      "count": 4,
      "examples": [
        {
          "question": "What happened today?",
          "sql": "SELECT * FROM events WHERE date = CURRENT_DATE"
        }
      ]
    }
  ],
  "suggestions": [
    "Try: 'Show me all active portfolios ordered by AUM'",
    "Try: 'What validation rules failed today?'",
    "Try: 'Which portfolios have the highest VaR?'"
  ]
}

User-Specific Query History

Agent: "Get john.smith's recent queries"

Returns:
{
  "success": true,
  "total_queries": 8,
  "successful_count": 7,
  "failed_count": 1,
  "recent_queries": [ ... user's queries ... ]
}

Learning & Suggestions

The tool improves over time by:

Tracking successful queries as examples
Identifying common query patterns
Extracting schema hints from usage
Suggesting effective queries based on historical success
Recording failed queries for debugging

Integration with Agent Framework

Both tools integrate seamlessly with the agent framework:

from app.agent.tools.nl_query import get_nl_query_tools

# Register tools with agent
tools = get_nl_query_tools()  # Returns [nl_query, get_query_history]

Best Practices

Use Specific Questions: “Show me validation failures for portfolio X today” works better than “What’s wrong?”
Leverage Workspace Context: Select a portfolio/date before querying for automatic filtering
Check Query History: Review past successful queries to learn effective patterns
Enable SQL Transparency: Set include_sql=true when learning SQL patterns
Monitor Safety Scores: Review validation warnings for unexpected query generation

Performance Considerations

Database schema is cached to avoid repeated loading
Query results automatically limited to max_results (default 100)
Queries added to history asynchronously (failures don’t block)
Workspace context filtering reduces result set sizes
Pattern extraction limited to 20 most recent queries by default

Agent System Overview

Available Agents

Integration Patterns

Agent Tools & Capabilities

Natural Language Query Tools

Overview

nl_query

Description

Parameters

Returns

Throws

Processing Pipeline

Examples

Simple Selection Query

Context-Aware Aggregation

Date-Filtered Analysis

Error Handling

Workspace Context Features

Safety & Validation

get_query_history

Description

Parameters

Returns

Throws

Pattern Recognition

Examples

Retrieve All Recent Queries

User-Specific Query History

Learning & Suggestions

Integration with Agent Framework

Best Practices

Performance Considerations

Agent System Overview

Available Agents

Integration Patterns

Agent Tools & Capabilities

​Overview

​nl_query

​Description

​Parameters

​Returns

​Throws

​Processing Pipeline

​Examples

​Simple Selection Query

​Context-Aware Aggregation

​Date-Filtered Analysis

​Error Handling

​Workspace Context Features

​Safety & Validation

​get_query_history

​Description

​Parameters

​Returns

​Throws

​Pattern Recognition

​Examples

​Retrieve All Recent Queries

​User-Specific Query History

​Learning & Suggestions

​Integration with Agent Framework

​Best Practices

​Performance Considerations

Overview

nl_query

Description

Parameters

Returns

Throws

Processing Pipeline

Examples

Simple Selection Query

Context-Aware Aggregation

Date-Filtered Analysis

Error Handling

Workspace Context Features

Safety & Validation

get_query_history

Description

Parameters

Returns

Throws

Pattern Recognition

Examples

Retrieve All Recent Queries

User-Specific Query History

Learning & Suggestions

Integration with Agent Framework

Best Practices

Performance Considerations