Skip to main content

Overview

The Natural Language Query tools enable agents to convert natural language questions into SQL queries, execute them against the database, and return results in natural language format. These tools are workspace-context-aware, automatically applying portfolio filters, date ranges, and focus modes from the current user context. The system uses a three-stage pipeline: natural language to SQL translation, SQL validation for safety, and intelligent result formatting.

nl_query

Convert a natural language question to SQL, validate, execute, and return formatted results. Type: Tool Required: Yes Since: v1.0 Status: Production

Description

This tool transforms conversational questions into database queries while respecting workspace context. It performs validation to prevent destructive operations, executes the query, and formats results in natural language with optional SQL transparency. The tool is workspace-context-aware, automatically incorporating:
  • Selected portfolio filtering
  • Date and date range constraints
  • Focus mode (validation, dashboard, visual query builder)
  • User preferences for data presentation

Parameters

question (string, required) The natural language question about the database. Examples: “Show me all active portfolios”, “What are today’s validation breaches?”, “Which portfolios have the highest VaR?” include_sql (boolean, optional) Include the generated SQL query in the response. Useful for transparency and debugging. Default: true max_results (integer, optional) Maximum number of results to return from the query. Prevents memory issues with large result sets. Default: 100, Range: 1-10000 config (RunnableConfig, optional) Agent runtime configuration containing workspace context and state. Automatically populated by the agent framework.

Returns

Dictionary with the following structure:
{
  "success": boolean,
  "results": [
    {
      "column1": "value1",
      "column2": "value2"
    }
  ],
  "summary": "Natural language summary of results",
  "sql_query": "SELECT ... (if include_sql=true)",
  "row_count": number,
  "generation": {
    "confidence": 0.0-1.0,
    "explanation": "How the SQL was generated",
    "tables_used": ["table1", "table2"]
  },
  "validation": {
    "warnings": [],
    "safety_score": 0.0-1.0
  },
  "context_aware": {
    "context_used": boolean,
    "portfolio_filter": "portfolio_id or null",
    "date_filter": "YYYY-MM-DD or null",
    "focus_mode": "validation|dashboard|visual_query_builder or null"
  }
}

Throws

  • Generation Error: Failed to convert natural language to SQL
  • Validation Error: Generated SQL failed safety validation (DROP, DELETE, etc.)
  • Database Error: Query execution failed
  • Context Error: Invalid workspace context provided

Processing Pipeline

  1. Extract Workspace Context: Retrieves portfolio, date, and focus mode from agent config
  2. Load Schema: Loads cached database schema with full table and column definitions
  3. Enhance Question: Augments the natural language question with workspace filters
  4. Generate SQL: Uses GPT-4 to convert enhanced question to SQL with full schema context
  5. Validate SQL: Checks for destructive operations and validates syntax
  6. Execute Query: Runs validated query with automatic row limits
  7. Format Results: Converts raw results to natural language summary
  8. Store History: Records query for learning and pattern extraction

Examples

Simple Selection Query

Agent: "Show me all active portfolios"

Returns:
{
  "success": true,
  "summary": "Found 12 active portfolios with total AUM of $2.3B. Top performers are TechFund (15.2% YTD) and BlueChip (12.8% YTD)",
  "sql_query": "SELECT id, name, aum, status, ytd_return FROM investment.portfolios WHERE status = 'active' ORDER BY aum DESC",
  "row_count": 12,
  "generation": {
    "confidence": 0.98,
    "explanation": "Simple filter on active status with order by AUM",
    "tables_used": ["investment.portfolios"]
  }
}

Context-Aware Aggregation

Agent: "What are the holdings by sector?"
(With selected portfolio: "Tech500")

Returns:
{
  "success": true,
  "summary": "Sector breakdown for Tech500: Technology 45% ($1.2B), Healthcare 20% ($530M), Financials 18% ($475M), Other 17% ($450M)",
  "sql_query": "SELECT sector, COUNT(*) as count, SUM(market_value) as total_value FROM investment.holdings WHERE portfolio_id = 'Tech500' GROUP BY sector ORDER BY total_value DESC",
  "row_count": 4,
  "context_aware": {
    "context_used": true,
    "portfolio_filter": "Tech500",
    "focus_mode": "dashboard"
  }
}

Date-Filtered Analysis

Agent: "Show me today's validation failures"
(With selected date: "2024-10-29")

Returns:
{
  "success": true,
  "summary": "3 validation rules failed today affecting 2 portfolios: Portfolio X (rule 7), Portfolio Y (rules 2, 4)",
  "sql_query": "SELECT portfolio_id, rule_id, error_message FROM validation.failures WHERE check_date = '2024-10-29' ORDER BY portfolio_id",
  "row_count": 3,
  "context_aware": {
    "context_used": true,
    "date_filter": "2024-10-29",
    "focus_mode": "validation"
  }
}

Error Handling

Agent: "Show me all portfolios with bad syntax"

Returns:
{
  "success": false,
  "error": "Generated SQL failed validation",
  "issues": ["SQL syntax error: unexpected keyword 'bad'"],
  "warnings": ["Query may not return expected results"],
  "question": "Show me all portfolios with bad syntax"
}

Workspace Context Features

When workspace context is available, the tool automatically:
  • Filters by selected_portfolio_id if a portfolio is selected
  • Applies single date filter if selected_date is set
  • Applies date range if selected_date_range contains start/end dates
  • Focuses schema hints based on focus_mode:
    • validation: Emphasizes validation schema tables
    • dashboard: Emphasizes analytics and performance metrics
    • visual_query_builder: Suggests tables for visual query construction

Safety & Validation

All generated SQL is validated before execution:
  • Blocks destructive operations (DROP, DELETE, TRUNCATE, ALTER)
  • Validates SQL syntax
  • Enforces row limits to prevent runaway queries
  • Records all queries in history for audit trail
  • Returns safety score (0.0-1.0) in response

get_query_history

Retrieve recent natural language queries with pattern analysis and suggestions. Type: Tool Required: No Since: v1.0 Status: Production

Description

This tool retrieves the history of natural language queries executed in the workspace. It analyzes successful queries to extract common patterns and generates suggestions for similar queries. Useful for learning what queries have worked well and improving query generation through examples.

Parameters

limit (integer, optional) Number of recent queries to retrieve. Default: 20, Range: 1-100 user_id (string, optional) Filter history by specific user ID. If not provided, returns all users’ queries in the workspace. Default: null

Returns

Dictionary with the following structure:
{
  "success": boolean,
  "total_queries": number,
  "successful_count": number,
  "failed_count": number,
  "recent_queries": [
    {
      "question": "Natural language question",
      "sql_query": "SELECT ...",
      "success": boolean,
      "row_count": number,
      "error_message": "Error description if failed",
      "created_at": "ISO timestamp"
    }
  ],
  "common_patterns": [
    {
      "pattern": "Aggregation|Multi-table Join|Time-filtered Query|Top N Query|Simple Select",
      "count": number,
      "examples": [
        {
          "question": "Example question",
          "sql": "Generated SQL"
        }
      ]
    }
  ],
  "suggestions": [
    "Try: 'Show me all active portfolios ordered by AUM'",
    "Try: 'What validation rules failed today?'"
  ]
}

Throws

  • Database Error: Failed to retrieve history from database
  • Invalid User ID: User ID format invalid or user not found

Pattern Recognition

The tool automatically identifies five query pattern types:
  1. Simple Select: Basic SELECT queries without complex clauses
  2. Aggregation: Queries with GROUP BY and aggregate functions
  3. Multi-table Join: Queries joining multiple tables
  4. Time-filtered Query: Queries filtering by date/timestamp
  5. Top N Query: Queries with ORDER BY and LIMIT

Examples

Retrieve All Recent Queries

Agent: "Show me recent queries"

Returns:
{
  "success": true,
  "total_queries": 15,
  "successful_count": 13,
  "failed_count": 2,
  "recent_queries": [
    {
      "question": "Show me all active portfolios",
      "sql_query": "SELECT id, name, aum FROM investment.portfolios WHERE status = 'active'",
      "success": true,
      "row_count": 12,
      "created_at": "2024-10-29T14:30:00Z"
    },
    {
      "question": "What is the VaR by portfolio?",
      "sql_query": "SELECT portfolio_id, SUM(value_at_risk) as total_var FROM risk.calculations GROUP BY portfolio_id",
      "success": true,
      "row_count": 8,
      "created_at": "2024-10-29T14:15:00Z"
    }
  ],
  "common_patterns": [
    {
      "pattern": "Aggregation",
      "count": 5,
      "examples": [
        {
          "question": "Show me holdings by sector",
          "sql": "SELECT sector, COUNT(*) FROM investment.holdings GROUP BY sector"
        }
      ]
    },
    {
      "pattern": "Time-filtered Query",
      "count": 4,
      "examples": [
        {
          "question": "What happened today?",
          "sql": "SELECT * FROM events WHERE date = CURRENT_DATE"
        }
      ]
    }
  ],
  "suggestions": [
    "Try: 'Show me all active portfolios ordered by AUM'",
    "Try: 'What validation rules failed today?'",
    "Try: 'Which portfolios have the highest VaR?'"
  ]
}

User-Specific Query History

Agent: "Get john.smith's recent queries"

Returns:
{
  "success": true,
  "total_queries": 8,
  "successful_count": 7,
  "failed_count": 1,
  "recent_queries": [ ... user's queries ... ]
}

Learning & Suggestions

The tool improves over time by:
  • Tracking successful queries as examples
  • Identifying common query patterns
  • Extracting schema hints from usage
  • Suggesting effective queries based on historical success
  • Recording failed queries for debugging

Integration with Agent Framework

Both tools integrate seamlessly with the agent framework:
from app.agent.tools.nl_query import get_nl_query_tools

# Register tools with agent
tools = get_nl_query_tools()  # Returns [nl_query, get_query_history]

Best Practices

  1. Use Specific Questions: “Show me validation failures for portfolio X today” works better than “What’s wrong?”
  2. Leverage Workspace Context: Select a portfolio/date before querying for automatic filtering
  3. Check Query History: Review past successful queries to learn effective patterns
  4. Enable SQL Transparency: Set include_sql=true when learning SQL patterns
  5. Monitor Safety Scores: Review validation warnings for unexpected query generation

Performance Considerations

  • Database schema is cached to avoid repeated loading
  • Query results automatically limited to max_results (default 100)
  • Queries added to history asynchronously (failures don’t block)
  • Workspace context filtering reduces result set sizes
  • Pattern extraction limited to 20 most recent queries by default