Analysis Tasks

KCPilot’s analysis tasks are AI-powered diagnostic checks that examine your Kafka cluster’s configuration, performance, and health. Each task is defined as a YAML file containing prompts and rules that guide the AI analysis engine to identify specific issues and provide remediation guidance.

Overview

Analysis tasks leverage Large Language Models (LLMs) to intelligently analyze collected cluster data. Unlike static rule-based checks, these tasks can understand context, identify patterns, and provide nuanced recommendations based on your specific cluster configuration.

Key Features

  • AI-Powered Analysis: Uses OpenAI or compatible LLMs to analyze complex cluster states
  • Configurable Severity: Automatically maps findings to critical, warning, or info levels
  • Data Filtering: Each task specifies which data types it needs (logs, configs, metrics, admin)
  • Actionable Remediation: Provides specific steps to resolve identified issues

How to Use Analysis Tasks

List Available Tasks

To see all available analysis tasks with their descriptions:

# List all tasks
kcpilot task list

# List with detailed information
kcpilot task list --detailed

Execute a Single Task

To run a specific analysis task on collected scan data:

# Test a single task
kcpilot task test <task-id> <snapshot-path>

# Example: Test JVM heap configuration
kcpilot task test jvm_heap_memory_ratio ./scan-2024-01-15

# With debug logging
RUST_LOG=kcpilot=debug kcpilot task test <task-id> <snapshot-path>

Run Full Analysis

To execute all analysis tasks on a snapshot:

# Analyze with terminal and markdown reports
kcpilot analyze <snapshot-path> --report terminal,markdown

# Example
kcpilot analyze ./scan-2024-01-15 --report terminal,markdown --output analysis-report.md

Creating Custom Tasks

Analysis tasks are YAML files stored in the analysis_tasks/ directory. Each task includes:

  1. Metadata: ID, name, description, and category
  2. Prompt: The analysis instruction sent to the LLM
  3. Data Selection: Which data types to include via include_data
  4. Severity Mapping: Keywords that determine finding severity levels

Task Template

id: your_task_id
name: Task Display Name
description: Brief description of what this task checks
category: configuration|performance|security

prompt: |
  Your analysis prompt here with placeholders:
  Configuration: {config}
  Logs: {logs}
  Metrics: {metrics}
  
  Analysis instructions...

include_data:
  - config
  - logs
  - metrics
  - admin

severity_keywords:
  critical:
    - "data loss"
    - "cluster down"
  warning:
    - "performance degraded"
    - "misconfiguration"
  info:
    - "recommendation"
    - "optimization"

Available Analysis Tasks

Below is a comprehensive list of all available analysis tasks, organized by category:

Configuration - General

Configuration - KRaft

Configuration - ZooKeeper

Environment Configuration

To use AI-powered analysis tasks, you need to configure your LLM API key:

# OpenAI API
export OPENAI_API_KEY=your_openai_api_key_here

# Alternative LLM API
export LLM_API_KEY=your_alternative_llm_api_key

# Enable debug logging
export LLM_DEBUG=true

Troubleshooting

Common Issues

  1. No LLM API Key: Tasks will fail without a configured API key
  2. Timeout Issues: Increase timeout with --llm-timeout <seconds>
  3. Data Not Found: Ensure snapshot contains required data types
  4. Task Not Found: Verify task ID matches file name in analysis_tasks/

Debug Mode

Enable detailed logging to troubleshoot task execution:

# Debug specific task
RUST_LOG=kcpilot=debug kcpilot task test <task-id> <snapshot>

# Debug LLM interactions
kcpilot analyze <snapshot> --llmdbg