> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ziet.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Monitoring

> View logs, metrics, and track agent performance

## Overview

Monitor your agents through the dashboard or CLI to track performance, debug issues, and optimize execution.

## Viewing Logs

### Via Dashboard

<Steps>
  <Step title="Go to Runs">
    Navigate to **Agents** → **my\_agent** → **Runs**
  </Step>

  <Step title="Select a run">
    Click on any run to view details
  </Step>

  <Step title="View logs">
    See execution timeline, logs, and results
  </Step>
</Steps>

**What you'll see**:

* Real-time execution logs
* Action calls and results
* Memory operations
* Integration API calls
* Errors and warnings
* Final result
* Execution time

### Via CLI

```bash theme={null}
# Stream logs for a specific run
ziet logs run_abc123

# Follow logs in real-time
ziet logs run_abc123 --follow

# View recent logs for an agent
ziet logs my_agent --limit 100

# Filter by level
ziet logs my_agent --level error
ziet logs my_agent --level warning

# Filter by date
ziet logs my_agent --since "1 hour ago"
ziet logs my_agent --since "2024-01-15"
```

### Via API

```bash theme={null}
curl https://api.ziet.ai/runs/run_abc123/logs \
  -H "Authorization: Bearer YOUR_API_KEY"
```

**Response**:

```json theme={null}
{
  "logs": [
    {
      "timestamp": "2024-01-15T10:30:00.123Z",
      "level": "info",
      "message": "Starting agent execution"
    },
    {
      "timestamp": "2024-01-15T10:30:01.456Z",
      "level": "info",
      "message": "Calling action: search_flights"
    },
    {
      "timestamp": "2024-01-15T10:30:05.789Z",
      "level": "info",
      "message": "Memory added: key=search_results"
    }
  ]
}
```

## Metrics

### Agent Performance

View metrics in the dashboard for any agent:

**Execution Metrics**:

* Average execution time
* P50, P95, P99 latency
* Success rate
* Error rate

**Resource Metrics**:

* Action calls per run
* Memory operations (reads/writes)
* Integration API calls
* Memory usage

**Volume Metrics**:

* Total runs
* Runs per day/hour
* Active users

### Example Dashboard View

```
my_agent
├── Success Rate: 98.5%
├── Avg Execution Time: 12.3s
├── Total Runs: 1,234
├── Error Rate: 1.5%
└── Last 24h: 87 runs
```

## Run Status

### Check Run Status

<Tabs>
  <Tab title="CLI">
    ```bash theme={null}
    ziet status run_abc123
    ```

    Output:

    ```
    Run: run_abc123
    Agent: my_agent
    Status: completed
    Started: 2024-01-15 10:30:00
    Duration: 12.3s
    Result: {"flights_found": 42}
    ```
  </Tab>

  <Tab title="API">
    ```bash theme={null}
    curl https://api.ziet.ai/runs/run_abc123/status \
      -H "Authorization: Bearer YOUR_API_KEY"
    ```
  </Tab>

  <Tab title="Python SDK">
    ```python theme={null}
    from ziet import Client

    client = Client(api_key="your_key")
    status = client.get_run_status("run_abc123")

    print(status["status"])  # "running" | "completed" | "failed"
    ```
  </Tab>
</Tabs>

### Run Statuses

* **`running`** - Currently executing
* **`completed`** - Finished successfully
* **`failed`** - Encountered an error
* **`cancelled`** - Manually cancelled
* **`timeout`** - Exceeded time limit

## Recent Runs

### List Runs

<Tabs>
  <Tab title="CLI">
    ```bash theme={null}
    # List recent runs
    ziet runs my_agent --limit 10

    # Filter by status
    ziet runs my_agent --status failed
    ziet runs my_agent --status completed

    # Filter by date
    ziet runs my_agent --since "1 day ago"
    ```
  </Tab>

  <Tab title="Python SDK">
    ```python theme={null}
    # List runs
    runs = client.list_runs(
        agent_id="my_agent",
        limit=10,
        status="completed"
    )

    for run in runs:
        print(f"{run['run_id']}: {run['status']}")
    ```
  </Tab>

  <Tab title="Dashboard">
    Go to **Agents** → **my\_agent** → **Runs** to see a list of all runs
  </Tab>
</Tabs>

## Webhooks

Get notified when agents complete or fail.

### Setup Webhook

<Steps>
  <Step title="Add webhook in dashboard">
    1. Go to **Settings** → **Webhooks**
    2. Click **Add Webhook**
    3. Enter webhook URL
    4. Select events: `run.completed`, `run.failed`, `run.started`
    5. Click **Save**
  </Step>

  <Step title="Receive webhook">
    Your endpoint receives POST requests:

    ```json theme={null}
    {
      "event": "run.completed",
      "run_id": "run_abc123",
      "agent_id": "my_agent",
      "status": "completed",
      "result": {
        "flights_found": 42
      },
      "timestamp": "2024-01-15T10:30:00Z"
    }
    ```
  </Step>

  <Step title="Verify signature">
    ```python theme={null}
    import hmac
    import hashlib

    def verify_webhook(payload, signature, secret):
        expected = hmac.new(
            secret.encode(),
            payload.encode(),
            hashlib.sha256
        ).hexdigest()
        
        return hmac.compare_digest(signature, expected)
    ```
  </Step>
</Steps>

### Webhook Events

* **`run.started`** - Agent run started
* **`run.completed`** - Agent run completed successfully
* **`run.failed`** - Agent run failed
* **`run.cancelled`** - Agent run was cancelled

### Example Webhook Handler

```python theme={null}
from flask import Flask, request
import hmac
import hashlib

app = Flask(__name__)
WEBHOOK_SECRET = "your_webhook_secret"

@app.route('/webhook', methods=['POST'])
def handle_webhook():
    payload = request.data
    signature = request.headers.get('X-Ziet-Signature')
    
    # Verify signature
    if not verify_signature(payload, signature):
        return 'Invalid signature', 401
    
    data = request.json
    event = data['event']
    
    if event == 'run.completed':
        print(f"Run {data['run_id']} completed!")
        # Process result...
    
    elif event == 'run.failed':
        print(f"Run {data['run_id']} failed!")
        # Send alert...
    
    return 'OK', 200

def verify_signature(payload, signature):
    expected = hmac.new(
        WEBHOOK_SECRET.encode(),
        payload,
        hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(signature, expected)
```

## Alerts

### Email Alerts

Get notified via email when agents fail:

1. Go to **Settings** → **Alerts**
2. Enable **Email Alerts**
3. Select events: `run.failed`, `run.timeout`
4. Add email addresses
5. Save

### Slack Alerts

Send notifications to Slack:

1. Go to **Settings** → **Integrations** → **Slack**
2. Connect your Slack workspace
3. Go to **Settings** → **Alerts**
4. Enable **Slack Alerts**
5. Select channel and events
6. Save

## Best Practices

<AccordionGroup>
  <Accordion title="Monitor error rates" icon="chart-line">
    Check your agent's error rate regularly:

    ```bash theme={null}
    # View failed runs
    ziet runs my_agent --status failed --limit 10

    # Check error logs
    ziet logs my_agent --level error
    ```
  </Accordion>

  <Accordion title="Set up webhooks for critical agents" icon="bell">
    Get notified immediately when important agents fail:

    * **Settings** → **Webhooks**
    * Subscribe to `run.failed` events
    * Integrate with your alerting system
  </Accordion>

  <Accordion title="Review performance metrics" icon="gauge">
    Check execution time trends:

    * Dashboard → **Agents** → **my\_agent** → **Metrics**
    * Look for sudden increases in execution time
    * Optimize slow actions
  </Accordion>

  <Accordion title="Use log levels appropriately" icon="layer-group">
    ```python theme={null}
    import logging

    logger = logging.getLogger(__name__)

    # Info for normal operation
    logger.info("Processing started")

    # Warning for concerning but non-fatal
    logger.warning("Rate limit approaching")

    # Error for failures
    logger.error("API call failed")
    ```
  </Accordion>
</AccordionGroup>

## Debugging Failed Runs

### View Error Details

```bash theme={null}
# Get run details
ziet status run_abc123

# View full logs
ziet logs run_abc123

# View error logs only
ziet logs run_abc123 --level error
```

### Common Issues

<AccordionGroup>
  <Accordion title="Timeout errors">
    **Cause**: Agent exceeded time limit

    **Solution**: Increase timeout or optimize actions

    ```python theme={null}
    @Action(
        id="slow_action",
        name="Slow Action",
        description="...",
        timeout=300  # 5 minutes
    )
    def slow_action():
        ...
    ```
  </Accordion>

  <Accordion title="Integration errors">
    **Cause**: API call to integration failed

    **Solution**: Check logs for API errors, verify API keys

    ```bash theme={null}
    # Check integration status
    ziet logs run_abc123 | grep "integration"
    ```
  </Accordion>

  <Accordion title="Memory errors">
    **Cause**: Missing memory key

    **Solution**: Ensure actions store data before retrieval

    ```python theme={null}
    # Store first
    memory.add(key="data", value=result)

    # Then retrieve
    data = memory.get("data")
    ```
  </Accordion>
</AccordionGroup>

## Next Steps

<CardGroup cols={2}>
  <Card title="Deploying" icon="rocket" href="/deployment/deploying">
    Deploy your agents
  </Card>

  <Card title="Invoking" icon="play" href="/deployment/invoking">
    Run your agents
  </Card>

  <Card title="API Reference" icon="code" href="/api-reference/introduction">
    Full API documentation
  </Card>

  <Card title="Memory" icon="database" href="/core/memory">
    Debug memory operations
  </Card>
</CardGroup>
