Metadata-Version: 2.4
Name: metricguard
Version: 1.0.1
Summary: Enterprise-grade Shift-Left FinOps static analysis — detects metric cardinality explosion in Python, Go, TypeScript, and Java before it reaches production.
Author-email: Yossi Cohen <Yossi85291@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/Yossi-Cohen19/MetricGuard
Project-URL: Documentation, https://github.com/Yossi-Cohen19/MetricGuard#readme
Project-URL: Repository, https://github.com/Yossi-Cohen19/MetricGuard
Project-URL: Bug Tracker, https://github.com/Yossi-Cohen19/MetricGuard/issues
Keywords: finops,prometheus,observability,cardinality,static-analysis,devops,semgrep,opentelemetry,datadog
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: System :: Monitoring
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: pydantic<3.0.0,>=2.5.0
Requires-Dist: PyYAML<7.0.0,>=6.0.1
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: mypy>=1.7.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Provides-Extra: semgrep
Requires-Dist: semgrep>=1.60.0; extra == "semgrep"

# MetricGuard 🛡️ v1.0.0

> **Enterprise-grade Shift-Left FinOps** — Detect cardinality explosion in Python, Go, TypeScript, and Java *before it reaches production*.

[![Version](https://img.shields.io/badge/version-1.0.0-blue)](pyproject.toml)
[![PyPI version](https://img.shields.io/pypi/v/metricguard.svg?color=blue)](https://pypi.org/project/metricguard/)
[![GitHub Action](https://img.shields.io/badge/GitHub%20Action-v1-blue?logo=github-actions)](action.yml)
[![Python](https://img.shields.io/badge/Python-3.11%2B-blue?logo=python)](pyproject.toml)
[![License: MIT](https://img.shields.io/badge/License-MIT-green)](LICENSE)


> 🏆 **Battle-Tested:** Successfully scanned **100,000+ files** across `Kubernetes`, `Sentry`, `ArgoCD`, `Airflow`, `Loki`, `Terraform`, `Titus-Executor` and `Grafana` monorepos with zero OOM crashes. Caught zero-day cardinality leaks in official `Prometheus` configs.

📦 **Available on PyPI:** [pypi.org/project/metricguard](https://pypi.org/project/metricguard/)
---

## What is Cardinality Explosion?

Every unique combination of metric label values creates a new **time series** in Prometheus, or a new **custom metric** in Datadog. When developers use high-cardinality values like `user_id`, `request_id`, or UUIDs as label values, the number of time series grows unbounded:

```python
# 🔴 DANGER: Creates 1 series per user — millions at scale
REQUEST_COUNTER.labels(user_id=request.user_id).inc()

# ✅ SAFE: Bounded by a known Enum — always exactly 3 series
REQUEST_COUNTER.labels(status=Status.OK).inc()
```

**Impact:**
- 💸 Monitoring costs spike 10x–1000x
- 🐌 Prometheus queries become unusably slow
- 🔥 OOM crashes, missed alerts, degraded reliability

MetricGuard catches these issues **at commit time**, not at 3am when your monitoring system falls over.

<p align="center">
<img width="1289" height="842" alt="Image" src="https://github.com/user-attachments/assets/3c1a55b9-a41f-456d-92fc-25d4b84dc5ed" /></p>
---

## Quick Start

### As a GitHub Action (Recommended)

```yaml
# .github/workflows/metricguard.yml
name: MetricGuard — Cardinality Scan

on: [pull_request]

jobs:
  metricguard:
    runs-on: ubuntu-latest
    permissions:
      security-events: write  # Upload SARIF to GitHub Code Scanning

    steps:
      - uses: actions/checkout@v4

      - name: MetricGuard Scan
        uses: Yossi-Cohen19/MetricGuard@main
        with:
          scan-path: ./src
          diff-base: ${{ github.event.pull_request.base.sha }}
          format: sarif
          output-file: results.sarif
          fail-on: high

      - name: Upload to GitHub Code Scanning
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: results.sarif
        if: always()
```

### Local CLI

```bash

# Install globally via pipx (Recommended)
pipx install metricguard

# OR install via pip (in a virtual environment)
pip install metricguard

# Optional: Add multi-language Semgrep support (Go, TypeScript, Java)
pipx install semgrep

# Basic Scan (Human-readable terminal output)
metricguard scan ./src

# ⚡ Enterprise Monorepo Scan (Max parallelism, JSON report)
metricguard scan ./src --workers 8 --format json --output results.json

# CI/CD: Export SARIF report & Fail only on CRITICAL findings
metricguard scan ./infra --format sarif --output results.sarif --fail-on critical

# Developer Workflow: Scan only Git-changed files (Ultra-fast)
metricguard scan ./src --diff origin/main

# FinOps Workflow: Suppress pre-existing debt, alert only on new leaks
metricguard scan ./src --create-baseline --baseline baseline.json
metricguard scan ./src --baseline baseline.json
```

---

## Features

| Feature | Description |
|---------|-------------|
| 🐍 **Python AST Scanner** | Detects dangerous label values in Prometheus, Datadog, OTel API calls |
| 🔍 **Scope-Aware Taint Analysis** | Tracks variables back to high-cardinality origins, per-function scope isolation |
| 📐 **Bounded Value Heuristics** | Enum members, booleans, and constants are automatically safe |
| 🌐 **Multi-Language (Go/TS/Java)** | Semgrep-backed scanner with custom rules for 95% of backend stacks |
| 📄 **YAML / TOML Infrastructure** | Detects dangerous `relabel_configs`, OTel processors, Telegraf global tags |
| 🏃 **Monorepo Safety** | Black-hole dir skip, 1 MiB size gate, symlink protection, lockfile/asset filtering |
| 🎯 **Smart Short-Keyword Regex** | CamelCase + snake_case-aware patterns — `clientIp`, `client_ip` and `ip` all caught |
| 🔕 **Inline Escape Hatch** | `# metricguard:ignore` on a line silences a specific finding |
| 📋 **Baseline Mechanism** | `--create-baseline` snapshots debt; future runs report only *new* issues |
| 📊 **SARIF Output** | First-class GitHub Code Scanning / Advanced Security integration |
| ⚡ **Parallel Scanning** | `ProcessPoolExecutor`-based for fast monorepo analysis |
| 🔀 **Differential Scanning** | Only scan Git-changed files with `--diff` |
| 💡 **Auto-Remediation** | Suggested fixes with code examples for every finding |
| 💰 **Economic Impact** | Estimated series increase and monthly cost per finding |

---

## Monorepo Safety

MetricGuard is hardened for massive monorepos with a **layered file discovery** strategy:

### Layer 1 — Directory Pruning (fastest)

Entire sub-trees are pruned in O(1) via `os.walk` in-place modification. MetricGuard **never stats a single file** inside these directories:

| Category | Directories |
|---|---|
| Hidden | Any directory starting with `.` (`.git`, `.github`, `.vscode`, `.idea`, `.mypy_cache`, …) |
| Test / mock | `tests`, `test`, `testdata`, `testing`, `spec`, `specs`, `__tests__`, `__mocks__`, `fixtures`, `mock`, `mocks`, `e2e`, `integration` |
| Dependencies | `node_modules`, `vendor`, `venv`, `env` |
| Build artifacts | `build`, `dist`, `out`, `target`, `bin`, `obj`, `__pycache__` |

### Layer 2 — Filename Pattern Filtering

Co-located test files and non-code assets are filtered out by `fnmatch` before path resolution:

| Category | Patterns |
|---|---|
| Co-located tests | `test_*.py`, `*_test.py`, `*_test.go`, `*.spec.ts`, `*.test.ts`, `*.spec.js`, `*.test.js`, … |
| Lockfiles | `*lock.json`, `*.lock`, `go.sum` |
| Non-code assets | `*.md`, `*.csv`, `*.txt`, `*.png`, `*.jpg`, `LICENSE`, `LICENSE.*` |

### Layer 3 — Additional Safety Gates

- **Symlink Protection** — `os.walk(followlinks=False)` prevents traversal into circular directory structures or external filesystems.
- **1 MiB Size Gate** — Files larger than 1,048,576 bytes are silently skipped. Auto-generated proto bundles and minified assets never reach the AST parser.
- **YAML Prometheus Heuristic** — `.yml`/`.yaml` files are parsed *only if* they contain at least one Prometheus or OTel indicator string (`relabel_configs`, `scrape_configs`, `target_label`, `receivers:`, `processors:`, etc.). Docker Compose, GitHub Actions, Helm values and other non-monitoring YAMLs are skipped in O(n) without YAML parsing.

---

## Smart Short-Keyword Detection

Simple word-boundary regexes produce false positives for short keywords like `ip`, `id`, `port` in camelCase contexts.  MetricGuard uses a **3-arm pattern** per short keyword:

| Arm | Example match | Example non-match |
|---|---|---|
| Exact: `^ip$` | `ip` | — |
| Snake suffix: `_ip$` | `client_ip` | `ipAddress` |
| CamelCase suffix: `Ip$` | `clientIp` | `scripted` |

Short keywords covered: `id`, `ip`, `mac`, `url`, `uri`, `port`, `host`, `hash`, `sku`.

---

## Inline Escape Hatch

Developers can suppress a specific finding without changing configuration files by adding an inline comment on the flagged line or on the line immediately above it:

```python
# Option 1: suppress on the same line
some_metric.labels(user_id=uid).inc()  # metricguard:ignore

# Option 2: suppress on the preceding line
# metricguard:disable-line
some_metric.labels(user_id=uid).inc()

# Option 3: general disable (also accepted)
some_metric.labels(user_id=uid).inc()  # metricguard:disable
```

> [!NOTE]
> The escape hatch applies to both the primary label-value scanner pass and the Cloud SDK second pass.

---

## Architecture

```
metricguard/
├── core/
│   ├── scanner.py          # Abstract BaseScanner (Strategy Pattern)
│   └── config.py           # MetricGuardConfig (Pydantic + YAML)
├── models/
│   └── finding.py          # Finding, Severity, EconomicImpact (Pydantic)
├── scanners/
│   ├── python_scanner.py   # Python AST + Scope-Aware Taint + Escape Hatch
│   ├── yaml_scanner.py     # Prometheus/OTel YAML + Indicator Heuristic
│   ├── aws_scanner.py      # CloudWatch Agent JSON + boto3 Python
│   ├── telegraf_scanner.py # Telegraf TOML + StatsD Python
│   └── semgrep_scanner.py  # Multi-language: Go, TypeScript, Java
├── rules/                  # Semgrep rule YAML files
│   ├── go_metrics.yml
│   ├── typescript_metrics.yml
│   └── java_metrics.yml
├── engines/
│   ├── impact_calculator.py   # Economic impact enrichment
│   └── remediation_engine.py  # Code fix suggestions
├── reporters/
│   └── sarif_exporter.py   # SARIF 2.1.0 exporter
├── utils/
│   └── git_diff.py         # Differential scanning utility
└── cli.py                  # CLI entrypoint + 3-layer file discovery
```

**Analysis Pipeline:**
```
Files → [3-Layer Triage] → [Parallel Workers] → [Findings] → [Enrich] → [Reporter]
         Dir pruning          ProcessPoolExecutor   List[Finding]  Impact    SARIF/JSON/Table
         File patterns                                             Remediation
         Size gate
         YAML heuristic
```

---

## What Gets Detected

### Python Code (MG-PY-*)

| Rule | Description | Severity |
|------|-------------|----------|
| `MG-PY-001` | Dynamic expression (attribute/call/subscript) as label value | HIGH |
| `MG-PY-002` | F-string as label value | HIGH |
| `MG-PY-003` | Tainted variable (traces to dynamic source) | HIGH |
| `MG-PY-004` | Variable name matches dangerous keyword (user_id, clientIp, etc.) | CRITICAL |

### Infrastructure YAML (MG-YAML-*)

| Rule | Description | Severity |
|------|-------------|----------|
| `MG-YAML-001` | Prometheus `target_label` matches dangerous keyword | CRITICAL |
| `MG-YAML-002` | Prometheus `source_labels` contains high-cardinality metadata | HIGH |
| `MG-YAML-003` | Prometheus `labelmap` with wildcard regex | HIGH |
| `MG-YAML-004` | OTel attributes processor with dangerous key | HIGH |
| `MG-YAML-005` | OTel spanmetrics dimension is unbounded | CRITICAL |
| `MG-YAML-006` | Generic YAML label key matches dangerous keyword | MEDIUM |

### Go (MG-GO-*)

| Rule | Description | Severity |
|------|-------------|----------|
| `MG-GO-001` | Dynamic value in `prometheus.Labels{}` map | HIGH |
| `MG-GO-002` | Dangerous keyword as Prometheus label key | CRITICAL |
| `MG-GO-003` | Dynamic OTel attribute value in Go SDK call | HIGH |

### TypeScript / JavaScript (MG-TS-*)

| Rule | Description | Severity |
|------|-------------|----------|
| `MG-TS-001` | Dynamic value in `prom-client` `.labels()` call | HIGH |
| `MG-TS-002` | Dynamic OTel attribute value in `@opentelemetry/api` span | HIGH |
| `MG-TS-003` | Dynamic tag in `dd-trace` span metadata | HIGH |

### Java (MG-JAVA-*)

| Rule | Description | Severity |
|------|-------------|----------|
| `MG-JAVA-001` | Dynamic tag in Micrometer `Counter.builder().tag()` | HIGH |
| `MG-JAVA-002` | Dangerous keyword as Micrometer tag key | CRITICAL |
| `MG-JAVA-003` | Dynamic OTel attribute value in Java SDK call | HIGH |

---

## Configuration (`metricguard.yml`)

```yaml
version: "1"
fail_on_severity: high          # Fail CI on HIGH+ findings

python:
  enable_taint_analysis: true
  extend_dangerous_keywords:    # ADD to defaults, not replace
    - account_id
    - device_id
  extend_monitored_functions:   # Add custom metric wrappers
    - emit_business_metric

yaml:
  dangerous_label_keys:
    - user_id
    - session_id

# Extend the default ignore lists (ADD to defaults, not replace)
extend_ignored_dirs:
  - generated          # any extra directories to skip
extend_ignored_file_patterns:
  - "*.proto"          # any extra filename patterns to skip

ignored_paths:
  - "**/legacy/**"
  - "**/tests/fixtures/**"
```

> [!TIP]
> Use `extend_ignored_dirs` and `extend_ignored_file_patterns` to add to the built-in black-hole lists without losing the defaults.

---

## Development

```bash
# Install with dev extras
pip install -e ".[dev]"

# Install with Semgrep multi-language support
pip install -e ".[dev,semgrep]"

# Run tests
pytest tests/ -v

# Type checking
mypy metricguard/

# Linting
ruff check metricguard/
```

---

## Author

**Yossi Cohen**
- 🐙 GitHub: [github.com/Yossi-Cohen19](https://github.com/Yossi-Cohen19/)
- 💼 LinkedIn: [linkedin.com/in/yossi-cohen-b302b0372](https://www.linkedin.com/in/yossi-cohen-b302b0372)
- 📧 Email: Yossi85291@gmail.com

---

## License

MIT — see [LICENSE](LICENSE).
