Metadata-Version: 2.4
Name: langchain-konstruct
Version: 0.1.0
Summary: LangChain integration for Konstruct - Typed Knowledge Graph for AI Agents
Author-email: AerwareAI <hello@aerware.ai>
Maintainer-email: Aeryn White <aeryn@aerware.ai>
License: Proprietary
Project-URL: Homepage, https://konstruct.aerware.ai
Project-URL: Documentation, https://docs.aerware.ai/konstruct
Project-URL: Repository, https://github.com/aerwareai/konstruct
Project-URL: Bug Tracker, https://github.com/aerwareai/konstruct/issues
Project-URL: Changelog, https://github.com/aerwareai/konstruct/blob/main/CHANGELOG.md
Keywords: langchain,konstruct,knowledge-graph,ai,agents,llm,rag,typed-relations,causal-reasoning,advisory
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: langchain-core>=0.1.0
Requires-Dist: requests>=2.31.0
Requires-Dist: pydantic>=1.10.0
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: black>=23.7.0; extra == "dev"
Requires-Dist: mypy>=1.5.0; extra == "dev"
Requires-Dist: ruff>=0.0.287; extra == "dev"
Provides-Extra: examples
Requires-Dist: langchain-anthropic>=0.1.0; extra == "examples"
Requires-Dist: python-dotenv>=1.0.0; extra == "examples"
Dynamic: license-file

# Konstruct × LangChain Integration

LangChain retriever classes powered by Konstruct's typed knowledge graph engine.

## Features

- **19 Typed Relations** — `is_a`, `causes`, `requires`, `conflicts_with`, `enables`, `prevents`, and 13 more
- **Symmetry-Based Inference** — Automatic reverse edge generation (e.g., `causes` → `caused_by`)
- **Causal & Structural Reasoning** — Deterministic graph traversal, zero LLM dependency
- **Token-Budgeted Advisory** — 300 token structured guidance (key concepts + relationships + frameworks)
- **Framework Triggering** — Context-aware questions, red flags, and green lights
- **Sub-100μs Queries** — ~100μs concept retrieval, ~500μs advisory generation

## Installation

```bash
pip install langchain langchain-anthropic python-dotenv
```

Copy `konstruct_retriever.py` to your project:
```bash
cp /path/to/konstruct/integrations/langchain/konstruct_retriever.py .
```

## Quick Start

```python
from konstruct_retriever import KonstructRetriever
from langchain.chains import RetrievalQA
from langchain_anthropic import ChatAnthropic

# Initialize retriever
retriever = KonstructRetriever(
    api_key="your_konstruct_api_key",  # Optional, for unlimited API calls
    max_concepts=10,
    max_depth=2
)

# Load a knowledge pack
pack = {
    "id": "startup_fundamentals",
    "name": "Startup Fundamentals",
    "concepts": [
        {
            "id": "product_market_fit",
            "name": "Product-Market Fit",
            "category": "strategy",
            "importance": 0.95,
            "description": "When product satisfies strong market demand"
        },
        {
            "id": "customer_discovery",
            "name": "Customer Discovery",
            "category": "process",
            "importance": 0.9,
            "description": "Process of validating problem-solution fit"
        }
    ],
    "edges": [
        {
            "source": "customer_discovery",
            "target": "product_market_fit",
            "relation": "enables",
            "weight": 0.9
        }
    ]
}

retriever.load_pack(pack)

# Create QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=ChatAnthropic(model="claude-sonnet-4-20250514"),
    chain_type="stuff",
    retriever=retriever
)

# Query
response = qa_chain.invoke({"query": "How do I validate my startup idea?"})
print(response["result"])
```

## API Reference

### KonstructRetriever

Base retriever class implementing `BaseRetriever`.

#### Constructor Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `mcp_url` | `str` | `https://konstruct-api.aerynw.workers.dev/mcp` | Konstruct MCP endpoint URL |
| `api_key` | `Optional[str]` | `None` | API key for authenticated access (bypasses rate limits) |
| `max_concepts` | `int` | `10` | Maximum concepts to return per query |
| `max_depth` | `int` | `2` | Edge traversal depth (1 or 2) |
| `max_advisory_tokens` | `int` | `300` | Token budget for advisory generation |
| `category_filter` | `Optional[str]` | `None` | Filter concepts by category |
| `stage_filter` | `Optional[str]` | `None` | Filter frameworks by stage (design, implementation, testing, etc.) |
| `search_type` | `str` | `"advisory"` | Return type: `advisory` or `concepts` |

#### Methods

**`_get_relevant_documents(query: str) -> List[Document]`**

Retrieve relevant documents from Konstruct knowledge graph. Called automatically by LangChain.

```python
# Via RetrievalQA chain
docs = qa_chain.retriever.get_relevant_documents("What causes technical debt?")

for doc in docs:
    print(doc.page_content)
    print(doc.metadata)
```

**`load_pack_from_file(pack_path: str) -> Dict[str, Any]`**

Load a knowledge pack from JSON file.

```python
result = retriever.load_pack_from_file("packs/yc-fundamentals.json")
print(f"Loaded {result.get('concepts', 0)} concepts")
print(f"Inferred {result.get('inferred_edges', 0)} reverse edges")
```

Returns:
```python
{
    "concepts": 15,
    "edges": 42,
    "inferred_edges": 27,  # Auto-generated reverse edges
    "frameworks": 3
}
```

**`load_pack(pack_data: Dict[str, Any]) -> Dict[str, Any]`**

Load a knowledge pack from dict.

```python
pack = {
    "id": "my_pack",
    "name": "My Knowledge Pack",
    "concepts": [...],
    "edges": [...],
    "frameworks": [...]
}

result = retriever.load_pack(pack)
```

**`explore_concept(concept_id: str, depth: int = 2) -> List[Document]`**

Explore relationships outward from a specific concept.

```python
# Explore 2 levels deep from "product_market_fit"
docs = retriever.explore_concept("product_market_fit", depth=2)

print(docs[0].page_content)
# Outputs relationship graph with:
# - Direct edges (depth 1)
# - Second-degree edges (depth 2)
```

**`get_all_concepts(category: Optional[str] = None) -> List[Document]`**

List all loaded concepts, optionally filtered by category.

```python
# All concepts
all_docs = retriever.get_all_concepts()

# Concepts in "strategy" category only
strategy_docs = retriever.get_all_concepts(category="strategy")
```

**`get_stats() -> Dict[str, Any]`**

Get knowledge graph statistics.

```python
stats = retriever.get_stats()
print(f"Concepts: {stats.get('concept_count', 0)}")
print(f"Edges: {stats.get('edge_count', 0)}")
print(f"Frameworks: {stats.get('framework_count', 0)}")
```

### KonstructCausalRetriever

Specialized retriever for causal reasoning queries. Filters and ranks by causal relations (`causes`, `prevents`, `requires`, `enables`, `conflicts_with`, `amplifies`, `diminishes`).

```python
from konstruct_retriever import KonstructCausalRetriever

# Emphasize causal relationships
causal_retriever = KonstructCausalRetriever(
    max_concepts=10,
    max_depth=2
)

causal_retriever.load_pack(compliance_pack)

# Queries automatically prioritize causal edges
docs = causal_retriever.get_relevant_documents("What must be in place before AML compliance?")
# Returns concepts with "requires" edges ranked higher
```

## Knowledge Pack Format

Knowledge packs define concepts, relationships, and frameworks in JSON:

```json
{
  "id": "unique_pack_id",
  "name": "Human-Readable Name",
  "source": "Where this knowledge came from",
  "version": "1.0.0",
  "concepts": [
    {
      "id": "concept_id",
      "name": "Concept Name",
      "category": "category_name",
      "importance": 0.85,
      "tags": ["tag1", "tag2"],
      "description": "What this concept means",
      "source": "Reference or citation"
    }
  ],
  "edges": [
    {
      "source": "concept_a",
      "target": "concept_b",
      "relation": "causes",
      "weight": 0.9,
      "metadata": "Optional explanation"
    }
  ],
  "frameworks": [
    {
      "id": "framework_id",
      "name": "Framework Name",
      "source": "Where this framework came from",
      "trigger_concepts": ["concept_a", "concept_b"],
      "questions": [
        "Question 1?",
        "Question 2?"
      ],
      "red_flags": [
        "Warning sign 1",
        "Warning sign 2"
      ],
      "green_lights": [
        "Positive indicator 1",
        "Positive indicator 2"
      ],
      "applicable_stages": ["design", "implementation", "testing"]
    }
  ]
}
```

### Relation Types

Konstruct supports 19 typed relations with automatic inference:

| Relation | Symmetry | Inverse | Description |
|----------|----------|---------|-------------|
| `is_a` | Transitive | `generalizes` | Subtype relationship |
| `part_of` | Transitive | `contains` | Component relationship |
| `causes` | Asymmetric | `caused_by` | Causal relationship |
| `enables` | Asymmetric | `enabled_by` | Enablement relationship |
| `requires` | Asymmetric | `required_by` | Dependency relationship |
| `conflicts_with` | Symmetric | `conflicts_with` | Mutual exclusion |
| `similar_to` | Symmetric | `similar_to` | Similarity relationship |
| `precedes` | Transitive | `follows` | Temporal ordering |
| `supports` | Asymmetric | `supported_by` | Support relationship |
| `contradicts` | Symmetric | `contradicts` | Logical contradiction |
| `amplifies` | Asymmetric | `amplified_by` | Amplification effect |
| `diminishes` | Asymmetric | `diminished_by` | Reduction effect |
| `implements` | Asymmetric | `implemented_by` | Implementation relationship |
| `depends_on` | Transitive | `dependency_of` | Stronger than requires |
| `correlates_with` | Symmetric | `correlates_with` | Statistical correlation |
| `replaces` | Asymmetric | `replaced_by` | Substitution |
| `triggers` | Asymmetric | `triggered_by` | Event causation |
| `validates` | Asymmetric | `validated_by` | Verification relationship |
| `prevents` | Asymmetric | `prevented_by` | Prevention relationship |

**Automatic inference:** When you add `{"source": "A", "target": "B", "relation": "causes"}`, Konstruct automatically creates the reverse edge `{"source": "B", "target": "A", "relation": "caused_by"}`.

## Examples

### Example 1: Startup Advisory System

```python
from konstruct_retriever import KonstructRetriever
from langchain.chains import RetrievalQA
from langchain_anthropic import ChatAnthropic

retriever = KonstructRetriever(max_advisory_tokens=500)

# Load Y Combinator fundamentals
pack = {
    "id": "yc_fundamentals",
    "name": "Y Combinator Fundamentals",
    "concepts": [
        {
            "id": "talk_to_users",
            "name": "Talk to Users",
            "category": "discovery",
            "importance": 0.95,
            "description": "Direct conversation with potential customers"
        },
        {
            "id": "build_something_people_want",
            "name": "Build Something People Want",
            "category": "strategy",
            "importance": 1.0,
            "description": "The core YC mantra"
        },
        {
            "id": "premature_scaling",
            "name": "Premature Scaling",
            "category": "pitfall",
            "importance": 0.85,
            "description": "Scaling before product-market fit"
        }
    ],
    "edges": [
        {
            "source": "talk_to_users",
            "target": "build_something_people_want",
            "relation": "enables",
            "weight": 0.95
        },
        {
            "source": "premature_scaling",
            "target": "talk_to_users",
            "relation": "prevents",
            "weight": 0.8
        }
    ],
    "frameworks": [
        {
            "id": "customer_development",
            "name": "Customer Development Framework",
            "trigger_concepts": ["talk_to_users"],
            "questions": [
                "Have you talked to at least 10 potential customers?",
                "What's the biggest pain point they described?",
                "Would they pay for a solution?"
            ],
            "red_flags": [
                "Building in a vacuum without user feedback",
                "Assuming you know what users want",
                "Focusing on features before validating problem"
            ],
            "green_lights": [
                "Users actively asking when they can pay",
                "Clear, consistent pain point across interviews",
                "Users willing to use MVP despite limitations"
            ]
        }
    ]
}

retriever.load_pack(pack)

qa_chain = RetrievalQA.from_chain_type(
    llm=ChatAnthropic(model="claude-sonnet-4-20250514"),
    retriever=retriever
)

response = qa_chain.invoke({"query": "Should I build more features before launching?"})
print(response["result"])
# LLM will see structured advisory with "talk to users" concept + framework triggers
```

### Example 2: Regulatory Compliance Checker

See `/home/aeryn/Code/konstruct/examples/compliance_advisor/advisor.py` for full implementation.

```python
from konstruct_retriever import KonstructCausalRetriever

# Use causal retriever for "if X then Y" reasoning
retriever = KonstructCausalRetriever(max_depth=2)

compliance_pack = {
    "id": "fintech_compliance",
    "concepts": [
        {"id": "kyc", "name": "Know Your Customer", ...},
        {"id": "aml", "name": "Anti-Money Laundering", ...}
    ],
    "edges": [
        {
            "source": "aml",
            "target": "kyc",
            "relation": "requires",
            "weight": 0.95,
            "metadata": "AML programs must include KYC procedures"
        }
    ]
}

retriever.load_pack(compliance_pack)

# Query for prerequisites
docs = retriever.get_relevant_documents("What's required before implementing AML?")
# Konstruct traverses "requires" edges deterministically, no LLM needed
```

### Example 3: Concept Exploration

```python
retriever = KonstructRetriever()
retriever.load_pack(my_pack)

# Explore all relationships from a concept
docs = retriever.explore_concept("product_market_fit", depth=2)

print(docs[0].page_content)
# Output:
# # Exploration from: product_market_fit
# - **enables** (weight: 0.90) → Sustainable Growth
#   Businesses with PMF can scale efficiently...
# - **requires** (weight: 0.85) → Customer Discovery
#   Cannot achieve PMF without understanding customer needs...
```

### Example 4: Category Filtering

```python
# Only retrieve security-related concepts
retriever = KonstructRetriever(
    category_filter="security",
    max_concepts=5
)

retriever.load_pack(infrastructure_pack)

docs = retriever.get_relevant_documents("How do I protect user data?")
# Only concepts with category="security" will be considered
```

### Example 5: Stage-Specific Frameworks

```python
# Design stage advisory
design_retriever = KonstructRetriever(
    stage_filter="design",
    max_advisory_tokens=400
)

design_retriever.load_pack(architecture_pack)

docs = design_retriever.get_relevant_documents("Planning a microservices architecture")
# Only frameworks with "design" in applicable_stages will trigger
```

### Example 6: Custom Advisory Budget

```python
# Detailed advisory (more tokens)
detailed_retriever = KonstructRetriever(
    search_type="advisory",
    max_advisory_tokens=600
)

# Compact advisory (fewer tokens)
compact_retriever = KonstructRetriever(
    search_type="advisory",
    max_advisory_tokens=150
)

# Raw concepts (no advisory formatting)
concept_retriever = KonstructRetriever(
    search_type="concepts"
)
```

## Configuration

### Environment Variables

Create a `.env` file:

```bash
# Required
ANTHROPIC_API_KEY=your_anthropic_key_here

# Optional (for unlimited API calls)
KONSTRUCT_API_KEY=your_konstruct_key_here
```

### Search Types

**`advisory`** (default) — Returns structured guidance with 3 sections:
1. Key Concepts (0-50% of token budget)
2. Key Relationships (50-75% of token budget)
3. Framework Guidance (75-100% of token budget)

```python
retriever = KonstructRetriever(search_type="advisory")
docs = retriever.get_relevant_documents("How do I validate my idea?")
# Returns single Document with token-budgeted advisory text
```

**`concepts`** — Returns raw concepts and relationships as separate documents:

```python
retriever = KonstructRetriever(search_type="concepts")
docs = retriever.get_relevant_documents("How do I validate my idea?")
# Returns multiple Documents, one per concept with edges/frameworks
```

### Traversal Depth

Controls how many edge hops to follow:

```python
# Shallow (direct connections only)
retriever = KonstructRetriever(max_depth=1)

# Deep (two-hop connections)
retriever = KonstructRetriever(max_depth=2)
```

**Performance:** Depth 1 is ~2x faster than depth 2.
**Relevance:** Depth 2 captures more context but may include noise.

### Max Concepts

Controls result size:

```python
# Focused (top 5 concepts)
retriever = KonstructRetriever(max_concepts=5)

# Comprehensive (top 20 concepts)
retriever = KonstructRetriever(max_concepts=20)
```

## Troubleshooting

### "Konstruct MCP error: Rate limit exceeded"

You're hitting the 60 requests/minute rate limit for unauthenticated requests.

**Solution:** Set `KONSTRUCT_API_KEY` environment variable to bypass rate limits.

```python
retriever = KonstructRetriever(
    api_key=os.getenv("KONSTRUCT_API_KEY")
)
```

### "No concepts returned"

Possible causes:
1. Knowledge pack not loaded
2. Query context doesn't match any concepts
3. Category/stage filters too restrictive

**Solutions:**
```python
# Check if pack loaded successfully
stats = retriever.get_stats()
print(stats)  # Should show concept_count > 0

# Remove filters temporarily
retriever = KonstructRetriever(
    category_filter=None,
    stage_filter=None
)

# List all concepts to verify loading
docs = retriever.get_all_concepts()
print(docs[0].page_content)
```

### "Advisory is truncated"

Advisory generation hit the token budget limit.

**Solution:** Increase `max_advisory_tokens`:

```python
retriever = KonstructRetriever(max_advisory_tokens=600)
```

Or use `search_type="concepts"` for full details.

### "Reverse edges not appearing"

Edge inference requires matching relation types and symmetry profiles.

**Check:**
1. Relation type is spelled correctly (case-sensitive)
2. Relation exists in Konstruct's 19 supported types
3. Source/target concept IDs exist in pack

```python
result = retriever.load_pack(pack)
print(f"Inferred {result.get('inferred_edges', 0)} reverse edges")
# Should be > 0 if you have asymmetric edges
```

### "Connection timeout"

MCP endpoint may be unreachable or slow.

**Solutions:**
1. Check internet connection
2. Verify `mcp_url` is correct
3. Self-host Konstruct if reliability is critical (see main README)

## Advanced Usage

### Custom MCP Endpoint

Self-host Konstruct and point to your instance:

```python
retriever = KonstructRetriever(
    mcp_url="https://my-konstruct-instance.com/mcp"
)
```

### Multiple Knowledge Packs

Load multiple packs into the same graph:

```python
retriever = KonstructRetriever()

# Load base knowledge
retriever.load_pack(fundamentals_pack)

# Add domain-specific knowledge
retriever.load_pack(healthcare_pack)

# Add regulatory knowledge
retriever.load_pack(compliance_pack)

# All three packs are merged in the graph
stats = retriever.get_stats()
print(f"Total concepts: {stats.get('concept_count', 0)}")
```

### Programmatic Pack Generation

Build knowledge packs from structured data sources:

```python
import json

def build_pack_from_docs(doc_urls):
    """Convert documentation into knowledge pack."""
    concepts = []
    edges = []

    for url in doc_urls:
        # Scrape documentation
        content = fetch_and_parse(url)

        # Extract concepts via LLM
        concept = {
            "id": generate_id(content),
            "name": extract_title(content),
            "category": classify_category(content),
            "importance": calculate_importance(content),
            "description": summarize(content)
        }
        concepts.append(concept)

        # Extract relationships
        for link in extract_links(content):
            edges.append({
                "source": concept["id"],
                "target": link["target_id"],
                "relation": infer_relation(link),
                "weight": calculate_weight(link)
            })

    pack = {
        "id": "generated_pack",
        "concepts": concepts,
        "edges": edges,
        "frameworks": []
    }

    return pack

pack = build_pack_from_docs(["https://docs.example.com/..."])
retriever.load_pack(pack)
```

### Monitoring Graph Stats

```python
import time

while True:
    stats = retriever.get_stats()

    print(f"Concepts: {stats.get('concept_count', 0)}")
    print(f"Edges: {stats.get('edge_count', 0)}")
    print(f"Packs: {stats.get('packs_loaded', 0)}")
    print(f"Frameworks: {stats.get('framework_count', 0)}")

    time.sleep(60)
```

### Concept-Level Queries

Retrieve specific concepts by ID:

```python
# Explore from a known concept
docs = retriever.explore_concept("product_market_fit", depth=2)

# List concepts in a category
docs = retriever.get_all_concepts(category="strategy")

# Get graph statistics
stats = retriever.get_stats()
```

## Performance Characteristics

Based on Cloudflare Workers + Durable Objects deployment:

| Operation | Latency | Description |
|-----------|---------|-------------|
| Concept Query | ~100μs | Retrieve concept by ID |
| Edge Traversal | ~200μs | Follow relationships (depth 1) |
| Advisory Generation | ~500μs | Token-budgeted guidance |
| Pack Loading | ~50ms | Load full knowledge pack |
| Graph Stats | ~50μs | Get counts and metadata |

**Comparison to competitors:**
- Mem0: 200ms retrieval
- Zep: <200ms retrieval
- LangChain VectorStore: 500ms-1s (depends on embedding)

Konstruct is **2-10x faster** due to pre-computed edge inference and deterministic traversal (no embedding/LLM needed).

## License

© 2026 AerwareAI - Proprietary

See Konstruct main README for licensing information.
