Metadata-Version: 2.4
Name: symbits
Version: 0.1.1
Summary: Symbolic bitvector arithmetic with three-valued logic for tracking known and unknown bits through bitwise operations
License-Expression: MIT
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Dynamic: license-file

# symbits

Symbolic bitvector arithmetic with three-valued logic for tracking known and unknown bits through bitwise operations.

Each bit in a value is in one of three states: **known-0**, **known-1**, or **unknown (`x`)**. When you perform bitwise operations, `symbits` propagates what's known and what isn't — exactly like a hardware known-bits analysis, but interactive in a Python REPL.

Built for GPU compiler engineers tracing bit-level state through instruction chains, but useful anywhere you need to reason about partial bit-level knowledge.

## Installation

```bash
pip install symbits
```

Requires Python 3.8+. No runtime dependencies.

## Quick start

```python
from symbits import B, B8, B16, B32, B64

# A value where the upper nibble is known, lower nibble is unknown
val = B8("1100xxxx")
print(val)
# 0x?X  (1100 xxxx)  [4/8 known]

# AND with a mask — known-0 bits force the result to 0
print(val & 0x0F)
# 0x0X  (0000 xxxx)  [4/8 known]

# OR with known-1 bits forces those bits to 1
print(val | 0x0F)
# 0x?f  (1100 1111)  [8/8 known]
```

## Creating values

### From integers

Integer inputs produce fully-known values. Default width is 32 bits.

```python
B(0x42)                # 32-bit, all bits known
B(0x42, width=64)      # 64-bit
B(0)                   # 32-bit zero
B(0xFF, width=8)       # 8-bit
```

### Convenience width constructors

```python
B8(0x42)               # 8-bit
B16(0x42)              # 16-bit
B32(0x42)              # 32-bit
B64(0x42)              # 64-bit
```

### From binary strings

Use `0` and `1` for known bits, `x` or `?` for unknowns. Spaces are ignored (use them for readability). Width is inferred from string length unless overridden.

```python
B("0000xxxx")                     # 8-bit: upper nibble known-0, lower unknown
B("1xxx0011")                     # 8-bit: mixed known and unknown
B("1xxx xxxx xxxx xxxx")          # 16-bit: spaces stripped, just for readability
B("1010", width=8)                # 8-bit: upper 4 bits are implicitly known-0
B("0000????")                     # same as "0000xxxx" — ? works too
```

### From hex strings

Use `0x` prefix for hex. `X` or `?` marks an entire nibble (4 bits) as unknown.

```python
B("0xFF")                         # 8-bit, all ones
B("0xFFXX00")                     # 24-bit: FF known, XX unknown, 00 known
B("0xFF??00")                     # same thing with ? syntax
B("0xFF", width=32)               # 32-bit: upper 24 bits are known-0
```

### All-unknown values

```python
B.unknown(32)                     # 32-bit, every bit is unknown
B.unknown(64)                     # 64-bit, every bit is unknown
```

## Reading the display

Every value prints three things: hex, binary (nibble-grouped), and how many bits are known.

```python
>>> B("0xFFXX00")
0xffXX00  (1111 1111 xxxx xxxx 0000 0000)  [16/24 known]
```

In the hex display:
- A normal hex digit (`0`-`f`) means all 4 bits in that nibble are known
- `X` means all 4 bits in that nibble are unknown
- `?` means the nibble has a mix of known and unknown bits

You can get just the binary or hex string for copy-pasting:

```python
>>> val = B("0xFFXX00")
>>> val.bin()
'111111110000xxxx00000000'
>>> val.hex()
'0xffXX00'
```

## Bitwise operations

All standard Python bitwise operators work and propagate known/unknown state correctly.

### AND (`&`)

A known-0 in either operand forces the result to 0, regardless of the other operand.

```python
>>> a = B8("1100xxxx")
>>> b = B8("1010xx00")
>>> a & b
0x?0  (1000 xx00)  [4/8 known]
```

The key insight: bits 3-2 of `a` are unknown, but those same bits of `b` are known-0. Since `anything AND 0 = 0`, the result bits 3-2 are known-0 even though `a` was unknown there.

### OR (`|`)

A known-1 in either operand forces the result to 1.

```python
>>> a = B8("1100xxxx")
>>> b = B8("1010xx00")
>>> a | b
0x?X  (111x xx00)  [4/8 known]
```

### XOR (`^`)

XOR can only produce a known result when both inputs are known.

```python
>>> B8("11001100") ^ B8("10101010")
0x66  (0110 0110)  [8/8 known]

>>> B8("1100xxxx") ^ B8("1010xx00")
0x?X  (011x xx00)  [4/8 known]
```

Note: `a ^ a` with unknowns does **not** produce all zeros — the unknown bits stay unknown (conservative, since the two `a`s could represent different concrete values in analysis contexts).

### NOT (`~`)

Flips known bits, leaves unknowns alone.

```python
>>> ~B8("1100xxxx")
0x?X  (0011 xxxx)  [4/8 known]

>>> ~~B8("1100xxxx") == B8("1100xxxx")   # double-NOT is identity
True
```

### Reverse operators

You can put the integer on the left side:

```python
>>> 0xFF & B8("xxxx0000")
0x?0  (xxxx 0000)  [4/8 known]

>>> 0x00 | B8("xxxx0000")
0x?0  (xxxx 0000)  [4/8 known]
```

## Shifts

Shift amounts must be concrete integers (not symbolic). `>>` is logical (fills with 0), `.ashr()` is arithmetic (fills with sign bit).

### Left shift (`<<`)

Bits shift up. Bottom bits become known-0. Top bits are lost.

```python
>>> B8("xxxx0011") << 4
0x30  (0011 0000)  [8/8 known]

>>> B8("xxxxxxxx") << 3
0x?0  (xxxx x000)  [3/8 known]
```

### Logical right shift (`>>`)

Bits shift down. Top bits become known-0. Bottom bits are lost.

```python
>>> B8("11001100") >> 4
0x0c  (0000 1100)  [8/8 known]

>>> B8("xxxxxxxx") >> 3
0x?X  (000x xxxx)  [3/8 known]
```

### Arithmetic right shift (`.ashr()`)

Like `>>`, but the top bits are filled with copies of the sign bit (MSB). If the sign bit is unknown, the fill bits are unknown.

```python
>>> B8("10000000").ashr(4)          # sign bit is 1: fills with 1
0xf8  (1111 1000)  [8/8 known]

>>> B8("01000000").ashr(4)          # sign bit is 0: fills with 0
0x04  (0000 0100)  [8/8 known]

>>> B8("xxxxxxxx").ashr(4)          # sign bit unknown: fills with x
0xXX  (xxxx xxxx)  [0/8 known]
```

### Shift errors

```python
>>> B8(0xFF) << B8(1)
TypeError: Cannot shift by symbolic amount — shift count must be a concrete integer

>>> B8(0xFF) << -1
ValueError: Shift amount must be non-negative, got -1
```

## Mixed-width operations

When two values of different widths interact, the narrower one is **zero-extended** (upper bits become known-0) and the result takes the wider width.

```python
>>> B8(0xFF) & B32(0x0000FFFF)
0x000000ff  (...)  [32/32 known]

>>> B8("xxxxxxxx") & B32(0xFFFFFFFF)
0x000000XX  (...)  [24/32 known]
# The 8-bit value was zero-extended: upper 24 bits became known-0
```

### Explicit width changes

```python
# Zero-extend: upper bits become known-0
>>> B8("1xxx0011").zext(32)
0x0000??X?  (0000 0000 0000 0000 0000 0000 1xxx 0011)  [29/32 known]

# Sign-extend: upper bits copy the sign bit (MSB)
>>> B8("1xxx0011").sext(16)
0xff?3  (1111 1111 1xxx 0011)  [13/16 known]
# Sign bit was 1, so upper 8 bits filled with 1

>>> B8("0xxx0011").sext(16)
0x00?3  (0000 0000 0xxx 0011)  [13/16 known]
# Sign bit was 0, so upper 8 bits filled with 0

>>> B8("xxxx0011").sext(16)
0xXX?3  (xxxx xxxx xxxx 0011)  [4/16 known]
# Sign bit unknown, so upper 8 bits are unknown

# Truncate: keep only the low bits
>>> B32(0xDEADBEEF).truncate(8)
0xef  (1110 1111)  [8/8 known]
```

## Bit extraction

### Single bit

`.bit(n)` returns `'0'`, `'1'`, or `'x'` for bit `n` (0-indexed from LSB).

```python
>>> val = B8("1100xxxx")
>>> val.bit(7)
'1'
>>> val.bit(3)
'x'
>>> val.bit(0)
'x'
>>> val.bit(100)     # out of range returns '0'
'0'
```

### Bit range

`.bits(hi, lo)` extracts an inclusive range as a new `B` value.

```python
>>> val = B8(0xAB)
>>> val.bits(7, 4)                # upper nibble
0xa  (1010)  [4/4 known]
>>> val.bits(3, 0)                # lower nibble
0xb  (1011)  [4/4 known]
```

### Slice syntax

`a[hi:lo]` uses **MSB:LSB hardware convention** (not Python's reversed slice convention). `a[7:0]` means bits 7 down to 0.

```python
>>> val = B8(0xAB)
>>> val[7:4]                      # upper nibble
0xa  (1010)  [4/4 known]
>>> val[3:0]                      # lower nibble
0xb  (1011)  [4/4 known]
>>> val[7]                        # single bit as width-1 B
0x1  (1)  [1/1 known]
```

## Concatenation

Combine values into wider ones. The first argument becomes the high bits.

```python
>>> hi = B32(0x0000FFFF)
>>> lo = B32(0xFFFF0000)
>>> hi.concat(lo)                 # instance method: self=high, arg=low
0x0000ffffffff0000  (...)  [64/64 known]

>>> B.concat(hi, lo)              # equivalent static-style call
0x0000ffffffff0000  (...)  [64/64 known]

>>> B.concat(B8(0xAA), B8(0xBB), B8(0xCC))   # three parts
0xaabbcc  (...)  [24/24 known]
```

Typical GPU use case — combining two 32-bit scalar registers into a 64-bit address:

```python
>>> s0 = B32("0xXXXX0000")       # low 32 bits, upper half unknown
>>> s1 = B32(0x00007FFF)         # high 32 bits, fully known
>>> addr = s1.concat(s0)
>>> addr
0x00007fffXXXX0000  (...)  [48/64 known]
>>> addr[63:32].as_int()         # extract high word
32767
```

## Inspection methods

```python
val = B8("1100xxxx")

val.known_bits()           # 4 — count of bits that are known
val.unknown_bits()         # 4 — count of bits that are unknown
val.is_fully_known()       # False
val.is_fully_unknown()     # False

val.known_ones_mask()      # 0xC0 — bitmask of bits known to be 1
val.known_zeros_mask()     # 0x30 — bitmask of bits known to be 0
val.unknown_mask()         # 0x0F — bitmask of unknown bits
```

### Converting to int

```python
>>> B8(0xAB).as_int()
171

>>> B8("1100xxxx").as_int()
ValueError: Cannot convert to int: 4 bits are unknown.
```

### Checking compatibility

`.could_equal(val)` returns `True` if the known bits are compatible with a concrete value.

```python
>>> val = B8("1100xxxx")
>>> val.could_equal(0xC0)         # 11000000 — compatible
True
>>> val.could_equal(0xCF)         # 11001111 — compatible
True
>>> val.could_equal(0xFF)         # 11111111 — bit 5 is known-0, incompatible
False
>>> val.could_equal(0x00)         # 00000000 — bits 7-6 are known-1, incompatible
False
```

## Comparison and equality

`==` is **structural equality** — same width, same known bits, same unknowns.

```python
>>> B8("1100xxxx") == B8("1100xxxx")
True
>>> B8(42) == 42                  # int is converted to B with matching width
True
>>> B8("xxxx0000") == 0           # structurally different (has unknowns)
False
>>> B8(0xFF) == B16(0xFF)         # different widths
False
```

### Refinement

`a.is_refinement_of(b)` returns `True` if `a` knows everything `b` knows, plus possibly more.

```python
>>> B8("11000000").is_refinement_of(B8("1100xxxx"))
True    # a knows all of b's bits, plus the lower nibble

>>> B8("1100xxxx").is_refinement_of(B8("11000000"))
False   # a doesn't know the lower nibble that b knows
```

### Boolean context

Using a symbolic value as a boolean raises `TypeError` to prevent bugs:

```python
>>> if B8(0xFF):
...     pass
TypeError: Symbolic bitvectors cannot be used as booleans.
```

## Algebraic laws

`symbits` respects standard bitwise algebra, which makes it useful for verifying optimizations:

```python
a = B8("1100xxxx")
b = B8("1010xx00")

# De Morgan's laws
~(a & b) == (~a | ~b)            # True
~(a | b) == (~a & ~b)            # True

# Idempotence
(a & a) == a                     # True
(a | a) == a                     # True

# Identity / annihilator
(a & B8(0xFF)) == a              # True (AND identity)
(a & B8(0x00)) == B8(0)          # True (AND annihilator)
(a | B8(0x00)) == a              # True (OR identity)
(a | B8(0xFF)) == B8(0xFF)       # True (OR annihilator)
(a ^ B8(0x00)) == a              # True (XOR identity)

# Double NOT
~~a == a                         # True
```

## Practical example: GPU address calculation

Tracing a 64-bit flat address computed from a base pointer and a scaled offset, where the offset comes from an unknown VGPR:

```python
from symbits import B32, B64, B

# Base address: known (from a constant buffer load)
base_lo = B32(0x12340000)
base_hi = B32(0x00007FFF)
base = base_hi.concat(base_lo)

# Offset: VGPR value, completely unknown
offset = B32.unknown(32)

# Scale offset by 4 (left shift 2)
scaled = offset.zext(64) << 2
print(scaled)
# 0x00000?XX XXXXXX00  — bottom 2 bits are known-0 from the shift

# Add would go here (symbits doesn't have add yet — just bitwise ops)
# But we can still reason about alignment:
print(f"Low 2 bits of scaled offset: {scaled.bit(1)}, {scaled.bit(0)}")
# Low 2 bits of scaled offset: 0, 0
# The shifted offset is guaranteed 4-byte aligned
```

## API reference

### Constructors

| Constructor | Description |
|-------------|-------------|
| `B(value, width=32)` | From int, binary string, or hex string |
| `B8(value)` | 8-bit shorthand |
| `B16(value)` | 16-bit shorthand |
| `B32(value)` | 32-bit shorthand |
| `B64(value)` | 64-bit shorthand |
| `B.unknown(width)` | All bits unknown |

### Operators

| Operator | Description |
|----------|-------------|
| `a & b` | Bitwise AND |
| `a \| b` | Bitwise OR |
| `a ^ b` | Bitwise XOR |
| `~a` | Bitwise NOT |
| `a << n` | Left shift (n must be int) |
| `a >> n` | Logical right shift |
| `a.ashr(n)` | Arithmetic right shift |
| `a == b` | Structural equality (returns bool) |
| `a != b` | Structural inequality |

### Width manipulation

| Method | Description |
|--------|-------------|
| `a.zext(width)` | Zero-extend to wider width |
| `a.sext(width)` | Sign-extend to wider width |
| `a.truncate(width)` | Keep only low bits |
| `a.concat(b, ...)` | Concatenate (self=high, args=low) |
| `B.concat(a, b, ...)` | Static concatenation |

### Extraction

| Method | Description |
|--------|-------------|
| `a.bit(n)` | Single bit: `'0'`, `'1'`, or `'x'` |
| `a.bits(hi, lo)` | Bit range as new B |
| `a[hi:lo]` | Slice (MSB:LSB convention) |
| `a[n]` | Single bit as width-1 B |

### Inspection

| Method | Description |
|--------|-------------|
| `a.known_bits()` | Count of known bits |
| `a.unknown_bits()` | Count of unknown bits |
| `a.is_fully_known()` | True if no unknowns |
| `a.is_fully_unknown()` | True if all unknown |
| `a.as_int()` | Int value (raises if unknowns) |
| `a.could_equal(val)` | Compatibility check |
| `a.known_ones_mask()` | Bitmask of known-1 bits |
| `a.known_zeros_mask()` | Bitmask of known-0 bits |
| `a.unknown_mask()` | Bitmask of unknown bits |
| `a.is_refinement_of(b)` | True if a knows everything b knows |

### Display

| Method | Description |
|--------|-------------|
| `repr(a)` | Hex + binary + known count |
| `a.bin()` | Binary string (no grouping) |
| `a.hex()` | Hex string with X/? |

## License

MIT
