Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion kmir/src/kmir/kdist/mir-semantics/kmir.md
Original file line number Diff line number Diff line change
Expand Up @@ -565,7 +565,8 @@ Therefore a heuristics is used here:
_SPAN
)
=>
#setTupleArgs(2, getValue(LOCALS, TUPLE)) ~> #execBlock(FIRST)
#setLocalValue(place(local(1), .ProjectionElems), #incrementRef(getValue(LOCALS, CLOSURE)))
~> #setTupleArgs(2, getValue(LOCALS, TUPLE)) ~> #execBlock(FIRST)
// arguments are tuple components, stored as _2 .. _n
...
</k>
Expand Down Expand Up @@ -609,6 +610,11 @@ Therefore a heuristics is used here:
// unpack tuple and set arguments individually
rule <k> #setTupleArgs(IDX, Aggregate(variantIdx(0), ARGS)) => #setTupleArgs(IDX, ARGS) ... </k>

rule <k> #setTupleArgs(IDX, VAL:Value)
=> #setTupleArgs(IDX, ListItem(VAL))
...
</k> [owise]

rule <k> #setTupleArgs(_, .List ) => .K ... </k>

rule <k> #setTupleArgs(IDX, ListItem(VAL) REST:List)
Expand Down
20 changes: 20 additions & 0 deletions kmir/src/kmir/kdist/mir-semantics/rt/data.md
Original file line number Diff line number Diff line change
Expand Up @@ -1051,6 +1051,12 @@ Literal arrays are also built using this RValue.
...
</k>

rule <k> ARGS:List ~> #mkAggregate(aggregateKindClosure(_DEF, _TY_ARGS))
=>
Aggregate(variantIdx(0), ARGS)
...
</k>


// #readOperands accumulates a list of `TypedLocal` values from operands
syntax KItem ::= #readOperands ( Operands )
Expand Down Expand Up @@ -1434,6 +1440,20 @@ which have the same representation `Value::Range`.
Also, casts to and from _transparent wrappers_ (newtypes that just forward field `0`, i.e. `struct Wrapper<T>(T)`)
are allowed, and supported by a special projection `WrapStruct`.

When the source and target types are pointer types with the same pointee type (i.e., differing only in mutability),
the cast preserves the source pointer and its metadata unchanged.

```k
rule <k> #cast(PtrLocal(OFFSET, PLACE, MUT, META), castKindPtrToPtr, TY_SOURCE, TY_TARGET)
=> PtrLocal(OFFSET, PLACE, MUT, META)
...
</k>
requires pointeeTy(lookupTy(TY_SOURCE)) ==K pointeeTy(lookupTy(TY_TARGET))
[priority(45), preserves-definedness] // valid map lookups checked
```

Otherwise, compute the type projection and convert metadata accordingly.

```k
rule <k> #cast(PtrLocal(OFFSET, place(LOCAL, PROJS), MUT, META), castKindPtrToPtr, TY_SOURCE, TY_TARGET)
=>
Expand Down
84 changes: 60 additions & 24 deletions kmir/src/kmir/kdist/mir-semantics/rt/decoding.md
Original file line number Diff line number Diff line change
Expand Up @@ -297,27 +297,26 @@ If there are no fields, the enum can be decoded by using their data as the discr
requires TAG =/=Int DISCRIMINANT
```

#### Enums with two variants
#### Enums with direct tag encoding and fields

Having two variants with possibly zero or one field each is a very common case,
it includes a number of standard library types such as `Option` and `Result`.

The following rules are specialised to the case of encoding an `Option`.
An important distinction here is whether or not the tag is niche-encoded.
If the option holds data that has all-zero as a possible value, a separate tag is used, usually as the first field.
In both cases we expect the tag to be in the single shared field, and the discriminant to be just 0 and 1.
Decodes any enum with `tagEncodingDirect` and a `primitiveInt` tag,
with arbitrary variant counts, discriminant values, and field counts per variant.
The tag is read as an unsigned little-endian integer, matched against the discriminant
list to find the variant index, and then the variant's fields are decoded using the
per-variant layout offsets.

```k
// General entry rule: direct-tag enum with at least one field somewhere.
rule #decodeValue(
BYTES
, typeInfoEnumType(...
name: _
, adtDef: _
, discriminants: discriminant(0) discriminant(1) .Discriminants
, fields: (.Tys : (FIELD_TYPE .Tys) : .Tyss)
, discriminants: DISCRIMINANTS
, fields: FIELD_TYPESS
, layout:
someLayoutShape(layoutShape(...
fields: fieldsShapeArbitrary(mk(... offsets: machineSize(0) .MachineSizes))
fields: _FIELDS
, variants:
variantsShapeMultiple(
mk(...
Expand All @@ -329,7 +328,7 @@ In both cases we expect the tag to be in the single shared field, and the discri
)
, tagEncoding: tagEncodingDirect
, tagField: 0
, variants: _VARIANTS
, variants: VARIANT_LAYOUTS
)
)
, abi: _ABI
Expand All @@ -338,22 +337,59 @@ In both cases we expect the tag to be in the single shared field, and the discri
))
) #as ENUM_TYPE
)
=> #decodeOptionTag01(BYTES, TAG_WIDTH, FIELD_TYPE, ENUM_TYPE)

syntax Evaluation ::= #decodeOptionTag01 ( Bytes , IntegerLength , Ty , TypeInfo ) [function, total]
// --------------------------------------------------------------------------------------
rule #decodeOptionTag01(BYTES, _LEN, _TY, _ENUM_TYPE)
=> Aggregate(variantIdx(0), .List)
requires 0 ==Int BYTES[0] // expect only 0 or 1 as tags, so higher bytes do not matter
[preserves-definedness]
rule #decodeOptionTag01(BYTES, LEN, TY, _ENUM_TYPE)
=> Aggregate(variantIdx(1), ListItem(#decodeValue(substrBytes(BYTES, #byteLength(LEN), lengthBytes(BYTES)), lookupTy(TY))))
requires 1 ==Int BYTES[0] // expect only 0 or 1 as tags, so higher bytes do not matter
=> #decodeEnumDirectFields(
BYTES,
#findVariantIdx(#decodeEnumDirectTag(BYTES, TAG_WIDTH), DISCRIMINANTS),
FIELD_TYPESS,
VARIANT_LAYOUTS,
ENUM_TYPE
)
requires notBool #noFields(FIELD_TYPESS)

// ---------------------------------------------------------------------------
// #decodeEnumDirectFields: given the variant index, decode its fields
// ---------------------------------------------------------------------------
syntax Evaluation ::= #decodeEnumDirectFields ( Bytes , VariantIdx , Tyss , LayoutShapes , TypeInfo ) [function, total]
// --------------------------------------------------------------------------------------------------------------------------
rule #decodeEnumDirectFields(BYTES, variantIdx(IDX), FIELD_TYPESS, VARIANT_LAYOUTS, _ENUM_TYPE)
=> Aggregate(
variantIdx(IDX),
#decodeFieldsWithOffsets(BYTES, #nthTys(FIELD_TYPESS, IDX), #nthVariantOffsets(VARIANT_LAYOUTS, IDX))
)
requires 0 <=Int IDX
[preserves-definedness]
rule #decodeOptionTag01(BYTES, _LEN, _TY, ENUM_TYPE)

// Error cases: variant not found or other failure
rule #decodeEnumDirectFields(BYTES, _, _FIELD_TYPESS, _VARIANT_LAYOUTS, ENUM_TYPE)
=> UnableToDecode(BYTES, ENUM_TYPE)
[owise]

// ---------------------------------------------------------------------------
// #decodeEnumDirectTag: read the tag bytes as an unsigned little-endian int
// ---------------------------------------------------------------------------
syntax Int ::= #decodeEnumDirectTag ( Bytes , IntegerLength ) [function, total]
rule #decodeEnumDirectTag(BYTES, TAG_WIDTH)
=> Bytes2Int(substrBytes(BYTES, 0, #byteLength(TAG_WIDTH)), LE, Unsigned)
requires lengthBytes(BYTES) >=Int #byteLength(TAG_WIDTH)
[preserves-definedness]
rule #decodeEnumDirectTag(_, _) => -1 [owise]

// ---------------------------------------------------------------------------
// List-indexing helpers
// ---------------------------------------------------------------------------

// Index into a Tyss (list of per-variant field type lists)
syntax Tys ::= #nthTys ( Tyss , Int ) [function, total]
rule #nthTys(TYS : _REST, 0) => TYS
rule #nthTys(_TYS : REST, N) => #nthTys(REST, N -Int 1) requires N >Int 0
rule #nthTys(_, _) => .Tys [owise]

// Index into variant layouts and extract field offsets in one step (total)
syntax MachineSizes ::= #nthVariantOffsets ( LayoutShapes , Int ) [function, total]
rule #nthVariantOffsets(layoutShape(fieldsShapeArbitrary(mk(OFFSETS)), _, _, _, _) _REST, 0) => OFFSETS
rule #nthVariantOffsets(_L REST, N) => #nthVariantOffsets(REST, N -Int 1) requires N >Int 0
rule #nthVariantOffsets(_, _) => .MachineSizes [owise]

syntax Int ::= #byteLength ( IntegerLength ) [function, total]
// -----------------------------------------------------------
rule #byteLength(integerLengthI8 ) => 1
Expand Down
16 changes: 14 additions & 2 deletions kmir/src/kmir/kompile.py
Original file line number Diff line number Diff line change
Expand Up @@ -521,13 +521,25 @@ def _functions(kmir: KMIR, smir_info: SMIRInfo) -> dict[int, KInner]:
for ty in smir_info.function_symbols_reverse[item_name]:
functions[ty] = parsed_item_kinner.args[1]

# Add intrinsic functions
# Add intrinsic functions and linked normal symbols that have no local body in `items`.
# Normal symbols must still map to `monoItemFn(..., noBody)` instead of falling back to UNKNOWN FUNCTION.
for ty, sym in smir_info.function_symbols.items():
if 'IntrinsicSym' in sym and ty not in functions:
if ty in functions:
continue
if 'IntrinsicSym' in sym:
functions[ty] = KApply(
'IntrinsicFunction',
[KApply('symbol(_)_LIB_Symbol_String', [stringToken(sym['IntrinsicSym'])])],
)
elif isinstance(sym.get('NormalSym'), str):
functions[ty] = KApply(
'MonoItemKind::MonoItemFn',
(
KApply('symbol(_)_LIB_Symbol_String', (stringToken(sym['NormalSym']),)),
KApply('defId(_)_BODY_DefId_Int', (intToken(ty),)),
KApply('noBody_BODY_MaybeBody', ()),
),
)

return functions

Expand Down
18 changes: 16 additions & 2 deletions kmir/src/kmir/testing/fixtures.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
from __future__ import annotations

import re
import sys
from difflib import unified_diff
from typing import TYPE_CHECKING

Expand All @@ -13,24 +15,36 @@

from pytest import FixtureRequest, Parser

import sys


def pytest_configure(config) -> None:
sys.setrecursionlimit(1000000)


def _normalize_symbol_hashes(text: str) -> str:
"""Normalize rustc symbol hash suffixes that drift across builds/environments."""
# Normalize mangled symbol hashes, including generic names with `$` and `.`.
# Keep trailing `E` when present; truncated variants may omit it.
text = re.sub(r'(_ZN[0-9A-Za-z_$.]+17h)[0-9a-fA-F]+E', r'\1<hash>E', text)
text = re.sub(r'(_ZN[0-9A-Za-z_$.]+17h)[0-9a-fA-F]+', r'\1<hash>', text)
# Normalize demangled hash suffixes (`...::h<hex>`).
text = re.sub(r'(::h)[0-9a-fA-F]{8,}', r'\1<hash>', text)
return text


def assert_or_update_show_output(
actual_text: str, expected_file: Path, *, update: bool, path_replacements: dict[str, str] | None = None
) -> None:
if path_replacements:
for old, new in path_replacements.items():
actual_text = actual_text.replace(old, new)
# Normalize rustc symbol hash suffixes that can drift across builds/environments.
actual_text = _normalize_symbol_hashes(actual_text)
if update:
expected_file.write_text(actual_text)
else:
assert expected_file.is_file()
expected_text = expected_file.read_text()
expected_text = _normalize_symbol_hashes(expected_text)
if actual_text != expected_text:
diff = '\n'.join(
unified_diff(
Expand Down
12 changes: 5 additions & 7 deletions kmir/src/kmir/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -288,8 +288,8 @@ def _extract_alloc_id(operands: KInner) -> int | None:
return None


def _annotate_unknown_function(k_cell: KInner, smir_info: SMIRInfo) -> list[str]:
"""If the k cell is `#setUpCalleeData` for `** UNKNOWN FUNCTION **`, return annotation lines with decoded info."""
def _annotate_nobody_function(k_cell: KInner, smir_info: SMIRInfo) -> list[str]:
"""If the k cell is `#setUpCalleeData` for a `noBody` callee, return annotation lines with decoded info."""

from .alloc import Allocation, AllocId, AllocInfo, Memory
from .linker import _demangle
Expand All @@ -307,16 +307,14 @@ def _annotate_unknown_function(k_cell: KInner, smir_info: SMIRInfo) -> list[str]
KApply(
label=KLabel(name='MonoItemKind::MonoItemFn'),
args=[
KApply(args=[KToken(token=symbol_name)]),
KApply(args=[KToken()]),
KApply(label=KLabel(name='defId(_)_BODY_DefId_Int'), args=[KToken(token=def_id_str)]),
_,
],
),
operands,
KApply(label=KLabel(name='span'), args=[KToken(token=span_str)]),
] if (
symbol_name == '\"** UNKNOWN FUNCTION **\"'
):
]:
def_id = int(def_id_str)
span = int(span_str)
case _:
Expand Down Expand Up @@ -379,7 +377,7 @@ def render_leaf_k_cells(proof: APRProof, cterm_show: CTermShow, smir_info: SMIRI
lines.extend(f' {k_line}' for k_line in k_lines)

if smir_info is not None and k_cell is not None:
annotations = _annotate_unknown_function(k_cell, smir_info)
annotations = _annotate_nobody_function(k_cell, smir_info)
lines.extend(annotations)

if idx != len(leaves) - 1:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
┃ │
┃ │ (6 steps)
┃ └─ 6 (stuck, leaf)
┃ #setUpCalleeData ( monoItemFn ( ... name: symbol ( "** UNKNOWN FUNCTION **" ) ,
┃ #setUpCalleeData ( monoItemFn ( ... name: symbol ( "_ZN3std7process4exit17h<hash>
┗━━┓ subst: .Subst
┃ constraint:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,69 +1,69 @@
{
"-6": {
"NormalSym": "_ZN4core3ptr85drop_in_place$LT$std..rt..lang_start$LT$$LP$$RP$$GT$..$u7b$$u7b$closure$u7d$$u7d$$GT$17hb524060ed0e83bbbE"
"NormalSym": "_ZN4core3ptr85drop_in_place$LT$std..rt..lang_start$LT$$LP$$RP$$GT$..$u7b$$u7b$closure$u7d$$u7d$$GT$17h<hash>E"
},
"-5": {
"NormalSym": "_ZN4core3ptr28drop_in_place$LT$$RF$u32$GT$17hb92e31f0aaa3d322E"
"NormalSym": "_ZN4core3ptr28drop_in_place$LT$$RF$u32$GT$17h<hash>E"
},
"-4": {
"NormalSym": "_ZN4core3ops8function6FnOnce40call_once$u7b$$u7b$vtable.shim$u7d$$u7d$17h526a5c5d4d9d3202E"
"NormalSym": "_ZN4core3ops8function6FnOnce40call_once$u7b$$u7b$vtable.shim$u7d$$u7d$17h<hash>E"
},
"-3": {
"NormalSym": "_ZN42_$LT$$RF$T$u20$as$u20$core..fmt..Debug$GT$3fmt17h3a8781eea2f9ddb7E"
"NormalSym": "_ZN42_$LT$$RF$T$u20$as$u20$core..fmt..Debug$GT$3fmt17h<hash>E"
},
"-2": {
"NormalSym": "_ZN3std2rt10lang_start17h17250425291430deE"
"NormalSym": "_ZN3std2rt10lang_start17h<hash>E"
},
"-1": {
"NormalSym": "_ZN8blackbox4main17h56268fefa1135d9eE"
"NormalSym": "_ZN8blackbox4main17h<hash>E"
},
"0": {
"NormalSym": "_ZN3std2rt19lang_start_internal17h018b8394ba015d86E"
"NormalSym": "_ZN3std2rt19lang_start_internal17h<hash>E"
},
"13": {
"NormalSym": "_ZN3std3sys9backtrace28__rust_begin_short_backtrace17h3ac302c9481e885fE"
"NormalSym": "_ZN3std3sys9backtrace28__rust_begin_short_backtrace17h<hash>E"
},
"14": {
"NormalSym": "_ZN54_$LT$$LP$$RP$$u20$as$u20$std..process..Termination$GT$6report17h8eaae5c69aab66e9E"
"NormalSym": "_ZN54_$LT$$LP$$RP$$u20$as$u20$std..process..Termination$GT$6report17h<hash>E"
},
"19": {
"NormalSym": "_ZN4core3ops8function6FnOnce9call_once17haae505bd39892fd2E"
"NormalSym": "_ZN4core3ops8function6FnOnce9call_once17h<hash>E"
},
"20": {
"IntrinsicSym": "black_box"
},
"21": {
"NormalSym": "_ZN4core3fmt3num50_$LT$impl$u20$core..fmt..Debug$u20$for$u20$u32$GT$3fmt17h99c61f0bde9a2afdE"
"NormalSym": "_ZN4core3fmt3num50_$LT$impl$u20$core..fmt..Debug$u20$for$u20$u32$GT$3fmt17h<hash>E"
},
"27": {
"NormalSym": "_ZN4core3fmt3num53_$LT$impl$u20$core..fmt..LowerHex$u20$for$u20$u32$GT$3fmt17hb987357f13dc6cc8E"
"NormalSym": "_ZN4core3fmt3num53_$LT$impl$u20$core..fmt..LowerHex$u20$for$u20$u32$GT$3fmt17h<hash>E"
},
"28": {
"NormalSym": "_ZN4core3fmt3num53_$LT$impl$u20$core..fmt..UpperHex$u20$for$u20$u32$GT$3fmt17h7baa47f3e5cbe44cE"
"NormalSym": "_ZN4core3fmt3num53_$LT$impl$u20$core..fmt..UpperHex$u20$for$u20$u32$GT$3fmt17h<hash>E"
},
"29": {
"NormalSym": "_ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$u32$GT$3fmt17hec74c53b91325b16E"
"NormalSym": "_ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$u32$GT$3fmt17h<hash>E"
},
"30": {
"NormalSym": "_ZN4core3ops8function6FnOnce9call_once17h2d8e0aae13049ae8E"
"NormalSym": "_ZN4core3ops8function6FnOnce9call_once17h<hash>E"
},
"32": {
"NormalSym": "_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17h5c121846c1652782E"
"NormalSym": "_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17h<hash>E"
},
"35": {
"IntrinsicSym": "black_box"
},
"36": {
"NormalSym": "_ZN4core9panicking19assert_failed_inner17h1d286061ca0adfe7E"
"NormalSym": "_ZN4core9panicking19assert_failed_inner17h<hash>E"
},
"43": {
"NormalSym": "_ZN8blackbox7add_one17h19d5f3b41fdf13daE"
"NormalSym": "_ZN8blackbox7add_one17h<hash>E"
},
"44": {
"NormalSym": "_ZN4core4hint9black_box17h0bfc654765aa2ddfE"
"NormalSym": "_ZN4core4hint9black_box17h<hash>E"
},
"45": {
"NormalSym": "_ZN4core9panicking13assert_failed17h9acd0d94a91ca0eaE"
"NormalSym": "_ZN4core9panicking13assert_failed17h<hash>E"
},
"48": {
"NoOpSym": ""
Expand Down
Loading