You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is quite a mouthful: [`Operand`] can represent either data stored somewhere in the [interpreter memory](#memory) (`Operand::Indirect`), or (as an optimization) immediate data stored in-line.
60
+
And [`Immediate`] can either be a single (potentially uninitialized) [scalar value][`Scalar`] (integer or thin pointer), or a pair of two of them.
61
+
In our case, the single scalar value is *not* (yet) initialized.
62
+
63
+
When the initialization of `_1` is invoked, the
59
64
value of the `FOO` constant is required, and triggers another call to
60
65
`tcx.const_eval`, which will not be shown here. If the evaluation of FOO is
61
-
successful, 42 will be subtracted by its value `4096` and the result stored in
62
-
`_1` as `ConstValue::ScalarPair(Scalar::Bytes(4054), Scalar::Bytes(0))`. The first
66
+
successful, `42` will be subtracted from its value `4096` and the result stored in
67
+
`_1` as `Operand::Immediate(Immediate::ScalarPair(Scalar::Raw { data: 4054, .. }, Scalar::Raw { data: 0, .. })`. The first
63
68
part of the pair is the computed value, the second part is a bool that's true if
64
-
an overflow happened.
69
+
an overflow happened. A `Scalar::Raw` also stores the size (in bytes) of this scalar value; we are eliding that here.
65
70
66
71
The next statement asserts that said boolean is `0`. In case the assertion
67
72
fails, its error message is used for reporting a compile-time error.
68
73
69
-
Since it does not fail, `ConstValue::Scalar(Scalar::Bytes(4054))` is stored in the
74
+
Since it does not fail, `Operand::Immediate(Immediate::Scalar(Scalar::Raw { data: 4054, .. }))` is stored in the
70
75
virtual memory was allocated before the evaluation. `_0` always refers to that
71
76
location directly.
72
77
73
-
After the evaluation is done, the virtual memory allocation is interned into the
74
-
`TyCtxt`. Future evaluations of the same constants will not actually invoke
75
-
miri, but just extract the value from the interned allocation.
76
-
77
-
The `tcx.const_eval` function has one additional feature: it will not return a
78
-
`ByRef(interned_allocation_id)`, but a `Scalar(computed_value)` if possible. This
79
-
makes using the result much more convenient, as no further queries need to be
78
+
After the evaluation is done, the return value is converted from [`Operand`] to [`ConstValue`] by [`op_to_const`]:
79
+
the former representation is geared towards what is needed *during* cost evaluation, while [`ConstValue`]
80
+
is shaped by the needs of the remaining parts of the compiler that consume the results of const evaluation.
81
+
As part of this conversion, for types with scalar values, even if
82
+
the resulting [`Operand`] is `Indirect`, it will return an immediate `ConstValue::Scalar(computed_value)` (instead of the usual `ConstValue::ByRef`).
83
+
This makes using the result much more efficient and also more convenient, as no further queries need to be
80
84
executed in order to get at something as simple as a `usize`.
81
85
86
+
Future evaluations of the same constants will not actually invoke
This is mainly the error enum and the `ConstValue` and `Scalar` types. A `ConstValue` can
87
-
be either `Scalar` (a single `Scalar`), `ScalarPair` (two `Scalar`s, usually fat
88
-
pointers or two element tuples) or `ByRef`, which is used for anything else and
99
+
This is mainly the error enum and the [`ConstValue`] and [`Scalar`] types. A `ConstValue` can
100
+
be either `Scalar` (a single `Scalar`, i.e., integer or thin pointer),
101
+
`Slice` (to represent byte slices and strings, as needed for pattern matching) or `ByRef`, which is used for anything else and
89
102
refers to a virtual allocation. These allocations can be accessed via the
90
103
methods on `tcx.interpret_interner`.
104
+
A `Scalar` is either some `Raw` integer or a pointer; see [the next section](#memory) for more on that.
91
105
92
-
If you are expecting a numeric result, you can use `unwrap_usize` (panics on
93
-
anything that can't be representad as a `u64`) or `assert_usize` which results
94
-
in an `Option<u128>` yielding the `Scalar` if possible.
106
+
If you are expecting a numeric result, you can use `eval_usize` (panics on
107
+
anything that can't be representad as a `u64`) or `try_eval_usize` which results
108
+
in an `Option<u64>` yielding the `Scalar` if possible.
95
109
96
-
## Allocations
110
+
## Memory
97
111
98
-
A miri allocation is either a byte sequence of the memory or an `Instance` in
99
-
the case of function pointers. Byte sequences can additionally contain
100
-
relocations that mark a group of bytes as a pointer to another allocation. The
101
-
actual bytes at the relocation refer to the offset inside the other allocation.
112
+
To support any kind of pointers, Miri needs to have a "virtual memory" that the pointers can point to.
113
+
This is implemented in the [`Memory`] type.
114
+
In the simplest model, every global variable, stack variable and every dynamic allocation corresponds to an [`Allocation`] in that memory.
115
+
(Actually using an allocation for every MIR stack variable would be very inefficient; that's why we have `Operand::Immediate` for stack variables that are both small and never have their address taken.
116
+
But that is purely an optimization.)
117
+
118
+
Such an `Allocation` is basically just a sequence of `u8` storing the value of each byte in this allocation.
119
+
(Plus some extra data, see below.)
120
+
Every `Allocation` has a globally unique `AllocId` assigned in `Memory`.
121
+
With that, a [`Pointer`] consists of a pair of an `AllocId` (indicating the allocation) and an offset into the allocation (indicating which byte of the allocation the pointer points to).
122
+
It may seem odd that a `Pointer` is not just an integer address, but remember that during const evaluation, we cannot know at which actual integer address the allocation will end up -- so we use `AllocId` as symbolic base addresses, which means we need a separate offset.
123
+
(As an aside, it turns out that pointers at run-time are [more than just integers, too](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#pointer-provenance).)
102
124
103
125
These allocations exist so that references and raw pointers have something to
104
126
point to. There is no global linear heap in which things are allocated, but each
105
127
allocation (be it for a local variable, a static or a (future) heap allocation)
106
128
gets its own little memory with exactly the required size. So if you have a
107
129
pointer to an allocation for a local variable `a`, there is no possible (no
108
130
matter how unsafe) operation that you can do that would ever change said pointer
109
-
to a pointer to `b`.
131
+
to a pointer to a different local variable `b`.
132
+
Pointer arithmetic on `a` will only ever change its offset; the `AllocId` stays the same.
133
+
134
+
This, however, causes a problem when we want to store a `Pointer` into an `Allocation`: we cannot turn it into a sequence of `u8` of the right length!
135
+
`AllocId` and offset together are twice as big as a pointer "seems" to be.
136
+
This is what the `relocation` field of `Allocation` is for: the byte offset of the `Pointer` gets stored as a bunch of `u8`, while its `AllocId` gets stored out-of-band.
137
+
The two are reassembled when the `Pointer` is read from memory.
138
+
The other bit of extra data an `Allocation` needs is `undef_mask` for keeping track of which of its bytes are initialized.
139
+
140
+
### Global memory and exotic allocations
141
+
142
+
`Memory` exists only during the Miri evaluation; it gets destroyed when the final value of the constant is computed.
143
+
In case that constant contains any pointers, those get "interned" and moved to a global "const eval memory" that is part of `TyCtxt`.
144
+
These allocations stay around for the remaining computation and get serialized into the final output (so that dependent crates can use them).
145
+
146
+
Moreover, to also support function pointers, the global memory in `TyCtxt` can also contain "virtual allocations": instead of an `Allocation`, these contain an `Instance`.
147
+
That allows a `Pointer` to point to either normal data or a function, which is needed to be able to evaluate casts from function pointers to raw pointers.
148
+
149
+
Finally, the [`GlobalAlloc`] type used in the global memory also contains a variant `Static` that points to a particular `const` or `static` item.
150
+
This is needed to support circular statics, where we need to have a `Pointer` to a `static` for which we cannot yet have an `Allocation` as we do not know the bytes of its value.
One common cause of confusion in Miri is that being a pointer *value* and having a pointer *type* are entirely independent properties.
160
+
By "pointer value", we refer to a `Scalar::Ptr` containing a `Pointer` and thus pointing somewhere into Miri's virtual memory.
161
+
This is in contrast to `Scalar::Raw`, which is just some concrete integer.
162
+
163
+
However, a variable of pointer or reference *type*, such as `*const T` or `&T`, does not have to have a pointer *value*:
164
+
it could be obtaining by casting or transmuting an integer to a pointer (currently that is hard to do in const eval, but eventually `transmute` will be stable as a `const fn`).
165
+
And similarly, when casting or transmuting a reference to some actual allocation to an integer, we end up with a pointer *value* (`Scalar::Ptr`) at integer *type* (`usize`).
166
+
This is a problem because we cannot meaningfully perform integer operations such as division on pointer values.
0 commit comments