Wednesday, November 25, 2020

A Comparison of Formal and Simulation for a Simple, Yet Non-Trivial Design - Part 1

I've talked a lot about constrained random verification on the blog, but now it's time to branch out to formal verification. As a fun first post on the topic, I thought we could do a comparative study. We can take a design and write two verification environments for it, one using formal verification and the other using simulation, based on UVM. Once we're done, it should be very interesting to be able to look at them side-by-side and to do an analysis.

I thought long and hard about which one of the verification environments should come first. While it might have made more sense to start with simulation, as this is what is probably most familiar to many of you, excitement got the better of me and I decided to go with formal verification.

Description of the Design Under Test

Let's have a look at the design we'll be verifying. My goal was to keep the design as simple as possible, so that it doesn't require a lot of effort to understand, but for it to have a few interesting quirks that make verifying it (especially using simulation) a bit more complicated.

Our design under test (DUT) will be a device that can process shapes by performing various operations on them. It can process circles, rectangles and triangles. As operations, it can compute the perimeter or the area of the shape. For rectangles, it can also figure out if a given one is a square. We can only perform this query on a rectangle, though, so it won't be possible to ask, for example, whether a circle is a square. It can also figure out whether a triangle is equilateral or whether it is isosceles. As before, these queries can only be performed on triangles.

Our design will have a single special function register (SFR), called CTRL. The CTRL register will be used to configure the type of shape to process and the operation to perform on it. It will have the following layout:

CTRL register layout generated using https://github.com/wavedrom/bitfield

The contents of the SHAPE field are one-hot encoded:

Value Shape
'b001 CIRCLE
'b010 RECTANGLE
'b100 TRIANGLE

The coding of the OPERATION field is a bit more complicated:

Value Operation
'b000_0000 PERIMETER
'b000_0001 AREA
'b010_0000 IS_SQUARE
'b100_0000 IS_EQUILATERAL
'b100_0001 IS_ISOSCELES

The uppermost three bits define which shapes can be combined with the operation. Operations that contain 000 in these bits can be used with any shape. In all other cases, these bits will contain the code for the shape on which the operation can be performed.

Writing a reserved value for any of the fields will cause the write to be ignored, in which case the SFR will keep its value. Writing an illegal combination of SHAPE and OPERATION values will also cause the write to be ignored.

To make things more interesting, the DUT provides a way of changing the operation without changing the shape. This is done by writing the special value 111 to the SHAPE field, together with the new value of OPERATION. We'll call this special value KEEP_SHAPE. Having this mechanism makes it easier for software to only update the OPERATION field, without having to do a read-modify-write of the CTRL register. The resulting value of the SFR after this write operation has to contain a legal combination, otherwise the write will be ignored. For example, it is allowed to write KEEP_SHAPE together with IS_ISOSCELES if CTRL.SHAPE is TRIANGLE, because it would result in (TRIANGLE, IS_ISOCELES), which is a legal combination. A write of KEEP_SHAPE together with IS_SQUARE would be ignored in the previous situation, as this would result in (TRIANGLE, IS_SQUARE), which is an illegal combination.

A similar mechanism exists for the OPERATION field, where the special value 111_1111, called KEEP_OPERATION, behaves like KEEP_SHAPE.

A real design would need some more SFRs to configure the shapes and to read out the results. For simplicity, we'll skip those and focus our verification efforts on the CTRL register.

Our design will have a simple bus interface:

module shape_processor(
    input bit rst_n,
    input bit clk,

    input bit write,
    input bit [31:0] write_data,

    input bit read,
    output bit [31:0] read_data
    );

I didn't include an address signal to keep things simple. All bus operations will always target the CTRL register.

You can find the full code here, including the verification environment, if you want to dig deeper into it or to experiment with it yourself.

Formal Verification

Before we start writing anything, let's stop for a second and think about what we want to achieve. It's true that we could write a few assertions about what the design is supposed whenever this or that is written. This would mean focusing on the implementation.

What is the most important thing that the design requirements try to ensure? All of them are there to make sure that the DUT is always in a legal configuration, where it can properly perform its function. Requirements like "writes with reserved values are ignored" or "writes with illegal combinations are ignored" merely serve to support this core requirement. The cool thing about formal verification is that we can focus on these core requirements.

To check these core requirements, we'll use a gray-box verification approach. We'll look directly at the internal DUT SFR. This way we can start writing properties without having to write any kind of infrastructure for the bus interface.

Of course, we could create a separate model of the DUT SFRs and write our properties using that. We could update it based on the bus operations that execute on the interface, similarly to how we would do it in a UVM-based testbench. This would not only require extra effort to implement the model and the monitoring, but it would increase the state space, making it more difficult for the tool to achieve proofs.

I was previously against any kind of white-box at block level, but I have since softened my stance, including when doing simulation. I now find it to be an acceptable trade-off, as long as it is done responsibly and in moderation. It should also only be done on parts of the design that are architecturally "visible" and that are relatively stable.

When accessing the values of the DUT's internal CTRL register, we should try to protect ourselves from any implementation specific aspects that might change unexpectedly and cause a lot of rework on our side. For example, the designer might choose to define a single 32 bit vector, where the bit positions for the fields might match those in the specification. They might decide to have two separate signals, one for each field. There are many possibilities to choose from.

We can define our "references" to the internal signals using the following let statements:

let shape_in_sfr = shape_processor.ctrl_sfr.shape;
let operation_in_sfr = shape_processor.ctrl_sfr.operation;

Our design currently uses a struct for the CTRL register, but if this were to ever change, we would only need to update these let statements.

To make our code more readable, we should model the encodings for the SFR fields using enums:

typedef enum bit [2:0] {
  CIRCLE = 'b001,
  RECTANGLE = 'b010,
  TRIANGLE = 'b100,
  KEEP_SHAPE = '1
} shape_e;

typedef enum bit [6:0] {
  PERIMETER = 'b000_0000,
  AREA = 'b000_0001,
  IS_SQUARE = 'b010_0000,
  IS_EQUILATERAL = 'b100_0000,
  IS_ISOSCELES = 'b100_0001,
  KEEP_OPERATION = '1
} operation_e;

With the infrastructure laid out, it's time to get to work! The most basic check we can do is to ensure that the values of the CTRL SFR fields will never have any of the reserved values. For the SHAPE field we would have the following assertion:

no_reserved_shapes_in_sfr: assert property (
    !is_reserved_shape(shape_in_sfr));

The is_reserved_shape(...) function will return 1 when it is supplied a value that doesn't match any of the values defined in the shape_e type:

function bit is_reserved_shape(bit [2:0] val);
  return !(val inside { KEEP_SHAPE, CIRCLE, RECTANGLE, TRIANGLE });
endfunction

(Ideally, we would have used a $cast(...) to check whether the bit vector argument can be converted to the desired enum type, but I couldn't get this to work.)

If the DUT would not properly validate the value of the SHAPE field before writing it, then the assertion from above would catch it.

We would also need a similar assertion for the OPERATION field:

no_reserved_operations_in_sfr: assert property (
    !is_reserved_operation(operation_in_sfr));

While KEEP_SHAPE and KEEP_OPERATION are defined write values for the CTRL fields, it is not possible for the SFR to contain any of these values. We can check that the DUT doesn't latch KEEP_SHAPE by mistake using the following assertion:

no_keep_shape_in_sfr: assert property (
    shape_in_sfr != KEEP_SHAPE);

Also, for KEEP_OPERATION we would have:

no_keep_operation_in_sfr: assert property (
    operation_in_sfr != KEEP_OPERATION);

We also want to ensure that we only ever see legal combinations in the CTRL SFR. For this, we need an extra assertion:

only_legal_combinations_in_sfr: assert property (
    is_legal_combination(shape_in_sfr, operation_in_sfr));

The is_legal_combination(...) function models the rules that govern which SHAPES can be combined with which OPERATIONS:

function bit is_legal_combination(shape_e shape, operation_e operation);
  if (operation inside { PERIMETER, AREA })
    return 1;
  if (operation == IS_SQUARE)
    return shape == RECTANGLE;
  if (operation inside { IS_EQUILATERAL, IS_ISOSCELES })
    return shape == TRIANGLE;
  return 0;
endfunction

Ideally such modeling code should not be a one-to-one copy of the RTL code. Where the RTL compares bits, the modeling code uses enum values, which also makes it very readable.

It's not the task of the is_legal_combination(...) function to model the behavior of KEEP_SHAPE or of KEEP_OPERATION. It should never be called with any of these values. The fact that it returns 0 if any of these values is passed as an argument is an implementation artifact. Normally, in simulation, I would trigger a fatal error in such cases, to make it clear to the caller that it's doing something wrong. Sadly, I couldn't get the tool to do this.

The assertions we wrote up to now will prove that our DUT can never be put in a faulty configuration. Unfortunately, they will also hold for a DUT that completely ignores writes (assuming, of course, that it comes out of reset with a legal combination of fields). This is because we haven't made it a requirement for the DUT to update its fields based on the write data presented to it. We only said that should it update its fields, the resulting values must satisfy the assertions we wrote.

A very cheap way of making sure that the DUT updates its fields is to write a cover property that proves that each (proper) value of the SHAPE and OPERATION fields can be stored in the SFR. We'll use the term "proper" to refer to all modes other than the two KEEP_* modes.

We can check that a CIRCLE can be seen in the SHAPE field using the following cover:

circle_in_sfr: cover property (
    shape_in_sfr == CIRCLE);

This won't bring us much if CIRCLE is the reset value of the field. Simply coming out reset will be enough to show us a trace. We should also check that we can see a RECTANGLE or a TRIANGLE:

rectangle_in_sfr: cover property (
    shape_in_sfr == RECTANGLE);

triangle_in_sfr: cover property (
    shape_in_sfr == TRIANGLE);

The DUT will have to be able to update the field, otherwise we won't get traces for all of the properties.

We're assuming that the CTRL register can be reset, otherwise the tool could just show us the DUT starting up with any SHAPE value. In this case the covers would be trivially reachable. The starting value could also be a reserved value, though, which would be caught by our assertions. Indirectly, we are ensuring that the SFR is reset. Moreover, if the SFR would not have a reset signal, then the tool would generally warn us about this.

We should give the OPERATION field the same treatment to make sure that all its proper values are reachable. Writing a cover for each possible operation, like we did for shapes, could get pretty tedious, though. To save typing we could use a generate statement and have the tool do the looping for us:

for (genvar operation = 0; operation < 2**$bits(operation_e); operation++) begin
  if (operation inside { PERIMETER, AREA, IS_SQUARE, IS_EQUILATERAL, IS_ISOSCELES }) begin
    operation_in_sfr_seen: cover property (
        operation_in_sfr == operation);
  end
end

The code suffers from some SystemVerilog limitations. Genvars are integers, so the names of the generated blocks won't be very suggestive because they'll contain the numeric values of the enum literals, not the literals themselves. It's also not possible to loop over enums directly, which makes the code longer and invites duplication. (The SystemVerilog syntax for looping over enums is not that great anyway, but at least it can handle changes in the enum type.)

The assertions that we wrote up to now for the internal SFR values, together with the covers, give us a very solid base. They provide enough guidance to implement the design properly and to avoid many non-trivial bugs. If we had high confidence in our designer and a good testing setup at the integration level we could probably stop here. If we were to go for the gold and verify the design completely we would still need a few more properties.

Since we're doing gray-box verification, we should also check that the SFR contents are delivered on the read data bus when a bus read is performed.

When working with the bus data, it would be really annoying to have to fiddle with bit positions to figure out what shape or operation we are getting as a result of the read. To help us extract the fields, we can define a struct with the same layout as the CTRL register:

typedef struct packed {
  bit [12:0] reserved1;
  bit [2:0] SHAPE;
  bit [8:0] reserved0;
  bit [6:0] OPERATION;
} ctrl_sfr_reg;

We can convert the read data to this type, for easy access to the fields using the . operator:

ctrl_sfr_reg read_data_as_ctrl_sfr;
assign read_data_as_ctrl_sfr = read_data;

To make the code that uses them shorter, we can define some let statements for the read values of the fields, like we did for the internal SFR:

let shape_on_read_bus = read_data_as_ctrl_sfr.SHAPE;
let operation_on_read_bus = read_data_as_ctrl_sfr.OPERATION;

Now we can write an assertion for each of the fields:

shape_delivered_as_read_data: assert property (
    read |-> shape_on_read_bus == shape_in_sfr);

operation_delivered_as_read_data: assert property (
    read |-> operation_on_read_bus == operation_in_sfr);

We haven't said anything about the behavior of the KEEP_SHAPE mode. For this, we'll have to look at the data being written on the bus.

Before we look at the write data, though, we can write another assertion. We can check that the SFR is only ever updated as a result of a bus write:

sfr_constant_if_no_write: assert property (
    !write |=> $stable(shape_processor.ctrl_sfr));

If the DUT were to change the value of the SFR when, for example, no access is taking place, this assertion would fail. This is a very powerful check, which is also very simple to write. It's pretty typical to find bugs where the RTL code doesn't properly sample the bus control signals.

We can do the same trick for the write data that we did with the read data, in order to get the values of the fields:

ctrl_sfr_reg write_data_as_ctrl_sfr;
assign write_data_as_ctrl_sfr = write_data;

let shape_on_write_bus = write_data_as_ctrl_sfr.SHAPE;
let operation_on_write_bus = write_data_as_ctrl_sfr.OPERATION;

Now we can check that when we try to write KEEP_SHAPE, the value of the DUT's SFR doesn't change:

shape_constant_if_keep_shape_write: assert property (
    write && shape_on_write_bus == KEEP_SHAPE |=>
        $stable(shape_in_sfr));

We can do the same for KEEP_OPERATION:

operation_constant_if_keep_operation_write: assert property (
    write && operation_on_write_bus == KEEP_OPERATION  |=>
        $stable(operation_in_sfr));

The previous assertions don't say anything about what should happen with the other field whenever a KEEP_* is written. For example, a DUT that completely ignores a write that contains KEEP_SHAPE and keeps the value of OPERATION as well will also satisfy the assertions. We can write a cover property to see that the DUT can update the value of OPERATION in such cases:

operation_updated_when_keep_shape: cover property (
    write && shape_on_write_bus == KEEP_SHAPE ##1 $changed(operation_in_sfr));

This cover will fail if the DUT completely ignores writes with KEEP_SHAPE.

Naturally, we can write a similar cover property for KEEP_OPERATION:

shape_updated_when_keep_operation: cover property (
    write && operation_on_write_bus == KEEP_OPERATION ##1 $changed(shape_in_sfr));

The properties we have don't fully specify what should happen whenever we try to write a reserved value to one of the fields. A DUT that treats a write with a reserved SHAPE value the same way it treats a KEEP_SHAPE will also pass all checks. To solve this, we can add the following assertions:

sfr_constant_if_reserved_shape_write: assert property (
    write && is_reserved_shape(shape_on_write_bus) |=>
        $stable(shape_processor.ctrl_sfr));

sfr_constant_if_reserved_operation_write: assert property (
    write && is_reserved_operation(operation_on_write_bus) |=>
        $stable(shape_processor.ctrl_sfr));

With these assertions, we make sure that writes that contain reserved values are completely ignored.

It's becoming more and more difficult to find a DUT that fulfills these properties, but doesn't fulfill the specification. The kinds of faulty implementations that these properties won't catch are starting to look like intentional mistakes. We've reached another point where we could probably just stop, but still be able to get a good night's sleep, knowing that the design is in very good shape. We could also write a few more properties and leave no stone unturned where bugs could hide.

Let's recall that whenever an illegal combination of SHAPE and OPERATION values is written, the DUT is supposed to ignore it. If, however, the DUT were to write some default combination into the SFR in such a case, for example the reset value of (CIRCLE, PERIMETER), then none of the properties we have would fail. This doesn't sound like the kind of mistake a designer would make when building the design from scratch. It might be an issue if a previous version of the specification used to require this behavior, in which case it would be possible to overlook the new requirement.

Writing a single assertion for this might get a bit more complicated because of the KEEP_* values, but we can split the problem into multiple cases.

First, we should check that illegal combinations of proper modes are ignored:

sfr_constant_if_illegal_combination_write_of_proper_modes: assert property (
    write && shape_on_write_bus != KEEP_SHAPE && operation_on_write_bus != KEEP_OPERATION
        && !is_legal_combination(shape_on_write_bus, operation_on_write_bus) |=>
            $stable(shape_processor.ctrl_sfr));

The fact that we already have the is_legal_combination(...) function which can operate on proper modes will nudge us into thinking about this behavior as an own case.

The next case we have is when we are writing KEEP_SHAPE, but the resulting combination would be illegal. This can also be implemented using the is_legal_combination(...) function, but instead of taking the bus value for the shape, we take the value in the SFR:

sfr_constant_if_illegal_combination_write_of_keep_shape: assert property (
    write && shape_on_write_bus == KEEP_SHAPE
        && !is_legal_combination(shape_in_sfr, operation_on_write_bus) |=>
            $stable(shape_processor.ctrl_sfr));

Of course, we have to do the same for KEEP_OPERATION:

sfr_constant_if_illegal_combination_write_of_keep_operation: assert property (
    write && operation_on_write_bus == KEEP_OPERATION
        && !is_legal_combination(shape_on_write_bus, operation_in_sfr) |=>
            $stable(shape_processor.ctrl_sfr));

Now we can prove that the DUT always ignores illegal writes, regardless of the reason (either reserved value or illegal combination), as we have properties for each of the cases.

I'm having a really difficult time coming up with a faulty implementation that still passes the current set of properties. The best I can come up with at this point is a design that somehow writes a "twisted" version of the data given to it: instead of writing CIRCLE, it writes RECTANGLE and vice-versa, but only when also trying to write PERIMETER (otherwise we might get illegal combination in the SFR).

For this design maybe it isn't that important to check that writes properly update the SFR fields, as there isn't much that much room for error. For a design where one of the shapes can be disabled via a separate SFR bit I could see it becoming very important. Let's imagine, for example, that the CIRCLE shape could be disabled depending on whether the pi computation unit is powered off. If this design also has a bug where it disables the TRIANGLE shape as well, we would not be able to find this if we only cover that the SHAPE field can be TRIANGLE. This is because the tool could always show us such traces when the pi computation unit is turned on.

To weed out any remaining pathological design behaviors we can write one more pair of assertions: that legal writes, that are supposed to update the SFR, do so properly for each of the fields. Here's what the assertion would look like for SHAPE:

legal_write_data_written_to_shape: assert property (
    write && is_legal_ctrl_write_data() |=>
        shape_in_sfr == $past(shape_on_write_bus));

While we were able to write multiple assertions when dealing with the cases where the write had to be ignored, in this case we have to write one assertion. We have to model what it means for the write data to be legal and this means describing all the properties required of it. None of the fields are allowed to contain reserved values and the combination of fields is supposed to be legal:

function bit is_legal_ctrl_write_data();
  return !is_reserved_shape(shape_on_write_bus)
      && !is_reserved_operation(operation_on_write_bus)
      && is_legal_ctrl_write_data_combination();
endfunction

The combination being legal is defined differently, depending on whether we are dealing with a KEEP_* or with a proper mode:

function bit is_legal_ctrl_write_data_combination();
  if (shape_on_write_bus == KEEP_SHAPE)
    return is_legal_combination(shape_in_sfr, operation_on_write_bus);
  if (operation_on_write_bus == KEEP_OPERATION)
    return is_legal_combination(shape_on_write_bus, operation_in_sfr);
  return is_legal_combination(shape_on_write_bus, operation_on_write_bus);
endfunction

If we try to prove this assertion, we'll notice that it fails. It will complain that it can't write KEEP_SHAPE into the SFR. This is because KEEP_SHAPE is not a value that leads to an update of the SHAPE field. We have to exclude this case and then the assertion will pass:

legal_write_data_written_to_shape: assert property (
    write && is_legal_ctrl_write_data() && shape_on_write_bus != KEEP_SHAPE |=>
        shape_in_sfr == $past(shape_on_write_bus));

Now imagine that the design would have contained the same bug. We could have let something slip through, since this assertion would have passed. Fortunately, even if this were the case, we still have the assertions on the internal SFR, which disallow KEEP_SHAPE as a value. This is the power of thinking about the core requirements of the design!

It's not that we wouldn't be able to do something similar in simulation. We could very well write checks on the internal SFR, alongside our register modeling which describes which writes are legal and which are not. It's just that we probably wouldn't, because our thinking is very focused on describing cause/effect relationships. We would probably stop after finishing with our register modeling and move on to the next thing.

For completeness, we also need a property for the OPERATION field:

legal_write_data_written_to_operation: assert property (
    write && is_legal_ctrl_write_data() && operation_on_write_bus != KEEP_OPERATION |=>
        operation_in_sfr == $past(operation_on_write_bus));

Before we end, I would like to revisit illegal writes that are supposed to be ignored. Now that we have a is_legal_ctrl_write_data() function, we could use it to write a single assertion to handle all cases of illegal writes, which are supposed to be ignored:

sfr_constant_if_illegal_write: assert property (
    write && !is_legal_ctrl_write_data() |=>
        $stable(shape_processor.ctrl_sfr));

This could replace the other assertions that we have that handle each illegal case separately.

Depending on the situation, it might be more advantageous to have either a single assertion for a requirement or to split it into multiple assertions. Multiple assertions are easier to debug and might be easier for the tool to prove (at least for big designs), because we are asking it to give us answers to slightly easier questions. Writing a single assertion has the advantage that it requires less code. It's easier to handle changes to the requirements when there are less places to update.

For this design I would go with the single property.

Conclusions

We've seen how formal verification nudges us to think about requirements, not about how to implement a testbench that is a faithful representation of the design.

We made our properties very readable and modular, which allowed us to tackle the verification task in incremental chunks. We should strive to avoid replicating the design and to make the properties read like sentences in the specification. In the end, our property checking environment doesn't have many more lines of code than the design itself.

We also saw that formal verification doesn't have to be an "all or nothing" proposition. Depending on our level of risk aversion we could decide to live with only a sub-set of checks, but still get high design quality. We can focus on the things that are truly important or difficult and not spend too much effort on the parts that are unlikely to have bugs.

Stay tuned for part 2, where we'll build a UVM testbench for the same design.

No comments:

Post a Comment