Tuesday, November 25, 2014

Working with Multiple Instances of vr_ad Registers

The devices that we verify often have multiple instances of a certain register type. These registers are instantiated in a regular structure inside the design. In our tests, we want to be able to access each of them.

In vr_ad, accessing a register is typically done using the write/read_reg family of macros. These macros encapsulate the very flexible access mechanisms that the package provides into one convenient call. For unique registers, they just work, without having to pass in any additional options, but things get a bit more involved if the register type we are trying to access is instantiated multiple times. There are multiple scenarios where this can be the case. Let's have a look at them!

The first scenario that comes to mind is when there are multiple instances of the same register inside a register file. Let's take the example from the last post, with the graphics processing device:

<'
extend vr_ad_reg_file_kind : [ GRAPHICS ];
extend GRAPHICS vr_ad_reg_file {
  keep size == 512;
  post_generate() is also {
    reset();
  };
};


reg_def TRIANGLE {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};


extend GRAPHICS vr_ad_reg_file {
  triangles[3] : list of TRIANGLE vr_ad_reg;
  
  add_registers() is also {
    for each (triangle) in triangles {
      add_with_offset(index * 0x4, triangle);
    };
  };
};
'>

In this case, we have three TRIANGLE registers, each at a different offset. Let's try to use the write_reg macro to access a triangle:

<'
extend MAIN vr_ad_sequence {
  !triangle : TRIANGLE vr_ad_reg;
  
  body() @driver.clock is only {
    write_reg triangle;
  };
};
'>

Trying to use write_reg like this will result in an error message, stating that there are multiple TRIANGLE registers inside the address map. It's impossible for the macro to know exactly which instance we want. We can solve this ambiguity in two ways.

The first way is by using the less known operation generate block argument to the macro (which is optional). We get the exact instance of the model register we want and pass that in as the static_item:

<'
extend MAIN vr_ad_sequence {  
  body() @driver.clock is only {
    var static_triangle := driver.addr_map.get_regs_by_kind(TRIANGLE)[0];
    write_reg { .static_item == static_triangle } triangle;
  };
};
'>

This lets the macro know exactly which instance we want to access.

The other way we can do this is by using the static triangle as the macro argument:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    var static_triangle :=
      driver.addr_map.get_regs_by_kind(TRIANGLE)[1].as_a(TRIANGLE vr_ad_reg);
    write_reg static_triangle { .SIDE0 == 1 };
  };
};
'>

In this case, we don't have to pass a static_item anymore, but we have to cast the static triangle to be able to access the TRIANGLE subtype's fields. In both cases, though, we have to write quite a bit of code just to access the register we want.

Instead of always getting the static_item ourselves and using that, why not encapsulate the whole operation inside a macro of our own? Since macros are anyway used to access vr_ad registers, it shouldn't cause any confusion for anybody. What we need is an extra macro argument, so our macro should look something like this: write_triangle_reg <inst'idx> <reg'exp> <reg_gen'block> (let's ignore the <op_gen'block> argument for simplicity).

Such a macro would just fetch the appropriate static_item based on the <inst'idx> argument and forward the other arguments to write_reg. This is how such a macro would look like:

<'
define <write_triangle_reg'action>
  "write_triangle_reg <idx'exp> <reg'exp>[ <any>]" as computed
{
  var los : list of string;
  los.add(        "{");
  los.add(        "var static_triangle :=");
  los.add(appendf("driver.addr_map.get_regs_by_kind(TRIANGLE)[%s];", <idx'exp>));
  los.add(appendf("write_reg { .static_item == static_triangle } %s %s;", <reg'exp>, <any>));
  los.add(        "};");
  
  result = str_join(los, "\n");
};
'>

To access the last triangle, we would simply do:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    write_triangle_reg 2 triangle { .SIDE1 == 1 };
  };
};
'>

Now this is all fine and dandy, but what if we also want to read the registers? We would need a second macro, whose expansion would be very similar to the first one's. We could generalize the macro body code inside a function that can return the result for both access operations. The vr_ad package defines such functions inside the global singleton, so if it's good enough for Cadence, then it's good enough for us:

<'
extend global {
  vgm__access_triangle_reg_body(operation : string,
    idx : string, reg : string, block : string = "") : string is
  {
    var los : list of string;
    los.add(        "{");
    los.add(        "var static_triangles :=");
    los.add(        "driver.addr_map.get_regs_by_kind(TRIANGLE);");
    los.add(appendf("assert %s in [0..static_triangles.size() - 1];", idx));
    los.add(appendf("%s { .static_item == static_triangles[%s] } %s %s;", operation, idx, reg, block));
    los.add(        "};");
    
    result = str_join(los, "\n");
  };
};
'>

This function can take any access operation (write_reg, read_reg or write_reg_fields - a lesser know brother of the two), together with the macro arguments and return the macro body. Declaring the macros just means calling this function with the appropriate arguments:

<'
define <write_triangle_reg'action>
  "write_triangle_reg <idx'exp> <reg'exp>[ <any>]" as computed
{
  result = vgm__access_triangle_reg_body("write_reg", <idx'exp>, <reg'exp>, <any>);
};

define <write_triangle_reg_fields'action>
  "write_triangle_reg_fields <idx'exp> <reg'exp>[ <any>]" as computed
{
  result = vgm__access_triangle_reg_body("write_reg_fields", <idx'exp>, <reg'exp>, <any>);
};

define <read_triangle_reg'action>
  "read_triangle_reg <idx'exp> <reg'exp>" as computed
{
  result = vgm__access_triangle_reg_body("read_reg", <idx'exp>, <reg'exp>);
};
'>

Now we can also easily read any TRIANGLE register. Here's a read of the first triangle:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    read_triangle_reg 0 triangle;
  };
};
'>

A pretty useful thing we can also do with the macros is use them inside loops:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    for i from 0 to 2 {
      write_triangle_reg_fields i triangle { .SIDE2 = 1 };
    };
  };
};
'>

What we have up to now is great and all. It works perfectly for triangles, but let's throw circles into the mix as well:

<'
reg_def CIRCLE {
  reg_fld RADIUS : uint(bits : 8);
};


extend GRAPHICS vr_ad_reg_file {
  circles[5] : list of CIRCLE vr_ad_reg;
  
  add_registers() is also {
    for each (circle) in circles {
      add_with_offset(0x20 + index * 0x4, circle);
    };
  };
};
'>

The *_triangle_reg macros won't work on these registers so we're back to where we started. It would be very silly to create a new macro that can handle only circles, because that would become really unmaintainable if we were to add more and more shapes. What we need is a macro that can work with any register, regardless of type.

We already have the information about the register's kind inside the register itself. We just need to use that when calling get_regs_by_kind(...) to get the static register. We'll call this new macro write_graphics_reg and define a new function that implements all three versions of it:

<'
extend global {
  vgm__access_graphics_reg_body(operation : string,
    idx : string, reg : string, block : string = "") : string is
  {
    var los : list of string;
    los.add(        "{");
    los.add(        "var kind : vr_ad_reg_kind;");
    los.add(appendf("if %s == NULL { %s = new };", reg, reg));
    los.add(appendf("kind = %s.kind;", reg));
    los.add(        "var static_regs :=");
    los.add(        "driver.addr_map.get_regs_by_kind(kind);");
    los.add(appendf("assert %s in [0..static_regs.size() - 1];", idx));
    los.add(appendf("%s { .static_item == static_regs[%s] } %s %s;", operation, idx, reg, block));
    los.add(        "};");
    
    result = str_join(los, "\n");
  };
};
'>

The three macros will be:

<'
define <write_graphics_reg'action>
  "write_graphics_reg <idx'exp> <reg'exp>[ <any>]" as computed
{
  result = vgm__access_graphics_reg_body("write_reg", <idx'exp>, <reg'exp>, <any>);
};

define <write_graphics_reg_fields'action>
  "write_graphics_reg_fields <idx'exp> <reg'exp>[ <any>]" as computed
{
  result = vgm__access_graphics_reg_body("write_reg_fields", <idx'exp>, <reg'exp>, <any>);
};

define <read_graphics_reg'action>
  "read_graphics_reg <idx'exp> <reg'exp>" as computed
{
  result = vgm__access_graphics_reg_body("read_reg", <idx'exp>, <reg'exp>);
};
'>

These new macros can work with triangles, circles and any new register types we might add:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    write_graphics_reg 2 triangle;
    read_graphics_reg 4 circle;
  };
};
'>

Let's look at a different scenario now. Let's assume that our DUT is composed of multiple slices, each of which is able to do some graphics processing independently of the others. An individual slice can only process one triangle and one circle, but there are many such slices. In this case, the register definitions would look like this:

<'
extend vr_ad_reg_file_kind : [ SLICE ];
extend SLICE vr_ad_reg_file {
  keep size == 8;
  post_generate() is also {
    reset();
  };
};


reg_def TRIANGLE SLICE 0x0 {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};

reg_def CIRCLE SLICE 0x4 {
  reg_fld RADIUS : uint(bits : 8);
};


extend vr_ad_reg_file_kind : [ GRAPHICS ];
extend GRAPHICS vr_ad_reg_file {
  slices[3] : list of SLICE vr_ad_reg_file;
  
  keep size == 64;
  post_generate() is also {
    reset();
  };
  
  add_registers() is also {
    for each (slice) in slices {
      add_with_offset(index * 0x10, slice);
    };
  };
};
'>

As before, we can access a certain triangle or circle by getting the appropriate register instance from the address map and using that with write_reg, like we did in the previous scenario. What we can also do is get a handle to the appropriate register file and use that as the static_item:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    var static_reg_file := driver.addr_map.get_reg_files_by_kind(SLICE)[0];
    write_reg { .static_item == static_reg_file } triangle;
    write_reg { .static_item == static_reg_file } circle;
  };
};
'>

The same comment as above still applies; we'd have to do this operation every time we want to access a register, which can become tedious. Why not write a new macro, then?

Writing a macro that expands to the new code is pretty straightforward. Here's how the global function for that would look like:

<'
extend global {
  vgm__access_slice_reg_body(operation : string,
    idx : string, reg : string, block : string = "") : string is
  {
    var los : list of string;
    los.add(        "{");
    los.add(        "var static_slices :=");
    los.add(        "driver.addr_map.get_reg_files_by_kind(SLICE);");
    los.add(appendf("assert %s in [0..static_slices.size() - 1];", idx));
    los.add(appendf("%s { .static_item == static_slices[%s] } %s %s;", operation, idx, reg, block));
    los.add(        "};");
    
    result = str_join(los, "\n");
  };
};
'>

We'll then wrap calls to this method inside the actual macro declarations, which I won't show here. Using these macros, we can easily access any slice register:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    write_slice_reg 1 triangle { .SIDE1 == 1 };
    read_slice_reg 1 triangle;
    
    write_slice_reg_fields 1 circle { .RADIUS = 1 };
    read_slice_reg 1 circle;
  };
};
'>

We can also, for example, loop over all triangles in the design:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    for i from 0 to 2 {
      write_slice_reg i triangle val 0x010203;
    };
  };
};
'>

Things start to get fun when the design contains a mixture of the two cases we've looked at above. Our registers are organized in slices, but within one slice some registers may be instantiated multiple times:

<'
// TRIANGLE and CIRCLE reg_defs
// ...

reg_def SQUARE SLICE 0x20 {
  reg_fld SIDE : uint(bits : 8);
};


extend SLICE vr_ad_reg_file {
  triangles[3] : list of TRIANGLE vr_ad_reg;
  circles[4] : list of CIRCLE vr_ad_reg;
  
  add_registers() is also {
    for each (triangle) in triangles {
      add_with_offset(index * 0x4, triangle);
    };
    
    for each (circle) in circles {
      add_with_offset(0x10 + index * 0x4, circle);
    };
  };
};


extend vr_ad_reg_file_kind : [ GRAPHICS ];
extend GRAPHICS vr_ad_reg_file {
  slices[3] : list of SLICE vr_ad_reg_file;
  
  keep size == 256;
  post_generate() is also {
    reset();
  };
  
  add_registers() is also {
    for each (slice) in slices {
      add_with_offset(index * 0x40, slice);
    };
  };
};
'>

To access a square it's enough to just pass in the register file as a static_item (this is the same situation as in scenario no. 2):

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    var static_slice := driver.addr_map.get_reg_files_by_kind(SLICE)[0];
    write_reg { .static_item == static_slice } square;
  };
};
'>

To access a triangle (or a circle), though, we need to get the appropriate register instance (similarly to how we did it in scenario no. 1):

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    var static_slice := driver.addr_map.get_reg_files_by_kind(SLICE)[0];
    var static_triangle := static_slice.get_regs_by_kind(TRIANGLE)[0];
    write_reg { .static_item == static_triangle } triangle;
  };
};
'>

This means that our macro has to be able to handle both the register file and register indices. We can only process one square at a time, though, so in that case it doesn't make sense to pass in a register index (seeing as how there is only one instance per register file). The register index argument must therefore be optional. Starting top-down, this is how the definition of the write_graphics_reg macro would look like:

<'
define <write_graphics_reg'action>
  "write_graphics_reg <rf_idx'exp>[ <reg_idx'exp>] <reg'exp>[ <any>]" as computed
{
  result = vgm__access_graphics_reg_body("write_reg", <rf_idx'exp>, <reg_idx'exp>, <reg'exp>, <any>);
};
'>

The global function would then take both of these arguments into account to determine the static_item that gets passed:

<'
extend global {
  vgm__access_graphics_reg_body(operation : string,
    rf_idx : string, reg_idx : string, reg : string, block : string = "") : string is
  {
    var los : list of string;
    los.add(        "{");
    los.add(        "var static_slices := driver.addr_map.get_reg_files_by_kind(SLICE);");
    los.add(appendf("assert %s in [0..static_slices.size() - 1];", rf_idx));
    
    // multiply instantiated reg
    if reg_idx != "" {
      los.add(        "var kind : vr_ad_reg_kind;");
      los.add(appendf("if %s == NULL { %s = new };", reg, reg));
      los.add(appendf("kind = %s.kind;", reg));
      los.add(        "var static_regs :=");
      los.add(appendf("static_slices[%s].get_regs_by_kind(kind);", rf_idx));
      los.add(appendf("assert %s in [0..static_regs.size() - 1];", reg_idx));
    };
    
    los.add(appendf("%s {", operation));
    if reg_idx == "" {
      los.add(appendf(".static_item == static_slices[%s];", rf_idx));
    }
    else {
      los.add(appendf(".static_item == static_regs[%s];", reg_idx));
    };
    los.add(appendf("} %s %s;", reg, block));
    los.add(        "};");
    
    result = str_join(los, "\n");
  };
};
'>

Using the extended version of the macro we can now access both triangles and squares:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    write_graphics_reg 1 1 triangle;
    read_graphics_reg 2 0 triangle;
    
    write_graphics_reg 1 square;
    read_graphics_reg 2 square;
  };
};
'>

We can also loop over all squares (not shown) or double-loop over all triangles:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    for i from 0 to 2 {
      for j from 0 to 2 {
        read_graphics_reg i j triangle;
      };
    };
  };
};
'>

As we've seen, by encapsulating the write/read_reg macros inside our own we can easily select the register instance that we want to access. We save a lot of tedious typing and duplicated code to get the appropriate static_item every time. We pay a small price when using macros though, as it becomes more difficult to track down syntax errors, but with proper documentation the disadvantages can be reduced. For more examples and detailed code, check out the SourceForge repository.

If your next project involves a lot registers with multiple instances, why not try this approach out? See you next time!

Monday, November 17, 2014

Experimental Cures for Flattened Register Definitions in vr_ad

On my current project, I had an issue with my register definitions. Quite a few of my DUT's registers where just instances of the same register type. My vr_ad register definitions were generated by a script, based on the specification, a flow that I'm pretty sure is very similar to what most of you also have. Instead of generating a nice regular structure, this script created a separate type for each register instance. What resulted was a flattened structure where I'd, for example, get one instance each of registers SOME_REG0, SOME_REG1, SOME_REG2, instead of three instances of SOME_REG. I was lucky enough to be able to (partly) change the definitions by patching them by hand.

Someone on StackOverflow had the same problem, but didn't have the luxury of being able to fix it like I did. They weren't allowed to touch the code as I'm guessing it probably belonged to a different team. They probably also had a lot of legacy code that was using those flattened register definitions. This made me want to do an experimental post on how to best cope with such an issue.

Naturally the best thing to do is to fix the underlying problem of the registers getting flattened, but that might not be possible, so let's look at how to fix the symptoms.

To be able to do any kind of serious modeling, we need to be able to program generically. We can't (easily) do this if each register is an own type. I've tried to think of how to best handle this from a maintainability point of view. As a bonus requirement, we'd also like it that when the register definitions do get fixed (i.e. the generation flow gets updated) we have to make as few changes as possible to the modeling code.

Enough with the stories, let's get our hands dirty. As always, we'll start small, but think big. We'll go through a few iterations, look at where we're lacking and gradually refine our approach.

Let's say we have a device that can operate with shapes. Part of its functionality involves doing stuff with triangles. It can process multiple triangles at the same time, where each triangle is described by a register containing the lengths of its sides. Our DUT does computations on the triangles, based on these values. For example, it can compute the areas of the triangles. We want to check that what the DUT writes out is correct so we need to model these computations.

We have a trusty script that can generate the register definitions from the specification (maybe an XML file). This script isn't very well written and it doesn't know that all three TRIANGLE registers are just the same register instantiated 3 times (i.e. a regular structure), or maybe the information got lost in the XML somehow. This is what we get for our register definitions:

<'
extend vr_ad_reg_file_kind : [ GRAPHICS ];
extend GRAPHICS vr_ad_reg_file {
  keep size == 256;
  post_generate() is also {
    reset();
  };
};

reg_def TRIANGLE0 GRAPHICS 0x00 {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};

reg_def TRIANGLE1 GRAPHICS 0x10 {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};

reg_def TRIANGLE2 GRAPHICS 0x20 {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};
'>

Our reference model will contain a pointer to the register file:

<'
struct flattened_graphics_model {
  graphics_regs : GRAPHICS vr_ad_reg_file;
};
'>

The reference model needs to be able to compute the area of each triangle. As a first idea, we create a method for each triangle that implements Heron's formula:

<'
extend flattened_graphics_model {
  get_triangle0_area() : real is {
    var triangle0 := graphics_regs.triangle0;
    var half_per : real = 0.5 *
      (triangle0.SIDE0 + triangle0.SIDE1 + triangle0.SIDE2);
    result = sqrt(
      (half_per - triangle0.SIDE0) *
      (half_per - triangle0.SIDE1) *
      (half_per - triangle0.SIDE2) *
      half_per
    );
  };
  
  get_triangle1_area() : real is {
    var triangle1 := graphics_regs.triangle1;
    var half_per : real = 0.5 *
      (triangle1.SIDE0 + triangle1.SIDE1 + triangle1.SIDE2);
    result = sqrt(
      (half_per - triangle1.SIDE0) *
      (half_per - triangle1.SIDE1) *
      (half_per - triangle1.SIDE2) *
      half_per
    );
  };
  
  get_triangle2_area() : real is {
    var triangle2 := graphics_regs.triangle2;
    var half_per : real = 0.5 *
      (triangle2.SIDE0 + triangle2.SIDE1 + triangle2.SIDE2);
    result = sqrt(
      (half_per - triangle2.SIDE0) *
      (half_per - triangle2.SIDE1) *
      (half_per - triangle2.SIDE2) *
      half_per
    );
  };
};
'>

We can immediately see a problem with this approach. We've implemented the formula in three different places. This means that should something change, we have three places to fix. Now, Heron's formula changing is a pretty unlikely event, but should we have a different computation to perform here the discussion stands.

What we can do is extract the part that computes the actual area as an own method, that takes the three sides as its arguments:

<'
get_triangle_area(side0 : uint, side1 : uint, side2 : uint) : real is {
  var half_per : real = 0.5 * (side0 + side1 + side2);
  result = sqrt(
    (half_per - side0) *
    (half_per - side1) *
    (half_per - side2) *
    half_per
  );
};
'>

We can simplify the three methods from before to just call this generic method:

<'
get_triangle0_area() : real is {
  var triangle := graphics_regs.triangle0;
  result = get_triangle_area(triangle.SIDE0, triangle.SIDE1,
    triangle.SIDE2);
};

get_triangle1_area() : real is {
  var triangle := graphics_regs.triangle1;
  result = get_triangle_area(triangle.SIDE0, triangle.SIDE1,
    triangle.SIDE2);
};

get_triangle2_area() : real is {
  var triangle := graphics_regs.triangle2;
  result = get_triangle_area(triangle.SIDE0, triangle.SIDE1,
    triangle.SIDE2);
};
'>

At least this way we've centralized the computation part to one location. The number of such methods will grow linearly, though, with the number of TRIANGLE registers. This means that for n triangles we'll need n methods to compute the areas.

Let's add a new requirement: our DUT is also able to compute which triangle is the largest and we need to model that too. We can define a new method to do that based on the areas:

<'
largest() : uint is {
  var areas : list of real;
  areas.add(get_triangle0_area());
  areas.add(get_triangle1_area());
  areas.add(get_triangle2_area());
  
  result = areas.max_index(it);
};
'>

In this method, the number of calls to get_triangleX_area() also grows with the number of triangles. Moreover, if we want to be able to find out which triangle is the smallest, the method for that would have to look like this:

<'
smallest() : uint is {
  var areas : list of real;
  areas.add(get_triangle0_area());
  areas.add(get_triangle1_area());
  areas.add(get_triangle2_area());
  
  result = areas.min_index(it);
};
'>

Pretty much the same as largest(), isn't it? In this setup, adding a single triangle would require adding a new method for the area and changing two others. That's not very maintainable. We can use the same trick we did for the area computation and pull out computing the list of areas to it's own method, while simplifying the largest() and smallest() methods:

<'
get_triangle_areas() : list of real is {
  result.add(get_triangle0_area());
  result.add(get_triangle1_area());
  result.add(get_triangle2_area());
};

largest() : uint is {
  var areas := get_triangle_areas();
  result = areas.max_index(it);
};

smallest() : uint is {
  var areas := get_triangle_areas();
  result = areas.min_index(it);
};
'>

Now we only need to update the get_triangle_areas() method when adding a new triangle. Not much of an improvement, but every little thing counts when you're potentially dealing with a large number of triangles.

While we may have things sorted out for areas, we get a new requirement. Our DUT can also compute perimeters and tell us which triangle is the longest and which one is the shortest. This means we'll need to add a similar set of methods to handle this aspect, based on the examples from above:

<'
extend flattened_graphics_model {
  get_triangle_perimeter(side0 : uint, side1 : uint, side2 : uint) : uint is {
    result = side0 + side1 + side2;
  };
  
  get_triangle0_perimeter() : uint is {
    var triangle := graphics_regs.triangle0;
    result = get_triangle_perimeter(triangle.SIDE0, triangle.SIDE1,
      triangle.SIDE2);
  };
  
  get_triangle1_perimeter() : uint is {
    var triangle := graphics_regs.triangle1;
    result = get_triangle_perimeter(triangle.SIDE0, triangle.SIDE1,
      triangle.SIDE2);
  };
  
  get_triangle2_perimeter() : uint is {
    var triangle := graphics_regs.triangle2;
    result = get_triangle_perimeter(triangle.SIDE0, triangle.SIDE1,
      triangle.SIDE2);
  };
  
  get_triangle_perimeters() : list of uint is {
    result.add(get_triangle0_perimeter());
    result.add(get_triangle1_perimeter());
    result.add(get_triangle2_perimeter());
  };
  
  longest() : uint is {
    var perimeters := get_triangle_perimeters();
    result = perimeters.max_index(it);
  };
  
  shortest() : uint is {
    var perimeters := get_triangle_perimeters();
    result = perimeters.min_index(it);
  };
};
'>

Adding just one measly triangle is starting to become a real pain. What would be awesome is being able to just add one line of code every time a new triangle gets added and be done with it. Well, thanks to our good friends, the macros, this is possible.

What we notice is that the code is very regular. Aside from the indices, the method bodies look remarkably similar. This means that for the area aspect we can create the following macro:

<'
define <triangle_area_utils'statement> "triangle_area_utils <num>" as {
  extend flattened_graphics_model {
    get_triangle<num>_area() : real is {
      var triangle := graphics_regs.triangle<num>;
      result = get_triangle_area(triangle.SIDE0, triangle.SIDE1,
        triangle.SIDE2);
    };
    
    get_triangle_areas() : list of real is also {
      result.add(get_triangle<num>_area());
    };
  };
};
'>

Adding a new triangle is now as easy as just expanding the macro with the appropriate argument:

<'
triangle_area_utils 0;
triangle_area_utils 1;
triangle_area_utils 2;
'>

We could define a similar macro for the perimeter aspect (I won't show it here). While we have made adding new triangles easier, we've also shot ourselves in the foot. Excessive use of macros is a code smell because it can be very difficult to understand what code gets expanded in the background. Also, it makes the code more difficult to refactor, since we can't rely on fancy IDE features.

If we analyze the code up now we see that one of our main problems is that each triangle is stored in an individual field. This means that there's no way to access a triangle from a method by just passing in the index of the triangle (0, 1, 2, etc.). If we could do this, we could get rid of all our get_triangleX_area() methods.

A way of doing this is using the reflection API. Reflection allows us, among others, to get a field of a struct by using only the name of that field, specified as a string. In our case, we know that our register file contains fields named triangle0, triangle1, triangle2, etc. We can use the reflection API to extract the field that contains contains the appropriate index as its suffix:

<'
extend flattened_graphics_model {
  num_triangles : uint;
    keep num_triangles == 3;
  
  get_triangle_reg(idx : uint) : vr_ad_reg is {
    assert idx < num_triangles;
    
    var regs_type := rf_manager.get_exact_subtype_of_instance(graphics_regs);
    
    var triangle_reg_field :=
      regs_type.get_fields().first(it.get_name() == appendf("triangle%d", idx));
    assert triangle_reg_field != NULL;
    
    assert triangle_reg_field.get_type() ==
      rf_manager.get_type_by_name(appendf("TRIANGLE%d'kind vr_ad_reg", idx));
    result =
      triangle_reg_field.get_value(graphics_regs).get_value().unsafe();
  };
};
'>

The way to use the reflection API is to get the representation of our register file from the rf_manager singleton. What we'll end up with is a struct of type rf_struct that understands what fields, methods, etc. the register file has. Out of this we can extract a representation of the field for the triangle that interests us, of type rf_field. Based on this field we can construct our return value. How exactly this happens is explained in the documentation and in this excellent post from the Specman R&D team. Have a look at those resources for more details on how to use the reflection interface.

After we've gotten an instance of our desired register, we can use this to compute the area. We can do away with the get_triangleX_area() methods and replace them with one get_triangle_area_by_index(...) method:

<'
get_triangle_area_by_index(idx : uint) : real is {
  assert idx < num_triangles;
  var reg := get_triangle_reg(idx);
  var reg_type := rf_manager.get_exact_subtype_of_instance(reg);
  
  var side0_field := reg_type.get_fields().first(it.get_name() == "SIDE0");
  assert side0_field != NULL;
  assert side0_field.get_type() == rf_manager.get_type_by_name("uint(bits:8)");
  var side0 : uint = side0_field.get_value(reg).get_value().unsafe();
  
  var side1_field := reg_type.get_fields().first(it.get_name() == "SIDE1");
  assert side1_field != NULL;
  assert side1_field.get_type() == rf_manager.get_type_by_name("uint(bits:8)");
  var side1 : uint = side1_field.get_value(reg).get_value().unsafe();
  
  var side2_field := reg_type.get_fields().first(it.get_name() == "SIDE2");
  assert side2_field != NULL;
  assert side2_field.get_type() == rf_manager.get_type_by_name("uint(bits:8)");
  var side2 : uint = side2_field.get_value(reg).get_value().unsafe();
  
  result = get_triangle_area(side0, side1, side2);
};
'>

Because the return value of get_triangle_reg(...) is of type vr_ad_reg, we can't reference the SIDEx fields directly (as these are defined under when subtypes). We can't cast the value to any of these subtypes, because we would need n cast statements (the very thing we want to avoid). We can use the same method as before to get the values of the sides via the reflection interface. The resulting code isn't pretty, but it works. Can we do better, though?

Of course we can! An essential observation to make here is that all triangle register types contain the same fields, whether they are of type TRIANGLE0 or TRIANGLE1 or TRIANGLE2. We could do all of our operations using only a variable of one of these types, provided that we fill it up with the appropriate values for the sides. That is, a TRIANGLE0 with sides 1, 2 and 3 has the same area as a TRIANGLE1 with the same sides. With this idea in mind, we can do the following:

<'
get_triangle_area_by_index(idx : uint) : real is {
  assert idx < num_triangles;
  var triangle : TRIANGLE0 vr_ad_reg = new;
  triangle.write_reg_rawval(get_triangle_reg(idx).read_reg_rawval());
  result = get_triangle_area(triangle.SIDE0, triangle.SIDE1,
    triangle.SIDE2);
};
'>

We can just create a variable of type TRIANGLE0 and fill it up with the contents of our desired register. We can then reference the SIDE fields directly, without the need for all of that messy reflection code. The price we pay for this convenience, however is in essence a copy operation. Whether this is slower than using the reflection interface I can't say (though I suspect it isn't), but it is in any case cleaner.

Our largest() method becomes pretty trivial to write:

<'
largest() : uint is {
  var areas : list of real;
  for i from 0 to num_triangles - 1 {
    areas.add(get_triangle_area_by_index(i));
  };
  
  result = areas.max_index(it);
};
'>

Not only that, but we can now handle any number of triangles without increasing the number of lines in the code. The only modification we need to make is to set the num_triangles field to the appropriate value.

I'd propose one final refactoring step. Why do we have to define the methods that compute the area and the perimeter inside the reference model? A triangle register contains all of the information required to compute these values. Seeing as how we'll just be using the TRIANGLE0 subtype in our code, we can extend that to contain a get_area() method:

<'
extend TRIANGLE0 vr_ad_reg {
  get_area() : real is {
    var half_per : real = 0.5 * (SIDE0 + SIDE1 + SIDE2);
    result = sqrt(
      (half_per - SIDE0) *
      (half_per - SIDE1) *
      (half_per - SIDE2) *
      half_per
    );
  };
};
'>

Getting the area of a triangle becomes just:

<'
print graphics_model.get_triangle_reg(0).get_area();
'>

We can also rewrite the largest() method as:

<'
extend flattened_graphics_model {
  get_triangle_regs() : list of TRIANGLE0 vr_ad_reg is {
    for i from 0 to num_triangles - 1 {
      result.add(get_triangle_reg(i));
    };
  };
  
  largest() : uint is {
    var triangles := get_triangle_regs();
    result = triangles.max_index(it.get_area());
  };
};
'>

Of course, we can do the same for the perimeter aspect (not shown here). Let's take a moment to see what we've achieved. We've managed to program our computations in a generic way, by relying on methods that take the index of a register as a parameter. This saves us a lot of typing because we don't have to define a method that accesses each field. We've also nicely encapsulated our methods: all methods that refer to a single triangle (get_area() and get_perimeter()) are defined in the triangle register struct, while the methods that refer to all triangles are encapsulated in the reference model struct.

Further above, I've mentioned the bonus requirement that we want our resulting code to look as similar as possible to the case where the register definitions aren't flattened. Let's see how our reference model would look in the ideal case.

First we have to start with our register definitions:

<'
reg_def TRIANGLE {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};


extend GRAPHICS vr_ad_reg_file {
  triangles[3] : list of TRIANGLE vr_ad_reg;
  
  add_registers() is also {
    for each (triangle) in triangles {
      add_with_offset(index * 0x10, triangle);
    };
  };
};
'>

Since there is only one triangle struct, we extend that to add the get_area() method:

<'
extend TRIANGLE vr_ad_reg {
  get_area() : real is {
    var half_per : real = 0.5 * (SIDE0 + SIDE1 + SIDE2);
    result = sqrt(
      (half_per - SIDE0) *
      (half_per - SIDE1) *
      (half_per - SIDE2) *
      half_per
    );
  };
};
'>

Finding the largest and the smallest triangles is easily done by iterating over the triangles list of the register file:

<'
extend compacted_graphics_model {
  largest() : uint is {
    var triangles := graphics_regs.triangles;
    result = triangles.max_index(it.get_area());
  };
  
  smallest() : uint is {
    var triangles := graphics_regs.triangles;
    result = triangles.min_index(it.get_area());
  };
};
'>

Notice that we don't need the get_triangle_regs() method anymore, as we already have our triangles organized in a list. If we were to implement the last proposal, once our register definitions would be fixed, migrating to the new structure would only require some minor search and replace operations. This goes to show that starting off on the wrong foot doesn't mean we're completely out of the dance. With some extra work, we can get very close to the ideal solution, but we have to be willing to compromise a bit on simulation speed. Still, it's better than compromising on maintainability and getting stuck in an endless loop of bad coding style.

I hope you found this post useful. I've posted the code to SourceForge for reference. Stay tuned for more!

Saturday, November 1, 2014

Using indirect_access(...) in vr_ad

I've been working a lot with vr_ad lately. It has a lot of nice features for modeling registers, but unfortunately not all of them are documented. I'm going to do a longer series of posts based on my recent experiences.

Let's start out small. We've all had the case where accesses to one register of the design affect the values of other registers. This might be best illustrated with a concrete example. Let's say we have a status register defined as follows:

<'
reg_def STATUS EXAMPLE 0x0 {
  reg_fld VALID : uint(bits : 1) : R : 0x0;
  reg_fld DONE  : uint(bits : 1) : R : 0x0;
};
'>

We'll also have a control register:

<'
reg_def CONTROL EXAMPLE 0x4 {
  reg_fld SETVALID : uint(bits : 1) : W : 0x0;
  reg_fld CLRVALID : uint(bits : 1) : W : 0x0;
  reg_fld CLRDONE  : uint(bits : 1) : W : 0x0;
  reg_fld START    : uint(bits : 1) : W : 0x0;
};
'>

The status flags are updated by the device. Using the control register's SETVALID and CLRVALID fields, we can affect the value of the VALID status flag. Whenever our VALID flag is set, our DUT can start crunching data. The operation is triggered by writing the START field of the control register and terminates within 5 clock cycles. The DONE status flag is set by the hardware once the operation completes and can be cleared by writing to the CLRDONE control field.

In the past I implemented this by giving the affecting register pointers to the affected registers and implementing the logic inside the post_access(...) method. While this works, vr_ad provides a neater way of doing it.

What we first need to do is to mark the status register as an observer of the control register:

<'
extend EXAMPLE vr_ad_reg_file {
  add_registers() is also {
    control.attach(status);
  };
};
'>

Now, whenever the control register is accessed, a method called indirect_access(...) is called inside the status register. This method receives the direction of the access, together with the observed vr_ad object (in our case it's a register, but it could just as well be a register file, a memory, etc.). Using this information we can model the evolution of the status flags.

Let's start with modeling the VALID flag. As stated above, writing a '1' to SETVALID will set the flag, while writing a '1' to CLRVALID will clear it. Simultaneously writing '1's to both fields doesn't make sense, so what we'll do in that case is leave the flag unchanged. Let's distil this behavior into a method:

<'
extend STATUS vr_ad_reg {
  model_valid(control : CONTROL vr_ad_reg) is {
    if control.SETVALID == 1 and control.CLRVALID == 0 {
      VALID = 1;
    }
    else if control.CLRVALID == 1 {
      VALID = 0;
    };
  };
};
'>

We'll call this method from within indirect_access(...), whenever a write to CONTROL happens:

<'
extend STATUS vr_ad_reg {
  indirect_access(direction : vr_ad_rw_t, ad_item : vr_ad_base) is {
    if direction == WRITE {
      var control := ad_item.as_a(CONTROL vr_ad_reg);
      assert control != NULL;
      
      model_valid(control);
    };
  };
};
'>

Let's give it a test drive to make sure that everything works. We'll emulate a monitor updating the register model by calling update(...) for writes and compare_and_update(...) for reads. To make things easier, let's wrap these calls into two handy methods to access the registers:

<'
extend sys {
  reg_file : EXAMPLE vr_ad_reg_file;
  
  event clk is @sys.any;
  
  write_control(data : vr_ad_data_t) @clk is {
    reg_file.update(0x4, pack(packing.high, data), {});
    wait [1];
  };
  
  read_status(data : vr_ad_data_t) @clk is {
    compute reg_file.compare_and_update(0x0, pack(packing.high, data));
    wait [1];
  };
};
'>

The data argument we pass to these methods represents the data seen on the bus. When reading the status register, if the data we pass to compare_and_update(...) doesn't match the model's value, an error message will appear and we'll know we've made a mistake.

Let's start by trying to set VALID:

<'
extend sys {
  run() is also {
    start do_test();
  };
  
  do_test() @clk is {
    // set VALID
    write_control(0b1000);
    
    // expect to read VALID
    read_status(0b10);
  };
};
'>

Let's also try to clear VALID:

<'
extend sys {
  do_test() @clk is also {
    // clear VALID
    write_control(0b0100);
    
    // expect to read not VALID
    read_status(0b00);
  };
};
'>

Let's tackle the DONE flag now. We'll need to define a clock event to handle the duration of the operation. Also, our modeling method will have to be a TCM:

<'
extend STATUS vr_ad_reg {
  event clk;
  
  model_done(control : CONTROL vr_ad_reg) @clk is {
    if control.CLRDONE == 1 {
      DONE = 0;
    };
    
    if control.START == 1 and VALID == 1 {
      wait [5];
      DONE = 1;
    };
  };
'>

We'll start this method from indirect_access(...):

<'
extend STATUS vr_ad_reg {
  indirect_access(direction : vr_ad_rw_t, ad_item : vr_ad_base) is {
    if direction == WRITE {
      // ...
      
      start model_done(control);
    };
  };
};
'>

As before, let's test this to make sure it works. First let's set VALID and issue a START command. With an immediate read from the status register we'll only expect to see the VALID flag set. After 5 clock cycle we'll expect to see both VALID and DONE set:

<'
extend sys {
  do_test() @clk is also {
    // set VALID and START
    write_control(0b1001);
    
    // expect to read VALID
    read_status(0b10);
    
    // after 5 clocks, expect to read DONE as well
    for i from 1 to 5 { wait [1] };
    read_status(0b11);
  };
};
'>

After writing CLRDONE, we expect to see the DONE flag cleared:

<'
extend sys {
  do_test @clk is also {
    // clear DONE
    write_control(0b0010);
    
    // expect to read not DONE
    read_status(0b10);
  };
};
'>

And there we have it: a nice, clean way of modeling register interdependencies. This pattern can be extended to however many other registers might depend on the CONTROL register. Best of all, it encapsulates the modeling logic inside the register that is being affected, as opposed to inside the affecting register. This means that we can easily add and remove observer registers as we please, as they are all independent of each other.

You can find the complete code on SourceForge if you want to try it out.

Have fun with your register modeling!

P.S.

Don't forget to subscribe if you don't want to miss out on any future vr_ad related posts. You can also be notified about new posts via email by using the "Subscribe by Email" box.