Sunday, December 14, 2014

Experimental Cures for Flattened Register Definitions in vr_ad, Part 2

We've already talked about how to handle flattened register definitions from a modeling point of view in this post. This other post also showed us that accessing multiply instantiated registers is a bit of a challenge, even when they are defined properly. Let's add the missing piece of the puzzle now and have a look at how to easily access flattened registers.

Let's apply the same idea from the previous post and use macros. This post is actually less experimental as I've already used this approach on my current project.

Let's start out with the register definitions. We'll use our trusty graphics processing engine that can handle triangles:

<'
extend vr_ad_reg_file_kind : [ GRAPHICS ];
extend GRAPHICS vr_ad_reg_file {
  keep size == 256;
  post_generate() is also {
    reset();
  };
};

reg_def TRIANGLE0 GRAPHICS 0x00 {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};

reg_def TRIANGLE1 GRAPHICS 0x4 {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};

reg_def TRIANGLE2 GRAPHICS 0x8 {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};
'>

Here's how we would access a single triangle:

<'
extend MAIN vr_ad_sequence {
  !triangle0 : TRIANGLE0 vr_ad_reg;
  
  body() @driver.clock is only {
    write_reg triangle0 {
      .SIDE0 == 1;
      .SIDE1 == 2;
      .SIDE2 == 3;
    };
  };
};
'>

If we would want to access TRIANGLE1 we would need to use a field of type TRIANGLE1 vr_ad_reg, but the code would otherwise stay the same. We don't need to use any static_item, because each register type is unique (which is exactly our problem). You can already see that creating a generic sequence that can handle any triangle is going to become a mess of case statements and doubled up code for the constraints.

We can fix that by using a macro on top of write_reg. We can call this macro on any field of type TRIANGLE0, TRIANGLE1, etc. and pass the desired instance as an argument. This is how we would write to TRIANGLE1:

<'
extend MAIN vr_ad_sequence {
  !triangle : TRIANGLE0 vr_ad_reg;
  
  body() @driver.clock is only {
    write_triangle_reg 1 triangle {
      .SIDE1 == 1;
    };
  };
};
'>

The macro would need to generate triangle with the appropriate constraint and execute an access to TRIANGLE1. Here's the macro body:

<'
define <write_triangle_reg'action>
  "write_triangle_reg <idx'exp> <reg'exp>[ <any>]" as computed
{
  var los : list of string;
  los.add(        "{");
  los.add(appendf("var temp_triangle : typeof(%s);", <reg'exp>));
  
  if <any> != "" {
    los.add(appendf("gen temp_triangle keeping %s;", <any>));
  }
  else {
    los.add(        "gen temp_triangle;");
  };
  
  los.add(        "var access_triangle : vr_ad_reg = new with {;");
  los.add( append(".kind = appendf(\"TRIANGLE%d\", ", <idx'exp>, ").as_a(vr_ad_reg_kind);"));
  los.add(        "};");
  los.add(        "write_reg access_triangle val temp_triangle.read_reg_rawval();");
  los.add(        "};");
  
  result = str_join(los, "\n");
};
'>

We generate a temporary register of the same type as the one we get passed in. Since all triangles have the same fields, it's irrelevant if the triangle field we passed in is of type TRIANGLE0, TRIANGLE1, etc. For the call to write_reg we need to use a variable of the appropriate type. We extract this type from the <idx'exp> argument by concatenating it to the string "TRIANGLE" and converting that to a vr_ad_reg_kind. As the write value we use the contents of the temporary field we just generated.

This should remind us that in this form our macro won't work in all cases, particularly when trying to use the val <val> syntax:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    write_triangle_reg 1 triangle val 0x010101;
  };
};
'>

This will give us a cryptic compile error. To do away with it we need to fix our macro body:

<'
define <write_triangle_reg'action>
  "write_triangle_reg <idx'exp> <reg'exp>[ <any>]" as computed
{
  var los : list of string;
  var is_val : bool = str_match(<any>, "/^val /");
  
  los.add(        "{");
  los.add(appendf("var temp_triangle : typeof(%s) = new;", <reg'exp>));
  
  if not is_val {
    if <any> != ""{
      los.add(appendf("gen temp_triangle keeping %s;", <any>));
    }
    else {
      los.add(        "gen temp_triangle;");
    };
  };
  
  los.add(        "var access_triangle : vr_ad_reg = new with {;");
  los.add( append(".kind = appendf(\"TRIANGLE%d\", ", <idx'exp>, ").as_a(vr_ad_reg_kind);"));
  los.add(        "};");
  if not is_val {
    los.add(        "write_reg access_triangle val temp_triangle.read_reg_rawval();");
  }
  else {
    los.add(appendf("write_reg access_triangle %s;", <any>));
  };
  los.add(        "};");
  
  result = str_join(los, "\n");
};
'>

We need to filter out the case when using the val <val> syntax. In that case we don't need to generate our temporary register to use it as the write value. "This macro is getting a bit too complicated", you might say and you would be right, but this is the price we pay for not doing things properly from the start (i.e. not having flattened register definitions).

As we've previously seen, the code for the write_reg_fields and read_reg flavors of the macro will be pretty similar, so it makes sense to encapsulate and generalize the macro code inside a global function:

<'
extend global {
  vgm__access_triangle_reg_body(operation : string,
    idx : string, reg : string, block : string = "") : string is
  {
    var los : list of string;
    var is_val : bool = str_match(block, "/^val /");
    
    los.add(        "{");
    los.add(appendf("var temp_triangle : typeof(%s) = new;", reg));
    
    if operation == "write_reg" {
      if not is_val {
        if block != ""{
          los.add(appendf("gen temp_triangle keeping %s;", block));
        }
        else {
          los.add(        "gen temp_triangle;");
        };
      };
    }
    else if operation == "write_reg_fields" {
      los.add(appendf("temp_triangle = new with %s;", block));
    };
    
    los.add(        "var access_triangle : vr_ad_reg = new with {;");
    los.add( append(".kind = appendf(\"TRIANGLE%d\", ", idx, ").as_a(vr_ad_reg_kind);"));
    los.add(        "};");
    
    // writing/reading
    if operation != "read_reg" {
      if operation != "write_reg" or not is_val {
        los.add(        "write_reg access_triangle val temp_triangle.read_reg_rawval();");
      }
      else {
        los.add(appendf("write_reg access_triangle %s;", block));
      };
    }
    else {
      los.add(        "read_reg access_triangle;");
      los.add(appendf("if %s == NULL {", reg));
      los.add(appendf("%s = new;", reg));
      los.add(        "};");
      los.add(appendf("%s.write_reg_rawval(access_triangle.read_reg_rawval());", reg));
    };
    
    los.add(        "};");
    
    result = str_join(los, "\n");
  };
};
'>

Our macro bodies will just contain calls to this function:

<'
define <write_triangle_reg'action>
  "write_triangle_reg <idx'exp> <reg'exp>[ <any>]" as computed
{
  result = vgm__access_triangle_reg_body("write_reg", <idx'exp>, <reg'exp>, <any>);
};

define <write_triangle_reg_fields'action>
  "write_triangle_reg_fields <idx'exp> <reg'exp>[ <any>]" as computed
{
  result = vgm__access_triangle_reg_body("write_reg_fields", <idx'exp>, <reg'exp>, <any>);
};

define <read_triangle_reg'action>
  "read_triangle_reg <idx'exp> <reg'exp>" as computed
{
  result = vgm__access_triangle_reg_body("read_reg", <idx'exp>, <reg'exp>);
};
'>

Here's the write_reg_fields flavor in action:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    write_triangle_reg 1 triangle val 0x010101;
  };
};
'>

And here's the read_reg flavor in action:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    read_triangle_reg 2 triangle;
  };
};
'>

What we have up to now works fine for triangles, but as you can remember our graphics engine can also process circles:

<'
reg_def CIRCLE0 GRAPHICS 0x10 {
  reg_fld RADIUS : uint(bits : 8);
};

reg_def CIRCLE1 GRAPHICS 0x14 {
  reg_fld RADIUS : uint(bits : 8);
};

reg_def CIRCLE2 GRAPHICS 0x18 {
  reg_fld RADIUS : uint(bits : 8);
};
'>

We need to generalize our macro to handle any type of register. We can use string matching to separate the register type from its instance number. For example, CIRCLE0 is composed of CIRCLE and 0. Once we extract the 0 from the end we can append the appropriate index given as an input to the macro. Here's a snippet that does exactly this:

<'
var reg_kind_str := reg.kind.as_a(string);
assert str_match(reg_kind_str, "/(.*)(\\d+)$/");
reg_kind_str = appendf("%s%d", $1, idx);
'>

We match everything until the end of the string, where we expect to see at least one numeric character. The first match group is stored in $1 (a built-in variable), to which we append the desired index. We integrate this code into the global function that returns the macro body:

<'
extend global {
  vgm__access_graphics_reg_body(operation : string,
    idx : string, reg : string, block : string = "") : string is
  {
    var los : list of string;
    var is_val : bool = str_match(block, "/^val /");
    
    los.add(        "{");
    los.add(appendf("var temp_reg : typeof(%s) = new;", reg));
    
    if operation == "write_reg" {
      if not is_val {
        if block != ""{
          los.add(appendf("gen temp_reg keeping %s;", block));
        }
        else {
          los.add(        "gen temp_reg;");
        };
      };
    }
    else if operation == "write_reg_fields" {
      los.add(appendf("temp_reg = new with %s;", block));
    };
    
    los.add(appendf("if %s == NULL {", reg));
    los.add(appendf("%s = new;", reg));
    los.add(        "};");
    los.add(appendf("var reg_kind_str := %s.kind.as_a(string);", reg));
    los.add(        "assert str_match(reg_kind_str, \"/(.*)(\\d+)$/\");");
    los.add( append("reg_kind_str = appendf(\"%s%d\", $1, ", idx, ");"));
    los.add(        "var access_reg : vr_ad_reg = new with {;");
    los.add(        ".kind = reg_kind_str.as_a(vr_ad_reg_kind);");
    los.add(        "};");
    
    // writing/reading
    if operation != "read_reg" {
      if operation != "write_reg" or not is_val {
        los.add(        "write_reg access_reg val temp_reg.read_reg_rawval();");
      }
      else {
        los.add(appendf("write_reg access_reg %s;", block));
      };
    }
    else {
      los.add(        "read_reg access_reg;");
      los.add(appendf("%s.write_reg_rawval(access_reg.read_reg_rawval());", reg));
    };
    
    los.add(        "};");
    
    result = str_join(los, "\n");
    print result;
  };
};
'>

As before, the macros just call this function:

<'
define <write_graphics_reg'action>
  "write_graphics_reg <idx'exp> <reg'exp>[ <any>]" as computed
{
  result = vgm__access_graphics_reg_body("write_reg", <idx'exp>, <reg'exp>, <any>);
};

define <write_graphics_reg_fields'action>
  "write_graphics_reg_fields <idx'exp> <reg'exp>[ <any>]" as computed
{
  result = vgm__access_graphics_reg_body("write_reg_fields", <idx'exp>, <reg'exp>, <any>);
};

define <read_graphics_reg'action>
  "read_graphics_reg <idx'exp> <reg'exp>" as computed
{
  result = vgm__access_graphics_reg_body("read_reg", <idx'exp>, <reg'exp>);
};
'>

Here's the macro being used to write to CIRCLE1:

<'
extend MAIN vr_ad_sequence {
  !circle : CIRCLE0 vr_ad_reg;
  
  body() @driver.clock is only {
    write_graphics_reg 1 circle { .RADIUS == 5 };
  };
};
'>

One of the main reasons why we wanted a generic way of accessing the registers is, of course, being able to do loops. Using the macro we can write all triangle registers:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    for i from 0 to 2 {
      write_graphics_reg i triangle {
        .SIDE0 == 3;
        .SIDE1 == 3;
        .SIDE2 == 3;
      };
    };
  };
};
'>

We can also easily read all circle registers:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    for i from 0 to 2 {
      read_graphics_reg i circle;
    };
  };
};
'>

The macros could also be converted to operate in multiple dimensions (i.e. to handle multiple instances of the GRAPHICS register file, etc.) by just updating the extraction of the vr_ad_reg_kind variable. We won't look at that here. If you want to get started with them, you can find the code on SourceForge, including the complete testing harness I used for development.

Using this approach we can easily access flattened registers in a generic way, allowing us to write portable sequences to complement the nice reference models we've learned to code last time. We pay for having flattened definitions with more complicated macro code, but the internal implementation of these macros is easy to change once we swap out the flattened model for a correctly defined one.

Happy register accesses and see you next time!

Tuesday, November 25, 2014

Working with Multiple Instances of vr_ad Registers

The devices that we verify often have multiple instances of a certain register type. These registers are instantiated in a regular structure inside the design. In our tests, we want to be able to access each of them.

In vr_ad, accessing a register is typically done using the write/read_reg family of macros. These macros encapsulate the very flexible access mechanisms that the package provides into one convenient call. For unique registers, they just work, without having to pass in any additional options, but things get a bit more involved if the register type we are trying to access is instantiated multiple times. There are multiple scenarios where this can be the case. Let's have a look at them!

The first scenario that comes to mind is when there are multiple instances of the same register inside a register file. Let's take the example from the last post, with the graphics processing device:

<'
extend vr_ad_reg_file_kind : [ GRAPHICS ];
extend GRAPHICS vr_ad_reg_file {
  keep size == 512;
  post_generate() is also {
    reset();
  };
};


reg_def TRIANGLE {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};


extend GRAPHICS vr_ad_reg_file {
  triangles[3] : list of TRIANGLE vr_ad_reg;
  
  add_registers() is also {
    for each (triangle) in triangles {
      add_with_offset(index * 0x4, triangle);
    };
  };
};
'>

In this case, we have three TRIANGLE registers, each at a different offset. Let's try to use the write_reg macro to access a triangle:

<'
extend MAIN vr_ad_sequence {
  !triangle : TRIANGLE vr_ad_reg;
  
  body() @driver.clock is only {
    write_reg triangle;
  };
};
'>

Trying to use write_reg like this will result in an error message, stating that there are multiple TRIANGLE registers inside the address map. It's impossible for the macro to know exactly which instance we want. We can solve this ambiguity in two ways.

The first way is by using the less known operation generate block argument to the macro (which is optional). We get the exact instance of the model register we want and pass that in as the static_item:

<'
extend MAIN vr_ad_sequence {  
  body() @driver.clock is only {
    var static_triangle := driver.addr_map.get_regs_by_kind(TRIANGLE)[0];
    write_reg { .static_item == static_triangle } triangle;
  };
};
'>

This lets the macro know exactly which instance we want to access.

The other way we can do this is by using the static triangle as the macro argument:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    var static_triangle :=
      driver.addr_map.get_regs_by_kind(TRIANGLE)[1].as_a(TRIANGLE vr_ad_reg);
    write_reg static_triangle { .SIDE0 == 1 };
  };
};
'>

In this case, we don't have to pass a static_item anymore, but we have to cast the static triangle to be able to access the TRIANGLE subtype's fields. In both cases, though, we have to write quite a bit of code just to access the register we want.

Instead of always getting the static_item ourselves and using that, why not encapsulate the whole operation inside a macro of our own? Since macros are anyway used to access vr_ad registers, it shouldn't cause any confusion for anybody. What we need is an extra macro argument, so our macro should look something like this: write_triangle_reg <inst'idx> <reg'exp> <reg_gen'block> (let's ignore the <op_gen'block> argument for simplicity).

Such a macro would just fetch the appropriate static_item based on the <inst'idx> argument and forward the other arguments to write_reg. This is how such a macro would look like:

<'
define <write_triangle_reg'action>
  "write_triangle_reg <idx'exp> <reg'exp>[ <any>]" as computed
{
  var los : list of string;
  los.add(        "{");
  los.add(        "var static_triangle :=");
  los.add(appendf("driver.addr_map.get_regs_by_kind(TRIANGLE)[%s];", <idx'exp>));
  los.add(appendf("write_reg { .static_item == static_triangle } %s %s;", <reg'exp>, <any>));
  los.add(        "};");
  
  result = str_join(los, "\n");
};
'>

To access the last triangle, we would simply do:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    write_triangle_reg 2 triangle { .SIDE1 == 1 };
  };
};
'>

Now this is all fine and dandy, but what if we also want to read the registers? We would need a second macro, whose expansion would be very similar to the first one's. We could generalize the macro body code inside a function that can return the result for both access operations. The vr_ad package defines such functions inside the global singleton, so if it's good enough for Cadence, then it's good enough for us:

<'
extend global {
  vgm__access_triangle_reg_body(operation : string,
    idx : string, reg : string, block : string = "") : string is
  {
    var los : list of string;
    los.add(        "{");
    los.add(        "var static_triangles :=");
    los.add(        "driver.addr_map.get_regs_by_kind(TRIANGLE);");
    los.add(appendf("assert %s in [0..static_triangles.size() - 1];", idx));
    los.add(appendf("%s { .static_item == static_triangles[%s] } %s %s;", operation, idx, reg, block));
    los.add(        "};");
    
    result = str_join(los, "\n");
  };
};
'>

This function can take any access operation (write_reg, read_reg or write_reg_fields - a lesser know brother of the two), together with the macro arguments and return the macro body. Declaring the macros just means calling this function with the appropriate arguments:

<'
define <write_triangle_reg'action>
  "write_triangle_reg <idx'exp> <reg'exp>[ <any>]" as computed
{
  result = vgm__access_triangle_reg_body("write_reg", <idx'exp>, <reg'exp>, <any>);
};

define <write_triangle_reg_fields'action>
  "write_triangle_reg_fields <idx'exp> <reg'exp>[ <any>]" as computed
{
  result = vgm__access_triangle_reg_body("write_reg_fields", <idx'exp>, <reg'exp>, <any>);
};

define <read_triangle_reg'action>
  "read_triangle_reg <idx'exp> <reg'exp>" as computed
{
  result = vgm__access_triangle_reg_body("read_reg", <idx'exp>, <reg'exp>);
};
'>

Now we can also easily read any TRIANGLE register. Here's a read of the first triangle:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    read_triangle_reg 0 triangle;
  };
};
'>

A pretty useful thing we can also do with the macros is use them inside loops:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    for i from 0 to 2 {
      write_triangle_reg_fields i triangle { .SIDE2 = 1 };
    };
  };
};
'>

What we have up to now is great and all. It works perfectly for triangles, but let's throw circles into the mix as well:

<'
reg_def CIRCLE {
  reg_fld RADIUS : uint(bits : 8);
};


extend GRAPHICS vr_ad_reg_file {
  circles[5] : list of CIRCLE vr_ad_reg;
  
  add_registers() is also {
    for each (circle) in circles {
      add_with_offset(0x20 + index * 0x4, circle);
    };
  };
};
'>

The *_triangle_reg macros won't work on these registers so we're back to where we started. It would be very silly to create a new macro that can handle only circles, because that would become really unmaintainable if we were to add more and more shapes. What we need is a macro that can work with any register, regardless of type.

We already have the information about the register's kind inside the register itself. We just need to use that when calling get_regs_by_kind(...) to get the static register. We'll call this new macro write_graphics_reg and define a new function that implements all three versions of it:

<'
extend global {
  vgm__access_graphics_reg_body(operation : string,
    idx : string, reg : string, block : string = "") : string is
  {
    var los : list of string;
    los.add(        "{");
    los.add(        "var kind : vr_ad_reg_kind;");
    los.add(appendf("if %s == NULL { %s = new };", reg, reg));
    los.add(appendf("kind = %s.kind;", reg));
    los.add(        "var static_regs :=");
    los.add(        "driver.addr_map.get_regs_by_kind(kind);");
    los.add(appendf("assert %s in [0..static_regs.size() - 1];", idx));
    los.add(appendf("%s { .static_item == static_regs[%s] } %s %s;", operation, idx, reg, block));
    los.add(        "};");
    
    result = str_join(los, "\n");
  };
};
'>

The three macros will be:

<'
define <write_graphics_reg'action>
  "write_graphics_reg <idx'exp> <reg'exp>[ <any>]" as computed
{
  result = vgm__access_graphics_reg_body("write_reg", <idx'exp>, <reg'exp>, <any>);
};

define <write_graphics_reg_fields'action>
  "write_graphics_reg_fields <idx'exp> <reg'exp>[ <any>]" as computed
{
  result = vgm__access_graphics_reg_body("write_reg_fields", <idx'exp>, <reg'exp>, <any>);
};

define <read_graphics_reg'action>
  "read_graphics_reg <idx'exp> <reg'exp>" as computed
{
  result = vgm__access_graphics_reg_body("read_reg", <idx'exp>, <reg'exp>);
};
'>

These new macros can work with triangles, circles and any new register types we might add:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    write_graphics_reg 2 triangle;
    read_graphics_reg 4 circle;
  };
};
'>

Let's look at a different scenario now. Let's assume that our DUT is composed of multiple slices, each of which is able to do some graphics processing independently of the others. An individual slice can only process one triangle and one circle, but there are many such slices. In this case, the register definitions would look like this:

<'
extend vr_ad_reg_file_kind : [ SLICE ];
extend SLICE vr_ad_reg_file {
  keep size == 8;
  post_generate() is also {
    reset();
  };
};


reg_def TRIANGLE SLICE 0x0 {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};

reg_def CIRCLE SLICE 0x4 {
  reg_fld RADIUS : uint(bits : 8);
};


extend vr_ad_reg_file_kind : [ GRAPHICS ];
extend GRAPHICS vr_ad_reg_file {
  slices[3] : list of SLICE vr_ad_reg_file;
  
  keep size == 64;
  post_generate() is also {
    reset();
  };
  
  add_registers() is also {
    for each (slice) in slices {
      add_with_offset(index * 0x10, slice);
    };
  };
};
'>

As before, we can access a certain triangle or circle by getting the appropriate register instance from the address map and using that with write_reg, like we did in the previous scenario. What we can also do is get a handle to the appropriate register file and use that as the static_item:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    var static_reg_file := driver.addr_map.get_reg_files_by_kind(SLICE)[0];
    write_reg { .static_item == static_reg_file } triangle;
    write_reg { .static_item == static_reg_file } circle;
  };
};
'>

The same comment as above still applies; we'd have to do this operation every time we want to access a register, which can become tedious. Why not write a new macro, then?

Writing a macro that expands to the new code is pretty straightforward. Here's how the global function for that would look like:

<'
extend global {
  vgm__access_slice_reg_body(operation : string,
    idx : string, reg : string, block : string = "") : string is
  {
    var los : list of string;
    los.add(        "{");
    los.add(        "var static_slices :=");
    los.add(        "driver.addr_map.get_reg_files_by_kind(SLICE);");
    los.add(appendf("assert %s in [0..static_slices.size() - 1];", idx));
    los.add(appendf("%s { .static_item == static_slices[%s] } %s %s;", operation, idx, reg, block));
    los.add(        "};");
    
    result = str_join(los, "\n");
  };
};
'>

We'll then wrap calls to this method inside the actual macro declarations, which I won't show here. Using these macros, we can easily access any slice register:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    write_slice_reg 1 triangle { .SIDE1 == 1 };
    read_slice_reg 1 triangle;
    
    write_slice_reg_fields 1 circle { .RADIUS = 1 };
    read_slice_reg 1 circle;
  };
};
'>

We can also, for example, loop over all triangles in the design:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    for i from 0 to 2 {
      write_slice_reg i triangle val 0x010203;
    };
  };
};
'>

Things start to get fun when the design contains a mixture of the two cases we've looked at above. Our registers are organized in slices, but within one slice some registers may be instantiated multiple times:

<'
// TRIANGLE and CIRCLE reg_defs
// ...

reg_def SQUARE SLICE 0x20 {
  reg_fld SIDE : uint(bits : 8);
};


extend SLICE vr_ad_reg_file {
  triangles[3] : list of TRIANGLE vr_ad_reg;
  circles[4] : list of CIRCLE vr_ad_reg;
  
  add_registers() is also {
    for each (triangle) in triangles {
      add_with_offset(index * 0x4, triangle);
    };
    
    for each (circle) in circles {
      add_with_offset(0x10 + index * 0x4, circle);
    };
  };
};


extend vr_ad_reg_file_kind : [ GRAPHICS ];
extend GRAPHICS vr_ad_reg_file {
  slices[3] : list of SLICE vr_ad_reg_file;
  
  keep size == 256;
  post_generate() is also {
    reset();
  };
  
  add_registers() is also {
    for each (slice) in slices {
      add_with_offset(index * 0x40, slice);
    };
  };
};
'>

To access a square it's enough to just pass in the register file as a static_item (this is the same situation as in scenario no. 2):

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    var static_slice := driver.addr_map.get_reg_files_by_kind(SLICE)[0];
    write_reg { .static_item == static_slice } square;
  };
};
'>

To access a triangle (or a circle), though, we need to get the appropriate register instance (similarly to how we did it in scenario no. 1):

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    var static_slice := driver.addr_map.get_reg_files_by_kind(SLICE)[0];
    var static_triangle := static_slice.get_regs_by_kind(TRIANGLE)[0];
    write_reg { .static_item == static_triangle } triangle;
  };
};
'>

This means that our macro has to be able to handle both the register file and register indices. We can only process one square at a time, though, so in that case it doesn't make sense to pass in a register index (seeing as how there is only one instance per register file). The register index argument must therefore be optional. Starting top-down, this is how the definition of the write_graphics_reg macro would look like:

<'
define <write_graphics_reg'action>
  "write_graphics_reg <rf_idx'exp>[ <reg_idx'exp>] <reg'exp>[ <any>]" as computed
{
  result = vgm__access_graphics_reg_body("write_reg", <rf_idx'exp>, <reg_idx'exp>, <reg'exp>, <any>);
};
'>

The global function would then take both of these arguments into account to determine the static_item that gets passed:

<'
extend global {
  vgm__access_graphics_reg_body(operation : string,
    rf_idx : string, reg_idx : string, reg : string, block : string = "") : string is
  {
    var los : list of string;
    los.add(        "{");
    los.add(        "var static_slices := driver.addr_map.get_reg_files_by_kind(SLICE);");
    los.add(appendf("assert %s in [0..static_slices.size() - 1];", rf_idx));
    
    // multiply instantiated reg
    if reg_idx != "" {
      los.add(        "var kind : vr_ad_reg_kind;");
      los.add(appendf("if %s == NULL { %s = new };", reg, reg));
      los.add(appendf("kind = %s.kind;", reg));
      los.add(        "var static_regs :=");
      los.add(appendf("static_slices[%s].get_regs_by_kind(kind);", rf_idx));
      los.add(appendf("assert %s in [0..static_regs.size() - 1];", reg_idx));
    };
    
    los.add(appendf("%s {", operation));
    if reg_idx == "" {
      los.add(appendf(".static_item == static_slices[%s];", rf_idx));
    }
    else {
      los.add(appendf(".static_item == static_regs[%s];", reg_idx));
    };
    los.add(appendf("} %s %s;", reg, block));
    los.add(        "};");
    
    result = str_join(los, "\n");
  };
};
'>

Using the extended version of the macro we can now access both triangles and squares:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    write_graphics_reg 1 1 triangle;
    read_graphics_reg 2 0 triangle;
    
    write_graphics_reg 1 square;
    read_graphics_reg 2 square;
  };
};
'>

We can also loop over all squares (not shown) or double-loop over all triangles:

<'
extend MAIN vr_ad_sequence {
  body() @driver.clock is only {
    for i from 0 to 2 {
      for j from 0 to 2 {
        read_graphics_reg i j triangle;
      };
    };
  };
};
'>

As we've seen, by encapsulating the write/read_reg macros inside our own we can easily select the register instance that we want to access. We save a lot of tedious typing and duplicated code to get the appropriate static_item every time. We pay a small price when using macros though, as it becomes more difficult to track down syntax errors, but with proper documentation the disadvantages can be reduced. For more examples and detailed code, check out the SourceForge repository.

If your next project involves a lot registers with multiple instances, why not try this approach out? See you next time!

Monday, November 17, 2014

Experimental Cures for Flattened Register Definitions in vr_ad

On my current project, I had an issue with my register definitions. Quite a few of my DUT's registers where just instances of the same register type. My vr_ad register definitions were generated by a script, based on the specification, a flow that I'm pretty sure is very similar to what most of you also have. Instead of generating a nice regular structure, this script created a separate type for each register instance. What resulted was a flattened structure where I'd, for example, get one instance each of registers SOME_REG0, SOME_REG1, SOME_REG2, instead of three instances of SOME_REG. I was lucky enough to be able to (partly) change the definitions by patching them by hand.

Someone on StackOverflow had the same problem, but didn't have the luxury of being able to fix it like I did. They weren't allowed to touch the code as I'm guessing it probably belonged to a different team. They probably also had a lot of legacy code that was using those flattened register definitions. This made me want to do an experimental post on how to best cope with such an issue.

Naturally the best thing to do is to fix the underlying problem of the registers getting flattened, but that might not be possible, so let's look at how to fix the symptoms.

To be able to do any kind of serious modeling, we need to be able to program generically. We can't (easily) do this if each register is an own type. I've tried to think of how to best handle this from a maintainability point of view. As a bonus requirement, we'd also like it that when the register definitions do get fixed (i.e. the generation flow gets updated) we have to make as few changes as possible to the modeling code.

Enough with the stories, let's get our hands dirty. As always, we'll start small, but think big. We'll go through a few iterations, look at where we're lacking and gradually refine our approach.

Let's say we have a device that can operate with shapes. Part of its functionality involves doing stuff with triangles. It can process multiple triangles at the same time, where each triangle is described by a register containing the lengths of its sides. Our DUT does computations on the triangles, based on these values. For example, it can compute the areas of the triangles. We want to check that what the DUT writes out is correct so we need to model these computations.

We have a trusty script that can generate the register definitions from the specification (maybe an XML file). This script isn't very well written and it doesn't know that all three TRIANGLE registers are just the same register instantiated 3 times (i.e. a regular structure), or maybe the information got lost in the XML somehow. This is what we get for our register definitions:

<'
extend vr_ad_reg_file_kind : [ GRAPHICS ];
extend GRAPHICS vr_ad_reg_file {
  keep size == 256;
  post_generate() is also {
    reset();
  };
};

reg_def TRIANGLE0 GRAPHICS 0x00 {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};

reg_def TRIANGLE1 GRAPHICS 0x10 {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};

reg_def TRIANGLE2 GRAPHICS 0x20 {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};
'>

Our reference model will contain a pointer to the register file:

<'
struct flattened_graphics_model {
  graphics_regs : GRAPHICS vr_ad_reg_file;
};
'>

The reference model needs to be able to compute the area of each triangle. As a first idea, we create a method for each triangle that implements Heron's formula:

<'
extend flattened_graphics_model {
  get_triangle0_area() : real is {
    var triangle0 := graphics_regs.triangle0;
    var half_per : real = 0.5 *
      (triangle0.SIDE0 + triangle0.SIDE1 + triangle0.SIDE2);
    result = sqrt(
      (half_per - triangle0.SIDE0) *
      (half_per - triangle0.SIDE1) *
      (half_per - triangle0.SIDE2) *
      half_per
    );
  };
  
  get_triangle1_area() : real is {
    var triangle1 := graphics_regs.triangle1;
    var half_per : real = 0.5 *
      (triangle1.SIDE0 + triangle1.SIDE1 + triangle1.SIDE2);
    result = sqrt(
      (half_per - triangle1.SIDE0) *
      (half_per - triangle1.SIDE1) *
      (half_per - triangle1.SIDE2) *
      half_per
    );
  };
  
  get_triangle2_area() : real is {
    var triangle2 := graphics_regs.triangle2;
    var half_per : real = 0.5 *
      (triangle2.SIDE0 + triangle2.SIDE1 + triangle2.SIDE2);
    result = sqrt(
      (half_per - triangle2.SIDE0) *
      (half_per - triangle2.SIDE1) *
      (half_per - triangle2.SIDE2) *
      half_per
    );
  };
};
'>

We can immediately see a problem with this approach. We've implemented the formula in three different places. This means that should something change, we have three places to fix. Now, Heron's formula changing is a pretty unlikely event, but should we have a different computation to perform here the discussion stands.

What we can do is extract the part that computes the actual area as an own method, that takes the three sides as its arguments:

<'
get_triangle_area(side0 : uint, side1 : uint, side2 : uint) : real is {
  var half_per : real = 0.5 * (side0 + side1 + side2);
  result = sqrt(
    (half_per - side0) *
    (half_per - side1) *
    (half_per - side2) *
    half_per
  );
};
'>

We can simplify the three methods from before to just call this generic method:

<'
get_triangle0_area() : real is {
  var triangle := graphics_regs.triangle0;
  result = get_triangle_area(triangle.SIDE0, triangle.SIDE1,
    triangle.SIDE2);
};

get_triangle1_area() : real is {
  var triangle := graphics_regs.triangle1;
  result = get_triangle_area(triangle.SIDE0, triangle.SIDE1,
    triangle.SIDE2);
};

get_triangle2_area() : real is {
  var triangle := graphics_regs.triangle2;
  result = get_triangle_area(triangle.SIDE0, triangle.SIDE1,
    triangle.SIDE2);
};
'>

At least this way we've centralized the computation part to one location. The number of such methods will grow linearly, though, with the number of TRIANGLE registers. This means that for n triangles we'll need n methods to compute the areas.

Let's add a new requirement: our DUT is also able to compute which triangle is the largest and we need to model that too. We can define a new method to do that based on the areas:

<'
largest() : uint is {
  var areas : list of real;
  areas.add(get_triangle0_area());
  areas.add(get_triangle1_area());
  areas.add(get_triangle2_area());
  
  result = areas.max_index(it);
};
'>

In this method, the number of calls to get_triangleX_area() also grows with the number of triangles. Moreover, if we want to be able to find out which triangle is the smallest, the method for that would have to look like this:

<'
smallest() : uint is {
  var areas : list of real;
  areas.add(get_triangle0_area());
  areas.add(get_triangle1_area());
  areas.add(get_triangle2_area());
  
  result = areas.min_index(it);
};
'>

Pretty much the same as largest(), isn't it? In this setup, adding a single triangle would require adding a new method for the area and changing two others. That's not very maintainable. We can use the same trick we did for the area computation and pull out computing the list of areas to it's own method, while simplifying the largest() and smallest() methods:

<'
get_triangle_areas() : list of real is {
  result.add(get_triangle0_area());
  result.add(get_triangle1_area());
  result.add(get_triangle2_area());
};

largest() : uint is {
  var areas := get_triangle_areas();
  result = areas.max_index(it);
};

smallest() : uint is {
  var areas := get_triangle_areas();
  result = areas.min_index(it);
};
'>

Now we only need to update the get_triangle_areas() method when adding a new triangle. Not much of an improvement, but every little thing counts when you're potentially dealing with a large number of triangles.

While we may have things sorted out for areas, we get a new requirement. Our DUT can also compute perimeters and tell us which triangle is the longest and which one is the shortest. This means we'll need to add a similar set of methods to handle this aspect, based on the examples from above:

<'
extend flattened_graphics_model {
  get_triangle_perimeter(side0 : uint, side1 : uint, side2 : uint) : uint is {
    result = side0 + side1 + side2;
  };
  
  get_triangle0_perimeter() : uint is {
    var triangle := graphics_regs.triangle0;
    result = get_triangle_perimeter(triangle.SIDE0, triangle.SIDE1,
      triangle.SIDE2);
  };
  
  get_triangle1_perimeter() : uint is {
    var triangle := graphics_regs.triangle1;
    result = get_triangle_perimeter(triangle.SIDE0, triangle.SIDE1,
      triangle.SIDE2);
  };
  
  get_triangle2_perimeter() : uint is {
    var triangle := graphics_regs.triangle2;
    result = get_triangle_perimeter(triangle.SIDE0, triangle.SIDE1,
      triangle.SIDE2);
  };
  
  get_triangle_perimeters() : list of uint is {
    result.add(get_triangle0_perimeter());
    result.add(get_triangle1_perimeter());
    result.add(get_triangle2_perimeter());
  };
  
  longest() : uint is {
    var perimeters := get_triangle_perimeters();
    result = perimeters.max_index(it);
  };
  
  shortest() : uint is {
    var perimeters := get_triangle_perimeters();
    result = perimeters.min_index(it);
  };
};
'>

Adding just one measly triangle is starting to become a real pain. What would be awesome is being able to just add one line of code every time a new triangle gets added and be done with it. Well, thanks to our good friends, the macros, this is possible.

What we notice is that the code is very regular. Aside from the indices, the method bodies look remarkably similar. This means that for the area aspect we can create the following macro:

<'
define <triangle_area_utils'statement> "triangle_area_utils <num>" as {
  extend flattened_graphics_model {
    get_triangle<num>_area() : real is {
      var triangle := graphics_regs.triangle<num>;
      result = get_triangle_area(triangle.SIDE0, triangle.SIDE1,
        triangle.SIDE2);
    };
    
    get_triangle_areas() : list of real is also {
      result.add(get_triangle<num>_area());
    };
  };
};
'>

Adding a new triangle is now as easy as just expanding the macro with the appropriate argument:

<'
triangle_area_utils 0;
triangle_area_utils 1;
triangle_area_utils 2;
'>

We could define a similar macro for the perimeter aspect (I won't show it here). While we have made adding new triangles easier, we've also shot ourselves in the foot. Excessive use of macros is a code smell because it can be very difficult to understand what code gets expanded in the background. Also, it makes the code more difficult to refactor, since we can't rely on fancy IDE features.

If we analyze the code up now we see that one of our main problems is that each triangle is stored in an individual field. This means that there's no way to access a triangle from a method by just passing in the index of the triangle (0, 1, 2, etc.). If we could do this, we could get rid of all our get_triangleX_area() methods.

A way of doing this is using the reflection API. Reflection allows us, among others, to get a field of a struct by using only the name of that field, specified as a string. In our case, we know that our register file contains fields named triangle0, triangle1, triangle2, etc. We can use the reflection API to extract the field that contains contains the appropriate index as its suffix:

<'
extend flattened_graphics_model {
  num_triangles : uint;
    keep num_triangles == 3;
  
  get_triangle_reg(idx : uint) : vr_ad_reg is {
    assert idx < num_triangles;
    
    var regs_type := rf_manager.get_exact_subtype_of_instance(graphics_regs);
    
    var triangle_reg_field :=
      regs_type.get_fields().first(it.get_name() == appendf("triangle%d", idx));
    assert triangle_reg_field != NULL;
    
    assert triangle_reg_field.get_type() ==
      rf_manager.get_type_by_name(appendf("TRIANGLE%d'kind vr_ad_reg", idx));
    result =
      triangle_reg_field.get_value(graphics_regs).get_value().unsafe();
  };
};
'>

The way to use the reflection API is to get the representation of our register file from the rf_manager singleton. What we'll end up with is a struct of type rf_struct that understands what fields, methods, etc. the register file has. Out of this we can extract a representation of the field for the triangle that interests us, of type rf_field. Based on this field we can construct our return value. How exactly this happens is explained in the documentation and in this excellent post from the Specman R&D team. Have a look at those resources for more details on how to use the reflection interface.

After we've gotten an instance of our desired register, we can use this to compute the area. We can do away with the get_triangleX_area() methods and replace them with one get_triangle_area_by_index(...) method:

<'
get_triangle_area_by_index(idx : uint) : real is {
  assert idx < num_triangles;
  var reg := get_triangle_reg(idx);
  var reg_type := rf_manager.get_exact_subtype_of_instance(reg);
  
  var side0_field := reg_type.get_fields().first(it.get_name() == "SIDE0");
  assert side0_field != NULL;
  assert side0_field.get_type() == rf_manager.get_type_by_name("uint(bits:8)");
  var side0 : uint = side0_field.get_value(reg).get_value().unsafe();
  
  var side1_field := reg_type.get_fields().first(it.get_name() == "SIDE1");
  assert side1_field != NULL;
  assert side1_field.get_type() == rf_manager.get_type_by_name("uint(bits:8)");
  var side1 : uint = side1_field.get_value(reg).get_value().unsafe();
  
  var side2_field := reg_type.get_fields().first(it.get_name() == "SIDE2");
  assert side2_field != NULL;
  assert side2_field.get_type() == rf_manager.get_type_by_name("uint(bits:8)");
  var side2 : uint = side2_field.get_value(reg).get_value().unsafe();
  
  result = get_triangle_area(side0, side1, side2);
};
'>

Because the return value of get_triangle_reg(...) is of type vr_ad_reg, we can't reference the SIDEx fields directly (as these are defined under when subtypes). We can't cast the value to any of these subtypes, because we would need n cast statements (the very thing we want to avoid). We can use the same method as before to get the values of the sides via the reflection interface. The resulting code isn't pretty, but it works. Can we do better, though?

Of course we can! An essential observation to make here is that all triangle register types contain the same fields, whether they are of type TRIANGLE0 or TRIANGLE1 or TRIANGLE2. We could do all of our operations using only a variable of one of these types, provided that we fill it up with the appropriate values for the sides. That is, a TRIANGLE0 with sides 1, 2 and 3 has the same area as a TRIANGLE1 with the same sides. With this idea in mind, we can do the following:

<'
get_triangle_area_by_index(idx : uint) : real is {
  assert idx < num_triangles;
  var triangle : TRIANGLE0 vr_ad_reg = new;
  triangle.write_reg_rawval(get_triangle_reg(idx).read_reg_rawval());
  result = get_triangle_area(triangle.SIDE0, triangle.SIDE1,
    triangle.SIDE2);
};
'>

We can just create a variable of type TRIANGLE0 and fill it up with the contents of our desired register. We can then reference the SIDE fields directly, without the need for all of that messy reflection code. The price we pay for this convenience, however is in essence a copy operation. Whether this is slower than using the reflection interface I can't say (though I suspect it isn't), but it is in any case cleaner.

Our largest() method becomes pretty trivial to write:

<'
largest() : uint is {
  var areas : list of real;
  for i from 0 to num_triangles - 1 {
    areas.add(get_triangle_area_by_index(i));
  };
  
  result = areas.max_index(it);
};
'>

Not only that, but we can now handle any number of triangles without increasing the number of lines in the code. The only modification we need to make is to set the num_triangles field to the appropriate value.

I'd propose one final refactoring step. Why do we have to define the methods that compute the area and the perimeter inside the reference model? A triangle register contains all of the information required to compute these values. Seeing as how we'll just be using the TRIANGLE0 subtype in our code, we can extend that to contain a get_area() method:

<'
extend TRIANGLE0 vr_ad_reg {
  get_area() : real is {
    var half_per : real = 0.5 * (SIDE0 + SIDE1 + SIDE2);
    result = sqrt(
      (half_per - SIDE0) *
      (half_per - SIDE1) *
      (half_per - SIDE2) *
      half_per
    );
  };
};
'>

Getting the area of a triangle becomes just:

<'
print graphics_model.get_triangle_reg(0).get_area();
'>

We can also rewrite the largest() method as:

<'
extend flattened_graphics_model {
  get_triangle_regs() : list of TRIANGLE0 vr_ad_reg is {
    for i from 0 to num_triangles - 1 {
      result.add(get_triangle_reg(i));
    };
  };
  
  largest() : uint is {
    var triangles := get_triangle_regs();
    result = triangles.max_index(it.get_area());
  };
};
'>

Of course, we can do the same for the perimeter aspect (not shown here). Let's take a moment to see what we've achieved. We've managed to program our computations in a generic way, by relying on methods that take the index of a register as a parameter. This saves us a lot of typing because we don't have to define a method that accesses each field. We've also nicely encapsulated our methods: all methods that refer to a single triangle (get_area() and get_perimeter()) are defined in the triangle register struct, while the methods that refer to all triangles are encapsulated in the reference model struct.

Further above, I've mentioned the bonus requirement that we want our resulting code to look as similar as possible to the case where the register definitions aren't flattened. Let's see how our reference model would look in the ideal case.

First we have to start with our register definitions:

<'
reg_def TRIANGLE {
  reg_fld SIDE0 : uint(bits : 8);
  reg_fld SIDE1 : uint(bits : 8);
  reg_fld SIDE2 : uint(bits : 8);
};


extend GRAPHICS vr_ad_reg_file {
  triangles[3] : list of TRIANGLE vr_ad_reg;
  
  add_registers() is also {
    for each (triangle) in triangles {
      add_with_offset(index * 0x10, triangle);
    };
  };
};
'>

Since there is only one triangle struct, we extend that to add the get_area() method:

<'
extend TRIANGLE vr_ad_reg {
  get_area() : real is {
    var half_per : real = 0.5 * (SIDE0 + SIDE1 + SIDE2);
    result = sqrt(
      (half_per - SIDE0) *
      (half_per - SIDE1) *
      (half_per - SIDE2) *
      half_per
    );
  };
};
'>

Finding the largest and the smallest triangles is easily done by iterating over the triangles list of the register file:

<'
extend compacted_graphics_model {
  largest() : uint is {
    var triangles := graphics_regs.triangles;
    result = triangles.max_index(it.get_area());
  };
  
  smallest() : uint is {
    var triangles := graphics_regs.triangles;
    result = triangles.min_index(it.get_area());
  };
};
'>

Notice that we don't need the get_triangle_regs() method anymore, as we already have our triangles organized in a list. If we were to implement the last proposal, once our register definitions would be fixed, migrating to the new structure would only require some minor search and replace operations. This goes to show that starting off on the wrong foot doesn't mean we're completely out of the dance. With some extra work, we can get very close to the ideal solution, but we have to be willing to compromise a bit on simulation speed. Still, it's better than compromising on maintainability and getting stuck in an endless loop of bad coding style.

I hope you found this post useful. I've posted the code to SourceForge for reference. Stay tuned for more!