Tuesday, September 22, 2015

Of Copies and Clones

Let's take a walk down memory lane and remember the fun times we had in college. For a few weeks at the end of the semester, though, things had to get serious, because exam season was starting. I wasn't a really big note taker, so I found myself having to loan notes from my friends. While studying, I couldn't well start scribbling on my friends' notes, as that would have made them really mad. The solution was to make copies on which I could then write to my heart's content. This way I could give the originals back to their owners in the same state as I got them.

In programming we're often in the same situation. We get some object passed via a function call from another collaborator in the program. For example, inside a scoreboard we get a transaction from a monitor. We might need to fiddle with this object, but before we do this, it's wise to copy it so that the original stays untouched. Much has been written on object copying, but I want to touch on how this is handled in SystemVerilog and, in particular, when using UVM.

According to section 8.12 Assignment, renaming, and copying of the IEEE 1800-2012 standard, it's possible to use the new keyword to create a shallow copy of an object, like so:

some_class obj = new()
some_class copy = new obj;

This will create a new object whose fields are identical to those of the first one. Let's look at a concrete example. Assume we have an APB transfer class, which extends uvm_sequence_item:

package vgm_apb;

class transfer extends uvm_sequence_item;
  rand direction_e direction;
  rand bit [31:0] address;
  rand bit [31:0] data;
  rand int unsigned delay;

  // ...
endclass

Copying a transfer would be conceptually equivalent to:

copy = new();
copy.direction = orig.direction;
copy.address = orig.address;
copy.data = orig.data;
copy.delay = orig.delay;

The compiler would "create" all of this code for us under the hood, which would save us a lot of typing. It would take all of the fields that our transfer class has and create such assignment statements for each and every one of them. When I say all fields, I mean all fields. Notice that transfer extends uvm_sequence_item, which means it will contain all fields defined in that class and in all other classes in the inheritance hierarchy. A grandparent class of uvm_sequence_item is uvm_transaction, which contains the following field definitions:

class uvm_transaction extends uvm_object;
  const uvm_event_pool events = new;
  uvm_event begin_event;
  uvm_event end_event;

  // ...
endclass

This means that our long list of assignments that the copy expands to would also contain:

copy.events = orig.events;
copy.begin_event = orig.begin_event;
copy.end_event = orig.end_event;

Notice that events, begin_event and end_event are objects themselves. An assignment like this wouldn't result in new objects being created inside the copy, but in the same objects getting referenced. Does this make sense? Not really. A transfer should have its own private events, totally independent of the events of other transfers. Unfortunately, it's impossible to exclude such fields from the copy procedure. In C++, it's possible to define a custom copy constructor to handle such special situations, but in SystemVerilog, when using the new operator, it's either all or nothing.

Because the language doesn't allow for flexibility when implementing copying, UVM introduced it's own functions to handle this task, called copy(...) and clone(). Let's look at copy(...) first. The way copy(...) is supposed to work is that it updates the fields of the caller object with the values contained in the object passed as its argument:

vgm_apb::transfer orig_transer = new();
vgm_apb::transfer copied_transfer = new();
copied_transfer.copy(orig_transfer);

Notice that we had to first create the object in which we stored the copy. After the call, some of the fields of copied_transfer will be set to the values contained in orig_transfer. Which fields get copied can be configured using the `uvm_field_*(...) macros:

class transfer extends uvm_sequence_item;
  // ...

  `uvm_object_utils_begin(transfer)
    `uvm_field_enum(direction_e, direction, UVM_ALL_ON)
    `uvm_field_int(address, UVM_ALL_ON)
    `uvm_field_int(data, UVM_ALL_ON)
    `uvm_field_int(delay, UVM_ALL_ON)
  `uvm_object_utils_end
endclass

The macros define which fields from the transfer class get copied. It's the responsibility of subclasses to define which of their fields participate in the copy. For example, uvm_transaction doesn't allow the events, begin_event and end_event fields to get copied. I don't want this post to be a tutorial on how to implement copying, as there are plenty of excellent resource already available, but a bit of introduction will be useful.

The use model for clone() is that, when called on an object, it will return a new object which is a copy of it:

vgm_apb::transfer orig_transer = new();
vgm_apb::transfer copied_transfer;
$cast(copied_transfer, orig_transfer.clone());

Notice that this time we didn't need to create a transfer to store the copy. The clone() function did this for us and subsequently called copy(...) on it to update its fields. We had to cast the return value, though, because the function's prototype is:

virtual function uvm_object clone();

Since the method allocated a vgm_apb::transfer, the cast will be successful.

What I want to investigate in this post is what happens when we try to copy and clone across the inheritance tree. For this purpose, we'll need a class that extends vgm_apb::transfer. The APB2 protocol defines an extra PSTRB signal which enables sparse write accesses. We may have a mixture of APB and APB2 slaves in our system and we want our UVC to be available in both flavors. The best way to do this would be to have another separate package for APB2. The APB2 transfer class would extend the previous one:

package vgm_apb2;

class transfer extends vgm_apb::transfer;
  rand bit strobe[4];

  // ...
endclass

To have the new strobe field get copied we could use a field macro for it. At the same time, the uvm_object class provides a hook for users to add their own code to extend the copy(...) function. This do_copy(...) hook is called after the code the field macros expand to:

class transfer extends vgm_apb::transfer;
  // ...

  virtual function void do_copy(uvm_object rhs);
    transfer rhs_cast;
    if (!$cast(rhs_cast, rhs))
      `uvm_fatal("CASTERR", "Cast error")
    this.strobe = rhs_cast.strobe;
  endfunction
endclass

The function is defined as virtual in uvm_object and it takes a uvm_object as an argument. This means we need to cast to our class to be able to access the strobe field. The copy(...) function is not virtual, so users are instructed to extend the hook. They're also not supposed to call the do_copy(...) function (or any of the other hooks for printing, recording, packing and unpacking) directly. It's kind of silly that the developers didn't define these methods are protected. One of the core tenets of OOP is encapsulation. According to this mantra, you know what's better than telling the user to not call certain functions directly? Not allowing the user to call certain functions directly (at least from outside of the class itself).

Copying a vgm_apb2::transfer works exactly as for the previous one. It's transparent to us whether the copy is performed via the field macros or the do_copy(...) hook:

vgm_apb2::transfer orig_transer = new();
vgm_apb2::transfer copied_transfer = new();
copied_transfer.copy(orig_transfer);

Now that we have a small inheritance hierarchy, let's see what happens when we try to mix copies and clones of different classes. Since any vgm_apb2::transfer is also a vgm_apb::transfer, it means that we can copy the former into the latter:

vgm_apb::transfer apb_trans_copy;
vgm_apb2::transfer apb2_trans = new("apb2_trans");

apb_trans_copy = new("apb_trans_copy");
apb_trans_copy.copy(apb2_trans);

At the same time, it's also possible to clone a vgm_apb2::transfer into a vgm_apb::transfer variable:

vgm_apb::transfer apb_trans_copy;
vgm_apb2::transfer apb2_trans = new("apb2_trans");

$cast(apb_trans_copy, apb2_trans.clone());

Now let's try and break stuff. Let's try and copy a vgm_apb::transfer into a  vgm_apb2::transfer:

vgm_apb::transfer apb_trans = new("apb_trans");
vgm_apb2::transfer apb2_trans_copy;

apb2_trans_copy = new("apb2_trans_copy");
apb2_trans_copy.copy(apb_trans);

Remember that the argument of copy(...) is passed to our do_copy(...) function, where it gets cast to vgm_apb2::transfer. Since a vgm_apb::transfer isn't a vgm_apb2::transfer, the cast will fail and cause a nice run time error. If, however, we would have implemented the copy of the strobe field using the field macros, we wouldn't be so lucky. We wouldn't get any fatal error (not even a warning). The vgm_apb::transfer fields would get copied to the target object, but strobe would be left untouched, leaving us with a pseudo-copy. This isn't nice at all.

Let's also try to clone a vgm_apb::transfer into a vgm_apb2::transfer variable:

vgm_apb::transfer apb_trans = new("apb_trans");
vgm_apb2::transfer apb2_trans_copy;

$cast(apb2_trans_copy, apb_trans.clone());

This time, the $cast(...) from this code snippet will fail, since clone() will return a vgm_apb::transfer. As we can see, misuse of copy(...) or clone() will cause either funky behavior in the worst case or run time errors in the best case. These are errors that could have been easily caught at compile time, but we need to make a few small changes to our classes.

Let's go back to a time when there wasn't any UVM and people wrote vanilla SystemVerilog code (and dinosaurs roamed the Earth). Let's forget about UVM objects and sequence items and let's implement our transfer as a stand-alone class:

package vgm_apb;

class transfer;  rand direction_e direction;
  rand bit [31:0] address;
  rand bit [31:0] data;
  rand int unsigned delay;

  // ...
endclass

If we want to implement a copy(...) method, the argument it's going to take will be of class vgm_apb::transfer (instead of uvm_object like in the previous case):

class transfer;
  // ...

  function void copy(transfer rhs);
    this.direction = rhs.direction;
    this.address = rhs.address;
    this.data = rhs.data;
    this.delay = rhs.delay;
  endfunction
endclass

The clone() method will also return a vgm_apb::transfer directly, instead of a downcast uvm_object:

class transfer;
  // ...

  virtual function transfer clone();
    clone = new(name);
    clone.copy(this);
    return clone;
  endfunction
endclass

Since our copy(...) and clone() functions are designed to work with objects of this class (and it's ancestors), we should have much more compile time safety. Moreover, when we're cloning, we don't need to do any more casting. Let's implement the APB2 transfer as well:

package vgm_apb2;

class transfer extends vgm_apb::transfer;
  rand bit strobe[4];

  // ...
endclass

Let's implement the copy(...) method to also be strongly typed:

class transfer extends vgm_apb::transfer;
  // ...

  function void copy(transfer rhs);
    super.copy(rhs);
    this.strobe = rhs.strobe;
  endfunction
endclass

I hope no one needs convincing that it's still possible to copy a vgm_apb2::transfer into a vgm_apb::transfer. Let's try to do it the other way around again and see what happens:

vgm_apb::transfer apb_trans = new();
vgm_apb2::transfer apb2_trans_copy;

apb2_trans_copy = new();
apb2_trans_copy.copy(apb_trans);

Since we're calling copy(...) on a vgm_apb2::transfer, where the argument is supposed to also be of type vgm_apb2::transfer, we'd expect to get a compile error here because we're passing it an object of type vgm_apb::transfer, right? Well, yes and no. Remember that vgm_apb:::transfer, which is the base class of vgm_apb2::transfer also defined a copy(...) function, that took a vgm_apb::transfer as an argument. What happened to that method when we overrode it in the sub-class? The LRM isn't really clear about what to expect when overriding methods and changing their arguments. In my simulator it seems that both versions of copy(...) exist and that the old one gets called. Since this is a gray area in the standard, other simulators might fail during compile, either when compiling the vgm_apb2 package (because they won't allow methods to be overridden like this) or when compiling the code snippet from above (because they will only keep the most recent definition of a method when it's overridden). I've tried it in a second simulator and there the latter case happened. Since we have three major EDA vendors, wouldn't it be really awesome if each one of them would have their own interpretation here and we could see all three behaviors I described above? If you ask me, I say the third option should be the legal one, since SystemVerilog doesn't explicitly support function overloading (in contrast to C++ or Java).

We've also got a clone() to implement. Luckily, SystemVerilog supports covariant return types, so it's possible to refine the clone() function to return a vgm_apb2::transfer.

class transfer extends vgm_apb::transfer;
  // ...

  virtual function transfer clone();
    clone = new();
    clone.copy(this);
    return clone;
  endfunction
endclass

This way, we don't need to do any casting of the return value for this class either. At the same time, compilers should be able to flag the following code as problematic:

vgm_apb::transfer apb_trans = new();
vgm_apb2::transfer apb2_trans_copy;

apb2_trans_copy = apb_trans.clone();

My tool merrily compiles it, but flags a run time error, even though It should have figured it out earlier while parsing the code. Other simulators will throw an error during compilation.

We did this exercise to show you that it's possible to write our copying routines in such a way that their improper use will result in a compile time error (that is, if the simulator allows). Now let's apply this knowledge about method overrides to our UVM sequence items. For both our transfers, we can override the copy(...) function to take an argument of the respective transfer type. Since I've called both classes transfer (but placed them in different packages), the code looks the same for both:

class transfer extends uvm_sequence_item;
  // ...

  function void copy(transfer rhs);
    super.copy(rhs);
  endfunction
endfunction

For clone(), I initially tried to call the base implementation (from uvm_object) and cast the result to the appropriate transfer class:

class transfer extends uvm_sequence_item;
  // ..

  virtual function transfer clone();
    void'($cast(clone, super.clone()));
    return clone;
  endfunction
endclass

As a side note, for those of you who don't know, it's possible to use the name of a value returning function as a temporary variable whose type is the return type. This is why we're trying to cast the result into clone. While I was developing the code, I made a mistake and forgot to add the `uvm_object_utils(...) macro to the vgm_apb2::transfer class. It was very surprising to see that the cast was always failing, until I looked in the implementation of uvm_object's clone() function. It seems that it calls a virtual method called create(...), which is defined for us when using the utils macro. Since I had forgotten to add the macro, the vgm_apb2::transfer class didn't have an own create(...) function and the one from the base class was getting called. This returned an object of vgm_apb::transfer, which is why the cast wasn't working. Since we're anyway overriding clone(), why not keep things simple and just create a new object using the class's constructor:

class transfer extends uvm_sequence_item;
  // ..

  virtual function transfer clone();
    clone = new(get_name());
    clone.copy(this);
    return clone;
  endfunction
endclass

This way it doesn't matter if we use the utils macro at all. As we can see, we don't need much more code to achieve compile time safety when copying. The code we do need to write, though, is all boilerplate. It could just as well be added to the utils macros, since the only thing that varies is the name of the class, which is already passed as an argument.

I've uploaded the code I used to prototype these ideas on SourceForge. Feel free to try it out in your simulator and see what you get. If it doesn't work to your liking (i.e. flag compile errors when trying to do improper copies), then be sure to write your friendly application engineer. Tools are supposed to help us catch such silly errors early, without having to resort to time consuming debugging.

8 comments:

  1. covariant return types are a -2012 only feature. before that they were illegal. thats why UVM doesn't utilise them in various functions such as copy().

    ps: actually IUS implemented functions return types as covariant in its initial implementation - however we had to drop that in the light of LRM compliance.

    ReplyDelete
    Replies
    1. Take it as a suggestion for when you decide to migrate to SV-2012 (I guess 10 years from now :P).

      Delete
    2. there are several places where new language features would be beneficial for the UVM implementation. unfortunately this is a big task and would affect lots of API (so maybe you are right)

      Delete
  2. Great post, as always!
    If I may, though, a couple of comments:
    1. For creation of new instances you use "new". The UVM recommendation is to use "create" for factory overriding - this would also help you find out faster that you forgot the factory registration.
    2. Just the same as the overloading of the "copy" in the non-uvm classes that you wrote, that is not "good" because SV does not (supposed to) support overloading, also implementing "clone" function to a class that inherits uvm_sequence_item doesn't seem to be a good idea, as it is an overloading of that function (since the return type is changed), isn't it so?

    ReplyDelete
    Replies
    1. 1. I guess you're talking about 'clone()'. That's explicitly what we want. When we're cloning a class, we want to create an object of exactly that class, otherwise it's not an identical clone. This is why we use 'new()' instead of 'create()'. There are also cases where you explicitly might not want to do factory registration (virtual classes, for example).

      2. Overloading 'copy()' is the right thing to do precisely because SV doesn't support function overloading. This way the compiler should flag an error when we're misusing the function. The same goes for 'clone()', plus here we have the added benefit that when we restrict the return type we can do away with casting.

      Delete
  3. there is no overloading in SV. all you get is the ability to redeclare a function in a derived class with the same signature (then it can be virtual), with a different signature (then it can't be virtual) or since lrm-2012 with the same signature+virtual and only the return type more specialized (a covariant return type). at no point of inheritance you see more than one function with a given name.

    ReplyDelete