Friday, March 25, 2016

An Overview of UVM End-of-Test Mechanisms

A lot traffic coming from Google to the blog is from searches about setting the UVM drain time. That's because one of my first posts was about how to set the drain time prior to going into the run phase. At the time of writing, this was the third most viewed post.

End-of-test handling in UVM seems to be a topic a lot of people are interested in. In this post we’re going to look at different ways of implementing it.

End-of-test relies on objections. Each component can raise objections during the run phase, meaning that it’s not yet ready to let the test finish. We typically raise an objection in the test, when starting our root sequence:

class test extends uvm_test;
  virtual task run_phase(uvm_phase phase);
    phase.raise_objection(this);
    seq.start(sequencer);
    phase.drop_objection(this);
  endtask

  // ...
endclass

This means that while the sequence is running, the test will keep going. Once we’ve finished pushing all of our traffic into the DUT, it will stop. This works great for designs without any latency. If our design processes data in the clock cycle it got it, then it’s fine if we just stop the simulation at that point. The isn’t usually the case. Due to the sequential nature of today’s designs, the effect of any kind of transaction fed to the DUT can only be seen one or more clock cycles later. If we stop the simulation at the time the transaction was accepted by the design, then we won’t be able to check what happens as an effect of that transaction.

As an example, let’s take a very boring design. Our DUT will have two APB interfaces, one slave and one master. Whatever comes in on the north (master) interface is going to come out of the south (slave) interface 16 clock cycles later. We're going to use the AMIQ APB UVC to talk to our design.

We'll need to instantiate two agents:

class env extends uvm_env;
  amiq_apb_master_agent master_agent;
  amiq_apb_slave_agent slave_agent;

  // ...
endclass

I'll spare you the code for actually instantiating and configuring the agents, since it's pretty much boilerplate.

What every testbench needs is a scoreboard to check that the DUT is doing what it's supposed to do. In this case, the scoreboard is pretty trivial. Whenever an item comes from the master agent, we should expect another item with identical characteristics to come from the slave agent.

class scoreboard extends uvm_scoreboard;
  `uvm_analysis_imp_decl(_north)
  `uvm_analysis_imp_decl(_south)

  uvm_analysis_imp_north #(amiq_apb_mon_item, scoreboard) north_aimp;
  uvm_analysis_imp_south #(amiq_apb_mon_item, scoreboard) south_aimp;

  // ...
endclass

Since it can be a while until a south side item comes, in the meantime we'll need to buffer the north side items in a queue. The APB UVC sends out two items per transfer through its analysis port, one for the setup phase and another for the access phase. I don't particularly like this approach, since it forces us to implement logic to throw out the setup phase item (two analysis ports would have been better):

class scoreboard extends uvm_scoreboard;
  protected int unsigned num_seen_north_items;

  protected amiq_apb_mon_item item_stream[$];


  virtual function void write_north(amiq_apb_mon_item item);
    num_seen_north_items++;
    if (num_seen_north_items % 2 == 1)
      return;

    `uvm_info("WRNORTH", "Got a north item", UVM_NONE)
    item_stream.push_back(item);
  endfunction

  // ...
endclass

When a south side item comes, we'll need to compare it with the first item in the queue:

class scoreboard extends uvm_scoreboard;
  protected int unsigned num_seen_south_items;

  protected amiq_apb_mon_item item_stream[$];


  virtual function void write_south(amiq_apb_mon_item item);
    num_seen_south_items++;
    if (num_seen_south_items % 2 == 1)
      return;

    `uvm_info("WRSOUTH", "Got a south item", UVM_NONE)
    if (!item.compare(item_stream.pop_front()))
      `uvm_error("DUTERR", "Mismatch")
  endfunction

  // ..
endclass

What we absolutely need to check is that at the end of the simulation there aren't any outstanding north side items that didn't yet make it to the south side. This means our queue must be empty. A great place to put this check is the check_phase(...) function:

class scoreboard extends uvm_scoreboard;
  virtual function void check_phase(uvm_phase phase);
    if (item_stream.size() != 0)
      `uvm_error("DUTERR", "There are still unchecked items")
  endfunction

  // ...
endclass

Here's where gracious test termination becomes important. If we just stop the simulation once the last north side item was sent, we're going have at least one item in our queue, which will cause the test to fail. This means we can't just simply start our sequence in this way:

class test extends uvm_test;
  virtual task run_phase(uvm_phase phase);
    apb_pipeline_tb::pipeline_sequence seq =
      apb_pipeline_tb::pipeline_sequence::type_id::create("seq", this);

    phase.raise_objection(this);
    seq.start(tb_env.master_agent.sequencer);
    phase.drop_objection(this);
  endtask

  // ...
endclass

We need to make sure that the objection gets dropped once the last item comes out through the south side APB interface. The naïve approach would be to add a delay inside the test between the sequence finishing and dropping the objection:

class test_delay extends test;
  virtual task run_phase(uvm_phase phase);
    apb_pipeline_tb::pipeline_sequence seq =
      apb_pipeline_tb::pipeline_sequence::type_id::create("seq", this);

    phase.raise_objection(this);
    seq.start(tb_env.master_agent.sequencer);

    #(16 * 2);

    phase.drop_objection(this);
  endtask

  // ...
endclass

This is going to work, though it might need an extra time step to avoid any race conditions when stopping the simulation (because the south side monitor might not get a chance to publish its item). There are a few drawbacks, though:

  1. We're going to have to add such a delay to each test we write. Once our designers decide that they need a 17 cycle deep pipeline, we're going to have to modify each and every one of these tests. This can, of course, be solved by writing a function that drops the objection and applies the delay beforehand.
  2. We've implemented the delay in terms of simulation steps, when we're actually interested in clock cycles (hence the multiplication with 2 - a clock cycle takes two simulation time steps). The same argument applies also if we were to wait for a certain number of time units. If someone decides that we need a longer clock, we're going to have to update the delays. This can also be solved by sending the APB clock to the test and using it for the delay. This is easier said than done in SystemVerilog, since what this entails is defining an interface, instantiating it, putting it into the config DB and getting it in the test.
  3. For complicated designs it might be difficult, if not impossible, to figure out how much time to wait before dropping the objection.
  4. It's very easy to forget to add the delay, leading to wasted debug time.

What UVM also provides is a "drain time" mechanism. After all objections have been dropped, the simulation end is delayed by the drain time configured by the user. The cool thing about it is that it can be set once in the base test and other tests don't need to take care of it anymore. A good place to do it is before the run phase starts, in either one of the end_of_elaboration_phase(...) or the start_of_simulation_phase(...) functions:

class test_drain_time extends test;
  virtual function void end_of_elaboration_phase(uvm_phase phase);
    uvm_phase run_phase = uvm_run_phase::get();
    run_phase.phase_done.set_drain_time(this, 16 * 2);
  endfunction

  // ...
endclass

The drawback here is, as in the previous case, that we are specifying the duration in simulation steps, not clock cycles. Moreover, in this case, the actual delay will be done by code in the UVM package. This means the time settings used when compiling UVM will be taken into account, so it might get really funky when working with a pre-compiled library from a vendor (which is usually the case).

The best thing would be if the scoreboard itself could decide when to allow the test to stop. What it could do is raise an objection whenever a north side item is received. This means that the DUT is processing something. Once a south side item comes out, it can drop an objection. Since (ideally) the number of north and south side items should match, once the DUT is done processing everything the scoreboard should drop all of its objections:

class scoreboard_with_objection extends apb_pipeline_tb::scoreboard;
  virtual function void write_north(amiq_apb_pkg::amiq_apb_mon_item item);
    uvm_phase run_phase;

    super.write_north(item);
    if (num_seen_north_items % 2 == 1)
      return;

    run_phase = uvm_run_phase::get();
    run_phase.raise_objection(this);
  endfunction


  virtual function void write_south(amiq_apb_pkg::amiq_apb_mon_item item);
    uvm_phase run_phase;

    super.write_south(item);
    if (num_seen_south_items % 2 == 1)
      return;

    run_phase = uvm_run_phase::get();
    run_phase.drop_objection(this);
  endfunction

  // ...
endclass

The great thing about this approach is that it works regardless of what pipeline depth we have. The only reason why someone might not want to implement a scoreboard like this is if they hang out too much on Verification Academy. The guys at Mentor Graphics say that raising objections in any place other than the test is a performance killer, particularly if its done on a per item basis, like we have here. This is because objections have to propagate throughout the hierarchy, which can take a significant toll on the simulator. In a toy example like this one it's probably not going to make much of a dent, but I can imagine that things can go overboard fast when dealing with complicated designs with many interfaces. With the (rather) new UVM 1.2 release, objections have gotten leaner, so the argument might not hold up anymore.

If you have a really big design and you're stuck using UVM 1.1, don't despair! There is a way to leave the scoreboard in control of when to end the test, without having to raise and drop objections for each item it gets. Each uvm_component has a phase_ready_to_end(...) function that is called before the phase is stopped. If our scoreboard still has items queued when the test sequence finishes, it can raise an objection to delay the end of the simulation. Once the queue becomes empty, it can drop the objection and allow the test to end:

class scoreboard_with_phase_ready_to_end extends apb_pipeline_tb::scoreboard;
  virtual function void phase_ready_to_end(uvm_phase phase);
    if (phase.get_name != "run")
      return;

    if (item_stream.size() != 0) begin
      phase.raise_objection(this);
      fork
        delay_phase_end(phase);
      join_none
    end
  endfunction


  virtual task delay_phase_end(uvm_phase phase);
    wait (item_stream.size() == 0);
    phase.drop_objection(this);
  endtask

  // ...
endclass

This combines the best of both worlds. It works regardless of pipeline depth, since we don't have to specify any kind of delay. It's also very efficient in terms of performance, since we don't need to execute anything for each item that the scoreboard receives. We only need to fork the drain task in the last stage of the simulation, which should have a negligible impact on the run time. There is one caveat, though. In more complicated testbenches, it might be the case that multiple components want to delay the end of the test. This could lead to situations where all objections for the run phase (for example) are dropped, phase_ready_to_end(...) gets called and a component decides to prolong the phase by raising another objection, eventually drops it, phase_ready_to_end(...) gets called again, another component wants to prolong the phase, and so on. If this process repeats too many times, a fatal error is flagged, as mentioned in this thread. Such a situation shouldn't happened very often in practice.

These are the ways of handling end-of-test that currently come to mind. If I missed anything, do let me know in the comments section. If you want to experiment, the code can be found on GitHub. Out of all outlined methods, using phase_ready_to_end(...) seems to be the best by far. I'm definitely using it in my future projects.

14 comments:

  1. Great Article!!! Keep going.
    I would wish to see article on phase.raise_objection mechanism.

    Thanks
    Taahir

    ReplyDelete
  2. Hi Tudor,

    Raising objections in the scoreboard comes with some drawbacks.
    The most obvious is that if there is a bug in the RTL in which a packet is not outputted then the test will never end.
    An other thing that one has to take care of is dropping the relevant number of objections when there is a reset in the middle of the test (e.g. scoreboard not empty), or if there is some "clear FIFO" functionality in the design.

    But if these little corner cases are handled I think that this is a great way to control when a test is ended.

    An other place where I would raise and drop objections is in the monitor, at the beginning and at the end of an item. This will prevent stopping the test in the middle of a transfer.

    Cristi

    ReplyDelete
    Replies
    1. You're right, you're at the mercy of DUT bugs in that case. This is why it's important to set up a global timeout (UVM already has something per default) or use the heartbeat mechanism to decide if the simulation is hanging.

      Making sure you don't stop during an item is also important. Even there, in case of performance problems with objections, you could use 'phase_ready_to_end(...)'.

      Delete
    2. alternatively, you can raise the objection in write north and drop it immediately in same function. set an drain time for that objection. Do same in write_south. set drain time as worst case transaction delay.

      Now what will happen, after drop_objection, before decrementing objection count it will wait. In between if same objection comes again timer will reset.

      keep a check for no_of_packet in report phase. If none of the packet is received, it will error out at end but test won't hang.

      You won't need a heartbeat also, this will effectively serve as heartbeat.

      Delete
  3. Hey Tudor,

    Generally, I would recommend sending your stimulus during the main phase and have your "drain" time during the shutdown phase.

    Also, using a watchdog to handle Cristi's comment about DUT bugs is not always going to be sufficient. For one thing, some tests may just generally take longer due to random constraints selecting sub-optimal rates. In other words, you'll find cases where 100,000ns just isn't long enough, and then you'll be pushing it out to 150, and later 200, and so on. That's the tail wagging the dog.

    Instead, I highly recommend taking a look at the deadlock checker in Chapter 6 of my book, Advanced UVM. This checker will cause your environment to phase jump to the check phase if a sufficient amount of time has gone by without seeing anything coming out of the device. The book also includes a lean objection mechanism for that UVM 1.1 problem you mentioned.

    We've been using some form of this deadlock checker for many years now and it's saved us tons of simulation time.

    Cheers,
    Brian


    ReplyDelete
  4. Another great post Tudor! i use a very similar mechanism in the scoreboard and i agree with the other posts as well, this only works when there is a complete DUT mess up in terms of interface protocol.
    It might be an idea for a subsequent post...

    ReplyDelete
  5. At my previous job I actually did some bench-marking with Mentor's own simulator testing various objection scenarios. Switching from from per-test objections to per-transaction objections lead to only a few percentage points increase in run time. If you want to speed up your simulations there are probably bigger fish to fry...

    ReplyDelete
  6. Hi, Could please let me know how ARM based mobile SoC booting process works in terms of Hardware perspective to verification engineer. Thanks.

    ReplyDelete
  7. How can delay_phase which is task called from function void phase_ready_to_end

    ReplyDelete
    Replies
    1. 'delay_phase_phase()' is started in parallel. It's not called within 'phase_ready_to_end()', it's scheduled to start once the currently running process (which includes 'phase_ready_to_end()' and any other functions that are called after) gets blocked by a waiting statement.

      Delete
  8. Its an interesting article to read, but i have a few doubts about the scalability of this scoreboard when we wish to use it at a higher level ( IP level to subModule and maybe Soc). Is the assumption here been made that this is the highest level of scoreboard present and this scoreboard will be the deciding component when we move it to another env ?
    If that has not been considered what happens if the test in the bigger or another env doesnt want to end on a condition met by and monitored by this scoreboard component. Is it not wise to keep the control in test writers hand to monitor multiple components done mechanism using some global variable or object or some event wait mechanism and then conclude when to end ? Would this approach not make it less reusable? Isnt it a good practice to control from the test case ?
    Ofcourse one way would be to have every component decide for itself but will that not obstruct the overall cause to keep all the components alive based on some condition which satisfies the higher purpose as i think once a component drops the objection its no longer doing anything in that simulation unless there is way to bring it back to life without adding more complex phase jumping mechanism that i can think of.

    Would like to know the opinion of many intelligent and smart people out there.
    I seek to learn. This debate of how to and where to drop the final objection to end the simulation gracefully haunts me.

    ReplyDelete
  9. Hi Tudor ,
    I have two threads -
    One running in configure phase , and one running in run_phase.
    Due to some events, objections are raised in run phase thread,while configure phase is running
    These objections are of configure phase. Will configure phase wait for these objections to drop to finish its phase

    ReplyDelete
    Replies
    1. If I get you correctly, in run_phase(...) you're calling something like configure_phase.raise_objection(...). The configure phase should wait in these objections to drop, meaning that post_configure, pre_main, main and so on will be stalled.

      Delete
    2. In run_phase, something like phase.raise_objection(this) , when configure phase is running in parallel.

      Delete