Many industries developing FPGA based electronic products choose to carry out inadequate or almost no simulation verification while resorting to only performing ad-hoc lab testing for debugging, verification and integration testing. Some of the common given reasons for this attitude towards simulation are, “there is no budget allocated in the project for simulation verification”, “there is no time left for simulation in the project”, “Business is willing to take the risk so as not to carry out simulation verification” etc.
In this article, firmware development term is used for FPGA development and lab testing term is used to refer to an activity of unstructured, undocumented, unplanned and infrequent ad-hoc hardware testing of some or all functionality of the firmware under development, i.e. quick and dirty prototyping or proof of concept demonstrator etc.
This article assumes that reader is well familiar with industry standard FPGA design methodology. Furthermore, it is assumed that the reader understands what it means by sythesising a VHDL/Verilog RTL statement like C = A&B through a synthesis tool in contrast with compiling it in a ‘C’ compiler in order to create an executable code for a processor. If not then please read my post fpga-fundas-0-vhdl-is-not-a-programming-language on the same.
Back Then
In past, when digital designs implemented in a single chip were much smaller, FPGAs were mainly used to implement simple glue logic in order to reduce discrete components on an electronic board. In those days, a typical digital electronic system mainly comprised of discrete chips on a PCB. This type of system enabled designers to comprehensively probe the functionality and timings of the signals at the board/PCB level using external logic analysers whilst the FPGA was merely used to glue signals between these discrete chips. For example, an older DSP system comprised of discrete chip multipliers and adders may have required buffering or delaying the signals. This buffering and delays were then typically implemented using an FPGA in order to save board space by reducing discrete chips for the buffers and delay elements (FlipFlops). Therefore, in old days, FPGA designers chose not to perform any simulation verification of such simple glue logic blocks to “save” time.
This attitude and approach worked in past as the glue logic designs were much less complex than modern day digital logic designs. The state-of the-art modern day FPGA typically comprises of many ARM processor cores, high speed on-chip interconnects such as AXI, DSP blocks, memories, high speed interfaces etc. Therefore, the same attitude and approach to test the FPGA design which was taken back then will not work in current day and age. If however, the same approach is taken, a lot of time can be spent on debugging such a complex design at a board level. Skipping simulation verification and just performing lab testing in the actual chip/FPGA in the actual hardware/board provides almost no direct visibility into the internal behaviour of a modern complex digital design. Even the smallest Artix-7, the 7 series Xilinx FPGA device comprises of 12800 logic cells. Therefore, industries are increasingly using FPGAs to implement much more complex designs than just the glue logic in order make efficient and cost effective use of the real-estate provided by the FPGA device in the form of Silicon area.
The FPGA designers of the past are likely to be leading or managing teams of FPGA engineers in the current time. Many of these managers may still possess the same past attitude and approach towards verifying and testing the modern day complex FPGA designs. They may still prefer to skip simulation verification and advocate moving directly to lab testing of the design. This may also be due to the fact that perhaps they do not adequately appreciate the general rise in the complexity of the system implemented in an FPGA since the glue logic days or simply does not have adequate relevant industrial experience in order to comprehend the reasons why simulation verification is essential for a stable and reliable FPGA based product development.
Simulation Verification
Simulation testbench environment has virtually 100% visibility into the design. Here the design is completely isolated from any board/hardware related issues. Most of the potential functional and timing bugs can be identified much before the design goes into the chip for lab testing. Other timing related bugs can also be identified by performing static timing analysis and/or gate level timing simulation and/or CDC linting. Therefore, investing the time to develop a decent testbench environment pays many folds in saving time to debug the design in the lab at the board level. Simulation testbenches can be developed at various levels of design abstractions. For example, typically an FPGA system is comprised of integrated individual functional blocks each performing its own function. Therefore, a testbench is typically required to verify each of the levels of the design abstraction. For example, at a unit or module level, integrated module level and top or system level. The unit level testbench is required and implemented during the development of the fundamental units of the FPGA system. Integrated module level testbench is required to verify when an a few units are integrated to perform some more complex integrated functionality within the FPGA system. The top or system level testbench is required to verify the entire system when the functional blocks within the entire system are designed and integrated. Tests carried out at unit level are not sufficient to guarantee functionality at system level. Therefore, units level and top level testbenches are not mutually exclusive but complementary. Both are essential and required to develop a stable and reliable product.
A top level testbench is also essential along with unit level testbenches. This is because most of the times the unit level testbenches do not model the functional and timing behaviour of the surrounding blocks as accurately and correctly as if the DUT was integrated with the resepective surrounding blocks. The functional and timing related interfacing issues only become apparent once the various units are integrated into a complete system. The top level testbench mainly focuses on verifying the intercommunication between the various units once they are integrated into the system. Additionally, simulation testbench environment provides means for quantifying verification effort. For example, most of the modern logic simulators inherently provide code coverage metrics. Free libraries such as OSVVM provide functional coverage metrics. These metrics can provide confidence in the design and test. The metrics also enable designers to answer the question “when is it done?”. The metrics can enable managers to make a quantifiable measure of risk when deciding when to stop verification.
DUT Reference Model
A testbench comprising of a DUT reference model and/or golden test vectors reflect the requirements as interpreted by the test team which due to the team independence may be different to that of the design team. As a result, the difference in the expected behaviour of the DUT becomes apparent during the testing. It is due to this difference a bug in the DUT or the testbench or the requirement is identified and corrected where appropriate. Without a simulation testbench comprising of a DUT model and/or golden test vectors and/or functional checkers, bugs can go unnoticed when the design is implemented in the chip.
Integration Testing
Integration testing is not merely lab testing. It is a planned and structured testing activity of the integrated FPGA/firmware system. Firmware integration testing proves functionality of the firmware when integrated with the hardware. Integration testing attempts to functionally test whether the various programmable logic blocs implemented using an HDL have been integrated, synthesised and configured in the FPGA correctly. Integration testing does not have the granularity to directly test for any potential FPGA signal timing related issues. However, this may be flagged indirectly during testing through a failing functional test at a higher level of abstraction. Unlike lab testing, prerequisites for integration testing include the following:
- The design has been constrained adequately, appropriately and that it meets the synthesis, place and route area and timing goals.
- The design has been synchronised adequately, appropriately for every CDC boundary and reports no issues through a CDC linting tool.
- The design has been successfully verified in simulation at a unit, module and top level.
- The design has been lab tested for any fundamental issues (clock generation, timing of external IOs etc.) in the FPGA on the board.
- The design passes its Power-On-Self-Test if any in the FPGA on the board.
- The FPGA board has been DVT separately and that all other components and peripherals on the PCB are functionally proven to be working.
Hardware Accelerated Simulation
Adhoc lab testing or prototyping is not hardware accelerated simulation as it is simulation is an entirely different thing. Hardware accelerated simulation is still a simulation which typically uses standard simulators such as Modelsim, ActiveHDL as the front end and automatically downloads testbench/testvectors along with the design from the simulator to an emulator (which can comprise of arrays of FPGAs) which accelerates the testbench and the simulation. The simulation outputs/results from the DUT+TB inside the emulator are still returned to the simulator front end therefore, the visibility of internal signals of the DUT remains the same as the normal simulation.
Conclusion
Simulation verification and Integration testing are not mutually exclusive, in fact, on the contrary, both are essential and complementary. Both are required to develop a stable and reliable product.
Many managers and leaders argue that by skipping testbenching and simulation, they are saving time; it is difficult to see how that is the case no matter what stage the project is in. If the product is shipped with hidden bugs, chances are, it is going to come back later on and more time will have to be invested to fix the bugs in future due to many reasons including lost project knowledge etc. etc., at an additional cost of reputation damage.
Independent references and empirical evidences can be provided for many of the claims/statements made in the above article.
References:
NASA,ESA Lessons Learned from FPGA Developments, A report by Gaisler Research
http://microelectronics.esa.int/asic/fpga_001_01-0-2.pdf
Best FPGA Development Practices
http://www.irtc-hq.com/wp-content/uploads/2015/04/Best-FPGA-Development-Practices-2014-02-20.pdf