1. Introduction
RISC-V [
1,
2] is a modern
Instruction Set Architecture (ISA) that has gained significant momentum in recent years. A key factor that drives the RISC-V success story is its free and open nature, combined with a lightweight and modular architecture. Moreover, RISC-V is designed from the ground up to enable the integration of custom instruction set extensions in order to build highly application-specific solutions. These properties push the adoption of RISC-V and strengthen its potential to become a game changer in the
Internet of Things (IoT) era. As such, great interest can be observed around RISC-V in industry and academia.
In line with RISC-V’s popularity, the extensive RISC-V ecosystem is continuously growing to include a broad set of software and hardware development tools and library. Recently, virtual prototyping solutions have been introduced into the RISC-V ecosystem to lay the foundation for advanced system-level use-cases in the RISC-V context. A
Virtual Prototype (VP) is essentially an abstract model of the entire hardware platform and enables early software development in the design flow. An industry-proven standard to create VPs is using the SystemC language in combination with the
Transaction-Level Modeling (TLM) style for modeling abstract communication interfaces [
3]. A key property of VPs is their binary compatibility with the hardware platform, i.e., from the software perspective, the VP provides the same interface as the hardware platform and hence the software can be executed unmodified on the VP and hardware. Besides a functional validation, VPs also enable design space exploration by evaluating different design decisions early in the design flow.
The RISC-V VP is a representative, advanced open source VP tailored for RISC-V and available at GitHub [
4], and has been described in [
5]. It provides an extensive feature set, such as support for the 32 and 64 bit RISC-V ISA, with all standard instruction set extensions, several operating systems (such as Zephyr and Linux), advanced debugging capabilities and configurations to create different platforms, such as the HiFive1 board from SiFive [
6]. The main benefit of the RISC-V VP is, however, the ease of its ability to expand from custom RISC-V instructions with dynamic dataflow analysis extensions [
7] to a symbolic execution engine [
8]. However, representative of other VPs in the RISC-V context, the RISC-V VP is missing an effective methodology used to design and integrate models that capture the interaction of the VP
with the environment, such as other components on a PCB besides the processor chip.
In this paper, we propose such an extension to broaden the application domain for virtual prototyping in the RISC-V context. We provide a set of building blocks for the environment, which includes buttons, LEDs and a display. The main idea of our approach is to separate the hardware model from the world behavior (see
Figure 1,
VP Environment vs.
RISC-V VP). This allows for the parallel development of software and hardware within the intended environment, speeding up the design process. For visualization of the environment, we designed a
Graphical User Interface (GUI) using the Qt C++ library. To ease the environment setup, we provide a configuration-file-based approach that enables the designer to specify the desired components and appropriate connections to the VP in a simple way. The communication channel between the VP and environment GUI is established through a TCP connection (which also enables us to distribute the simulation to different computers e.g., a simulation server and a user’s desktop PC). We designed appropriate libraries to transfer several hardware communication interfaces, such as GPIO, SPI or CAN (via SPI), on top of the TCP channel. This allows us to transparently map these interfaces between the VP, which models the SoC, and the environment GUI, which displays and simulates the behavior of external components. Furthermore, the communication was optimized to avoid performance impacts on the VP simulation. Our approach was designed to be integrated with SystemC-based VPs that leverage a
Transaction Level Modeling (TLM) communication system. In addition, our setup provides the foundation to even attach external real hardware components to perform a VP-driven hardware-in-the-loop simulation. To facilitate the environment model design, we provide a set of building blocks, such as buttons, LEDs and an OLED display. Moreover, for rapid prototyping purposes, we provide a modeling layer that leverages the dynamic Lua scripting language to design components and integrate them with the VP-based simulation. For evaluation purposes, we provide two case-studies with different virtual environments. In all case-studies, we used the RISC-V VP in the HiFive1 configuration, which is a model of the RISC-V HiFive1 board from SiFive [
6]. Besides the two virtual environments, we also built the corresponding two real physical systems. We can observe that both the virtual and physical systems behave identically in these case-studies, which demonstrates that our approach provides suitable virtual models to enable early software development in the design flow. We also believe that the combined VP platform can be very beneficial for education purposes in lectures and also for further research projects.
Besides our own positive experience in using our VP platform for teaching lectures on system-level design and virtual prototyping, we are also already aware of an other academic group that has leveraged our RISC-V VP infrastructure for teaching an embedded systems lecture with laboratory sessions in the RISC-V context [
9]. This further underlines the applicability of our VP platform with environment modeling capabilities for educational purposes. To further spread its adoption, we provide the VP platform with the environment interaction in combination with the case-studies as open source [
4].
1.1. Paper Structure
This journal paper includes and extends published material from our previous conference paper [
10]. We start by outlining the new contributions of this paper in the next paragraph. Then, we continue with a discussion of related work (
Section 2) and provide relevant background information (
Section 3). Next, we present the VP-driven environment modeling methodology, including the communication interfaces and configuration features, in more detail (
Section 4). We follow up with our rapid prototyping approach using the dynamic Lua scripting language (
Section 5). Then, we present our modeling case-studies with two different environment configurations (
Section 6.1). Afterwards, we describe the results of our performance evaluation (
Section 6.2). Finally, after a discussion on future work (
Section 7), we conclude the paper (
Section 8).
1.2. New Contributions
In comparison to our previous conference paper [
10], we have implemented a significant set of extensions to our VP-driven environment modeling platform. The extensions include a new rapid prototyping approach for modeling components using the dynamic Lua scripting language with a set of dedicated interfaces to enable an integration with a SystemC-based simulation, updated communication protocols to enable a better performance and more reliable communication and an updated set of component building blocks to speed up the development process of new environments. We also updated and extended the related work, preliminaries and future work sections to reflect the new developments, as well as provided a more detailed and reformulated description of the VP-driven environment modeling and added case-studies in the experiment section. To accommodate these new contributions, the paper has also undergone significant editorial changes.
The complete implementation of our VP-driven environment modeling platform, including all new extensions and case-studies, is available open source on GitHub.
2. Related Work
The extensive ecosystem of RISC-V comprises several simulators that differ in their implementation technique and intended purpose in order to cover different use-cases. SPIKE is the reference simulator that is mainly designed for pure CPU simulations with a basic set of peripherals [
11]. RV8 is a high-speed simulator that employs just-in-time compilation techniques to boost the execution performance, but also mainly covers pure CPU simulations [
12]. R2VM also targets CPU simulations by utilizing binary translation techniques [
13]. It can switch between fast and accurate simulations in order to cover different use-cases. QEMU enables a full system simulation that covers a complete platform and employs advanced binary-level optimization techniques to achieve a high performance [
14]. Building on that, [
15] proposed an approach to efficiently simulate
Translation Lookaside Buffer (TLB) behaviors in a QEMU setting. Gem5 is also a full-system simulator that puts a stronger emphasis on architectural exploration aspects, but has a significantly reduced performance as a trade-off [
16,
17]. Going beyond that, the
Renode simulation system supports multi-node networks of embedded systems in a distributed simulation [
18]. Recently, SystemC-based processor simulation solutions have been introduced into the RISC-V ecosystem as well. Besides the RISC-V VP, which we have covered in the introduction, viable alternatives are the DBT-RISE [
19] framework, ETISS [
20,
21] and the RISC-V-TLM [
22] instruction set simulator, which are also designed with a SystemC integration in mind and provide RISC-V support. However, they lack a way to model external devices, e.g., via SPI or GPIO. Regarding DBT-RISE, an example VP platform that integrates an RISC-V instruction set simulator and is implemented in SystemC TLM is provided [
23]. Another SystemC TLM simulator for RISC-V is RISC-V-TLM [
22], which is currently under active development to increase the supported core feature set. A recent approach that has built upon the RISC-V VP is a proposed visualization of the internal VP execution state for debugging purposes [
24]. It offers a live view into the execution state of the SystemC peripherals but lacks an interactive modeling platform for the environment interaction. However, the freely available VP-based frameworks for RISC-V are currently missing an effective methodology used to design and integrate configurable environment models with extensive graphical capabilities. Advanced environment modeling capabilities in a configurable framework with extensive graphical capabilities, as proposed in this paper, is, to the best of our knowledge, not yet available using any of the open RISC-V virtual prototyping approaches. Finally, there are commercial VP tools such as Synopsys Virtualizer [
25] that might support RISC-V in combination with extensive environment modeling capabilities, but their implementation is proprietary.
Looking beyond RISC-V, existing simulators such as
simavr [
26] and
PICSimLab [
27] (using
simavr in the background) can be cycle-accurate but are limited to a certain family of AVR processors, and are fairly computationally expensive. In contrast to our approach, which offers an interface to a SystemC VP and hence is able to incorporate custom in-house chips and IPs, these simulators are not designed with advanced industry-proven SystemC-based virtual prototyping in mind.
5. Rapid Prototyping Using Lua Scripting
To increase the usability of our VP environment modeling tool, we added a device scripting engine. This allows developers to focus on the actual behavior of devices without having to understand the whole system, to not have to re-build the framework for each change in a device and to increase modularity for an easier community-driven library of devices.
Such a scripting engine has to be fast, memory efficient and easily learnable. Without a particular scientific relevance, we chose Lua as the driving scripting language, as it is widely used in games and other applications where execution speed and a low memory footprint is key. Though the Python language was considered, as it is widely used nowadays in more high-level applications, its interpreters for C/C++ programs compare rather laboriously and (slightly) slowly.
For the interface between Lua and C/C++, we kept the dynamically typed language style, and activated offered interfaces in a “duck typing” way. This means that, if a script is loaded, it is checked whether it implements certain functions that are expected by our framework. These can then be used by the configuration mechanism to enable/connect the following currently implemented functions: SPI, Pin input/output, Configuration change, Button/Mouse input, and Graphics.
In
Figure 3, a brief overview of the interface registry is shown. On the right side, the
Lua tab, the functions to be implemented are grouped by their interfaces (colored). Higlighted in bold are the necessary functions for each interface (the Graphics interface is special, which will be discussed in the following paragraphs) to be recognized by the
Device wrapper (central tab). Upon instantiation, the device wrapper will check for the existence of these functions and add the corresponding interfaces to the C++-world. The
Platform then instantiates and stores all devices listed in the
configuration file (bottom left). For each of the interfaces, a central registry is held for all devices to speed up the lookup of each frame.
A Lua scripted device may implement a set of functions and have at least a member
classname, which is used to identify and instantiate this device (see
Figure 3). For
pin input/output, it has to implement at least the function
getPinLayout(), where it defines the number of input or output pins during the instantiation setup phase (see also Figure 6). The host system will then periodically call
getPin(num) (if implemented) to request updates, and
setPin(num, val) if registered and connected pins are updated from outside the device. A pin is updated asynchronously (i.e., only when the environment model updates; see
Section 4.2.2) per default, unless the environment configuration sets it as
synchronous. This is to save bandwidth and performance. Normal, asynchronously sensitive pins are registered in the
reading- and
writingPINs data structures, whereas synchronous pins are handled in the
registeredPINchannels data structure, as it is considered as an IO-function internally.
SPI connections are handled by implementing
receiveSPI(byte) and is called synchronously when the device is connected to an SPI port and receives something. Note that the function may return a value that is passed back to the processor if not configured in
SPI_NORESPONSE mode. If the device implements the
setConfig(list) functions, it may receive
configuration updates in the form of a key-value list during setup from the
json config file. Additionally, it may implement
getConfig(), from where the (default) settings may be viewed and reconfigured in the GUI. For GUI interactions (
Button/Mouse input), the device may implement
onKeypress(keycode, press_release) or
onClick(press_release). Note that, for
onClick, a graphical representation is needed.
The interface for
graphics is slightly more interesting, as the environment GUI offers functions
to the device once it defines the
getGraphbufferLayout() function. During setup(
Figure 3,
), the GUI calls this function and reserves a memory region with the requested image size and format (currently only
RGBA8888), and inserts the callback function (
Figure 3,
)
get- and
setGraphbuffer(x,y,Pixel), which directly access the internal image buffer. A
Pixel is a custom data type that combines red, green, blue and alpha values. These functions may be called by the device during all callbacks (
Figure 3,
).
Due to technical reasons, all scripted devices run in one single Lua interpreter state as scoped chunks for the best memory and execution speed. This means that a script is loaded into a table, where it may only access pre-defined global functions without access to the other script’s functions. All devices may call
setGraphbuffer(..), but they may only access their own buffer. To enable this, we opted for prefixed global C functions (e.g.,
button1_setGraphbuffer(..),
Figure 4, Line 9). This is a technical limitation of the used
LuaBridge3, where C functions may only be global These are inaccessible for the scoped device scripts (
chunks) until they are inserted into the respective Lua meta-table (
Figure 4, Line 20), and without the prefix.
5.1. Configuration
Our VP environment loads a
json-formatted configuration file on start-up for ease of customizing the user interface. An example is shown in
Figure 5. In the
window section, a background image (Line 3) and a desired window size (Line 4) can be defined (which defaults to the background image size).
After that, all implemented/loaded device classes may be referenced and instantiated in the
devices section (Line 6). A
device entry must have a
class and an
id (Lines 9 and 10). The
class references the building blocks
classname (see
Section 5), while the
id must be a name that is unique to the instance. Further items depend on the implemented interface of the specific building block (see
Figure 3). For example, a Lua-implemented button
button_lua offers the
graphics (Line 11),
onKeypress (Line 16) and
pin (Line 17) interface. The OLED device (Line 44) was implemented in both Lua and C++, with the latter being instantiated in this example.
5.2. Scoping Layers
To increase the modularity in the whole HW stack from the device to SoC peripheral, the
environment model consists of four layers: the
device layer, the
environment layer, the
platform layer and the
GPIO layer in the GPIO peripheral of the VP (see
Figure 6).
The
device layer is scoped to every individual device, which define the pin and other protocol descriptions according to the respective interfaces (see
Section 5). In the
environment layer, all instantiated devices are connected to the global pin numbers. This would normally be carried out via a prototyping breadboard or a PCB. It is allowed to not connect pins. The pins between the labeled “global” connectors of a platform (such as the HiFive 1) into the chip’s GPIO register offsets are translated in the
platform layer. Lastly, in the GPIO module that resides in the VP, the actual pin states are set/read according to
Section 4.2.2 and can either contain per-pin managed digital levels (see
Section 4.2.1) or pass through to an IO-function such as SPI.
5.3. Example Devices
To explain the concept better, we will show two of the currently implemented devices in more detail: a simple red LED (see
Figure 7) and a more complex OLED display (
Figure 8).
5.3.1. LED
The LED implementation in
Figure 7 uses the
pin,
config and
graphic interfaces. In Lines 3–6, the module defines only one pin, with the number
1 as an input pin and the description string of “
led_on”. Lines 8–11 request only one color pixel from the graphics system, which is accessed later via
setGraphbuffer(0,0, ⋯) on Lines 32 and 34. Lines 13–15 define local variables for the displayed color, which are set or read by the configuration file or during the runtime via
getConfig() and
setConfig(conf) in Lines 17 and 23, respectively. The actual display action happens if the input pin (
1) is changed (Lines 29–37). The call supports multiple pins, so
setPin(⋯) includes the pin number and the (boolean) value if it is
HIGH or
LOW.
5.3.2. SSD1103 OLED Display
In
Figure 8, a more sophisticated example is given. It implements the already known
pin interface, but also the
SPI interface with the function
receiveSPI(byte_in) (Lines 56–81). Note that, for brevity, some of the internal logic is omitted (Lines 44, 58, 77). In Lines 36–41, the most common operator bytes are defined. The omitted function
getMask(op) determines the value bits of an input command byte, which is then used by the
match(cmd) function (Lines 47–54) to decode incoming raw bytes. Lastly, in
receiveSPI(byte_in) (Lines 56–81), the actual drawings to the frame buffer are performed when the incoming SPI byte is detected as
data (if the
data_command pin was set
HIGH). In Lines 57–65, the translation from 1-bit-pixel rows to the pixelwise frame buffer is carried out, including the increment of the current column pointer. Some of the command handling is shown in Lines 65–79, where internal state variables are changed.
6. Evaluation
In this section, we will show some use-cases for our VP environment by modeling two example environments along with their interacting software (
Section 6.1), give a performance evaluation of different modeling strategies (and comparing to the baseline RISC-V VP,
Section 6.2) and, lastly, give a short demonstration on how we used it in our own lectures (
Section 6.3).
6.1. Modeling Case-Studies
We implemented our proposed approach for VP-driven environment modeling and interaction in the RISC-V context using the open source RISC-V VP as foundation, which is available at GitHub [
4]. To demonstrate the effectiveness of our approach in building feature rich environments, we designed two example environments in combination with different firmware applications as a case-study. In the following, we present both case-studies in more detail (
Section 6.1.1 and
Section 6.1.2).
6.1.1. Breadboard Environment
To demonstrate the usability of our approach as a rapid prototyping methodology, we designed a breadboard environment and configured a button, an LED and a seven-segment display, as well as the built-in RGB-LED of the HiFive1 board. An excerpt of the corresponding configuration file can be found in
Figure 5. The corresponding graphical display of the environment is shown in
Figure 9. Besides the already mentioned components, the environment also displays the connection between the respective GPIO pins of the HiFive1 and the breadboard. During the VP-driven simulation, the environment GUI was updated accordingly to reflect the current execution state of the VP.
The firmware is held as simple in this example: it counts a number in seconds using the core local interrupter (CLINT) timer, and renders it to the seven-segment display. Whenever the button is pressed, the count direction is reversed accordingly. The single LED is changed every second. Due to the built-in RGB LED segments always being connected to certain GPIO pins of the seven segment display, its color changes and mixes as well.
6.1.2. OLED Display Shield with Buttons
For a more sophisticated example, we modeled a hand-held “gaming” device with seven input buttons and a 64-by-128-pixel-wide OLED screen. An overview of this system is shown in
Figure 10, with the left side showing the virtual environment and the right side the corresponding real physical device. The screen is connected via SPI and demonstrates the bytewise I/O functions of our GPIO and environment model. The buttons are connected to the ground and require a pull-up resistor on the input pins to work, while the OLED screen is interfaced via an SSD1306 (
https://cdn-shop.adafruit.com/datasheets/SSD1306.pdf, last accessed 4 September 2022) compatible protocol, consisting of the usual SPI pins Master In Slave Out (MISO), Clock (CLK) and a Data Command (DC) input. This interface also demonstrates the requirement of synchronicity between the abstracted byte-wise SPI transmissions and the GPIO-handled (software driven) Data/Command (DC) pin, where a small transmission jitter of data vs. the DC pin would already result in a glitchy or inoperable display. An excerpt of the Lua implementation can be found in
Figure 8.
To test the interaction between the user input and output, we programmed a demo snake game that listens to the up, down, left and right buttons in an interrupt routine and draws a gaming field on the screen. With the key mapping of our environment model, it can be played by clicking on the on-screen buttons or via the arrow-keys on the keyboard. As the RISC-V VP is binary-compatible to the HiFive1 board, the same program can be used and played on the real board, after the PCB has been manufactured (see
Figure 10a). For demonstration purposes, we built another firmware, besides the snake game, that displays a
Mandelbrot visualization on the same device (as shown on
Figure 10b).
6.2. Performance Evaluation
For performance evaluation purposes, we designed a non-interactive test program that first calculates 40 frames in a
Mandelbrot set visualization (see
Figure 10a) that uses software floating-point arithmetic to render the fractal. It then fades the display using the background illumination command and draws 1000 characters of pre-defined, randomized text to the screen. After this, the program exits with a special RISC-V exit sequence that is handled in the RISC-V VP. This is, of course, not handled by the real processor. While the
Mandelbrot set visualization is computationally intensive, as every pixel is calculated individually, the text stream is only limited by the SPI-bandwidth as it uses lookup-buffers for the font and addresses the native eight-pixel-rows per byte.
In addition, we modeled the SPI OLED display three times:
1. in the SystemC VP, communicating directly with the SPI device over TLM, sharing only the screen buffer over memory-mapped I/O to the GUI;
2. in the VP environment GUI as a C++ device, using our GPIO protocol; and
3. in the VP environment GUI as a Lua device, using our GPIO protocol (see
Figure 8).
The results of this experiment can be found in
Table 1. The first column describes the Test type:
Baseline (unmodified RISC-V VP with non-functional mock-up GPIO peripheral);
Disconnected (our modified RISC-V VP with the display modeled in SystemC, but no connection to the environment); and
GUI-connected (our modified RISC-V VP with the connected environment GUI actively displaying the execution state). The connected tests were built in four different set-ups:
SystemC-Device, where the OLED display driver is directly connected to the SPI peripheral in SystemC sharing the screen buffer with our GUI;
Bidirectional C++-Device, where the driver is modeled in our environment GUI and the SPI peripheral awaits the answer byte via the protocol;
Unidirectional C++-Device, where the device’s answer is discarded for speedup; and, finally,
Unidirectional Lua-Device, where the logic of the display driver is modeled in our Lua scripting engine.
The next column, Time, reports the real time as reported by the program time of the whole simulation with an already started GUI (if applicable). # Exec. Instr. refers to the number of native machine instructions (not pseudo-instructions) executed until the test end. Note that the number differs slightly for the same binary due to different behavior when the GPIO memory mapped region is either mock-up memory (Baseline), correct but disconnected (Disconnected) or responding to actual SPI devices (GUI connected). Lastly, we calculated the number of Million Instructions Per Second (MIPS) to offer a comparison to other simulation approaches. All tests were conducted on a desktop grade AMD Ryzen 3700G processor with 32 GiB RAM, and outperformed the real HiFive1 setup; especially in memory-intensive tests, which is usually not possible with RTL models (note that the RISC-V VP has the feature to lock the CLINT (internal timer) to either simulation- or wall clock time).
As can be observed, a connected and running VP environment has a minimal impact on the execution speed of the VP. Besides the asynchronous communication scheme, the minimal overhead could be achieved through the use of a multi-core processor, as the RISC-V VP uses the single-threaded SystemC reference implementation. Thus, the RISC-V VP and the environment GUI can be executed in parallel with little to no interference. Secondly, it can be noted that the implementation of a high-throughput device (such as the OLED display) in the
Lua scripting language does
not add a significant run-time overhead to the simulation speed, as long as the response is discarded. Note, however, that the refresh rate of the environment GUI drops slightly, as Lua devices accesses to the frame buffer are generally slower because of the C-wrapper (see
Figure 3). The refresh rate in all tests varied between 10 and 20 Hz, limited to 20 Hz.
The overall impact of our approach on execution speed can be observed against a baseline version of the GPIO peripheral, where any accesses to the memory mapped IO-interface are ignored (pass-through to memory). This reveals only a runtime overhead on average for the benefit of a functioning, interactive GPIO interface.
6.3. Educational Tool for Teaching
Among others, we offer a system-level design lecture that also covers programming embedded systems. During the COVID-19 pandemic, there was no possibility for the students to interact with physical prototype boards such as the Sifive Hifive1. As the students covered implementing their own small VPs, it was easy to show them the principles of the more complex RISC-V VP. The students could then use and program our digital version of the Hifive1 board to understand the basic concepts of interrupt handling and how embedded systems interact with their environment. As the RISC-V VP can be analyzed using normal software-based debuggers such as GDB, we can show the detailed steps of different control flows during the runtime, and how software and hardware modules interact between each others. The small exercises were laid out in incremental steps to program an interrupt-triggered blinking LED while reacting to button presses. One year, the final lectures could be held in person, where the students could test their own programs on real Hifive1 boards supplied by the university.
Overall, we noted that the RISC-V VP with the environment model extension, while posing an initial learning curve, was very helpful during remote teaching and was still nice to have in in-person teaching as every student could test and build their programs at home without having to supply real hardware. We suppose that it will also be beneficial for more practical-focused embedded programming courses; especially when using hardware that is either too costly/complex to be supplied to every student or hardware that requires special programming devices.