Next Article in Journal
Students Collaboratively Prompting ChatGPT
Previous Article in Journal
Generative Artificial Intelligence as a Catalyst for Change in Higher Education Art Study Programs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Ubimus Plugging Framework: Deploying FPGA-Based Prototypes for Ubiquitous Music Hardware Design

by
Damián Keller
1,*,†,
Aman Jagwani
2,† and
Victor Lazzarini
2,†
1
Amazon Centre for Music Research (NAP), Federal University of Acre, Rio Branco 69920-900, Brazil
2
Department of Music, Maynooth University, W23 F2H6 Maynooth, Co. Kildare, Ireland
*
Author to whom correspondence should be addressed.
Current address: Ubiquitous Music Group, Rio Branco 69915-900, Brazil.
Computers 2025, 14(4), 155; https://doi.org/10.3390/computers14040155
Submission received: 14 March 2025 / Revised: 27 March 2025 / Accepted: 16 April 2025 / Published: 21 April 2025

Abstract

:
The emergent field of embedded computing presents a challenging scenario for ubiquitous music (ubimus) design. Available tools demand specific technical knowledge—as exemplified in the techniques involved in programming integrated circuits of configurable logic units, known as field-programmable gate arrays (FPGAs). Low-level hardware description languages used for handling FPGAs involve a steep learning curve. Hence, FPGA programming offers a unique challenge to probe the boundaries of ubimus frameworks as enablers of fast and versatile prototyping. State-of-the-art hardware-oriented approaches point to the use of high-level synthesis as a promising programming technique. Furthermore, current FPGA system-on-chip (SoC) hardware with an associated onboard general-purpose processor may foster the development of flexible platforms for musical signal processing. Taking into account the emergence of an FPGA-based ecology of tools, we introduce the ubimus plugging framework. The procedures employed in the construction of a modular- synthesis library based on field-programmable gate array hardware, ModFPGA, are documented, and examples of musical projects applying key design principles are discussed.

1. Introduction

Ubiquitous music (ubimus) approaches have been applied in various contexts unveiling a good potential to support music-making and tool design both as expanded forms of legacy practices and as cutting-edge explorations of emergent techniques. In particular, fast-prototyping tools provide experienced and novice stakeholders with a path toward participation in technically demanding tasks. Despite the promising results in software design, multimodal installations, and DIY artistic initiatives, advances in hardware prototyping have been slow. Of the three activities encompassed by ubimus practice—ways of thinking, ways of designing, and ways of deploying—we delve into the second area, emphasising the documentation of methods and also providing some insights on the impact of design choices on the conceptual and sonic results (there is a growing literature on ubimus practice, including the extensive coverage provided in [1], plus several topic-oriented volumes [2]).
Ubimus projects geared toward the development of hardware infrastructure have unveiled an emerging set of demands, highlighting the limitations of genre-oriented views [3,4,5]. Some difficulties have been encountered while deploying widely spread resources and heterogeneous infrastructure, thus highlighting specific local needs. Without excluding the support of strongly timed strategies, ubimus design frameworks have tended to adopt flexible temporalities (Beyond the implications for hardware design, the adoption of temporalities releases musical interaction from the straitjacket of metre-centric and orchestra-centric design thinking, which may be applied to legacy support but it may limit prospective frameworks) based on an expanded set of audio processing and synthesis techniques.
Possibly one of the most complex challenges faced by ubimus frameworks is the demand for supporting both legacy practices and prospective exploratory initiatives. Replicability to ensure the construction of community-oriented knowledge is among the key targets of ubimus design. Fundamental to this is the adoption of Free, Libre, and Open Source (FLOSS) approaches to both software and hardware while also fostering wider knowledge-sharing. Replicable techniques do not mean the enforcement of similar artistic results or the imposition of cultural uniformity. In fact, multicultural ubimus deployments suggest the need to cater to diverse and innovative musical practices while preserving the local ways of doing. A fragile balance between replicability and diversity may be achieved through design methods compatible with aesthetic pliability.
How do these requirements translate into specific choices when designing hardware infrastructure? Standardisation seems to be a partial solution to ensure the support of legacy practices. The push toward the incorporation of IoT resources through the development of an Internet of Musical Things reinforces this direction. Another thread toward more sustainable infrastructures is enabled through ubimus archaeologies, with interesting implications for future designs. Prospective frameworks may demand approaches that resist standardisation, particularly for endeavours in which music-theoretical groundings are in flux [6] or when the framework is geared toward contingency as a creative factor. In these contexts, just inheriting a concept-plus-tool package may be too limiting (see the caveats of the adoption of “the thing” as a working unit of musical thinking in [6]). Therefore, aesthetic pliability calls for an overhaul of the extant approaches to hardware design, not only because of the current technical caveats but also to encourage frameworks amenable to the expansion of musical endeavours.
In this paper, we introduce ModFPGA, a modular synthesis system based on field-programmable gate arrays (FPGAs). The remainder of this paper is organised as follows. We briefly introduce FPGAs and discuss their usage in state-of-the-art audio programming. We then introduce our proposed design and its components. This is followed by two audio-synthesis studies, including comparisons with other embedded systems through tests and performance measurements. The ideas implemented in ModFPGA highlight the specific contributions of our ubimus framework for accessible audio-hardware prototyping. Our results showcase how our plugging framework fosters hardware scalability and composability while encouraging the deployment of flexible temporalities and aesthetic pliability in music-making.

2. Field-Programmable Gate Arrays

The functionality of a modular audio environment is grounded on two characteristics. The components provide a certain level of abstraction that ensures their behaviour is not obscured by complex parametric procedures. There is also some flexibility in the architecture, a characteristic known as composability [7]. Within the realm of reconfigurable audio systems, modules act as building blocks to enable unique information-processing tasks. Musical information is routed through modules’ inputs and outputs by means of a signal-processing architecture. The processes of constructing and using modular hardware resources come together through the action of plugging. Thus, throughout this paper, we will refer to patching or plugging to describe a composite set of activities that enables the investigation and deployment of reconfigurable hardware architectures for creative music-making.
The current state of affairs of hardware prototyping indicates opportunities for exploratory designs enabled by the increased miniaturisation and computational power provided by field-programmable gate arrays (FPGAs) and system-on-chip resources. FPGAs may foster a path to achieve the aims of ubiquitous music ecosystems. These integrated circuits consist of a large amount of configurable logic units that can be programmed (and reprogrammed) to enable complex computational tasks [8]. FPGAs are well-suited for audio due to their high throughput, ultra-low latency, and high sampling rates [9]. High throughput enables complex tasks that remain difficult to deal with in standard CPU-based embedded systems (for example, spectral processing) to be carried out in embedded environments. Ultra-low latency enables the seamless processing of synchronous audio, fostering the exploration of active acoustic control. Within the recent expansion of ubiquitous music frameworks, high sampling rates enable the development of DSP techniques such as higher-order FM [10]. Additionally, modern FPGAs often incorporate a processor system on the same chip, fostering hardware-software co-design practices [11]. Through the use of MIDI, OSC, or networking protocols, system-on-chip (SoC) configurations allow for flexible and efficient musical signal processing. The integrated CPU can also run embedded Linux, providing operating system-based techniques to exploit FPGA-based resources for acceleration [12,13].
Programming FPGAs typically requires low-level hardware description languages, such as Verilog and VHDL [8], along with specialised hardware design knowledge. These tools require a specific skill set and intensive development investment. In contrast, audio processing algorithms are typically developed using high-level programming environments like C/C++, Python, or MATLAB [9]. An alternative approach can be based on high-level synthesis (HLS) tools. HLS supports C/C++ programming of intellectual property (IP) cores to handle the processing blocks inside FPGAs. Thus, IP cores can be used as components of a framework based on the plugging metaphor.

2.1. FPGAs and Hardware Prototyping

The increased availability of fast, affordable micro-controllers and single-board computers, along with the popularisation of open-source hardware like Arduino [14] or platformIO [15], has led to the emergence of a number of programmable embedded-audio platforms such as the ElectroSmith Daisy [16], Bela [17], and Teensy [18]. These platforms can foster flexibility for setups used in audio installations, performances, and other artistic productions, with an emphasis on the design of custom devices [19]. High-level programming is supported by means of music-specific languages such as PureData [20] for the Daisy platform [16], Csound for the Bela platform [17], or Faust for the Teensy platform [18]. However, these tools are often limited in terms of their computational power, latency constraints, or access to special-purpose resources, making them suitable for niche applications but presenting some caveats when aiming for larger deployability and scalability.
FPGAs offer several advantages, including ultra-low latency, high throughput, and the ability to handle high sampling rates. In typical software-based audio processing systems, audio samples are often handled in chunks (or buffers) to conserve computational resources. Processing audio in vectors or buffers reduces the computational load by avoiding the need to handle each sample individually, which can be taxing on traditional CPUs. However, in the context of FPGA-based systems, buffering becomes optional. Due to their high throughput and fast clock speeds, FPGAs are capable of processing each audio sample synchronously. FPGA clock rates, which run in the range of 100 MHz, are significantly higher than standard audio sampling rates (e.g., 44.1 kHz or 48 kHz), providing ample processing time for each sample within the sample-processing period. This allows the FPGA implementations to handle complex operations on a per-sample basis, resulting in extremely low latency.

2.2. State of the Art in FLOSS FPGA

Complementing the development of ubimus frameworks, a number of FLOSS projects have targeted FPGA hardware. Verstraelen et al. (2014) implemented a scalable and low-latency parallel programming platform for audio DSP called WaveCore [21]. Vannoy et al. (2019) report a model-based open audio processing platform for FPGAs using Simulink and auto-generated VHDL [22]. Focusing on hardware–software co-design, Vaca et al. (2022) demonstrate an open audio processing platform for Zynq-based chips [23]. Through onboard CPU and SoC development techniques, other projects explore audio processing in the areas of physical modelling, spatialisation, and audio effects on FPGAs [12,24,25]. A pioneering study employing HLS for FPGA programming was reported in [26], where CPU-only SoCs were compared with FPGA-based SoCs in sound-synthesis tasks. This approach was later adopted by the Syfala project [9], providing a toolchain to generate FPGA designs from FAUST or C++ [27]. These projects showcase the ways in which FPGAs can employ different levels of programming abstractions. However, the process of creating custom, pluggable IP cores with HLS has not been demonstrated so far.

3. ModFPGA: System Architecture and Implementation

ModFPGA is a modular sound synthesis system for FPGA-based SoCs, emulating analogue modular synthesisers [28] in an embedded context. Our library provides a set of interconnectable audio processing IP core modules, thus supporting reconfigurable sound synthesis systems that meet specific musical and technical requirements. ModFPGA employs HLS as a key technique for the development of custom IP cores. The system includes a base audio infrastructure for inter-module communication, audio input, and output management, and a processing system application for configuration and control, enabling the efficient integration of FPGA-based modules (The HLS IP code, hardware platforms, drivers, processing system application code, and build scripts for ModFPGA can be found in the following repository: https://github.com/amanjagwani/ModFPGA, accessed on 10 April 2025).

3.1. HLS Application Development in ModFPGA

FPGAs are commonly programmed using low-level hardware description languages such as VHDL or Verilog. This creates a significant difference in the level of abstraction between FPGA programming and the typical design workflow of audio programs and applications. A workaround for this is a ready-made IP block-based design methodology, where pre-built IP blocks are combined using tools like System Generator and HDL Coder to support this strategy [29]. While this approach could be conducive to a ubimus plugging paradigm, it does not provide enough control over the functionality of the IP blocks.
HLS techniques mitigate these issues. HLS allows developers to design IP cores in C/C++, granting both a high-level programming environment and full access to the internal workings of the IP cores. This is particularly advantageous in a modular, reconfigurable system such as ModFPGA, where each individual IP core functionality must be finely tuned to specific audio processing requirements. At the same time, HLS IP cores can be combined and connected through various block designs, providing a good balance between granularity and adherence to the ubimus plugging paradigm.
The ModFPGA system has been implemented on the Digilent Zybo Z7020 development board featuring a Xilinx Zynq 7000 SOC [30]. The FPGA fabric (or programmable logic, PL) and a dual-core ARM Cortex-A9 processing system (PS) are based on the same chip. The combination of FPGA and ARM CPU furnishes a flexible architecture for hardware–software co-design. The FPGA fabric handles the parallel, low-latency DSP tasks while the processing system manages control tasks, such as configuring the IP cores and data exchanges with external devices.
HLS programming and RTL generation employ the Vitis HLS tool [31]. IP core rendering is primarily based on a top-level function, acting as a wrapper. All of the processing takes place within this function. Its arguments are used to define the inputs and outputs of the IP core. The top-level function also defines the interfaces, enabling communication with other IP cores through communication protocols and configurations of data types and widths. Additionally, high-level pragmas or directives can be used to define interfaces, optimise the IP, or describe the use of parallelism in the generated hardware. Lastly, after synthesising or generating an IP core, Vitis HLS presents a report of resource utilisation and latency metrics for further analysis. Below is an excerpt from the FourPole module of ModFPGA, which highlights the HLS programming process:
Computers 14 00155 i001
FourPole is a four-pole low-pass filter module with resonance. This excerpt shows the top-level function, which acts as a wrapper for the IP. The HLS INTERFACE pragma is used to define the streaming interfaces for the in and out audio samples and an AXILITE interface for the coefficients to be passed from the processing system (ARM CPU) to the filter IP. The function body contains the reading of the input stream, the filter processing, and the writing of the output stream. Lastly, a HLS PIPELINE pragma is used to turn off automatic pipelining for this IP. Pipelining is the process of starting a loop or process before a previous instance has been completed. By default Vitis HLS pipelines processes to start the next iteration at every clock cycle. However, this is not suitable for this filter because there are state dependencies across iterations of the filter processing. Additionally, pipelining may reduce latency but increases the resource consumption of IPs. At 48,000 Hz, without pipelining, the latency of the IP is already below our tolerance limit (Table 1). Thus, we reduced resource consumption, and, in turn, more voices are available for a polyphonic output. Alternatively, in Section 3.4, an example where pipelining is beneficial is presented.
This table shows the key resource and latency metrics from the Vitis HLS synthesis report for the IP. In an implementation of a polyphonic synthesiser (as shown in Section 4.1), several instances of this IP fit comfortably on the FPGA board. The latency is presented as the number of clock cycles for one iteration of the IP to run or for one sample to be processed. All ModFPGA IPs are designed to run at 125 MHz, with a clock period of 8 ns. So 423 cycles at 8 ns indicate a latency of 3.384 µs, which is only 16.24% of the sampling period of 20.8333 µs at 48 KHz.
The HLS toolchain (Figure 1) also supports high-level testing of hardware designs through the C simulation feature [31]. Test benches can be created in C/C++ to call the top-level function to compare its results to the expected outcomes, routing the output to the console or to a file for persistent storage. Once an IP core has been generated and tested, it can be imported into the block-design environment provided by the Vivado tool. As a result, a complete FPGA-based hardware architecture is created by instantiating and connecting different IP cores generated from HLS. Furthermore, the system employs utility pre-built IP cores—such as I2S transmitters, clock dividers, and other components—to complement the design. The outcome yields a bitstream that can be loaded into the FPGA hardware.
A final step in the HLS development process involves creating a software application in Vitis IDE, the third software tool in the ubimus plugging framework toolchain (Figure 2). This application runs on the processing system (PS) and provides an interface for managing IP cores. It also handles board-based tasks like parametric settings and configurations of the I2C-based audio codec. By leveraging the processing system alongside the FPGA, the Zynq SoC enables an efficient distribution of tasks: control and peripheral management are handled in software, and intensive DSP operations are carried out on the FPGA fabric.

3.2. Base Audio System

To smooth the integration and operation of the audio processing IP core modules, the plugging framework features a comprehensive base audio system that acts as a foundation of a structured environment for patching the modules. Peripheral tasks such as communication protocols, audio codec configuration, and clock management are handled by the base system, hence allowing the stakeholders to focus exclusively on audio processing and module patching without the need to manage low-level details.
  • Communication protocols: In a modular synthesiser, signal routing between components is critical. Similarly, in the ModFPGA environment, appropriate data communication between the IP core modules is important. Apart from facilitating inter IP-core communication, the base system also handles communication between the FPGA and the essential peripherals, including the onboard audio codec and external controllers, such as MIDI devices. These communications require specific protocols. The base audio system abstracts this complexity. Below are the key protocols managed by the base system.
    AXI4 Stream—Advanced Microcontroller Bus Architecture (AMBA) Advanced eXtensible Interface 4 (AXI4) is the standard used for communication within system-on-chip designs. AXI4 is suitable for high-performance, high-bandwidth, high-frequency, and low-latency designs [32]. The streaming version of this standard, AXI4 Stream, is used for inter-IP communication in our system [33].
    AXI4-Lite—AXI4-Lite is a simplified version of the complete AXI4 protocol, and it is used for communication between the modular IP cores and the on-chip processing system [34].
    I2S—The Inter-Integrated Circuit Sound (I2S) protocol is a serial audio interface used to transfer audio data. I2S communication is carried out between the FPGA modular synthesis system and the onboard audio codec, facilitating the streaming of audio data to and from the codec [35].
    I2C—The Inter-Integrated Circuit (I2C) protocol is used to configure the onboard audio codec in Zybo Z7020. Unlike I2S, which transmits audio data, I2C is used to send control commands, allowing the processing system to configure the codec’s internal settings, such as sampling rate and audio level, thus enabling the proper playback and reception of audio signals [36].
    UART—The Universal Asynchronous Receiver Transmitter (UART) protocol is useful for interfacing with control devices such as MIDI controllers. In ModFPGA, UART is used to connect MIDI controllers to the processing system, which accepts MIDI messages and assigns controls to them. UART is accessible via the USB port or MIO pins on Zybo Z7020 and the Xilinx PSUart driver [37].
  • Audio input and output: In embedded digital audio systems, the audio codec is a peripheral device responsible for interfacing the system with the external analogue components. We use the Analog Devices SSM2603 audio codec [38], which is present on the Zybo Z7020 board. This codec converts digital audio data from the FPGA into an analogue signal that can be output to speakers, and conversely, it digitises analogue input for further processing. The ModFPGA base system manages this interface, taking care of configuring the codec via I2C and handling audio data transmission through I2S, ensuring the IP cores can interact with external audio devices. The modular IP cores communicate using AXI4 Stream, and audio input/output is managed via the I2S protocol. To bridge these two communication standards, the system uses intermediary IP cores. Xilinx provides pre-packaged I2S Receiver and Transmitter IP cores within Vivado [39], which handle the conversion between I2S serial audio data from the codec and AXI4 streams for use by the audio processing IP cores. The I2S Receiver converts incoming serial audio data into AXI4 streams, and the I2S Transmitter converts AXI4 streams back to I2S, enabling seamless communication between the modular system and the codec. Since these cores come pre-integrated with Vivado, it makes sense to use them rather than developing custom ones, as they are easy to integrate and ensure compatibility across any Xilinx design and device.
  • Port definition: For the Zynq 7000 chip to be able to communicate with external devices, such as the audio codec, the appropriate ports need to be defined on particular GPIO pins of the chip. For example, the I2C port on the pin connects the chip with the I2C input of the audio codec. Additionally, input ports are defined, for example, to receive a clock source to drive the FPGA design.
  • Clock management: The base system also provides clock management to drive the IP cores and the I2S audio, ensuring synchronised operations across all components.

3.3. Audio Processing IP Core Modules

Given the way that the ModFPGA base audio system handles peripheral and low-level tasks, the design and integration of audio processing IP core modules becomes streamlined. Each IP core is responsible for a specific audio processing function and can be easily connected to other cores and integrated into the base audio system. We provide a short description of the functionality of the implemented modules.
  • Band-limited oscillator bank: The oscillator is one of the most fundamental units of sound synthesis, acting as a sound-generating source that is able to produce a plethora of different waveforms both in the analogue modular synthesis realm and in the digital audio synthesis of ModFPGA. The ModFPGA oscillator bank module contains pre-computed, high-resolution, band-limited, and interpolating wavetables for sawtooth, square, and triangle waves. This module can be used to create rich, stacked monophonic synthesisers or polyphonic synthesisers. The module has versions containing 4, 8, 12, or 16 voices. The provided waveforms can be selected for each voice. Additionally, each voice contains two oscillators that can be detuned to create thick sounds with phasing effects. This module receives gain, waveform type, and frequency values for each voice as input arrays from the processing system via the AXI4-Lite protocol and outputs separate AXI4 streams containing the audio data for each voice.
  • Four-pole filter: This module is a port of a virtual analogue four-pole low-pass filter from Victor Lazzarini’s Aurora C++ library [40] (a lightweight, header-only C++ library that features multiple musical signal processing components). The port demonstrates how existing audio code can be converted into HLS modules through HLS programming design constraints. Modifications while porting to HLS involved the reorganisation of the filter processing algorithm into a single, top-level function (the IP core) with the coefficient calculations handled by the processing system. These coefficient calculations involve a large number of floating-point mathematical operations that would consume a large amount of logical resources on the FPGA. Therefore, the coefficients are calculated in the processing system and sent to the IP core via AXI4-Lite to tackle changes in frequency.
  • FM operator: This module contains two sine wave oscillators, a carrier and a modulator, representing one voice of a frequency modulation (FM) synthesiser. An optional modulation input enables stacking operators to create denser modulation networks. The operator receives the carrier and modulator frequencies, modulation index, and amplitude values from the processing system.
  • ReverbSC: This module is a port of the Csound’s reverbsc opcode [41] and consists of an eight-delay-line feedback network based on the work carried out by Sean Costello. This module supports parameters such as feedback amount, damping frequency, and mix amount routed from the processing system. Sample-by-sample processing in an embedded system provides rich, dense, and lush reverberation textures. This is the most complex module that has been developed so far in ModFPGA; consequently, it consumes a considerable amount of logical resources (around 20,000 lookup tables, whereas all the other modules consume less than 3000 lookup tables).
Additional ModFPGA modules include chorus effect, single oscillators, mixers, and phase generators. The programming framework developed as part of this project enables porting implementations from C++ libraries such as Aurora. A list of the current modules is shown in Table 2.

3.4. Leveraging Parallelisation and HLS Optimisation Techniques

As mentioned in Section 3.1, optimisations and parallelisation can be applied to an HLS IP core using pragmas—high-level directives that instruct Vitis HLS to generate hardware in a specific way. The use of these pragmas and their impact on resource utilisation and latency is demonstrated through the ModFPGA OscBank module.
OscBank is an oscillator bank that can be configured to produce multiple voices within a single IP core. By consolidating multiple voices into a single IP, each voice can share the same wavetables, thereby conserving on-chip memory utilisation. Within the IP, the parallelisation of voice processing can be adjusted using pragmas.
Computers 14 00155 i002
The excerpt above shows a loop that iterates over the voices, calling the process_phase _input function for each voice. This function computes the phase increment based on the current frequency of the voice, performs linear interpolation on the wavetable, and generates the corresponding audio samples. There are several ways in which this loop can be translated into hardware. The HLS PIPELINE pragma can be used to control this behaviour.
  • No pragma provided: If no pipeline directive is specified, Vitis HLS will attempt to pipeline the loop automatically with an initiation interval (II) of 1. This means a new iteration of the loop will begin processing on each clock cycle. This results in maximum parallelisation and the lowest possible latency, as all voices are processed in fewer clock cycles. However, this approach demands large FPGA logical resources.
  • HLS pipeline disabled: If pipelining is turned off, each iteration of the loop will only start after the previous one has been completed. This leads to sequential execution, increasing latency but reducing resource utilisation: the same hardware resources can be reused across iterations.
  • HLS pipeline with a custom pipeline factor: As shown in the excerpt, the pipeline pragma can take a custom initiation interval. This allows for a trade-off between latency and resource utilisation. In the example, #pragma HLS PIPELINE II=10 ensures that each voice is processed every 10 clock cycles. This maintains low latency while avoiding excessive FPGA resource consumption.
Table 3 summarises the Vitis HLS synthesis report of the latency and resource utilisation metrics of the IP under these three configurations.
Thus, for an IP like OscBank, parallelisation and pipelining are advantageous in achieving optimal performance. Furthermore, since parallelisation occurs between independent voices, there is no state dependency across iterations in the pipelined for loop. In contrast, other IPs, such as ReverbSC or FourPole, involve state dependencies or recursive processing across iterations, which prevents effective pipelining. Therefore, careful consideration of optimisation strategies and pragma usage is crucial in designing HLS-based modules since different algorithms impose varying requirements in terms of sequential and parallel audio processing.

3.5. Processing System Software Application

While the modular IP cores handle the real-time audio processing on the FPGA, the processing system software application, programmed in C, manages system configuration, parameter control, and asynchronous tasks such as parametric configurations and the routing of controller data (Figure 3).
  • Configuration: Essential for initialising the system, configuration tasks involve setting up IP cores and peripherals, such as the audio codec or UART interface (for MIDI), according to precise programming. This setup is similar to init-time operations in domain-specific audio processing languages like Csound.
  • Coefficient calculation: The processing system offloads complex mathematical computations, like the four-pole filter coefficient calculations, which often require floating-point operations and do not need to be executed at audio rates. This helps conserve FPGA resources for audio rate tasks.
  • Control: Through AXI4-Lite communication, the processing system can adjust various parameters of the IP cores, such as gain and modulation indexes. This allows control rate operations to be carried out through different interfaces, including GUIs, MIDI, OSC, or standalone programs, enhancing the system’s modularity. For instance, our examples utilised MIDI and a standalone sequencer program for controlling sound synthesis parameters.
  • Envelopes: Envelope generators that modulate parameters such as modulation index or filter cutoff frequency can run on the processing system; this approach conserves FPGA resources.
Below is an excerpt of the PS application code for the subtractive synthesis ModFPGA example.
Computers 14 00155 i003
This excerpt presents a high-level view of the role of the PS in the overall system, including peripheral configuration (I2S transmitter, UART, and audio codec), coefficient calculation for the FourPole filter, parametric settings for the reverb and mixer IPs, and MIDI control. The implementations of each of these function calls can be seen in the ModFPGA GitHub repository.

4. Deploying ModFPGA: Case Studies

This section presents two case studies showcasing the deployment of ModFPGA applied to two widely used synthesis techniques: subtractive synthesis and frequency modulation.

4.1. Subtractive Synthesis

Subtractive synthesis is a widely used technique, especially in dedicated hardware synthesisers. It involves a sound source that typically covers a broad frequency spectrum and a filter that acts as a spectrum modifier, shaping the sound produced by the source. In this approach, the sound from the source contains a wide range of frequencies, and the filter selectively removes unwanted frequencies, yielding only the desired components.
Subtractive synthesis is used for analogue emulation or virtual analogue oscillators, producing waves commonly found in analogue hardware synthesisers, such as sawtooth, pulse, and triangle waves, as well as noise generators or sampled sound sources. In our case, we focus on virtual analogue sources. The three waveform types mentioned above provide a wide range of sounds after filtering. When transitioning from analogue to digital, it is important that these sources are band-limited. This ensures that the components remain within the Nyquist frequency (half the sampling rate) to prevent aliasing artefacts.
The ModFPGA band-limited oscillator bank IP core addresses this by offering sawtooth, pulse, and triangle waveforms while also implementing band-limiting. Single-cycle waveforms are stored as wavetables in two-dimensional arrays, with the appropriate number of harmonics for each frequency range relative to the system’s 48 kHz sampling rate. The IP core selects the appropriate wavetable for the current frequency and outputs a signal within the Nyquist limit.
Various filter types can be used for spectral shaping, including low-pass filters that allow frequencies below a cutoff frequency, high-pass filters that do the opposite, and band-pass filters that yield a specific frequency range while removing everything outside of it. In our example, we use the ModFPGA four-pole filter, a low-pass filter, as the spectral modifier. Filters like four-pole also often have the capability of resonating, increasing the volume of frequencies near their cutoff frequency. Thus, they offer additional spectral-shaping possibilities.
Another key aspect of subtractive synthesis is dynamic spectral modulation. When the frequency content of a sound source changes over time, it opens up possibilities for elaborate sonic processes. In ModFPGA, the current choices for modulation sources include envelope generators and oscillators. In our example, a filter frequency envelope generator is used to modulate the cutoff frequency of the filter.
Audio effects can add texture and variation to synthesised sounds. The ModFPGA reverbsc module is used in this case to create spatial effects, adding depth and subtle pitch and phase modulations to the sound.
So, in terms of the plugging of IP core modules to create our subtractive synthesis examples, these are the main modules that will be interconnected. This can be carried out in monophonic or single-voice configurations, as shown in Figure 4, as well as polyphonic or multi-voice configurations, as shown in Figure 5. In the monophonic version, a four-voice oscillator bank’s outputs are mixed and then processed by the four-pole filter. In the polyphonic version, each voice of an eight-voice oscillator bank has its own four-pole filter. The outputs of all the voices are then mixed together to be output.
Both versions feature envelope generators running on the processing system, modulating the amplitude and filter cutoff frequencies. In the case of the polyphonic design, each voice has its own amplitude and filter envelope. To control and play the synthesiser, MIDI can be used with the help of the Xilinx PS UART driver, as mentioned above. MIDI messages are received via the UART peripheral of the board. Message parsing, voice allocation, and control change/parameter processing are carried out in the C application on the processing system, providing support for MIDI controllers and keyboards. The complete polyphonic design employs about 76% of the available lookup tables of Zybo Z7020, leaving room for additional voices or effect processors in the design.
Polyphonic subtractive synthesis is demonstrated in this video: https://youtu.be/HEGCWv5DMnw, accessed on 10 April 2025.
In addition to the plugging of IP cores in the block design, this subtractive synthesis example highlights a lower layer of plugging in ModFPGA in the form of porting existing audio processing algorithms and plugging them into the ModFPGA, HLS, and FPGA environment. This can be seen in the four-pole filter, which is ported from the Aurora Library, as well as the reverbs module, which was adapted from Csound. The IP cores form the basis of the plugging environment for creating different sound synthesis possibilities in the FPGA environment, and the lower layer involves the plugging of sources for the IP core itself, enabling the expansion of the higher plugging layer.

4.2. FM Synthesis

Frequency modulation (FM) synthesis involves the creation of complex harmonic spectra by modulating the frequency of sine waves at a high rate in the audio range and by a large amount [42]. FM synthesis uses a single fundamental unit, the sine wave oscillator, which serves both as a sound source and a spectral modifier. A basic FM synthesis setup consists of two sine wave oscillators: a carrier and a modulator. A pair of carrier and modulator oscillators is referred to as an operator.
FM synthesis allows for complex configurations of carriers and modulators, enabling stacked arrangements. For instance, a second modulator can modulate the first modulator, which in turn may modulate the carrier. This capability to form various architectures using the same basic unit makes FM synthesis a versatile technique. Among the available parameters for FM synthesis, we highlight the following: carrier frequency, modulator frequency, and modulation index.
  • Carrier frequency: This will be the fundamental frequency of the sound produced by the synthesiser.
  • Modulator frequency: This should be within the audio range (above 20 Hz), and it determines the harmonic content of the resulting sound. The ratio between the carrier and modulator frequencies defines the harmonic structure; whole-number ratios result in harmonic spectra, and non-integer ratios produce inharmonic spectra.
  • Modulation index: This parameter relates to the amount of modulation or frequency deviation caused by the modulator. It translates to the amount of harmonics being produced in the resultant sound. This parameter is particularly important as a modulation target in creating dynamic spectra. In our design, an envelope generator modulates the modulation index to achieve this.
ModFPGA supports both monophonic and polyphonic FM synthesis designs. Figure 6 shows a monophonic FM synthesiser with a dual operator design made from individual fmosc modules used as carriers and modulators. In this case, there are more than enough available FPGA resources, so ADSR (attack, decay, sustain, and release) envelope generator modules are run on the programmable logic itself instead of the processing system.
Figure 7 shows a polyphonic FM synthesiser which uses eight separate FM operator IP cores (each containing a carrier and oscillator). In this design, reverbsc is also added so to conserve FPGA resources, and the envelope generators are run on the processing system. Similarly to the previous example, MIDI over UART can be used on the processing system to control or play the polyphonic FM synthesiser.
Polyphonic FM synthesis is demonstrated in this video: https://youtu.be/-JtMQfLILuk, accessed on 10 April 2025.

4.3. Performance Evaluation

This section presents an evaluation of key performance metrics as well as a comparison with two other established embedded audio computing platforms: Bela [17] and Daisy [18].
Bela is a single-board computer (SBC) audio platform based on BeagleBone Black. It is a Linux-based system optimised for low audio latency; it is open-source and supports programming with multiple open-source audio languages such as Csound, SuperCollider, PureData, and C++.
Daisy is a bare-metal platform based on the STM32 microcontroller. It features a compact form factor and various breakout boards that facilitate integration into different environments, such as effects pedals or Eurorack modules.
To enable a direct comparison, an equivalent subtractive synthesis program was implemented on each platform, replicating the structure of the ModFPGA subtractive synthesis case. Each implementation consists of eight voices, each containing two oscillators, a FourPole filter, and a ReverbSC reverberator. The Bela implementation was developed using Csound, while the Daisy implementation was written in C++ using the DaisySP library (the source code for the Bela and Daisy implementations is available in the ModFPGA repository).

4.3.1. FPGA Resource Utilisation

Table 4 highlights the resource utilisation of both synthesis cases in the programmable logic (PL) section of the Zynq-7000 SoC.
These values indicate that while both designs utilise more than half of the available resources on the Zynq PL, a substantial amount of resources remain available for further expansion. The available LUTs and FFs suggest that additional voices could be added beyond the current eight-voice configuration. The remaining BRAM capacity allows for the inclusion of additional wavetables or the use of higher-resolution wavetables for improved synthesis fidelity. One significant factor to consider is the impact of the ReverbSC module on resource utilisation, as shown in Table 5.
Due to the modular nature of ModFPGA, users can make trade-offs between effects processing and synthesis complexity. For instance, the ReverbSC IP alone occupies a significant portion of the available resources. Users could choose to replace it with additional voices, introduce more oscillators per voice for richer timbres, or prioritise alternative effects. In the current examples, eight voices were sufficient to demonstrate polyphony, making it feasible to include the ReverbSC IP within the available resources.
Furthermore, it is important to note that at this stage, ModFPGA HLS IPs are not fully optimised, as they rely on floating-point operations. While floating-point arithmetic facilitates the direct transfer of existing audio algorithms from other platforms to ModFPGA—as demonstrated by the FourPole module—fixed-point representation could reduce resource consumption and improve performance when deployed in hardware. However, as demonstrated in this section, even without fixed-point optimisation, the current implementation of ModFPGA achieves an acceptable level of performance and latency.

4.3.2. Computational Load Comparison

For Bela and Daisy, performance metrics are typically expressed in terms of CPU load. The Bela platform provides real-time CPU usage monitoring within its online IDE, while the Daisy platform reports CPU load via the CpuLoadMeter helper class in the vendor-supplied libDaisy hardware abstraction library. With all eight voices engaged, the measured CPU usage for both platforms is shown in Table 6.
These results indicate that both platforms operate under considerable computational load, with Bela nearing its maximum processing capacity. In contrast, on the Zynq platform, all audio processing occurs in the programmable logic, with resource utilisation directly quantified through implementation reports, as shown in the tables above. Additionally, only a minimal portion of the processing workload is handled by a single core of the dual-core Cortex-A9 processor, as discussed in Section 3.5. This leaves significant headroom for additional processing tasks, offering greater scalability compared to CPU-based platforms.

4.3.3. Latency Comparison

To measure and compare latency across the three platforms, a MIDI note-on message was sent from a computer to each device, and the corresponding audio output was recorded back into the computer. The latency was measured as the time elapsed between the onset of the note-on message and the first audio sample returning to the computer. Additionally, a commercial synthesiser, the Makenoise 0-Coast semi-modular synthesiser, was also added to this comparison to provide an industrial benchmark for these measurements.
The measurement setup consisted of a computer running a Digital Audio Workstation (DAW) for sending MIDI messages and receiving audio data, an audio/MIDI interface, and the four devices. The baseline round-trip latency of the system—excluding the audio processing on the boards—was measured at 2.5 ms. Since only the audio input latency into the system is relevant to the final measurements, half of this round-trip latency (1.25 ms) was subtracted from the reported values. Each measurement represents an average of 10 trials, taken separately for monophonic and polyphonic note-on messages on each platform. The 0-Coast is a monophonic synthesiser, so only monophonic readings are included for that device.
The results in Table 7 indicate that the FPGA-based system achieves lower latency than all the other platforms. Moreover, most of the latency in the FPGA implementation can be attributed to MIDI message processing and audio output transmission from the audio codec rather than the actual audio processing itself.
This was further confirmed by a control measurement of MIDI latency, where MIDI note-on messages were sent to a software instrument within the DAW. The time elapsed between the message and the first generated audio sample was recorded. The software instrument used was a stock synthesiser included with the DAW, Ableton Live, which is rated to have zero additional latency, ensuring an accurate MIDI latency measurement. This test yielded a latency of approximately 4 ms. Subtracting this value from the total measured FPGA platform latency results in effective latencies of 2.55 ms (monophonic) and 4.75 ms (polyphonic) for onboard audio latency resulting from the audio processing on the PL, the software application on the PS, the I2S transmitter IP, and the audio output from the onboard audio codec.
For ModFPGA-specific audio processing latency in the subtractive synthesis case, the total latency can be determined by summing the latencies of the ModFPGA IPs, which are connected in series for each voice. Since all voices are processed in parallel within the Vivado block design, the per-voice processing time represents the longest sequential path any single audio sample must traverse. As a result, the sum of latencies for a single voice can be considered the effective total latency of the entire subtractive synthesiser system. The breakdown of per-module processing latency is presented in Table 8.
Given a clock period of 8 ns, this per-voice processing latency amounts to 6.008 µs for the complete subtractive synthesiser. Since all voices operate in parallel, this is also the overall system latency for processing an incoming audio sample. This is significantly lower than the 20.83 µs clock period corresponding to a sample rate of 48 kHz, confirming that the FPGA performs real-time sample-by-sample processing. In contrast, both Bela and Daisy process audio in blocks of eight samples to prevent glitches or dropouts, which inherently introduces additional latency. The headroom between the total processing latency of FPGA and the sample period also suggests that the system could operate at higher sample rates, further reducing overall latency. However, the achievable performance is currently constrained by the sampling rate limitations of the onboard audio codec on Zybo Z7020.

4.4. Build Replication

Both complete synthesis cases, as well as each individual IP, can be built and programmed onto the Zybo Z7020 board using the ModFPGA GitHub repository from the command line. Xilinx Software Command Line Tools (XSCTs) [43] are utilised to automate the build process via TCL scripts.
In terms of hardware setup, both examples respond to MIDI control via the UART protocol. The UART RX signal is mapped to MIO 14 on PMOD JF Pin 9 of the board, and the TX signal is mapped to MIO 15 on Pin 10 of the same PMOD.
To safely interface with these pins and control the synthesiser designs, a standard MIDI controller can be connected using a five-pin DIN cable and an opto-isolated MIDI-to-UART interface, such as a breakout board. Detailed instructions for building, programming the board, and the hardware setup are provided in the repository’s README file.

4.5. Discussion of Results

We described the development of a modular synthesis system on FPGA hardware. Our ModFPGA library takes advantage of the platform’s key features, such as low latency, parallelism, and strong processing power, to provide a portable embedded environment for musical applications. The system is designed to be programmed in C/C++ using HLS to facilitate and speed up the prototyping cycle.
A key idea behind our ubimus plugging approach involves the implementation of IP cores as independent modules for synthesis, inspired by analogue modular audio synthesisers. ModFPGA includes a growing repository of source code for these modules, tailored for FPGA hardware. Looking at the block diagram features of the FPGA development cycle, IP cores resemble modular elements within a chip, featuring input and output ports within predefined communication protocols. Their signal processing capabilities are established during the HLS programming methods. Instantiating and interconnecting these cores within the block designs reflects the support of ModFPGA for modularity. Therefore, if each of the IP cores is programmed to perform a specific audio signal processing task, it can be thought of as a synthesiser module. Its inputs and outputs can be thought of as points for plugging actions that yield multiple configurations. By approaching hardware design with a consistent methodology empowered by HLS, our ubimus plugging framework aims to provide access points to users of various skill levels.
  • High level: Pre-built patches are offered for various types of synthesiser and processor designs that are ready to use. Little or no requirements are imposed besides the ability to connect to hardware and flash the application.
  • Medium level: Pre-built IP cores are plugged together to set up custom designs. Stakeholders may require some added knowledge on how each IP core interacts and how cores can be connected while handling the tools to build an FPGA architecture.
  • Low level: A C++ framework is provided for the development of custom IP cores through HLS. Users are required to have audio programming and signal processing skills, as well as a practical understanding of the HLS toolchain.
ModFPGA supports the creation of audio-synthesis systems with the ability to cater to diverse musical requirements using a relatively small set of IP core modules, hence fostering aesthetic pliability, scalability, and composability.
  • Scalability—Modular synthesisers can be added or subtracted based on performance needs, space constraints, and artistic vision. This means that artists can have a large system with numerous modules in their home setup and have a smaller portable system containing their core modules that can be used for performances and supporting their artistic needs. In the FPGA-based ubimus plugging framework, this translates to being able to select FPGA resources. On a smaller chip with fewer resources, fewer modules are instantiated to create a ‘portable’ version of the system. A bigger chip with more resources enables a more comprehensive set. Since both designs are based on the same platform, synergy and continuity across different scenarios are supported.
  • Composability—Interconnecting modules to create unique signal chains, a key activity in the plugging framework, fosters the use of multiple synthesis techniques and sonic transformations without requiring an expansion of components. The deployment of just a few FPGA-based modules can result in multiple sonic results on a single chip. Due to being composable, our framework provides a complement to fixed-circuit synthesisers that tend to be tailored to limited functionality [44].
  • Modularity—Being ‘field-programmable’, FPGAs allow different hardware systems to be created by reprogramming their logic units. Our modular IP core synthesis platform embraces this concept at a high level. By reconfiguring the interconnections between IP core modules, new audio synthesis systems can be created on the fly. This modularity mirrors the adaptive nature of FPGAs, allowing the practitioners to reprogram both the hardware and the signal-processing architecture without having to deal with the underlying complexities. For example, in the polyphonic FM synthesis design, each operator is processed in parallel by instantiating separate cores in the block design. Similarly, in the polyphonic subtractive synthesis example, the eight separate filters are processed in parallel. This modular approach not only harnesses both parallelism and high throughput, but it also features greater flexibility in designing complex audio processing systems.
  • Flexible temporalities—Latency can be further reduced by introducing parallelism through HLS pragmas. For instance, with the OscBank module, we have already observed the impact of pipelining on latency with different pipeline factors. However, latency can be further minimised by unrolling processing loops. The band-limited oscillator bank module already utilises pipelining, but it can also be configured to process each voice on separate, replicated hardware using the HLS UNROLL pragma. This directive unrolls processing loops, effectively duplicating the hardware for each voice, allowing all voices to be processed simultaneously rather than sequentially.
    In our design, we chose not to unroll the loop to conserve FPGA resources, demonstrating the flexibility of managing timing, sequencing, and parallelism as needed. Unrolling increases the FPGA resource utilisation, significantly reducing latency and enhancing the system’s throughput. By processing multiple streams in parallel, the system can take full advantage of the FPGA capacity to handle a large volume of data in a very short time, thus maximising throughput. Furthermore, although the audio sample rate is limited to 48 kHz due to the onboard audio codec of the Zybo board, each IP core operates at a much higher clock speed (125 MHz). In the examples, each oscillator takes around 200 clock cycles to process one sample, with a minimum sampling period of 1.6 microseconds, allowing the core to achieve an effective sampling rate of 625 kHz. With the use of unrolling and parallelisation, as explained above, this sampling rate could be further increased, enabling the implementation of advanced synthesis techniques such as higher-order FM or virtual analogue emulations.
  • Aesthetic pliability—The musical examples of the two implemented prototypes highlight the ability of ModFPGA-based synthesis techniques to deliver high-quality and diverse sonic outcomes. However, this is, of course, just a small fraction of the range of applications supported by the library. In the future, the exploratory nature of fast hardware prototyping may support emerging ubimus frameworks that comprise human–computer interaction techniques and audio synthesis and processing know-how. Consider, for instance, struck-string interaction. This approach to piano-like timbre fosters the deployment of generative techniques that explore the boundaries between tuning and sonic qualities. At the time of writing, implementations of timbre-oriented tunings based on synchronous tracking of sonic information are still works in progress. ModFPGA could be deployed to enable alternative architectures involving fairly short hardware-design cycles.

4.6. Caveats and Future Developments

There are a few issues that need to be noted at this stage:
  • While the use of HLS and the availability of ready-made ModFPGA components simplify the process of creating musical applications for a highly specialised form of embedded hardware, the development process still requires a significant amount of programming experience.
  • Additionally, new module development has been facilitated by the existence of a well-documented approach from standard computer code to HLS sources. Nevertheless, there are many FPGA-specific conditions that need to be met, and a detailed understanding of platform-specific matters is required.
  • Our approach has foregrounded the use of separate IP cores, which supports our study of pluggability. However, the use of several individual IP cores may cause a slight increase in FPGA logical resource utilisation compared to developing single-IP schemes. Individual IPs are better suited for plugging, reconfiguring, and individualised optimisation. There is a trade-off between resource utilisation and flexibility. The added resource utilisation is not significant enough to render our approach unviable or inefficient.
    There are several avenues to expand ModFPGA library support. The Xilinx FPGA tools are vast and complex and still necessitate a steep learning curve. We plan to create high-level plugging metaphors to enable designs of sound-synthesis algorithms either through the adoption of visual programming strategies [20] through multimodal techniques, including flexible temporalities, or through computational-thinking frameworks tailored to casual usage such as lite coding [45]. The FPGA bitstream and software control application would be auto-generated and flashed to the board.
    Additionally, we plan to use HLS to develop and implement a delta-sigma DAC directly on the FPGA, as outlined in [46]. This would reduce the dependency on the sample rate and latency limitations imposed by the pre-packaged I2S transmitter and onboard audio codec. We would then be able to use the system to study extremely high-sampling rate DSP methods, as employed in higher-order FM synthesis [10] and similar techniques. Lastly, the control system on the CPU may also be expanded with OSC, web server, and analogue sensor-based control.

5. Conclusions

Emergent hardware prototyping methods have been fostered by a trend toward miniaturisation and increased accessibility to open-source hardware platforms. Despite these tendencies, given the steep learning curve demanded by programmable logic circuits, artists and practitioners have been slow in incorporating hardware design as a means to expand their musical practices. FPGA programming remains an area for the technically savvy.
Two factors may help to change this state of affairs. On the one hand, technological convergence has yielded reconfigurable hardware components that bring together field-programmable gate arrays and system-on-chip resources. Consequently, an ecology of tools that features abstraction and encapsulation as strategies to reduce the complexity of hardware design procedures is emerging. A case in point is high-level synthesis.
Our approach incorporates design principles based on knowledge gathered through various ubimus research initiatives. As a first incursion into the realm of fast hardware prototyping, this paper documents the development of the ModFPGA library, a platform geared toward the exploration of audio processing and synthesis hardware architectures. We demonstrate two ModFPGA-based implementations of classic synthesis techniques: subtractive synthesis and frequency modulation.
Modularity and composability are two characteristics inherited from a well-established tradition of analogue synthesis practice. Modularity is exemplified by the support of software abstractions that emulate multiple hardware architectures. Composability ensures the functionality of interconnected components, enabling the exploration of both established and prospective designs. Scalability encompasses mechanisms to enable complex and dynamic configurations. As a strategy to avoid brittle solutions based on proprietary tools or on heterogeneous computational resources, multiple alternatives may be tested before local or remote deployments (see the section of caveats for some limitations in this regard; sustainability is also relevant, and it has been addressed across several dimensions of the ModFPGA design process).
Two design qualities seem to be at the core of multiple ubimus frameworks: flexible temporalities and aesthetic pliability. Flexible temporalities foster diverse creative strategies to avoid the imposition of centralised or fixed-time management of musical events. Rather than imposing a mono-temporal regime, ubimus frameworks encourage the coexistence of multiple temporal layers, including quasi-synchronous and asynchronous relational properties among components. In ModFPGA, this is materialised through tailorable computational resources that may feature parallelism, the unrolling of iterated operations, or strict synchronisation at the subsample level. The reprogrammability of FPGAs leaves room for dynamic adaptability within different scenarios. For instance, with the same IP core modules, an FPGA can be programmed for ultra-low latency in one application and for minimal resource usage in another setting. This adaptability maximises the utility of FPGAs across various applications to meet diverse design requirements, thus enhancing sustainability. By using HLS pragmas, trade-offs between latency and resource consumption can be managed, adjusting FPGA performance to specific needs and contexts.
Lastly, and arguably one of the most important design characteristics, aesthetic pliability—or the ability to shape the resources according to the cultural needs of the stakeholders—is supported by keeping the design transparent. It may be argued that any form of abstraction implies constraining the decision-making processes. While this is true, the three-level design approach of our ubimus plugging framework fosters increased tailoring as a way to improve knowledge-sharing. Following this research path, we plan to study the limitations and potentials of the ubimus plugging framework as an environment for developing collective music-making involving reconfigurable hardware resources.

Author Contributions

All authors contributed to the writing; A.J. contributed the experimental work and coding for FPGAs. All authors have read and agreed to the published version of the manuscript.

Funding

We would like to thank the editorial team of Computers for encouraging the submission of this paper. D.K. acknowledges the support from CNPq [308790/2021-9]. Aman Jagwani’s work is funded by the Hume Scholarship from Maynooth University.

Data Availability Statement

The data and software are available from the git repository https://github.com/amanjagwani/ModFPGA, accessed on 10 April 2025.

Acknowledgments

Aman Jagwani would like to acknowledge the support of Maynooth University through the Hume Scholarship programme.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Keller, D.; Lazzarini, V.; Pimenta, M.S. (Eds.) Ubiquitous Music, 1st ed.; Springer: Berlin, Germany, 2014. [Google Scholar] [CrossRef]
  2. Bridges, B.; Lazzarini, V.; Keller, D. Editorial: Ecologically grounded creative practices and ubiquitous music—Interaction and environment. Organ. Sound 2023, 28, 321–327. [Google Scholar] [CrossRef]
  3. Brown, A.R.; Ferguson, J. DIY musical instruments: From Handmade Electronic Circuits to Microcontrollers and Digital Fabrication. J. Ubiquitous Music 2024, 1, 8–22. [Google Scholar]
  4. Mikolajczyk, K.; Ferguson, S.; Candy, L.; Dias Pereira dos Santos, A.; Bown, O. Space shaping in the design process for creative coding: A case study in media multiplicities. Digit. Creat. 2024, 35, 31–51. [Google Scholar] [CrossRef]
  5. Timoney, J.; Lazzarini, V.; Keller, D. DIY electronics for ubiquitous music ecosystems. In Ubiquitous Music Ecologies, 1st ed.; Routledge: London, UK, 2020; p. 19. ISBN 9780429281440. [Google Scholar]
  6. Messina, M.; Keller, D.; Freitas, B.; Simurra, I.; Gómez, C.; Aliel, L. Disruptions, technologically convergent factors and creative activities: Defining and delineating musical stuff. Digit. Creat. 2024, 35, 13–30. [Google Scholar] [CrossRef]
  7. Attie, P.; Baranov, E.; Bliudze, S.; Jaber, M.; Sifakis, J. A general framework for architecture composability. Form. Asp. Comput. 2016, 28, 207–231. [Google Scholar] [CrossRef]
  8. Kastner, R.; Matai, J.; Neuendorffer, S. Parallel Programming for FPGAs. arXiv 2018, arXiv:1805.03648. [Google Scholar]
  9. Popoff, M.; Michon, R.; Risset, T.; Cochard, P.; Letz, S.; Orlarey, Y.; de Dinechin, F. Audio DSP to FPGA Compilation: The Syfala Toolchain Approach. Univ Lyon, INSA Lyon, Inria, CITI, Grame, Emeraude, Tech. Rep. RR-9507, May 2023. Available online: https://inria.hal.science/hal-04099135 (accessed on 1 May 2023).
  10. Lazzarini, V.; Timoney, J. Theory and practice of higher-order frequency modulation synthesis. J. New Music Res. 2023, 52, 186–201. [Google Scholar] [CrossRef]
  11. Xilinx. Zynq-7000 SoC. Available online: https://www.xilinx.com/products/silicon-devices/soc/zynq-7000.html (accessed on 7 May 2023).
  12. Wegener, C.; Stang, S.; Neupert, M. FPGA-accelerated real-time audio in Pure Data. In Proceedings of the 19th Sound and Music Computing Conference, Saint-Étienne, France, 5–12 June 2022. [Google Scholar]
  13. Cochard, P.; Popoff, M.; Fraboulet, A.; Risset, T.; Letz, S. A programmable Linux-based FPGA platform for audio DSP. In Proceedings of the Sound and Music Computing Conference, Stockholm, Sweden, 12–17 June 2023; Royal College of Music and KTH Royal Institute of Technology: Stockholm, Sweden, 2023; pp. 110–116. Available online: https://hal.archives-ouvertes.fr/hal-04394035 (accessed on 24 April 2024).
  14. Arduino. Available online: https://www.arduino.cc/ (accessed on 16 February 2024).
  15. PlatformIO. Available online: https://platformio.org/ (accessed on 16 February 2024).
  16. Electro-Smith. Daisy. Available online: https://www.electro-smith.com/daisy (accessed on 8 June 2023).
  17. McPherson, A. Bela: An embedded platform for low-latency feedback control of sound. J. Acoust. Soc. Am. 2017, 141, 3618. [Google Scholar] [CrossRef]
  18. Michon, R.; Orlarey, Y.; Letz, S.; Fober, D. Real-time audio digital signal processing with faust and the teensy. In Proceedings of the Sound and Music Computing Conference, Malaga, Spain, 28–31 May 2019; pp. 325–332. [Google Scholar]
  19. Jagwani, A. Creative Possibilities and Customizability of Live Performance Systems with Open Source Programming Platforms. In Proceedings of the 13th International Symposium on Ubiquitous Music (Ubimus23), Derry–Londonderry, UK, 16 December 2023. [Google Scholar]
  20. Puckette, M. Pure data: Another integrated computer music environment. In Proceedings of the Second Intercollege Computer Music Concerts, Tachikawa, Japan, 7 May 1997. [Google Scholar]
  21. Verstraelen, M.; Kuper, J.; Smit, G.J.M. Declaratively programmable ultra-low-latency audio effects processing on FPGA. In Proceedings of the 17th International Conference on Digital Audio Effects, Erlangen, Germany, 1–5 September 2014; p. 8. [Google Scholar]
  22. Vannoy, T.C.; Davis, T.B.; Dack, C.A.; Sobrero, D.; Snider, R. An open audio processing platform using SoC FPGAs and model-based development. J. Audio Eng. Soc. 2019, 147, 1–8. Available online: https://api.semanticscholar.org/CorpusID:210965405 (accessed on 16 April 2024).
  23. Vaca, K.; Jefferies, M.M.; Yang, X. An open audio processing platform with Zynq FPGA. In Proceedings of the 2019 IEEE International Symposium on Measurement and Control in Robotics (ISMCR), Houston, TX, USA, 19–21 September 2019. [Google Scholar]
  24. Singhani, A.; Morrow, A. Real-time spatial 3D audio synthesis on FPGAs for blind sailing. In Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, ACM, Seaside, CA, USA, 23–25 February 2020. [Google Scholar]
  25. Merah, L.; Lorenz, P.; Ali-Pacha, A.; Hadj-Said, N. A guide on using Xilinx System Generator to design and implement real-time audio effects on FPGA. Int. J. Future Comput. Commun. 2021, 10, 38–44. [Google Scholar] [CrossRef]
  26. Fitzgerald, L. Sound Synthesis Using Programmable System-on-Chip Devices. Master’s Thesis, Maynooth University, Co., Kildare, Ireland, October 2019. [Google Scholar]
  27. GRAME. Fast: Fast Audio Signal-Processing Technologies on FPGA. Available online: https://fast.grame.fr/ (accessed on 12 April 2023).
  28. Teboul, E.J.; Kitzmann, A.; Engström, E. (Eds.) Modular Synthesis: Patching Machines and People, 1st ed.; Focal Press: London, UK, 2024; p. 500. ISBN 9781003219484. [Google Scholar] [CrossRef]
  29. MathWorks. FPGA, ASIC, and SoC Development with Xilinx. Available online: https://in.mathworks.com/solutions/fpga-asic-soc-development/xilinx.html (accessed on 12 July 2023).
  30. Digilent. Zybo Z7 Reference Manual. Available online: https://digilent.com/reference/programmable-logic/zybo-z7/start (accessed on 24 April 2023).
  31. AMD. Vitis High-Level Synthesis User Guide. Available online: https://docs.amd.com/r/en-US/ug1399-vitis-hls (accessed on 20 August 2024).
  32. ARM. AXI Protocol Overview. Available online: https://developer.arm.com/documentation/102202/0300/AXI-protocol-overview (accessed on 18 August 2023).
  33. Xilinx. AXI4 Stream User Guide. Available online: https://docs.xilinx.com/r/en-US/ug1399-vitis-hls/Introduction (accessed on 18 February 2023).
  34. ARM. AXI4 and AXI4-Lite Interfaces. Available online: https://developer.arm.com/documentation/dui0534/b/Parameter-Descriptions/Interface/AXI4-and-AXI4-Lite-interfaces (accessed on 18 August 2023).
  35. Philips Semiconductors. I2S Bus Specification. Available online: https://www.nxp.com/docs/en/user-manual/UM11732.pdf (accessed on 18 August 2023).
  36. I2C Info. I2C Bus Specification. Available online: https://i2c.info/i2c-bus-specification (accessed on 29 August 2024).
  37. Xilinx. PS UART. Available online: https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18842340/PS+UART (accessed on 10 February 2024).
  38. Analog Devices. SSM2603: Low Power Audio Codec. Available online: https://www.analog.com/media/en/technical-documentation/data-sheets/SSM2603.pdf (accessed on 29 August 2024).
  39. Xilinx. I2S Transmitter/Receiver Subsystem. Available online: https://docs.xilinx.com/r/en-US/pg308-i2s?tocId=9FEzOdB8yo2s6RvSeJdRkA (accessed on 18 August 2023).
  40. Lazzarini, V.; Walsh, R. Aurora-Lattice: Rapid Prototyping and Development of Music Processing Applications. In Proceedings of the 13th International Symposium on Ubiquitous Music (Ubimus23), Derry–Londonderry, UK, 16 December 2023. [Google Scholar]
  41. Csound. Reverbsc Opcode. Available online: https://www.csounds.com/manual/html/reverbsc.html (accessed on 29 August 2024).
  42. Chowning, J.M. The synthesis of complex audio spectra by means of frequency modulation. J. Audio Eng. Soc. 1973, 21, 526–534. [Google Scholar]
  43. AMD. Xilinx Software Command-Line Tool (XSCT) Reference Guide, UG1208, Version 2018.2. Available online: https://docs.amd.com/v/u/2018.2-English/ug1208-xsct-reference-guide (accessed on 18 January 2025).
  44. J-UBIMUS. Journal of Ubiquitous Music. Available online: https://periodicos.ufes.br/j-ubimus/issue/view/1504 (accessed on 29 August 2024).
  45. Keller, D.; Lazzarini, V. Building blocks for lite coding: Temporalities in litePlay.js. In Proceedings of the Ubiquitous Music Symposium, Macau, China, 31 October–2 November 2024. [Google Scholar]
  46. Michon, R.; Sourice, J.; Lazzarini, V.; Timoney, J.; Risset, T. Towards high sampling rate sound synthesis on FPGA. In Proceedings of the 26th International Conference on Digital Audio Effects (DAFx23), Copenhagen, Denmark, 4–7 September 2023. [Google Scholar]
Figure 1. HLS programming structure.
Figure 1. HLS programming structure.
Computers 14 00155 g001
Figure 2. Complete HLS development flow and toolchain.
Figure 2. Complete HLS development flow and toolchain.
Computers 14 00155 g002
Figure 3. Zynq 7000 architecture, highlighting the tight integration of the processing system and programmable logic along with shared resources and peripherals.
Figure 3. Zynq 7000 architecture, highlighting the tight integration of the processing system and programmable logic along with shared resources and peripherals.
Computers 14 00155 g003
Figure 4. Monophonic subtractive synthesis.
Figure 4. Monophonic subtractive synthesis.
Computers 14 00155 g004
Figure 5. Polyphonic subtractive synthesis.
Figure 5. Polyphonic subtractive synthesis.
Computers 14 00155 g005
Figure 6. Monophonic FM synthesis.
Figure 6. Monophonic FM synthesis.
Computers 14 00155 g006
Figure 7. Polyphonic FM synthesis.
Figure 7. Polyphonic FM synthesis.
Computers 14 00155 g007
Table 1. FourPole Vitis HLS synthesis report.
Table 1. FourPole Vitis HLS synthesis report.
BRAMFFLUTDSPLatency (Clock Cycles)
Value0169927248423
% Used on Zybo Z70200%1%5%3%16.24%
Table 2. List of ModFPGA modules.
Table 2. List of ModFPGA modules.
ModuleDescription
ADSRAttack, decay, sustain, and release envelope
ButterworthBilinear transformation second-order Butterworth filter
ChorusVariable-delay chorus effect
DelayVariable-time delay line
FMOscFrequency modulation single oscillator
OperatorFrequency modulation carrier and modulator pair
FourPoleBilinear transformation four-pole ladder low-pass filter
MixerAudio signal mixer
OscBankBank of oscillators
PhaseGenPhase signal generator
Resonatorsecond-order resonator
ReverbSCFeedback delay network reverb
ToneFirst-order low-pass filter
Table 3. OscBank resource utilisation pipelining comparison.
Table 3. OscBank resource utilisation pipelining comparison.
CaseFFLUTDSPLatency (Clock Cycles)
Pipeline II = 17466979050121
Pipeline off3619580023811
Pipeline II = 103963506718189
Table 4. Resource utilisation for subtractive and FM designs.
Table 4. Resource utilisation for subtractive and FM designs.
CaseLUTFFBRAMDSP
Subtractive73%47%57%68%
FM76%47%70%49%
Table 5. ReverbSC Vitis HLS synthesis report.
Table 5. ReverbSC Vitis HLS synthesis report.
BRAMFFLUTDSP
Value12812,85522,55461
% Used on Zybo Z702045%12%42%27%
Table 6. CPU load metrics.
Table 6. CPU load metrics.
PlatformCPU Usage
Bela84%
Daisy67%
Table 7. Latency comparison.
Table 7. Latency comparison.
PlatformSingle-Note Latency (ms)Polyphonic Latency (ms)
Bela10.2511.5
Daisy11.3514.75
Zynq FPGA SoC6.558.75
Makenoise 0-Coast10.75N.A.
Table 8. Subtractive synthesiser IP latencies from Vitis HLS synthesis reports.
Table 8. Subtractive synthesiser IP latencies from Vitis HLS synthesis reports.
ModuleLatency (Clock Cycles)
OscBank189
FourPole422
Mixer19
ReverbSC121
Total751
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Keller, D.; Jagwani, A.; Lazzarini, V. The Ubimus Plugging Framework: Deploying FPGA-Based Prototypes for Ubiquitous Music Hardware Design. Computers 2025, 14, 155. https://doi.org/10.3390/computers14040155

AMA Style

Keller D, Jagwani A, Lazzarini V. The Ubimus Plugging Framework: Deploying FPGA-Based Prototypes for Ubiquitous Music Hardware Design. Computers. 2025; 14(4):155. https://doi.org/10.3390/computers14040155

Chicago/Turabian Style

Keller, Damián, Aman Jagwani, and Victor Lazzarini. 2025. "The Ubimus Plugging Framework: Deploying FPGA-Based Prototypes for Ubiquitous Music Hardware Design" Computers 14, no. 4: 155. https://doi.org/10.3390/computers14040155

APA Style

Keller, D., Jagwani, A., & Lazzarini, V. (2025). The Ubimus Plugging Framework: Deploying FPGA-Based Prototypes for Ubiquitous Music Hardware Design. Computers, 14(4), 155. https://doi.org/10.3390/computers14040155

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop