Fpga and Soc Devices Applied to New Trends in Image/video and Signal Processing Fields

Field-programmable gate arrays (FPGAs) and, recently, System on Chip (SoC) devices have been applied in different areas and fields for the past 20 years. [...]


Introduction
Field-programmable gate arrays (FPGAs) and, recently, System on Chip (SoC) devices have been applied in different areas and fields for the past 20 years.The initial planned roadmaps to deploy them in electronics devices/applications have been widely exceeded due to the high performance that they can achieve.Other improvements such as scalability, reconfigurability or affordability have been responsible for broadening the different types of designers using these devices.Nowadays, embedded processors are available in FPGA/SoC devices, ready to be used in signal processing applications, video analysis algorithms, etc. Specific modules can be developed using the reconfigurable hardware and combined into a standard processor system all on one die/circuit.
Currently, many applications/algorithms executed in conventional computing architectures are being redefined to fully exploit the parallelization of hardware systems co-designed with pieces of software executed on one or more standard processors.Today the most important applications based on FPGA/SoC devices are focused in image and signal processing areas and the relevant topics, such as embedded smart video systems and network on a chip oriented to signal/image processing.Besides new trends in the image-video processing area, new signal modulation/codification techniques and industrial applications to speed up the execution time are being researched.Moreover, these kinds of devices have enlarged the portfolio of products capable of being used in smart sensors to collect, process and send processed data.All of the above is being made by means of new high-level languages and novel tools which greatly improve the algorithm processing performance and speed up the development process.
This Special Issue is centered on the aforementioned areas directly linked to FPGA/SoCs-related topics.The received works have been handled and reviewed according to the rigorous criteria of the journal, in a blind peer process, with three reviews for each received paper, requiring that every reviewer must be positive about the paper to be accepted.This rigor in the selection has led to the fact that only around 25% of the submitted works have been accepted, preserving and guaranteeing the quality of the publication.

Summary of the Special Issue
The research lines related to speeding up image processing algorithms to be executed in FPGA/SoC devices have been growing during recent years.Thus, different proposals to apply and enhance their use have had a significant evolution.In this Special Issue, the published works include novel and relevant topics such as: real-time hardware implementation of motion detection for vision-based automated surveillance, traffic signs detection based on color segmentation combined with a SURF (speeded-up robust features) feature extractor, and nearest-neighbor classifier detection, in color edge detection, using a dedicated geometric algebra (GA) co-processor.
On the other hand, generic signal processing systems have been proposed to process data in different research fields such as speech recognition, speaker identification/verification, musical instrument identification, biomedical signal classification, bio-acoustic signal recognition, coincidence events detection, etc.Sometimes signal processing requires the analysis of multiple signal features, so a typical pattern recognition system is composed of two main blocks: feature extraction and classification.Other times, signal processing requires analyzing physical events associated with time, which is a continuous parameter.This Special Issue includes works related to bio-acoustic signal, events detection and sound recognition.
This editorial summarizes the contents of three works related to the topic of image processing and another three related to signal processing systems.

Relevant Contributions Related to Image Processing
In the field of video surveillance, besides the development of high-level computer vision algorithms, VLSI (Very Large Scale Integration) designers focus their research on efficient hardware design to address the issue of real-time implementation of motion detection low-level algorithms.There are different hardware approaches for motion detection implementation and they can be classified according to the type of design methodologies and tools used.Taking into account the design methodologies used, we might find different approaches such as: digital signal processors, complex programmable logic devices, ASIC, FPGA-based hardware design, FPGA-based hardware/software co-design, etc. From a design tool view, the differences can be noted on the use of: VHDL/Verilog RTL languages, Handle-C or SystemC high-level languages, MATLAB-Simulink simulation/implementation software, etc.
Another trending topic is related to car driver assistance video systems.Current emerging technologies are supporting a huge change in car equipment and in drive assistance, as we are beginning to see in the development of self-driving cars.One of the most important components of advanced driver assistance systems is traffic sign recognition that enables the car to recognize the road signs in real-world environments.The main challenged for robust detection are given by the environmental conditions, such as lighting, weather, background colors and occlusions.Moreover, real-time operation is another big challenge for the traffic sign recognition systems.In recent years, several solutions for this task have been proposed using different types of hardware elements (standard computers, ASIC, FPGAs, DSPs, GPUs, etc.) and different traffic sign recognition algorithms have been implemented.
Nowadays, there is an increasing demand for faster and more efficient image/video processing systems.Many algorithms are based on feature extraction, such as edge detection.This kind of low-level processing of images is too slow if it is carried out via software, so different hardware processor systems have been proposed.For example, a geometric algebra co-processor that can be applied to this particular edge detection task has recently been proposed.Edge detection is one of the most basic operations in image processing and can be applied to both gray-scale and color images.For gray-scale images, edges are defined as the discontinuity in the brightness function, whereas in color images they are defined as discontinuities along adjacent regions in the RGB color space.

Motion Detection
Nowadays, motion detection is one of the most complex components in an automated video surveillance system, intended to be used as a standalone system; in addition to being accurate and robust, a motion detection technique must also be fair in the use of the computational resources on hardware, because other complex algorithms must also run on the same hardware system.In order to reduce the computational complexity, several researchers have proposed a clustering solution based on blocks, with very-low-complexity processing, to detect motion in the video sequence.This scheme became robust enough for handling the pseudo-stationary nature of the background, and it also lowers the computational complexity in a notable manner, so it is suited for designing standalone real-time systems.The work developed in [1] is focused on the algorithm implementation for real-time motion detection in automated surveillance systems.
The authors propose a dedicated VLSI architecture for a clustering-based motion detection scheme for designing a complete real-time standalone motion detection device.The implemented prototype includes an input camera interface, an output display interface and a motion detection VLSI architecture for a ready-to-use automated video surveillance system.The developed test platform (implemented on a ML510 board from Xilinx, San Jose, CA, USA) presents real-time relevant motion detection capabilities, and detects motion in real time for standard video streams (720 × 576 pixels and 25 frames per second) directly coming from a standard camera.The FPGA resource utilization by this system is negligible (0.6% to 4%), except for internal block memories (57.38%).The results of the tests show that the system is robust enough to detect only the relevant motion in a live video scene and it eliminates the continuous undesirable movements in the video background.

Traffic Signs Recognition
In reference to traffic sign recognition (TSR) systems, they are generally comprised of two parts: sign detection and sign recognition/classification.Many approaches for sign detection transform the image to be processed on an alternate color space information such as normalized RGB, hue saturation, hue saturation enhancement, etc.For sign recognition, several feature extraction methods have been applied: canny edge detection, scale invariance feature (SIFT), speeded-up robust feature (SURF) and histogram of oriented gradients (HOG).Typically, image features are extracted for the subsequent machine learning stage, which is used for sign classification.Support vector machine (SVM) and neural networks (NNs) are popular for use as classifiers.In [2], a new TSR algorithm flow is proposed, which performs exceptionally robustly against environmental challenges, such as partially obscured, rotated and skewed traffic signs.Another critical component is the embedded system implementation of the algorithm on a programmable logic device that can enable real-time operation.The proposed work is based on previous research of the same authors which introduced a programmable hardware platform for TSR.This study shares the sign detection steps, but introduces a new sign recognition algorithm based on feature extraction and classification steps and its corresponding hardware implementation.
The system shows a robust detection performance even for rotated or skewed signs.False classification rates can be reduced down to less than 1%, which is very promising.The proposed TSR system that combines a hardware and software co-design is implemented on Xilinx's ZedBoard.An ARM CPU framework based on the AXI interconnect is developed for custom IP design and testing.Overall, the system throughput is eight times faster compared to the authors' previous design based on the Virtex 5 FPGA, when considering both IP hardware execution and algorithms implemented in the software.The current execution time of 992 ms may not be sufficient for real-time operation, but the FPGA resource usage is very low, and may be used to implement more complex algorithms to improve their computation times.

Geometric Algebra Methods in Edge Detection
The techniques of convolution and correlation are quite common in image processing algorithms for scalar fields.Techniques to identify the edges and critical features of an image using the rotational and curvature properties of vector fields are becoming a popular method.A combination of the scalar and vector field techniques has been extended to vector fields for signal analysis.Geometric algebra (GA) methods were introduced for image processing where it was shown that hyper-complex convolution is in fact a subset within GA.Recent works showed that the convolution technique based on GA obtains outstanding results, and also that GA is an effective method for edge detection and can provide a metric for edge strength.
In [3], the above ideas have been extended further by introducing color vector transformations and a different subspace within GA.The paper presents an overview of the geometric algebra fundamentals and convolution operations involving rotors for image processing applications.The discussion shows that the convolution operation with the rotor masks within GA belongs to a class of linear vector filters and can be applied to image or speech signals.The use of the ASIC GA co-processor for rotor operations is described while showing its potential for other applications in computer vision and graphics.The proposed hardware architecture is tailored for image processing applications providing acceptable application performance results.The usefulness of this new approach was shown by analyzing and implementing three different edge detection algorithms.The qualitative analysis for the edge detection algorithm shows the usefulness of GA-based computations within image processing applications and details the improvement for color edge detection.The given analysis not only describes the execution times, but also the trade-offs that can be made in terms of resources, area and timing.The overall performance gain using the proposed GA co-processor is approximately an order of magnitude faster than any other previously published results for edge detection hardware implementations.

Relevant Contributions Related to Signal Analysis
In reference to underwater bio-acoustic signal processing, feature extraction is the process that transforms the pattern signal from the original high-dimensional representation into a lower-dimensional space, while the classification task tries to associate the signal pattern to one of several predefined classes.In order to carry out the feature extraction phase, different works use algorithms such as: discrete Fourier transform, discrete wavelet transform, time-frequency contours, etc.On the other hand, the following algorithms have been used for the recognition/classification task: artificial neural networks (ANNs), support vector machine (SVM), Gaussian mixture models (GMMs), Fisher's discriminant analysis (FD), k-nearest neighbor, etc.
Another important research topic is the acquisition of bio-impedance data.Bio-impedance measurements became a popular method to determine the characteristics of a particular tissue.This method has been extensively applied in the determination of body composition.By performing time-resolved measurements, it is further possible to measure the impedance changes that happen during the arrival of a pulsed wave.Knowing the bio-impedance of various points around an object under analysis, the internal conductivity distribution can be estimated.This advanced method is known as electrical impedance tomography (EIT).The first clinical applications of EIT were respiration monitoring and breast cancer detection.
Measuring a continuous parameter, such as time, requires some analog-to-digital conversion, which imposes certain quantization errors.Technically, time coincidence is referred to as the occurrence of two events that happen within a defined time span, called the coincidence window.The detection of certain physical events requires a very high resolution in time.A good example is the emission of gamma-ray pairs from medical radioisotopes.These radioisotopes are injected into a human body.The detection of a pair of the emitted gamma rays allows extracting conclusions about a patient's medical status.For this task, positron emissions tomography (PET) is an approach that detects these rays.The timing of the detection is crucial for determining the origin of the gamma rays, and subsequent processing stages can reconstruct the position of the event.These post-processing stages are usually implemented in software, and produce colored images that provide information about that specific area of the body.

Automatic Blue Whale Calls Recognition
Analysis of underwater bio-acoustic signals has been the object of numerous studies, where several approaches have been proposed to recognize and classify these signals.Different animal species can be recognized by their specific sounds.Marine mammals are highly vocalizing animals; among them, the blue whales produce regular and powerful low-frequency (<100 Hz) vocalizations that can propagate over distances exceeding 100 km.The North Atlantic blue whales produce different specific vocalizations: the stereotypical infrasonic calls (15-20 Hz) and more variable audible calls (35-120 Hz).From blind recordings at a given location, these calls can be used to identify the species.
To date, no hardware architecture has been proposed in the literature to recognize bio-acoustic signals.
In [4], the authors propose a hardware implementation of an automatic blue whale call recognition system based on short-time Fourier transform (STFT) and the well-known multilayer perceptron (MLP) neural network.Based on a previous algorithm developed using Matlab software, the proposed architecture has been optimized and implemented on a FPGA using the Xilinx System Generator (XSG) and Nexys-4 Artix-7 FPGA board.This architecture takes advantage of the native parallelism of the FPGA chips, instead of sequential processors implemented on these chips.Classification performances based on the fixed-point XSG/FPGA implementation are compared to those using floating-point values.Performances obtained using XSG-and Matlab-based implementations were exactly the same, obtaining a 100% true-match.The paper's main contributions are, on one hand, the optimization of the MLP-based classifier by reducing the number of its hidden neurons and, on the other hand, the development of XSG-based models of the mathematical equations describing the characterization/classification algorithm to be implemented on the FPGA with a significant optimization of the required resources by reducing the fixed-point data format.

Complex Bio-Impedance Measurements
Related to the topic of bio-impedance, to reconstruct an image of the conductivity distributions, different transfer-impedance measurements are necessary.Since the impedances change over time, it is fundamental to measure all needed transfer impedances for a single image in a short fraction of time (implementing identical measurement circuits for each of the transfer impedances).By using multiplexers, all transfer impedances of interest can be measured successively.If the measurement of a complete image is much faster than the impedance changes, the influence of the multiplexing time lags can be neglected.Multiple research works have used this kind of EIT multiplexed system.With a common multi-frequency detection, it is possible to measure the impedances' magnitudes and phases over time, usually using one chosen excitation frequency or the addition of a few sinusoidal waves with different frequencies.However, the introduced EIT system measures the complete spectrum ranging from 10 to 380 kHz for each transfer impedance over time, achieving an additional dimension of information for the subsequent image reconstruction.In [5], the authors present a FPGA-based multi-frequency EIT system for time-resolved measurements of conductivity distributions in living tissue.The system acquires complex impedance values very precisely.To demonstrate the characteristics of the system, it was verified against an impedance phantom and real physiological measurements were carried out.Besides, the system can make use of different excitation signal forms such as sinusoidal, chirp, or rectangular with a frequency range of up to 500 kHz.The excitation currents are also adjustable from 10 µA to 5 mA.Each impedance measurement can be carried out with up to 16 multiplexed current and voltage channels.The achieved frame rate is sufficient to resolve the slow breathing of a subject.Nevertheless, to analyze faster events such as heart rate-related quantities, the frame rate should be increased.

Precision Events Coincidence Detector
In gamma ray events detection, it must be considered that rays travel at the speed of light, so timing requirements are very hard to comply with.It should be clear that the quality of the resulting image directly depends on the available time resolution, which is about 1 ns or below in state-of-the-art time-of-flight PET systems.Since these resolutions are still quite challenging for software-based systems, the field of PET systems still has significant activity in the development of hardware-based coincidence detectors.A common characteristic of the existing approaches is that they all use certain logic gates, e.g., AND gates, as their basic processing elements.It is well known that these gates process input signals based on their actual voltage levels rather than on signal transitions.
In [6], the authors propose a different approach based on a new type of RS latch.The novel latch processes the voltage changes of the input signals and is thus edge-triggered in certain way.That new circuitry is called the coincidence detector latch (CDL); it is able to save the edge event rather than losing it like an AND gate which goes low once its inputs go low.This special latch structure is able to operate as a coincidence detector with a coincidence window in the range of about a few hundred picoseconds; it achieves a coincidence window width as short as 115 ps (more than 10 times better than other AND gates reported by recent research works).Its advantages are its low complexity, full synthesizability and high scalability.The proposed CDL was implemented in a field-programmable gate array (FPGA) and thoroughly tested in a physical laboratory setup.The CDL can be formed by standard VHDL constructs and is fully synthesizable by common synthesis tools.Despite utilizing six logic gates, it consumes only two configurable logic elements from an off-the-shelf FPGA.Even a vintage Cyclone II FPGA development board can host tens of thousands of CDLs.
The integration of the proposed CDLs into a commercial system requires, for example, cooperation with a selected PET system developer.Such a system integration would include long-term stability tests in order to evaluate the effects of temperature changes, radiation exposure, the event detector's precision, etc.