2.1. Foundational Principles
The structures of and within biological organisms have inspired a range of computational principles that are increasingly applied in collision detection. Optic flow refers to the motion perceived by a vision system. In translational motion, close-by objects induce large optic flow amplitudes and far-away objects appear to move more slowly [
13]. EMDs are typically computational structures used to determine optic flow. Motion pattern recognition is another principle typically used in robotic systems that can be computationally modelled from biological structures. EMDs are directionally selective neurons (DSNs) as they are capable of detecting translational movement [
7]. To detect the motion of objects in and out of depth, LGMDs employ looming detection [
5]. This emphasised selectivity to motion features is characteristic of biological visual systems in particular [
8].
The biological principle of parallel processing, for instance, of luminance increments (ON) and decrements (OFF) in separate, parallel pathways, is also built into EMDs and LGMDs. These parallel pathways map onset and offset responses to changes in light intensities perceived [
8] and process translational or directional motion captured by photoreceptors in parallel. They are useful for their computational efficiency, particularly in mass and power, for their robustness in complex environments, for their biological plausibility and for their prominence in preliminary visual signalling. Biological ephemeral vision pathways are applied to dynamic vision sensors (DVS), which are then used in event cameras to facilitate an asynchronous reaction to light intensity variation [
8]. DVS are able to benefit from the inherently low latency, high signal range, and high temporal resolution to process high-speed movements in dynamic environments [
8].
Insects such as locusts and flies are considered to be experts in motion perception as they possess complex visual systems that can effectively segment visual stimuli and facilitate a unified response [
14,
15]. Photoreceptors [
2] that detect visual stimuli initiate transduction that converts the stimuli to electrical signals to be interpreted by their nervous systems, which comprises circuits and neural networks dedicated to the comprehension of motion. This capability allows insects to discern, prioritise, and respond appropriately to different potential collision scenarios based on their type, the risk they pose and the relevance to the task at hand. Broadly, research has been able to establish a relationship between neuroethology and computing, specifically between the perception evaluation circuits found in organisms and the robotic control systems using biological principles to enhance robot navigation, collision avoidance and reactive decision-making.
2.4. Comparison, Hybridisation and Future Directions
The biological grounding and computational implementation of LGMDs, EMDs and SNNs differ. While LGMDs are high-fidelity physiological models of a biological structure, EMDs are more so a computational structure inspired by biological structures. In real-world applications, LGMD and EMD models differ notably in the types of motion they detect, i.e., looming for LGMDs and lateral in the case of EMDs [
8]. In real-world robotics applications, this often enables the models to be deployed complementarily [
7], such that impending collision and translational motion signals can be perceived as a unit of computation. Tangentially, the neuroethological spike-based nature that underpins the SNNs’ information communication procedure [
17] may be able to extend the capabilities of itself and other bio-inspired computational models when providing the mechanical basis for them.
The computational models share biological principles, including parallel processing of ON/OFF pathways, directional selectivity and feed-forward processing. However, as shown in
Table 1, the models diverge in their movement processing selectivity, neural computation mechanisms, learning strategies and network complexity [
8,
17]. SNNs are more complex in these regards than LGMDs and EMDs. Relatively simplistic bio-inspired models like LGMD and EMD, while inspired by complex biological neurons, are often computationally simplified to make them practical for real-world applications, including embedded neuromorphic systems [
5]. Their strength lies in their resource efficiency for dedicated tasks [
8]. However, challenges regarding accuracy in complex dynamic scenes are a pervasive issue.
SNNs appear to provide a promising framework [
17] within which computational models such as LGMDs and EMDs may be able to be combined and developed to communicate visual motion signals, in potentially more complex, hardware and resource constrained environments [
2,
8] while maintaining reasonable proximity to biology and computing [
20]. Numerous studies support the integration of LGMDs and EMDs into SNNs to enhance robustness in real-world, dynamic environments. Research has tested the implementation of LGMDs and EMDs on real-time, embedded systems for movement and perception tasks [
5]. Such hybridisation has already been shown to improve accuracy in cluttered scenes and enhance responsiveness to varied motion cues [
24].
A hybrid LGMD-SNN model could embed the core architecture of the LGMD, photoreceptors, lateral inhibition, and spike generation within a spiking neural framework. Each layer in the LGMD could be implemented using spiking neurons such as LIF units, enabling the system to maintain biological fidelity while also supporting integration into larger SNN pipelines. One design might treat the LGMD structure as a feature extractor within a broader SNN, feeding collision-related spiking events into downstream decision-making or control layers. Alternatively, LGMD and EMD streams could function as parallel pathways within a spiking architecture, with integration or competition occurring in higher layers. This architecture allows for improved selectivity (e.g., LGMD1 for light/dark, LGMD2 for dark-only), modularity, and online learning in environments with fluctuating complexity [
3]. Another design could combine EMDs with deep SNNs, introducing the potential for hierarchical motion feature extraction [
22], where EMD outputs can be processed by deeper spiking layers to support complex visual recognition tasks such as trajectory estimation or obstacle tracking. This could significantly expand the utility of EMDs beyond simple motion detection, especially when paired with event-based sensors or neuromorphic cameras [
6]. Similarly, integrating feedback mechanisms into the LGMD-based SNNs (as in F-LGMD variants) can enhance adaptability and robustness. Feedback can help modulate gain control, suppress irrelevant motion, and support context-sensitive collision detection. These hybrid systems more closely mirror biological circuits and reduce the need for hand-tuned thresholds or hard-coded logic, making them suitable for deployment in real-world, reactive control tasks [
17].
The implementation and evaluation of these biologically inspired models contribute meaningfully to the advancement of adaptive and efficient robotic systems across a range of applications. LGMD-based architectures, for example, are particularly effective for rapid frontal obstacle detection, making them highly suitable for use in UAVs and micro-robots where low-latency collision avoidance is essential [
4]. In parallel, EMD-inspired models provide robust perception of translational motion and optic flow, which are foundational for navigation tasks such as obstacle avoidance, visual odometry, and landing control in both aerial and ground-based robotic platforms [
8]. The integration of these vision models within energy-efficient SNNs, especially when deployed on neuromorphic hardware [
9], enable scalable and real-time control under power constraints, offering a compelling solution for autonomous vehicles and swarm robotics where efficiency is paramount [
25]. Moreover, incorporating EMD or LGMD modules into multi-agent robotic systems supports decentralised coordination by leveraging local visual cues for inter-agent collision avoidance, thus enhancing group-level intelligence without the need for centralised oversight [
3].
LGMDs and EMDs demonstrate computational efficiency and robustness, particularly in low-resource or cluttered settings, but are inherently limited by fixed architectures and threshold dependencies. SNNs can extend these capabilities by supporting sparse event-based communication and hierarchical processing of spatial-temporal information. By comparing LGMD, EMD, and a hybrid SNN that integrates their mechanisms, it becomes possible to evaluate trade-offs in accuracy, responsiveness, and deployability. Such a comparative study is motivated by the shared biological inspiration, differing sensitivities to motion types, and the increasing demand for scalable, neuromorphic solutions in real-world autonomous navigation. This study aims to critically investigate the extent to which these biologically inspired models, LGMD, EMD, and a hybrid SNN, can contribute to robust and efficient collision detection in visually dynamic environments. By systematically evaluating their performance on shared datasets and comparing their outputs across key metrics such as responsiveness, accuracy, and resource efficiency, the research explores not only the relative strengths and limitations of each approach but also the potential advantages of integrating biologically grounded architectures within trainable, spike-based neural frameworks. In doing so, the study seeks to establish how well such models scale under realistic visual conditions and what role hybridisation plays in advancing biologically inspired vision in dynamic scenarios.