Next Article in Journal
Random Access Using Deep Reinforcement Learning in Dense Mobile Networks
Next Article in Special Issue
Caries and Restoration Detection Using Bitewing Film Based on Transfer Learning with CNNs
Previous Article in Journal
Cooperative Fusion Based Passive Multistatic Radar Detection
Previous Article in Special Issue
A Novel Focal Phi Loss for Power Line Segmentation with Auxiliary Classifier U-Net
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Ball-Catching System Using Image Processing and an Omni-Directional Wheeled Mobile Robot

Department of Engineering Science, National Cheng Kung University, Tainan 701401, Taiwan
*
Author to whom correspondence should be addressed.
Sensors 2021, 21(9), 3208; https://doi.org/10.3390/s21093208
Submission received: 26 February 2021 / Revised: 30 April 2021 / Accepted: 30 April 2021 / Published: 5 May 2021
(This article belongs to the Special Issue Perceptual Deep Learning in Image Processing and Computer Vision)

Abstract

:
The ball-catching system examined in this research, which was composed of an omni-directional wheeled mobile robot and an image processing system that included a dynamic stereo vision camera and a static camera, was used to capture a thrown ball. The thrown ball was tracked by the dynamic stereo vision camera, and the omni-directional wheeled mobile robot was navigated through the static camera. A Kalman filter with deep learning was used to decrease the visual measurement noises and to estimate the ball’s position and velocity. The ball’s future trajectory and landing point was predicted by estimating its position and velocity. Feedback linearization was used to linearize the omni-directional wheeled mobile robot model and was then combined with a proportional-integral-derivative (PID) controller. The visual tracking algorithm was initially simulated numerically, and then the performance of the designed system was verified experimentally. We verified that the designed system was able to precisely catch a thrown ball.

1. Introduction

The method in which visual information in a feedback control loop is used to precisely control the motion, position, and posture of a robot is called visual servoing. Visual servoing is well-known and commonly used in both dynamic and unstructured environments. Visual servoing tasks involving the use of a robot to catch a flying ball present a challenge.
In robotic ball-catching, there are several variations in the system configurations, methods of implementation, trajectory prediction algorithms, and control laws. In [1], the stereo camera system (two PAL cameras) was placed above the work area to catch a flying ball. In [2], two cameras were placed at the top and left of the work area for ping-pong ball catching and ball juggling. In [3], the cameras were located behind the robot for catching in-flight objects with uneven shapes. In [4], a high-speed vision system with two cameras was used for a ball juggling system.
In the above research, fixed cameras were used to predict the trajectory of the target object in the work area. The advantage is that the background image was fixed, and the image processing was relatively simple; however, it is difficult to achieve the desired results in an open space due to the field of view (FoV) constraints associated with the camera system and the limited workspace of a robotic arm.
In [5], an autonomous wheelchair mounted with a robotic arm used two vision sensors to accurately determine the location of the target object and to pick up the object. In [6], a robotic system endowed with only a single camera mounted in eye-in-hand configuration was used for the ball catching task. In [7], a robot with an eye-in-hand monocular camera was used for autonomous object grasping. The methods of [5,6,7] can resolve the stated problem. The eye-in-hand concept enables the camera to maneuver along with the robotic arm, which improves the trajectory prediction precision in an open space, especially when the object is near the robot. Another method by which to catch a ball in a wide-open space is to combine a static stereo vision system with a mobile robot to complete the ball-catching task, as described below.
In [8], the ball was tracked using two ceiling-mounted cameras and a camera mounted on the base of the robot. Each camera performed ball detection independently. The 3D coordinates of the ball were then triangulated from the 2D image locations. In [9], a robotic system consisted of a high-speed hand-arm, a high-speed vision system, and a real-time control system. A method was proposed to estimate the 3D position and orientation by fusing the information observed by a vision system and the contact information observed by tactile sensors.
In [10], the robot maintained a camera fixation that was centered on the image of the ball and kept the tangent of the camera angle rising at a constant rate. The performance advantage was principally due to the higher gain and effectively wider viewing angle when the camera remained centered on the ball image. In [11], a movable humanoid robot with an active stereo vision camera was designed and implemented for a ball-catching task. Two flying balls were caught by mobile humanoid robots at the same time in a wide space.
The prediction of the trajectory is an important factor for the effective capture of a flying ball. Most methods assumed that a free-flying object’s dynamic model was known. A parabola in 3D space was used to model the trajectory of a flying ball, and the least-squares method was used to estimate the model parameters.
In [10], the flying ball trajectory was predicted using a non-linear dynamic model that included air resistance with different parameter estimation algorithms. In [6], an estimate of the catching point was initially provided through a linear algorithm. Then, additional visual measurements were acquired to constantly refine the current estimate by exploiting a nonlinear optimization algorithm and a more accurate ballistic model with the influence of air resistance, visual measurement errors, and rotation caused by ground friction. One of the typical uses of the Kalman filter is in navigation and positioning technology.
In [12], an automated guided vehicle was combined with a Kalman filter with deep learning. The system was found to have good adaptability to the statistical properties of the noise system, which improved the positioning accuracy and prevented filter divergence. In our previous work [13], we provided a brief overview of an effective ball-catching system with an omni-directional wheeled mobile robot. In this paper, we extend the ball trajectory estimation method to improve the system’s ball-catching performance. Furthermore, the experiments and application are described in detail.
In this research, we developed a combined omni-directional wheeled mobile robot and a multi-camera vision system to catch a flying ball in a large workspace. Figure 1 provides a schematic diagram of the proposed system.
To maneuver the robot around, an omni-directional wheeled setup was selected due to its superiority in terms of mobility. This robot can enable translational or rotational movements or any combination of these two movements. An active stereo vision camera assumed the visual tracking task for the flying ball with two pan-and-tilt cameras. To navigate the mobile robot to the ball’s touchdown point, a static vision camera was used. Real-time processing of the image processing algorithms and control laws are necessary to accomplish these visual servoing tasks.
Therefore, digital signal processors (DSPs) were used to ensure the real-time ability to carry out these actions. Noise from the environment or other sources, such as the measurement of vision systems for caught balls, can deteriorate the performance of the proposed visual servoing system. In this work, estimating the position and velocity of the ball was achieved using the Kalman filter and a linear dynamic model of a flying ball. The use of deep learning in Kalman filtering improved both the accuracy and robustness of the results. In this paper, the experimental setup is presented, and the results of the simulation and experiments are provided to demonstrate the performance of the proposed system.
The remainder of this paper is composed as follows: In Section 2, the relationship between the coordinates used in this work is described (specifically, the world and image coordinates). In addition, the image processing algorithms are also introduced. Section 3 describes the active stereo vision camera. Section 4 describes the prediction and trajectory estimation method for a flying ball. Section 5 offers a discussion of the control law of the omni-directional wheeled mobile robot. Section 6 presents the implementation of the designed system. Section 7 presents the results for the simulation and the experiments. Finally, Section 8 provides our concluding remarks.

2. Image Processing and Visual Measurement

Vision systems are used to obtain the location of the ball and robot in a three-dimensional (3D) Cartesian coordinate system. The position determination of an object from an image is based on a pinhole camera model of the vision system [14,15]. The position of an object can be given in the vector form as
λ p = K I [ R E | T ] P o
where vector p = [ x i m y i m 1 ] T represents the 2D homogeneous coordinates of an image point in the image coordinate system. The vector P o = [ X W Y W Z W 1 ] T represents the 3D homogeneous coordinate of a target object point in the world coordinate system. The 3 by 3 matrix R E and 3D vector T are external camera parameters that define the rotation and translation between the world frame and camera frame, respectively. λ is a scaling factor, and K I is the internal parameter matrix of the camera. It is given by
K I = f s x f s θ o x 0 f s y o y 0 0 1 .
In the matrix K I , ( o x , o y ) is the principal point in the image coordinate system in pixels; f s x and f s y are the size of the unit lengths in the horizontal and vertical pixels, respectively, and f s θ is the skewness factor of the pixels. The camera calibration procedure in [14] is used to calibrate the internal camera parameters beforehand.
In the image plane, to obtain the precise position of a target object, a series of image processing algorithms are used to process the images captured from the camera. The source image with a complex background acquired from the camera is shown in Figure 2a. In this study, the template matching approach [16] is used to find the image of the target object that matches a template object image in the whole image. The template matching process compares the sub-image of the source image and template object image, from left to right and from top to bottom, to obtain the correlation between these two images.
Additionally, to detect the target object in the three-dimensional space, the template object image is resized during the comparison process. Figure 2b shows the normalized cross-correlation image, in which the brightest one represents the most similarity between two images.
The simulation result of the template matching was compared to the one obtained by the color matching method. The result when using the template matching method to perform ball detection is shown in Figure 3a. The result based on the color matching method is shown in Figure 3b. In a complex background, when comparing the object extraction step in Figure 3a with the thresholding process in Figure 3b, the use of template matching for ball detection is not interfered with by other objects, which is suitable for the environment used in this study.
The location of the processed image was taken using the centroid of the object as follows:
( x c , y c ) = 1 n 1 x i m , 1 n 1 y i m
where ( x c , y c ) are the center coordinates of the object, ( x i m , y i m ) are the coordinates of the white pixel, and n 1 is the number of white pixels. The actual centroid of the ball in Figure 3 obtained by the manual image segmentation is ( x c , y c ) = ( 336 , 368 ) . The centroid of the ball obtained from the template matching method is ( x c , y c ) = ( 334 , 365 ) , and the centroid of the ball obtained from the color matching method is ( x c , y c ) = ( 322 , 355 ) . The template matching method provided a more accurate result.

3. Active Stereo Vision

In this study, an active stereo vision camera was used to locate and trace the flying ball. The active stereo vision camera included two cameras mounted on a pan-and-tilt platform, and the cameras were parallel to each other. The coordinate systems of this active vision camera are described in Figure 4, for which the parameters are listed below.
P   target   object   in   the   word   coordinate   system . r   the   distance   between   the   stereo   coordinate   and   the   geodetic   coordinate . d   the   distance   between   the   center   of   camera   A   and   camera   B . ϕ s   the   angle   of   the   pan   axis . θ s   the   angle   of   the   tilt   axis . ( O A , X ^ A , Y ^ A , Z ^ A )   coordinate   system   of   camera   A . ( O B , X ^ B , Y ^ B , Z ^ B )   coordinate   system   of   camera   B . ( O S , X ^ S , Y ^ S , Z ^ S ) stereo   coordinate   system   located   in   the   center   of   the   camera   frame   A   and   camera   frame   B   to   point   to   the   target . ( O G , X ^ G , Y ^ G , Z ^ G ) geodetic   coordinate   system   fixed   relative   to   the   ground   with   the   pan   axis   aligned   on   the Y ^ G axis   and   the   tilt   axis   aligned   on   the X ^ G axis ,   where   both   axes   are   connected   and   considered   to   be   zero .   Three axes   are   determined   using   a   right handed   coordinate   system .
The pan angle ϕ S rotates along the Y ^ G axis, and the tilt angle θ S rotates along the X ^ G axis. The position vector of the target object P relative to camera A is denoted as [ x A y A z A ] T . In this study, we assumed that cameras A and B were identical. Based on (1), after obtaining the image coordinates of the objects of camera A and camera B, respectively, the position of the target object in the coordinate frame of camera A can be written as:
x A = d x A _ i m x B _ i m [ x A _ i m o x f s θ f s y ( y A _ i m o y ) ] y A = d f s x f s y · y A _ i m o y x A _ i m x B _ i m z A = f s x d x A _ i m x B _ i m
where ( x A _ i m , y A _ i m ) and ( x B _ i m , y B _ i m ) are the coordinates of the target object obtained from cameras A and B, respectively. According to Figure 4, the relationship of the coordinate system between the stereo coordinate and the camera A coordinate can be given by:
x S y S z S = x A y A z A d / 2 0 0
where [ x S y S z S ] T is the position vector of target object P relative to the stereo coordinate frame. The position vector of target object P converts to the geodetic frame using a homogeneous transformation matrix H S G , which is given by:
x G y G z G 1 = H S G x S y S z S 1
where [ x G y G z G ] T is the position of target object P relative to the geodetic coordinate system and
H S G = cos ϕ s sin ϕ s sin θ s sin ϕ s cos θ s r sin ϕ s cos θ s 0 cos θ s sin θ s r sin θ s sin ϕ s cos ϕ s sin θ s cos ϕ s cos θ s r cos ϕ s cos θ s 0 0 0 1 .
Using Equations (4)–(6), the pixel position of the target object in the image frames of the two cameras can be used to determine the position vector of target object P in the world frame.
Figure 5 illustrates the position of target object P in the geodetic frame. From Figure 5 using simple geometry, the angular displacement of the pan-axis can be determined by
ϕ S = sin 1 x G x G 2 + z G 2
where the angular displacement of the tilt-axis is:
θ S = sin 1 y G x G 2 + y G 2 + z G 2
The direct current (DC) servo motors drive the pan-and-tilt platform to maintain the continuous tracking of θ S and ϕ S . The θ S and ϕ S angular commands are sent in real-time to keep the target object in the FoV of the cameras.

4. Trajectory Estimation and Prediction

Visual tracking is challenging due to image variations caused by camera noise, background changes, illumination changes, and object occlusion. The above-mentioned problems will deteriorate the tracking performance and may even cause the loss of target objects. In this work, a Kalman filter [17,18] was applied to enhance the robustness of the designed visual tracking system for the purpose of estimating the target object’s position and velocity. The projectile motion trajectory for the flying ball was used to predict the touchdown point. The mobile robot was commanded to catch the ball at the appropriate location. A brief introduction to the Kalman filter is given below.
A state-space system model is described by
x k + 1 = A x k + B u k + ε k y k = C x k + ω k
where x k is the state of the system; u k is the input, and y k is the measurement. Matrices A, B, and C are the state transition matrix, input matrix, and output matrix, respectively. The state and measurement noises are denoted as ε k and ω k . The zero-mean normal-distribution Gaussian white noise is assumed to be these two noises, and the covariances are Q k and R k , respectively.
The use of the Kalman filter involves two major procedures: the time updating step and the measurement updating step [17,18]. Time updating is used to estimate the probability outcome of the next state. The measurement updating step is used to update the estimated state with the measured information. This updated state is used in the time updating step of the next cycle. The Kalman filter is done recursively, and the recursive formulas are
  • Time updating step:
    x ^ k = A x ^ k 1 + B u k 1 P k = A P k 1 A T + Q k .
  • Measurement updating step:
    K k = P k C T ( C P k C T + R k ) 1
    x ^ k = x ^ k + K k ( y k C x ^ k )
    P k = ( I K k C ) P k .
In (10)–(13), x ^ k denotes the a priori predicted state, and x ^ k is the optimal estimated state after the measurement is updated. P k and P k are the a priori and a posteriori estimate error covariances, respectively. K k is known as the Kalman gain, and y k y ^ k is called the measurement residual. Based on (11), the a posteriori estimate error covariance P k is minimized by K k , and x ^ k is, hence, optimized.
Now, we consider the dynamics of a flying ball. The position and velocity vector of the ball are denoted as [ x W y W z W ] T and [ v x v y v z ] T in the world coordinate system, respectively. We assumed that the flying trajectory of the ball is not interfered with by other objects. The forces considered in this work are the gravitational force, the buoyancy force, and the drag force. Other non-stated forces, such as the Magnus force [19], are ignored.
The buoyancy force vector is denoted as [ 0 0 4 3 π R 3 ρ g ] T , where the radius of the ball is R ( m ) ; the air density is ρ ( kg / m 3 ) , and the gravitational acceleration at sea level is g ( m / s 2 ) . The drag force is assumed to be proportional to the velocity. According to Stokes’s law [19], the drag force is 6 π μ R v w , where μ ( kg / m · s ) is the dynamic viscosity of the air, and v w = [ v x v y v z ] T . Let the mass of the ball be denoted as m ( kg ) . The equation that governs the motion of a flying ball can be written as
m d v w d t = 0 m g 0 b v w
where
b 6 π μ R
m m 4 3 π R 3 ρ .
By discretizing (14) with the sampling period Δ t , we obtain the dynamic model of the system as shown below:
x W ( t ) = v x ( t Δ t ) Δ t + x W ( t Δ t )
y W ( t ) = v y ( t Δ t ) Δ t + y W ( t Δ t )
z W ( t ) = v z ( t Δ t ) Δ t + z W ( t Δ t )
v x ( t ) = ( 1 b m ) v x ( t Δ t ) Δ t
v y ( t ) = m m g Δ t + ( 1 b m ) v y ( t Δ t ) Δ t
v z ( t ) = ( 1 b m ) v z ( t Δ t ) Δ t .
From Equations (17)–(22), a discrete-time linear state-space form (9) can be further written for the system model as
x k + 1 = x W ( t ) y W ( t ) z W ( t ) v x ( t ) v y ( t ) v z ( t ) T x k = [ x W ( t Δ t ) y W ( t Δ t ) z W ( t Δ t ) v x ( t Δ t ) v y ( t Δ t ) v z ( t Δ t ) ] T u k = g
A = 1 0 0 Δ t 0 0 0 1 0 0 Δ t 0 0 0 1 0 0 Δ t 0 0 0 ( 1 b m ) Δ t 0 0 0 0 0 0 ( 1 b m ) Δ t 0 0 0 0 0 0 ( 1 b m ) Δ t
B = 0 0 0 0 m m Δ t 0 T
C = 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 .
Based on the above model, the Kalman filter is applicable for visual tracking to estimate the ball’s trajectory. Therefore, for the visual tracking of the flying ball, the Kalman filter and a constant acceleration model were used in this work.
Using the estimated position, the estimated velocity, and the projectile motion Formulas (17)–(22), the future trajectory and touchdown point were predicted for the flying ball. Since the uncertainty of the initial value P 0 affects the estimation accuracy and convergence of the Kalman filter, this can cause the active stereo vision camera to fail to track the ball, and therefore the ball leaves the FoV. The inaccuracy of the initial value is the result of an unreasonable flight trajectory, such as the ball not being thrown into the air.
In this work, deep learning was used to obtain the initial value of the state estimation error covariance P 0 . Figure 6 describes the application using the Kalman filter combined with deep learning. After the camera obtains the position and velocity of the flying ball, the Kalman filter estimates the next position and commands the active stereo vision camera to track the ball.
Figure 7 describes the deep learning architecture. The input layer receives the input x k and x k 1 , which is the data learned by the neural network. The network is based on AlexNet [20], which is a deep convolutional neural network. These input matrices represent the position vector and velocity vector of the flying ball. The last layer is called the output layer, which outputs an initial estimated error covariance P 0 , representing the neural network’s result. The hidden layers are performed in the layers between the input and output layers. Deep learning is helpful to attenuate the initial value deviation in the filtering process. Once the ball’s predicted touchdown point is obtained, the point-to-point path planning is used to command the mobile robot to move toward the touchdown point in advance to catch the ball. This ball-catching strategy is discussed in a later section.

5. Controller Design of the Omni-Directional Wheeled Mobile Robot

Figure 8 shows a schematic diagram of an omni-directional wheeled mobile robot from the top view. This mobile robot consists of a rigid body and three omni-directional wheels labeled 1, 2, and 3. The wheels were arranged at an equal distance from the center of the robot platform with a 120 interval. This setup allowed the robot to move freely in any direction. In this study, ( O W , Z ^ W , X ^ W ) is the world coordinate system. ( O M , Z ^ M , X ^ M ) is the body coordinate system with the origin attached to the center of mass of the robot. The direction of the X ^ M -axis is aligned to wheel 1, as shown in Figure 8. We assumed that the wheels can roll without slipping.
The parameters for an omni-directional mobile robot are listed as follows:
M t mass   of   the   mobile   robot . L c radius   of   the   mobile   robot . f 1 , f 2 , f 3 the   reaction   force   applied   by   the   ground   to   the   omni directional   wheel ,   where   the   direction   is   vertical   to   wheel   axes   1 ,   2 ,   and   3 ,   respectively . R w radius   of   the   omni directional   wheels . δ c the   angle   between   wheel   2   and   the   Z ^ M axis   of   the   mobile   coordinate   system ;   the   value   is   fixed   to   30 . ϕ the   rotation   angle   of   the   omni directional   mobile   robot . I y moment   of   inertia   of   the   omni directional   mobile   robot   about   Y ^ M axis . θ 1 , θ 2 , θ 3 the   rotation   angle   of   omni directional   wheels   1 ,   2 ,   and   3 ,   respectively . ω 1 , ω 2 , ω 3 the   angular   velocity   of   omni directional   wheels   1 ,   2 ,   and   3 ,   respectively . z W x W T the   position   for   the   center   of   mass   of   the   robot   relative   to   the   world   coordinate   system .
According to [21], the dynamics and kinematics of the mobile robot can be described by:
z ¨ W x ¨ W ϕ ¨ = M t 0 0 0 M t 0 0 0 I y cos ϕ sin ϕ 0 sin ϕ cos ϕ 0 0 0 1 1 1 2 1 2 0 3 2 3 2 L c L c L c f 1 f 2 f 3
ω 1 ω 2 ω 3 = 1 R w 1 0 L c 1 2 3 2 L c 1 2 3 2 L c cos ϕ sin ϕ 0 sin ϕ cos ϕ 0 0 0 1 z ˙ W x ˙ W ϕ ˙ .
In this study, the brushed DC servo motors drove the omni-directional wheels. We also assumed that the motor’s electrical time constant was smaller than the mechanical time constants and that the motor friction was negligible. The model of a DC motor is, thus, reduced to
τ m = K t R a u K t 2 R a ω m .
In (26), τ m , u, K t , ω m , and R a represent the motor torque, control voltage, motor torque constant, angular velocity of the motor, and armature resistance, respectively. The traction force f of the wheel is given by
f = n R w τ m
where n is the gear ratio. The three motors used in this mobile robot were assumed to be identical. Therefore, combining (26) and (27), the relationship between f, u, and ω is given by
f 1 f 2 f 3 = n K t R w R a u 1 u 2 u 3 n 2 K t 2 R w R a ω 1 ω 2 ω 3 .
From (24), (25), and (28), the dynamics of the robot can be presented as
P ¨ w = A w P ˙ w + B w ( ϕ ) U C
where
P w = [ z W x W ϕ ] T , U C = [ u 1 u 2 u 3 ] T
A w = a 1 0 0 0 a 1 0 0 0 a 2
B w ( ϕ ) = 2 b 1 cos ( ϕ ) b 1 cos ( ϕ ) 3 b 1 sin ( ϕ ) 2 b 1 sin ( ϕ ) b 1 sin ( ϕ ) + 3 b 1 cos ( ϕ ) b 2 b 2 b 1 cos ( ϕ ) + 3 b 1 sin ( ϕ ) b 1 sin ( ϕ ) 3 b 1 cos ( ϕ ) b 2
with
a 1 = 3 n 2 K t 2 2 R w 2 M t R a , a 2 = 3 n 2 K t 2 L c 2 R w 2 I y R a , b 1 = n K t 2 R w M t R a , b 2 = n K t L c R w I y R a .
We assume that the predicted touchdown point of the ball in the world coordinate system is [ z b w x b w ] T . The position reference to the mobile robot is set to be:
P b w ( t ) = z b w ( t ) x b w ( t ) ϕ T
where the rotation angle ϕ is assumed to be 0. The tracking error is defined as follows:
e ( t ) = P b w ( t ) P w ( t ) .
This gives
e ¨ = P ¨ b w [ A w P ˙ w + B w ( ϕ ) U C ] .
A new control input U [22] is defined as follows:
U P ¨ b w [ A w P ˙ w + B w ( ϕ ) U C ] .
From (35) and (34), we obtain
d d t e e ˙ = 0 I 0 0 e e ˙ + 0 I U .
From (34), the feedback control U c can be written as
U C = B w ( ϕ ) 1 [ P ¨ b w A w P ˙ w U ] .
In the form used in (36), the system is decoupled into a linear system, and the PID control algorithm is used for tracking control. In this case, the following PID control:
ε ˙ = e U = K p e K d e ˙ K i ε
where K d , K p , and K i are 3 by 3 diagonal PID gain matrixes that equal diag { k d i } , diag { k p i } , and diag { k i i } , respectively, and i = 1, 2, and 3. From (36) and (37), it follows that the closed-loop tracking error system is given by:
d d t ε e e ˙ = 0 I 0 0 0 I K i K p K d ε e e ˙ .
According to the Routh–Hurwitz criterion [23], PID gain values k p , k i , and k d must satisfy:
k i i < k d i k p i , i = 1 , 2 , 3 .
To obtain closed-loop stability, the PID control gain values are based on the control design method proposed in [24]. The phase margin and the gain margin were set to 45 and 6.0206 dB in this work.

6. Implementation of the Designed System

Figure 9 shows a block diagram of the proposed ball-catching system. This system consisted of an omni-directional wheeled mobile robot and an image processing system that included an active stereo vision camera and a static vision camera.
Figure 10a shows the active stereo vision camera. The MT9P001 image sensor was used, which is a complementary metal-oxide-semiconductor (CMOS). It can capture 640 × 480 pixels in the quantized RGB format at 60 frames per second (FPS). The cameras were attached onto a field-programmable gate array (FPGA) board. This FPGA board was used to configure the cameras and to acquire images. In the real-time image process for the acquired images, a DSP (TMS320DM6437) board was used.
An optical encoder with the resolution of 500 pulses/rev was used to measure the motors’ angular displacement in the pan-and-tilt platform. Another DSP board (TMS320F2812), with two quadrature encoder pulse (QEP) units and one pulse width modulation (PWM) signal generator unit was used to acquire the angular displacement and rotational direction of the motor from the quadrature encoder and to control the pan-and-tilt platform motors. In addition, the Kalman filter was implemented to mitigate the measurement noises and to predict the motion of the ball.
The static vision camera was mounted above the work area, where the FoV of the camera covered the entire work area (length: 2.5 m, width: 2.5 m, and height: 3 m) as shown in Figure 1. This camera was used to locate and navigate the omni-directional wheeled mobile robot. As with the active stereo vision camera, the Kalman-filter-based vision tracking and image processing algorithms were implemented in the DSP board (TMS320DM6437).
The mobile robot, as stated in Section 6, consisted of three brushed DC motors used to drive the omni-directional wheels. A DSP board (TMS320F2812) was used for PID control of the motors and the touchdown point prediction for the ball. To obtain the wheel’s angular displacement, an optical encoder with a 500 pulse/rev resolution was mounted to the shaft of each wheel. These optical encoders generate quadrature encoder signals to the QEP circuit on an FPGA board for decoding. The wheels’ angular velocities were obtained by differentiating the angular displacement using the sampling time, and a low-pass filter was then applied to attenuate the high-frequency noises.
The robot’s position and orientation were determined with the static vision camera and a dead reckoning algorithm based on the motor encoder measurements. The PWM signals were generated according to the designed PID control laws to drive each of the motors. Figure 10b shows a basket 0.16 m in diameter mounted on the top of the robot for the purpose of catching the ball.
All of the systems stated above were communicated with using wireless communication modules, as shown in Figure 9. The active stereo vision camera obtained the position and velocity of the ball and then sent it to the mobile robot to predict the touchdown point. The static vision camera obtained the position and direction of the mobile robot, and then sent it to the mobile robot for navigation and positioning through wireless communication.

7. Simulation and Experimental Results

7.1. Touchdown Point Prediction

The prediction of the touchdown point of the target ball was first verified through a numerical simulation using MATLAB/Simulink. The mass and the radius of the ball were set at 0.07 kg and 0.004 m, respectively. The initial location of the ball in meters was [ 3.9 1.1 5.15 ] T , and the initial velocity in meters per second was [ 2.1 5.2 1.9 ] T . The sampling period was 0.0167 s. Although the dynamics of a flying ball were considered, it was impossible to precisely model the ball’s rotation and the airflow field conditions during flight.
Q k mainly represents these noises. Since these noises have little effect on the flight of the ball, Q k can be reasonably obtained according to experimental measurements. R k models the light variation noise, which cannot be processed by the camera calibration method; however, the slight variation was not significant in this study. A reasonable value of R k can be obtained through experimental measurements. Thus, the covariance matrices Q k and R k were assumed to be constant during the motion, and those used in this simulation are given below.
Q k = 0.01 0 0 0 0 0 0 0.01 0 0 0 0 0 0 0.01 0 0 0 0 0 0 0.005 0 0 0 0 0 0 0.005 0 0 0 0 0 0 0.005
R k = 0.8 0 0 0 0 0 0 0.8 0 0 0 0 0 0 0.8 0 0 0 0 0 0 0.16 0 0 0 0 0 0 0.16 0 0 0 0 0 0 0.16 .
The initial error covariance matrix prediction was
P 0 = 1.6 × 10 9 0 0 3.2 × 10 7 0 0 0 1.6 × 10 9 0 0 3.2 × 10 7 0 0 0 1.6 × 10 9 0 0 3.2 × 10 7 3.2 × 10 7 0 0 6.4 × 10 5 0 0 0 3.2 × 10 7 0 0 6.4 × 10 5 0 0 0 3.2 × 10 7 0 0 6.4 × 10 5 .
The simulated results of the estimated, measured, and actual trajectories of the flying ball are shown in Figure 11. We observed that the initial estimated and actual trajectories were significantly different. After several iterations, the estimated trajectory overlapped the actual trajectory with reasonable accuracy. The predicted touchdown point obtained using Equations (17)–(22) in each step of the estimation is plotted in Figure 12. The initially predicted location was quite far from the actual one; however, it converged to the real touchdown point with additional iterations.

7.2. Improvement of Projectile Prediction

Deep learning was used to improve the projectile prediction. The training data sets are shown in Figure 13, with a total of 80 projectile data. The deep learning ANN was trained using many projectile trajectory data in Figure 13 to accurately obtain the ball’s projectile in the initial state to set a reasonable P 0 value. The covariance matrices Q k and R k were the same as shown in the above simulation, and the initial estimate error covariance matrix P 0 was obtained through the deep learning ANN after two iterations of measurements, and the Kalman filter was set to update P k during the ball motion. Figure 14 shows the result of predicting the touchdown point using P 0 obtained by the deep learning ANN. From the figure, the use of deep learning increased the accuracy of predicting the touchdown point.

7.3. Free Falling Ball Experimental

For further validation, the proposed visual servoing system for ball-catching was developed and performed in the experimental setup shown in Figure 15. An active stereo vision camera achieved the visual tracking task for the flying ball with two pan-and-tilt cameras. The omni-directional wheeled mobile robot was navigated to catch the ball.
To verify the performance of the active stereo vision camera, an orange ball was dropped a suitable distance in front of the system. The visual tracking system was expected to be able to track the ball and keep it in the FoV of both cameras. Figure 16 shows a series of images taken by the active stereo vision camera. In both image series (image A and image B), the ball was kept in the images. The results of a free falling ball visual tracking of this study and the results of our previous work are given in the following Figure 17 and Figure 18.
Figure 17 shows the estimated trajectory of the ball, and Figure 18 shows the command and time response of the pan and tilt angles, respectively. In the case of a free-drop object, the object only moved in the vertical direction with no horizontal displacement. As shown in Figure 18a, the command and time response of the tilt angle changed over time, while the pan angle remained at 0 the entire time. Both the pan and tilt motor were able to precisely follow the commands. Figure 18b, compared with our previous results, shows that the tracking response was improved in this study. With these results, we concluded that the active stereo vision camera was able to successfully perform visual tracking.

7.4. Catching a Flying Ball Experiment

Next, the active stereo vision camera was tested to track a flying ball. In this test, we threw the target ball at a high initial velocity. Figure 19 shows a series of images captured by the active stereo vision camera. Similar to the free-drop ball case, the visual tracking system was able to keep the ball in the FoV of the cameras. The results of a flying ball visual tracking of this study and the results of our previous work are given in Figure 20 and Figure 21.
The estimated trajectory of the ball is shown in Figure 20, and the command and time response of the pan and tilt angles are shown in Figure 21, respectively. The results are similar to those obtained for the free-drop ball case, where both of the motors were able to precisely respond to the commands. This experimental results indicate that the active stereo vision camera can track a target and keep it in the FoV even at a high initial velocity.
The effectiveness of the touchdown point prediction was investigated through experiments. In this test, the ball was thrown toward the stereo vision camera, and then the system estimated the ball’s position and velocity. With the estimated position and velocity, the future trajectory and touchdown point of the ball will be predicted by projectile motion Formulas (17)–(22). Figure 22 shows the experimental results. In this figure, the predicted trajectories converge to the measured and estimated trajectories, and the predicted and real touchdown points converge with a reasonable degree of accuracy. For comparison, the results of the touchdown point prediction of the previous work are given in Figure 22a. The results are improved.
Finally, we tested the complete ball-catching task. In this test, the ball was thrown a 2-m distance from the mobile robot. The robot moved to the predicted point as soon as possible after receiving the command, in order to catch the ball before it touches the ground. The results of the ball-catching task of this study and the results of our previous work are given in Figure 23 and Figure 24.
In Figure 23 (3-directional view) and Figure 24 (view from the X Z plane) shown the ball’s trajectories of estimation and measurement and the robot’s path of movement. The result shows that the final predicted touchdown point and the location of the robot match very well. The residual difference between the measured trajectory and the estimated trajectory is the measurement residual. The mean value and standard deviation of the measurement residual of our previous work are 0.0529 m and 0.034 m, while those of this study are 0.0372 m and 0.019 m, respectively.
The methods proposed in this paper provided a smaller residual and variance and, therefore, more accurate measurements and more precise prediction of the touchdown point. This indicates that the robot could accomplish the ball-catching task effectively. Readers can access a link to a video clip to watch the developed system in action (https://youtu.be/En-6XcmkeBs, accessed on 30 April 2021). To determine the success rate, we performed 50 throws with random initial positions. The robot successfully caught the ball 44 times out of 50 total trials, giving an overall success rate of 88%. The success rate of our previous work was about 74%.

8. Concluding Remarks

In this research, a robotic ball-catching system was presented. This system consisted of multi-camera vision systems, an omni-directional wheeled mobile robot, and wireless communication. In the multi-camera vision system, the ball’s motion was tracked with an active stereo vision camera while a static vision camera navigated a mobile robot. Using a Kalman filter and the motion governing equations of a flying ball, the ball’s touchdown point was predicted with reasonable accuracy. For Kalman filtering, we found that the use of deep learning improved the accuracy and robustness.
The robot was controlled to move toward the predicted point to catch the ball before it hit the ground. The performance of the sub-systems and the proposed algorithms was verified through simulations and experiments. The results of the simulation matched the experimental results well. The experimental results confirmed that the developed robotic system combined with multi-camera vision systems could catch a flying ball.
The main contribution of this paper is to present the main issues and technical challenges of the design and system integration of a vision-based ball-catching system. Compared with the existing vision-based ball-catching systems, by combining an omni-directional wheeled mobile robot and an active stereo vision system, the ball-catching system proposed in this paper can be in a large workspace.
In future research, the image capture system used in this paper can be improved through the use of better Kalman filtering methods to restrain noise problems. Deep learning will be applied to image pre-processing, and it will be used on complex backgrounds with balls of different colors to verify the recognition accuracy. Different sensors (such as a laser range finder or RGB-D cameras) will be used for experimental comparisons. Different control laws will be applied to the omni-directional wheeled mobile robot in an attempt to increase the speed and accuracy of movement. In addition, worst-case conditions (long-distance movement or disturbance during the movement) will be applied to verify the robot’s abilities. In the simulations and experiments, different ball conditions (speed, height, etc.) and disturbances during the flight will be used to verify the system robustness.

Author Contributions

Conceptualization, M.-T.H.; Methodology, S.-T.K. and M.-T.H.; Software, S.-T.K.; Validation, S.-T.K.; Formal analysis, M.-T.H.; Data curation, S.-T.K.; Writing-original draft, S.-T.K.; Writing-review and editing, S.-T.K. and M.-T.H.; Supervision, M.-T.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Ministry of Science and Technology of Taiwan under Grant No. MOST 103-2221-E-006-184.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data collected through research presented in the paper are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bauml, B.; Wimbock, T.; Hirzinger, G. Kinematically Optimal Catching a Flying Ball with a Hand-Arm-System. In Proceedings of the International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 18–22 October 2010; pp. 2592–2599. [Google Scholar]
  2. Rapp, H.H. A Ping-Pong Ball Catching and Juggling Robot: A Real-Time Framework for Vision Guided Acting of an Industrial Robot Arm. In Proceedings of the 2011 International Conference on Automation, Robotics and Applications, Wellington, New Zealand, 6–8 December 2011; pp. 430–435. [Google Scholar]
  3. Kim, S.; Shukla, A.; Billard, A. Catching Objects in Flight. IEEE Trans. Robot. 2014, 30, 1049–1065. [Google Scholar] [CrossRef]
  4. Oka, T.; Komura, N.; Namiki, A. Ball juggling robot system controlled by high-speed vision. In Proceedings of the IEEE International Conference on Cyborg and Bionic Systems, Beijing, China, 17–19 October 2017. [Google Scholar]
  5. Karuppiah, P.; Metalia, H.; George, K. Automation of a wheelchair mounted robotic arm using computer vision interface. In Proceedings of the IEEE International Instrumentation and Measurement Technology Conference, Houston, TX, USA, 14–17 May 2018. [Google Scholar]
  6. Cigliano, P.; Lippiello, V.; Ruggiero, F.; Siciliano, B. Robotic Ball Catching with an Eye-in-Hand Single-Camera System. IEEE Trans. Control. Syst. Technol. 2015, 23, 1657–1671. [Google Scholar] [CrossRef] [Green Version]
  7. Tongloy, T.; Boonsang, S. An image-based visual servo control system based on an eye-in-hand monocular camera for autonomous robotic grasping. In Proceedings of the International Conference on Instrumentation, Control and Automation, Bandung, Indonesia, 29–31 August 2016. [Google Scholar]
  8. Carter, E.J.; Mistry, M.N.; Carr, G.P.K.; Kelly, B.A.; Hodgins, J.K. Playing catch with robots: Incorporating social gestures into physical interactions. In Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication, Edinburgh, UK, 25–29 August 2014. [Google Scholar]
  9. Namiki, A.; Itoi, N. Ball catching in kendama game by estimating grasp conditions based on a high-speed vision system and tactile sensors. In Proceedings of the IEEE-RAS International Conference on Humanoid Robots, Madrid, Spain, 18–20 Novemer 2014. [Google Scholar]
  10. Sugar, T.G.; McBeath, M.K.; Suluh, A.; Mundhra, K. Mobile robot interception using human navigational principles: Comparison of active versus passive tracking algorithms. Auton. Robot. 2006, 21, 43–54. [Google Scholar] [CrossRef]
  11. Bauml, B.; Birbach, O.; Wimbock, T.; Frese, U. Catching Flying Balls with a Mobile Humanoid: System Overview and Design Considerations. In Proceedings of the IEEE International Conference on Humanoid Robots, Bled, Slovenia, 26–28 October 2011; pp. 513–520. [Google Scholar]
  12. Wang, R.; Liu, M.S.; Zhou, Y.; Xun, Y.Q.; Zhang, W.B. A deep belief networks adaptive Kalman filtering algorithm. In Proceedings of the IEEE International Conference on Software Engineering and Service Science, Beijing, China, 26–28 August 2016. [Google Scholar]
  13. Kao, S.T.; Wang, Y.; Ho, M.T. Ball catching with omni-directional wheeled mobile robot and active stereo vision. In Proceedings of the IEEE 26th International Symposium on Industrial Electronics, Edinburgh, UK, 19–21 June 2017. [Google Scholar]
  14. Faugeras, O. Three Dimensional Computer Vision; MIT Press: Cambridge, MA, USA, 1993. [Google Scholar]
  15. Ma, Y.; Soatto, S.; Kosecka, J.; Sastry, S. An Invitation to 3-D Vision: From Images to Geometric Models; Springer: New York, NY, USA, 2003. [Google Scholar]
  16. Brunelli, R. Template Matching Techniques in Computer Vision: Theory and Practice; John Wiley & Sons: New York, NY, USA, 2009. [Google Scholar]
  17. Welch, G.; Bishop, G. An Introduction to the Kalman Filter; Department of Computer Science, University of North Carolina at Chapel Hill: Chapel Hill, NC, USA, 2006. [Google Scholar]
  18. Grewal, M.S.; Andrews, A.P. Kalman Filtering: Theory and Practice with MATLAB; Wiley: New York, NY, USA, 2014. [Google Scholar]
  19. Elger, D.F.; Williams, B.C.; Crowe, C.T.; Roberson, J.A. Engineering Fluid Mechanics; Wiley: New York, NY, USA, 2012. [Google Scholar]
  20. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
  21. Watanabe, K.; Shiraishi, Y.; Tzafestas, S.G.; Tang, J.; Fukuda, T. Feedback Control of an Omnidirectional Autonomous Platform for Mobile Service Robots. J. Intell. Robot. Syst. 1998, 22, 315–330. [Google Scholar] [CrossRef]
  22. d’Andrea Novel, B.; Bastin, G.; Campion, G. Dynamic feedback linearization of nonholonomic wheeled mobile robots. In Proceedings of the IEEE International Conference on Robotics and Automation, Nice, France, 12–14 May 1992; pp. 2527–2532. [Google Scholar]
  23. Golnaraghi, F.; Kuo, B.C. Automatic Control Systems; Wiley: New York, NY, USA, 2009. [Google Scholar]
  24. Ho, M.T.; Wang, H.S. PID Controller Design with Guaranteed Gain and Phase Margins. Asian J. Control. 2003, 5, 374–381. [Google Scholar] [CrossRef]
Figure 1. Schematic overview of the proposed system.
Figure 1. Schematic overview of the proposed system.
Sensors 21 03208 g001
Figure 2. (a) Image with a complicated background captured by the camera and (b) the normalized cross-correlation image.
Figure 2. (a) Image with a complicated background captured by the camera and (b) the normalized cross-correlation image.
Sensors 21 03208 g002
Figure 3. Simulation results of the (a) template matching method and (b) color matching method.
Figure 3. Simulation results of the (a) template matching method and (b) color matching method.
Sensors 21 03208 g003
Figure 4. Coordinate frames.
Figure 4. Coordinate frames.
Sensors 21 03208 g004
Figure 5. Angular motor displacements.
Figure 5. Angular motor displacements.
Sensors 21 03208 g005
Figure 6. Kalman filter combined with deep learning.
Figure 6. Kalman filter combined with deep learning.
Sensors 21 03208 g006
Figure 7. The architecture of the deep learning.
Figure 7. The architecture of the deep learning.
Sensors 21 03208 g007
Figure 8. Top view of the omni-directional mobile robot.
Figure 8. Top view of the omni-directional mobile robot.
Sensors 21 03208 g008
Figure 9. Block diagram of the proposed system.
Figure 9. Block diagram of the proposed system.
Sensors 21 03208 g009
Figure 10. Omni-directional wheeled mobile robot.
Figure 10. Omni-directional wheeled mobile robot.
Sensors 21 03208 g010
Figure 11. Comparison of the actual trajectory, estimated trajectory, and measured trajectory.
Figure 11. Comparison of the actual trajectory, estimated trajectory, and measured trajectory.
Sensors 21 03208 g011
Figure 12. Prediction of the touchdown points.
Figure 12. Prediction of the touchdown points.
Sensors 21 03208 g012
Figure 13. The training data sets.
Figure 13. The training data sets.
Sensors 21 03208 g013
Figure 14. Prediction of the touchdown points improved by deep learning.
Figure 14. Prediction of the touchdown points improved by deep learning.
Sensors 21 03208 g014
Figure 15. The experimental setup.
Figure 15. The experimental setup.
Sensors 21 03208 g015
Figure 16. A free falling ball: the sequence of images captured.
Figure 16. A free falling ball: the sequence of images captured.
Sensors 21 03208 g016
Figure 17. A free falling ball: estimated trajectory of the ball.
Figure 17. A free falling ball: estimated trajectory of the ball.
Sensors 21 03208 g017
Figure 18. A free falling ball: tracking response of (a) the pan-axis motor and (b) the tilt-axis motor.
Figure 18. A free falling ball: tracking response of (a) the pan-axis motor and (b) the tilt-axis motor.
Sensors 21 03208 g018
Figure 19. A flying ball: the sequence of images captured.
Figure 19. A flying ball: the sequence of images captured.
Sensors 21 03208 g019
Figure 20. A flying ball: estimated trajectory of the ball.
Figure 20. A flying ball: estimated trajectory of the ball.
Sensors 21 03208 g020
Figure 21. A flying ball: tracking response of (a) the pan-axis motor and (b) the tilt-axis motor.
Figure 21. A flying ball: tracking response of (a) the pan-axis motor and (b) the tilt-axis motor.
Sensors 21 03208 g021
Figure 22. Prediction of touchdown points: (a) the previous work and (b) this study.
Figure 22. Prediction of touchdown points: (a) the previous work and (b) this study.
Sensors 21 03208 g022
Figure 23. The ball’s trajectories of estimation and measurement, and the robot’s path of movement: (a) the previous work and (b) this study.
Figure 23. The ball’s trajectories of estimation and measurement, and the robot’s path of movement: (a) the previous work and (b) this study.
Sensors 21 03208 g023
Figure 24. Moving path of the robot to catch the ball: (a) the previous work and (b) this study.
Figure 24. Moving path of the robot to catch the ball: (a) the previous work and (b) this study.
Sensors 21 03208 g024
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Kao, S.-T.; Ho, M.-T. Ball-Catching System Using Image Processing and an Omni-Directional Wheeled Mobile Robot. Sensors 2021, 21, 3208. https://doi.org/10.3390/s21093208

AMA Style

Kao S-T, Ho M-T. Ball-Catching System Using Image Processing and an Omni-Directional Wheeled Mobile Robot. Sensors. 2021; 21(9):3208. https://doi.org/10.3390/s21093208

Chicago/Turabian Style

Kao, Sho-Tsung, and Ming-Tzu Ho. 2021. "Ball-Catching System Using Image Processing and an Omni-Directional Wheeled Mobile Robot" Sensors 21, no. 9: 3208. https://doi.org/10.3390/s21093208

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop