Normally, films are played with 24 frames per second and TV programs are broadcasted with a standard frame rate of 30 Hz. Especially in Multimedia Internet of Things (IoT), limited by the bandwidth of wireless channels, the lower frame rate is required when encoding video sequences. Low frame rate can basically meet people’s entertainment needs, but motion blur would occur when a mass of fast movements exists in the video sequences. It is known that video sequences at a high frame rate contain fewer blurs or block artifacts and provide people with a better visual experience. Therefore, in the receiver of Multimedia IoT, we should increase the frame rate of video sequences in order to attract the eyes of the audience. To meet the needs described above, Motion-Compensated Frame Rate Up-Conversion (MC-FRUC) can often be used to convert low-frame-rate videos to high-frame-rate ones.
MC-FRUC, which is gaining extensive attention from scholars in recent years [1
], is a video processing technique interpolating several new frames between two adjacent original frames. It has a standard flow including Motion Estimation (ME), Motion Vector Smoothing (MVS), Motion Vector Mapping (MVM) and Motion-Compensated Interpolation (MCI), among which the former three are combined to provide the Motion Vector Field (MVF) of the middle frame, and MCI is used to interpolate the new frame according to the above MVF [6
The quality of the interpolated frames is heavily influenced by the accuracy of MVF, so lots of researches are focused on ME, MVS and MVM. ME is a process of predicting the MVF between two adjacent original frames [7
]. Block matching algorithm (BMA), the typical method among various ME algorithms, has an advantage of low complexity over pixel-wise ME [8
]. The size of one standard block is much smaller than that of one frame, and the pixels of most objects are distributed in different contiguous blocks. In light of that, 3D Recursive Search (3DRS) was proposed based on the spatiotemporal correlation [10
]. To track MVs as truly as possible, MVS imposes some smoothness constraints on BMA [11
], so that more MV outliers can be effectively suppressed. MVS can also be explicitly implemented by median filtering and penalty terms [12
], but this explicit approach increases the computational complexity. After MVS, MVM is used to deduce the MVF of the intermediate frame from the MVF between adjacent original frames [13
]. Forward MVM is a common strategy which maps halved MVs along their directions to blocks where they are pointed [14
]. Little temporal mismatch occurs when performing forward MVM, but some blocks in the intermediate frame could have multiple MVs or no MV, thus introducing overlaps and holes. According to the assumption of temporal symmetry, the bilateral MVM directly performs the Bilateral ME (BME) [16
] on the intermediate frame, which avoids block artifacts. However, due to the varying statistics of video sequences, MV outliers always exist, which results in edge blurring and block artifacts in the process of MCI. Some advanced MCI approaches, e.g., Overlapped Block Motion Compensation (OBMC) [17
], can reduce some bad effects resulting from MV outliers. Fractal interpolation also can be performed to predict the pixels at fractional coordinates and effectively reduces blurring and block artifacts by providing a pleasant zoom and slow motion [18
]. Various research results on ME, MVS, MVM and MCI can be combined flexibly with the MC-FRUC with different performances. Recently, some state-of-the-art methods are continually presented, e.g., Li et al. [19
] proposed a MC-FRUC using patch-based sparseland model, Tsai et al. [20
] constructed the hierarchical motion field and an MV mapping stage to improve the performance of MC-FRUC and Li et al. [21
] used multiple ME schemes to jointly interpolate the frames. However, the performance improvements of these works are at the costs of computational complexity. Similar to natural images, the MVF of video frames also has local stationary statistics [22
], which can help MCFI to reduce the computational complexity.
The existing works throw a lot of computations to suppress MV outliers, but the improvement of MV precision is far from satisfactory. We expect a good balance between computations and MVF accuracy, so a Spatial Prediction-based Motion-Compensated Frame Interpolation (SP-MCFI) is proposed in this paper. The contributions of SP-MCFI are listed as follows:
Experimental results demonstrate that the proposed SP-MCFI algorithm generates a pleasant up-converted video, and meanwhile, it has a low computational complexity.