Next Article in Journal
“Network Sentiment” Framework to Improve Security and Privacy for Smart Home
Next Article in Special Issue
THBase: A Coprocessor-Based Scheme for Big Trajectory Data Management
Previous Article in Journal
Bidirectional Recurrent Neural Network Approach for Arabic Named Entity Recognition
Previous Article in Special Issue
A Bi-Directional LSTM-CNN Model with Attention for Aspect-Level Text Classification
Article Menu
Issue 12 (December) cover image

Export Article

Open AccessArticle
Future Internet 2018, 10(12), 124; https://doi.org/10.3390/fi10120124

A Method for Filtering Pages by Similarity Degree based on Dynamic Programming

1,2,* and 2
1
College of Economics and Trade, Changsha Commerce & Tourism College, Changsha 410116, China
2
National Supercomputing Center in Changsha, Hunan University, Changsha 410116, China
*
Author to whom correspondence should be addressed.
Received: 20 October 2018 / Revised: 8 December 2018 / Accepted: 11 December 2018 / Published: 13 December 2018
Full-Text   |   PDF [2096 KB, uploaded 13 December 2018]   |  

Abstract

To obtain the target webpages from many webpages, we proposed a Method for Filtering Pages by Similarity Degree based on Dynamic Programming (MFPSDDP). The method needs to use one of three same relationships proposed between two nodes, so we give the definition of the three same relationships. The biggest innovation of MFPSDDP is that it does not need to know the structures of webpages in advance. First, we address the design ideas with queue and double threads. Then, a dynamic programming algorithm for calculating the length of the longest common subsequence and a formula for calculating similarity are proposed. Further, for obtaining detailed information webpages from 200,000 webpages downloaded from the famous website “www.jd.com”, we choose the same relationship Completely Same Relationship (CSR) and set the similarity threshold to 0.2. The Recall Ratio (RR) of MFPSDDP is in the middle in the four filtering methods compared. When the number of webpages filtered is nearly 200,000, the PR of MFPSDDP is highest in the four filtering methods compared, which can reach 85.1%. The PR of MFPSDDP is 13.3 percentage points higher than the PR of a Method for Filtering Pages by Containing Strings (MFPCS). View Full-Text
Keywords: method for filtering pages; similarity degree; dynamic programming; combination method method for filtering pages; similarity degree; dynamic programming; combination method
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Deng, Z.; He, T. A Method for Filtering Pages by Similarity Degree based on Dynamic Programming. Future Internet 2018, 10, 124.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Future Internet EISSN 1999-5903 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top