Feature Engineering for Machine Learning

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (28 February 2021) | Viewed by 2410

Special Issue Editor


E-Mail Website
Guest Editor
Dept. of Software Engineering, Dankook University, Yongin-si, Korea
Interests: data analysis; feature engineering; machine learning

Special Issue Information

Dear Colleagues,

Features are base material in the building of a learning model. At the same time, they are the most important factor affecting the performance of a learning model. Thus, we cannot think of machine learning without feature engineering. There is no golden rule for feature engineering; even though there are lots of research works, we still need a more efficient methodology to find effective features for a specific learning model.

This Special Issue on “Feature Engineering for Machine Learning” aims to present recent research related to feature engineering and give insight for building high-performance learning models. Submissions are expected to focus on both the theoretical aspects and applications of feature engineering. Review papers are also welcome.

Topics of interest include but are not limited to the following areas:

  • Feature generation/selection/extraction
  • Feature synthesis
  • Dimension reduction
  • Feature importance in a learning model
  • Feature interaction
  • Visualization of a feature-related task
  • Automation of feature engineering
  • Development of a feature engineering tool

I hope this Special Issue works as a roadmap for all researchers of feature engineering and developers of learning models.

Prof. Dr. Sejong Oh
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 3805 KiB  
Article
Feature-Weighted Sampling for Proper Evaluation of Classification Models
by Hyunseok Shin and Sejong Oh
Appl. Sci. 2021, 11(5), 2039; https://doi.org/10.3390/app11052039 - 25 Feb 2021
Cited by 1 | Viewed by 1536
Abstract
In machine learning applications, classification schemes have been widely used for prediction tasks. Typically, to develop a prediction model, the given dataset is divided into training and test sets; the training set is used to build the model and the test set is [...] Read more.
In machine learning applications, classification schemes have been widely used for prediction tasks. Typically, to develop a prediction model, the given dataset is divided into training and test sets; the training set is used to build the model and the test set is used to evaluate the model. Furthermore, random sampling is traditionally used to divide datasets. The problem, however, is that the performance of the model is evaluated differently depending on how we divide the training and test sets. Therefore, in this study, we proposed an improved sampling method for the accurate evaluation of a classification model. We first generated numerous candidate cases of train/test sets using the R-value-based sampling method. We evaluated the similarity of distributions of the candidate cases with the whole dataset, and the case with the smallest distribution–difference was selected as the final train/test set. Histograms and feature importance were used to evaluate the similarity of distributions. The proposed method produces more proper training and test sets than previous sampling methods, including random and non-random sampling. Full article
(This article belongs to the Special Issue Feature Engineering for Machine Learning)
Show Figures

Figure 1

Back to TopTop