Abstract
Pre-trained models have played important roles in many tasks, such as domain adaptation and out-of-distribution generalization, by transferring matured knowledge. In this paper, we study Neural Architecture Search (NAS) in the feature space level and observe that low-level features of NAS-based networks (generated networks from a NAS space) become stable in the earlier stage of training. In addition, these low-level features are similar to those from hand-crafted networks such as VGG, ResNet, and DenseNet. This phenomenon is consistent over different search spaces and datasets. Motivated by these observations, we propose a new architectural method for NAS, called Knowledge-Transfer NAS, which utilizes the features from a pre-trained hand-crafted network. Specifically, we replace the first few cells of NAS-based networks with pre-trained manually designed blocks and freeze them, and then only train the remaining cells. We perform extensive experiments using various NAS algorithms and search spaces, and show that Knowledge-Transfer NAS achieves higher/comparable performance while requiring less memory footprint and search time, offering a new perspective on the applicability of pre-trained models for improved NAS algorithms.