The Effect of Focal Length Variations on Convolutional Neural Network-Based Fabric Classifications

Gutierrez, Jhamil; Villaverde, Jocelyn

doi:10.3390/engproc2026134057

Open AccessProceeding Paper

The Effect of Focal Length Variations on Convolutional Neural Network-Based Fabric Classifications^†

by

Jhamil Gutierrez

^1,2,*

and

Jocelyn Villaverde

¹

School of Graduate Studies, Mapúa University, Manila 1002, Philippines

²

School of Engineering and Technology, National University, Manila 1008, Philippines

^*

Author to whom correspondence should be addressed.

^†

Presented at the 7th Eurasia Conference on IoT, Communication and Engineering 2025 (ECICE 2025), Yunlin, Taiwan, 14–16 November 2025.

Eng. Proc. 2026, 134(1), 57; https://doi.org/10.3390/engproc2026134057

Published: 16 April 2026

(This article belongs to the Proceedings of The 7th Eurasia Conference on IoT, Communication and Engineering 2025 (ECICE 2025))

Download

Browse Figures

Versions Notes

Abstract

This study investigated the impact of image capture distance on the performance of convolutional neural networks (CNNs) in classifying fabrics. Unlike previous works that rely solely on digital zoom and data augmentation to simulate multi-scale variations, this research explores the use of physically captured images at far, mid-range, and near focal lengths using a camera with an attached varifocal lens. Fabric samples from three categories of Cotton, Linen, and Silk were imaged under consistent lighting to create an image dataset with a total of 1350 images used to train CNN models via transfer learning, with MobileNetV2 and ResNet50 as the baseline architectures. Classification performance was evaluated separately on each focal subset and on their combined dataset to test the trained model generalization capability. Results showed an absolute accuracy gain of 20.57% with MobileNetV2 and 9.78% for ResNet50 while performing with an improved accuracy at 98.42% for MobileNetV2 and ResNet50 at 96.30%

Keywords:

fabric classification; convolutional neural network; computer vision; Raspberry Pi; MobileNetV2; ResNet50

1. Introduction

CNNs have revolutionized many domains in computer vision that include object recognition, facial detection and texture classification [1]. One emerging area where CNNs show strong promise is in the classification, detection, and identification of textiles, specifically fabrics [2,3,4,5,6,7,8,9,10,11,12], which contributes to the field of textile manufacturing, automated quality inspection and smart material recognition. With growing interest in identifying and classifying fabrics, image-based approaches can provide a non-invasive, low-cost, and scalable solution. However, current implementations overlook the physical image acquisition process, particularly the effect of the focal length and distance at which images are captured. This study is conducted based on the idea that real-world image capture conditions, such as focal length variation, could significantly influence CNN performance by providing a diversity in the level of texture details captured in fabric images.

Recent studies in fabric classification have largely focused on the utilization of CNNs to showcase their ability to classify and identify fabrics or to identify certain fabrics with a focus on fault, defect and damage detection [3], where the researchers classified different blends of Abaca fabrics, and in [6,7], where pineapple fabrics and pineapple cotton fabric blends were the main focus of the developed classification systems based on CNN. Similarly, the identification of barong Tagalog textiles using CNNs was conducted in Ref. [8]. Other research aimed to improve textile manufacturing processes by developing systems to detect fabric faults and defects as presented in Ref. [4] where fabric anomalies was detected to enhance textile quality assurance, Ref. [5] where detection was applied in a real-world manufacturing setup, and Ref. [9] where classification of woven fabric faults was explored and [10] where fabric defects are detection using deep learning. Although these works demonstrate the effectiveness of CNNs in fabric classification and identification tasks, they do not address hardware-based image acquisition techniques or methodologies. Most rely heavily on digital zoom and scaling as proxies for image diversity, while treating all images as uniformly representative of the target material. Previous methods rely on digital zoom and scaling as substitutes for image diversity, treating all images as uniformly representative of the target material.

While such methods have yielded promising results in certain contexts, results in Ref. [8] indicate limited accuracy, with the system trained to identify Barong Tagalog textiles achieving only 71.10%. This highlights the need for improved image acquisition strategies to enhance model generalization. Moreover, the researchers have not examined how optical variations at the point of capture, particularly differences in camera distance and focus, influence classification performance. Evidence from other domains, such as medical imaging and remote sensing, shows that physical variations in image scale and focus significantly affect CNN learning [12,13]. However, similar investigations in fabric recognition remain scarce.

Brown et al. [14] analyzed the impact of camera choice on image classification tasks by evaluating six different cameras used to capture dataset images. Nevertheless, they did not consider the potential effects of lens attachments. Therefore, it is necessary to explore the role of physical capture configurations, such as focal length variation, in augmenting or improving fabric classification performance. While augmentation techniques simulate scale diversity, they cannot fully replicate the optical effects of genuine focal changes, such as depth of field, resolution shifts, and real-world detail gradients [15,16].

2. Methodology

2.1. Experimental Workflow

The research process flowchart presents the structured experimental workflow designed to investigate the influence of varying focal lengths on CNN-based fabric classification performance (Figure 1). The process begins with image acquisition, in which fabric images are captured under four focal length configurations: far, mid, and near.

This is followed by dataset organization, where captured images are grouped according to their focal length category, and dataset preparation, which involves splitting the organized data into training and validation subset and testing subset. Two CNN architectures are separately trained on each focal length-based dataset. Afterward, the trained models are evaluated using validation and inference accuracy to determine their classification performance on each dataset variant. Finally, the results are compared to identify performance trends across focal lengths and between the two architectures, for a detailed analysis of how focal length variation affects CNN-based fabric classification systems.

2.2. Image Acquisition Device

An image acquisition prototype was designed and developed; identical to that described in [17] as part of the author’s previous work to facilitate fabric image collection. The annotated design of the device is shown in Figure 2a. It has a dimension specification of 17.78 × 17.78 × 33.02 cm (length × width × height).

The camera lens is positioned 8.94 cm above the sample platform, providing the optimal focal distance required for high-detail fabric image capture. Additionally, a 1.27 cm insertion gap is integrated at the base of the device to facilitate easy placement and alignment of fabric samples. Image acquisition is performed using a 4-megapixel high-definition USB camera with a native resolution of 2560 × 1440 to enable detailed capture of fabric textures. Illumination is provided by USB-powered LED strip lights, delivering stable 5 V power directly from the Raspberry Pi without an external supply. For user interaction, a 7-inch touchscreen LCD (800 × 480 resolution) is used, eliminating the need for peripheral input devices. The camera is equipped with a 5–50 mm varifocal lens, which supports adjustable focal lengths to enhance image sharpness and emphasize fine fabric details.

The experimental image acquisition setup materialized through 3D printing technology (Figure 3a). The device accommodates a minimum fabric sample dimension of 5.08 × 5.08 cm, defined by the lens’ maximum zoom capability and field-of-view constraints. For improved handling and alignment, larger samples extending beyond the device width are recommended. Additionally, the prototype enclosure incorporates a front access opening, facilitating precise lens adjustments without requiring system disassembly. A custom-designed GUI for the image acquisition device is presented in Figure 3b. The interface features a capture button alongside progress indicators that display focal length configuration batch progress and the total number of images captured per class during each data collection session. Each focal length configuration batch is set to capture exactly 100 images, with the system automatically halting further image storage once the target count of 300 images for that class is reached.

2.3. Data Gathering

An image dataset consisting of a total of 1350 fabric images was prepared by capturing random partial views of each fabric input sample. Figure 4 shows the representative images from each class and focal length configurations.

The dataset is split into the training, validation, and testing subsets. The training and validation subset contains 900 images, with each fabric class having 300 images evenly split into far, mid, and near views (100 each) based on the applied focal length configuration. The dedicated testing subset has a split of 450 images, with each class having 150 images split into 50:50:50, with each focal length configuration used specifically to test the combined (training and validation) subset trained model.

2.4. Model Architectures

To comprehensively evaluate the impact of focal length variations on classification performance, this study employs two widely recognized CNN architectures: MobileNetV2 and ResNet50. These models were selected to present a balanced spectrum of architectural philosophies, lightweight, and residual networks for a diverse performance comparison under the same experimental conditions. MobileNetV2 is designed for computational efficiency and optimized for deployment on resource-constrained devices. It utilizes inverted residual blocks and depthwise separable convolutions, reducing the number of parameters while maintaining competitive accuracy, making it suitable for lightweight yet effective feature extraction [18]. ResNet50 employs residual connections, a technique that allows the network to train effectively even at considerable depth by mitigating the vanishing gradient problem. Its 50-layer architecture is structured using bottleneck residual blocks, which improve computational efficiency while enabling the model to learn highly abstract and complex feature hierarchies. This capability is advantageous for texture recognition tasks, where variations in focal length subtly alter spatial detail and feature distribution. By maintaining representational strength across layers, ResNet50 remains robust to such variations, improving the model’s ability to generalize to images captured at different distances and zoom levels [19].

2.5. Model Training

The process flow of the model training is shown in Figure 5. The process begins with the image dataset as input, followed by preparation for augmentation and preprocessing. To enhance model generalization and mimic real-world variability, several augmentation techniques were applied, including random horizontal and vertical flips, minor rotations, shifts in width and height, brightness adjustments, and shear transformations, while excluding zoom variations to align with the study’s objectives. All augmentations employed nearest-neighbor filling to preserve edge integrity. Model reconstruction via transfer learning varied only in the preprocessing configurations specific to each architecture.

In MobileNetV2, the model was initialized with pretrained ImageNet weights, with its depthwise separable convolutional backbone initially frozen to retain robust low-level and mid-level feature extraction capabilities. Input images were standardized to 224 × 224 pixels and normalized within a [0, 1] range to align with MobileNetV2’s expected preprocessing. On the other hand, ResNet50 was initialized with pretrained ImageNet weights, with its deep residual convolutional layers initially frozen to preserve the pretrained hierarchical feature mappings. Input images were resized to 224 × 224 pixels and processed using the ResNet50-specific pre-process input function.

2.6. Cross-Dataset Evaluation

To assess the cross-generalization capability of CNN trained on fabric images captured at varying focal distances, four distinct datasets were utilized: Far, Mid, Near, and Combined, each containing three fabric classes. For each dataset, the network was trained exclusively on its corresponding training split and validated on its respective validation set to monitor convergence. Following training, the model underwent cross-domain evaluation, where it was tested on the remaining datasets without overlapping to simulate real-world shifts in image acquisition conditions. This ensured that a model trained on, for example, the Far dataset was never tested on the Far test split, but instead on the Mid, Near, and Combined sets.

3. Results and Discussion

MobileNetV2 and ResNet50 were trained and evaluated under varying focal length conditions in the collected image datasets. Each model was trained on one dataset and evaluated on the remaining datasets, enabling cross-domain generalization assessment. The classification performance, measured primarily through accuracy, is presented and compared to identify the effects of focal distance variation and architectural differences on model performance. The combined dataset pertains to the combined dataset of Far, Mid and Near while the combined testing is the dedicated testing dataset to represent the combined dataset of different focal length variations. The models were trained under uniform conditions to ensure fair and unbiased performance evaluation. Each model will not have target accuracy and a fine-tuning process providing a consistent performance benchmark across architectures. Training was limited to a maximum of 10 epochs, with an early stopping patience of 3 epochs to halt training if no improvement in validation accuracy was observed.

Table 1 presents the trained model validation accuracy based on the training and validation split dataset of (80/20). All models demonstrated strong performance in classifying fabric images captured under consistent focal length and zoom configurations. However, such conditions rarely hold in practical applications.

More realistic results are shown in Table 2 when subject to cross-dataset evaluation. MobileNetV2, as a lightweight architecture, demonstrated an advantage when trained on the combined dataset, achieving near-perfect accuracy across all test sets: Far (98.67%), Mid (99.67%), Near (99.33%), and CombinedTesting (96.00%), with an overall average accuracy of 98.42%. In contrast, models trained on single focal length datasets exhibited reduced performance when tested on different focal lengths, indicating limited adaptability. ResNet50 showed improvements when trained on different focal lengths but generally performed good with noticeable lesser accuracy in the Mid-trained model, performing an average of 74.22%. These results suggest that incorporating diverse focal lengths in the training set enhances generalization capability and highlights the importance of focal length diversity in training data to enhance model accuracy in real-world fabric classification tasks, specifically when using MobileNetV2 and ResNet50 as the CNN base architecture. Equation (1) was used to determine the absolute accuracy gain of each model with the result summary shown in Table 3.

G_{t r a i n}^{(T)} = A_{C o m b i n e d}^{(T)} - A_{t r a i n}^{(T)}

(1)

The MobileNetV2-based trained models have shown an average absolute accuracy gain of 20.57, while the ResNet50-based model showed 9.78.

4. Conclusions and Recommendations

The incorporation of focal length variations in dataset acquisition setup significantly influences the overall performance of CNN-based classification. Using a varifocal lens image acquisition technique, datasets were generated at multiple focal distances and evaluated with two CNN architectures, MobileNetV2 and ResNet50. Cross-dataset evaluations revealed that the combined focal length dataset consistently delivered higher results, with MobileNetV2 achieving 98.42% testing accuracy and ResNet50 reaching 96.30%. Noticeable performance improvements were observed in MobileNetV2, where it recorded an absolute accuracy gain of 20.57, while ResNet50 achieved an average absolute accuracy gain of 9.78. Integrating focal length-based image diversity enhances model generalization capabilities. These results proves that the integration of focal length-based image diversity in datasets improves model generalization capabilities. Future research should extend this approach to other material recognition domains, investigate additional optical and hardware techniques to enrich dataset variability, and evaluate focal length-based datasets across a broader range of CNN architectures. Furthermore, validating the scalability of varifocal lens methods in diverse computer vision tasks and expanding fabric dataset diversity could further improve CNN classification performance.

Author Contributions

Conceptualization, J.G.; methodology, J.G.; software, J.G.; resources, J.G.; writing—original draft preparation, J.G.; writing—review and editing, J.G.; supervision, J.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset created and used is publicly archived in Kaggle and is accessible through https://doi.org/10.34740/kaggle/ds/9263699.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CNN	Convolutional Neural Network
USB	Universal Serial Bus
GUI	Graphical User Interface
LCD	Liquid Crystal Display

References

Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 6999–7019. [Google Scholar] [CrossRef]
Pribowo, P.; Syarif, A.; Lumbanraja, F.R. Implementation of Convolutional Neural Network (CNN) Approach for Classification of Lampung Textile (Tapis). In Proceedings of the 5th International Conference on Applied Sciences, Mathematics, and Informatics (ICASMI 2024); Springer: Berlin/Heidelberg, Germany, 2025; pp. 69–80. [Google Scholar]
Cinco, C.D.; Dominguez, L.M.R.; Villaverde, J.F. Abaca Blend Fabric Classification Using YOLOv8 Architecture. Eng. Proc. 2025, 92, 42. [Google Scholar]
Murugan, K.; Vigneesh, A.H.; Sree, N.U. Enhancing Textile Quality Assurance with TensorFlow: Detecting Fabric Anomalies. In Proceedings of the 2nd International Conference on Applied Artificial Intelligence and Computing (ICAAIC), Salem, India, 4–6 May 2023; pp. 1548–1552. [Google Scholar]
Nasim, M.; Mumtaz, R.; Ahmad, M.; Ali, A. Fabric defect detection in real world manufacturing using deep learning. Information 2024, 15, 476. [Google Scholar] [CrossRef]
Villaverde, J.F.; Ferrer, M.D.; Macabeo, J.A.T.; Masilungan Manuel, J.T. Classification of Cotton Fabric, Pineapple Fabric and Cotton Pineapple Blend Fabric with VGG16 using Keras. In Proceedings of the IEEE 15th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), Coron, Philippines, 19–23 November 2023; pp. 1–6. [Google Scholar]
Ounyoung, N.; Mettripun, N. Classification of Pineapple Fiber Woven Fabrics Based on Convolutional Neural Network. In Proceedings of the Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON), Phuket, Thailand, 22–25 March 2023; pp. 389–392. [Google Scholar]
Totesora, J.B.; Torralba, E.C.; Manlises, C.O. Identifying Barong Tagalog textile using convolutional neural network and support vector machine with structural pattern segmentation. Eng. Proc. 2025, 92, 29. [Google Scholar]
Ashraf, R.; Ijaz, Y.; Asif, M.; Haider, K.Z.; Mahmood, T.; Owais, M. Classification of Woven Fabric Faulty Images Using Convolution Neural Network. Math. Probl. Eng. 2022, 2022, 2573805. [Google Scholar] [CrossRef]
Liu, Q.; Wang, C.; Li, Y.; Gao, M.; Li, J. A Fabric Defect Detection Method Based on Deep Learning. IEEE Access 2022, 10, 4284–4296. [Google Scholar] [CrossRef]
Ohi, A.Q.; Mridha, M.F.; Hamid, M.A.; Monowar, M.M.; Kateb, F.A. FabricNet: A Fiber Recognition Architecture Using Ensemble ConvNets. IEEE Access 2021, 9, 13224–13236. [Google Scholar] [CrossRef]
Sabottke, C.F.; Spieler, B.M. The Effect of Image Resolution on Deep Learning in Radiography. Radiol. Artif. Intell. 2020, 2, e190015. [Google Scholar] [CrossRef] [PubMed]
Song, J.; Gao, S.; Zhu, Y.; Ma, C. A survey of remote sensing image classification based on CNNs. Big Earth Data 2019, 3, 232–254. [Google Scholar] [CrossRef]
Brown, J.; Nguyen, A.; Raj, N. Effect of Camera Choice on Image-Classification Inference. Appl. Sci. 2025, 15, 246. [Google Scholar] [CrossRef]
Alomar, K.; Aysel, H.I.; Cai, X. Data Augmentation in Classification and Segmentation: A Survey and New Strategies. J. Imaging 2023, 9, 46. [Google Scholar] [CrossRef] [PubMed]
Kar, O.F.; Yeo, T.; Atanov, A.; Zamir, A. 3D Common Corruptions and Data Augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 18941–18952. [Google Scholar]
Gutierrez, J.G.; Villaverde, J.F. Classification and Identification of Natural Biodegradable Fabrics using Convolutional Neural Network. In Proceedings of the 9th International Conference on Electrical, Telecommunication and Computer Engineering (ELTICOM), Medan, Indonesia, 6–7 November 2025; pp. 168–173. [Google Scholar]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–24 June 2018; pp. 4510–4520. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]

Figure 1. Research process flow.

Figure 2. Image acquisition device design: (a) Annotated design. (b) Components’ block diagram.

Figure 3. Experimental setup: (a) Image acquisition setup. (b) Graphical user interface (GUI) of the setup.

Figure 4. Data collection: (a) Samples of the captured fabric images. (b) Dataset structure diagram.

Figure 5. Model training flowchart.

Table 1. Validation accuracy of trained MobileNetV2 and ResNet50 models.

Training Dataset	MobileNetV2	ResNet50
Training Dataset	Validation Accuracy ¹	Validation Accuracy
Near	100%	100%
Mid	100%	100%
Far	96.67%	100%
Combined	98.33%	93.89%

¹ Accuracy is derived from the highest accuracy of the training round from the restored best weight.

Table 2. Cross-dataset evaluation results of the MobileNetV2 and ResNet50 models.

Number	Train Dataset	Test Dataset	MobileNetV2		ResNet50
Number	Train Dataset	Test Dataset	Accuracy	Average	Accuracy	Average
1	Near	Mid	61.00%	63.48%	91.00%	92.22%
2		Far	58.33%		91.67%
3		CombinedTesting	71.00%		94.00%
4	Mid	Near	76.33%	84.40%	80.33%	74.22%
5		Far	91.33%		73.00%
6		CombinedTesting	85.56%		69.33%
7	Far	Near	76.00%	85.67%	91.00%	93.11%
8		Mid	96.33%		97.67%
9		CombinedTesting	84.67%		90.67%
10	Combined ²	Near	99.33%	98.42%	99.67%	96.30%
11		Mid	99.67%		99.67%
12		Far	98.67%		94.33%
13		CombinedTesting	96.00%		91.16%

² Combined was not used to test Near, Mid and Far trained models to avoid overlapping.

Table 3. Accuracy gains of the trained MobileNetV2 and ResNet50 models.

Training Set	MobileNetV2			ResNet50
Training Set	Baseline Accuracy	Combined Accuracy	Accuracy Gain ³	Baseline Accuracy	Combined Accuracy	Accuracy Gain
Near	63.48%	98.42%	34.94	92.22%		4.08
Mid	84.40%		14.02	74.22%	96.30%	22.08
Far	85.67%		12.75	93.11%		3.19
Average absolute gain			20.57			9.78

³ Accuracy gains are represented in percentage points (absolute).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gutierrez, J.; Villaverde, J. The Effect of Focal Length Variations on Convolutional Neural Network-Based Fabric Classifications. Eng. Proc. 2026, 134, 57. https://doi.org/10.3390/engproc2026134057

AMA Style

Gutierrez J, Villaverde J. The Effect of Focal Length Variations on Convolutional Neural Network-Based Fabric Classifications. Engineering Proceedings. 2026; 134(1):57. https://doi.org/10.3390/engproc2026134057

Chicago/Turabian Style

Gutierrez, Jhamil, and Jocelyn Villaverde. 2026. "The Effect of Focal Length Variations on Convolutional Neural Network-Based Fabric Classifications" Engineering Proceedings 134, no. 1: 57. https://doi.org/10.3390/engproc2026134057

APA Style

Gutierrez, J., & Villaverde, J. (2026). The Effect of Focal Length Variations on Convolutional Neural Network-Based Fabric Classifications. Engineering Proceedings, 134(1), 57. https://doi.org/10.3390/engproc2026134057

Article Menu

The Effect of Focal Length Variations on Convolutional Neural Network-Based Fabric Classifications^†

Abstract

1. Introduction