- Article
Cytological Image-Finding Generation Using Open-Source Large Language Models and a Vision Transformer
- Atsushi Teramoto,
- Yuka Kiriyama and
- Hiroshi Fujita
- + 3 authors
In lung cytology, screeners and pathologists examine many cells in cytological specimens and describe their corresponding imaging findings. To support this process, our previous study proposed an image-finding generation model based on convolutional neural networks and a transformer architecture. However, further improvements are required to enhance the accuracy of these findings. In this study, we developed a cytology-specific image-finding generation model using a vision transformer and open-source large language models. In the proposed method, a vision transformer pretrained on large-scale image datasets and multiple open-source large language models was introduced and connected through an original projection layer. Experimental validation using 1059 cytological images demonstrated that the proposed model achieved favorable scores on language-based evaluation metrics and good classification performance when cells were classified based on the generated findings. These results indicate that a task-specific model is an effective approach for generating imaging findings in lung cytology.
8 February 2026







