AI-Driven Panel Assignment Optimization via Document Similarity and Natural Language Processing

Ramachandran, Rohit; Patil, Urjit; Sundar, Srinivasaraghavan; Shah, Prem; Ramesh, Preethi

doi:10.3390/ai6080177

Open AccessArticle

AI-Driven Panel Assignment Optimization via Document Similarity and Natural Language Processing

by

Rohit Ramachandran

^1,*,

Urjit Patil

²,

Srinivasaraghavan Sundar

²,

Prem Shah

³ and

Preethi Ramesh

²

¹

Department of Chemical and Biochemical Engineering, Rutgers University, 98 Brett Road, Piscataway, NJ 08854, USA

²

Department of Statistics, Rutgers University, 110 Frelinghuysen Road, Piscataway, NJ 08854, USA

³

West Windsor-Plainsboro High School North, 90 Grovers Mill Road, Plainsboro Township, NJ 08536, USA

^*

Author to whom correspondence should be addressed.

AI 2025, 6(8), 177; https://doi.org/10.3390/ai6080177

Submission received: 19 June 2025 / Revised: 28 July 2025 / Accepted: 29 July 2025 / Published: 1 August 2025

(This article belongs to the Section AI Systems: Theory and Applications)

Download

Browse Figures

Versions Notes

Abstract

Efficient and accurate panel assignment is critical in expert and peer review processes. Traditional methods—based on manual preferences or Heuristic rules—often introduce bias, inconsistency, and scalability challenges. We present an automated framework that combines transformer-based document similarity modeling with optimization-based reviewer assignment. Using the all-mpnet-base-v2 from model (version 3.4.1), our system computes semantic similarity between proposal texts and reviewer documents, including CVs and Google Scholar profiles, without requiring manual input from reviewers. These similarity scores are then converted into rankings and integrated into an Integer Linear Programming (ILP) formulation that accounts for workload balance, conflicts of interest, and role-specific reviewer assignments (lead, scribe, reviewer). The method was tested across 40 researchers in two distinct disciplines (Chemical Engineering and Philosophy), each with 10 proposal documents. Results showed high self-similarity scores (0.65–0.89), strong differentiation between unrelated fields (−0.21 to 0.08), and comparable performance between reviewer document types. The optimization consistently prioritized top matches while maintaining feasibility under assignment constraints. By eliminating the need for subjective preferences and leveraging deep semantic analysis, our framework offers a scalable, fair, and efficient alternative to manual or Heuristic assignment processes. This approach can support large-scale review workflows while enhancing transparency and alignment with reviewer expertise.

Keywords:

panel assignment; optimization; similarity; LLM; NLP; sentence transformers

1. Introduction

The Panel Assignment Problem (PAP) is a subclass of the General Assignment Problem (GAP), which involves the allocation of tasks (e.g., reviews) to reviewers while satisfying multiple constraints [1,2]. The PAP is a combinatorial optimization problem commonly found in peer review systems, workforce allocation, and decision-making tasks [3,4]. The primary objectives of solutions to the PAP include fairness, workload balance, expertise maximization, and conflict avoidance. The PAP is a combinatorial optimization challenge categorized as NP-hard (nondeterministic polynomial time), meaning that in computational complexity theory, finding a solution may not always be feasible within polynomial time. Such NP-hard problems are widespread in operations research and manufacturing industries, where they arise in various operational and strategic decision-making scenarios. Developing and implementing effective solution approaches, including optimization algorithms, heuristic strategies, and Artificial Intelligence (AI)-based techniques, can provide substantial advantages. By utilizing the methods proposed in this study, industries can enhance operational efficiency and sustain competitiveness in an increasingly dynamic environment. Several optimization and heuristic techniques have been proposed to solve the PAP, and these include the following: (1) Greedy algorithms [5]—which make locally optimal choices at each step but may not guarantee global optimality, (2) Optimization-based Integer Linear Programming (ILP) [6]—which provides globally optimal solutions but can be computationally expensive at larger-scale problem sizes, (3) Constraint Programming (CP) [7]—which handles complex constraints efficiently but can be slow for large problems, (4) Heuristic methods [8]—which are computationally efficient but may not yield optimal results (e.g., simulated annealing), (5) Bipartite matching [9] (Hungarian algorithm)—which can offer an optimal assignment in polynomial time, (6) Branch and Bound [10]—which guarantees an optimal solution but is slow for large-scale problems, (7) Genetic algorithms [3,11,12]—which are search-based optimization techniques inspired by natural selection that can explore a large parametric space but can be slow to converge, and (8) Hybrid approaches [4,8,13]—which combine the different techniques listed above to leverage their different strengths.

1.1. Related Work and Motivation

Natural Language Processing (NLP) and its subset Large Language Modeling (LLM) [2,14,15,16] present an alternative paradigm to optimization and Heuristic-based approaches by leveraging data-driven capabilities to enhance efficiency, scalability, and flexibility in solving the PAP. Since panel assignments often depend on document similarity, reviewer expertise, and content matching, NLP techniques can be highly effective in automating this process [17,18,19,20,21,22,23,24,25]. One of the primary challenges in the PAP is determining the relevance between a document and a reviewer’s expertise. In traditional methods, this is usually determined by obtaining a preferences matrix from each reviewer, who are tasked with determining their interest in reviewing each proposal/document [26]. These are called rankings, which are denoted by integer values of 0 s, 1 s, 2 s, and 3 s, where 0 s indicate a conflict of interest (COI) and a reviewer cannot be assigned a proposal; 1 s, 2 s, and 3 s indicate high preference, medium preference, and low preference, respectively [26]. A different scale may also be used (e.g., 0 to 10), which will not change the logic and methodology.

However, obtaining rankings/preferences from reviewers can introduce bias and compliance issues, potentially compromising the fairness and efficiency of the panel assignment process. One major concern is the subjectivity and bias in self-reported numbers of rankings/preferences. Reviewers may intentionally or unintentionally inflate or deflate their preferences based on personal biases, unofficial collaborations, or other motivations. This can lead to preference misrepresentation, where certain reviewers favor specific proposals while avoiding others, even when they may be well-suited for an objective evaluation. Another challenge is inconsistent and incomplete ranking/preference submissions, where reviewers may fail to provide submissions on time, leaving assignment coordinators with missing or incomplete data. In such cases, the absence of preference scores can lead to suboptimal assignments. Furthermore, time constraints and high workload demands may lead some reviewers to submit arbitrary or rushed preferences, further reducing the reliability of user-preference-based assignments. Given these challenges, automated similarity-based matching using NLP and Large Language Models (LLMs) offers a promising alternative by reducing dependence on subjective rankings, potentially resulting in a more objective, scalable, and consistent panel assignment process. NLP-based document similarity methods can entail approaches such as TF-IDF (term frequency-inverse document frequency) and cosine similarity, which are traditional methods that measure lexical similarity but may struggle with synonyms and the contextual meaning. Other NLP-based methods include word embeddings (e.g., Word2Vec, GloVe, FastText), which can capture semantic relationships between words, offering more nuanced similarity comparisons. Transformer-based models (e.g., BERT, RoBERTa, GPT) are also part of the NLP suite of methods, and these can provide deep contextual embeddings that enable highly accurate semantic similarity detection [27,28,29,30,31,32,33].

1.2. Research Objectives

In this study, we present a novel approach to the PAP by levering NLP-based document similarity and AI-driven decision making to augment the optimization-based Panel Assignment Problem. Panel assignments in organizations such as the U.S. National Science Foundation (NSF) and other proposal review agencies require careful alignment between reviewer expertise and proposal content whilst also adhering to multiple constraints such as the workload balance, conflict of interest (COI), fairness, and minimization of bias. The complexity of the PAP is further increased when additional reviewer roles, such as the lead reviewer (who introduces the proposal) and scribe reviewer (who takes notes and writes up the summary) must be assigned optimally within a panel, aside from the ordinary reviewer role. This is because the workload for a lead or scribe is higher than an ordinary reviewer. Ensuring that assignments are optimal and feasible also presents a significant computational challenge especially for large-scale problem sizes. We propose an alternative approach to solving the multi-level PAP using NLP-based similarity detection in combination with integer programming (optimization) rather than relying on manually provided preference rankings. We then evaluate how well NLP techniques can automatically determine reviewer–proposal relevance while maintaining fairness and reducing biases.

2. Methods

This section outlines the methodology used for NLP-based similarity detection and optimization for automating optimal panel assignments. The process involves multiple stages such as document processing, text extraction, preprocessing, embedding generation, tokenization, similarity computation, optimization, and assignment visualization.

2.1. Text Extraction and Preprocessing

To accurately match proposals with reviewers, the first step is extracting meaningful textural data from uploaded documents. Since the documents are typically in PDF format, we employ two primary text-extraction techniques. For regular text extraction, the PyMuPDF library (version 1.26.3) is used to extract text from the PDFs directly. PyMuPDF is a high-performance Python library (version 3.11.7) for data extraction, analysis, conversion, and manipulation of PDF files. PyMuPDF provides a robust framework for reading, processing, and manipulating PDFs while preserving their original structure and formatting. This method is effective for digitally created PDFs where text is embedded in a machine-readable format. If successful, the extracted text is passed to the preprocessing step.

If direct extraction fails (e.g., due to scanned documents or image-based content), we use optical character recognition (OCR) to extract text from the documents. The document pages are first converted into images with enhanced contrast to improve text recognition. The OCR engine extracts text from each page, ensuring that even non-machine-readable PDFs are processed. The extracted text is then structured into a format suitable for NLP-based similarity analysis. To perform the OCR, we use EasyOCR which is a Python-based OCR module. EasyOCR is optimized for speed, while maintaining high accuracy, making it more suitable for real-time document processing. The library automatically enhances contrast and sharpness, improving text recognition without extensive pre-processing. To determine when to switch from regular text extraction to OCR-based text extraction, we compute the stopword ratio of the document. Stopwords are commonly used words in a language (e.g., ‘the’, ‘is’, ‘and’, ‘of’, ‘in’, ‘at’) that typically carry minimal or no meaning in text analysis. These words are frequently filtered out in NLP tasks to improve computationally efficiency and focus more on informative terms. The stopword ratio can be defined as the number of stopwords in the extracted text to the total number of words in the extracted text. A higher stopword ratio indicates that the extracted text is natural and complete, as stopwords are expected to be present in well-formed sentences. A very low stopword ratio suggests incomplete or faulty extraction, missing words, fragmented text and/or garbled characters, which indicate that the regular text extraction was erroneous. In this study, we use a threshold stopword ratio of 0.05 to denote that if the stopword ratio is less than 0.05, the code will switch from regular text extraction to OCR-based extraction. It is important to first attempt regular text extraction as OCR-based text extraction is computationally more expensive. Aside from stopword removal, the extracted text also undergoes additional preprocessing steps to enhance accuracy. These include whitespace and formatting normalization where line breaks and redundant spaces are eliminated to create a clean text representation.

2.2. Text Similarity Computation

To determine the relevance of proposals to reviewers, we compute similarity scores using two different NLP-based methods. In the first method, a pre-trained deep learning model from SentenceTransformers called ‘all-mpnet-base-v2’ is utilized [34]. Each pre-processed document (proposal or reviewer profile) is converted into a numerical vector (embedding), and the vector captures the semantic meaning of the text. Cosine similarity is computed between the embeddings of the proposals and reviewers to determine relevance. Unlike traditional keyword-based methods, transformers understand context and meaning. They can detect subtle similarities between documents even when different words are used to express similar ideas. The cosine similarity score ranges from −1 to 1. A score of 1 indicates perfect similarity, meaning the texts are identical in meaning, while a score of 0 indicates no similarity, suggesting the texts are completely unrelated. A score of −1 indicates complete dissimilarity, where the texts are maximally opposite in meaning. Since research proposals and reviewer profiles typically belong to related academic domains, we expect the similarity scores to generally be positive or at the most, weakly negative. Higher similarity values are expected when comparing proposals and reviewers within the same field, while lower scores are anticipated when comparing across different disciplines.

The ‘all-mpnet-base-v2’ model is a pre-trained sentence-embedding model developed as part of the SentenceTransformers library. It is based on MPNet (Masked and Permuted Pre-training for Language Understanding), a state-of-the-art transformer architecture introduced by Microsoft. MPNet combines the strengths of BERT (Bidirectional Encoder Representations from Transformers) and XLNet (eXtreme Learning Net), making it highly effective for capturing contextual semantics, syntactic dependencies, and long-range relationships in textual data. While the ‘all-mpnet-base-v2’ model does not generate text like full-scale LLMs, it still processes and understands natural language in a deep, contextual way. Given its transformer-based architecture, large-scale pre-training, and ability to encode rich semantic representations, it can be considered a mini LLM specialized for text similarity and retrieval tasks rather than open-ended text generation.

Transformer-based models, such as SentenceTransformer’s ‘all-mpnet-base-v2’, impose strict token length limitations, typically capping input sequences at 512 tokens due to hardware constraints and model architecture. This poses a significant challenge when processing long documents, as exceeding the token limit can lead to truncation and the loss of important semantic information. Given that research proposals and reviewer profiles often contain thousands of tokens, processing them in a single pass is infeasible. Furthermore, truncating the document to fit the token limit may not be a sound strategy as often key information is often distributed across different sections of the document. To address the token limit, we implement a chunking strategy that ensures complete document representation while maintaining the model’s efficiency and accuracy. The document is first tokenized using the same tokenizer employed by the ‘all-mpnet-base-v2’ model. It is then split into sequential 512-token chunks, ensuring that each segment remains within the model’s processing limits. Each of the 512-token chunks is individually passed through the transformer model, generating an embedding vector for that specific portion of the text. The final document embedding is then computed as the mean of all the chunk embeddings, which helps to smooth out variability between chunks and ensures a stable document representation. After the cosine similarity scores are computed, the similarity scores are converted to rankings for the purpose of standardization for the subsequent panel assignment optimization and the conversion can be determined as follows:

R a n k i n g s = 3 - (2 \times S i m i l a r i t y S c o r e)

(1)

This ensures that a higher similarity score results in a lower numerical rank (better match) and ensures that rankings are bounded between 1 (highest preference) and 3 (lowest preference). Conflicts of interest will be assigned as 0s. The final output is a rankings matrix, where each proposal is ranked against every reviewer determined by the similarity analysis as opposed to obtaining manually obtaining preferences from reviewers.

For this study, we use journal publications (which are publicly available through a valid license if subscription-based or freely available if open-access) as a replacement for proposals since proposal documents are proprietary. Journal publications are a valid replacement as they share contextual similarity with a proposal and have a similar length and similar amount of domain-specific information that can be used for similarity analysis. For reviewer profile documents (to compare against the journal publications), we use (1) Google Scholar profiles of faculty and (2) CVs of faculty, both of which are considered publicly available documents and contain sufficient technical information to be accurately analyzed against the journal publications for similarity matching and analysis.

2.3. Google Scholar Title Extraction

A google scholar profile provides a comprehensive summary of a researcher’s academic contribution, citation metrics, and research impact. When printed as a PDF, the profile typically contains the following key information such as the full name, affiliation, research areas, total citations, H-index, i10-index, citation trends, title of each paper, author names, journal/conference name, publication year, keywords, and profile URLs. These elements make the google scholar profile valuable for reviewer selection, but some of the information contained may be unnecessary and may skew the similarity analysis. As a result, we developed a method to accurately extract only the research publication titles from the Google Scholar profiles. This was performed by leveraging PyMUPDF to detect and retrieve blue-colored text elements. Since Google Scholar consistently formats publication titles in blue, identifying and extracting these elements provides a reliable method for title retrieval.

Each text span in a PDF document is associated with an RGB (red–green–blue) color value. To detect blue text, we ensured that the blue component was greater than both the red and green components. Consecutive blue text spans were merged to capture complete titles, even when they spanned multiple lines, to ensure accurate title extraction. Since only blue-colored text is considered, we avoid capturing irrelevant sections such as author names, affiliations, or citation counts.

2.4. CV Publication Title Extraction

A researcher’s Curriculum Vitae (CV) provides a detailed overview of their academic background, professional experience and research contributions. It is typically structured into several key sections containing personal information, education, professional affiliations, research interests, publications, grants and funding, awards and honors, teaching experience, mentorship and supervision, etc. The length of a CV can also range from 1 to over 50 pages. Like the journal paper title extraction of the Google Scholar document, here, we also develop a methodology to extract journal paper titles from the CV to eliminate redundant information from the CV that may skew the similarity results when a CV is used for similarity analysis.

The first step is to parse text from the CV PDF document using PyMuPDF, whereby structured output is generated, maintaining line breaks and spacing to preserve document integrity. The extraction method also ensures compatibility with both single-column and multi-column CV formats. CVs often contain headers, footers, and marginal content that do not contribute to the extraction of research publications. To address this, we set margin thresholds to exclude text near the top and bottom of each page, preventing the inclusion of headers/footers. Page numbers and institutional affiliations, commonly found in these regions, are automatically discarded. The first page of a CV typically contains an introductory section, and since publication lists typically appear after this introductory content, we employed a character-based truncation strategy whereby the first 400 characters of the extracted text were skipped to avoid capturing general information, and this approach ensures that only relevant publication entries, which are formatted as numbered lists or bulleted items, are retained. After processing all pages, the extracted text is concatenated into a single structured string, with page breaks preserved to maintain content organization.

Subsequently, a LLaMA-3-8B-8192 model through the Groq API was used to focus on journal paper title extraction from the CV document. The LLaMA-3-8B-8192 model is a cutting-edge, large-scale language model from Meta’s LLaMA 3 series, optimized for high-performance natural language processing (NLP) tasks [35]. The model features 8 billion parameters and supports an extended context length of 8192 tokens, making it well-suited for processing long documents such as CVs, research papers, and proposal reviews. The Groq API is a cloud-based AI inference service designed to provide ultra-low latency and high throughput for large language models (LLMs) [36]. It enables seamless deployment and the execution of models like LLaMA-3-8B-8192, making it ideal for NLP tasks that require real-time or high-speed processing [37,38].

A specialized prompt was created to provide the model with examples of correctly extracted journal/research publications. Example-driven few-shot prompting enhanced the model’s ability to generalize across different layouts, and the prompt explicitly guided the model to focus on numbered/bulleted publication lists whilst ignoring other sections of the CV. LLMs have character input constraints, limiting the length of text that they can process in a single pass. Hence, we implemented an iterative segmentation strategy whereby the text was split into 13,000-character segments and each segment was processed separately while maintaining contextual continuity, and the results are merged into a structured output array to ensure completeness. We also specify a rigid output format within the prompt, enabling the development of a robust parsing function where extracted titles are systematically added to an array, which can then be used for post-processing similarity analysis.

2.5. Optimization

To assign reviewers to proposals (or journal papers as used in this study), we employ an Optimization-based Integer Linear Programming (ILP) approach to minimize the total preference score across assignments that is based on the authors’ prior work [39] and related works [18,40]. Hence, only the pertinent details are reported here, and for complete details, please refer to Appendix A. The preference score is determined by a ranking matrix, where each entry represents a reviewer’s suitability for a given proposal. However, now, the rankings matrix is not based on reviewer preferences that are obtained from reviewer input but rather from a document similarity analysis (Section 2.1, Section 2.2, Section 2.3 and Section 2.4), whereby a similarity score is computed by comparing a ‘proposal’ with a reviewer document. In this study, we will use a journal paper as the ‘proposal’ (henceforth denoted as ‘proposal’) and either a Google Scholar profile document or a CV as the reviewer document. Higher similarity scores are hypothesized to result in the optimization, prioritizing the matches in order of lead, scribe, and ordinary reviewer, while lower similarity scores will tend to result in unassigned matches. Conflicts between a ‘proposal’ and a reviewer document will be entered a priori so that the match is excluded from the optimization. For the optimization, the similarity scores are scaled to a range of 1 to 3 to ensure that they are consistent with the traditional method of obtaining preferences (1 to 3) from reviewers. Here, a similarity score of 1 is scaled to a ranking of 0 and a similarity score of 0 is scaled to a ranking of 3 with an inverse linear rule applied for in-between similarity scores. It is possible that when two documents are divergent in semantic similarity, they can result in negative similarity scores, which will result in the rankings becoming greater than 3. This does not pose any problem for the optimization as it will just treat scores of greater than 3 as less preferred in terms of matching.

The optimization problem consists of the objective function (fval), which can be represented as follows:

Minimize fval = \sum_{i = 1}^{n} \sum_{j = 1}^{m} R_{i j} X_{i j}

(2)

where n = number of proposals, m = number of reviewers, and R is the rankings matrix, where R_ij represents the ranking or preference of assigning reviewer j to proposal i, and X is the assignment matrix, where X_ij = 1 if reviewer j is assigned to proposal i and X_ij = 0 otherwise. The constraints can be represented as follows, whereby each proposal is (1) assigned the required number of reviews, (2) reviewers are assigned with minimum and maximum workload limits, and (3) conflicts of interest (COI) are avoided by ensuring reviewers are not assigned proposals where their ranking is 0.

The lead and scribe assignment problems follow the same optimization-based approach whereby the lead assignments are assigned based on their workload and preference scores, using a round-robin approach for balanced distribution. The optimization is performed using PULP/CPLEX. PULP (Python Utility for Linear Programming) is an open-source library for formulating and solving Integer Linear Programming (ILP) and Mixed-Integer Linear Programming (MILP) problems in Python. CPLEX is a commercial solver available in Python that handles ILP and it uses a combination of simplex methods, barrier methods, Branch-and-Bound, and branch-and-cut. The optimization method also incorporates the case of where the lead and scribe reviewer are the same and where the lead and scribe reviewer are different.

2.6. Implementation and Deployment

The complete procedure is implemented as an interactive web application using Streamlit (an open-source Python framework to deliver interactive data applications). This allows users to upload proposal-based documents (journal publications) and reviewer portfolio documents (google scholar profiles or CVs) as PDFs. To ascertain if a document is a Google Scholar profile, the highest text line on the first page was examined to confirm if the phrase ‘Google Scholar’ was present. The system then processes the documents and generates similarity scores and corresponding rankings. Subsequently, the rankings matrix is then used to perform the panel assignment either via optimization.

3. Results and Discussion

All results were computed using Python. The Python version used was Python 3.10.4 with key libraries such as NumPy, pandas, scikit-learn, and data preprocessing tasks, along with PuLP for the optimization in Python. The simulations were performed on a 12th Gen Intel(R) Core(TM) i7-1255U CPU @ 1.70 GHz, 16 GB RAM, Windows 11 64-bit operating system.

3.1. Document Similarity

We first compared the general text extraction method for Google Scholar (GS) and CV documents and compared results with the specialized text extraction method for the same documents (Section 2.3 and Section 2.4). Results showed that there was no discernible difference in the similarity score between the two approaches. This can be attributed to the robustness of the sentence transformer mode (allmpnet-base-v2), which effectively captures overall semantic similarity, even in the presence of non-technical or less relevant text. These models are designed to focus on the core meaning of the content, making them resilient to extraneous information such as author names, affiliations, or formatting inconsistencies. Given the negligible impact of specialized extraction and the added computational overhead that it introduced, we opted to proceed with the general text extraction method.

We then conducted a series of tests to evaluate the robustness of our NLP-based document similarity method using the general method. We created a controlled environment where the ‘proposal’ documents used were the journal papers of the corresponding reviewer. This means that in the similarity matrix S, where S_i,j represents the similarity score between proposal i and proposal j, the diagonal elements S_i,i correspond to the similarity score between a reviewer’s own papers and their professional profile documents. Since each ‘proposal’ is authored by the corresponding reviewer (who is one of the authors of the ‘proposal’), we expect these diagonal elements S_i,i for i = 1, 2, …, N to exhibit significantly higher similarity scores compared to the off-diagonal elements where

S_{i j} w h e r e i ≢ j

. This experimental design enables us to validate our method and its ability to detect semantic relationships accurately. The diagonal elements represent the self-similarity scores, and the off-diagonal elements represent the dissimilarity scores

For a comprehensive analysis, we consider 40 researchers from two distinct disciplines: Chemical and Biochemical Engineering (20 researchers) and Philosophy (20 researchers). To ensure a balanced evaluation, we analyze 10 ‘proposals’ from each of the 40 researchers. We structure our evaluation into three cases:

Case 1: Examines similarity by comparing proposals and reviewer documents within the Chemical Engineering discipline, with reviewer information derived from (1) Google Scholar profiles and (2) CVs.

Case 2: Examines similarity within the Philosophy discipline, using reviewer information from (1) Google Scholar profiles and (2) CVs.

Case 3: Assesses both similarity and dissimilarity by pairing a mix of ‘proposals’ from Chemical Engineering and Philosophy with reviewer documents (Google Scholar profiles and CVs) from the respective disciplines.

This structured evaluation allows us to test the robustness of our method across distinct subject areas and data sources.

3.1.1. Average Similarity Results

The average self-similarity score across the Philosophy and Chemical Engineering departments is 0.655 when using reviewer data from Google Scholar profiles and 0.672 when using data from CVs. This difference is expected, as Google Scholar profiles primarily include published works and citation data, which may not fully capture the breadth of a researcher’s contributions. In contrast, CVs provide a more comprehensive and structured overview, including unpublished research, professional activities, and detailed descriptions of academic work, leading to a slightly higher self-similarity score. Meanwhile, the average dissimilarity score is −0.01 for Google Scholar-based reviewer data and −0.02 for CV-based reviewer data. These near-zero values indicate that proposals paired with reviewers from different disciplines exhibit minimal to no semantic alignment. The slightly more negative dissimilarity scores from CV-based reviewer data suggest that the additional details captured in CVs may accentuate disciplinary differences more effectively than Google Scholar profiles, reinforcing the method’s ability to distinguish between unrelated fields. Ultimately, however, there is minimal difference between Google Scholar profiles and CVs, as both offer rich semantic information suitable for similarity analysis.

3.1.2. Similarity Results for Chemical Engineering ‘Proposals’ and Reviewers

As shown in Figure 1, similarity scores for Chemical Engineering (ChE) reviewers (R) demonstrate strong self-similarity, with median values predominantly ranging between 0.65 and 0.85. Reviewer R9_ChE achieved the highest median similarity score of approximately 0.89, followed closely by reviewers R1_ChE, R14_ChE, and R16_ChE, with scores ranging from 0.82 to 0.85. Notably, R1_ChE had the highest lower fence with a score of 0.79, indicating a robust consistency in similarity across their proposals and reviewer profile documents. Additionally, R1_ChE exhibited the tightest distribution, with scores spanning from 0.78 to 0.86, further confirming their strong internal consistency. However, R12_ChE exhibited significant outliers at the lower end, with similarity scores of 0.05 and 0.15, suggesting that these two proposals were semantically distant from the reviewer’s expertise, despite an otherwise moderate median similarity score of 0.65. Conversely, R6_ChE had the lowest median score of 0.60, along with the second-lowest mean similarity score (averaged across both CV and Google Scholar methods) of 0.56 as seen in Figure 2. In contrast, the highest average similarity scores were achieved by R9_ChE and R14_ChE, with averages of 0.84 and 0.82, respectively. From Figure 3, we observed a high degree of consistency between the Google Scholar profiles and CVs as document sources. Both approaches yielded similar median similarity scores of 0.72, with comparable interquartile ranges. However, the CV-based approach demonstrated a slightly tighter distribution, suggesting a more consistent alignment of proposals to reviewer profiles when using CVs. This consistency validates our methodology of extracting relevant information from different document types, highlighting the flexibility in source documents while maintaining reliable similarity assessments.

3.1.3. Similarity Test Results for Philosophy ‘Proposals’ and Reviewers

Building upon our evaluation of Chemical Engineering documents, we extended our analysis to the Philosophy discipline, maintaining the same controlled design where proposals consisted of research papers from the corresponding reviewers. This cross-disciplinary comparison allowed us to assess whether our NLP-based similarity method performs consistently across distinct academic fields, each with its own linguistic and semantic characteristics. As seen in Figure 4, similarity scores for Philosophy reviewers (R1_Phil through R20_Phil) exhibited strong self-similarity, albeit with slightly different patterns than the Chemical Engineering test cases. The median values predominantly ranged between 0.55 and 0.75, with tighter distributions observed for most reviewers. Reviewer R13_Phil achieved the highest median similarity score of 0.76, followed closely by R10_Phil (0.74), R5_Phil (0.73), and R7_Phil (0.71). Compared to Chemical Engineering, the Philosophy reviewers exhibited fewer extreme outliers. The most notable outlier was R10_Phil, with an isolated point at approximately 0.30, suggesting that this reviewer’s document may have had less semantic alignment with the corresponding proposals. However, this point appeared to be an anomaly, as R10_Phil’s median score remained high at 0.74, indicating overall consistency in similarity scores. In Figure 5, we observe that the anomaly in R10_Phil’s document significantly impacted the reviewer’s overall mean similarity score, which dropped to 0.55. The highest average similarity scores were achieved by R5_Phil (0.72) and R13_Phil (0.75), while R16_Phil and R9_Phil had the lowest scores at 0.54 and 0.55, respectively. Notably, the overall range of average similarity scores (0.54 to 0.75) was narrower than in the Chemical Engineering test cases, suggesting a more consistent alignment across Philosophy documents. From Figure 6, we observed similar consistency between Google Scholar profiles and CVs as sources for Philosophy reviewers, with both document types yielding median similarity scores of approximately 0.65. The interquartile ranges for both sources were nearly identical, indicating that both sources provide equally reliable semantic information in the Philosophy domain. Although the median similarity scores in Philosophy were slightly lower than those for Chemical Engineering (0.65 versus 0.73), the distributions for Philosophy were tighter, suggesting more homogeneity within the discipline across both document types.

Overall, the consistently strong self-similarity scores in both disciplines validate the effectiveness of our approach across different academic domains. While Philosophy documents showed slightly lower absolute similarity values than Chemical Engineering, they exhibited greater consistency across reviewers. These results demonstrate that our transformer-based document similarity method successfully captures semantic relationships within related documents across distinct academic fields and writing styles.

3.1.4. Similarity Test Results for Mix of Chemical Engineering and Philosophy ‘Proposals’ and Corresponding Mix of Reviewers

To evaluate the effectiveness of our algorithm in distinguishing semantically unrelated documents, we conducted cross-disciplinary tests between Chemical Engineering and Philosophy. For these tests, we processed 200 proposal documents per reviewer, which provided a robust dataset to measure dissimilarity scores across disciplines. This cross-disciplinary analysis is particularly important for real-world panel assignment scenarios, where reviewers from one discipline should not be assigned proposals from unrelated fields.

When evaluating Chemical Engineering reviewers against Philosophy ‘proposals’, we consistently observed low similarity scores, confirming the model’s effectiveness in identifying dissimilar documents. As shown in Figure 7, the median similarity scores for Chemical Engineering reviewers when paired with Philosophy proposals ranged from −0.14 to 0.08 using the Google Scholar method and from −0.21 to 0.067 when using CVs as the source for reviewers’ professional information. These scores signify minimal semantic overlap between the two disciplines. The highest average similarity score of 0.013 was exhibited by R7_ChE. Figure 7 further highlights that R7_ChE demonstrated the highest median value of approximately 0.006, while R8_ChE, R14_ChE, and R19_ChE showed the lowest median scores, around −0.04. These consistently low values across all reviewers indicate that our similarity method effectively identifies cross-disciplinary mismatches. The interquartile ranges for these scores were narrow, typically spanning from −0.10 to 0.10, reflecting consistent dissimilarity between Chemical Engineering and Philosophy documents.

Similarly, when Philosophy reviewers were evaluated against Chemical Engineering ‘proposals’, the results showed a comparable pattern of significant dissimilarity. As illustrated in Figure 8, the similarity scores for Philosophy reviewers using the Google Scholar method ranged between −0.12 and 0.11, while the CV-based approach yielded scores between −0.15 and 0.09. These ranges align with the expected outcome given the semantic differences between the two disciplines. Among the Philosophy reviewers, R18_Phil exhibited the highest average similarity score of −0.004, while R9_Phil had the lowest average score of −0.057. Figure 8 shows that R1_Phil had the highest median similarity at approximately 0.005, while R14_Phil demonstrated the lowest median at about −0.06. The interquartile ranges for Philosophy reviewers were slightly wider than those observed for the Chemical Engineering reviewers, spanning from −0.08 to 0.05. This suggests marginally more variability in dissimilarity patterns within the Philosophy group.

The consistently negative or near-zero median similarity scores across both cross-disciplinary tests validate the method’s ability to accurately distinguish between semantically relevant and irrelevant document matches. This capability is particularly crucial for panel assignment tasks, where the algorithm must avoid assigning reviewers to proposals outside their domain of expertise.

3.2. Panel Assignment Optimization

We present three cases to demonstrate the results for the panel assignment optimization via NLP-based document similarity: Case 1: 10 ‘proposals’, which are represented by journal publications, and 10 ‘reviewers’, which are represented by reviewers’ Google Scholar profiles or CVs are considered, whereby the reviewer documents are from the Chemical Engineering discipline. Case 2: 10 ‘proposals’, which are represented by journal publications, and 10 ‘reviewers’, which are represented by reviewers’ Google Scholar profiles or CVs are considered, whereby the reviewer documents are from the Philosophy discipline. For each case, we consider the instance where the lead and scribe are the same and when they are different. The number of reviews per proposal is set at four for all cases. In each case, ‘proposal’ 1 to 10 represent the journal paper from that particular reviewer, 1 to 10. Hence, we expect the diagonal elements in the panel assignment matrix to have the highest similarity scores (and correspondingly, a lower rankings score) due the self-similarity of the ‘proposals’ to the reviewers, and as a result, the diagonal elements should be ideally assigned as “LS”, when lead/scribe are the same and “L” when the lead and scribe are different since the leads (L) are prioritized in the optimization formulation to be assigned their highest rates for ‘proposal’. Case 3 represents a scenario whereby half of the set (documents 1 to 5) are from Chemical Engineering and the other half (documents 6 to 10—Google Scholar profiles and CVs) are from Philosophy

3.2.1. Case 1: Chemical Engineering ‘Proposals’ and Related Google Scholar Profile and CV Documents

Results from Table 1, where Chemical Engineering ‘proposals’ are compared with Chemical Engineering Google Scholar (GS) profiles, show that the similarity scores for the diagonal elements are significantly higher than the scores of the off-diagonal elements. The average self-similarity score for the diagonal elements is 0.78 and the average (dis)-similarity score for the off-diagonal elements is 0.39. The corresponding average rankings are 1.42 and 2.21, respectively. This confirms that the document similarity method can sufficiently capture the semantic similarity (or dis-similarity) between documents from the same reviewer and documents from different reviewers. Based on the rankings, panel assignment optimization was performed, and the results are illustrated in Table 2. Here, it can be seen that each diagonal pairing (proposal—reviewer) was assigned as lead (L)/scribe (S) (when the lead and scribe are the same) or lead (when the lead and scribe are different) confirming that the optimizer was able to ensure that the proposal was assigned to the reviewer with the most related expertise, which is confirmed with the similarity scores and known a priori due to the self-similarity aspects in this assignment. R indicates reviewer.

Results from Table 3 and Table 4 (where now instead of GS documents, CVs are used) follow the same trends as drawn from the results in Table 1 and Table 2. Here, we note that when CVs are used, the average of the self-similarity scores for the diagonal elements increase from 0.78 to 0.79, and the average of the (dis)-similarity scores for the off-diagonal elements increase from 0.39 to 0.41. The negligible differences can be attributed to the difference in semantic information that is captured in a GS document versus a CV, but it is noted that both offer sufficient semantic context.

3.2.2. Case 2: Philosophy ‘Proposals’ and Corresponding Google Scholar Profile and CV Documents

Results from Table 5, where Philosophy ‘proposals’ are compared with Chemical Engineering Google Scholar (GS) profiles, show that the similarity scores for the diagonal elements are significantly higher than the scores of the off-diagonal elements. The average self-similarity score for the diagonal elements is 0.67, and the average (dis)-similarity score for the off-diagonal elements is 0.41. The corresponding average rankings are 1.64 and 2.17, respectively. This confirms that the document similarity method can sufficiently capture the semantic similarity (or dis-similarity) between documents from the same reviewer and documents from different reviewers. Based on the rankings, the panel assignment optimization was performed, and results are illustrated in Table 6. Here, it can be seen that each diagonal pairing (proposal—reviewer) was assigned as lead/scribe (when the lead and scribe are the same) or lead (when the lead and scribe are different) confirming that the optimizer was able to ensure that the proposal was assigned to the reviewer with the most related expertise, which is confirmed with the similarity scores and known a priori due to the self-similarity aspects in this assignment.

Results from Table 7 and Table 8 (where now instead of GS documents, CVs are used) follow the same trends as drawn from the results in Table 1 and Table 2. Here, we note that when CVs are used, the average of the self-similarity scores for the diagonal elements increase from 0.67 to 0.69, and the average of the (dis)-similarity scores for the off-diagonal elements increase from 0.41 to 0.43. The negligible differences can be attributed to the difference in semantic information that is captured in a GS document versus a CV, but it is noted that both offer sufficient semantic context even in a different discipline. It is also interesting to note that the difference between the self- and dis-similarity scores in the Philosophy discipline is less than that in the Chemical Engineering discipline, and this can be attributed to the reviewers in the Philosophy discipline having more diverse research areas compared to those in Chemical Engineering, for the sample set that was used in this study.

3.2.3. Case 3: Mix of Chemical Engineering and Philosophy ‘Proposals’ and Their Corresponding Google Scholar Profile and CV Documents

Here, we consider a case where half the set (1 to 5) is represented by Chemical Engineering documents and the other half is represented by Philosophy documents. Results from Table 9 and Table 10 show that the self-similarity score of the diagonal elements within each set is high, confirming the same trends as seen in the earlier cases. Similarly, we see the lower similarity scores reflected within each set for the off-diagonal elements, confirming the same trends seen earlier. Correspondingly, from Table 11 and Table 12, we can see that the optimized panel assignment is able to ensure that the diagonal pairings consist of the lead/scribe or lead assignments. As expected, the optimizer was able to ensure a clear separation of all assignments, whereby it seen that the review assignments are only made within each set due to the expectation that ‘proposals’ of that specific discipline should be reviewed by reviewers from that discipline. The same trends are seen whether GS or CV documents are used.

3.3. Implications of Research Results

The findings of this study have important implications for enhancing the efficiency, fairness, and scalability of panel assignment processes across diverse domains. By demonstrating that NLP-based similarity detection combined with optimization can reliably match reviewers to proposals without relying on subjective preferences, this research paves the way for more objective, bias-resistant assignment frameworks. The consistent performance across disciplines and document sources underscores the method’s adaptability in real-world scenarios, including large-scale peer review systems. Additionally, the ability to differentiate between semantically unrelated documents ensures discipline-aware assignments, reducing the risk of mismatches that compromise review quality. The success of automatic lead and scribe assignments further highlights the method’s practical value in managing complex role distributions within panels. Beyond academic peer review, these results can inform broader applications in workforce allocation, project-team formation, and other expert-task matching contexts where fairness, expertise alignment, and efficiency are critical. The proposed framework offers a scalable and data-driven alternative to traditional preference-based approaches.

4. Conclusions

This study demonstrates the effectiveness of an NLP-based document similarity approach in optimizing panel assignments by ensuring that proposals are matched with the most semantically relevant reviewers. The three cases examined provide clear evidence of the method’s robustness across different disciplines and document sources. For single-discipline cases (Cases 1 and 2), where proposals and reviewers belonged to the same field (Chemical Engineering or Philosophy), the self-similarity scores of diagonal elements were significantly higher than off-diagonal elements. This confirmed that the method accurately captured semantic alignment between reviewers and proposals. Additionally, the panel assignment optimizer successfully assigned lead/scribe roles to the most relevant reviewers, reinforcing the reliability of the similarity-based optimization approach. Minor variations in similarity scores between Google Scholar profiles and CVs were observed, with CVs offering slightly higher self-similarity scores due to their broader coverage of academic contributions. The cross-disciplinary analysis (Case 3) further validated the method’s capability to distinguish semantically unrelated documents. When Chemical Engineering reviewers were matched with Philosophy proposals (and vice versa), the similarity scores were consistently near zero or negative, highlighting minimal semantic overlap. Despite the interdisciplinary nature of some research areas, the algorithm effectively separated proposals and reviewers into their respective disciplines, preventing cross-disciplinary mismatches. Additionally, the study revealed that Philosophy reviewers exhibited slightly more variability in similarity scores compared to Chemical Engineering reviewers, possibly due to the broader thematic diversity within the Philosophy discipline. Nonetheless, the optimizer ensured that proposals were only assigned to reviewers within the same discipline, further demonstrating the method’s practical applicability in real-world panel assignment scenarios. Overall, these findings confirm that our NLP-based similarity approach, combined with panel assignment optimization, provides a robust, scalable, and discipline-aware framework for assigning reviewers to research proposals.

Author Contributions

Conceptualization, R.R. and U.P.; methodology, R.R., S.S. and P.S.; software, R.R.; validation, S.S., P.S. and P.R.; formal analysis, R.R.; investigation, R.R.; resources, R.R.; data curation, S.S. and P.R.; writing—original draft preparation, R.R.; writing—review and editing, R.R.; visualization, S.S.; supervision, R.R.; project administration, R.R.; funding acquisition, R.R. All authors have read and agreed to the published version of the manuscript.

Funding

The APC was funded by the National Science Foundation, where R.R is on assignment as Program Director in the Engineering Directorate.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author R.R. The data are not publicly available due to restrictions (e.g., their Google Scholar profiles containing information that could compromise the privacy of research participants).

Acknowledgments

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Integer Optimization

To assign reviewers to proposals in an optimal manner, we employ an Integer Linear Programming (ILP) approach. The objective is to minimize the overall preference score of the assignments, which is determined by a ranking matrix where each entry represents the preference of a particular reviewer for a specific proposal. The optimization problem is subject to several key constraints that ensure a fair and feasible distribution of reviews. The primary goal of the optimization is to minimize the total preference score across assigned reviewers. We represent this objective function as follows:

Minimize fval = \sum_{i = 1}^{n} \sum_{j = 1}^{m} R_{i j} X_{i j}

(A1)

where n = number of proposals, m = number of reviewers, and R is the rankings matrix, where R_ij represents the ranking or preference of assigning reviewer j to proposal i, and X is the assignment matrix, where X_ij = 1 if reviewer j is assigned to proposal i and X_ij = 0 otherwise.

The goal is to assign reviewers to proposals in a way that maximizes overall preference, while respecting other constraints. Each proposal must be reviewed by a fixed number of reviewers. This equality (eq) constraint ensures that exactly the required number of reviews is assigned to each proposal. It is defined as follows:

A_{e q 1} = k r o n (e y e (n u m_p r o p o s a l s), o n e s (1, n u m_r e v i e w e r s)

(A2)

where kron returns the tensor product of matrices by taking all possible products between the elements of two matrices.

b_{e q 1} = r e v i e w s_p e r_p r o p o s a l \times o n e s (n u m_p r o p o s a l s, 1)

(A3)

Each reviewer is assigned to a minimum and maximum number of proposals to ensure a balanced workload distribution. The upper and lower bounds on the number of reviews per reviewer are enforced based on the following inequality constraints (ineq). The upper bound constraint (maximum reviews per reviewer) is expressed as follows:

A_{i n e q 1} = k r o n (o n e s (1, n u m_p r o p o s a l s), e y e (n u m_r e v i e w e r s))

(A4)

b_{i n e q 1} = m a x_r e v i e w s_p e r_r e v i e w e r \times o n e s (n u m_r e v i e w e r s, 1)

(A5)

The lower bound constraint (minimum reviews per reviewer) is given by the following:

A_{i n e q 2} = - k r o n (o n e s (1, n u m_p r o p o s a l s), e y e (n u m_r e v i e w e r s))

(A6)

b_{i n e q 2} = - m i n_r e v i e w s_p e r_r e v i e w e r \times o n e s (n u m_r e v i e w e r s, 1)

(A7)

These constraints ensure that each reviewer is assigned between the minimum and maximum allowed reviews. Reviewers should not be assigned to proposals where a conflict of interest exists. This is handled by ensuring that no reviewer is assigned to proposals where their preference value is zero. These constraints are expressed as follows:

A_{i n e q 3} = e y e (n u m_v a r s)

(A8)

b_{i n e q 3} = (r a n k i n g s (:) > 0)

(A9)

whereby

(:)

indicates flattening of the matrix into a 1-dimensional representation. These constraints exclude any assignments where a conflict would occur, i.e., where the ranking is zero.

The optimization problem is solved using the intlinprog function in MATLAB (version R2024b), which performs Integer Linear Programming. The decision variables are binary (0 or 1), representing whether a particular reviewer is assigned to a specific proposal. The bounds are set between 0 and 1 for these decision variables, and all constraints are combined into the following forms. The matrix of all inequality constraints is formed by the following:

A = [\begin{matrix} A_{i n e q 3} \\ - A_{i n e q 3} \\ A_{i n e q 1} \\ A_{i n e q 2} \end{matrix}]

(A10)

The corresponding vector for the inequality constraints is as follows:

b = [\begin{matrix} b_{i n e q 3} \\ z e r o e s (n u m v a r s, 1) \\ b_{i n e q 1} \\ b_{i n e q 2} \end{matrix}]

(A11)

The Integer Linear Programming solver finds the optimal assignments by minimizing the objective function fval, subject to the equality constraints

A_{e q 1} = b_{e q 1}

and the inequality constraints A ≤ b. The solution vector X is obtained from the optimization and reshaped into a matrix of assignments (assignments (i, j) = 1 or 0), where each entry indicates whether a specific reviewer is assigned to a given proposal.

The lead assignment optimization is very similar to the review assignment optimization. A lead assignment matrix (lead_assignments) is first initialized, with each element indicating whether a reviewer is assigned as the lead for a given proposal. An array (lead_counts) is initialized to track the number of lead reviews assigned to each reviewer. An array (total_preferences) sums the preference values for each reviewer across all proposals, while another array (average_preferences) computes the average non-zero preference for each reviewer, since 0 indicates a conflict and is not considered. The reviewers are sorted in ascending order by their total number of assigned reviews (reviews_count) and average preference. This sorting is performed in ascending order of review counts first and then by average preferences, ensuring a balanced approach to assigning lead roles. A round-robin approach is used to distribute the lead roles across reviewers. Each reviewer is incremented in a cyclic manner to ensure a fair distribution of lead roles. The optimization problem is formulated to minimize the overall preference score. The same form of Equation (A1) is used as the objective function, representing the summation of the product of the rankings/preference score of each reviewer and the lead assignment matrix (lead_assignments). Constraints ensure that decision variables of the lead assignments are binary (0 or 1) to indicate whether a reviewer leads a proposal. Bounds for decision variables are set between 0 and 1. The inequality constraints are as follows:

A_{i n e q 1} = k r o n (o n e s (1, n u m_p r o p o s a l s), e y e (n u m_r e v i e w e r s))

(A12)

b_{i n e q 1} = l e a d_c o u n t s (:)

(A13)

This constraint ensures that each reviewer’s total number of leads should not exceed the allowed number specified in lead_counts.

A_{i n e q 2} = - k r o n (o n e s (1, n u m_p r o p o s a l s), e y e (n u m_r e v i e w e r s))

(A14)

b_{i n e q 2} = - o n e s (n u m_r e v i e w e r s, 1)

(A15)

This constraint ensures that each reviewer must lead at least one proposal.

A_{i n e q 3} = e y e (n u m_v a r s)

(A16)

b_{i n e q 3} = a s s i g n m e n t s (:)

(A17)

This constraint ensures that reviewers can only lead the proposals that they are assigned to review.

The equality constraints are as follows:

A_{e q} = k r o n (e y e (n u m_p r o p o s a l s), o n e s (1, n u m_r e v i e w e r s))

(A18)

b_{e q} = o n e s (n u m_p r o p o s a l s, 1)

(A19)

This constraint ensures that the sum of lead assignments for each proposal equals 1, meaning each proposal has one and only one lead.

The last scribe assignment optimization is very similar to the lead assignment optimization, and only the key differences are described below. First, the optimal scribe_counts are computed, which is based on the reviewer’s initial review assignment and lead assignment loads. This ensures there is a balance between review, lead, and scribe assignment workloads. The constraints are then the same as the lead assignment optimization where the inequality constraints ensure that each reviewer is the scribe for at least one proposal and is also the scribe for, at most, the maximum number of scribes assigned to them. The equality constraint ensures that each proposal must have exactly one scribe assigned. It should be noted that for both the lead and scribe assignment optimization, the use of inequality constraints offers more flexibility in arriving at feasible/optimal solutions compared to the strict use of equality constraints. The only exception is the need for each proposal to have exactly one lead and one scribe assignment.

References

Cattrysse, D.G.; Van Wassenhove, L.N. A survey of algorithms for the generalized assignment problem. Eur. J. Oper. Res. 1992, 60, 260–272. [Google Scholar] [CrossRef]
Li, X.; Watanabe, T. Automatic paper-to-reviewer assignment, based on the matching degree of the reviewers. Procedia Comput. Sci. 2013, 22, 633–642. [Google Scholar] [CrossRef]
Chu, P.C.; Beasley, J.E. A genetic algorithm for the generalized assignment problem. Comput. Oper. Res. 1997, 24, 17–23. [Google Scholar] [CrossRef]
French, A.P.; Wilson, J.M. Heuristic solution methods for the multilevel generalized assignment problem. J. Heuristics 2002, 8, 143–153. [Google Scholar] [CrossRef]
Yagiura, M.; Yamaguchi, T.; Ibaraki, T. A variable depth search algorithm with branching search for the generalized assignment problem. Optim. Methods Softw. 1998, 10, 419–441. [Google Scholar] [CrossRef]
Park, J.S.; Lim, B.H.; Lee, Y. A Lagrangian dual-based branch-and-bound algorithm for the generalized multi-assignment problem. Manag. Sci. 1998, 44 Pt 2, S271–S282. [Google Scholar] [CrossRef]
Lorena, L.A.N.; Narciso, M.G. Relaxation heuristics for a generalized assignment problem. Eur. J. Oper. Res. 1996, 91, 600–610. [Google Scholar] [CrossRef]
Osman, I.H. Heuristics for the generalized assignment problem: Simulated annealing and tabu search approaches. Oper. Res. Spectr. 1995, 17, 211–225. [Google Scholar] [CrossRef]
Nauss, R.M. Solving the generalized assignment problem: An optimizing and heuristic approach. Inf. J. Comput. 2003, 15, 249–266. [Google Scholar] [CrossRef]
Haddadi, S.; Ouzia, H. Effective algorithm and heuristic for the generalized assignment problem. Eur. J. Oper. Res. 2004, 153, 184–190. [Google Scholar] [CrossRef]
Feltl, H.; Raidl, G.R. An improved hybrid genetic algorithm for the generalized assignment problem. In Proceedings of the 2004 ACM Symposium on Applied Computing, Nicosia, Cyprus, 14–17 March 2004; ACM: New York, NY, USA, 2004; pp. 1104–1111. [Google Scholar]
Fisher, M.L.; Jaikumar, R.; Van Wassenhove, L.N. A multiplier adjustment method for the generalized assignment problem. Manag. Sci. 1986, 32, 1095–1103. [Google Scholar] [CrossRef]
Guignard, M.; Rosenwein, M.B. An improved dual-based algorithm for the generalized assignment problem. Oper. Res. 1989, 37, 658–663. [Google Scholar] [CrossRef]
Li, L.; Wang, Y.; Liu, G.; Wang, M.; Wu, X.; Wang, L. Context-aware reviewer assignment for trust enhanced peer review. PLoS ONE 2015, 10, e0133904. [Google Scholar] [CrossRef] [PubMed]
Wang, F.; Zhou, S.; Shi, N. Group-to-group reviewer assignment problem. Comput. Oper. Res. 2013, 40, 1351–1362. [Google Scholar] [CrossRef]
Venkatasubramanian, V.; Chakraborty, A. Quo Vadis ChatGPT? From large language models to Large Knowledge Models. Comput. Chem. Eng. 2025, 192, 108895. [Google Scholar] [CrossRef]
Huang, S.; Huang, Y.; Liu, Y.; Luo, Z.; Lu, W. Are large language models qualified reviewers in originality evaluation? Inf. Process. Manag. 2025, 62, 103973. [Google Scholar] [CrossRef]
Zhao, X.; Zhang, Z. Reviewer assignment algorithms for peer review automation: A survey. Inf. Process. Manag. 2022, 59, 103028. [Google Scholar] [CrossRef]
Chiang, C.-H.; Chen, W.-C.; Kuan, C.-Y.; Yang, C.; Lee, H.-Y. Large language model as an assignment evaluator: Insights, feedback, and challenges in a 1000+ student course. arXiv 2025, arXiv:2407.12504v2. [Google Scholar]
Liu, Y.; Zhang, H.; Miao, Y.; Lef, V.-H.; Li, Z. OptLLM: Optimal assignment of queries to large language models. arXiv 2024, arXiv:2405.15130v1. [Google Scholar] [CrossRef]
Ribeiro, A.C.; Sizo, A.; Reis, L.P. Investigating the reviewer assignment problem: A systematic literature review. J. Inf. Sci. 2023, 1–21. [Google Scholar] [CrossRef]
Pendyala, V.S.; Kamdar, K.; Mulchandani, K. Automated research review support using machine learning, large language models, and natural language processing. Electronics 2025, 14, 256. [Google Scholar] [CrossRef]
Delgado-Chaves, F.M.; Jennings, M.; Atalaia, A.; Wolff, J.; Horvath, R.; Mamdouh, Z.; Baumbach, J. Transforming literature screening: The emerging role of large language models in systematic reviews. Proc. Natl. Acad. Sci. USA 2025, 122, e2411962122. [Google Scholar] [CrossRef] [PubMed]
Busch, F.; Hoffmann, L.; Rueger, C.; van Dijk, E.H.; Kader, R.; Ortiz-Prado, E.; Makowski, M.R.; Saba’, L.; Hadamitzky, M.; Kather, J.N.; et al. Current applications and challenges in large language models for patient care: A systematic review. Commun. Med. 2025, 2, 56. [Google Scholar] [CrossRef] [PubMed]
Mishra, T.; Sutanto, E.; Rossanti, R.; Pant, N.; Ashraf, A.; Raut, A.; Uwabareze, G.; Oluwatomiwa, A.; Zeeshan, B. Use of large language models as artificial intelligence tools in academic research and publishing among global clinical researchers. Sci. Rep. 2024, 14, 31672. [Google Scholar] [CrossRef] [PubMed]
Janak, S.L.; Taylor, M.S.; Floudas, C.A.; Burka, M.; Mountziaris, T.J. Novel and effective integer optimization approach for the NSF panel-assignment problem: A multiresource and preference-constrained generalized assignment problem. Ind. Eng. Chem. Res. 2006, 45, 258–265. [Google Scholar] [CrossRef]
Chowdhury, G.G. Introduction to Modern Information Retrieval, 3rd ed.; Facet Publishing: London, UK, 2010. [Google Scholar]
Jones, K.S. A statistical interpretation of term specificity and its application in retrieval. J. Doc. 1972, 28, 11–21. [Google Scholar] [CrossRef]
Mikolov, T.; Yih, W.-T.; Zweig, G. Linguistic regularities in continuous space word representations. In Proceedings of the NAACL-HLT 2013, Atlanta, GA, USA, 9–14 June 2013; pp. 746–751. [Google Scholar]
Pennington, J.; Socher, R.; Manning, C.D. GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
Bojanowski, P.; Grave, E.; Mikolov, T.; Joulin, A. Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 2017, 5, 135–146. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 3–5 June 2019; Burstein, J., Doran, C., Solorio, T., Eds.; Long and Short Papers. Association for Computational Linguistics: Minneapolis, MN, USA, 2019; Volume 1, pp. 4171–4186. [Google Scholar]
Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A robustly optimized BERT pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
Song, L.; Tan, X.; Qin, T.; Lu, J.; Liu, T.-Y. MPNet: Masked and permuted pre-training for language understanding. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, BC, Canada, 6–12 December 2020. [Google Scholar]
Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.A.; Lacroix, T.; Rozière, B.; Goyal, N. LLaMA: Open and efficient foundation language models. arXiv 2023, arXiv:2302.13971. [Google Scholar] [CrossRef]
Groq, Inc. Groq API for Efficient Inference of Large Language Models. 2023. Available online: https://groq.com/ (accessed on 13 March 2025).
Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, BC, Canada, 6–12 December 2020. [Google Scholar]
Chowdhury, A.; Narang, S.; Devlin, J. PaLM: Scaling language modeling with pathways. J. Mach. Learn. Res. 2023, 24, 1–113. [Google Scholar]
Ramachandran, R.; Patil, U.; De, A. An analytical hybrid approach to panel assignment optimization. Decis. Anal. J. 2025. submitted for publication. [Google Scholar]
Wei, X.; Chen, H.; Yu, H.; Fei, H.; Liu, Q. Guided Knowledge Generation with Language Models for Commonsense Reasoning. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, FL, USA, 12–16 November 2024; pp. 1103–1136. [Google Scholar]

Figure 1. Similarity distribution for all reviewers in the Chemical Engineering department.

Figure 2. Average similarity and dissimilarity scores for each reviewer in the Chemical Engineering department.

Figure 3. Similarity and dissimilarity distribution for CV and Google Scholar approaches in the Chemical Engineering department.

Figure 4. Similarity distribution for all reviewers in the Philosophy department.

Figure 5. Average similarity and dissimilarity scores for each reviewer in the Philosophy department.

Figure 6. Similarity and dissimilarity distribution for CV and Google Scholar approaches in the Philosophy department.

Figure 7. Dissimilarity score distribution with Chemical Engineering reviewers and Philosophy proposals.

Figure 8. Dissimilarity score distribution with Philosophy reviewers and Chemical Engineering proposals.

Table 1. Similarity and rankings scores for Case 1 that includes Chemical Engineering ‘proposals’ and Google Scholar profiles. The similarity score is seen in the 1st number entry, and the rankings score is seen in the 2nd number entry. Suffixes: ChE indicates Chemical Engineering, and GS indicates Google Scholar.

Proposal/Reviewer	R1_{ChE_GS}	R2_{ChE_GS}	R3_{ChE_GS}	R4_{ChE_GS}	R5_{ChE_GS}	R6_{ChE_GS}	R7_{ChE_GS}	R8_{ChE_GS}	R9_{ChE_GS}	R10_{ChE_GS}
P1_ChE	0.79, 1.42	0.73, 1.53	0.43, 2.13	0.33, 2.32	0.22, 2.55	0.31, 2.37	0.18, 2.63	0.31, 2.37	0.32, 2.34	0.26, 2.46
P2_ChE	0.89, 1.22	0.92, 1.15	0.49, 2.00	0.61, 1.77	0.42, 2.15	0.48, 2.02	0.34, 2.31	0.30, 2.39	0.40, 2.20	0.44, 2.10
P3_ChE	0.25, 2.48	0.27, 2.46	0.79, 1.41	0.20, 2.58	0.17, 2.65	0.47, 2.05	0.52, 1.95	0.16, 2.66	0.49, 2.00	0.12, 2.74
P4_ChE	0.49, 2.01	0.52, 1.95	0.36, 2.26	0.83, 1.33	0.70, 1.59	0.58, 1.82	0.35, 2.28	0.36, 2.26	0.38, 2.23	0.53, 1.92
P5_ChE	0.36, 2.27	0.37, 2.25	0.48, 2.03	0.61, 1.77	0.67, 1.64	0.51, 1.96	0.44, 2.11	0.33, 2.33	0.34, 2.31	0.40, 2.20
P6_ChE	0.45, 2.10	0.42, 2.15	0.59, 1.80	0.55, 1.88	0.45, 2.09	0.87, 1.24	0.60, 1.79	0.13, 2.73	0.38, 2.23	0.29, 2.40
P7_ChE	0.52, 1.96	0.52, 1.96	0.62, 1.75	0.50, 1.99	0.36, 2.26	0.77, 1.44	0.62, 1.75	0.20, 2.58	0.52, 1.94	0.40, 2.18
P8_ChE	0.27, 2.45	0.30, 2.39	0.18, 2.62	0.20, 2.59	0.21, 2.56	0.09, 2.81	0.11, 2.76	0.72, 1.55	0.25, 2.49	0.53, 1.93
P9_ChE	0.29, 2.40	0.31, 2.37	0.48, 2.02	0.27, 2.45	0.23, 2.53	0.27, 2.44	0.31, 2.36	0.25, 2.48	0.86, 1.27	0.34, 2.31
P10_ChE	0.39, 2.20	0.44, 2.12	0.17, 2.64	0.59, 1.80	0.39, 2.21	0.42, 2.15	0.35, 2.28	0.36, 2.27	0.25, 2.49	0.77, 1.45

Table 2. Panel assignment matrix based on optimization. The first entry indicates the scenario where the lead and scribe are the same. The 2nd entry in brackets indicates the scenario where the lead and scribe are different. Suffixes: ChE indicates Chemical Engineering, and GS indicates Google Scholar.

Proposal/Reviewer	R1_{ChE_GS}	R2_{ChE_GS}	R3_{ChE_GS}	R4_{ChE_GS}	R5_{ChE_GS}	R6_{ChE_GS}	R7_{ChE_GS}	R8_{ChE_GS}	R9_{ChE_GS}	R10_{ChE_GS}
P1_ChE	LS (L)	R (S)	-	-	-	-	-	R (R)	R (R)	-
P2_ChE	R (S)	LS (L)	-	R (R)	R (R)	-	-	-	-	-
P3_ChE	-	-	LS (L)	-	-	R (R)	R (R)	-	R (S)	-
P4_ChE	R (R)	-	-	LS (L)	R (S)	-	-	-	-	R (R)
P5_ChE	-	-	-	R (S)	LS (L)	R (R)	R (R)	-	-	-
P6_ChE	-	-	R (R)	-	R (R)	LS (L)	R (S)	-	-	-
P7_ChE	-	-	R (R)	-	-	R (S)	LS (L)	-	R (R)	-
P8_ChE	R (R)	R (R)	-	-	-	-	-	LS (L)	-	R (S)
P9_ChE	-	-	R (S)	-	-	-	-	R (R)	LS (L)	R (R)
P10_ChE	-	R (R)	-	R (R)	-	-	-	R (S)	-	LS (L)

Table 3. Similarity and rankings scores for Case 1 that includes Chemical Engineering ‘proposals’ and CVs. The similarity score is seen in the 1st number entry, and the rankings score is seen in the 2nd number entry. Suffixes: ChE indicates Chemical Engineering.

Proposal/Reviewer	R1_{ChE_CV}	R2_{ChE_CV}	R3_{ChE_CV}	R4_{ChE_CV}	R5_{ChE_CV}	R6_{ChE_CV}	R7_{ChE_CV}	R8_{ChE_CV}	R9_{ChE_CV}	R10_{ChE_CV}
P1_ChE	0.81, 1.36	0.76, 1.47	0.49, 2.00	0.33, 2.32	0.26, 2.46	0.29, 2.40	0.20, 2.59	0.37, 2.24	0.36, 2.26	0.29, 2.40
P2_ChE	0.86, 1.26	0.91, 1.16	0.49, 2.00	0.54, 1.91	0.40, 2.19	0.45, 2.08	0.35, 2.28	0.33, 2.33	0.43, 2.13	0.46, 2.07
P3_ChE	0.29, 2.41	0.30, 2.39	0.78, 1.42	0.26, 2.46	0.26, 1.46	0.43, 2.12	0.53, 1.92	0.24, 2.51	0.45, 2.08	0.15, 2.69
P4_ChE	0.50, 1.99	0.50, 1.98	0.35, 2.29	0.83, 1.33	0.75, 1.49	0.63, 1.73	0.40, 2.19	0.43, 2.12	0.52, 1.95	0.57, 1.85
P5_ChE	0.39, 2.20	0.38, 2.23	0.47, 2.04	0.70, 1.58	0.79, 1.41	0.57, 1.84	0.51, 1.97	0.43, 2.12	0.48, 2.02	0.43, 2.12
P6_ChE	0.47, 2.04	0.43, 2.13	0.57, 1.84	0.59, 1.81	0.50, 1.99	0.87, 1.25	0.61, 1.76	0.20, 2.58	0.42, 2.14	0.31, 2.36
P7_ChE	0.52, 1.94	0.51, 1.97	0.61, 1.76	0.52, 1.94	0.36, 2.27	0.75, 1.48	0.62, 1.74	0.25, 2.48	0.54, 1.91	0.41, 2.17
P8_ChE	0.25, 2.49	0.28, 2.43	0.17, 2.64	0.21, 2.56	0.16, 2.67	0.07, 2.85	0.10, 2.79	0.70, 1.58	0.34, 2.31	0.54, 1.91
P9_ChE	0.28, 2.42	0.29, 2.41	0.47, 2.05	0.28, 2.42	0.20, 2.58	0.23, 2.52	0.33, 2.32	0.28, 2.43	0.81, 1.37	0.36, 2.26
P10_ChE	0.37, 2.24	0.41, 2.17	0.16, 2.67	0.56, 1.86	0.37, 2.25	0.44, 2.11	0.39, 2.21	0.39, 2.20	0.38, 2.23	0.78, 1.42

Table 4. Panel assignment matrix based on optimization. The first entry indicates the scenario where the lead and scribe are the same. The 2nd entry in brackets indicates the scenario where the lead and scribe are different. Suffixes: ChE indicates Chemical Engineering.

Proposal/Reviewer	R1_{ChE_CV}	R2_{ChE_CV}	R3_{ChE_CV}	R4_{ChE_CV}	R5_{ChE_CV}	R6_{ChE_CV}	R7_{ChE_CV}	R8_{ChE_CV}	R9_{ChE_CV}	R10_{ChE_CV}
P1_ChE	LS (L)	R (S)	R (R)	-	-	-	-	R (R)	-	-
P2_ChE	R (S)	LS (L)	-	R (R)	R (R)	-	-	-	-	-
P3_ChE	-	-	LS (L)	-	-	R (R)	R (R)	-	R (S)	-
P4_ChE	-	-	-	LS (L)	R (S)	R (R)	-	-	-	R (R)
P5_ChE	-	-	-	R (S)	LS (L)	-	R (R)	-	R (R)	-
P6_ChE	R (R)	-	-	-	R (R)	LS (L)	R (S)	-	-	-
P7_ChE	R (R)	-	R (R)	-	-	R (S)	LS (L)	-	-	-
P8_ChE	-	R (R)	-	-	-	-	-	LS (L)	R (R)	R (S)
P9_ChE	-	-	R (S)	-	-	-	-	R (R)	LS (L)	R (R)
P10_ChE	-	R (R)	-	R (R)	-	-	-	R (S)	-	LS (L)

Table 5. Similarity and rankings scores for Case 1 that includes Philosophy ‘proposals’ and Google Scholar profiles. The similarity score is seen in the 1st number entry, and the rankings score is seen in the 2nd number entry. Suffixes: Phil indicates Philosophy, and GS indicates Google Scholar.

Proposal/Reviewer	R1_{Phil_GS}	R2_{Phil_GS}	R3_{Phil_GS}	R4_{Phil_GS}	R5_{Phil_GS}	R6_{Phil_GS}	R7_{Phil_GS}	R8_{Phil_GS}	R9_{Phil_GS}	R10_{Phil_GS}
P1_Phil	0.67, 1.65	0.57, 1.85	0.60, 1.79	0.44, 2.11	0.53, 1.92	0.33, 2.33	0.50, 2.00	0.70, 1.59	0.39, 2.20	0.63, 1.72
P2_Phil	0.34, 2.31	0.65, 1.68	0.54, 1.90	0.46, 2.07	0.35, 2.28	0.31, 2.36	0.30, 2.38	0.42, 2.15	0.28, 2.43	0.61, 1.77
P3_Phil	0.50, 1.98	0.44, 2.10	0.66, 1.66	0.44, 2.10	0.41, 2.17	0.45, 2.09	0.43, 2.12	0.52, 1.94	0.51, 1.96	0.45, 2.08
P4_Phil	0.33, 2.32	0.32, 2.35	0.42, 2.14	0.71, 1.56	0.34, 2.30	0.29, 2.41	0.19, 2.61	0.33, 2.33	0.32, 2.34	0.31, 2.36
P5_Phil	0.57, 1.84	0.54, 1.91	0.67, 1.64	0.59, 1.81	0.78, 1.43	0.34, 2.30	0.47, 2.05	0.69, 1.60	0.38, 2.23	0.62, 1.74
P6_Phil	0.25, 2.48	0.08, 2.82	0.23, 2.53	0.31, 2.37	0.13, 2.72	0.37, 2.25	0.10, 2.78	0.19, 2.61	0.35, 2.23	0.13, 2.72
P7_Phil	0.51, 1.96	0.42, 2.15	0.47, 2.05	0.28, 2.43	0.36, 2.27	0.29.2.41	0.71, 1.57	0.56, 1.86	0.35, 2.30	0.48, 2.03
P8_Phil	0.65, 1.68	0.41, 2.17	0.46, 2.06	0.28, 2.43	0.42, 2.15	0.27, 2.44	0.53, 1.92	0.78, 1.43	0.35, 2.30	0.47, 2.04
P9_Phil	0.50, 1.99	0.28, 2.42	0.41, 2.16	0.31, 2.37	0.28, 2.42	0.50, 2.00	0.37, 2.24	0.52, 1.95	0.59, 1.80	0.35, 2.29
P10_Phil	0.39, 2.21	0.56, 1.86	0.47, 2.04	0.51, 1.96	0.49, 2.01	0.19, 2.61	0.27, 2.44	0.54, 1.90	0.21, 2.56	0.80, 1.38

Table 6. Panel assignment matrix based on optimization. The first entry indicates the scenario where the lead and scribe are the same. The 2nd entry in brackets indicates the scenario where the lead and scribe are different. Suffixes: Phil indicates Philosophy, and GS indicates Google Scholar.

Proposal/Reviewer	R1_{Phil_GS}	R2_{Phil_GS}	R3_{Phil_GS}	R4_{Phil_GS}	R5_{Phil_GS}	R6_{Phil_GS}	R7_{Phil_GS}	R8_{Phil_GS}	R9_{Phil_GS}	R10_{Phil_GS}
P1_Phil	LS (L)	R (R)	-	-	-	-	-	R (S)	-	R (R)
P2_Phil	-	LS (L)	R (R)	R (R)	-	-	-	-	-	R (S)
P3_Phil	-	-	LS (L)	-	-	R (R)	R (R)	-	R (S)	-
P4_Phil	-	-	-	LS (L)	R (S)	R (R)	-	-	R (R)	-
P5_Phil	-	-	R (S)	-	LS (L)	-	-	R (R)	-	R (R)
P6_Phil	-	-	R (R)	R (S)	-	LS (L)	-	-	R (R)	-
P7_Phil	R (S)	R (R)	-	-	-	-	LS (L)	R (R)	-	-
P8_Phil	R (R)	-	-	-	R (R)	-	R (S)	LS (L)	-	-
P9_Phil	R (R)	-	-	-	-	R (S)	R (R)	-	LS (L)	-
P10_Phil	-	R (S)	-	R (R)	R (R)	-	-	-	-	LS (L)

Table 7. Similarity and rankings scores for Case 1 that includes Philosophy ‘proposals’ and CVs. The similarity score is seen in the 1st number entry, and the rankings score is seen in the 2nd number entry. Suffixes: Phil indicates Philosophy.

Proposal/Reviewer	R1_{Phil_CV}	R2_{Phil_CV}	R3_{Phil_CV}	R4_{Phil_CV}	R5_{Phil_CV}	R6_{Phil_CV}	R7_{Phil_CV}	R8_{Phil_CV}	R9_{Phil_CV}	R10_{Phil_CV}
P1_Phil	0.70, 1.59	0.56, 1.87	0.67, 1.65	0.49, 2.00	0.58, 1.82	0.38, 2.22	0.61, 1.77	0.72, 1.54	0.41, 2.17	0.64, 1.70
P2_Phil	0.41, 2.17	0.70, 1.59	0.57, 1.84	0.48, 2.02	0.46, 2.06	0.31, 2.36	0.35, 2.28	0.42, 2.14	0.29, 2.40	0.61, 1.76
P3_Phil	0.55, 1.89	0.49, 2.01	0.64, 1.70	0.47, 2.05	0.52, 1.94	0.51, 1.97	0.48, 2.03	0.52, 1.94	0.55, 1.89	0.45, 2.08
P4_Phil	0.41, 2.17	0.41, 2.16	0.45, 2.08	0.70, 1.59	0.46, 2.06	0.34, 2.30	0.24, 2.50	0.26, 2.46	0.32, 2.34	0.33, 2.33
P5_Phil	0.62, 1.74	0.58, 1.83	0.71, 1.57	0.59, 1.80	0.84, 1.30	0.40, 2.19	0.54, 1.91	0.64, 1.71	0.39, 2.21	0.63, 1.73
P6_Phil	0.15, 2.69	0.15, 2.69	0.24, 2.50	0.30, 2.39	0.20, 2.59	0.35, 2.28	0.13, 2.73	0.11, 2.76	0.33, 2.33	0.14, 2.70
P7_Phil	0.53, 1.93	0.39, 2.20	0.46, 2.07	0.31, 2.36	0.41, 2.16	0.30, 2.38	0.75, 1.48	0.61, 1.76	0.34, 2.30	0.47, 2.05
P8_Phil	0.62, 1.74	0.39, 2.21	0.51, 1.96	0.35, 2.28	0.47, 2.04	0.34, 2.31	0.60, 1.79	0.81, 1.37	0.38, 2.22	0.46, 2.07
P9_Phil	0.39, 2.20	0.31, 2.38	0.44, 2.10	0.35, 2.28	0.38, 2.22	0.53, 1.93	0.40, 2.19	0.46, 2.07	0.59, 1.81	0.36, 2.26
P10_Phil	0.45, 2.09	0.55, 1.89	0.55, 1.88	0.53, 1.93	0.55, 1.89	0.21, 2.56	0.39, 2.20	0.52, 1.96	0.21, 2.57	0.81, 1.36

Table 8. Panel assignment matrix based on optimization. The first entry indicates the scenario where the lead and scribe are the same. The 2nd entry in brackets indicates the scenario where the lead and scribe are different. Suffixes: Phil indicates Philosophy.

Proposal/Reviewer	R1_{Phil_CV}	R2_{Phil_CV}	R3_{Phil_CV}	R4_{Phil_CV}	R5_{Phil_CV}	R6_{Phil_CV}	R7_{Phil_CV}	R8_{Phil_CV}	R9_{Phil_CV}	R10_{Phil_CV}
P1_Phil	LS (L)	-	R (R)	-	-	-	R (R)	R (S)	-	-
P2_Phil	-	LS (L)	R (R)	R (R)	-	-	-	-	-	R (S)
P3_Phil	R (R)	-	LS (L)	-	-	R (R)	-	-	R (S)	-
P4_Phil	-	R (R)	-	LS (L)	R (S)	R (R)	-	-	-	-
P5_Phil	-	R (R)	R (S)	-	LS (L)	-	-	-	-	R (R)
P6_Phil	-	-	-	R (S)	R (R)	LS (L)	-	-	R (R)	-
P7_Phil	R (S)	-	-	-	-	-	LS (L)	R (R)	-	R (R)
P8_Phil	R (R)	-	-	-	-	-	R (S)	LS (L)	R (R)	-
P9_Phil	-	-	-	-	-	R (S)	R (R)	R (R)	LS (L)	-
P10_Phil	-	R (S)	-	R (R)	R (R)	-	-	-	-	LS (L)

Table 9. Similarity and rankings scores for Case 1 that includes Chemical Engineering and Philosophy ‘proposals’ and Chemical Engineering and Philosophy Google Scholar profiles. The similarity score is seen in the 1st number entry, and the rankings score is seen in the 2nd number entry. Suffixes: ChE indicates Chemical Engineering, Phil indicates Philosophy, and GS indicates Google Scholar.

Proposal/Reviewer	R1_{ChE_GS}	R2_{ChE_GS}	R3_{ChE_GS}	R4_{ChE_GS}	R5_{ChE_GS}	R6_{Phil_GS}	R7_{Phil_GS}	R8_{Phil_GS}	R9_{Phil_GS}	R10_{Phil_GS}
P1_ChE	0.78, 1.42	0.73, 1.53	0.43, 2.12	0.33, 2.32	0.22, 2.55	−0.03, 3.07	0.02, 2.95	0.00, 3.00	−0.07, 3.14	0.00, 3.00
P2_ChE	0.89, 1.22	0.92, 1.15	0.49, 2.00	0.61, 1.77	0.42, 2.15	0.01, 2.97	0.08, 2.83	0.06, 2.87	0.00, 3.00	0.01, 2.98
P3_ChE	0.25, 2.48	0.27, 2.45	0.79, 1.41	0.20, 2.58	0.17, 2.65	−0.09, 3.18	−0.03, 3.06	−0.06, 3.12	−0.03, 3.07	−0.01, 3.03
P4_ChE	0.49, 2.01	0.52, 1.95	0.36, 2.26	0.83, 1.33	0.70, 1.59	−0.06, 3.12	−0.01, 3.02	0.01, 2.96	−0.08, 3.16	−0.03, 3.06
P5_ChE	0.36, 2.26	0.37, 2.25	0.48, 2.03	0.61, 1.77	0.67, 1.64	−0.06, 3.12	−0.01, 3.03	0.05, 2.88	−0.04, 3.09	−0.03, 3.06
P6_Phil	0.15, 2.68	0.17, 2.64	0.11, 2.77	0.11, 2.76	0.01, 2.96	0.37, 2.25	0.10, 2.78	0.19, 2.61	0.35, 2.29	0.13, 2.72
P7_Phil	0.05, 2.89	0.07, 2.85	0.00, 2.99	0.03, 2.92	0.04, 2.91	0.29, 2.41	0.71, 1.57	0.56, 1.86	0.35, 2.29	0.48, 2.03
P8_Phil	0.04, 2.90	0.07, 2.85	0.01, 2.96	0.06, 2.86	0.13, 2.72	0.27, 2.44	0.53, 1.92	0.78, 1.43	0.35, 2.29	0.47, 2.04
P9_Phil	0.01, 2.96	0.03, 2.92	−0.03, 3.06	−0.03, 3.07	−0.01, 3.02	0.50, 2.00	0.37, 2.24	0.52, 1.95	0.59, 1.80	0.35, 2.29
P10_Phil	−0.02, 3.05	−0.01, 3.02	0.00, 2.99	−0.06, 3.13	0.03, 2.93	0.19, 2.61	0.27, 2.44	0.54, 1.90	0.21, 2.56	0.80, 1.38

Table 10. Similarity and rankings scores for Case 1 that includes Chemical Engineering and Philosophy ‘proposals’ and Chemical Engineering and Philosophy CVs. The similarity score is seen in the 1st number entry, and the rankings score is seen in the 2nd number entry. Suffixes: ChE indicates Chemical Engineering and Phil indicates Philosophy.

Proposal/Reviewer	R1_{ChE_CV}	R2_{ChE_CV}	R3_{ChE_CV}	R4_{ChE_CV}	R5_{ChE_CV}	R6_{Phil_CV}	R7_{Phil_CV}	R8_{Phil_CV}	R9_{Phil_CV}	R10_{Phil_CV}
P1_ChE	0.81, 1.36	0.76, 1.47	0.49, 2.01	0.33, 2.32	0.26, 2.46	−0.08, 3.17	0.00, 3.00	−0.01, 3.01	−0.09, 3.19	−0.03, 3.06
P2_ChE	0.86, 1.26	0.91, 1.16	0.49, 2.01	0.54, 1.91	0.40, 2.19	−0.01, 3.02	0.05, 2.89	0.06, 2.87	−0.02, 3.04	−0.01, 3.03
P3_ChE	0.29, 2.41	0.30, 2.39	0.78, 1.42	0.26, 2.46	0.26, 2.46	−0.08, 3.17	−0.04, 3.09	−0.03, 3.07	−0.04.3.10	0.00, 3.00
P4_ChE	0.50, 2.00	0.50, 1.98	0.35, 2.29	0.83, 1.33	0.75, 1.50	−0.05, 3.10	−0.01, 3.03	0.05, 2.89	−0.07, 3.15	−0.03, 3.07
P5_ChE	0.39, 2.20	0.38, 2.23	0.47, 2.04	0.70, 1.58	0.79, 1.41	−0.04, 3.08	0.00.3.00	0.09, 2.81	−0.04, 3.10	−0.02, 3.05
P6_Phil	0.14, 2.71	0.15, 2.69	0.11, 2.76	0.09, 2.80	0.03, 2.92	0.35, 2.28	0.13, 2.73	0.11, 2.76	0.33, 2.33	0.14, 2.70
P7_Phil	0.03, 2.93	0.05, 2.88	0.00, 3.00	0.00, 3.00	0.00, 3.00	0.30, 2.38	0.75, 1.50	0.61, 1.77	0.34, 2.30	0.47, 2.05
P8_Phil	0.03, 2.93	0.06, 2.87	0.02, 2.94	0.01, 2.97	0.12, 2.75	0.34, 2.31	0.60, 1.80	0.81, 1.37	0.38, 2.23	0.46, 2.07
P9_Phil	0.00, 3.00	0.02, 2.95	−0.01.3.02	−0.08, 3.17	−0.04, 3.17	0.53, 1.93	0.40, 2.20	0.46, 2.07	0.59, 1.81	0.36, 2.26
P10_Phil	−0.04, 3.09	−0.02, 3.05	0.01, 2.98	−0.08, 3.16	0.00, 3.00	0.21, 2.56	0.39, 2.21	0.51, 1.96	0.21, 2.57	0.81, 1.36

Table 11. Panel assignment matrix based on optimization. The first entry indicates the scenario where the lead and scribe are the same. The 2nd entry in brackets indicates the scenario where the lead and scribe are different. Suffixes: ChE indicates Chemical Engineering, Phil indicates Philosophy, and GS indicates Google Scholar.

Proposal/Reviewer	R1_{ChE_GS}	R2_{ChE_GS}	R3_{ChE_GS}	R4_{ChE_GS}	R5_{ChE_GS}	R6_{Phil_GS}	R7_{Phil_GS}	R8_{Phil_GS}	R9_{Phil_GS}	R10_{Phil_GS}
P1_ChE	LS (L)	R (S)	R (R)	-	R (R)	-	-	-	-	-
P2_ChE	R (S)	LS (L)	R (R)	R (R)	-	-	-	-	-	-
P3_ChE	R (R)	-	LS (L)	R (S)	R (R)	-	-	-	-	-
P4_ChE	R (R)	R (R)	-	LS (L)	R (S)	-	-	-	-	-
P5_ChE	-	R (R)	R (S)	R (R)	LS (L)	-	-	-	-	-
P6_Phil	-	-	-	-	-	LS (L)	R (R)	-	R (S)	R (R)
P7_Phil	-	-	-	-	-	R (R)	LS (L)	R (R)	-	R (S)
P8_Phil	-	-	-	-	-	-	R (S)	LS (L)	R (R)	R (R)
P9_Phil	-	-	-	-	-	R (S)	R (R)	R (R)	LS (L)	-
P10_Phil	-	-	-	-	-	R (R)	-	R (S)	R (R)	LS (L)

Table 12. Panel assignment matrix based on optimization. The first entry indicates the scenario where the lead and scribe are the same. The 2nd entry in brackets indicates the scenario where the lead and scribe are different. Suffixes :ChE indicates Chemical Engineering, Phil indicates Philosophy, and CV indicates Curriculum Vitae.

Proposal/Reviewer	R1_{ChE_CV}	R2_{ChE_CV}	R3_{ChE_CV}	R4_{ChE_CV}	R5_{ChE_CV}	R6_{Phil_CV}	R7_{Phil_CV}	R8_{Phil_CV}	R9_{Phil_CV}	R10_{Phil_CV}
P1_ChE	LS (L)	R (R)	R (S)	-	R (R)	-	-	-	-	-
P2_ChE	R (S)	LS (L)	R (R)	R (R)	-	-	-	-	-	-
P3_ChE	-	R (S)	LS (L)	R (R)	R (R)	-	-	-	-	-
P4_ChE	R (R)	R (R)	-	LS (L)	R (S)	-	-	-	-	-
P5_ChE	R (R)	-	R (R)	R (S)	LS (L)	-	-	-	-	-
P6_Phil	-	-	-	-	-	LS (L)	R (R)	-	R (S)	R (R)
P7_Phil	-	-	-	-	-	-	LS (L)	R (R)	R (R)	R (S)
P8_Phil	-	-	-	-	-	R (R)	R (S)	LS (L)	R (R)	-
P9_Phil	-	-	-	-	-	R (S)	-	R (R)	LS (L)	R (R)
P10_Phil	-	-	-	-	-	R (R)	R (R)	R (S)	-	LS (L)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ramachandran, R.; Patil, U.; Sundar, S.; Shah, P.; Ramesh, P. AI-Driven Panel Assignment Optimization via Document Similarity and Natural Language Processing. AI 2025, 6, 177. https://doi.org/10.3390/ai6080177

AMA Style

Ramachandran R, Patil U, Sundar S, Shah P, Ramesh P. AI-Driven Panel Assignment Optimization via Document Similarity and Natural Language Processing. AI. 2025; 6(8):177. https://doi.org/10.3390/ai6080177

Chicago/Turabian Style

Ramachandran, Rohit, Urjit Patil, Srinivasaraghavan Sundar, Prem Shah, and Preethi Ramesh. 2025. "AI-Driven Panel Assignment Optimization via Document Similarity and Natural Language Processing" AI 6, no. 8: 177. https://doi.org/10.3390/ai6080177

APA Style

Ramachandran, R., Patil, U., Sundar, S., Shah, P., & Ramesh, P. (2025). AI-Driven Panel Assignment Optimization via Document Similarity and Natural Language Processing. AI, 6(8), 177. https://doi.org/10.3390/ai6080177

Article Menu

AI-Driven Panel Assignment Optimization via Document Similarity and Natural Language Processing

Abstract

1. Introduction

1.1. Related Work and Motivation

1.2. Research Objectives

2. Methods

2.1. Text Extraction and Preprocessing

2.2. Text Similarity Computation

2.3. Google Scholar Title Extraction

2.4. CV Publication Title Extraction

2.5. Optimization

2.6. Implementation and Deployment

3. Results and Discussion

3.1. Document Similarity

3.1.1. Average Similarity Results

3.1.2. Similarity Results for Chemical Engineering ‘Proposals’ and Reviewers

3.1.3. Similarity Test Results for Philosophy ‘Proposals’ and Reviewers

3.1.4. Similarity Test Results for Mix of Chemical Engineering and Philosophy ‘Proposals’ and Corresponding Mix of Reviewers

3.2. Panel Assignment Optimization

3.2.1. Case 1: Chemical Engineering ‘Proposals’ and Related Google Scholar Profile and CV Documents

3.2.2. Case 2: Philosophy ‘Proposals’ and Corresponding Google Scholar Profile and CV Documents

3.2.3. Case 3: Mix of Chemical Engineering and Philosophy ‘Proposals’ and Their Corresponding Google Scholar Profile and CV Documents

3.3. Implications of Research Results

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Integer Optimization

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI