Smart Money, Greener Future: AI-Enhanced English Financial Text Processing for ESG Investment Decisions
Abstract
1. Introduction
- RQ1: How can an autoregressive generation framework be designed to jointly extract sustainability-related entities and relations from English financial texts with high fidelity, particularly for emerging market contexts?
- RQ2: To what extent can domain-specific knowledge, such as green finance regulations and ESG reporting standards, be integrated into a generative model to improve the accuracy and compliance of the extracted information?
- To address the complex sustainability entities and environmental relationships prevalent in emerging market financial texts, FinATG introduces a domain-adaptive span representation method specifically fine-tuned on sustainability-focused financial corpora. This method enhances the accuracy of extracting green finance entities, carbon emission metrics, and ESG relationships through refined span representation learning, enabling precise capture of environmental entity boundaries and sustainability type information crucial for emerging market regulatory compliance and international ESG standard alignment.
- To ensure compliance with diverse sustainability reporting standards and environmental regulations across emerging markets, FinATG implements a constrained decoding mechanism based on green finance domain rules and sustainability reporting frameworks. This mechanism effectively guides the generation process through state transition frameworks that incorporate carbon accounting standards, ESG disclosure requirements, and emerging market environmental regulations, ensuring that extracted information meets both local compliance needs and international sustainability standards essential for attracting green investment.
- FinATG integrates the financial domain pre-trained model FinBERT with sustainability-aware autoregressive generation mechanisms, creating an end-to-end framework optimized for emerging market contexts. This integration enhances the model’s comprehension of region-specific sustainability terminology, green finance instruments, and carbon credit mechanisms, making it particularly effective for processing ESG reports, green bond documentation, and environmental impact assessments that are increasingly critical for emerging market access to international sustainable finance.
2. Related Work
2.1. English Financial Information Extraction
2.2. Generative Information Extraction
2.3. Applications of Financial Text Analysis in Finance
3. Methodology
3.1. Task Definition
3.2. Model Architecture
3.3. Span Representation Learning
3.3.1. Boundary Detection
3.3.2. Type-Specific Representation
3.3.3. Handling Nested Entities
3.3.4. Financial Context Integration
3.4. Constrained Decoding
3.4.1. State Transition Framework
- : Initial state—Generation starting point, allowing only <START> token
- : Entity generation state—Allows generation of entity spans (start position, end position, and type)
- : Relation generation state—Permits generation of relation types and corresponding entity triples
- : Terminal state—Generates <END> token, marking sequence completion
3.4.2. Financial Domain Rules
3.4.3. Rule Implementation Details
3.4.4. Implementation Example
Listing 1. Pseudocode for the constrained decoding process. |
def constrained_decode (input_text): # Initialize state current_state = S0 generated_sequence = [] # Start generation generated_sequence.append ("<START>") current_state = S1 # Entity generation: Company A entity_span = generate_entity_span ("Company A", "ORG") validate_entity_constraints (entity_span) generated_sequence.append (entity_span) # Transition to relation generation generated_sequence.append ("<SEP>") current_state = S2 # Generate Acquire relation relation = generate_relation ("Acquire", entity_span) validate_relation_constraints (relation) generated_sequence.append (relation) # Continue with remaining entities and relations... # End generation generated_sequence.append ("<END>") return generated_sequence |
4. Experiments
4.1. Experimental Setup for English Financial Language Analysis
4.1.1. Datasets
4.1.2. Evaluation Metrics
4.1.3. Implementation Details
4.2. Main Results
4.2.1. Overall Performance Comparison
4.2.2. Ablation Study
4.2.3. Case Study
4.2.4. Evaluation on Real-World ESG Report Data
4.3. Explainability Analysis
4.4. Sensitivity Analysis
4.4.1. Span Length (K) Sensitivity
4.4.2. English Sentence Augmentation Parameter (B) Impact
4.4.3. Confusion Matrix Analysis
4.5. SAFE Evaluation
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Pseudocode for Model Embedding and Constrained Decoding
Algorithm A1 FinATG Model: Embedding and Constrained Decoding Process |
|
Appendix B. Computational Resource and Consumption Analysis
Component | Specification |
---|---|
GPU | NVIDIA A100 (40 GB HBM2, 312 TFLOPS FP16) |
CPU | 2 × Intel Xeon Gold 6230 (20 cores each, 2.1 GHz) |
Memory | 128 GB DDR4 |
Storage | 1 TB NVMe SSD |
Deep Learning Framework | PyTorch 1.9.0 + CUDA 11.1 |
Operating System | Ubuntu 20.04 LTS |
Batch Size | 16 |
Maximum Sequence Length | 512 tokens |
Number of Samples | Epoch Time (min) | Total Time (h, 30 Epochs) | Peak GPU Memory (GB) |
---|---|---|---|
250 | 12.3 | 6.15 | 28.4 |
500 | 15.4 | 7.70 | 29.1 |
1000 | 21.7 | 10.85 | 30.3 |
2000 | 32.1 | 16.05 | 31.8 |
4000 | 55.8 | 27.90 | 33.5 |
Appendix C. Frequency of Conflicts Between the Domain-Specific Financial Rules
Dataset | RCF (%) | APG |
---|---|---|
CoNLL04 | 3.2 | 0.04 |
FiNER-ORD | 4.5 | 0.05 |
Financial PhraseBank | 4.3 | 0.05 |
FIRE | 6.1 | 0.07 |
References
- Gupta, A.; Dengre, V.; Kheruwala, H.A.; Shah, M. Comprehensive review of text-mining applications in finance. Financ. Innov. 2020, 6, 39. [Google Scholar] [CrossRef]
- Berman, J.J. Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information; Morgan Kaufmann Publishers: Burlington, MA, USA, 2013. [Google Scholar]
- Hsu, M.F.; Chang, C.; Zeng, J.H. Automated text mining process for corporate risk analysis and management. Risk Manag. 2022, 24, 386–419. [Google Scholar] [CrossRef]
- Jothi Prakash, V.; Arul Antran Vijay, S. A Comprehensive Multimodal Framework for Optimizing Social Media Hashtag Recommendations. IEEE Trans. Comput. Soc. Syst. 2024, 1–12. [Google Scholar] [CrossRef]
- Jothi Prakash, V.; Arul Antran Vijay, S. A multi-aspect framework for explainable sentiment analysis. Pattern Recognit. Lett. 2024, 178, 122–129. [Google Scholar] [CrossRef]
- Jothi Prakash, V.; Arul Antran Vijay, S. Cross-lingual Sentiment Analysis of Tamil Language Using a Multi-stage Deep Learning Architecture. ACM Trans. Asian-Low-Resour. Lang. Inf. Process. 2023, 22, 254. [Google Scholar]
- Repke, T.; Krestel, R. Extraction and representation of financial entities from text. In Data Science for Economics and Finance; Springer: Cham, Switzerland, 2021; pp. 241–263. [Google Scholar]
- Yang, Y.; Wu, Z.; Yang, Y.; Lian, S.; Guo, F.; Wang, Z. A survey of information extraction based on deep learning. Appl. Sci. 2022, 12, 9691. [Google Scholar] [CrossRef]
- Nasar, Z.; Jaffry, S.W.; Malik, M.K. Named entity recognition and relation extraction: State-of-the-art. ACM Comput. Surv. (CSUR) 2021, 54, 20. [Google Scholar] [CrossRef]
- Han, R.; Ning, Q.; Peng, N. Joint event and temporal relation extraction with shared representations and structured prediction. arXiv 2019, arXiv:1909.05360. [Google Scholar]
- Mavillonio, M.S. Natural Language Processing Techniques for Long Financial Document; Italy Discussion Papers; Dipartimento di Economia e Management (DEM), University of Pisa: Pisa, Italy, 2024. [Google Scholar]
- Wang, X.; Huang, L.; Xu, S.; Lu, K. How Does a Generative Large Language Model Perform on Domain-Specific Information Extraction A Comparison between GPT-4 and a Rule-Based Method on Band Gap Extraction. J. Chem. Inf. Model. 2024, 64, 7895–7904. [Google Scholar] [CrossRef]
- Nayak, T.; Ng, H.T. Effective modeling of encoder-decoder architecture for joint entity and relation extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 8528–8535. [Google Scholar]
- Li, X.; Jin, J.; Zhou, Y.; Zhang, Y.; Zhang, P.; Zhu, Y.; Dou, Z. From matching to generation: A survey on generative information retrieval. arXiv 2024, arXiv:2404.14851. [Google Scholar] [CrossRef]
- Fisher, I.E.; Garnsey, M.R.; Hughes, M.E. Natural language processing in accounting, auditing and finance: A synthesis of the literature with a roadmap for future research. Intell. Syst. Account. Financ. Manag. 2016, 23, 157–214. [Google Scholar] [CrossRef]
- Balaneji, F. Language as a Lens: A Hybrid Text Summarization and Sentiment Analysis Approach for Multiclass Stock Return Prediction. In Intelligent Systems and Applications, Proceedings of the 2024 Intelligent Systems Conference (IntelliSys), Amsterdam, The Netherlands, 5–6 September 2024; Springer: Cham, Switzerland, 2024; pp. 429–448. [Google Scholar]
- Pandey, A.K.; Roy, S.S. Natural language generation using sequential models: A survey. Neural Process. Lett. 2023, 55, 7709–7742. [Google Scholar] [CrossRef]
- Goyal, A.; Gupta, V.; Kumar, M. Recent named entity recognition and classification techniques: A systematic review. Comput. Sci. Rev. 2018, 29, 21–43. [Google Scholar] [CrossRef]
- Wadden, D.; Wennberg, U.; Luan, Y.; Hajishirzi, H. Entity, relation, and event extraction with contextualized span representations. arXiv 2019, arXiv:1909.03546. [Google Scholar]
- Yu, M.; Han, D.; Hon, G.C.; He, C. Tet-assisted bisulfite sequencing (TAB-seq). DNA Methylation Protoc. 2018, 1708, 645–663. [Google Scholar]
- Ma, Y.; Hiraoka, T.; Okazaki, N. Named entity recognition and relation extraction using enhanced table filling by contextualized representations. J. Nat. Lang. Process. 2022, 29, 187–223. [Google Scholar] [CrossRef]
- Min, B.; Ross, H.; Sulem, E.; Veyseh, A.P.B.; Nguyen, T.H.; Sainz, O.; Agirre, E.; Heintz, I.; Roth, D. Recent advances in natural language processing via large pre-trained language models: A survey. ACM Comput. Surv. 2023, 56, 30. [Google Scholar] [CrossRef]
- Huang, A.H.; Wang, H.; Yang, Y. FinBERT: A large language model for extracting information from financial text. Contemp. Account. Res. 2023, 40, 806–841. [Google Scholar] [CrossRef]
- Jawahar, G.; Sagot, B.; Seddah, D. What does BERT learn about the structure of language? In Proceedings of the ACL 2019-57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019. [Google Scholar]
- Beltagy, I.; Lo, K.; Cohan, A. SciBERT: A pretrained language model for scientific text. arXiv 2019, arXiv:1903.10676. [Google Scholar]
- Łaniewski, S.; Ślepaczuk, R. Enhancing Literature Review with NLP Methods Algorithmic Investment Strategies Case; University of Warsaw Working Papers; Faculty of Economic Sciences, University of Warsaw: Warsaw, Poland, 2024. [Google Scholar]
- Giorgi, J.; Bader, G.D.; Wang, B. A sequence-to-sequence approach for document-level relation extraction. arXiv 2022, arXiv:2204.01098. [Google Scholar]
- Parmar, N.; Vaswani, A.; Uszkoreit, J.; Kaiser, L.; Shazeer, N.; Ku, A.; Tran, D. Image transformer. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 4055–4064. [Google Scholar]
- Paolini, G.; Athiwaratkun, B.; Krone, J.; Ma, J.; Achille, A.; Anubhai, R.; Santos, C.N.d.; Xiang, B.; Soatto, S. Structured prediction as translation between augmented natural languages. arXiv 2021, arXiv:2101.05779. [Google Scholar]
- Ren, L.; Sun, C.; Ji, H.; Hockenmaier, J. HySPA: Hybrid span generation for scalable text-to-graph extraction. arXiv 2021, arXiv:2106.15838. [Google Scholar]
- Josifoski, M.; De Cao, N.; Peyrard, M.; Petroni, F.; West, R. GenIE: Generative information extraction. arXiv 2021, arXiv:2112.08340. [Google Scholar]
- Geng, S.; Josifoski, M.; Peyrard, M.; West, R. Grammar-constrained decoding for structured NLP tasks without finetuning. arXiv 2023, arXiv:2305.13971. [Google Scholar]
- Zheng, J.; Chow, J.H.; Shen, Z.; Xu, P. Grammar-based decoding for improved compositional generalization in semantic parsing. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, 9–14 July 2023; pp. 1399–1418. [Google Scholar]
- Kearney, C.; Liu, S. Textual sentiment in finance: A survey of methods and models. Int. Rev. Financ. Anal. 2014, 33, 171–185. [Google Scholar] [CrossRef]
- Choi, I.; Kim, W.C. Detecting and analyzing politically-themed stocks using text mining techniques and transfer entropy—focus on the Republic of Korea’s case. Entropy 2021, 23, 734. [Google Scholar] [CrossRef] [PubMed]
- Leippold, M. Sentiment spin: Attacking financial sentiment with GPT-3. Financ. Res. Lett. 2023, 55, 103957. [Google Scholar] [CrossRef]
- Fatouros, G.; Soldatos, J.; Kouroumali, K.; Makridis, G.; Kyriazis, D. Transforming sentiment analysis in the financial domain with ChatGPT. Mach. Learn. Appl. 2023, 14, 100508. [Google Scholar] [CrossRef]
- Daudert, T. Exploiting textual and relationship information for fine-grained financial sentiment analysis. Knowl.-Based Syst. 2021, 230, 107389. [Google Scholar] [CrossRef]
- Consoli, S.; Barbaglia, L.; Manzan, S. Fine-grained, aspect-based sentiment analysis on economic and financial lexicon. Knowl.-Based Syst. 2022, 247, 108781. [Google Scholar] [CrossRef]
- Tsalis, T.A.; Nikolaou, I.E.; Konstantakopoulou, F.; Zhang, Y.; Evangelinos, K.I. Evaluating the corporate environmental profile by analyzing corporate social responsibility reports. Econ. Anal. Policy 2020, 66, 63–75. [Google Scholar] [CrossRef]
- Yang, M.; Lim, M.K.; Qu, Y.; Ni, D.; Xiao, Z. Supply chain risk management with machine learning technology: A literature review and future research directions. Comput. Ind. Eng. 2023, 175, 108859. [Google Scholar] [CrossRef] [PubMed]
- Galloppo, G.; Nexus, P. A Journey into ESG Investments; Palgrave Studies in Impact Finance; Palgrave Macmillan: Cham, Switzerland, 2025. [Google Scholar]
- Zaratiana, U.; Tomeh, N.; Holat, P.; Charnois, T. An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–28 February 2024; Volume 38, pp. 19477–19487. [Google Scholar]
- Shah, A.; Vithani, R.; Gullapalli, A.; Chava, S. Finer: Financial named entity recognition dataset and weak-supervision model. arXiv 2023, arXiv:2302.11157. [Google Scholar]
- Babaei, G.; Giudici, P.; Raffinetti, E. A rank graduation box for SAFE AI. Expert Syst. Appl. 2025, 259, 125239. [Google Scholar] [CrossRef]
- Giudici, P. Safe machine learning. Statistics 2024, 58, 473–477. [Google Scholar] [CrossRef]
Dataset | Domain | # Instances |
---|---|---|
CoNLL04 | General News | 1441 |
FiNER-ORD (Custom) | Financial Text | 250 |
Financial PhraseBank | Financial News | 4840 |
FIRE | Financial | 3025 |
Parameter | Value |
---|---|
Encoder layers | 12 |
Decoder layers | 6 |
Hidden dimension | 768 |
Attention heads | 8 |
Learning rate | |
Batch size | 16 |
Maximum sequence length | 512 |
Maximum span length (K) | 12 |
Dropout rate | 0.1 |
Weight decay | 0.01 |
Training epochs | 30 |
Early stopping patience | 5 |
Warm-up steps | 1000 |
Dataset | Model | ENT F1 | REL F1 | REL+ F1 | AUC | RGR | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
Score | Score | Score | Score | Score | |||||||
CoNLL 2004 | |||||||||||
FinATG (Ours) | 88.5 | - | 80.2 | - | 75.3 | - | 0.93 | - | 0.89 | - | |
SciBERT-based | 88.0 | −0.5 | 80.5 | +0.3 | 75.0 | −0.3 | 0.92 | −0.01 | 0.90 | +0.01 | |
Domain-adapted GPT-based | 87.0 | −1.5 | 79.0 | −1.2 | 74.0 | −1.3 | 0.94 | +0.01 | 0.88 | −0.01 | |
FinBERT+Pipeline | 87.8 | −0.7 | 79.5 | −0.7 | 74.8 | −0.5 | 0.93 | 0 | 0.88 | −0.01 | |
TANL | 87.2 | −1.3 | 79.1 | −1.1 | 74.8 | −0.5 | 0.92 | −0.01 | 0.88 | −0.01 | |
Tab-Seq | 86.9 | −1.6 | 77.5 | −2.7 | 72.4 | −2.9 | 0.91 | −0.02 | 0.87 | −0.02 | |
DyGIE++ | 85.3 | −3.2 | 75.4 | −4.8 | 70.1 | −5.2 | 0.89 | −0.04 | 0.85 | −0.04 | |
Financial (Custom) | |||||||||||
FinATG (Ours) | 85.7 | - | 78.6 | - | 73.1 | - | 0.93 | - | 0.89 | - | |
SciBERT-based | 84.1 | −1.6 | 78.9 | +0.3 | 72.8 | −0.3 | 0.91 | −0.02 | 0.90 | +0.01 | |
Domain-adapted GPT-based | 83.8 | −1.9 | 77.7 | −0.9 | 72.3 | −0.8 | 0.90 | −0.03 | 0.88 | −0.01 | |
FinBERT+Pipeline | 84.5 | −1.2 | 78.0 | −0.6 | 72.2 | −0.9 | 0.92 | −0.01 | 0.88 | −0.01 | |
TANL | 84.0 | −1.7 | 77.8 | −0.8 | 72.0 | −1.1 | 0.91 | −0.02 | 0.87 | −0.02 | |
Tab-Seq | 83.6 | −2.1 | 76.3 | −2.3 | 71.0 | −2.1 | 0.90 | −0.03 | 0.86 | −0.03 | |
DyGIE++ | 82.1 | −3.6 | 74.8 | −3.8 | 69.5 | −3.6 | 0.88 | −0.05 | 0.84 | −0.05 | |
Financial PhraseBank | |||||||||||
FinATG (Ours) | 86.0 | - | 79.0 | - | 74.0 | - | 0.93 | - | 0.89 | - | |
SciBERT-based | 84.5 | −1.5 | 78.2 | −0.8 | 73.5 | −0.5 | 0.92 | −0.01 | 0.89 | 0 | |
Domain-adapted GPT-based | 84.0 | −2.0 | 78.0 | −1.0 | 73.0 | −1.0 | 0.91 | −0.02 | 0.88 | −0.01 | |
FinBERT+Pipeline | 84.8 | −1.2 | 78.5 | −1.0 | 73.8 | −1.0 | 0.93 | 0 | 0.88 | −0.01 | |
TANL | 84.7 | −1.3 | 78.0 | −1.0 | 73.2 | −0.8 | 0.92 | −0.01 | 0.88 | −0.01 | |
Tab-Seq | 84.3 | −1.7 | 77.8 | −1.2 | 72.9 | −1.1 | 0.91 | −0.02 | 0.87 | −0.02 | |
DyGIE++ | 83.0 | −3.0 | 75.0 | −4.0 | 70.0 | −4.0 | 0.89 | −0.04 | 0.85 | −0.04 | |
FIRE | |||||||||||
FinATG (Ours) | 84.5 | - | 77.0 | - | 72.0 | - | 0.91 | - | 0.87 | - | |
SciBERT-based | 83.0 | −1.5 | 77.2 | +0.2 | 71.5 | −0.5 | 0.90 | −0.01 | 0.87 | 0 | |
Domain-adapted GPT-based | 82.5 | −2.0 | 76.5 | −0.5 | 71.0 | −1.0 | 0.90 | −0.01 | 0.86 | −0.01 | |
FinBERT+Pipeline | 83.2 | −1.3 | 76.8 | −0.2 | 71.2 | −0.8 | 0.91 | 0 | 0.87 | 0 | |
TANL | 82.8 | −1.7 | 76.5 | −0.5 | 70.8 | −1.2 | 0.90 | −0.01 | 0.87 | 0 | |
Tab-Seq | 82.5 | −2.0 | 76.0 | −1.0 | 70.5 | −1.5 | 0.89 | −0.02 | 0.86 | −0.01 | |
DyGIE++ | 81.0 | −3.5 | 73.8 | −3.2 | 68.5 | −3.5 | 0.87 | −0.04 | 0.84 | −0.03 |
Configuration | ENT F1 | REL F1 | REL+ F1 | AUC | ||||
---|---|---|---|---|---|---|---|---|
Score | Score | Score | Score | |||||
Full Model (FinATG) | 88.5 | - | 80.2 | - | 75.3 | - | 0.93 | - |
• Type-aware Span Encoding | 86.9 | −1.6 | 78.4 | −1.8 | 73.8 | −1.5 | 0.92 | −0.01 |
• Financial Rule Constraints | 87.1 | −1.4 | 79.5 | −0.7 | 74.1 | −1.2 | 0.93 | 0 |
• Sentence Augmentation | 87.8 | −0.7 | 79.8 | −0.4 | 74.7 | −0.6 | 0.93 | 0 |
• Architecture & Rules | 85.6 | −2.9 | 76.3 | −3.9 | 71.4 | −3.9 | 0.90 | −0.03 |
Configuration | ENT F1 | REL F1 | REL+ F1 | AUC | ||||
---|---|---|---|---|---|---|---|---|
Score | Score | Score | Score | |||||
Full Model (FinATG) | 85.7 | - | 78.6 | - | 73.1 | - | 0.93 | - |
• Type-aware Span Encoding | 83.4 | −2.3 | 76.2 | −2.4 | 70.9 | −2.2 | 0.91 | −0.02 |
• Financial Rule Constraints | 84.2 | −1.5 | 77.1 | −1.5 | 71.5 | −1.6 | 0.92 | −0.01 |
• Sentence Augmentation | 84.9 | −0.8 | 77.8 | −0.8 | 72.6 | −0.5 | 0.92 | −0.01 |
• Architecture & Rules | 82.5 | −3.2 | 74.0 | −4.6 | 68.9 | −4.2 | 0.89 | −0.04 |
Configuration | ENT F1 | REL F1 | REL+ F1 | AUC | ||||
---|---|---|---|---|---|---|---|---|
Score | Score | Score | Score | |||||
Full Model (FinATG) | 86.0 | - | 79.0 | - | 74.0 | - | 0.93 | - |
• Type-aware Span Encoding | 84.2 | −1.8 | 77.8 | −1.2 | 72.8 | −1.2 | 0.91 | −0.02 |
• Financial Rule Constraints | 84.8 | −1.2 | 78.5 | −1.0 | 73.2 | −0.8 | 0.92 | −0.01 |
• Sentence Augmentation | 85.2 | −0.8 | 78.8 | −0.2 | 73.6 | −0.4 | 0.92 | −0.01 |
• Architecture & Rules | 83.5 | −2.5 | 76.5 | −2.5 | 71.5 | −2.5 | 0.90 | −0.03 |
Configuration | ENT F1 | REL F1 | REL+ F1 | AUC | ||||
---|---|---|---|---|---|---|---|---|
Score | Score | Score | Score | |||||
Full Model (FinATG) | 84.5 | - | 77.0 | - | 72.0 | - | 0.91 | - |
• Type-aware Span Encoding | 82.8 | −1.7 | 75.8 | −1.2 | 70.8 | −1.2 | 0.90 | −0.01 |
• Financial Rule Constraints | 83.2 | −1.3 | 76.2 | −0.8 | 71.2 | −0.8 | 0.90 | −0.01 |
• Sentence Augmentation | 83.8 | −0.7 | 76.5 | −0.5 | 71.5 | −0.5 | 0.90 | −0.01 |
• Architecture & Rules | 81.9 | −2.6 | 74.8 | −2.2 | 69.8 | −2.2 | 0.88 | −0.03 |
Model | ENT F1 | REL F1 | REL+ F1 |
---|---|---|---|
SciBERT-based | 81.2 | 73.5 | 68.1 |
FinATG (Ours) | 84.6 | 76.8 | 71.5 |
Relation Type | Precision (%) | Recall (%) | F1-score (%) | Support |
---|---|---|---|---|
Acquire | 92.4 | 93.8 | 93.1 | 65 |
Invest_In | 93.8 | 90.0 | 91.8 | 50 |
Work_For | 94.1 | 96.0 | 95.0 | 50 |
Interest_In | 100.0 | 100.0 | 100.0 | 50 |
Others | 98.0 | 98.0 | 98.0 | 49 |
Accuracy | 96.9% | 264 | ||
Macro Avg | 95.7 | 95.6 | 95.6 | 264 |
Micro Avg | 96.9 | 96.9 | 96.9 | 264 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fan, J.; Wang, D.; Zheng, Y. Smart Money, Greener Future: AI-Enhanced English Financial Text Processing for ESG Investment Decisions. Sustainability 2025, 17, 6971. https://doi.org/10.3390/su17156971
Fan J, Wang D, Zheng Y. Smart Money, Greener Future: AI-Enhanced English Financial Text Processing for ESG Investment Decisions. Sustainability. 2025; 17(15):6971. https://doi.org/10.3390/su17156971
Chicago/Turabian StyleFan, Junying, Daojuan Wang, and Yuhua Zheng. 2025. "Smart Money, Greener Future: AI-Enhanced English Financial Text Processing for ESG Investment Decisions" Sustainability 17, no. 15: 6971. https://doi.org/10.3390/su17156971
APA StyleFan, J., Wang, D., & Zheng, Y. (2025). Smart Money, Greener Future: AI-Enhanced English Financial Text Processing for ESG Investment Decisions. Sustainability, 17(15), 6971. https://doi.org/10.3390/su17156971