Eghbal commited on
Commit
07f3442
Β·
verified Β·
1 Parent(s): d18e162

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -30
README.md CHANGED
@@ -1,51 +1,96 @@
1
  ---
2
- title: README
3
- emoji: πŸ’»
4
  colorFrom: gray
5
  colorTo: blue
6
  sdk: static
7
  pinned: false
8
  ---
9
 
10
- # README πŸ’»
11
  <div style="text-align: center; padding: 20px;">
12
- <h1 style="font-size: 2em; font-weight: bold;">
13
- FinText: A Specialised Financial LLM Repository
14
- </h1>
15
  </div>
16
 
17
- <div style="padding: 10px; border-radius: 10; box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);">
18
- <h2 style="font-size: 1.5em; text-align: center;">πŸš€ **Stage 1 Release** πŸš€</h2>
19
 
20
- We are thrilled to introduce a specialised suite of **68 large language models (LLMs)**, meticulously designed for the accounting and finance. The FinText models have been **pre-trained** on high quality, domain-specific historical data, addressing challenges such as **look-ahead bias** and **information leakage**. These models are crafted to elevate the accuracy and depth of financial research and analysis.
21
 
22
- πŸ’‘ **Key Features:**
23
- - **Domain-Specific Training:** FinText utilises diverse financial datasets including news articles, regulatory filings, transcripts, IP records, key information, board information, speeches (ECB, FED), and major Wikipedia articles.
24
- - **Time-Period Specific Models:** Separate models are pre-trained for each year from **2007 to 2023**, ensuring the utmost precision and historical relevance.
25
- - **RoBERTa Architecture:** The suite includes both a **base model** with **125 million parameters** and a **smaller variant** with **51 million parameters**.
26
- - **Two distinct pre-training durations:** We also introduce a series of models to explore the impact of futher pre-training.
27
- - **Accessibility:** The models are pre-trained using **BF16**, but are released in **FP32** format to ensure they are accessible to a broader community, including those without high-end GPUs.
28
- - **Sustainability:** The entire electricity used was fully traceable and sourced exclusively from renewable energy.
29
 
30
- **For further details on this and citation, please refer to the paper, which is accessible from [here](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4963618).**
31
- <div style="font-style: italic; font-size: 0.9em; padding: 10px; border-left: 4px solid #003366;">
32
- <strong>Rahimikia, Eghbal and Drinkall, Felix, *Re(Visiting) Large Language Models in Finance* (September 21, 2024). Available at SSRN: <a href="https://ssrn.com/abstract=4963618">https://ssrn.com/abstract=4963618</a> or <a href="http://dx.doi.org/10.2139/ssrn.4963618">http://dx.doi.org/10.2139/ssrn.4963618</a></strong>
33
- </div>
34
 
35
- Stay tuned for upcoming updates and new features for FinText. We expect to launch stages 2 and 3 within next months. πŸŽ‰
36
 
37
- <div style="text-align: center; margin-top: 10px;">
38
- <strong>UPDATE (5th February): The models are no longer publicly available. We are rcurrently working on the new models and expect to release them in the coming months.</strong>
39
- </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
- </div> <div style="font-size: 0.8em; margin-top: 20px; text-align: justify;"> This project is supported by several key resources and institutions. We would like to acknowledge the invaluable assistance provided by Research IT and the use of the Computational Shared Facility at The University of Manchester. The project also benefited from the resources of the N8 Centre of Excellence in Computationally Intensive Research (N8 CIR), supported by the N8 research partnership and the Engineering and Physical Sciences Research Council (EPSRC) under Grant No. EP/T022167/1. The N8 CIR is coordinated by the Universities of Durham, Manchester, and York. Additionally, we are grateful for the financial support provided by Digital Futures at The University of Manchester, the Alan Turing Institute, and the Alliance Manchester Business School (AMBS). We are grateful to AMBS and the Oxford-Man Institute of Quantitative Finance for providing internal server access, which was essential for completing this project successfully. </div>
43
 
44
- <div style="text-align: center; margin-top: 20px; display: flex; flex-direction: column; align-items: center;">
45
- <p style="font-weight: bold; font-size: 1.2em; margin-bottom: 5px;">Developed by:</p>
46
- <img src="https://fintext.ai/UoM-logo.svg" alt="Logo" style="width: 250px; height: auto; margin: 0;">
47
- <p style="font-size: 0.8em; margin: 0; line-height: 1;">Alliance Manchester Business School</p>
 
 
 
 
 
 
 
 
 
 
48
  </div>
49
 
 
 
 
50
 
51
- </div>
 
 
1
  ---
2
+ title: FinText-TSFM
3
+ emoji: πŸ“ˆ
4
  colorFrom: gray
5
  colorTo: blue
6
  sdk: static
7
  pinned: false
8
  ---
9
 
10
+ # FinText-TSFM πŸ“ˆ
11
  <div style="text-align: center; padding: 20px;">
12
+ <h1 style="font-size: 2em; font-weight: bold;">Time Series Foundation Models for Finance</h1>
 
 
13
  </div>
14
 
15
+ <div style="padding: 12px; border-radius: 10px; box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);">
 
16
 
17
+ ## πŸš€ Stage 1 Release
18
 
19
+ We are pleased to introduce **FinText-TSFM**, a comprehensive suite of **time series foundation models (TSFMs)** developed for financial forecasting and quantitative research.
20
+ This release accompanies the paper
21
+ **[*Re(Visiting) Time Series Foundation Models in Finance*](https://ssrn.com/abstract=4963618)**
22
+ by *Eghbal Rahimikia, Hao Ni, and Weiguan Wang (2025)*.
 
 
 
23
 
24
+ ---
 
 
 
25
 
26
+ ### πŸ’‘ Key Highlights
27
 
28
+ - **Finance-Native Pre-training:**
29
+ Models are pre-trained **from scratch** on large-scale financial time series datasets β€” including daily excess returns across **89 markets** and **over 2 billion observations** β€” to ensure full temporal and domain alignment.
30
+
31
+ - **Bias-Free Design:**
32
+ Training strictly follows a **chronological expanding-window setup**, avoiding any **look-ahead bias** or **information leakage**.
33
+
34
+ - **Model Families:**
35
+ This release includes variants of **Chronos** and **TimesFM** architectures adapted for financial time series:
36
+ - Chronos-Tiny / Mini / Small
37
+ - TimesFM-8M / 20M
38
+ - Parameter counts range from **8M to 200M+**.
39
+
40
+ - **Performance Insights:**
41
+ Our findings show that **off-the-shelf TSFMs** underperform in zero-shot forecasting, while **finance-pretrained models** achieve large gains in both predictive accuracy and portfolio Sharpe ratios.
42
+
43
+ - **Evaluation Scope:**
44
+ Models are benchmarked across **U.S. and international equities**, using rolling windows (5, 21, 252, 512 days) and **18M+ out-of-sample forecasts**.
45
+
46
+ - **Open Science Commitment:**
47
+ All released models are available in **FP32** format for full transparency and reproducibility.
48
+
49
+ ---
50
+
51
+ ### 🧠 Technical Overview
52
+
53
+ - **Architecture:** Transformer-based TSFMs (Chronos & TimesFM)
54
+ - **Training Regime:** Pre-training from scratch, fine-tuning, and zero-shot evaluation
55
+ - **Objective:** Mean squared error (MSE) for continuous returns; cross-entropy for tokenized sequences
56
+ - **Compute:** >50,000 GPU hours on NVIDIA GH200 Grace Hopper clusters
57
+ - **Data Sources:** CRSP, Compustat Global, JKP factors, and proprietary merged panels (1990–2023)
58
 
59
+ ---
60
+
61
+ ### πŸ“š Citation
62
+
63
+ Please cite the accompanying paper if you use these models:
64
+
65
+ > **Rahimikia, Eghbal; Ni, Hao; Wang, Weiguan.**
66
+ > *Re(Visiting) Time Series Foundation Models in Finance.*
67
+ > University of Manchester, UCL, Shanghai University, November 2025.
68
+ > SSRN: [https://ssrn.com/abstract=4963618](https://ssrn.com/abstract=4963618)
69
+ > DOI: [10.2139/ssrn.4963618](http://dx.doi.org/10.2139/ssrn.4963618)
70
+
71
+ ---
72
 
73
+ ### πŸ”‹ Acknowledgments
74
 
75
+ This project was made possible through computational and institutional support from:
76
+ - **Isambard-AI National AI Research Resource (AIRR)**
77
+ - **The University of Manchester** (Research IT & Computational Shared Facility)
78
+ - **N8 Centre of Excellence in Computationally Intensive Research (N8 CIR)** β€” EPSRC Grant EP/T022167/1
79
+ - **University College London** and **Shanghai University**
80
+ - **The Alan Turing Institute**
81
+ - **Alliance Manchester Business School (AMBS)**
82
+
83
+ ---
84
+
85
+ <div style="text-align: center; margin-top: 20px;">
86
+ <p style="font-weight: bold; font-size: 1.2em;">Developed by:</p>
87
+ <img src="https://fintext.ai/UoM-logo.svg" alt="Logo" style="width: 240px; height: auto;">
88
+ <p style="font-size: 0.8em; margin: 0;">Alliance Manchester Business School</p>
89
  </div>
90
 
91
+ </div>
92
+
93
+ ---
94
 
95
+ **πŸ—“οΈ Update (November 2025):**
96
+ Public models for *Stage 1* are available. Future stages will introduce larger-scale TSFMs, multivariate extensions, and diffusion-based financial forecasting models.