Article Open Access

Transformer-Based Tabular Foundation Models: Outperforming Traditional Methods with TabPFN

R Anand Babu, Vishwa Priya V, Manoj Kumar Mishra, Inakoti Ramesh Raja, Surya Kiran Chebrolu, B Swarna

Abstract


Scientific research and commercial applications rely heavily on tabular data, yet efficiently modelling this data has constantly been a problem. For over twenty years, the standard method for machine learning has been based on traditional models, with gradient-boosted decision trees (GBDTs). Despite recent advancements in deep learning, neural networks often fail to provide satisfactory results on compact tabular datasets due to factors such as overfitting, insufficient data & intricate feature relationships. The study offers a Tabular Prior data Fitted Network, a foundation model developed by meta-learning on more than one million synthetic datasets generated sequentially, which is constructed on transformers to tackle these limitations. Without retraining or hyperparameter optimization, TabPFN learns to anticipate the best solutions for tabular problems, gaining inspiration from the achievements of GPT-like models in natural language processing. When applied to small to medium-sized datasets, its cutting-edge performance in inference speed & accuracy outperforms that of traditional methods. TabPFN redefines efficient and scalable tabular data modelling, including generative capabilities, few-shot learning, & rapid adaptation.


Keywords


Tabular Data, Deep Learning, Gradient-Boosted Decision Trees, Generative Capability, Few-Shot Learning, Transformer.

References


C. Picard and F. Ahmed, “Fast and Accurate Zero-Training Classification for Tabular Engineering Data,” Jan. 2024.

D. Xu, O. Cirit, R. Asadi, Y. Sun, and W. Wang, “Mixture of In-Context Prompters for Tabular PFNs,” May 2024.

Y. Mao, “TabTranSELU: A transformer adaptation for solving tabular data,” Appl. Comput. Eng., vol. 51, pp. 81–88, Mar. 2024, doi: 10.54254/2755-2721/51/20241174.

Govinda Rajulu, G., et al. "Cloud-computed solar tracking system." Computer Communication, Networking and IoT: Proceedings of 5th ICICC 2021, Volume 2. Singapore: Springer Nature Singapore, 2022. 75-85.

K. V. Katariya, R. Yadav, S. Kumar, A. K. Pradhan and I. Kamwa, "Wide-Area-Measurement-System-Based Event Analytics in the Power System: A Data-Driven Framework for Disturbance Characterization and Source Localization in the Indian Grid," in IEEE Power and Energy Magazine, vol. 23, no. 1, pp. 35-46, Jan.-Feb. 2025, doi: 10.1109/MPE.2024.3446737.

A. Lourenço, J. Gama, E. P. Xing, and G. Marreiros, “In-context learning of evolving data streams with tabular foundational models,” Feb. 2025.

Y. Yang, Y. Q. Wang, G. Liu, L. Wu, and Q. Liu, “UNITABE: A UNIVERSAL PRETRAINING PROTOCOL FOR TABULAR FOUNDATION MODEL IN DATA SCIENCE,” in 12th International Conference on Learning Representations, ICLR 2024, International Conference on Learning Representations, ICLR, 2024.

“Toward Robust, Reliable, and Generalizable Models for Tabular Data,” https://www.proquest.com/openview/9cc4f4c12499fa9cb718e61e19fe1d7f/1?cbl=18750&diss=y&pq-origsite=gscholar.

W. Jiang et al., “Coverage Prediction in Mobile Communication Networks: A Deep Learning Approach With a Tabular Foundation Model,” Internet Technol. Lett., vol. 8, May 2025, doi: 10.1002/itl2.70034.

R. Yadav, A. K. Pradhan and I. Kamwa, "Spectral Continuity and Subspace Change Detection for Recovery of Missing Harmonic Features in Power Quality," in IEEE Transactions on Power Delivery, vol. 39, no. 1, pp. 180-191, Feb. 2024, doi: 10.1109/TPWRD.2023.3328470.

J.-P. Jiang, S.-Y. Liu, H.-R. Cai, Q. Zhou, and H.-J. Ye, “Representation Learning for Tabular Data: A Comprehensive Survey,” Apr. 2025.

J. Ma et al., “TabDPT: Scaling Tabular Foundation Models,” Oct. 2024.

M. Schambach, “Towards Tabular Foundation Models,” https://hal.science/hal-04440710/.

V. Thomas et al., “Retrieval & Fine-Tuning for In-Context Tabular Models,” https://proceedings.neurips.cc/paper_files/paper/2024/hash/c40daf14d7a6469e65116507c21faeb7-Abstract-Conference.html.

H.-J. Ye, S.-Y. Liu, and W.-L. Chao, “A Closer Look at TabPFN v2: Strength, Limitation, and Extension,” Feb. 2025.

J. Qu, D. Holzmüller, G. Varoquaux, and M. Le Morvan, “TabICL: A Tabular Foundation Model for In-Context Learning on Large Data,” Feb. 2025.

M. Jayawardhana et al., “Transformers Boost the Performance of Decision Trees on Tabular Data across Sample Sizes,” Feb. 2025.

N. Hollmann, S. Müller, K. Eggensperger, and F. Hutter, “TABPFN: A TRANSFORMER THAT SOLVES SMALL TABULAR CLASSIFICATION PROBLEMS IN A SECOND,” in 11th International Conference on Learning Representations, ICLR 2023, International Conference on Learning Representations, ICLR, 2023.

Parul Singh, Ravi Yadav, Ashok Kumar Pradhan, Innocent Kamwa, Fundamental factors influencing bus coherency in distribution networks with distributed energy resources, International Journal of Electrical Power & Energy Systems, Volume 147,2023,108778,ISSN 0142-0615,https://doi.org/10.1016/j.ijepes.2022.108778.

Mugi Praseptiawan, Ahmad Naim Che Pee, Mohd Hafiz Zakaria, Agustinus Noertjahyana " Advancing the Measurement of MOOCs Software Quality: Validation of Assessment Tools Using the I-CVI Expert Framework" in International Journal of Engineering, Science, and Information Technology (IJESTY) , VOL 5, NO 3 2025. DOI : https://doi.org/10.52088/ijesty.v5i3.911.

Majidah Majidah, Widiyanto Widiyanto, Aji Purwinarko, Kasinyo Harto, Fridiyanto Fridiyanto, Amirul Mukminin "The Radio Frequency Identification Implementation Design for INLISLite Library Management System" in International Journal of Engineering, Science, and Information Technology (IJESTY) , VOL 5, NO 3 2025. DOI : 10.52088/ijesty.v5i3.902




DOI: https://doi.org/10.52088/ijesty.v5i3.1146

Refbacks

  • There are currently no refbacks.


Copyright (c) 2025 R Anand Babu, Vishwa Priya V, Manoj Kumar Mishra, Inakoti Ramesh Raja, Surya Kiran Chebrolu, B Swarna

International Journal of Engineering, Science, and Information Technology (IJESTY) eISSN 2775-2674