Orthogonal Adaptative Networks for Structured Word Embedding Compression

Authors

  • Amruddin Nekpai Department of Information Technologies, Friendship University of Russia, Moscow, Russia Author

DOI:

https://doi.org/10.64229/cwexr598

Keywords:

Data Science, Machine Learning, Dimensionality Reduction, Data Analysis, PCA, t-SNE, UMAP

Abstract

This comparative study represents the Orthogonal Adaptive Neural Network (OA-NN), a novel dimensionality reduction method that synergizes orthogonal linear projections with adaptive nonlinear transformations to preserve both local and global structures in high-dimensional data. The dataset which we have used in this paper, Glove (Global Vectors for Word Representation), is an embedded dataset with 400000 embedded word to vectors. We curated a subset of 1000 word to vectors with the mentioned dataset from 400000 (word to vectors) to reduce the time duration and balance between comprehensiveness and computational efficiency. The proposed method is rigorously compared against three established techniques: principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), and Uniform Manifold Approximation and Projection (UMAP). we have used quantitative metrics (MSE, Trustworthiness, Continuity) and qualitative visual analysis for testing OA-NN method and and also for comparison,the OA-NN showed a (2-5) better continuity rate, and an equal continuity rate compared to UMAP while applying Glove 300D dataset.

References

[1]Greenacre, M., Groenen, P. J. F., Hastie, T., & Iodice D’Enza, A. (2022). Principal component analysis. Nature Reviews Methods Primers, 2(1), 100. https://doi.org/10.1038/s43586-022-00184-w.

[2]Huroyan, V., Navarrete, R., Hossain, M. I., & Kobourov, S. G. (2022). Embedding neighborhoods simultaneously t-SNE (ENS-t-SNE). arXiv preprint. https://doi.org/10.48550/arXiv.2205.11720.

[3]Ravuri, A., & Lawrence, N. D. (2024). Towards one model for classical dimensionality reduction: A probabilistic perspective on UMAP and t-SNE. arXiv preprint. https://doi.org/10.48550/arXiv.2405.17412.

[4]Gettler, E., Warren, B. G., III, Fils-Aime, G., Barrett, A., Graves, A. M., Anderson, D. J., & Smith, B. A. (2025). Where does the glove fit? Examining the effect of hand hygiene timing on healthcare personnel glove contamination. Open Forum Infectious Diseases, 12(Supplement_1), ofae631.529. https://doi.org/10.1093/ofid/ofae631.529.

[5]Exploring Word Vectors: A Tutorial for "Reading and Writing" Word Embeddings

[6]Jolliffe, I. T. (2002). Principal Component Analysis (2nd ed.). Springer Series in Statistics.

[7]Pearson, K. (1901). On Lines and Planes of Closest Fit to Systems of Points in Space. Philosophical Magazine, 2(11), 559-572.

[8]Kurita, T. (2020). Principal Component Analysis (PCA). In: Computer Vision. Springer, Cham.

[9]Hodson, T. O., Over, T. M., & Foks, S. S. (2021). Mean squared error, deconstructed. Journal of Advances in Modeling Earth Systems, 13(12), e2021MS002681. https://doi.org/10.1029/2021MS002681.

[10]Velliangiri, S., Alagumuthukrishnan, S., & Joseph, S. I. T. (2020). A review of dimensionality reduction techniques for efficient computation. Procedia Computer Science, 171, 79-88. https://doi.org/10.1016/j.procs.2020.01.079.

[11]Kutumov, V. (2015). Introduction to t-SNE - A Method for Visualizing Multidimensional Data. Habr.

[12]Maaten, L. van der, Hinton, G. (2008). *Visualizing Data Using t-SNE*. Proceedings of the 2008 International Conference on Machine Learning (ICML).

[13]McInnes, L., Healy, J., & Melville, J. (2018). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv preprint arXiv:1802.03426.

[14]Saxe, A. M., McClelland, J. L., & Ganguli, S. (2014). Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. In arXiv preprint arXiv:1312.6120.

[15]Martinis, J. M., & Geller, M. R. (2014). Fast adiabatic qubit gates using only control.

[16]Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning.

Downloads

Published

2025-09-28

Issue

Section

Articles