ОПТИМІЗАЦІЯ ОБСЯГУ МЕТАДАНИХ У СУЧАСНИХ ІНФОРМАЦІЙНИХ СИСТЕМАХ:  МЕТОДИ, ІНСТРУМЕНТИ Й АЛГОРИТМІЧНІ ПІДХОДИ

Ольга ТКАЧЕНКО; Андрій  ЛЕМЕШКО

doi:10.17721/AIT.2025.1.01

Authors

Olha Tkachenko, DSc (Engin.), Prof. Taras Shevchenko National University of Kyiv Author
Andriy LEMESHKO, PhD, Assoc. Prof. Taras Shevchenko National University of Kyiv, Kyiv, Ukraine Author

DOI:

https://doi.org/10.17721/AIT.2025.1.01

Keywords:

metadata, optimization, management, information, dynamism, adaptability, relevance, performance, quality, algorithm, compression

Abstract

B a c k g r o u n d. The study of metadata volume optimization aims to achieve a balance between the sufficiency of description and system efficiency. Excessive metadata can overload systems, and insufficient metadata can complicate data access. At the same time, the growing volume of data complicates work with metadata, since its creation, storage and processing require significant resources. Metadata volume optimization has become an important task for organizations seeking to achieve effective information management. Challenges associated with metadata volume: redundancy, insufficiency, duplication, data dynamics, non-compliance with standards.

M e t h o d s. The paper considers the theoretical foundations, practical methods, tools (Collibra, Apache Atlas, Talend Metadata Manager, AI algorithms) and the benefits of metadata volume optimization (cost reduction, improved productivity, improved data quality, flexibility and scalability, improved analytics).

R e s u l t s. The paper proposes a comprehensive metadata optimization strategy adapted for the IT environment. It is shown that the use of a systematic approach, including analysis, standardization, automation and integration of the latest technologies, allows to significantly reduce costs and improve data management. An algorithm for optimizing the volume of metadata is presented, which can be adapted for various application areas, such as databases, content management systems or big data. The proposed algorithm takes into account the assessment of metadata usefulness using metric normalization to unify evaluation scales and determine the usefulness of each metadata element; metadata selection (filtering, clustering); metadata compression; automatic optimization using machine learning models and dynamic tuning; verification and adaptation. The algorithm can be expanded or modified depending on the specifics of the task.

C o n c l u s i o n s. The proposed algorithm takes into account: metadata usefulness assessment using metric normalization to unify evaluation scales and determine the usefulness of each metadata element; metadata selection (filtering, clustering); metadata compression; automatic optimization using machine learning models and dynamic tuning; testing and adaptation. The algorithm can be expanded or modified depending on the specifics of the task.

Downloads

Download data is not yet available.

References

Bizer, C., Heath, T., & Berners-Lee, T. (2009). Linked data – The story so far. International Journal on Semantic Web and Information Systems, 5(3), 1–22. https://doi.org/10.4018/jswis.2009081901

Dean, J., & Ghemawat, S. (2004). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), 107–113. https://doi.org/10.1145/1327452.1327492

Ghemawat, S., Gobioff, H., & Leung, S.-T. (2003). The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP ’03) (pp. 29–43). Association for Computing Machinery. https://doi.org/10.1145/945445.945450

Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2017). Bag of tricks for efficient text classification. In M. Lapata, P. Blunsom, & A. Koller (Eds.), Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Vol. 2, Short Papers (pp. 427–431). Association for Computational Linguistics. https://doi.org/10.18653/v1/E17-2068

Korshun, N., Myshko, I., & Tkachenko, O. (2023). Automation and management in operating systems: The role of artificial intelligence and machine learning. In Proceedings of the 20th International Scientific Conference Dynamical System Modeling and Stability Investigation (DSMSI 2023: Mathematical Foundations of Information Technologies) (pp. 59–68). Igor Sikorsky Kyiv Polytechnic Institute; published online in CEUR Workshop Proceedings. https://ceur-ws.org/Vol-3687/

Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv. https://doi.org/10.48550/arXiv.1301.3781

Turovsky, O., Tkachenko, O., Ghno, G. S. N., & Abed, A. M. (2024). Selection and substantiation of the system of criteria for evaluating the effectiveness of steganographic methods of hiding information in the image. In Proceedings of the 35th Conference of Open Innovations Association (FRUCT) (pp. 755–763). Open Innovations Association (FRUCT). https://doi.org/10.23919/FRUCT61870.2024.10516392

Optimization of metadata volume in modern information systems: methods, tools and algorithmic approaches

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Information

Author Guidelines

Make a Submission

Language

Journal indexing

Flag counter