Bandizip: an approach to lossless compression of database files

Authors

  • Fabricio Marcillo Instituto Superior Universitario Japón
  • Javier Guaña Instituto Superior Universitario Japón
  • Yamileth Arteaga Instituto Superior Universitario Japón
  • Lucía Begnini Instituto Superior Universitario Japón

DOI:

https://doi.org/10.18779/ingenio.v8i1.889

Keywords:

Lossless compression, file formats, bandizip

Abstract

Data compression is an essential process in computer science and information theory to reduce the size of information through specialized techniques that eliminate statistical or structural redundancies. This process can be lossless, maintaining data integrity, or lossy, sacrificing accuracy for greater compression. Applications of data compression range from optimizing storage to improving data transfer performance, albeit at a computational cost that must be balanced. Choosing the right compression format can significantly impact the performance of database and storage operations. Notable formats include ZIP, ZIPX, 7Z and XZ, each with unique characteristics that make them more appropriate for certain contexts. For example, ZIPX and EXE are effective in terms of compression rate, while XZ excels in Unix and Linux environments. In the research presented, Bandizip was employed to demonstrate versatility in managing compressed files with a focus on compression efficiency and compression rate. Comparative analysis of compression rates revealed that ZIPX, EXE and XZ are optimal for compressing databases, maximizing size reduction without compromising data integrity. These findings underscore the importance of strategically selecting the compression format to optimize the storage and transmission of large volumes of information, especially in database environments.

Downloads

Download data is not yet available.

References

A. Gopinath, and M. Ravisankar, “Comparison of Lossless Data Compression Techniques,” en 2020 International Conference on Inventive Computation Technologies (ICICT), 2020, pp. 628–633. [En línea]. Disponible en: https://doi.org/10.1109/ICICT48043.2020.9112516

S. A. Abdulzahra, A. K. M. Al-Qurabat, and A. K. Idrees, “Data Reduction Based on Compression Technique for Big Data in IoT,” en 2020 International Conference on Emerging Smart Computing and Informatics (ESCI),2020, pp. 103–108, 2020. [En línea]. Disponible en: https://doi.org/10.1109/ESCI48226.2020.9167636

R. Vestergaard, D. E. Lucani, and Q. Zhang, “A Randomly Accessible Lossless Compression Scheme for Time-Series Data,” en IEEE INFOCOM 2020 - IEEE Conference on Computer Communications, Toronto, Canada, 2020, pp. 2145-2154. [En línea]. Disponible en: https://doi.org/10.1109/INFOCOM41043.2020.9155450

S. Subbarayappa, and P. G. Aradhyamath, “Analytical Transform for Image Compression,” en 2021 6th International Conference for Convergence in Technology (I2CT), 2021, pp. 1–5. [En línea]. Disponible en: https://doi.org/10.1109/I2CT51068.2021.9418183.

M. Um, J. Han, and S. Lee, “File fingerprinting of the ZIP format for identifying and tracking provenance,” Forensic Science International: Digital Investigation, vol. 39, dec. 2021. [Online] Available: https://doi.org/10.1016/j.fsidi.2021.301271

M. H. Kolekar, C. K. Jha, and P. Kumar, “ECG Data Compression Using Modified Run Length Encoding of Wavelet Coefficients for Holter Monitoring,” Irbm, vol. 43, no. 5, pp. 325–332, oct. 2022. [Online] Available: https://doi.org/10.1016/j.irbm.2021.10.001

A. A. R. Beserra, L. C. Souza, and D. F. L. Souza, “Bootstrap analysis of compression algorithms,” IEEE L Latin America Transactions, vol. 18, no. 9, pp. 1639–1645, sep. 2020. [Online] Available: https://doi.org/10.1109/TLA.2020.9381807

I. V. Selivanova, B. Y. Ryabko, and A. E. Guskov, “Classification by compression: Application of information-theory methods for the identification of themes of scientific texts,” Automatic Documentation and Mathematical Linguistic, vol. 51, no. 3, pp. 120–126, aug. 2017. [Online] Available: https://doi.org/10.3103/s0005105517030116

B. Lal, R. Gravina, F. Spagnolo, and P. Corsonello, “Compressed Sensing Approach for Physiological Signals: A Review,” IEEE Sensors Journal, vol. 23, no. 6, pp. 5513–5534, mar. 2023. [Online] Available: https://doi.org/10.1109/JSEN.2023.3243390

T. Islam, C. H. Kim, H. Iwata, H. Shimono, and A. Kimura, “DeepCGP: A Deep Learning Method to Compress Genome-Wide Polymorphisms for Predicting Phenotype of Rice,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 20, no. 3, pp. 2078–2088, 2023. [Online] Available: https://doi.org/10.1109/TCBB.2022.3231466

N. G. Larrakoetxea, J. E. Astobiza, I. P. Lopez, B. S. Urquijo, J. G. Barruetabena, and A. Z. Rego, “Efficient Machine Learning on Edge Computing Through Data Compression Techniques,” IEEE Access, vol. 11, pp. 31676–31685, mar. 2023. [Online] Available: https://doi.org/10.1109/ACCESS.2023.3263391

E. B. Van De Kraats, G. P. Penney, D. Tomazevic, T. Van Walsum, and W. J. Niessen, “Standardized evaluation of 2D-3D registration”, en Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2004, Lecture Notes in Computer Science, 2004, pp. 574–581. [Online] Available: https://doi.org/10.1007/978-3-540-30135-6_70

D. Kim, J. Jeong, S. H. Lee, S. H. Kang, y Y. K. Lee, “Integrity check value, are you a spy? Information leakage attack on archive formats,” IEEE Access, vol. 12, pp. 105258-105267, jun. 2024. [Online] Available: https://doi.org/10.1109/ACCESS.2024.3416690.

Published

2025-01-14

How to Cite

Marcillo, F., Guaña, J., Arteaga, Y., & Begnini, L. (2025). Bandizip: an approach to lossless compression of database files. InGenio Journal, 8(1), 137–146. https://doi.org/10.18779/ingenio.v8i1.889

Issue

Section

Articles