Journal Home > Volume 27 , Issue 3

Non-Volatile Memory (NVM) offers byte-addressability and persistency. Because NVM can be plugged into memory and provide low latency, it offers a new opportunity to build new database systems with a single-layer storage design. A single-layer NVM-Native DataBase (N2DB) provides zero copy and log freedom. Hence, all data are stored in NVM and there is no extra data duplication and logging during execution. N2DB avoids complex data synchronization and logging overhead in the two-layer storage design of disk-oriented databases and in-memory databases. Garbage Collection (GC) is critical in such an NVM-based database because memory leaks on NVM are durable. Moreover, data recovery is equally essential to guarantee atomicity, consistency, isolation, and durability properties. Without logging, it is a great challenge for N2DB to restore data to a consistent state after crashes and recoveries. This paper presents the GC and data recovery mechanisms for N2DB. Evaluations show that the overall performance of N2DB is up to 3.6× higher than that of InnoDB. Enabling GC reduces performance by up to 10%, but saves storage space by up to 67%. Moreover, our data recovery requires only 0.2% of the time and half of the storage space of InnoDB.


menu
Abstract
Full text
Outline
About this article

Garbage Collection and Data Recovery for N2DB

Show Author's information Shiyu CaiKang Chen( )Mengxing LiuXuyang LiuYongwei WuWeimin Zheng
Department of Computer Science and Technology, Tsinghua University, Bejing 100084, China

Abstract

Non-Volatile Memory (NVM) offers byte-addressability and persistency. Because NVM can be plugged into memory and provide low latency, it offers a new opportunity to build new database systems with a single-layer storage design. A single-layer NVM-Native DataBase (N2DB) provides zero copy and log freedom. Hence, all data are stored in NVM and there is no extra data duplication and logging during execution. N2DB avoids complex data synchronization and logging overhead in the two-layer storage design of disk-oriented databases and in-memory databases. Garbage Collection (GC) is critical in such an NVM-based database because memory leaks on NVM are durable. Moreover, data recovery is equally essential to guarantee atomicity, consistency, isolation, and durability properties. Without logging, it is a great challenge for N2DB to restore data to a consistent state after crashes and recoveries. This paper presents the GC and data recovery mechanisms for N2DB. Evaluations show that the overall performance of N2DB is up to 3.6× higher than that of InnoDB. Enabling GC reduces performance by up to 10%, but saves storage space by up to 67%. Moreover, our data recovery requires only 0.2% of the time and half of the storage space of InnoDB.

Keywords: Non-Volatile Memory (NVM), Garbage Collection (GC), data recovery

References(23)

[1]
M. Andrei, C. Lemke, G. Radestock, R. Schulze, C. Thiel, R. Blanco, A. Meghlan, M. Sharique, S. Seifert, S. Vishnoi, et al., SAP HANA adoption of non-volatile memory, Proceedings of the VLDB Endowment, vol. 10, no. 12, pp. 1754-1765, 2017.
[2]
J. Arulraj and A. Pavlo, How to build a non-volatile memory database management system, in Proc. 2017 ACM Int. Conf. Management of Data, Chicago, IL, USA, 2017, pp. 1753-1758.
DOI
[3]
J. Arulraj, A. Pavlo, and S. R. Dulloor, Let’s talk about storage & recovery methods for non-volatile memory database systems, in Proc. 2015 ACM SIGMOD Int. Conf. Management of Data, Melbourne, Australia, 2015, pp. 707-722.
DOI
[4]
J. Arulraj, M. Perron, and A. Pavlo, Write-behind logging, Proceedings of the VLDB Endowment, vol. 10, no. 4, pp. 337-348, 2016.
[5]
A. Eisenman, D. Gardner, I. AbdelRahman, J. Axboe, S. Y. Dong, K. Hazelwood, C. Petersen, A. Cidon, and S. Katti, Reducing DRAM footprint with NVM in Facebook, in Proc. 13th EuroSys Conf., Porto, Portugal, 2018, p. 42.
DOI
[6]
J. DeBrabant, A. Pavlo, S. Tu, M. Stonebraker, and S. Zdonik, Anti-caching: A new approach to database management system architecture, Proceedings of the VLDB Endowment, vol. 6, no. 14, pp. 1942-1953, 2013.
[7]
J. DeBrabant, J. Arulraj, A. Pavlo, M. Stonebraker, S. Zdonik, and S. R. Dulloor, A prolegomenon on OLTP database systems for non-volatile memory, Proceedings of the VLDB Endowment, vol. 7, no. 14, pp. 57-63, 2014.
[8]
J. Arulraj, J. Levandoski, U. F. Minhas, and P. A. Larson, Bztree: A high-performance latch-free range index for non-volatile memory, Proceedings of the VLDB Endowment, vol. 11, no. 5, pp. 553-565, 2018.
[9]
S. M. Chen and Q. Jin, Persistent B+-trees in non-volatile main memory, Proceedings of the VLDB Endowment, vol. 8, no. 7, pp. 786-797, 2015.
[10]
M. Liu and Y. Wu, Concurrency control for non-volatile memory systems, (in Chinese), PhD dissertation, Tsinghua University, Beijing, China, 2020.
[11]
A. Rudoff and M. Slusarz, Persistent memory development kit, https://pmem.io/pmdk/, 2014.
[12]
K. Bhandari, D. R. Chakrabarti, and H. J. Boehm, Makalu: Fast recoverable allocation of non-volatile memory, in Proc. 2016 ACM SIGPLAN Int. Conf. Object-Oriented Programming, Systems, Languages, and Applications, Amsterdam, the Netherlands, 2016, pp. 677-694.
DOI
[13]
J. Coburn, A. M. Caulfield, A. Akel, L. M. Grupp, R. K. Gupta, R. Jhala, and S. Swanson, NV-Heaps: Making persistent objects fast and safe with next-generation, non-volatile memories, ACM SIGARCH Computer Architecture News, vol. 39, no. 1, pp. 105-118, 2011.
[14]
A. Kolli, S. Pelley, A. Saidi, P. M. Chen, and T. F. Wenisch, High-performance transactions for persistent memories, in Proc. 21stInt. Conf. Architectural Support for Programming Languages and Operating Systems, Atlanta, GA, USA, 2016, pp. 399-411.
DOI
[15]
E. R. Giles, K. Doshi, and P. Varman, SoftWrAP: A lightweight framework for transactional support of storage class memory, in Proc. 2015 31st Symp. Mass Storage Systems and Technologies (MSST), Santa Clara, CA, USA, 2015, pp. 1-14.
DOI
[16]
The PostgreSQL Global Development Group, PostgreSQL, https://www.postgresql.org/, 2020.
[17]
C. Diaconu, C. Freedman, E. Ismert, P. A. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling, Hekaton: SQL server’s memory-optimized OLTP engine, in Proc. 2013 ACM SIGMOD Int. Conf. Management of Data, New York, NY, USA, 2013, pp. 1243-1254.
DOI
[18]
Y. J. Wu, J. Arulraj, J. X. Lin, R. Xian, and A. Pavlo, An empirical evaluation of in-memory multi-version concurrency control, Proceedings of the VLDB Endowment, vol. 10, no. 7, pp. 781-792, 2017.
[19]
J. Böttcher, V. Leis, T. Neumann, and A. Kemper, Scalable garbage collection for in-memory MVCC systems, Proceedings of the VLDB Endowment, vol. 13, no. 2, pp. 128-141, 2019.
[20]
T. Neumann, T. Mühlbauer, and A. Kemper, Fast serializable multi-version concurrency control for main-memory database systems, in Proc. 2015 ACM SIGMOD Int. Conf. Management of Data, Melbourne, Australia, 2015, pp. 677-689.
DOI
[21]
J. Lee, H. Shin, C. G. Park, S. Ko, J. Noh, Y. Chuh, W. Stephan, and W. S. Han, Hybrid garbage collection for multi-version concurrency control in SAP HANA, in Proc. 2016 Int. Conf. Management of Data, San Francisco, CA, USA, 2016, pp. 1307-1318.
DOI
[22]
W. T. Cai, H. S. Wen, H. A. Beadle, C. Kjellqvist, M. Hedayati, and M. L. Scott, Understanding and optimizing persistent memory allocation, in Proc. 2020 ACM SIGPLAN Int. Symp. Memory Management, London, UK, 2020, pp. 60-73.
DOI
[23]
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, Benchmarking cloud serving systems with YCSB, in Proc. 1st ACM Symp. Cloud Computing, Indianapolis, IN, USA, 2010, pp. 143-154.
DOI
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 04 February 2021
Accepted: 15 February 2021
Published: 13 November 2021
Issue date: June 2022

Copyright

© The author(s) 2022

Acknowledgements

This work was supported by the National Key Research & Development Program of China (No. 2016YFB1000504) and the National Natural Science Foundation of China (Nos. 61877035, 61433008, 61373145, and 61572280).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return