Unified programming models can effectively improve program portability on various heterogeneous high-performance computers. Existing unified programming models put a lot of effort to code portability but are still far from achieving good performance portability. In this paper, we present a preliminary design of a performance-portable unified programming model including four aspects: programming language, programming abstraction, compilation optimization, and scheduling system. Specifically, domain-specific languages introduce domain knowledge to decouple the optimizations for different applications and architectures. The unified programming abstraction unifies the common features of different architectures to support common optimizations. Multi-level compilation optimization enables comprehensive performance optimization based on multi-level intermediate representations. Resource-aware lightweight runtime scheduling system improves the resource utilization of heterogeneous computers. This is a perspective paper to show our viewpoints on programming models for emerging heterogeneous systems.
- Article type
- Year
- Co-author

Multi-Clock Snapshot Isolation (MCSI) is a concurrency control mechanism that implements snapshot isolation on a single-layer Non-Volatile Memory (NVM) database. It stores a single copy of data by using multi-version storage to ensure durability and runtime access. With multi-clock transaction timestamp assignment, MCSI can efficiently generate snapshots with vector clocks and use per-thread transaction status arrays to identify uncommitted versions in NVM. For evaluation, we compared MCSI with the PostgreSQL-style concurrency control used in the single-layer NVM database N2DB. The maximum transaction throughput of MCSI is 101%–195% higher than that of N2DB for the YCSB workloads, and 25%–49% higher for the TPC-C workloads. Moreover, the transaction latency of MCSI remains relatively stable as the thread count increases. With 18 worker threads, the average transaction latency of MCSI is 65%–84% lower than that of N2DB for the YCSB workloads and 16%–43% lower for the TPC-C workloads.

Non-Volatile Memory (NVM) offers byte-addressability and persistency. Because NVM can be plugged into memory and provide low latency, it offers a new opportunity to build new database systems with a single-layer storage design. A single-layer NVM-Native DataBase (N2DB) provides zero copy and log freedom. Hence, all data are stored in NVM and there is no extra data duplication and logging during execution. N2DB avoids complex data synchronization and logging overhead in the two-layer storage design of disk-oriented databases and in-memory databases. Garbage Collection (GC) is critical in such an NVM-based database because memory leaks on NVM are durable. Moreover, data recovery is equally essential to guarantee atomicity, consistency, isolation, and durability properties. Without logging, it is a great challenge for N2DB to restore data to a consistent state after crashes and recoveries. This paper presents the GC and data recovery mechanisms for N2DB. Evaluations show that the overall performance of N2DB is up to

Brain-inspired computing refers to computational models, methods, and systems, that are mainly inspired by the processing mode or structure of brain. A recent study proposed the concept of "neuromorphic completeness" and the corresponding system hierarchy, which is helpful to determine the capability boundary of brain-inspired computing system and to judge whether hardware and software of brain-inspired computing are compatible with each other. As a position paper, this article analyzes the existing brain-inspired chips’ design characteristics and the current so-called "general purpose" application development frameworks for brain-inspired computing, as well as introduces the background and the potential of this proposal. Further, some key features of this concept are presented through the comparison with the Turing completeness and approximate computation, and the analyses of the relationship with "general-purpose" brain-inspired computing systems (it means that computing systems can support all computable applications). In the end, a promising technical approach to realize such computing systems is introduced, as well as the on-going research and the work foundation. We believe that this work is conducive to the design of extensible neuromorphic complete hardware-primitives and the corresponding chips. On this basis, it is expected to gradually realize "general purpose" brain-inspired computing system, in order to take into account the functionality completeness and application efficiency.