Recent DaMRL Projects


EVOLVING FLASH-BASED STORAGE

Abstract:

Flash-based solid-state drives (SSDs) are widely used to accelerate different applications because of their superior overall performance compared to hard-disk drives (HDDs). To achieve better performance with SSDs, the storage stack overhead imposed by the operating system (OS), rather than device speed, is now the bottleneck that must be addressed as a key research priority. It is critical to evolve new techniques to take full advantage of the unique characteristics of flash memory and flash-based persistent storage. However, our existing OS cannot take advantage of such techniques as it is designed in a very generic fashion to support the broad class of the storage devices. There is thus a critical need to rethink our system infrastructure to take advantage of the best and potentially unique aspects of flash-based memory and NVMe SSDs as persistent storage. The primary objective of our research is to design new system infrastructures, that take advantage of the unique flash characteristics exposed by new storage devices, for accelerating various applications.

Publications:

  1. Mahsa Bayati, Janki Bhimani, Ronald Lee, Ningfang Mi. Exploring Benefits of NVMe SSDs for Big Data Processing in Enterprise Data Centers International Conference on Big Data Computing and Communication (BIGCOM19), Qingdao, China, 2019.

Acknowledgments:

NSF


DATACENTER SCHEDULING AND RESOURCE MANAGEMENT

Abstract:

In the era of big data and cloud computing, large amounts of data are generated from user applications and need to be processed in the datacenter. High-performance and scalable frameworks have become the need of the hour for data-intensive processing and analytics in both industry and academia. More and more applications are using the new parallel-data computing techniques used as TensorFlow, and Apache Spark. It is an interesting research problem to maximize resource utilization and minimize big data processing time. However, given the limited resources in the cluster and a complex dependency in data flow, it is challenging to design scheduling and resource management techniques. Therefore, the primary focus of our research is to put significant efforts into developing new schemes for job scheduling and resource management for evolving parallel-data computing frameworks and applications.

Publications:

  1. Danlin Jia, Janki Bhimani, Son Nam Nguyen, Bo Sheng, and Ningfang Mi, ATuMm: Auto-tuning Memory Manager in Apache Spark, 2019 International Performance Computing and Communications Conference (IPCCC19), London, UK, 2019. Acceptance Rate: 29.2%.

Public Software:

  1. https://github.com/DanlinJia/spark_core_ATMM

Acknowledgments:

NSF


I/O BEHAVIOR MODELING & PERSISTENT STORAGE DEVICE CONFIGURATION

Abstract:

This project makes empirical contributions to storage systems by addressing challenges issued by large-scale data-intensive applications. Specifically, it advances (1) how to analyze the impact of various system components while running multiple workloads on emerging storage systems; (2) how to design interactive frameworks that allow users to modify the internal algorithms and parameters of modern storage devices; (3) how to enable novices to configure storage systems with respect to their workloads and data processing requirements; and (4) how to derive I/O models to predict future I/O workload patterns and accordingly configure storage systems in advance for better performance.

This project will allow designing better storage systems with high performance and reliability. The outcome of this project will bring a significant impact on many areas that are dependent on processing a large amount of data. This project will share the findings with undergraduate and graduate students through computer science and engineering programs and open up career opportunities to female students, underrepresented minorities, and first-generation college students. This project will disseminate the proposed techniques into the industry and foster technology transfer through new industrial collaborations. The developed infrastructure will be available to the research community through a web-based portal.

Publications:

  1. Danlin Jia, Manoj Pravakar Saha, Janki Bhimani, and Ningfang Mi, ”Performance and Consistency Analysis for Distributed Deep Learning Applications”, 2020 International PerformanceComputing and Communications Conference (IPCCC20), Virtual using Zoom, 2020. Acceptance Rate: 29.3%
  2. Janki Bhimani, Ningfang Mi, Miriam Leeser, and Zhengyu Yang, New Performance Modeling Methods for Parallel Data Processing Applications, ACM Transactions on Modeling and computer simulation (TOMACS), 2019. DOI 10.1145/3309684.
  3. Janki Bhimani, Rajinikanth Pandurangan, Ningfang Mi, and Vijay Balakrishnan, Emulate Processing of Assorted Database Server Applications on Flash-Based Storage in Datacenter Infrastructures, 2019 International Performance Computing and Communications Conference (IPCCC19), London, UK, 2019. Acceptance Rate: 29.2%

Public Software:

  1. https://github.com/bhimanijanki/FiM
  2. https://github.com/bhimanijanki/KMeans

Acknowledgments:

NSF


IMPACT OF ENVIRONMENTAL FACTORS ON FLASH STORAGE

Abstract:

Understanding the reliability of components is an important criterion for building robust systems. Data storage is one of the most critical components that is at the center of all emerging technologies. Thus, studying reliability and different types of faults for system storage components is important. Moreover, with fastly emerging flash-based storage technologies such as Solid State Drives (SSDs), the previous fault tolerance understandings for Hard Disk Drives (HDDs) are not directly applicable. We study the impacts of various environmental factors such as vibration, temperature, humidity, etc. on the performance of SSDs in data center infrastructures. We investigate the “short-term” and “long-term” impacts of exposure to SSDs. We also analyze the impacts of different types of application workloads.

Publications:

  1. Janki Bhimani, Tirthak Patel, Ningfang Mi, and Devesh Tiwari, What does Vibration do to YourSSD?, 2019 Design Automation Conference (DAC19), Las Vegas, NV, 2019. Acceptance Rate:24.3%.

Public Software:

  1. https://github.com/bhimanijanki/SSD_Vibration

Acknowledgments:

NSF


IMPROVING FLASH ENDURANCE IN DATA CENTERS

Abstract:

With the capital expenditure of SSDs declining and the storage capacity of SSDs increasing, all-flash data centers are evolving to serve cloud services better than SSD-HDD hybrid data centers. During this transition, the biggest challenge is how to reduce the Write Amplification Factor (WAF) as well as to improve the endurance of SSD since this device has limited program/erase cycles. A specified case is that storing data with different lifetimes (i.e., I/O streams with similar temporal fetching patterns such as reaccess frequency) in one single SSD can cause high WAF, reduce the endurance, and downgrade the performance of SSDs. Motivated by this, multi-stream SSDs have been developed to enable data with a different lifetime to be stored in different SSD regions. The logic behind is to reduce internal movement of data — when garbage collection is triggered, there are high chances of having data blocks with either all the pages being invalid or valid. However, the limitation of this technology is that the system needs to manually assign the same streamID to data with a similar lifetime. We are working towards designing systems to perform the data placements for improving the flash endurance in data centers while running multi-tenant applications.

Publications:

  1. Janki Bhimani, Zhengyu Yang, Jingpei Yang, Adnan Maruf, Ningfang Mi, Rajinikanth Panduran-gan, Changho Choi, Vijay Balakrishnan. Automatic Stream Identification to Improve Flash Endurance in Data Centers. ACM Transactions on Storage (TOS) 2021.
  2. Janki Bhimani, Ningfang Mi, Zhengyu Yang, Jingpei Yang, Rajinikanth Pandurangan, Changho Choi, and Vijay Balakrishnan, FIOS: Feature-Based I/O Stream Identification for ImprovingEndurance of Multi-Stream SSDs, 2018 IEEE International Conference on Cloud Computing (CLOUD’18), San Francisco, CA, 2018. Acceptance Rate: 15%. (Best Paper Award)
  3. Janki Bhimani, Jingpei Yang, Zhengyu Yang, Ningfang Mi, NHV Krishna Giri, RajinikanthPandurangan, Changho Choi, and Vijay Balakrishnan. Enhancing SSDs with multi-stream: What? why? how? IEEE International Performance Computing and Communications Conference (IPCCC17), San Diego, CA, 2017. (Short Paper)

Public Software:

  1. https://github.com/bhimanijanki/ms_ssds_sim

Acknowledgments:

Samsung, NSF


UNDERSTANDING FLASH-BASED STORAGE I/O BEHAVIOUR OF GAMES

Abstract:

Computer games are an extremely popular but overlooked workload. Cloud gaming has been one of the biggest buzzwords in the gaming industry throughout 2020. The rapid growth of the video gaming industry and the diverse set of popular video games available today raises increasing concern to properly understand its I/O characteristics to improve their performance and design better gaming servers and consoles. We attempt to systematically measure, quantify, and characterize the organization of game data into files, back-end storage access patterns, and the performance of gaming workloads. We explore the I/O behavior of the recent and famous games, producing a series of observations coming from measurements done on a real setup.

Publications:

  1. Adnan Maruf, Zhengyu Yang, Bridget Davis, Daniel Kim, Jeffrey Wong, Matthew Durand, and Janki Bhimani, Understanding Flash-Based Storage I/O Behavior of Games, 2021 IEEE International Conference on Cloud Computing (CLOUD’21), Online Virtual Congress, 2021. Acceptance Rate: 23.8%.

Acknowledgments:

Samsung, NSF


EMERGING KEY-VALUE BASED FLASH MEMORIES

Abstract:

An increasing concern that curbs the widespread adoption of KV-SSD is whether or not offloading host-side operations to the storage device changes device behavior, negatively affecting various applications’ overall performance. In this paper, we systematically measure, quantify, and understand the performance of KV-SSD by studying the impact of its distinct components such as indexing, data packing, and key handling on I/O concurrency, garbage collection, and space utilization. Our experiments and analysis uncover that KV-SSD’s behavior differs from well-known idiosyncrasies of block-SSD. A proper understanding of its characteristics will enable us to achieve better performance for random, read-heavy, and highly concurrent workloads.

Publications:

  1. Manoj Pravakar Saha, Bryan Kim, and Janki Bhimani, KV-SSD: What is it Good For?, 2021 Design Automation Conference (DAC’21), San Francisco, CA, 2021. Acceptance Rate: 23%.

Public Software:

  1. https://support.cis.fiu.edu/ftp/damrl/
  2. https://damrl.cs.fiu.edu/wp-content/uploads/sites/59/2021/12/RHIK_TR.pdf

Acknowledgments:

Samsung, NSF


MULTI-CLOCK: Dynamic Tiering for Hybrid Memory Systems

Abstract:

The rapid growth of in-memory computing by data-intensive applications today has increased the demand for DRAM in servers. However, a DRAM-based system can be limiting for modern workloads because of its capacity, cost, and power consumption characteristics. Hybrid memory systems, which consist of different types of memory, such as DRAM and persistent memory, can help address many of these limitations. Persistent memory devices are byte-addressable like DRAM but are also larger in capacity and consume less power relative to DRAM. One promising direction that has been explored in the recent literature is to introduce these persistent memory devices as a second memory tier that is directly exposed to the CPU. The resulting tiered memory design must address the fundamental challenge of placing the right data in the right memory tier at the right time with minimal real-system overhead. We present MULTI-CLOCK, an efficient, low-overhead hybrid memory system that relies on a unique page selection technique. MULTI-CLOCK’s careful page selection captures both page access recency and frequency, and move pages to appropriate tiers at the right time within hybrid memory systems. We implemented a Linux-based, NUMA Aware version of MULTI-CLOCK that is entirely transparent and backward compatible with any existing application. Our evaluation with diverse real-world applications such as graph processing and key-value stores shows that MULTICLOCK can improve the average throughput by as much as 352% when compared with several state-of-the-art techniques for tiered memory

Publications:

  1. Adnan Maruf, Ashikee Ghosh, Janki Bhimani, Daniel Campello, Andy Rudoff, Raju Rangaswami, MULTI-CLOCK: Dynamic Tiering for Hybrid Memory Systems, 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA’22), Seoul, South Korea, 2022. Acceptance Rate: 30%.

Public Software:

  1. https://github.com/bhimanijanki/Multi-Clock

Acknowledgments:

NSF

 


TOWARDS LEVERAGING IN-STORAGE INDEXING DEVICES (ISIDs)

Abstract:

The overarching goal of this research project is to advance the capabilities of ISIDs to promote their widespread adoption in storage systems with better performance. The specific research objectives of the project are organized into four thrusts.
The first thrust focuses on modeling ISID performance and reliability by developing novel queuing models that capture dependencies among internal features and proposing solutions for dynamic model calibration. The second thrust aims to design new elastic index management techniques that effectively utilize the limited on-device resources, taking into account flash-specific constraints to optimize endurance and latency. The third thrust addresses the challenges of adaptive indexing in multi-tenant environments, including interference mitigation, optimized index updates, and reevaluation of wear-leveling techniques. Lastly, the fourth thrust aims to establish a host-device interface, conduct comprehensive case studies, develop productivity tools, and integrate programmable board-based SSDs into an open-source Linux code base. Through these research endeavors, the project aims to drive significant advancements in ISID technology and contribute to the overall improvement of performance and efficiency in storage systems. 

Publications:

  1. Janki Bhimani, Jingpei Yang, Ningfang Mi, Changho Choi, and Manoj Pravakar Saha, Fine-grained Control of Concurrency within KV-SSDs, 2021 14th ACM International Systems and Storage Conference (SYSTOR’21), Virtual. Acceptance Rate: 29.9%.
  2. Ziyang Jiao, Janki Bhimani, Bryan S. Kim, Wear Leveling in SSDs Considered Harmful, 2022 ACM Workshop on Hot Topics in Storage and File Systems (HotStorage ’22), Virtual. (Best Paper Award)
  3. Manoj Saha, Danlin Jia, Janki Bhimani and Ningfang Mi, MoKE: Modular Key-value Emulator for Realistic Studies on Emerging Storage Devices, 2023 IEEE International Conference on Cloud Computing (CLOUD’23), Hybrid Event, Chicago, IL, 2023.
  4. Manoj P. Saha, Omkar Desai, Bryan S. Kim, Janki Bhimani. “Leveraging Keys In Key-Value SSD for Production Workloads” The International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC’23), Orlando, FL, 2023. (Short Paper)
  5. Adnan Maruf, Daniel Carlson, Ashikee Ghosh, Manoj Saha, Janki Bhimani, Raju Rangaswami. “Allocation Policies Matter for Hybrid Memory Systems” The International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC’23), Orlando, FL, 2023. (Short Paper)
  6. Manoj P. Saha, Bryan S. Kim, Haryadi S. Gunawi, Janki Bhimani. “RHIK – Re-configurable Hash-based Indexing for KVSSD” The International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC’23), Orlando, FL, 2023. (Short Paper)

Public Software:

Acknowledgments: