Prashant Pandey

GitHubprashantpandey Emailppandey@cs.stonybrook.edu
[Recent News] [Publications] [Talks]
[Research Statement] [CV]

Research Interests

My research interests lie at the intersection of Systems and Algorithms. I design and build theoretically well-founded data structures for big data problems in computational biology, databases, and file systems.

My current research focuses on building efficient approximate membership query data structures, specifically, counting filters and their applications. I am also working on finding compact methods to represent large DNA sequencing and transcriptome datasets for large-scale sequence-search and de Bruijn graph traversal and assembly process. I am also a member of the team that developed BetrFS, an in-kernel file system built on write-optimized indexes.

While interning at Intel Labs, I worked on an encrypted FUSE file system using Intel SGX. At Google, I designed and implemented an extension to the ext4 file system for cryptographically ensuring file integrity. Google is currently working to integrate this extension into Android and the mainline Linux kernel. While at Google, I also worked on the core data structures of Spanner, Google’s geo-distributed big database.

I am being co-advised by Prof. Michael Bender and Prof. Rob Johnson at Stony Brook University, where I am currently pursuing my Ph.D. in Computer Science.

Recent News

  1. I recently received the Catacosinos fellowship for excellence in computer science at Stony Brook University.   [link]

  2. Our counting quotient filer paper is one of eight ACM SIGMOD 2017 Reproducible Papers.   [link]

  3. Our computational biology research got mentioned on VMware Research blog.   [link]

  4. The counting quotient filter data structure featured on the morning paper.   [link]

Publications

In reverse chronological order:

  1. Buffered Count-Min Sketch on SSD: Theory and Experiments (ESA 2018)
    Mayank Goswami, Dzejla Medjedovic, Emina Mekic, Prashant Pandey

  2. Mantis: A Fast, Small, and Exact Large-Scale Sequence-Search Index (RECOMB 2018) (Cell Systems 2018)
    Prashant Pandey, Fatemeh Almodaresi, Michael A. Bender, Michael Ferdman, Rob Johnson, and Rob Patro

  3. Rainbowfish: A Succinct Colored de Bruijn Graph Representation (WABI 2017) [biorxiv]
    Fatemeh Almodaresi, Prashant Pandey, and Rob Patro

  4. deBGR: An Efficient and Near-Exact Representation of the Weighted de Bruijn Graph (ISMB 2017) (Bioinformatics 2017)
    Prashant Pandey, Michael A. Bender, Rob Patro, and Rob Johnson

  5. Squeakr: An Exact and Approximate k-mer Counting System (Bioinformatics 2017) [biorxiv]
    Prashant Pandey, Michael A. Bender, Rob Patro, and Rob Johnson

  6. A General-Purpose Counting Filter: Making Every Bit Count (SIGMOD 2017)
    Prashant Pandey, Michael A. Bender, Rob Patro, and Rob Johnson

  7. A Fast x86 Implementation of Select (arXiv 2017)
    Prashant Pandey, Michael A. Bender, and Rob Johnson

  8. Writes Wrought Right, and Other Adventures in File System Optimization (ACM Transactions on Storage (TOS) - Special Issue USENIX FAST 2016
    Jun Yuan, Yang Zhan, William Jannen, Prashant Pandey, Amogh Akshintala, Kanchan Chandnani, Pooja Deo, Zardosht Kasheff, Michael Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter

  9. Optimizing Every Operation in a Write-optimized File System (FAST 2016) [Awarded Best Paper]
    Jun Yuan, Yang Zhan, William Jannen, Prashant Pandey, Amogh Akshintala, Kanchan Chandnani, Pooja Deo, Zardosht Kasheff, Michael Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter

  10. BetrFS: Write-Optimization in a Kernel File System (ACM Transactions on Storage (TOS) - Special Issue USENIX FAST 2015)
    William Jannen, Jun Yuan, Yang Zhan, Amogh Akshintala, John Esmet, Yizheng Jiao, Ankur Mittal, Prashant Pandey, Phaneendra Reddy, Leif Walsh, Michael A. Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter

  11. BetrFS: A Right-Optimized Write-Optimized File System (FAST 2015) [Runner up for best paper]
    William Jannen, Jun Yuan, Yang Zhan, Amogh Akshintala, John Esmet, Yizheng Jiao, Ankur Mittal, Prashant Pandey, Phaneendra Reddy, Leif Walsh, Michael A. Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter

Talks

  1. Fast and Space-Efficient Maps: Shrinking Big Data Down to Size   [pdf]
    Venue: Proposal defense, Stony Brook University, NY [June 2018]

  2. Mantis: A Fast, Small, and Exact Large-Scale Sequence-Search Index   [pdf]
    Venue: RECOMB 2018, Paris, France [April 2018]

  3. Scheduling Problems in Write-Optimized Key-Value Stores   [pdf]
    Venue: New Challenges in Scheduling Theory 2018, Aussois, France [April 2018]

  4. Compact Representation of Annotated de Bruijn Graphs   [pdf]
    Venue: Berkeley Lab, Berkeley, CA [January 2018]

  5. deBGR: An Efficient and Near-Exact Representation of the Weighted de Bruijn Graph [Extended talk]   [pdf] [Talk]
    Venue: VMware Research, Palo Alto, CA [August 2017] and                         Google Research, NY [September 2017]

  6. deBGR: An Efficient and Near-Exact Representation of the Weighted de Bruijn Graph   [pdf] [Talk]
    Venue: ISMB 2017, Prague, Czech Republic [July 2017]

  7. A General-Purpose Counting Filter: Making Every Bit Count   [pdf] [Talk]
    Venue: SIGMOD 2017, Chicago, IL [May 2017]

  8. Intel Software Guard Extensions (SGX)   [pdf]
    Venue: Sandia National Laboratories, Livermore, CA [August 2015]