About

My name is Ubaid Ullah Hafeez. I am a Ph.D student at Stony Brook University where I work at the PACE lab. I am advised by Dr. Anshul Gandhi. My research interests lie at the intersection of Distributed Systems and Machine Learning. When I am not in my lab, I can be found playing vidoe/board games. I also like to travel.

Technologies I Interact With

  • Memcached
  • TensorFlow
  • Kubernetes
  • Network Simulator (NS)

Publications

List of my most recent publications.

ElMem: Towards and Elastic Memcached System

Ubaid Ullah Hafeez, Muhammad Wajahat, Anshul Gandhi. Proceedings of The 2018 IEEE International Conference on Distributed Computing Systems (ICDCS '18'). Best Student Paper Award

Memory caches, such as Memcached, are a critical component of online applications as they help maintain low latencies. However, memory caches are expensive, both in terms of power and operating costs. It is thus important to dynamically scale such caches in response to workload variations. Unfortunately, stateful systems, such as Memcached, are not elastic in nature. The performance loss that follows a scaling action can severely impact latencies. I contributed in developing an elastic Memcached system that mitigates post-scaling performance loss by proactive migration of hot data between cache nodes. Our experimental results on OpenStack, across several workload traces, show that elastic Memcached scales while reducing the post scaling performance degradation by about 90%.

Realising an Elastic Memcached via Cached Data Migration

Ubaid Ullah Hafeez, Deepthi Male, Sharath Kumar Naeni, Muhammad Wajahat, Anshul Gandhi. Proceedings of the 2017 ACM/IFIP/USENIX Middleware Conference (MiddleWare '17'), Posters and Demos Track.

TCP incast congestion happens in many-to-one communication patterns that frequently arise in large-scale datacenter applications such as web search, social networks etc. Incast congestion can severely degrade the performance of applications. To mitigate successive timeouts, I designed algorithms for randomizing TCP retransmission timeout (RTO). These algorithms rely on (a) successive timeouts, (b) explicit knowledge of the level of multiplexing, and (c) the knowledge of flow sizes. Results show that these algorithms improve goodput by 1.5x-11x for up to 64 senders and provide greater improvement for larger number of senders.

Mitigating Datacenter Incast Congestion Using RTO Randomization

Ubaid Ullah Hafeez, Aqsa Kashaf, Qurat ul ain Bajwa, Aisha Mushtaq , Hassan Mujtaba Zaidi, Ihsan Ayyub Qazi. Proceedings of the 2015 IEEE GLOBECOM (GLOBECOM '15').

TCP incast congestion happens in many-to-one communication patterns that frequently arise in large-scale datacenter applications such as web search, social networks etc. Incast congestion can severely degrade the performance of applications. To mitigate successive timeouts, I designed algorithms for randomizing TCP retransmission timeout (RTO). These algorithms rely on (a) successive timeouts, (b) explicit knowledge of the level of multiplexing, and (c) the knowledge of flow sizes. Results show that these algorithms improve goodput by 1.5x-11x for up to 64 senders and provide greater improvement for larger number of senders.

Ubaid Ullah Hafeez