SWE Tea - Year 1

1. Infra and Networking
2. Distributed Systems
3. System Management
4. Virtualization and Memory
5. Algorithms and Theory

About a year ago, my buddy Casper and I started a paper reading group. We ended up covering 45 papers in the first year, which was a blast. Some papers were duds, and some were absolutely brilliant, so here's a list of the ones that really made us think. You can find the full list in the history page: http://malloc.dog/swetea

1. Infra and Networking

Jupiter Rising
- Singh, Arjun, Joon Ong, Amit Agarwal, Glen Anderson, Ashby Armistead, Roy Bannon, Seb Boving, et al. "Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network." In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, 183–97. London United Kingdom: ACM, 2015. https://doi.org/10.1145/2785956.2787508
- This paper was incredible - it shows how Google built an optical switching fabric using actual mirrors to route network traffic. Way cooler than traditional CLOS networks.
FBOSS
- Choi, Sean, Boris Burkov, Alex Eckert, Tian Fang, Saman Kazemkhani, Rob Sherwood, Ying Zhang, and Hongyi Zeng. "FBOSS: Building Switch Software at Scale," 2018
- A peek into Facebook's software switch internals. Probably outdated compared to what's running now, but the architecture is fascinating.
Slicer
- Adya, Atul, Daniel Myers, Jon Howell, Jeremy Elson, Colin Meek, Vishesh Khemani, Stefan Fulger, et al. "Slicer: Auto-Sharding for Datacenter Applications," n.d.
- Slicer shows why consistent hashing is actually terrible at scale, and why a central coordinator can do way better at resource utilization.

2. Distributed Systems

RocksDB
- Dong, Siying, Andrew Kryczka, Yanqin Jin, and Michael Stumm. "RocksDB: Evolution of Development Priorities in a Key-Value Store Serving Large-Scale Applications." ACM Transactions on Storage 17, no. 4 (November 30, 2021): 1–32. https://doi.org/10.1145/3483840
- This paper is massive, but it's worth it. Great deep dive into why RocksDB works so well and where it fits in modern architectures.
Ceph
- Weil, Sage A, Scott A Brandt, Ethan L Miller, Darrell D E Long, and Carlos Maltzahn. "Ceph: A Scalable, High-Performance Distributed File System," n.d
- A fantastic "big ideas" paper. CRUSH, metadata/object store separation, and dynamic metadata management - all ideas that were way ahead of their time.
DTrace
- Cantrill, Bryan M, Michael W Shapiro, and Adam H Leventhal. "Dynamic Instrumentation of Production Systems," n.d
- The OG tracing paper. Modern tools are still trying to catch up to what DTrace did years ago.

3. System Management

Dynamo
- Wu, Qiang, Qingyuan Deng, Lakshmi Ganesh, Chang-Hong Hsu, Yun Jin, Sanjeev Kumar, Bin Li, Justin Meza, and Yee Jiun Song. "Dynamo: Facebook's Data Center-Wide Power Management System," n.d.
- The efficiency numbers aren't mind-blowing, but it's a great look into power management at warehouse scale.
Build Systems a la Carte
- Mokhov, Andrey, Neil Mitchell, and Simon Peyton Jones. “Build Systems à La Carte.” Proceedings of the ACM on Programming Languages 2, no. ICFP (July 30, 2018): 1–29. https://doi.org/10.1145/3236774.
- This is an excellent paper, it pokes at your brain by actually asking you what you consider a build system to be.

4. Virtualization and Memory

Xen
- Barham, Paul, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. "Xen and the Art of Virtualization." Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles - SOSP '03, 2003, 164. https://doi.org/10.1145/945445.945462
- Another classic "big ideas" paper that laid out what good virtualization should look like.

5. Algorithms and Theory

ANS
- https://kedartatwawadi.github.io/post--ANS/
- ANS is wild stuff, but this explains what compression could be if we really pushed it.

Ghettos of Abu Nawas