Ask HN: Stanford CS 153 help

65 points by anjneymidha 4 days ago

hi hn - i'm volunteering at Stanford next quarter to co-teach cs 153 (infrastructure at scale) - a course i wish had existed during my undergrad years. rather than pure theory, it's focused on how large-scale systems actually work in production

the format combines hands-on projects with a speaker series. we've confirmed some solid speakers (Jensen Huang from NVIDIA, Matthew Prince from Cloudflare etc), but i'm also keen to bring in perspectives from folks who don't fit the standard mold. tbh, many of the best systems eng/devs/infra ppl i've worked with are pretty weird - they think differently, take unconventional paths, and often learn by obsessively building and breaking things rather than following traditional routes. i think it would be cool for the students to realize its a feature, not a bug, to be weirdly obsessive

if you're interested in this kind of stuff, i'd value your thoughts on:

1/ who are the fascinating/unsung heroes in infra/systems eng that students should learn from? especially interested in people who've solved hard scaling problems through unconventional thinking or unique approaches

2/ what kind of projects do you think would fun and meaningfully demonstrate real-world infrastructure challenges while still being achievable in an academic quarter?

prerequisites are CS106/CS111 level programming. draft syllabus here: https://explorecourses.stanford.edu/search?view=catalog&filt...

email: anjney at alumni dot stanford edu if you prefer to share thoughts privately. thank you in advance for any and all help

spenczar5 4 days ago

Rachel by the Bay (https://rachelbythebay.com/) has long impressed me as someone who clearly is deep in the actual work of systems, day in and day out, and can write well about it.

Julia Evans has a wonderful approach as well, and has amazing talent for teaching: https://jvns.ca/

Kellan Elliott-McCrea (https://laughingmeme.org/) has given the world some of the better advice on the hardest parts of software scaling, which is of course scaling the human organizations. New grads are virtually always underestimating that part of the work; eventually you realize the hard problems are usually social and not technical.

  • anjneymidha 4 days ago

    i've followed Rachel and Julia for a long time, but didn't know about Kellan - thanks so much for that.

    re: human org scaling - true and this was the most surprising thing for me when i was running the platform org at discord. companies ship their org charts whether they like it or not. and refactoring org charts correctly, at scale, is essentially untested in the modern era

slaucon 4 days ago

A progression of projects that comes to mind:

1) CI and IAC that deploy a web app running in a container

2) Add horizontal scaling and load balancer

3) Add long running tasks / scheduled task support

4) Deploys will likely break long running tasks. Implement blue/green or rolling deploys or some other sort of advanced deployment scheme

5) Implement rollbacks

  • dirtbag__dad 4 days ago

    This! This is what I’ve seen at my companies and is super salient to today’s real life work ~

  • anjneymidha 4 days ago

    Love this. Easy to Advanced, with 5 for extra credit. Thank you

  • lizzas 4 days ago

    6) Feature flags, telemetry, soaking

    7) Alarms

JohnMakin 4 days ago

2/

Build a multi-cloud architecture. And by this, I mean connect two cloud's networks without traversing the public internet to connect two applications running in each respective cloud. And then, put that into IaC. It sounds like not much, but the issues you uncover are pretty illuminating and it is a fantastic interview question to give to senior-ish infra guys to see how they approach it and the challenges they expect.

And you're right, we're all weird.

  • revskill 3 days ago

    I am curious how to connect without public internet. U mean vpn ?

    • stogot 3 days ago

      Direct Connect and ExpressRoute

  • orionblastar 4 days ago

    We are all nerds because we love the technology, science, and math behind it.

  • anjneymidha 4 days ago

    this is exactly the type of pointer i was hoping for, thank you

zerr 7 hours ago

Are there any downloadable materials and lecture videos?

joschi03 4 days ago

At multiple points in my career I stumbled upon stuff from Bredan Greg. He is highly skilled in large-scale distributed computing but also down to the nitty gritty details (bits).

qm2crossing 4 days ago

kyle kingsbury/aphyr of jepsen seems like an exemplar of #1

  • anjneymidha 4 days ago

    this is an awesome rec thank you

WobblyTyre 4 days ago

I don't have recommendations like others here. But as a junior engineer still coming upto speed with real engineering, I'd really appreciate it if this was course was made open (interms of lectures, assignments etc) to help folks like me audit & learn

huevosabio 4 days ago

1) you should reach out to the Convex.dev folks. They have built a solid infra platform, and their backend is open sourced(ish). They are ex-Dropbox as well. And finally they love to share!

2) I think multiplayer games could be interesting! Lots of meat while still having a lot of space to calibrate the scope.

  • anjneymidha 4 days ago

    convex is really elegant and now that you mention it, multiplayer games like their ai-town agent sim is such a great fit for the class - thank you

mavelikara 4 days ago

Not unsung, but Jay Kreps has made original contributions to the practice of building large scale systems. He also built a big business around it, so that perspective might also be interesting to students.

romanhn 4 days ago

Charity Majors (https://charity.wtf) is a great writer and speaker, and her work on observability is directly relevant to infra at scale.

majke 4 days ago

Quite a strong cast of presenters back in Jan 2024 https://cs153.stanford.edu/syllabus.html

  • anjneymidha 4 days ago

    thanks for noticing! this is the first time we're expanding it from 'security at scale' to 'infra at scale', but we've taught this course 2 yrs in a row now

    • kyawzazaw 4 days ago

      curious to learn how many undergrads took this?

tayo42 4 days ago

couldn't find the syllabus

deploy something like cassandra and make a system that can update the kernel on the servers running the databases without downtime or losing data

or come up with some distrubuted blob store thing/cdn for world wide users

my whole career has been automating updates for software or operating systems lol

jjoe 4 days ago

Maybe reach out to Netflix's live streaming dept. since we all learn so much more from our own failures.

Cheers!

randomcatuser 4 days ago

i didn't know you could do that! how does one volunteer to teach a course?

dirtbag__dad 4 days ago

Infrastructure for gov cloud is another beast and might make a fun case study

  • dirtbag__dad 4 days ago

    Also the folks at a company like render, railway, or even supabase might be fascinating - what it takes to write an infra abstraction at scale