peter_d_sherman 2 days ago

This is a very interesting OS design:

>"1.1 High level overview

Barrelfish is “multikernel” operating system [3]: it consists of a small kernel running on each core (one kernel per core), and while rest of the OS is structured as a distributed system of single-core processes atop these kernels. Kernels share no memory, even on a machine with cache-coherent shared RAM, and the rest of the OS does not use shared memory except for transferring messages and data between cores, and booting other cores."

  • jdefr89 2 days ago

    Was I the only one confused by this? It wasn't just me right? I love when I see things like this. "The cool thing about our kernel is that you cannot share memory! It's super secure. Except for, you know, ..." then list nearly everything. What were they trying to provide/gain with this proposal?

transpute 2 days ago

Systems research geneaology:

  Xen [U of Cambridge, XenSource, Citrix]
    KVM [Qumranet, RedHat]
    EC2 [AWS]
      Nitro [Annapurna Labs, AWS]
    Barrelfish [ETH Zurich, Microsoft]
      Snitch RISC-V (many)core
    uXen ("micro" Xen, CoW memory) [Bromium, HP]
      firecracker
    AX x86 ("atto" Xen) [Bromium, HP]
      pKVM Arm [Google, Android, Linux]
  • jamesblonde 2 days ago

    I don't know how you got from Barrelfish (a message-passing OS) to a RISC-V CPU. Bit of a stretch. Just because they are both messaging passing distributed systems?

    • transpute 2 days ago

      Apologies, mistake in my notes. Should be Enzian (48-core Arm + FPGA).

      From Mothy (Barrelfish researcher) profile, https://people.inf.ethz.ch/troscoe/ & https://enzian.systems/why-enzian/

      > Building and using a research computer called Enzian for experimentation with hardware/software codesign for servers.. If academics can’t do relevant, impactful, and medium-to-long-term system software research using commodity platforms, and they can’t do it using someone else’s cost-optimized application-specific custom hardware, what can they do? Our response is to build Enzian: a computer.. optimized for exploring the design space for custom hardware/software co-design.. over-engineered relative to any off-the-shelf hardware.. optimized for flexibility and configurability rather than unit code, efficiency, or performance along any particular dimension.

  • yencabulator 15 hours ago

    How do you manufacture a connection between a hypervisor and a kernel that does nothing at all with virtualization? Did you just want to mention Xen?

  • kfreds 2 days ago

    Interesting. Do you know of any good SoK papers or articles that summarize the current state of the art, or explains this genealogy?

    • transpute 2 days ago

      A longer history would start with IBM mainframes. More recently, IBM Ultravisor shipped in OpenPower firmware, mediating KVM VMs, https://www.youtube.com/watch?v=6qjrqn3ug0g & https://github.com/open-power/ultravisor

      2018 video by Ian Pratt covers Xen, uXen and AX (2005-2015), https://news.ycombinator.com/item?id=44135977#44141164. Citrix acquired XenSource. Pratt left to work at Bromium, acquired by HP (which previously acquired BIOS company from Bromium co-founder). The former CTO of XenSource co-founded Qumranet (KVM), acquired by RedHat.

      AWS began with Xen, then migrated to a subset of KVM. Nitro used Arm hardware to virtualize I/O (storage, network) paths, leaving KVM responsible for x86 CPU and memory virtualization, https://www.youtube.com/watch?v=e8DVmwj3OEs & https://news.ycombinator.com/item?id=24515019#24516523. Parallels could be drawn to the Apple T2 enclave (Arm) coprocessor being used for disk encryption on x86 Apple Macbooks.

      Under the "Confidential Computing" umbrella, Intel has TDX and a new (closed?) hypervisor on servers, using SGX and new hardware privilege levels.

      Apple recently added Secure eXclaves to iOS, and Apple Silicon hardware supports nested virtualization, which is what Google pKVM uses on Pixel (and upcoming ChromeOS?) devices, https://news.ycombinator.com/item?id=43314657

      For production code, pKVM deserves attention because it's open (upstreamed to mainline Linux), exists in the real world (Pixel phones), stands in stark contrast to Apple's neutered iPads and has the potential to improve upon TrustZone security, https://news.ycombinator.com/item?id=41523758.

      Finally, to bring this thread back to Barrelfish, Google OpenTitan open silicon root of trust (OCP servers, Chromebooks) is partly under Pulp Platform research, alongside Snitch (descended from Barrelfish research) open hardware from ETH Zurich. So progress is being made in both mainstream-compatible systems software and greenfield hardware cores.

      (hopefully readers can correct any errors or gaps above)

      • kfreds 2 days ago

        The virtualization of I/O is fascinating, and VirtIO's progress from the Linux kernel to hardware implementations. My only wish is that Linux would support inter-VM shared memory as a VirtIO transport in addition to pci and mmio.

        Thanks for the pKVM tip, and the connection between OpenTitan and Barrelfish.

        Speaking of security and open-source hardware, shameless plug of stuff I work on:

        - dev.tillitis.se (FPGA-based OSHW RoT)

        - system-transparency.org (related to CC, TDX, SNP)

        - sigsum.org

      • kfreds 2 days ago

        Thank you! I realize now that I was thinking about a different aspect of systems research, but failed to say so.

        Barrelfish (multikernel) and your username made me think of manycore systems and the scheduling challenges we will surely face as systems become more heterogeneous. I'm in a period of trying to learn more about that. Any and all recommendations are much appreciated.

        • transpute 2 days ago

          Jim Keller's Tenstorrent ($1B funding to date) is shipping $1K PCIe manycore accelerators, with open-but-immature software, https://www.theregister.com/2024/08/27/tenstorrent_ai_blackh...

          > compute.. is handled by 140 of Tenstorrent's Tensix cores, each of which is composed of five "Baby RISC-V" cores, a pair of routers, a compute complex, and some L1 cache.. Tensix cores account for 700 of the 752 so-called baby RISC-V cores on board.. TT-Metalium low-level programming model.. kernels themselves are plain C++ with APIs.. Tenstorrent aims to support running any AI model on its accelerators using commonly used runtimes like PyTorch, ONNX, JAX, TensorFlow, and vLLM.

          Legion from the Stanford research team that lead to CUDA, https://legion.stanford.edu/ & https://elliottslaughter.com/2024/02/legion-paper-history

          > A novel mapping interface provides explicit programmer controlled placement of data in the memory hierarchy and assignment of tasks to processors in a way that is orthogonal to correctness, thereby enabling easy porting and tuning of Legion applications to new architectures.. Legion is developed as an open source project, with major contributions from LANL, NVIDIA Research, SLAC, and Stanford.

          • kfreds 2 days ago

            It seems we read the same stuff. :)

            I assume you're also aware of the Oxide and Friends podcast, and the Microarch Club podcast?

            • transpute 2 days ago

              Yes on Oxide, will check out Microarch Club, thanks!

          • bionsystem 2 days ago

            So far when Jim starts something it's a massive success, can't wait to see how this one goes.

andsoitis 2 days ago

and you can download and run it: https://barrelfish.org/download.html

  • transpute 2 days ago

    Ten years of OS research, supporting x86, ARMv7 and ARMv8 devices, leading to 2021 talk about hardware and subsequent design of new hardware (RISC-V).

  • binarycrusader 2 days ago

    sadly, last release 2020-03-23:

    The Barrelfish project is no longer active. See https://systems.ethz.ch/ for information about our current research activities.

    It is still interesting though.

davemp 2 days ago

I find these type of efforts somewhat disappointing. So much OS research boils down to “We’ll handle scheduling and rudimentary peripheral multiplexing good luck on rest”. These basics are so far from a useful system that you’d have to slap linux on top and immediately lose most/all benefits of the new architecture.