Removing nommu feels wrong to me. Being able to run linux on a simple enough hardware that anybody sufficiently motivated could write an emulator for, help us, as individuals, remain in control. The more complex things are, the less freedom we have.
It's not a well argumented thought, just a nagging feeling.
Maybe we need a simple posix os that would run on a simple open dedicated hardware that can be comprehended by a small group of human beings. A system that would allow communication, simple media processing and productivity.
These days it feels like we are at a tipping point for open computing. It feels like being a frog in hot water.
We need accessible open hardware. Not shoehorning proprietary hardware to make it work with generic standards they never actually followed.
Open source is one thing, but open hardware - that’s what we really need. And not just a framework laptop or a system76 machine. I mean a standard 64-bit open source motherboard, peripherals, etc that aren’t locked down with binary blobs.
I would love to works with hardware, if you can foot my bill then I be happy to do that
since open source software is one thing but open source hardware need considerable investment that you cant ignore from the start unlike software that we have AI to help us
> I mean a standard 64-bit open source motherboard, peripherals, etc that aren’t locked down with binary blobs.
The problem here is scale. Having fully-open hardware is neat, but then you end up with something like that Blackbird PowerPC thing which costs thousands of dollars to have the performance of a PC that costs hundreds of dollars. Which means that only purists buy it, which prevents economies of scale and prices out anyone who isn't rich.
Whereas what you actually need is for people to be able to run open code on obtainium hardware. This is why Linux won and proprietary Unix lost in servers.
That might be achievable at the low end with purpose-built open hardware, because then the hardware is simple and cheap and can reach scale because it's a good buy even for people who don't care if it's open or not.
But for the mid-range and high end, what we probably need is a project to pick whichever chip is the most popular and spend the resources to reverse engineer it so we can run open code on the hardware which is already in everybody's hands. Which makes it easier to do it again, because the second time it's not reverse engineering every component of the device, it's noticing that v4 is just v3 with a minor update or the third most popular device shares 80% of its hardware with the most popular device so adding it is only 20% as much work as the first one. Which is how Linux did it on servers and desktops.
We'll likely never have "affordable" photolithography, but electron beam lithography will become obtainable in my lifetime (and already is, DIY, to some degree.)
nommu is a neat concept, but basically nobody uses it, and I don't see that as likely to change. There's no real use case for using it in production environments. RTOSes are much better suited for use on nommu hardware, and parts that can run "real" Linux are getting cheaper all the time.
If you want a hardware architecture you can easily comprehend - and even build your own implementation of! - that's something which RISC-V handles much better than ARM ever did, nommu or otherwise.
There's plenty of use cases for Linux on microcontrollers that will be impossible if nommu is removed. The only reason we don't see more Linux on MCUs is the lack of RAM. RP2350 are very close! Running Linux makes it much easier to develop than a plain RTOS.
Those operating systems already exist. You can run NetBSD on pretty much anything (it currently supports machines with a Motorola 68k CPU for example). Granted many of those machines still have an MMU iirc but everything is still simple enough to be comprehend by a single person with some knowledge in systems programming.
Mmm... would beg to differ. I have ported stuff to NOMMU Linux and almost everything worked just as on a "real" Linux. Threads, processes (except only vfork, no fork), networking, priorities, you no name it. DOS gives you almost nothing. It has files.
The one thing different to a regular Linux was that a crash of a program was not "drop into debugger" but "device reboots or halts". That part I don't miss at all.
That's fair. If so, then you still can have things like drivers and HAL and so on too. However, there's no hard security barriers.
How do multiple processes actually work, though? Is every executable position-independent? Does the kernel provide the base address(es) in register(s) as part of vfork? Do process heaps have to be constrained so they don't get interleaved?
There are many options. Executables can be position-independent, or relocated at run-time, or the device can have an MPU or equivalent registers (for example 8086/80286 segment registers), which is related to an MMU but much simpler.
Executables in a no-MMU environment can also share the same code/read-only segments between many processees, the same way shared libraries can, to save memory and, if run-time relocation is used, to reduce that.
The original design of UNIX ran on machines without an MMU, and they had fork(). Andrew Tanenbaum's classic book which comes with Minix for teaching OS design explains how to fork() without an MMU, as Minix runs on machines without one.
For spawning processes, vfork()+execve() and posix_spawn() are much faster than fork()+execve() from a large process in no-MMU environments though, and almost everything runs fine with vfork() instead of fork(), or threads. So no-MMU Linux provides only vfork(), clone() and pthread_create(), not fork().
Just reading about this...turns out nommu Linux can use vfork(), which unlike fork() shares the parent's address space. Another drawback is that vfork's parent process gets suspended until the child exits or calls execve().
Typicall you always call vfork() + execve(), vfork is pretty useless on its own.
Think about it like CreateProcess() on Windows. Windows is another operating system which doesn't support fork(). (Cygwin did unholy things to make it work anyway, IIRC.)
Supporting 32bit is not 'simple' and the difference between 32bit hardware and 64bit hardware is not big.
The industry has a lot of experience doing so.
In parallel, the old hardware is still supported, just not by the newest Linux Kernel. Which should be fine anyway because either you are not changing anything on that system anyway or you have your whole tool stack available to just patch it yourself.
But the benefit would be a easier and smaller linux kernel which would probably benefit a lot more people.
Also if our society is no longer able to produce chips in a commercial way and we loose all the experience people have, we are probably having a lot bigger issues as a whole society.
But I don't want to deny that it would be nice to have the simplest way of making a small microcontroller yourself (doesn't has to be fast or super easy just doable) would be very cool and could already solve a lot of issues if we would need to restart society from wikipedia.
Nothing prevents you from maintaining nommu as a fork. The reality of things is, despite your feelings, people have to work on the kernel, daily, and there comes a point where your tinkering needs do not need to be supported in main. You can keep using old versions of the kernel, too.
Linux remains open source, extendable, and someone would most likely maintain these ripped out modules. Just not at the expense of the singular maintainer of the subsystem inside the kernel.
I don't think it makes sense to run Linux on most nommu hardware anymore. It'd make more sense to have a tiny unikernel for running a single application, because on nommu, you don't typically have any application isolation.
> on nommu, you don't have any application isolation
That isn't necessarily the case. You can have memory protection without a MMU - for instance, most ARM Cortex-M parts have a MPU which can be used to restrict a thread's access to memory ranges or to hardware. What it doesn't get you is memory remapping, which is necessary for features like virtual memory.
Big endian will stay around as long as IBM continues to put in the resources to provide first-class Linux support on s390x. Of course if you don’t expect your software to ever be run on s390x you can just assume little-endian, but that’s already been the case for the vast majority of software developers ever since Apple stopped supporting PowerPC.
I wish the same applied to written numbers in LTR scripts. Arithmetic operations would be a lot easier to do that way on paper or even mentally. I also wish that the world would settle on a sane date-time format like the ISO 8601 or RFC 3339 (both of which would reverse if my first wish is also granted).
> It will be relegated to the computing dustbin like non-8-bit bytes and EBCDIC.
I never really understood those non-8-bit bytes, especially the 7 bit byte. If you consider the multiplexer and demux/decoder circuits that are used heavily in CPUs, FPGAs and custom digital circuits, the only number that really makes sense is 8. It's what you get for a 3 bit selector code. The other nearby values being 4 and 16. Why did they go for 7 bits instead of 8? I assume that it was a design choice made long before I was even born. Does anybody know the rationale?
> I also wish that the world would settle on a sane date-time format like the ISO 8601
IIRC, in most countries the native format is D-M-Y (with varying separators), but some Asian countries use Y-M-D. Since those formats are easy to distinguish, that's no problem. That's why Y-M-D is spreading in Europe for official or technical documents.
There's mainly one country which messes things up...
YYYY-MM-DD is also the official date format in Canada, though it's not officially enforced, so outside of government documents you end up seeing a bit of all three formats all over the place. I've always used ISO 8601 and no one bats an eye, and it's convenient since YYYY-DD-MM isn't really a thing, so it can't be confused for anything else, unlike the other two formats.
YMD has caught on, I think, because it allows for the numbers to be "in order" (not mixed-endian) while still having the month before the day which matches the practice for speaking dates in (at least) the US and Canada.
I don't know that 7-bit bytes were ever used. Computer word sizes have historically been multiples of 6 or 8 bits, and while I can't say as to why particular values were chosen, I would hypothesize that multiples of 6 and 8 work well for representation in octal and hexadecimal respectively. For many of these early machines, sub-word addressability wasn't really a thing, so the question of 'byte' is somewhat academic.
For the representation of text of an alphabetic language, you need to hit 6 bits if your script doesn't have case and 7 bits if it does have case. ASCII ended up encoding English into 7 bits and EBCDIC chose 8 bits (as it's based on a binary-coded decimal scheme which packs a decimal digit into 4 bits). Early machines did choose to use the unused high bit of an ASCII character stored in 8 bits as a parity bit, but most machines have instead opted to extend the character repertoire in a variety of incompatible ways, which eventually led to Unicode.
On the DEC-10 the word size is 36 bits. There was (an option to include) a special set of instructions to enable any given byte size with bytes packed. Five 7-bit bytes per word, for example, with a wasted bit in each word.
I wouldn’t be surprised if other machines had something like this in hardware.
I believe that 10- and 12-bit bytes were also attested in the early days. As for "why": the tradeoffs are different when you're at the scale that any computer was at in the 70s (and 60s), and while I can't speak to the specific reasons for such a choice, I do know that nobody was worrying about scaling up to billions of memory locations, and also using particular bit combinations to signal "special" values was a lot more common in older systems, so I imagine both were at play.
UTF-16 arguably is Unicode 2.0+. It's how the code point address space is defined. Code points are either 1 or 2 16-bit code units. Easy. Compare w/ UTF-8 where a code point may be 1, 2, 3, or 4 8-bit code units.
UTF-16 is annoying, but it's far from the biggest design failure in Unicode.
Of the two UTF-16 is much less of a problem, it's trivially[1] and losslessly convertible.
[1] Ok I admit, not trivially when it comes to unpaired surrogates, BOMs, endian detection, and probably a dozen other edge and corner cases I don't even know about. But you can offload the work to pretty well-understood and trouble-free library calls.
I have some places in some software where I assume little endian for simplicity, and I just leave in a static_assert(std::endian::native == std::endian::little) to let future me (or future someone else) know that a particular piece of code must be modified if it is ever to run on a not-little-endian machine.
> One other possibility is to drop high memory, but allow the extra physical memory to be used as a zram swap device. That would not be as efficient as accessing the memory directly, but it is relatively simple and would make it possible to drop the complexity of high memory.
Wild, like some kind of virtual cache. Reminds me a bit of the old Macintosh 68k accelerators; sometimes they included their own (faster) memory and you could use the existing sticks as a RAM disk.
Not just more power-efficient, but also a little more memory efficient because pointers are only half as big and so don't take up as much space in the cache. Lower-bit chips are also smaller (which could translate into faster clock and/or more functional units per superscaler core and/or more cores per die).
Part of the problem with these discussion is that often when people say "64-bit" vs "32-bit" they are also considering all the new useful instructions that were added to the new instruction set generation. But a true "apples-to-apples" comparison between "32-bit" and "64-bit" should be using almost identical whose only difference is the datapath and pointer size.
I feel that the programs and games I run shouldn't really need more than 4GB memory anyway, and the occasion instance that the extra precision of 64-bit math is useful could be handled by emulating the 64-bit math with the compiler adding a couple extra 32-bit instructions.
Applications don't get 4GB with a 32-bit address space. The practical split between application and kernel was usually 1-3 or 2-2 with 3-1 being experimental and mooted with the switch to 64-bit. Nowadays with VRAM being almost as large as main RAM, you need the larger address space just to map a useful chunk of it in.
When you factor in memory fragmentation, you really only had a solid
0.75-1.5GB of space that could be kept continuously in use. That was starting to become a problem even when 32-bit was the only practical option. A lot of games saw a benefit to just having the larger address space, such that they ran better in 64-bit with only 4GB of RAM despite the fatter 64-bit pointers.
It depends on the kernel architecture. 4G/4G kernels weren't the most common thing, but also weren't exactly rare in the grand scheme of things. PowerPC macOS (and x86 in macOS before they officially released Intel based mac hardware) were 4G/4G for example. The way that works under x86 is that you just reserve a couple kernel pages mapped into both address spaces to do the page table swap on interrupts and syscalls. A little expensive, but less than you'd think, and having the kernel and user space not fight for virtual address space provided its own efficiencies to partially make up the difference. We've been moving back to that anyway with Kernel Page Table isolation for spectre mitigations.
And 3-1 wasn't really experimental. It was essentially always that way under Linux, and had been supported under Windows since the late 90s.
I believe that's an accident of the evolutionary path chosen with syscalls. If we'd instead gone with a ring buffer approach to make requests, then you'd never need to partition the memory address space; the kernel has its memory and userspace has its and you don't need the situation where the kernel is always mapped.
I think it would be possible for e.g. microkernels to greatly reduce the size of the reservation (though not to eliminate it entirely). However, I can't imagine how you would handle the privilege escalation issue without having at least some system code in the application's virtual address space that's not modifiable by the application.
I'm not sure how privilege escalation would be an issue since you'd never escalate privilege in the first place (I'm assuming you're talking about CPU ring privileges and not OS privileges). You'd just enqueue into the shared kernel/user space ring buffer your operations and the kernel would pick them up on its side, but you'd never jump between rings.
Such a design may require at least one processor dedicated to running the kernel at all times, so it might not work on a single processor architecture. However, single processor architectures might be supportable by having the "kernel process" go to sleep by arming a timer and the timer interrupt is the only one that's specially mapped so it can modify the page table to resume the kernel (for handling all the ring buffers + scheduling). As you note, there's some reserved address space but it's a trivial amount just to be able to resume running the kernel. I don't think it has anything to do with monolithic vs microkernels.
True, you don't have to go full microkernel just to have messages passed though a buffer. However, if the buffer is shared by all processes, it does need to have some protection. I guess you could assign one buffer per process (which might end up using a lot of physical RAM), and then just crash the process if it corrupts its own buffer. The bigger issue with this approach might be adapting to asynchrony though.
It wouldn't be by all processes. One per process just like with io_uring. Not sure how it would end up being all that much physical RAM - you get a lot more memory mapped when you just start a process. Page faults might be another tricky corner case.
On anything but the smallest implementations, the 32 vs 64bit alu cost difference is pretty tiny compared to everything else going on in the core to get performance. And assumes the core doesn't support 32-bit ops, leaving the rest of the ALU idle, or does something like double pumping.
Really the ALU width is an internal implementation detail/optimisation, you can tune it to the size you want at the cost of more cycles to actually complete the full width.
It's the MMU width, not the ALU width, that matters.
Lots of machines are capable of running with 32-bit pointers and 64-bit integers ("Knuth mode" aka "ILP32"). You get a huge improvement in memory density as long as no single process needs more than 4GB of core.
What makes you think that a 32 bit system has 32 transistors? For example, from the top of my head, the pentium pro had a 86 bit direction bus and a 64 bit data bus.
Yes and no. A problem with 8-bit and 16-bit for desktop and servers is the limited memory address space, so the compiler has to insert extra instructions to deal with things like updating the segment registers. And likewise if you need to do higher-bit math then the compiler again has to insert extra instructions. Those extra instructions clog up the pipeline, but aren't needed if your largest program's working memory set and the largest precision math you generally need fits within the ISA's bit size. Unless you are doing scientific computing or other large-memory set tasks like Blender (which dropped 32-bit support), then 32-bit really is good-enough.
I couldn't tell if your comment was a joke, but it is worth mentioning the 8-bit microcontrollers like TinyAVR still fill a niche where every joule and cent counts.
Linux has become the dominant operating system for a wide range of devices, even though other options like FreeRTOS or the BSD family seem more specialized. The widespread adoption of Linux suggests that a single, versatile operating system may be more practical than several niche ones. However, the decision to drop support for certain hardware because it complicates maintenance, as seen here, would seem to contradict the benefit of a unified system. I wouldn't be surprised if it really just results in more Linux forks - Android is already at the point of not quite following mainline.
Funny, I remember 32 bits being 'the future', now it is the distant past. I think they should keep it all around, and keep it buildable. Though I totally understand the pressure to get rid of it I think having at least one one-size-fits-all OS is a very useful thing to have. You never know what the future will bring.
Just because support would be removed from current and new versions doesn't mean the old code and tarballs are just going to disappear. Can dust off an old 32 bit kernel whenever you want
There's always NetBSD. I'm pretty sure that's supporting x86 as far back was 80486 and 32-bit SPARC as far back as... something I wouldn't want to contemplate.
The Apple Watch has 32-bit memory addressing (and 64-bit integer arithmetic -- it's ILP32). Granted it doesn't run Linux, but it's a very very modern piece of hardware, in production, and very profitable.
Same for WASM -- 32-bit pointers, 64-bit integers.
Both of these platforms have a 32-bit address space -- both for physical addresses and virtual addresses.
Ripping out support for 32-bit pointers seems like a bad idea.
On the userland side, there is some good progress of using thunking to run 32-bit Windows programs in Wine on Linux without the need for 32-bit libraries (the only edge case remaining is thunking 32-bit OpenGL which is lacking needed extensions for acceptable performance). But the same can't be said for a bunch of legacy 32-bit native Linux stuff like games which commonly have no source to rebuild them.
May be someone can develop such thunking for legacy Linux userland.
How many of those legacy applications where the source is not available actually need to run natively on a modern kernel?
The only thing I can think of is games, and the Windows binary most likely works better under Wine anyways.
There are many embedded systems like CNC controllers, advertisement displays, etc... that run those old applications, but I seriously doubt anyone would be willing to update the software in those things.
It shouldn’t be difficult to write a binary translator to run 32-bit executables on a 64-bit userspace. You will take a small performance hit (on top of the performance hit of using the 32-bit architecture to begin with), but that should be fine for anything old enough to not be recompiled.
In some ways, Windows already does that too - the 32-bit syscall wrappers [switch into a 64-bit code segment](https://aktas.github.io/Heavens-Gate) so the 64-bit ntdll copy can call the 64-bit syscall.
that said, i sometimes think about a clean-room reimplementation of e.g. the unity3d runtime -- there are so many games that don't even use native code logic (which still could be supported with binary translation via e.g. unicorn) and are really just mono bytecode but still can't be run on platforms for which their authors didn't think to build them (or which were not supported by the unity runtime at the time of the game's release).
Yeah, that's a reasonable workaround, as long as it doesn't hit that OpenGL problem above (now it mostly affects DX7 era games, since they don't have Vulkan translation path). Hopefully it can be fixed.
I can’t help but wonder if kernel devs realize how much this discussion sounds like something you’d expect from Apple. They are talking about obsoleting hardware not because it’s fundamentally broken, but because it no longer fits neatly into a roadmap. Open source has always been about making hardware outlive commercial interest and let it run long after the hardware vendor abandons it.
I'm pretty shocked to see comments like "the RAM for a 32-bit system costs more than the CPU itself", but open source isn’t supposed to be about market pricing or what’s convenient for vendors; it’s about giving users the freedom to decide what’s worth running.
I understand that maintainers don’t want to drag around unmaintained code forever, and that testing on rare hardware is difficult. But if the code already exists and is working, is it really that costly to just not break it? The kernel's history is full of examples where obscure architectures and configs were kept alive for decades with minimal intervention. Removing them feels like a philosophical shift, especially when modern hardware is more locked down and has a variety of black box systems running behind it like Intel ME and AMD PSP.
> But if the code already exists and is working, is it really that costly to just not break it?
It depends on the feature, but in many cases the answer is in fact 'yes.' There's a reason why Alpha support (defunct for decades) still goes on but Itanium support (defunct for years) has thoroughly been ripped out of systems.
What's the Venn diagram of people stuck with 32-bit hardware and people needing features of newer kernels? Existing kernels will keep working. New devices probably wouldn't support that ancient hardware; seen any new AGP graphics cards lately?
There's not a compelling reason to run a bleeding edge kernel on a 2004 computer, and definitely not one worth justifying making the kernel devs support that setup.
The bulk of CVEs that crossed my desk in the last couple of years were in things that wouldn’t matter on a 32-bit system, like problems in brand new graphics cards or fibre channel or 10G Ethernet, or KVM hosting, or things like that. There wasn’t a huge flood of things affecting older, single-user type systems.
But in any case, I’m sure Red Hat etc would be happy to sell backports of relevant fixes.
It’s not that I’m unsympathetic to people with older systems. I get it. I’ve got old hardware floating around that I’ve successfully kept my wife from ecycling. It’s that I’m also sympathetic to the kernel devs who only have so many hours, and don’t want to use them supporting ancient systems that aren’t still widely used.
If I'm running a living museum of computer history, to let the youth of today experience 15" CRTs and learn the difference between ISA, PCI and AGP slots - I'm probably not connecting my exhibits to the internet.
> open source isn’t supposed to be about market pricing or what’s convenient for vendors; it’s about giving users the freedom to decide what’s worth running.
Ehhh, it's about users having the ability to run whatever they like. Which they do.
If a group of users of 32 bit hardware care to volunteer to support the latest kernel features, then there's no problem.
If no one does, then why should a volunteer care enough to do it for them? It's not like the old kernel versions will stop working. Forcing volunteers to work on something they don't want to do is just a bad way to manage volunteers.
> If a group of users of 32 bit hardware care to volunteer to support the latest kernel features, then there's no problem.
It's not just the case that you need people to support 32bit/nommu; you also have to account for the impact on other kernel devs working on features that are made harder.
This is called out in the article around keeping highmem support.
That is a fair point! I do think though that it would make sense for maintainers to at least put out an open call to users and developers before dropping something as fundamental as 32 bit support. The reality is that not all users are going to be kernel developers, and even many developers today aren’t familiar with the workflow kernel development requires. Mailing lists, patch submission processes, and the cultural expectations around kernel work are all a pretty steep barrier to entry, even if someone does care about the removal and also happens to be a developer.
The other dynamic here is that the direction in Linux does come from the top. When you have maintainers like Arnd Bergmann saying they would "like" to remove support for hardware (like the ARM boards), that sets the tone, and other contributors will naturally follow that lead. If leadership encouraged a philosophy closer to "never break existing hardware" the same way we’ve had "never break userspace" for decades, we probably wouldn’t even be debating removing 32 bit.
I’m not saying kernel devs need to carry the weight alone, but it would be nice if the community’s baseline stance was towards preservation rather than obsolescence. :(
Removing nommu feels wrong to me. Being able to run linux on a simple enough hardware that anybody sufficiently motivated could write an emulator for, help us, as individuals, remain in control. The more complex things are, the less freedom we have.
It's not a well argumented thought, just a nagging feeling.
Maybe we need a simple posix os that would run on a simple open dedicated hardware that can be comprehended by a small group of human beings. A system that would allow communication, simple media processing and productivity.
These days it feels like we are at a tipping point for open computing. It feels like being a frog in hot water.
We need accessible open hardware. Not shoehorning proprietary hardware to make it work with generic standards they never actually followed.
Open source is one thing, but open hardware - that’s what we really need. And not just a framework laptop or a system76 machine. I mean a standard 64-bit open source motherboard, peripherals, etc that aren’t locked down with binary blobs.
I would love to works with hardware, if you can foot my bill then I be happy to do that since open source software is one thing but open source hardware need considerable investment that you cant ignore from the start unlike software that we have AI to help us
> I mean a standard 64-bit open source motherboard, peripherals, etc that aren’t locked down with binary blobs.
The problem here is scale. Having fully-open hardware is neat, but then you end up with something like that Blackbird PowerPC thing which costs thousands of dollars to have the performance of a PC that costs hundreds of dollars. Which means that only purists buy it, which prevents economies of scale and prices out anyone who isn't rich.
Whereas what you actually need is for people to be able to run open code on obtainium hardware. This is why Linux won and proprietary Unix lost in servers.
That might be achievable at the low end with purpose-built open hardware, because then the hardware is simple and cheap and can reach scale because it's a good buy even for people who don't care if it's open or not.
But for the mid-range and high end, what we probably need is a project to pick whichever chip is the most popular and spend the resources to reverse engineer it so we can run open code on the hardware which is already in everybody's hands. Which makes it easier to do it again, because the second time it's not reverse engineering every component of the device, it's noticing that v4 is just v3 with a minor update or the third most popular device shares 80% of its hardware with the most popular device so adding it is only 20% as much work as the first one. Which is how Linux did it on servers and desktops.
Until we have affordable photolithography machines (which would be cool!), hardware is never really going to be open.
> affordable photolithography machines
We'll likely never have "affordable" photolithography, but electron beam lithography will become obtainable in my lifetime (and already is, DIY, to some degree.)
depends on what one means by affordable, but DIY versions have been successfully attempted
https://www.youtube.com/watch?v=IS5ycm7VfXg
The next 3D print revolution, photolithography your own chip wafers at home. Now that would be something!
I doubt anyone here has a clean enough room.
Jeri Ellsworth has that covered.
https://www.youtube.com/watch?v=PdcKwOo7dmM
Insane… I thought I was smart, she proves me wrong.
> I doubt anyone here has a clean enough room.
Jordan Peterson has entered the building...
"Clean your rooms, men!" Starts sobbing
Maybe if he cleaned his own room, he’d find his copy of the Communist Manifesto in time to read it for a scheduled debate.
https://www.youtube.com/watch?v=qsHJ3LvUWTs
>> Until we have affordable photolithography....
If that comes to pass we will want software that run on earlier nodes and 32bit hardware.
We kinda have this with IBM POWER 9. Though that chip launched 8 years ago now, so I'm hoping IBM's next chip can also avoid any proprietary blobs.
Indeed with the OpenPOWER foundation.
Let’s hope some of that trickles down to consumer hardware.
nommu is a neat concept, but basically nobody uses it, and I don't see that as likely to change. There's no real use case for using it in production environments. RTOSes are much better suited for use on nommu hardware, and parts that can run "real" Linux are getting cheaper all the time.
If you want a hardware architecture you can easily comprehend - and even build your own implementation of! - that's something which RISC-V handles much better than ARM ever did, nommu or otherwise.
There's plenty of use cases for Linux on microcontrollers that will be impossible if nommu is removed. The only reason we don't see more Linux on MCUs is the lack of RAM. RP2350 are very close! Running Linux makes it much easier to develop than a plain RTOS.
A "plain RTOS" is the better idea most of the time.
Those operating systems already exist. You can run NetBSD on pretty much anything (it currently supports machines with a Motorola 68k CPU for example). Granted many of those machines still have an MMU iirc but everything is still simple enough to be comprehend by a single person with some knowledge in systems programming.
NetBSD doesn't support any devices without an mmu.
I think people here are misunderstanding just how "weird" and hacky trying to run an OS like linux on those devices really is.
Yeah, a lot of what defines "operating system" for us nowadays is downstream of having memory isolation.
Not having an MMU puts you more into the territory of DOS than UNIX. There is FreeDOS but I'm pretty sure it's x86-only.
Mmm... would beg to differ. I have ported stuff to NOMMU Linux and almost everything worked just as on a "real" Linux. Threads, processes (except only vfork, no fork), networking, priorities, you no name it. DOS gives you almost nothing. It has files.
The one thing different to a regular Linux was that a crash of a program was not "drop into debugger" but "device reboots or halts". That part I don't miss at all.
That's fair. If so, then you still can have things like drivers and HAL and so on too. However, there's no hard security barriers.
How do multiple processes actually work, though? Is every executable position-independent? Does the kernel provide the base address(es) in register(s) as part of vfork? Do process heaps have to be constrained so they don't get interleaved?
There are many options. Executables can be position-independent, or relocated at run-time, or the device can have an MPU or equivalent registers (for example 8086/80286 segment registers), which is related to an MMU but much simpler.
Executables in a no-MMU environment can also share the same code/read-only segments between many processees, the same way shared libraries can, to save memory and, if run-time relocation is used, to reduce that.
The original design of UNIX ran on machines without an MMU, and they had fork(). Andrew Tanenbaum's classic book which comes with Minix for teaching OS design explains how to fork() without an MMU, as Minix runs on machines without one.
For spawning processes, vfork()+execve() and posix_spawn() are much faster than fork()+execve() from a large process in no-MMU environments though, and almost everything runs fine with vfork() instead of fork(), or threads. So no-MMU Linux provides only vfork(), clone() and pthread_create(), not fork().
FWIW, Linux is not the only OS looking into dropping 32bit.
FreeBSD is dumping 32 bit:
https://www.osnews.com/story/138578/freebsd-15-16-to-end-sup...
OpenBSD has this quote:
>...most i386 hardware, only easy and critical security fixes are backported to i386
I tend to think that means 32bit on at least x86 days are numbered.
https://www.openbsd.org/i386.html
I think DragonflyBSD never supported 32bit
For 32bit, I guess NetBSD may eventually be the only game in town.
If you want a POSIX OS, nommu Linux already isn't it: it doesn't have fork().
Just reading about this...turns out nommu Linux can use vfork(), which unlike fork() shares the parent's address space. Another drawback is that vfork's parent process gets suspended until the child exits or calls execve().
Typicall you always call vfork() + execve(), vfork is pretty useless on its own.
Think about it like CreateProcess() on Windows. Windows is another operating system which doesn't support fork(). (Cygwin did unholy things to make it work anyway, IIRC.)
Removing nommu makes the kernel simpler and easier to understand.
Supporting 32bit is not 'simple' and the difference between 32bit hardware and 64bit hardware is not big.
The industry has a lot of experience doing so.
In parallel, the old hardware is still supported, just not by the newest Linux Kernel. Which should be fine anyway because either you are not changing anything on that system anyway or you have your whole tool stack available to just patch it yourself.
But the benefit would be a easier and smaller linux kernel which would probably benefit a lot more people.
Also if our society is no longer able to produce chips in a commercial way and we loose all the experience people have, we are probably having a lot bigger issues as a whole society.
But I don't want to deny that it would be nice to have the simplest way of making a small microcontroller yourself (doesn't has to be fast or super easy just doable) would be very cool and could already solve a lot of issues if we would need to restart society from wikipedia.
The comment you're responding to isn't talking about 32 vs 64 bit, but MMU vs no MMU.
ELKS can still run on systems without an mmu (though not microcontrollers afaik).
ELKS runs 16bit x86, including 8086.
Note ELKS is not Linux.
There's also Fuzix.
Nothing prevents you from maintaining nommu as a fork. The reality of things is, despite your feelings, people have to work on the kernel, daily, and there comes a point where your tinkering needs do not need to be supported in main. You can keep using old versions of the kernel, too.
Linux remains open source, extendable, and someone would most likely maintain these ripped out modules. Just not at the expense of the singular maintainer of the subsystem inside the kernel.
I don't think it makes sense to run Linux on most nommu hardware anymore. It'd make more sense to have a tiny unikernel for running a single application, because on nommu, you don't typically have any application isolation.
> on nommu, you don't have any application isolation
That isn't necessarily the case. You can have memory protection without a MMU - for instance, most ARM Cortex-M parts have a MPU which can be used to restrict a thread's access to memory ranges or to hardware. What it doesn't get you is memory remapping, which is necessary for features like virtual memory.
virtual memory as in swap is one, but imo bigger one is memory-mapped files
It is amazing that big endian is almost dead.
It will be relegated to the computing dustbin like non-8-bit bytes and EBCDIC.
Main-core computing is vastly more homogenous than when I was born almost 50 years ago. I guess that's a natural progression for technology.
Big endian will stay around as long as IBM continues to put in the resources to provide first-class Linux support on s390x. Of course if you don’t expect your software to ever be run on s390x you can just assume little-endian, but that’s already been the case for the vast majority of software developers ever since Apple stopped supporting PowerPC.
> It is amazing that big endian is almost dead.
I wish the same applied to written numbers in LTR scripts. Arithmetic operations would be a lot easier to do that way on paper or even mentally. I also wish that the world would settle on a sane date-time format like the ISO 8601 or RFC 3339 (both of which would reverse if my first wish is also granted).
> It will be relegated to the computing dustbin like non-8-bit bytes and EBCDIC.
I never really understood those non-8-bit bytes, especially the 7 bit byte. If you consider the multiplexer and demux/decoder circuits that are used heavily in CPUs, FPGAs and custom digital circuits, the only number that really makes sense is 8. It's what you get for a 3 bit selector code. The other nearby values being 4 and 16. Why did they go for 7 bits instead of 8? I assume that it was a design choice made long before I was even born. Does anybody know the rationale?
> I also wish that the world would settle on a sane date-time format like the ISO 8601
IIRC, in most countries the native format is D-M-Y (with varying separators), but some Asian countries use Y-M-D. Since those formats are easy to distinguish, that's no problem. That's why Y-M-D is spreading in Europe for official or technical documents.
There's mainly one country which messes things up...
YYYY-MM-DD is also the official date format in Canada, though it's not officially enforced, so outside of government documents you end up seeing a bit of all three formats all over the place. I've always used ISO 8601 and no one bats an eye, and it's convenient since YYYY-DD-MM isn't really a thing, so it can't be confused for anything else, unlike the other two formats.
YMD has caught on, I think, because it allows for the numbers to be "in order" (not mixed-endian) while still having the month before the day which matches the practice for speaking dates in (at least) the US and Canada.
I don't know that 7-bit bytes were ever used. Computer word sizes have historically been multiples of 6 or 8 bits, and while I can't say as to why particular values were chosen, I would hypothesize that multiples of 6 and 8 work well for representation in octal and hexadecimal respectively. For many of these early machines, sub-word addressability wasn't really a thing, so the question of 'byte' is somewhat academic.
For the representation of text of an alphabetic language, you need to hit 6 bits if your script doesn't have case and 7 bits if it does have case. ASCII ended up encoding English into 7 bits and EBCDIC chose 8 bits (as it's based on a binary-coded decimal scheme which packs a decimal digit into 4 bits). Early machines did choose to use the unused high bit of an ASCII character stored in 8 bits as a parity bit, but most machines have instead opted to extend the character repertoire in a variety of incompatible ways, which eventually led to Unicode.
On the DEC-10 the word size is 36 bits. There was (an option to include) a special set of instructions to enable any given byte size with bytes packed. Five 7-bit bytes per word, for example, with a wasted bit in each word.
I wouldn’t be surprised if other machines had something like this in hardware.
I believe that 10- and 12-bit bytes were also attested in the early days. As for "why": the tradeoffs are different when you're at the scale that any computer was at in the 70s (and 60s), and while I can't speak to the specific reasons for such a choice, I do know that nobody was worrying about scaling up to billions of memory locations, and also using particular bit combinations to signal "special" values was a lot more common in older systems, so I imagine both were at play.
Computers never used 7-bit bytes similarly to how 5-bit bytes were uncommon, but both 6-bit and 8-bit bytes were common in their respective eras.
Now just UTF-16 and non '\n' newline types remaining to go
UTF-16 arguably is Unicode 2.0+. It's how the code point address space is defined. Code points are either 1 or 2 16-bit code units. Easy. Compare w/ UTF-8 where a code point may be 1, 2, 3, or 4 8-bit code units.
UTF-16 is annoying, but it's far from the biggest design failure in Unicode.
Of the two UTF-16 is much less of a problem, it's trivially[1] and losslessly convertible.
[1] Ok I admit, not trivially when it comes to unpaired surrogates, BOMs, endian detection, and probably a dozen other edge and corner cases I don't even know about. But you can offload the work to pretty well-understood and trouble-free library calls.
UTF-16 will be quite the mountain as Windows APIs and web specifications/engines default to it for historical reasons.
We'll have to deal with it forever in network protocols. Thankfully that's rather walled off from most software.
As well as in a number of widely spread cryptographic algorithms (e.g. SHA-2), which use BE for historic reasons.
just call it 2-AHS and you're done :)
Good call out, I have just removed some #ifdef about endianness from my engine.
I have some places in some software where I assume little endian for simplicity, and I just leave in a static_assert(std::endian::native == std::endian::little) to let future me (or future someone else) know that a particular piece of code must be modified if it is ever to run on a not-little-endian machine.
Another good another call. Thank you.
> One other possibility is to drop high memory, but allow the extra physical memory to be used as a zram swap device. That would not be as efficient as accessing the memory directly, but it is relatively simple and would make it possible to drop the complexity of high memory.
Wild, like some kind of virtual cache. Reminds me a bit of the old Macintosh 68k accelerators; sometimes they included their own (faster) memory and you could use the existing sticks as a RAM disk.
Aren't 32 systems more power-efficient? It costs less energy to switch 32 transistors than 64.
Not just more power-efficient, but also a little more memory efficient because pointers are only half as big and so don't take up as much space in the cache. Lower-bit chips are also smaller (which could translate into faster clock and/or more functional units per superscaler core and/or more cores per die).
Part of the problem with these discussion is that often when people say "64-bit" vs "32-bit" they are also considering all the new useful instructions that were added to the new instruction set generation. But a true "apples-to-apples" comparison between "32-bit" and "64-bit" should be using almost identical whose only difference is the datapath and pointer size.
I feel that the programs and games I run shouldn't really need more than 4GB memory anyway, and the occasion instance that the extra precision of 64-bit math is useful could be handled by emulating the 64-bit math with the compiler adding a couple extra 32-bit instructions.
Applications don't get 4GB with a 32-bit address space. The practical split between application and kernel was usually 1-3 or 2-2 with 3-1 being experimental and mooted with the switch to 64-bit. Nowadays with VRAM being almost as large as main RAM, you need the larger address space just to map a useful chunk of it in.
When you factor in memory fragmentation, you really only had a solid 0.75-1.5GB of space that could be kept continuously in use. That was starting to become a problem even when 32-bit was the only practical option. A lot of games saw a benefit to just having the larger address space, such that they ran better in 64-bit with only 4GB of RAM despite the fatter 64-bit pointers.
It depends on the kernel architecture. 4G/4G kernels weren't the most common thing, but also weren't exactly rare in the grand scheme of things. PowerPC macOS (and x86 in macOS before they officially released Intel based mac hardware) were 4G/4G for example. The way that works under x86 is that you just reserve a couple kernel pages mapped into both address spaces to do the page table swap on interrupts and syscalls. A little expensive, but less than you'd think, and having the kernel and user space not fight for virtual address space provided its own efficiencies to partially make up the difference. We've been moving back to that anyway with Kernel Page Table isolation for spectre mitigations.
And 3-1 wasn't really experimental. It was essentially always that way under Linux, and had been supported under Windows since the late 90s.
I believe that's an accident of the evolutionary path chosen with syscalls. If we'd instead gone with a ring buffer approach to make requests, then you'd never need to partition the memory address space; the kernel has its memory and userspace has its and you don't need the situation where the kernel is always mapped.
Hmm. I don't understand how that would work.
I think it would be possible for e.g. microkernels to greatly reduce the size of the reservation (though not to eliminate it entirely). However, I can't imagine how you would handle the privilege escalation issue without having at least some system code in the application's virtual address space that's not modifiable by the application.
I'm not sure how privilege escalation would be an issue since you'd never escalate privilege in the first place (I'm assuming you're talking about CPU ring privileges and not OS privileges). You'd just enqueue into the shared kernel/user space ring buffer your operations and the kernel would pick them up on its side, but you'd never jump between rings.
Such a design may require at least one processor dedicated to running the kernel at all times, so it might not work on a single processor architecture. However, single processor architectures might be supportable by having the "kernel process" go to sleep by arming a timer and the timer interrupt is the only one that's specially mapped so it can modify the page table to resume the kernel (for handling all the ring buffers + scheduling). As you note, there's some reserved address space but it's a trivial amount just to be able to resume running the kernel. I don't think it has anything to do with monolithic vs microkernels.
True, you don't have to go full microkernel just to have messages passed though a buffer. However, if the buffer is shared by all processes, it does need to have some protection. I guess you could assign one buffer per process (which might end up using a lot of physical RAM), and then just crash the process if it corrupts its own buffer. The bigger issue with this approach might be adapting to asynchrony though.
It wouldn't be by all processes. One per process just like with io_uring. Not sure how it would end up being all that much physical RAM - you get a lot more memory mapped when you just start a process. Page faults might be another tricky corner case.
I wish x32 had taken off; better performance and lower memory use.
https://wiki.debian.org/X32Port
On anything but the smallest implementations, the 32 vs 64bit alu cost difference is pretty tiny compared to everything else going on in the core to get performance. And assumes the core doesn't support 32-bit ops, leaving the rest of the ALU idle, or does something like double pumping.
Really the ALU width is an internal implementation detail/optimisation, you can tune it to the size you want at the cost of more cycles to actually complete the full width.
It's the MMU width, not the ALU width, that matters.
Lots of machines are capable of running with 32-bit pointers and 64-bit integers ("Knuth mode" aka "ILP32"). You get a huge improvement in memory density as long as no single process needs more than 4GB of core.
What makes you think that a 32 bit system has 32 transistors? For example, from the top of my head, the pentium pro had a 86 bit direction bus and a 64 bit data bus.
Sometimes you gotta run real fast and go right to bed to save power.
Wait until you hear about 8 bit systems
Yes and no. A problem with 8-bit and 16-bit for desktop and servers is the limited memory address space, so the compiler has to insert extra instructions to deal with things like updating the segment registers. And likewise if you need to do higher-bit math then the compiler again has to insert extra instructions. Those extra instructions clog up the pipeline, but aren't needed if your largest program's working memory set and the largest precision math you generally need fits within the ISA's bit size. Unless you are doing scientific computing or other large-memory set tasks like Blender (which dropped 32-bit support), then 32-bit really is good-enough.
I couldn't tell if your comment was a joke, but it is worth mentioning the 8-bit microcontrollers like TinyAVR still fill a niche where every joule and cent counts.
Linux has become the dominant operating system for a wide range of devices, even though other options like FreeRTOS or the BSD family seem more specialized. The widespread adoption of Linux suggests that a single, versatile operating system may be more practical than several niche ones. However, the decision to drop support for certain hardware because it complicates maintenance, as seen here, would seem to contradict the benefit of a unified system. I wouldn't be surprised if it really just results in more Linux forks - Android is already at the point of not quite following mainline.
Funny, I remember 32 bits being 'the future', now it is the distant past. I think they should keep it all around, and keep it buildable. Though I totally understand the pressure to get rid of it I think having at least one one-size-fits-all OS is a very useful thing to have. You never know what the future will bring.
Just because support would be removed from current and new versions doesn't mean the old code and tarballs are just going to disappear. Can dust off an old 32 bit kernel whenever you want
There's always NetBSD. I'm pretty sure that's supporting x86 as far back was 80486 and 32-bit SPARC as far back as... something I wouldn't want to contemplate.
Technologies have lifecycles. Film at 11.
the netbsd team agrees! more users for us.
The Apple Watch has 32-bit memory addressing (and 64-bit integer arithmetic -- it's ILP32). Granted it doesn't run Linux, but it's a very very modern piece of hardware, in production, and very profitable.
Same for WASM -- 32-bit pointers, 64-bit integers.
Both of these platforms have a 32-bit address space -- both for physical addresses and virtual addresses.
Ripping out support for 32-bit pointers seems like a bad idea.
i do miss being able to read and memorize hex addresses. 64 bits is a little too long to easily 'see' at a glance. or see at all for that matter.
On the userland side, there is some good progress of using thunking to run 32-bit Windows programs in Wine on Linux without the need for 32-bit libraries (the only edge case remaining is thunking 32-bit OpenGL which is lacking needed extensions for acceptable performance). But the same can't be said for a bunch of legacy 32-bit native Linux stuff like games which commonly have no source to rebuild them.
May be someone can develop such thunking for legacy Linux userland.
How many of those legacy applications where the source is not available actually need to run natively on a modern kernel?
The only thing I can think of is games, and the Windows binary most likely works better under Wine anyways.
There are many embedded systems like CNC controllers, advertisement displays, etc... that run those old applications, but I seriously doubt anyone would be willing to update the software in those things.
Yeah, games I'd guess is the most common case or at least one enough people would care about.
It shouldn’t be difficult to write a binary translator to run 32-bit executables on a 64-bit userspace. You will take a small performance hit (on top of the performance hit of using the 32-bit architecture to begin with), but that should be fine for anything old enough to not be recompiled.
In some ways, Windows already does that too - the 32-bit syscall wrappers [switch into a 64-bit code segment](https://aktas.github.io/Heavens-Gate) so the 64-bit ntdll copy can call the 64-bit syscall.
I would guess so, but I haven't seen anyone developing that so far.
most of those games would have windows builds?
that said, i sometimes think about a clean-room reimplementation of e.g. the unity3d runtime -- there are so many games that don't even use native code logic (which still could be supported with binary translation via e.g. unicorn) and are really just mono bytecode but still can't be run on platforms for which their authors didn't think to build them (or which were not supported by the unity runtime at the time of the game's release).
> most of those games would have windows builds?
Yeah, that's a reasonable workaround, as long as it doesn't hit that OpenGL problem above (now it mostly affects DX7 era games, since they don't have Vulkan translation path). Hopefully it can be fixed.
In practice, the path for legacy software on Linux is Wine.
I have heard people say the only stable ABI on Linux is Win32.
Win32S but the other way around.
Win64S?
Perhaps a new compatibility layer, call it LIME -- LIME Is My Emulater
LIME Isn’t Merely an Emulator
I can’t help but wonder if kernel devs realize how much this discussion sounds like something you’d expect from Apple. They are talking about obsoleting hardware not because it’s fundamentally broken, but because it no longer fits neatly into a roadmap. Open source has always been about making hardware outlive commercial interest and let it run long after the hardware vendor abandons it.
I'm pretty shocked to see comments like "the RAM for a 32-bit system costs more than the CPU itself", but open source isn’t supposed to be about market pricing or what’s convenient for vendors; it’s about giving users the freedom to decide what’s worth running.
I understand that maintainers don’t want to drag around unmaintained code forever, and that testing on rare hardware is difficult. But if the code already exists and is working, is it really that costly to just not break it? The kernel's history is full of examples where obscure architectures and configs were kept alive for decades with minimal intervention. Removing them feels like a philosophical shift, especially when modern hardware is more locked down and has a variety of black box systems running behind it like Intel ME and AMD PSP.
> But if the code already exists and is working, is it really that costly to just not break it?
It depends on the feature, but in many cases the answer is in fact 'yes.' There's a reason why Alpha support (defunct for decades) still goes on but Itanium support (defunct for years) has thoroughly been ripped out of systems.
What's the Venn diagram of people stuck with 32-bit hardware and people needing features of newer kernels? Existing kernels will keep working. New devices probably wouldn't support that ancient hardware; seen any new AGP graphics cards lately?
There's not a compelling reason to run a bleeding edge kernel on a 2004 computer, and definitely not one worth justifying making the kernel devs support that setup.
security isn't a compelling reason?
The bulk of CVEs that crossed my desk in the last couple of years were in things that wouldn’t matter on a 32-bit system, like problems in brand new graphics cards or fibre channel or 10G Ethernet, or KVM hosting, or things like that. There wasn’t a huge flood of things affecting older, single-user type systems.
But in any case, I’m sure Red Hat etc would be happy to sell backports of relevant fixes.
It’s not that I’m unsympathetic to people with older systems. I get it. I’ve got old hardware floating around that I’ve successfully kept my wife from ecycling. It’s that I’m also sympathetic to the kernel devs who only have so many hours, and don’t want to use them supporting ancient systems that aren’t still widely used.
If I'm running a living museum of computer history, to let the youth of today experience 15" CRTs and learn the difference between ISA, PCI and AGP slots - I'm probably not connecting my exhibits to the internet.
"perfection"
> open source isn’t supposed to be about market pricing or what’s convenient for vendors; it’s about giving users the freedom to decide what’s worth running.
Ehhh, it's about users having the ability to run whatever they like. Which they do.
If a group of users of 32 bit hardware care to volunteer to support the latest kernel features, then there's no problem.
If no one does, then why should a volunteer care enough to do it for them? It's not like the old kernel versions will stop working. Forcing volunteers to work on something they don't want to do is just a bad way to manage volunteers.
> If a group of users of 32 bit hardware care to volunteer to support the latest kernel features, then there's no problem.
It's not just the case that you need people to support 32bit/nommu; you also have to account for the impact on other kernel devs working on features that are made harder.
This is called out in the article around keeping highmem support.
That is a fair point! I do think though that it would make sense for maintainers to at least put out an open call to users and developers before dropping something as fundamental as 32 bit support. The reality is that not all users are going to be kernel developers, and even many developers today aren’t familiar with the workflow kernel development requires. Mailing lists, patch submission processes, and the cultural expectations around kernel work are all a pretty steep barrier to entry, even if someone does care about the removal and also happens to be a developer.
The other dynamic here is that the direction in Linux does come from the top. When you have maintainers like Arnd Bergmann saying they would "like" to remove support for hardware (like the ARM boards), that sets the tone, and other contributors will naturally follow that lead. If leadership encouraged a philosophy closer to "never break existing hardware" the same way we’ve had "never break userspace" for decades, we probably wouldn’t even be debating removing 32 bit.
I’m not saying kernel devs need to carry the weight alone, but it would be nice if the community’s baseline stance was towards preservation rather than obsolescence. :(