Is there such a thing as a technology historian? I have some odd questions about the history of computers.

by NotRoryWilliams

My specific question, on the off chance such a person exists and follows this forum, is how operating systems were able to accommodate so many diverse hardware configurations among early computers, specifically Apple II and classic Macintosh platforms, where there existed a host of “upgrade cards” that added memory, graphics, and even processor upgrades through expansion bus products. I understand the basic principle of “device driver” software components, but it’s just boggling my mind how it was possible for software to run with acceptable and even sometimes impressive performance with things like coprocessors and expansion boards before the invention of modern “threading” tech - or was that concept actually in place all along and just not discussed in the same terms?

I feel like this may be more of a “tech” question than a classic “history” question in the spirit of this subreddit, though, so I would be happy if someone could just point me to the right place to ask this question.

tokynambu

The moderators have had a day here, so I am somewhat nervous about answering :-)

Buses on early microcomputers computers were just an extension of the address and data lines on the processor. S-100, ISA: in both cases, the backplane was carrying the same signals that were present on the CPU. This had to be the case: given the density of components of the era, memory was inevitably going to be a separate board, rather than connected to the processor with a (by implication) faster bus on the motherboard. Indeed, such machines did not have motherboards as we now understand them, the processor and its support logic was just another board on the bus.

Devices on the bus were usually therefore memory mapped. A given device occupied a range of addresses, and was accessed as though it were memory: an address was output on the address lines, and then the appropriate lines were strobed to "write" the data lines to that location, or "read" that location to the data lines. Devices were visualised as either a block of memory (say, a video card) or a set of registers. So a hypothetical RS232 card would contain a register which was "the byte to be written", and in order to write a byte to the serial line you'd load it into that memory location and the board would do the rest. On the read side there would also be an interrupt line which was strobed to say "look at me! I have something", and then the registers would be read to find out what. So the same RS232 board would, on receipt of a byte, place it in a register and strobe the interrupt line, which would then cause the processor to read it.

But there is a history lesson here: the history of RAM speeds. Until relatively recently, RAM latency was comparable to process clock speeds. Consult this table: https://en.wikipedia.org/wiki/CAS_latency

In the 1970s and 1980s, RAM was asynchronous: it was controlled directly by the processor, and was as fast as it needed to be. So an [[ edit to be more typical of the era ]] 1MHz processor could be confident that if it went to RAM to fetch some data, it could do so within a clock cycle, [[ 1us being immensely faster than the delays inherent in the silicon ]]. There was no concept of cache on the CPU die [[ beyond a handful of registers ]], because the RAM was "fast enough".

By the 1990s, with PC100, we see what the table calls "First Byte" (the delay between issuing a read request and the first byte in that block arriving) of 20ns --- that's equivalent to 50MHz (ie, you can single-threaded read individual bytes at 50MB/sec), and machines of the era had "roughly" that sort of clock speed. So going to RAM to fetch some data would cost you one, at most two clock cycles when you couldn't do anything else. Registers become more valuable, and perhaps caching or pre-fetching instructions gets you something, but it's not the end of the world. 1992, right?

Now scan to the bottom of that table for 2022's new hotness. Note how the CAS latency is steadily increasing. Look at the very fastest hotness in the new hotness, DDR5-6400 CAS32. That has a "first byte" delay of 10ns, only half what you had in 1992. That's equivalent to 100MHz. Right, but processors now clock at 3200MHz, not 50MHz, and you have a lot of them on a die. Now you're going to need to wait a _lot_ of clock cycles for anything to happen.

Hence the need to thread, and pipeline, and cache, and speculate, and all the rest: because if you imagine a processor which followed the Von Neumann architecture of "read an instruction from memory, read an operand from memory, write the result to memory, repeat" it's going to take 30ns each time you do that, and your processor may as well be clocked at 33MHz because that's as fast as it's going.

That RAM is barely faster, in latency terms, in 2022 than it was in 1992 is the dirty secret of modern computer architectures, and most of the complexity springs that that and that alone.