Rendered at 11:56:30 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
tasty_freeze 20 hours ago [-]
Northstar made an S-100 card which did FP math, using BCD arithmetic. It had a ucode ROM and a 4b (single digit) ALU, and a few small RAMs to hold the digits. If I remember correctly you could program it to select how many digits you wanted in your representation, up to 14 digits. It did everything one digit at a time, and it had a 256 byte ROM to carry out any digit*digit product in one cycle. For normalization no data was moved -- just the pointer to the appropriate digit was incremented or decremented.
That's a very interesting board! It came out in 1976 (four years before the 8087) and cost $499 assembled, equivalent to $2900 in current dollars, so it was expensive. It was really a decimal processor built from simple TTL parts, and had four microcoded instructions: add, subtract, multiply, and divide. Arithmetic used the 74LS181, the very popular ALU chip. (It did multiplication with repeated addition; there's no ROM with digit products, unless that was a later version.) The "small RAM" was very small by modern standards: four 4-bit registers that each held 16 digits. Each register was implemented with a 74S189 chip.
The microcode is available, so it would be a fun project to write a simulator that runs the microcode.
My mistake -- what I wrote was from memory so I got the bit wrong about the multiplier ROM. I must have confused that detail with the design of the Wang 2200 computer, which had double precision BCD float math and did in fact have a 4b x 4b multiplier ROM.
Lovely page, I enjoyed it lots. Especially this: “The first time I programmed a computer was in the fall of 1978 at LTHS, Lyons Township High School, in La Grange, IL. It changed my life. For a long time I couldn't think of anything else but programming computers, and it hasn't yet completely worn off.”
Calms me down and gives hope. I started feeling like losing the programming spark is just behind the corner more and more during the past decade (working for money), yet decade and some before that was so exhilarating. But now I think, you've started 20 years before I was born!
kjs3 14 hours ago [-]
I didn't know about that board; very cool. Northstar had an S-100 'math board' bases on an AMD 9511 FP chip that was popular in some very niche markets. Quite a bit more capable, but probably not as intrinsically interesting.
trollbridge 21 hours ago [-]
Must…resist…clicking link… I’ve got a lot to today and this is like carefully crafted bait to tie me up for the next 4 hours. :-)
QuadrupleA 10 hours ago [-]
Always amazed how spoiled we are with modern hardware! The 8087 was $500 in today's dollars, and delivered around 50 kFLOPS of performance (0.00005 GFLOPS).
A cheap mobile phone CPU+GPU costs the manufacturer maybe $20, and typically does 50 GFLOPS on the CPU and 500+ on the GPU. So 10 million times the performance for 1/25th the cost.
Humbling too how "worthless" all the incredible ingenuity of the 8087 circuits and die designs now is, although I'm sure many of those innovations live on in modern chips.
SyzygyRhythm 16 hours ago [-]
Why isn't the shifter built with a log2 arrangement, shifting 32-16-8-4-2-1 bits? Takes fewer sub-stages and doesn't require a separate decoder for the input.
The article mentions it already has a two-stage design, shifting bits and then bytes, so it can't be about shifting more than one bit at a time. Anyone know why?
kens 15 hours ago [-]
Yes, you can use a "logarithmic shifter". The CDC 6600 supercomputer (1964) used that approach. The tradeoff is that you need more stages with the logarithmic approach (six versus two for 64 bits).
If you're using MOS pass transistors for each stage, you lose some voltage at each stage, which limits the number of stages. I think this is why the 8087 (and the 386) used two-stage shifters rather than logarithmic shifters. I don't know how the circuit area compares between the two approaches--two more complex stages vs six simpler stages--but I suspect the two-stage approach wins.
xenadu02 14 hours ago [-]
Wasn't this also one of the last chips laid out by hand (literally the masks were cut and laid out physically)? Or am I thinking of something else?
I sometimes wonder if some design decisions were made on that basis.
kens 11 hours ago [-]
CAD was a very incremental process. Early chips were drawn by hand and the Rubylith masks were cut by hand with the help of a Coordinatograph. Later, Intel used a Xynetics plotter to cut the Rubylith. By 1974, layouts were digitized with a Calma GDS I so repeated cells could be handled automatically. By the time of the 8087, there was a lot of automation.
You might think that the 8087's shifter would be a regular grid, easy to lay out by hand. It turns out to be very optimized and irregular. (I traced it out by hand and it was a pain.)
bell-cot 22 hours ago [-]
Closely related, 8 days ago, 138 points & 28 comments:
Yes - I was just trying to give things a "this is interesting, so upvote & discuss!" kick. In the absence of Ken popping up with good "Author here for your 8087 questions" comment.
elpocko 20 hours ago [-]
I guess he didn't pop up because the article is 6 years old.
https://s100computers.com/Hardware%20Folder/NorthStar/FP%20B...
The microcode is available, so it would be a fun project to write a simulator that runs the microcode.
Manual and schematics are here if anyone is looking for them: https://bitsavers.org/pdf/northstar/boards/North_Star_Floati...
https://www.wang2200.org/
(I'm the guy behind the wang2200.org domain)
A cheap mobile phone CPU+GPU costs the manufacturer maybe $20, and typically does 50 GFLOPS on the CPU and 500+ on the GPU. So 10 million times the performance for 1/25th the cost.
Humbling too how "worthless" all the incredible ingenuity of the 8087 circuits and die designs now is, although I'm sure many of those innovations live on in modern chips.
The article mentions it already has a two-stage design, shifting bits and then bytes, so it can't be about shifting more than one bit at a time. Anyone know why?
If you're using MOS pass transistors for each stage, you lose some voltage at each stage, which limits the number of stages. I think this is why the 8087 (and the 386) used two-stage shifters rather than logarithmic shifters. I don't know how the circuit area compares between the two approaches--two more complex stages vs six simpler stages--but I suspect the two-stage approach wins.
I sometimes wonder if some design decisions were made on that basis.
You might think that the 8087's shifter would be a regular grid, easy to lay out by hand. It turns out to be very optimized and irregular. (I traced it out by hand and it was a pain.)
https://news.ycombinator.com/item?id=48519011 (about the 8087's adder)
https://www.righto.com/2020/05/die-analysis-of-8087-math-cop...
https://www.righto.com/2026/06/intel-8087-adder-reverse-engi...
https://news.ycombinator.com/item?id=23362673