We are still actively working on the spam issue.

/aig/ Alternative ISA General

From InstallGentoo Wiki
Revision as of 22:45, 4 October 2021 by Gene (talk | contribs) (expand m68k and some tweaks)
Jump to: navigation, search
Alternative as in choice, choice as in freedom

Alternative ISA General is a discussion thread about non x86 hardware. "Alternative" doesn't mean "unpopular" it means "alternative to x86". While there have been such threads in the past, they were usually sporadic and not very well connected with one another, which meant that whatever transpired in one thread wasn't carried over to the next one.

Due to the rise of desktop-class ARM chips, interest in alternative hardware has risen, with many Anons even coming up with projects of their own. Therefore, a centralised place was needed, where we could keep track of the development and goals of the community.

While discussion of Intel or AMD hardware is not absolutely prohibited (and even if it were, who is gonna enforce this? LOL), due to the ubiquity of x86 hardware, it is assumed that whatever concerns such architecture can be discussed in any of the other gorillion threads on the board.

Old threads are available on Desuarchive.

Ongoing projects

SOON™

Anons are currently interested in porting several open source projects to the PowerPC architecture. Currently the following proposals have been made:

Grand Theft Auto III

Re3 is a homebrew engine intended to replace proprietay RenderWare with an open source implementation. Anons have been discussing making a port for the 32-bit PowerPC version of Mac OS X.

The Elder Scrolls III: Morrowind

OpenMW is a free and open source modern re-implementation of the Gamebryo engine.

Tomb Raider

OpenLara is a Classic Tomb Raider open-source engine.

Resources

Anon has been kind enough to put together a small reference library.

A collection of brief infocards for many different processors is available on TextFiles which also holds huge archives relating to programming and microcomputers, especially 8-bit processors. Also WikiChip and CPU Shack have a lot of information on alternative ISAs, even though the front page is dominated by mainstream processors.

Wikipedia has a Comparison of instruction set architectures. There was also a List of instruction sets which was decided to merge with the Comparison article, but in typical Wikipedia fashion a redirect was made without anyone doing the real work of merging the articles. Thus the archived List of instruction sets is still worth a review.

ISA Overview

ISA simply means Instruction Set Architecture. This is what the programmer sees from the outside, which these days is very different from the microcode and state machines operating inside the processors, normally inaccessible for normal programmers. The mid 1970's saw a Cambrian explosion of architectures that later fossilised into what we see today. Any assembly programmer and academics such as Hennessy and Patterson agree that x86 stinks, but as usual inertia and money trumps speed, efficiency and elegance. Those three qualities are what we instead celebrate in this general.

The same ISA can be implemented by many different internal architectures and microcode. The ISA is the main topic but since many architectures are DIY we discuss both. The best way is to illustrate by examples of processors. An ISA is defined by a fairly large set of parameters.

Much can be summarised as CISC that is complex, or RISC which is simple. The RISC definition has drifted over the decades, and changed from "simple" to register-to-register based operations with load-store memory handling with greatly reduced and simplified addressing modes.

ISA Features

Registers

The question is simply few or many. Few is good for low latency, many registers are good for lazy programmers and poor compilers. 6502 managed with just one accumulator plus a handful other registers. Compiler writers prefer at least 16 registers. One cannot avoid noting modern processors have tons of registers but still seem sluggish.

Register Use

Operations are performed on registers of some form.

Accumulator based where all processing is via one (or 2, rarely more) main register, examples are most early 8-bit processors;

Stack based where everything is performed on a stack, though TOS (top of stack) is in a register for fast access and operations, possibly also next on stack (NOS), examples are Novix NC4000 and many virtual processors; or

Register file where many registers can be used in similar ways, examples are 68K and many modern processors.

Register Types

Accumulator that is the default destination(s) of operations, usually tightly coupled to the ALU (Arithmetic Logic Unit)

Data Register similar to accumulator, but on processors with register files such as 68K that had 8 data registers

Address Registers used for addressing into memory, usually tightly coupled to the data address generators. Stack pointers can be a form of an address register

Index Registers used for indexing into memory from an offset that may come from an address register

Operands

2-op instructions of the form A += X; or

3-op instructions of the form A = B + C.

The former requires a little extra thought but 3-op is simple for lazy programmers and poor compilers. One cannot avoid noting modern processors drift towards 3-op instructions

Modes

Addressing modes
Mode Description Example
Accumulator The instruction operates on the accumulator (and not, say, memory) 6502: ROL A
Absolute The instruction operates on memory defined by a full width address 6502, 16-bit address: LDA $FF00
Absolute, X The instruction operates on memory defined by a full width address, with offset defined by index register X 6502, 16-bit address: LDA $FF00, X
Absolute, Y The instruction operates on memory defined by a full width address, with offset defined by index register Y 6502, 16-bit address: STA $FF00, Y
Immediate The instruction uses data in program memory subsequent to the instruction 6502: LDX #$FD
Implied Data is implied the instruction 6502: SEI
Indirect Data is accessed indirectly via a pointer 6502: JMP ($F000)
Indexed Indirect Data is accessed indirectly via a pointer where the pointer is offset by the X register 6502: LDA ($C0, X)
Indirect Indexed Data is accessed indirectly via a pointer where the target of the pointer is offset by the Y register 6502: LDA ($D0), Y
Relative Data is accessed or program counter is accessed by an offset from present position 6502: BNE $F300
Zero Page Data is accessed from the zero page, addressed by a single byte 6502: LDA $A0
Zero Page, X Data is accessed from the zero page, addressed by a single byte, with offset defined by index register X 6502: LDA $A2, X
Zero Page, Y Data is accessed from the zero page, addressed by a single byte, with offset defined by index register Y 6502: LDA $A4, Y
Post Increment Data is accessed via an address register that is incremented after access 68K: MOVE.L (A0)+,D3
Pre Decrement Data is accessed via an address register that is decremented before access 68K: MOVE.W -(A7),D4

Implied, immediate, absolute, absolute indexed, zeropage, zeropage indexed, stack relative, index indirect, indirect indexed, all with or without pre/post increment/decrement. Much can be combined. Stack relative addressing is very helpful for C programming, where variables are often transferred to the stack when calling a function. Relative addressing makes it possible to make relocatable code, which is useful when running several programs without MMU.

Zero page was an addressing mode where the address was a single byte and it was implied this related to addresses 0x00 - 0xff where the high byte of the address (the "page") was set to zero. This saved space and time, typically 30 percent. This was later improved by Direct page wherein the page was set by a separate 8-bit register, such as on the 6809. This allowed moving the active page around and reduced the zero page pressure enormously. Lately zero page and direct page have fallen out of fashion.

Memory Architecture

Von Neuman which is an unified memory architecture for program and data.

Harvard Architecture where program and data are located in different memory spaces.

Recent designs are often hybrids, in that ISA seen by programmers is von Neuman, while at cache 1 level there is a Harvard architecture. This means self modifying code will fail to work.

Overall Design

CPU that we are all familiar with

DSP are Digital Signal Processors that tend to use Harvard or super Harvard architecture, with separate memory buses for X-, Y-, and program memory. These are optimised for processing long series of numbers, typically sampled signals from an ADC, low power consumption and hard real-time requirements. Typically a DSP is an accumulator based design tightly coupled to the MAC (Multiply and Accumulate) unit, which is the heart of the DSP. In a typical clock tick, the DSP loads a parameter from X-memory and a parameter from the Y-memory, multiplies these and sums this with the accumulator, while incrementing the pointers into X- and Y-memory. Optionally there is also a shift and rounding in every tick. The pointer incrementing typically takes place in the data address generators that serve to feed the MAC at maximum speed. A well known C equivalent is

A += *x++ * *y++

GPU are more recent design where competition drives towards an all out performance design no matter the thermal issues.

ISA Implementations

Processors

4 Bit Processors

These arrived in the early 1970's with Intel 4004, but were soon overtaken by 8-bit processors. This format still exists, is used in huge volume markets where cost is extremely sensitive and consequently these chips are remarkably cheap.

8 Bit Processors

These usually have 8 bit registers and a 16 bit program counter or instruction pointer (terminologies vary) and can access 64 KB memory. Most are accumulator based which worked well in the 1980's since in this era memory and CPUs were equally slow.

RCA 1802

This is a weird and wonderful processor implemented as bit serial architecture, which made it slow. It was popular for machines such as Cosmac ELF. The fabrication made it radiation resistant and it was also popular for satellites, and is still in production. A modern and compact machine is the 1802 Membership Card in the credit card form factor. Later RCA 1805 was introduced, using the single previously undefined opcode 0x68 as a prefix code and added several more new instructions.

CHIP-8 was a popular virtual processor or very low level language, popular on the 1802 platform, and fast enough for making games. It has been extended and ported to many platforms.

6502

The 6502 was introduced in 1975, and has one accumulator (A), two index registers (X and Y), a stack pointer (S) and a processor status (P), all 8 bit wide; plus a 16 bit program counter (PC). It also has a zero page that could be used as address registers. It entered the market at a much lower price than 6800 and quickly won a following. It was used in many popular computers of that era including Apple 2, BBC and Commodore 64. For all the limitations it was powerful enough in the hands of skilled programmers to power the first spreadsheet (VisiCalc) which was also the first killer application, as well as 3D space games with hidden line wireframe graphics such as Elite.

The 6502 still has many loyal fans, hugely active communities and dozens of implementations. Complete development platforms, simulators, debuggers, operating systems, libraries and more are available, most for free. It is still supported commercially by The Western Design Center, founded by the original designer. An estimated 200 million chips are made annually for an installed base estimated at 2 billion. Not bad for a nearly 50 year old design. This time span also means it is proven, and is therefore used in applications such as pacemakers, where lifetime guarantees take on an entirely new meaning. It is also seen in robots and the occasional terminator. The 6502 has an extremely low transistor count which makes it interesting for new opportunities such as a flexible version.

The 6502 has two weaknesses. First of all it is awkward for 16 bit pointer handling, which is why The Woz overcame this by making SWEET-16 virtual processor. The second is that the 6502 is not suited for stack intensive languages such as C. This has been overcome by other virtual processors such as the p-code for the UCSD p-System and VTL-2 (source), both of which exist for several ISA. A more recent virtual CPU for the 6502 is AcheronVM, self described as the successor to SWEET-16.

Several OS have been made for 6502, including LUnix (Little Unix), Minikernel, GeckOS/A65 and many more. GEOS was an add-on OS for C64 that provided windowing system plus many applications such as text processing, spreadsheets and more - all of this complex system fitting in 64 KB RAM. GEOS was not multi tasking, that extension came with Wheels, which also had a web browser, but increasing RAM requirement to a whopping 128 KB.

6800

This was introduced in 1974 and was thus an early design. It has dual accumulators and one 16-bit index register.

6809

This was the peak of 8-bit architectures with dual accumulators (A and B) that could be merged to a 16 bit accumulator (D), and even featured an opcode for multiplication. Hitachi got a license and made the 6309 variant that includes more registers including another set of dual accumulators (E and F) that could be merged to a 16 bit accumulator (W).

Z-80

This is an offshoot of 8080 by Zilog and hugely popular in business applications thanks to CP/M. Zilog played evil games and won evil prizes.

While the chip may be old, people are still making new multi tasking windowing operating systems for it.

16-bit Processors

Typical 16-bit architectures support 20- or 24-bit addressing and 16-bit data. Typical clock speeds are in the megahertz to low tens of megahertz range.

Intel x86-16 (8086, 80186, 80286)

16-bit offerings from Intel included the 8086, 80186, and 80286.

WDC 65816 (65C816)

The '816 is essentially a 16-bit 6502 with some additional enhancements, such as a relocatable zero page. This processor was used in the Apple IIgs and the Super Famicom (SNES). Significant compatibility with the 6502; on reset, the processor is in compatibility mode, wherein it behaves substantially like a 65C02. The processor is not pin-compatible with the 6502, however.

Zilog Z8000 (Z8001, Z8002, Z8003, Z8004)

Introduced in 1979. Sixteen 16-bit general purpose registers that can be used in 32-bit or 64-bit combinations. Not compatible with the earlier Z80.

32-bit Processors

The 16-bit generation had a short reign before being overtaken by 32-bit processors.

Motorola 68k series (68000, 680x0)

Motorola's evolution of the 6800, introduced in 1979. The first generation processors (68000, 68010, 68012) are generally described as being mixed 16-/32-bit CPUs (the 68008 is described as mixed 8-/32-bit). This is due to the width of its data and address ALUs, and internal and external data buses. Later generations are all fully 32-bit.

The first Apple Macintosh computers used the 68000. Apple continued to use m68k CPUs until transitioning to the PowerPC in the mid-1990s. It is said that WDC was designing at 32-bit successor to the 65C816, but Apple chose to go with Motorola chips and the rest is history. The Amiga, Atari ST, and Sega Genesis also used m68k CPUs.

This ISA brought high performance with many registers (8 data registers and 8 address registers) and numerous addressing modes. The complexity might at first glance seem overwhelming, nevertheless it was very popular and performant with assembly programmer.

VAX

This is probably the peak of CISC and powered VAX computers, typically running the VMS operating system with a reliability where uptimes was measured in 10+ years. This can be simulated by SimH, see below.

Home Made Processors

Making a CPU chip requires a lot of work and infrastructure. Thankfully there are alternatives. The first is to use several chips, and TTL (Transistor-Transistor-Logic) chips were popular, and also used to prototype processors. Later FPGA (Field Programmable Gate Arrays) made things even simpler and faster.

TTL Processors

These can be wire wrap monsters but work surprisingly well. A well known example is the Home Brew CPU complete with an adapted C-compiler and a port of Minix. It is accessible from the net. Other home built processors can be found at the Homebuilt CPUs WebRing.

A very recent and interesting case is the Gigatron TTL Computer that has a micro code system that can emulate a 6502 processor and a 16-bit processor, at a speed sufficient for simple games.

Soft Cores

Not to be confused with pr0n, a softcore is a description (typically in languages such as VHDL or Verilog) that is compiled and then downloaded into a FPGA. In the raw state an FPGA is a large collection of primitive components such as adders, MUX etc. that are connected together by the bitstream from the compiler, and then turns into nearly any kind of digital devices such as a CPU, DSP, GPU, state machine or similar. A large collection of open source designs can be found on Github and OpenCores. These tend to be a lot faster than TTL processors, both in building/programming and in operations.

It should be noted that the FPGA companies also provides softcores, such as Picoblaze for the Xilinx range of products, and Mico8 from the Lattice range.

Making your own ISA

This is where things get exciting!

Design

Start simple. Tempting as it may be to make the definitive ISA that once and for all will kick Intel off the market is not a good first project. And face it, if you are here reading this you are fairly new to ISA design. Start simple and get a feel for how it works. Like C or assembly programming, also this is about skill, experience and elegance that only comes from experience. And if you don't want to make it elegant, well, Intel has shown even that can have utility. So start simple, perhaps 8-bit or even 4-bit. Reimplement an existing ISA, the 6502 is very popular in this respect.

Implementation

Going for a TTL design on breadboard or wirewrap, is an exercise in patience. FPGA might be simpler and avoids short circuiting pins, especially if you use development boards with FPGA and some auxiliary parts such as display, switches and LEDs. You may have done software debugging using printf, now you might have to do debugging using a LED...

FPGAs are configured using designs in VHDL or Verilog. More information on that can be found in this thread over in Anycpu.org, including links to books.

Alternative OS for Alternative ISA

Many cross platform operating systems are available. Contiki OS is available for 6502, AVR and more. Microware OS-9 is available for 6809, 68K and more. FUZIX is a UNIX like OS available for many 8-bit processors and 68K

Simulators

Often it can be impractical to run the actual hardware in order to test old software, such as ordering a large VAX to test VMS. The solution is a simulator, such as SimH, which is capable of simulating a large number of architectures.

Links

The following is mostly a list of bookmarks.

Amiga (Motorola 68k)

amigaXfer, an easy-to-use GUI tool for lightning fast disk/file transfers on the serial port with the Amiga

Amiga 3000 running PPC software on KillerNIC NPU.

Amiga Hardware Database

Amiga Wiki

Compiled list of free/open sources related to classic Commodore Amiga computers

FS-UAE (Amiga emulator) released for Apple Silicon arm architecture

Amiberry (Emulate an Amiga on on your Raspberry Pi)

Amiberry how-to


Atari (Motorola 68k)

Atari Museum

FireBee Atari-compatible computer

FreeMiNT Project Website

ARAnyM (Atari Running on Any Machine) VM Software


Other Motorola 68k Links

Motorola 68000 computer

News around Motorola 680x0 CPU computer systems

Motorola 680x0 Resources


MIPS

Emulate Windows NT 4.0 MIPS version (translated)

A guide to running IRIX 6.5.22 in MAME

IRIX Introduction

MIPS is back (translated)

Windows NT 3.50/MIPS installation on QEMU/MIPS


NMOS 6502

C64 Resources

Commodore 64 Preservation Project

Commodore 64 Resources

Commodore Computer Club - USA

Developing for the 6502 microprocessor and its relatives

The 6502 microprocessor Resource

Interactive in-browser course

"SWEET 16" Explainer

6502 ISA Code Chart


POWER/PowerPC

About Power9

Compiling for Powerpc, how to?

Evolution of PowerPC

Fixing Radeon Linux graphics on PowerPC

GNU/Linux Open Hardware PowerPC notebook

More about Spectre and the PowerPC (or why you may want to dust that G3 off)

http://bgafc.t-hosting.hu/oses4ppc.php Operating Systems for PowerPC

Power Macintosh


RISC-V

Haiku RISC-V port progress

SiFive: The Direction and Magnitude of SiFive Intelligence


SPARC

The Resurgence of SPARC/Solaris (Okamoto Rikiya)

SPARC Internacional, Inc.


SuperH

Hitachi SuperH RISC Engine (Kawasaki Ikuya et al)

SuperH RISC engine Family Features


VAX

The Computer History Simulation Project

VAX MP: SIMH VAX simulator able to execute OpenVMS (VAX/VMS)

VAX in FPGA

VAX extended to 64 bits


Z80

Sharp PC-1500 (TRS-80 PC-2) resource page

ZX Spectrum Next

Symb-OS is a multitasking windowing OS for Z-80


Hardware Reimplementation

An interesting fpga handheld is being crowdfunded, with a focus on security

FPGA-related repositories on GitHub

Homebrew Cray-1A

MiSTer wiki


Raspberry Pi (arm)

Raspberry Pi hardware


Apple Silicon (arm)

Apple M1 SoC Tech Specs

Check if an app is native Apple Silicon or not yet

Apple MacBook Air (2020, M1)

SSD wear "issue" is FUD

Apple fixes incorrect wear reporting


Virtual Processors

DCPU-16 was a virtual processor intended for the game 0x10c.

The p-code machine was a stack based virtual processor used by UCSD Pascal on Apple 2 and other machines.


Other

Developing for all sorts of CPU

ForwardCom Instruction Set

FrogFind search engine

High-energy Electron Beam Lithography for Nanoscale Fabrication

Transputer Instruction Set

To Do

This document is in need of a lot more material. Information and references for ARM, AVR, RISC-V and MIPS are missing