The M1 speed up is not an overnight success story or concentrated efforts of a few engineers over a course of a year. It was over 10 years of research and development efforts by Apple because they see they are being held back by Intel. That cumulative efforts for 10 years results in a ultra portable laptop (Macbook Air) which can edit 3 streams of 4k video without a single fan onboard. How did this happen? It’s a long story …

Back Story

The year is 2005 and Steve Job in the midst of transitioning their products away from PowerPC to Intel. the iPod has been a runaway success and Apple actually seeing money coming in instead of burning it away. So the next plan is to make a revolutionary handheld computer for the masses: enter the iPhone.

At this point, Apple is looking for chips on their new devices and actually asked Intel, their new partner / supplier if they can make a low-power, high-performance chip for their new handheld computer. Intel, in what would be called the biggest misstep ever, decided no because there would not be a big market for it.

the iPhone and eventually the iPad Pro

the first iPhone was launched on 2007 with Samsung ARM chips inside. Realizing how important the phone is and how capable this device could be, Steve Jobs set out to ensure that the iPhone will be the best handheld device it could be. This means that it would have a lot of computational power in a device that is mainly battery powered and has no fan to cool down it CPU. This sets off a series of development process that pushes the ARM architecture in general and A-series in particular to the next level.


a4 to a11 evolution
The evolution of the A-series chip. Each cumulative progress points to an eventual desktop-class chip that is designed with efficiency from the ground up.

Apple first release was the A4-chip for the iPad and eventually the iPhone 4. But that was a redesigned Cortex ARM chip. The real chip that Apple designed from the ground up is the A6 for the iPhone 5. Over the years, new features added to the A-series processor, such as Secure Enclave and image processor on the A7, 64-bit support on the A8, performance and efficiency cores on the A10, integrated graphics and neural networks on the A11.

The backstory about those feature is Apple designed both the hardware to process the information and the software to manipulate the information. This tight integration ensures that the overall solution is both highly efficient and highly performing. Hardware accelerators like the image processor ensure video encoding is fast and the software feature tightly control which codecs that is supports (like hevc / H.265).


intel vs a-series
Intel vs A-series processor. Apple improves and eventually catch up and exceed Intel performance. And mind you, this is for mobile phones!

the tight combination of highly optimized hardware and software ensures that Apple A-series chips performance increase at a higher rate than Intel over the years and eventually catching up to Intel top of the line processors. Not only they catch up with raw performance, they actually catch up with a lower thermal budget.

Getting smaller

The transistor is the main component of any CPU. It acts as an miniature electrical switch. From that switch, you can create logic gates. From logic gates, you can eventually create addition machines, memory storage and many others that makes to computer. The size of the circuit is a great factor of how fast the performance would be. Smaller circuit means less time for electrons to travel to turn the switch on and off so it means less time it takes for information to move around.

Because of this, Apple can theoretically put 4 times more transistors inside than Intel can for the given area. This itself will give more performance boost. Furthermore, signals travel half the distance and time from one transistor to another. This will give even more performance boost.

System on a chip

For typical x86–64 designs from Intel or AMD, this will be the typical layout on your computer’s motherboard. We have pretty much use this layout for 30–40 years because it’d works.


10th gen core block diagram

To show the evolution, this is the architecture layout back in Pentium 4 (2000s).


pentium 4 block diagram

This layout has done over more than 30 years because it’d works, it’s flexible and good enough for the most part. Now here’s the layout for Apple M1 chip:-


m1 block diagram

The first thing to notice is that the RAM (DRAM) is on the chip itself. Another thing to notice is that GPU is also on the chip itself. Neural Engine, a specialize hardware to process Machine learning algorithm is on the chip itself. This is something that Intel doesn’t even have. So the main stuff to run a computer (CPU, GPU, RAM) is already on the chip itself. Based on this design, input/output controller for USB ports, audio, networking, display is also on the chip itself. all you need to have is connection to a hard drive to store all the data.


system on chip vs discreet chips
System on a chip like the M1. Since everything is on a single path, less switching cost which translate to lower latency.

Mindset

Intel and Apple has two different mindset when building the chips. Intel is a company that makes CPUs for a lot of markets while Apple makes chips for a very targeted and segmented market: wearable, phones and consumer. Intel also makes CPUs for consumer and the highly lucrative server / data center market.

Intel has multiple customers around the globe that demand different things. Intel has to make a chip that caters to all of them and sometimes can’t afford to ignore them. Apple chip division has only one customer: Apple. So, it makes chips that is highly optimized to meet whatever vision that Apple has. Because of this, the chips are highly optimized for the things that need to be done, but not that flexible if it’s not in their vision.

Take this example for the calculator app. the calculator app is available on the iPhone, Watch and Mac. Because they can’t design a good looking calculator app on the iPad, they decided they won’t make a calculator app. The closest thing you can get as a calculator in iPad is asking Siri to do calculations.

So, the M1 chip is a highly optimized version of the Intel chip in ARM instruction set that works really well in macOS environment: the only environment that Apple cares.

Conclusion

The M1 managed to maximize because of development and optimization that Apple has done on its A-series chips for over 10 years. The tight integration between hardware and software ensure that at the task that it was designed to perform, the M1 will run that task with exceeding efficiency.

Plug

Help to grow my family and this site. Get your Apple Computers with M1 chips at my Amazon Affiliate links.

  • Apple Mac Mini M1 - link
  • Apple MacBook Air M1 - link
  • Apple Macbook Pro M1 - link