M1 Pro and M1: Everything That We Know

Apple newest chip has arrived. We study in detail what M1 Pro and M1 Max is all about and compare them to contemporary rivals from Intel, AMD and Nvidia. Posted by erwinkarim on 6:33 PM, October 21, 2021 Last updated on 9:17 AM, November 3, 2021 Filed in: m1 pro, m1 max, deep dive, analysis, benchmarks,

According to Tim Cook, Apple is at the end of the 1st year in their two year transition period from Intel to their own chip solution, the Apple Silicon. And to celebrate the end of the first year, they will finally expand the M1 family to launch a System-on-Chip (SOC) that is fit for professionals. Thus, the M1 Pro and M1 Max are unveiled. Apple spent a good chunk of the presentation explaining the features and benefits of the Pro and Max chip. After all, it will be the beating heart of their future professional Macs.

We will dissect everything that Apple said on the presentation and research everything that we know from the wild to give you the most complete information on the new M1 Pro and M1 Max SOCs. This will be a long one.

Preamble

The original M1, the first of Apple Silicon, has several functions. First and foremost, it powers the lower end consumer Mac. It has to be easy and cheap to manufacture, but powerful enough to run almost anything that users throw at. Most importantly, the chip is used to field test the Apple Silicon architecture to weed out the bugs in the system. Something that usually the laboratory environment does not catch. It also let developers test and update their programs to leverage the new Apple Silicon. One year out, the experiment has been a resounding success. A lot of people praise the M1 chip, surprised by the performance and there’s no major issues to speak of.


The OG M1 is a technology validator for the Apple Silicon. The M1 Pro and M1 Max is the paydirt.

Confident, Apple now has updated the M-series line up to include SOCs that can handle serious professional tasks. While the M1 itself has amazed people by how well it does heavy work like 4K RAW editing, playing games and beating high-end Intel chip, everyone is left wondering what kind of professional chip that Apple will field. Enter the M1 Pro and M1 Max.

On the surface, the M1 Pro and M1 Max just looked like a beefed up version of the M1 SOC. This is widely inaccurate. Firstly, the M1 Pro and M1 Max are performance orientated instead of efficiency oriented. They are meant to compete with high-end, high-priced laptop (and some desktop) versions that are on the market. They also added features that are important for professionals. We will discuss in depth each of the features.


Apple M1 Max is here to kick ass and take names

Compute Cores

To see how the M1 Pro and M1 Max do against Intel and AMD finest, read here.

Compute cores are the general purpose cores that handle instructions from your programs. In the M1, there’s 4 high performance cores and 4 efficiency cores. Both kinds of cores are at the same time, except the efficient cores has a smaller cache, and run on a lower clock speed to preserve power usage. The formula is flipped on the M1 Pro and M1 Max. First, they have more cores than the normal M1. Both M1 Pro and M1 Max come with 10 compute cores. Second, they have more high performance cores than high efficiency cores. In the pro SOCs, the efficiency cores have been reduced to 2 cores from 4, while the performance cores have been increased from 4 cores to 8 cores.


An example of binning. All the chips come from the same wafer. The percentage points is the yield rate.

Binning is a practice in the semiconductor industry to increase production yield and lower overall manufacturing costs. You see, the pro SOCs die size (chip size) is much larger than the M1 chip. So you can make less SOCs given the same wafer size. The bad news is that any containment, like a speck of dust on the wafer can destroy a chip. The good news is that most chip designs now are multi-core and you can basically disable a few of the cores and sell the same chip at different configurations. Everybody does it and Apple is no exception.

Apple gives an option of 8 or 10 cores for the M1 Pro and only 10 cores for the M1 Max. Reports shows that both versions are running at the same clock speed but the M1 Max has a “high-power” mode to allow the M1 Max to run slightly faster longer, akin to Turbo Boost mode in Intel’s implementation.

Graphic Cores

We also did a comparison of M1 Pro and M1 Max GPU against Nvidia and AMD finest: the GTX 3080 Laptop and Radeon 6600M. You can read about it here

In a typical layout on a PC that runs on Intel or AMD chips, the CPU would have a weak integrated graphic or PC manufacturers would integrate a more powerful discrete GPU in their design. There are several advantages for such a setup, which allows flexibility in the design and gives more options to the end customers. There are also several disadvantages to going this route. Cost is increased for example. Integrated solutions are always going to be cheaper than discrete solutions.


At 16 graphic cores, M1 Pro has double the graphic core count than the M1, which makes the GPU respectable.

Apple decided to flip the script by proposing an integrated solution. And they don’t play around with their integrated solution. GPU cores account for half to two thirds of die real estate. This is because Apple decided to put a huge GPU together with the computer and other cores on the M1 Pro and M1 Max. On the M1 Pro, you have the choice between 14 or 16 cores. M1 Max goes double by providing 24 or 32 graphic cores.


And the M1 Max upped the ante by having up to 32 cores, which starting to challenge products from graphic specialist like Nvidia.

Furthermore, in a discreet solution like in a PC setup, each graphic and compute chips would have their own memory. In Apple’s setup, compute and graphic cores share the same memory space. Instead of going back and forth in between graphic memory and compute memory, each of the compute and graphic cores manipulate the same unified memory space. This greatly increases performance as copying from one memory set to another is expensive.

Media Engine

Another advantage of making your own chip that powers your own computers is that you can have a unique solution on the market. Of course, this means that you need to create your own ecosystem which is very difficult to do without support, but the reward is very high if it’s perfectly executed. In Apple’s case, they are pushing for the ProRes standard and provides the solution to the problem via the A15 chip for the phone and M1 Pro and M1 Max for the Macs.


The media engine accelerates decoding and encoding common video codecs. They have a special encoder/decoder for ProRes, which works perfectly if you are in Apple ecosystem.

The media cores are specialized cores that can help decode and encode common codecs like MP4. Apple took an extra step by providing accelerators that can decode and encode ProRes streams. Furthermore, in the M1 Max, there’s two ProRes encode/decode cores which allows the M1 Max to handle 7x 8K streams or 30x 4K stream. To put that in perspective, Apple does provide such hardware acceleration on the Mac Pro, but it’s a $2,000 option that takes up a PCI-x card slot and can only handle 6x 8K streams. And it does not even provide hardware accelerated ProRes encoding. It is an understatement to say that a 2021 14” M1 Pro Max laptop is more capable than a $6,000 2019 Mac Pro in almost everyway.


The $2,000 optional Afterburner card for the Mac Pro

Other features


Apple added their own bells and whistles to help with their own ecosystem. Neural core, which Intel and AMD solution is conspicuously absent is present on the M1 Pro and Max. They are the same 16 cores Neural engine in the base M1 chip. Same goes for the Secure Enclave which used to be the duty of the T2 chip in Intel-based Macs.

There are some speculation that Apple added a second 16-cores Neural Engine module in the M1 Max. Although at a glance the design looks identical, it’s not confirmed that the M1 Max has an additional Neural Engine. If this analysis is true, than the M1 Max should have double the ML processing capacity.


Display engine allows the MacBook Pro laptop to connect up to 4 displays

Apple added the display engine to handle I/O that goes out. One may presume that this also handles the ThunderBolt 4 protocol which is the main way the Macs will be connected to an external display. In the M1 Pro, you can connect up to 2x 6K Monitor (the $6,000 32” Super Retina XDR is one) while on the M1 Max, you can connect to 3x 6K Monitor with a 4K TV. That’s 66.5 million pixels 60 times per second.

On the 16” MacBook Pro with M1 Max with the large unibody that can handle higher thermal loads and the 140W charger, the M1 Max has a high power mode which means it can sustain higher loads for a longer time before it goes to thermal throttle mode. It does not increase the performance of the M1 Max, just allows the M1 Max to be in high performance mode longer.


Matrix multiplication. Matrix operation is useful in a lot of graphic and machine-learning application like applying a blur filter to an image. And M1 Pro and M1 Max has a co-processor to speed such operations.

Some users has discovered that Apple has an undocumented co-processor which called Apple Matrix or AMX for short. What the coprocessor does is to speed up matrix manipulation tasks. A matrix is a mathematical concept where numbers are stored in banks of columns and rows like in an Excel spreadsheet. Matrices are useful to represent large data in a single structure. Manipulation with matrices are integral to tasks like image / video manipulation, face recognition and machine learning. People have claimed that the AMX co-processor is larger in the M1 Pro than the base M1 and the M1 Max has two AMX co-processor. Tests indeed have shown that the M1 Pro is twice as fast in matrix manipulation tasks than the base M1 and the M1 Max is even more so.

Above: a die shot interpretation from twitter user Locuza

Matrix multiplication tests shows that M1 Pro does that job faster than the base M1, which also after than a Ryzen 7 chip

Memory Controller

Another aspect of the new pro SOCs that Apple excitedly mentioned but did not go into details is the memory controller that Apple uses. They throw around numbers like 200GB/s on the M1 Pro and 400GB/s on the M1 Max chip, but did not give any perspective of what the number means. Here’s some numbers for you. Intel’s top of the line processor, the i9-11980HK has around 51.2GB/s memory bandwidth. Intel higher workstation processor, the Xeon w3375 which can only do 171.5GB/s. This is a workstation processor which costs around $4,500 and here a processor meant for a laptop has double the memory bandwidth.


M1 Pro has two memory modules. Each modules have 2 memory channels, bringing a total of 4.

How does Apple achieve this? First, it’s confirmed that Apple uses the new LPDDR5 chip which provides higher bandwidth at 6.4GB/s per 8-bit memory interface. So on a 64-bit machine like the M1 Pro, each memory channel 51.2GB/s. And it’s confirmed the memory interface is 256-bit wide, so on the M1 Pro, it has 4 memory channels and that brings the memory bandwidth at around 204.8GB/s. Now the M1 Max goes a step further by having a 512-bit memory interface. So the M1 Pro has 8 memory channels which is something that you only see in a server chip. With 8 memory channels, your bandwidth goes to 409.6GB/s.


M1 Max has double everything. 4 memory modules and 8 memory channels.

M1 Pro and M1 Max Against Peers


CPU performance claims by Apple. They set TDP per chip at around 30W

They set TDP per chip around 60W

So now we get the details laid out, how does the M1 Pro and M1 Max compare with their competitors. Not only that Apple has to compete with Intel and AMD for the general purpose processors, Apple now has to compete with graphic specialists like Nvidia. As a theoretical benchmark goes, the M1 Pro and M1 Max are at the very top. At press time, we have to go with what Apple claims and some questionable data from the internet as most of the MacBook Pro has yet to arrive in the hands of the end users.

As a general purpose processor, doing general purpose number crunching, the M1 Pro and M1 Max excels at it thanks to having more performance cores than before. However, based on tests that so far have leaked on the internet, the single core performance is identical to the base M1 processor. What this means is the M1 Pro and M1 Max are built based on the M1 designs, which itself are based on the A14 processor.

As a GPU, the M1 Max is an excellent mobile graphic processor. What’s is surprising, the M1 Max edges out the Nvidia RTX 3080 card, a dedicated desktop graphic chip that costs $1,700 and has a TDP requirement of 350W, while the M1 MAX is a laptop graphic chip with a TDP of 60W. Of course, since the laptops has not yet reached users hands, we could not perform benchmarks on it yet.


Apple M1 Pro and M1 Max GPU compared with contemporary desktop GPUs. These are calculated test only.

Apple M1 Pro and M1 Max GPU compared with contemporary desktop GPUs. These are calculated test only.

Conclusion or How The New Chips Fall In The Family


Current state of the M1 family. M1 is for most consumers while the M1 Pro and Max is for the power users on the go.

Now Apple has unveiled the M1 Pro and M1 Max, Apple’s plan for the M-series chip has become clear. Believe it or not, the M1 Max, while being a very powerful laptop chip, is not even the best Apple is making. That would be the chip that is set to go in the higher end iMac and eventually the Mac Pro.

So in Apple’s mind, theoretically there will be 3 performance tiers. The base M1 is for the general consumers. The mid-tier, which the current M1 Pro and M1 Max occupy will have the balance between performance and power consumption as it is geared toward high performance laptops. And finally, we have to all out high performance high power consumption SOC that will go into the high end iMac and Mac Pro.

If rumors are to be believed, which so far has been confirmed with the advent of the M1 Pro and M1 Max, there will be an upscale version of the M1 Max dubbed Jade-2C and Jade-4C. In a single die, there will be a version with dual M1 Max cores and quad M1 Max cores. The Jade-2C will have 20 compute cores and 64 graphic cores while the Jade-4C will have 40 compute cores and 128 graphic cores. The performance and thermal envelope of such SOC will be insane.


The eventual family of the M-series chip. The Jade-2C / 4C should make an appearance in the Mac Pro. Will the Jade-2C appear on the new iMac 32-inch? Time will tell.

Plug

Support this free website by visiting my Amazon affiliate links. Any purchase you make will give me a cut without any extra cost to you

Resources

Share this article:

Related:

Apple October Event Recap

'Unleashed' event come and went. Apple dropped a few heavyweight punches at Intel and Nvidia. We take stock of what Apple revealed and what did not make the cut

Macbook Pro Setup Guide: From Noob To Pro-Pro

With new MacBook Pro in your hand, everything in your horizon is available for your taking. Complete your MacBook Pro setup with these recommended accessories to complete your indispensable tool
Tags: m1 pro, m1 max, deep dive, analysis, benchmarks,