M4 Mac minis in a cluster is cool, but not massively effective


A cluster of M4 Mac minis – Image credit: Alex Ziskind/YouTube



There is a way to use a collection of M4 Mac minis in a cluster, but the benefits only really exist when you use high-end Macs.

While most people think of having a more powerful computer means buying a single expensive device, there are other ways to perform large amounts of number crunching. In one concept that has been around for decades, you could use multiple computers to handle processing on a project.

The concept of cluster computing revolves around a task with lots of calculations being shared between two or more processing units. Working together to complete tasks in parallel, the result is a severe shortening of time to process.

In a video published to YouTube on Sunday, Alex Ziskind demonstrates a cluster computing setup using the M4 Mac mini. Using a collection of five Mac minis stacked in a plastic frame, he sets a task that is then distributed between them for processing.

While typical home cluster computing setups rely on Ethernet networking for communications between the nodes, Ziskind is instead taking advantage of the speed of Thunderbolt by using Thunderbolt Bridge. This speeds up the communications between the nodes considerably, as well as allowing larger packets of data to be sent, saving on processing performance.

Ethernet can run at 1Gb/s normally, or up to 10Gb/s if you paid for the Ethernet upgrade in some Mac models. The Thunderbolt Bridge method can instead run at 40Gb/s for Thunderbolt 4 ports, or 80Gb/s on Thunderbolt 5 in M4 Pro and M4 Max models when run bi-directionally.

Better than GPU processing

Ziskind points out that there can be benefits to using Apple Silicon rather than a PC using a powerful graphics card for cluster computing.

For a start, processing using a GPU relies on having considerable amounts of video memory available. On a graphics card, this could be 8GB on the card itself, for example.

Apple’s use of Unified memory on Apple Silicon means that the Mac’s memory is used by the CPU and the GPU. The Apple Silicon GPU therefore has access to a lot more memory, especially when it comes to Mac configurations with 32GB or more.

Then there’s power draw, which can be considerable for a graphics card. High power usage can be equated to a higher ongoing cost of operation.

By contrast, the Mac minis were found to use very little power, and a cluster of five Mac minis running at full capacity used less power than one high-performance graphics card.

MLX, not Xgrid

To get the cluster running, Ziskind use a project we’ve already talked about. It uses MLX, an Apple open-source project described as an “array framework designed for efficient and flexible machine learning research on Apple Silicon.”

This is vaguely reminiscent of Xgrid, Apple’s long-dead dead distributed computing solution, which could control multiple Macs for cluster computing. That system also allowed for a Mac OS X Server to take advantage of workgroup Macs on a network to perform processing when they aren’t being used for anything else.

However, while Xgrid worked for large-scale operations that were very well funded at a corporate or federal level, as AppleInsider‘s Mike Wuerthele can attest to, it didn’t translate well to smaller projects. Under perfect and specific situations, and specific code, it worked fantastically, but home-made clusters tended to not perform very well, and sometimes slower than a single computer doing the work.

MLX does change that quite a bit, as it’s using the standard MPI distributed computing methodology to work. It is also possible to get running on a few Macs of varying performance, without necessarily shelling out for hundreds or thousands of them.

Unlike Xgrid, MLX seems to be geared a lot more towards smaller clusters, meaning the crowd that wanted to use Xgrid but kept running into trouble.

A useful cluster for the right reasons

While adding together the performance of multiple Mac minis together in a cluster seems attractive, it’s not something that everyone can benefit from.

For a start, you’re not going to see benefits for typical Mac uses, like running an app or playing a game. This is intended for processing massive data sets or for high intensity tasks that benefit from parallel processing.

This makes it ideal for purposes like creating LLMs for machine learning research, for example.

It’s also not exactly easy to use by the typical Mac user.

Also, the performance gains aren’t necessarily going to be that beneficial for the usual Mac owner. Ziskind found in tests that simply buying a M4 Pro model offers more performance than two M4 units working together when using LLMs.

Two stacked silver computer units with a minimalist design sit on a white table against a blurred background.

Clusters can be really worth it when using multiple high-spec Macs together

Where a cluster like this comes into play is when you need more performance than you can get from a single powerful Mac. If a model is too big to work on a single Mac, such as constraints on memory, a cluster can offer more total memory for the model to use.

Ziskind offers that, at this stage, a high-end M4 Max Mac with vast amounts of memory is better than a cluster of lower-performance machines. But even so, if your requirements somehow go beyond the highest single Mac configuration, a cluster can help out here.

However, there are still some limitations to consider. While Thunderbolt is fast, Ziskind had to resort to using a Thunderbolt hub to connect the nodes to the host Mac, which reduced the available bandwidth.

Directly connecting the Macs together solved this, but then it runs into problems such as the number of available Thunderbolt ports to connect multiple Macs together. This can make scaling the cluster problematic.

He also ran into thermal oddities, where the host Mac mini was running especially hot, while nodes ran at a more reasonable level.

Ultimately, Ziskind found the Mac mini cluster tower experiment was interesting, but he doesn’t intend to use it long-term. However, it’s still relatively early days for the technology, and in cases where you use multiple high-end Macs for a sufficiently tough model, it can still work very well.



Source link

Previous articleAI predicts a crypto that could surpass TON and DOGE