How to run Tensorflow in Docker on an Apple Silicon Mac
This post will explain how to run Tensorflow in Docker on Apple Silicon Macs. Some familiarity with docker is required. For the reasons outlined below, we will compile Tensorflow from source with ARM64 as the target architecture. Except for the long compilation time it should be straightforward to follow. I have created a sample github repository that can be used as a working example.
The release of the M-series of Macs have seen a lot of hype — and the performance benchmarks have been impressive. However, since the CPU architecture has changed from AMD64 (also known as x86_64) to ARM64 (also known as AArch64) it can create problems for some developer toolchains. In particular, Google does not release an ARM64 version of the popular Tensorflow framework necessary in many machine learning stacks. This is not normally a problem in production systems since the servers running here are likely to have an AMD64 architecture. However, it is good practice to have a development environment that is similar to the production environment, and docker is a very common tool to ensure consistency between development and production. In other words, it is a common workflow to use the same Dockerfile for production and development. Somewhere in the Dockerfile you are likely to have something along the lines of
RUN pip install -r requirements.txt
and tensorflow==X.Y.Z
in the requirements.txt
file. When building a docker image locally on an M-series of Macs pip
cannot find a compatible version of tensorflow when building the image locally since Google does not release this for ARM64, and thus the build will fail with
ERROR: Could not find a version that satisfies the requirement tensorflow (from versions: none)
ERROR: No matching distribution found for tensorflow
Unfortunately building a docker container with--platform=linux/amd64
does not work, as tensorflow-amd64 requires AVX which is not emulated by the docker daemon on an M1. This leaves two options:
- Find a hosted version of tensorflow compiled for Apple Silicon Macs
- Compile Tensorflow for ARM64 yourself
Obviously the first option is the easiest. And people have of course already compiled Tensorflow for ARM64. So if you are just playing around and want something quick you could do that. However, chances are that if you are in a corporate setting your local friendly security guy/gal will not look kindly on you installing python wheels from unknown sources as that is a major security risk (especially if you run as root in your docker image — but you wouldn’t do that, right?). There are no reputable sources providing Tensorflow for ARM64 so that leaves us with option 2. Let’s get compiling!
The easiest is to perform the compilation of Tensorflow in a Docker container locally on your Mac. Expect a compilation time of 4–8 hours — so let it run overnight. Since the build process is dockerized it is trivial to run it on a cloud VM as well. On an c6g.4xlarge (16 cores, 32 GB RAM) AWS instance for example it took me 45 minutes to compile.
Below I have pasted a sample Dockerfile, and I will give details on the steps below (and yes, I do realize that I run as root here despite what I just said #yolo). If you copy this Dockerfile you should be left with a tensorflow wheel for python 3.10 in the /wheels/tensorflow directory in your container. If you run the container as-is it will pip install
tensorflow as a proof-of-concept.
FROM --platform=linux/aarch64 python:3.10 ARG TENSORFLOW_VERSION=v2.9.1
ARG DEBIAN_FRONTEND=noninteractive RUN apt update && apt install -y build-essential python3-dev pkg-config zip zlib1g-dev unzip curl wget git htop openjdk-11-jdk liblapack3 libblas3 libhdf5-dev npm RUN npm install -g @bazel/bazelisk RUN pip install six numpy grpcio h5py packaging opt_einsum wheel requests RUN git clone https://github.com/tensorflow/tensorflow.git && mkdir -p /wheels/tensorflowWORKDIR tensorflow RUN git checkout $TENSORFLOW_VERSION RUN bazel build -c opt --verbose_failures //tensorflow/tools/pip_package:build_pip_packageRUN /bazel-bin/tensorflow/tools/pip_package/build_pip_package /wheels/tensorflowCMD ["pip", "install", "--extra-index-url", "/wheels", "tensorflow"]
The FROM
statement explicitly targets linux/aarch64, although it is redundant when running on an M-series. The python version is important, so if you want to compile for a different version of python you can simply change the python version here. The commandARG TENSORFLOW_VERSION
tells docker what tensorflow version to compile — this should correspond to a tag in the official Tensorflow repository. You can change this by providing a--build-arg=tag
to thedocker build
command.
The compilation has several dependencies on various external libraries. There is nothing out of the ordinary here — but do note that the node package manager npm
is installed. This is to make it easy to install bazel
— the build tool we use to build Tensorflow.
After npm installing bazel
we install various python packages. Frustratingly, compilation can start without these, only to fail late in the build process if these are not present (looking at you packaging
!).
Next we git clone the official Tensorflow repository and check out the version to build. Now comes the fun part — the actual compilation. The commandRUN bazel build -c opt — verbose_failures //tensorflow/tools/pip_package:build_pip_package
will build the binary build_pip_package
which is what is used to build the python wheel. The -c opt
option tells bazel
to use optimization. This step will take a long time to complete — but it scales almost linearly with the number of cores allocated. I had 4 cores allocated to docker when I tested this — and the build took an entire night. I experimented with 8 cores but it ran out of memory (I have 16 GB) during compilation :sadpanda: so I reduced it to 4 to have a better memory/core ratio.
After this step has completed, the command RUN /bazel-bin/tensorflow/tools/pip_package/build_pip_package /wheels/tensorflow
will output a python wheel into /wheels/tensorflow
. You can then install the wheel by using the --extra-index-url
and point to the directory as shown in the CMD
command. You would probably be better off to host the tensorflow wheel online (for example in an S3 bucket), but that is the topic for a future post. For now, let us just assume that you have hosted the wheel on http://mydomain.com/tensorflow.
Remember the discussion about the line
RUN pip install -r requirements.txt
in your development container above? If you now change this to
RUN pip install -r requirements.txt --extra-index-url=http://mydomain.com --trusted-host=mydomain.com
it will work for development and production. First pip
will check pypi, and if it comes up with nothing, it will search mydomain.com (do note that there is some ambiguity about whether pypi.org or mydomain.com is checked first, so don’t shadow packages already on pypi if there is a working version there). So in production everything is working as before — only pypi.org packages are used. But on the local development container where there is no tensorflow package available on pypi.org instead mydomain.com will be used.
And that is it! You should have a version of tensorflow that you can install and use inside a docker container on M-series Macs. Unfortunately it only comes with CPU support.