Docker container is useful when building the data analysis environment with fixed version of software (e.g. ubuntu, jupyterlab, or anaconda).
Dockefile
1
2
3
4
5
6
|
FROM ubuntu:latest
RUN apt-get update && apt-get install -y \
sudo wget \
vim
WORKDIR /opt
RUN wget https://repo.anaconda.com/archive/Anaconda3-2023.07-2-Linux-x86_64.sh
|
Build image
As my environment is M1 Mac (Apple Silicon), I needed to add --platform linux/amd64
option.
1
|
docker build --platform linux/amd64 .
|
Run a container
1
|
docker run -it 571f59ade236 bash
|
Output
1
2
|
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
root@a1234b549e43:/opt#
|
Run Anaconda installer
1
|
sh Anaconda3-2023.07-2-Linux-x86_64.sh
|
After finishing the installation, the prompt says;
1
2
|
Do you wish the installer to initialize Anaconda3
by running conda init? [yes|no]
|
entered “yes”. After few minutes, you will see
1
|
Thank you for installing Anaconda3!
|
Add Anaconda bin directory to PATH
Anaconda is installed in /opt/anaconda3
in my case.
On the container,
1
2
3
4
5
6
7
8
|
$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# Export path
export PATH=/opt/anaconda3/bin:$PATH
$ echo $PATH
/opt/anaconda3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
|
Edit Dockerfile
To avoid interactive operation, we need to use options for usig Anaconda3-xxxx-Linux-x86_64.sh
in batch mode.
By using sh -x
option, you can check shell options:
1
2
3
4
5
6
7
8
|
-b run install in batch mode (without manual intervention),
it is expected the license terms (if any) are agreed upon
-f no error if install prefix already exists
-h print this help message and exit
-p PREFIX install prefix, defaults to /root/anaconda3, must not contain spaces.
-s skip running pre/post-link/install scripts
-u update an existing installation
-t run package tests after installation (may install conda-build)
|
We can use -b
(run install in batch mode) and -p
for installing prefix. The below command line will install Anaconda without manual intervention and set prefix for the path.
1
|
sh /opt/Anaconda3-2023.07-2-Linux-x86_64.sh -b -p /opt/anaconda3
|
Edit Dockerfile
1
2
3
4
5
6
7
8
9
10
11
12
13
|
FROM ubuntu:latest
RUN apt-get update && apt-get install -y \
sudo wget \
vim
WORKDIR /opt
RUN wget https://repo.anaconda.com/archive/Anaconda3-2023.07-2-Linux-x86_64.sh && \
sh Anaconda3-2023.07-2-Linux-x86_64.sh -b -p /opt/anaconda3 && \
rm -f Anaconda3-2023.07-2-Linux-x86_64.sh
ENV PATH /opt/anaconda3/bin:$PATH
RUN pip install --upgrade pip
WORKDIR /
CMD ["jupyter", "lab", "--ip=0.0.0.0", "--allow-root", "--LabApp.token=''"]
|
- By
ENV
command, set environment variable.
- By
CMD
command, run jupyter lab on local host (--ip=0.0.0.0
) and --allow-root
and remove token setting.
Re-build docker image and run jupyter lab on a docker container.
1
|
docker build --platform linux/amd64 .
|
Then run a conatiner.
1
|
docker run e6ff3baff3db1
|
Without specifying -p
option, we cannot access the jupyter lab from browser.
Instead, We need to run,
1
|
docker run -p 8888:8888 e6ff3baff3db1
|
Then we can access http://127.0.0.1:8888/lab
.
Share file system between host and container
After creating an external directory from docker container on host (for example, in my case,)
1
|
mkdir /Users/tato/repo/dhub/ds_python
|
Run docker container with -v
option.
1
2
|
docker run -p 8888:8888 -v /Users/tato/repo/dhub/ds_python:/work
--name my-lab e6ff3baff3db1
|
links