宿主服务器使用的是 Ubuntu 18.04,需要注意的是 Docker 目前不支持 Ubuntu 19.10。如要在 19.10 中使用 Docker 需要在 Docker 源配置时设置 Ubuntu 18.04 的版本标识:bionic。
deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable
Docker 的安装
Docker 的安装流程非常的简单,按以下命令执行即可:
sudo apt remove docker docker-engine docker.io sudo apt install apt-transport-https ca-certificates curl software-properties-common curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add – sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" sudo apt update sudo apt install docker-ce sudo usermod -aG docker $USER #将当前用户加入到 Docker 组 sudo echo "DOCKER_OPTS=\"--registry-mirror=https://registry.docker-cn.com\"" >> /etc/default/docker #更改为国内源 sudo service docker restart CEnt
CentOS 7 下的安装:
#卸载旧版本 Docker
sudo yum remove docker docker-common docker-selinux docker-engine
#安装依赖包
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
#添加国内 yum 软件源:
sudo yum-config-manager --add-repo https://mirrors.ustc.edu.cn/docker-ce/linux/centos/docker-ce.repo
#安装 Docker CE
sudo yum makecache fast
sudo yum install docker-ce
#启动 Docker CE
sudo systemctl enable docker
sudo systemctl start docker
#将当前用户加入 docker 组
sudo usermod -aG docker $USER
#镜像加速
#sudo echo "DOCKER_OPTS=\"--registry-mirror=https://registry.docker-cn.com\"" >> /etc/default/docker #更改为国内源
sudo nano /etc/docker/daemon.json
{
"registry-mirrors":["http://hub-mirror.c.163.com"]
}
#重新启动服务
sudo systemctl daemon-reload
sudo systemctl restart docker
#测试 Docker 是否安装正确
sudo docker run hello-world
#安装 Docker-Compose
#通过访问 https://github.com/docker/compose/releases/latest 得到最新的 docker-compose 版本
sudo curl -L https://github.com/docker/compose/releases/download/1.25.5/docker-compose-`uname -s`-`uname -m` -o /usr/bin/docker-compose
sudo chmod +x /usr/bin/docker-compose
参考链接:
多用户 jupyterhub 的安装
1、拉取相关镜像:
不用拉取 latest 版本,latest 版本存在 Bug,安装完成后不能正常运行。别问我怎么知道的,血与泪~
docker pull jupyterhub/jupyterhub:1.0.0 docker pull jupyterhub/singleuser:1.0.0
2、创建 jupyterhub_network 网络
docker network create --driver bridge jupyterhub_network
3、创建 jupyterhub 的 volume
sudo mkdir -pv /data/jupyterhub sudo chown -R root /data/jupyterhub sudo chmod -R 777 /data/jupyterhub
4、创建 jupyterhub_config.py 文件并将其复制到 volume
cp jupyterhub_config.py /data/jupyterhub/jupyterhub_config.py
文件内容:
# Configuration file for JupyterHub
c = get_config()
# spawn with Docker
c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner'
# Spawn containers from this image
c.DockerSpawner.image = 'qw/jupyter_lab_singleuser:latest'
c.DockerSpawner.extra_create_kwargs = {'user': 'root'}
c.DockerSpawner.environment = {
'GRANT_SUDO': '1',
'UID': '0', # workaround https://github.com/jupyter/docker-stacks/pull/420
}
# c.JupyterHub.base_url = '/jupyterhub'
# JupyterHub requires a single-user instance of the Notebook server, so we
# default to using the `start-singleuser.sh` script included in the
# jupyter/docker-stacks *-notebook images as the Docker run command when
# spawning containers. Optionally, you can override the Docker run command
# using the DOCKER_SPAWN_CMD environment variable.
c.DockerSpawner.extra_create_kwargs.update({'command': "start-singleuser.sh --SingleUserNotebookApp.default_url=/lab"})
# Connect containers to this Docker network
network_name = 'jupyterhub_network'
c.DockerSpawner.use_internal_ip = True
c.DockerSpawner.network_name = network_name
# Pass the network name as argument to spawned containers
c.DockerSpawner.extra_host_config = {'network_mode': network_name}
# Explicitly set notebook directory because we'll be mounting a host volume to
# it. Most jupyter/docker-stacks *-notebook images run the Notebook server as
# user `jovyan`, and set the notebook directory to `/home/jovyan/work`.
# We follow the same convention.
notebook_dir = '/home/jovyan'
# notebook_dir = '/home/jovyan/work'
c.DockerSpawner.notebook_dir = notebook_dir
# Mount the real user's Docker volume on the host to the notebook user's
# notebook directory in the container
c.DockerSpawner.volumes = {'jupyterhub-user-{username}': notebook_dir, 'jupyterhub-shared': {"bind": '/home/jovyan/shared', "mode": "rw"}}
# volume_driver is no longer a keyword argument to create_container()
# c.DockerSpawner.extra_create_kwargs.update({'volume_driver': 'local'})
# Remove containers once they are stopped
c.DockerSpawner.remove_containers = False
# For debugging arguments passed to spawned containers
c.DockerSpawner.debug = True
# The docker instances need access to the Hub, so the default loopback port doesn't work:
# from jupyter_client.localinterfaces import public_ips
# c.JupyterHub.hub_ip = public_ips()[0]
c.JupyterHub.hub_ip = 'jupyterhub'
# IP Configurations
c.JupyterHub.ip = '0.0.0.0'
c.JupyterHub.port = 8000
# OAuth with GitLab
import os
c.JupyterHub.authenticator_class = 'oauthenticator.gitlab.GitLabOAuthenticator'
os.environ['OAUTH_CALLBACK_URL'] = 'http://10.101.14.13:8000/hub/oauth_callback'
os.environ['GITLAB_CLIENT_ID'] = 'd89d76ef002100f217f4a7c1fc73011ca4d9eee7bb5ff8ce3e9532ba7721e29e'
os.environ['GITLAB_CLIENT_SECRET'] = '05075caea4f3cb63a0cebc5d65e446df4dfc9598932cf3ddc751deb8eee5baf3'
c.GitLabOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL']
c.GitlabOAuthenticator.client_id = os.environ['GITLAB_CLIENT_ID']
c.GitlabOAuthenticator.client_secret = os.environ['GITLAB_CLIENT_SECRET']
c.Authenticator.whitelist = whitelist = set()
c.Authenticator.admin_users = admin = set()
here = os.path.dirname(__file__)
with open(os.path.join(os.path.dirname(__file__), 'userlist')) as f:
for line in f:
if not line:
continue
parts = line.split()
name = parts[0]
whitelist.add(name)
if len(parts) > 1 and parts[1] == 'admin':
admin.add(name)
5、创建 userlist 文件并将其复制到 volume
文件内容:
qwadmin
这里只需要添加一个 admin 账户即可,因为其他账户后期都可以直接在界面中增加。
cp userlist /data/jupyterhub/userlist
6、build jupyterhub 镜像
由于 Docker 中要用到 pip,所以建议修改下 pip 源。新建 pip.conf 文件。内容为:
[global] trusted-host = mirrors.aliyun.com index-url = http://mirrors.aliyun.com/pypi/simple/
创建 Dockerfile,文件内容为:
ARG BASE_IMAGE=jupyterhub/jupyterhub:1.0.0
FROM ${BASE_IMAGE}
ADD pip.conf /etc/pip.conf
RUN pip install --no-cache --upgrade jupyter
RUN pip install --no-cache dockerspawner
RUN pip install --no-cache oauthenticator
ENV GITLAB_HOST=http://git.domain.com
EXPOSE 8000
完成后执行:
docker build -t qw/jupyterhub .
7、build singleuser镜像(多用户支持)
创建Dockerfile,内容为:
ARG BASE_IMAGE=jupyterhub/singleuser:1.0.0
FROM ${BASE_IMAGE}
# 加速
ADD pip.conf /etc/pip.conf
RUN conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
RUN conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
RUN conda config --set show_channel_urls yes
# Install jupyterlab
RUN conda install -c conda-forge jupyterlab
# RUN pip install jupyterlab
RUN jupyter serverextension enable --py jupyterlab --sys-prefix
ENV GITLAB_HOST=http://git.domain.com
USER jovyan
完成后执行:
docker build -t qw/jupyter_lab_singleuser .
8、开启容器
docker run -d --name jupyterhub -p 8000:8000 --network jupyterhub_network -v /var/run/docker.sock:/var/run/docker.sock -v /data/jupyterhub:/srv/jupyterhub qw/jupyterhub:1.0.0
如报如下错误:
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.40/images/json?all=1: dial unix /var/run/docker.sock: connect: permission denied
则执行:
sudo chmod 666 /var/run/docker.sock
9、其他相关
Jupyterhub的配置文件中设置了共享目录,然是在实际使用时会报没有权限的问题。解决方案:
sudo chmod -R 777 /var/lib/docker/volumes/jupyterhub-shared
Docker容器内多用户版JupyterHub支持GPU
按照上述流程安装完毕后会遇到一个问题:Docker内无法使用GPU,这对JupyterHub来说是致命的。今天就来一起梳理下如何解决这个问题。
nvidia-docker
原以为nvidia docker是最佳解决方案,安装完nvidia-docker后在运行Docker时加上–gpu all指令让容器支持GPU,但是该实现方案只是让Jupyterhub的容易可支持GPU,针对多用户版本的JupyterHub,每个用户会生成一个单独的容器。而单用户容器是由DockerSpawner由API create_container生成的。create_container并不支持—gpus参数:https://github.com/docker/docker-py/issues/2395
解决方案:
1、卸载Docker 19.03,降级安装18.09版本的Docker:
services stop docker rm -rf /var/lib/docker yum remove docker* yum -y install docker-ce-18.09.0 docker-ce-cli-18.09.0 containerd.io
2、安装旧版nvidia-docker,即nvidia-docker2
如果以前安装过nvidia-docker1.0版本,需要先将其删除:
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
yum remove nvidia-docker
添加相关库并进行安装
distribution=$(./etc/os-release; echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | tee /etc/yum.repos.d/nvidia-docker.repo yum install -y nvidia-docker2
配置nvidia-docker2,把默认的Runtime设为nvidia。
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia",
}
以上内容加入/etc/docker/daemon.json文件中,然后重启dockerd。
jupyterhub/singleuser
jupyterhub/singleuser本身没有安装任何显卡驱动,解决方案是重新进行Build。
从jupyterhub/singleuser的Dockerfile我们可以看到它的BASE_IMAGE为jupyter/base-notebook。
# Build a jupyterhub/singleuser
# Run with the DockerSpawner in JupyterHub
ARG BASE_IMAGE=jupyter/base-notebook
FROM $BASE_IMAGE
MAINTAINER Project Jupyter<jupyter@googlegroups.com>
ADD install_jupyterhub /tmp/install_jupyterhub
ARG JUPYTERHUB_VERSION=master
# install pinned jupyterhub and ensure notebook is installed
RUN python3 /tmp/install_jupyterhub &&\
python3 -m pip install notebook
再来看下jupyter/base-notebook的Dockerfile:
# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.
# Ubuntu 18.04 (bionic) from 2019-10-29
# https://github.com/tianon/docker-brew-ubuntu-core/commit/d4313e13366d24a97bd178db4450f63e221803f1
ARG BASE_CONTAINER=ubuntu:bionic-20191029@sha256:6e9f67fa63b0323e9a1e587fd71c561ba48a034504fb804fd26fd8800039835d
FROM $BASE_CONTAINER
LABEL maintainer="Jupyter Project <jupyter@googlegroups.com>"
ARG NB_USER="jovyan"
ARG NB_UID="1000"
ARG NB_GID="100"
USER root
# Install all OS dependencies for notebook server that starts but lacks all
# features (e.g., download as all possible file formats)
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update \
&& apt-get install -yq --no-install-recommends \
wget \
bzip2 \
ca-certificates \
sudo \
locales \
fonts-liberation \
run-one \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
RUN echo "en_US.UTF-8 UTF-8" > /etc/locale.gen && \
locale-gen
# Configure environment
ENV CONDA_DIR=/opt/conda \
SHELL=/bin/bash \
NB_USER=$NB_USER \
NB_UID=$NB_UID \
NB_GID=$NB_GID \
LC_ALL=en_US.UTF-8 \
LANG=en_US.UTF-8 \
LANGUAGE=en_US.UTF-8
ENV PATH=$CONDA_DIR/bin:$PATH \
HOME=/home/$NB_USER
# Add a script that we will use to correct permissions after running certain commands
ADD fix-permissions /usr/local/bin/fix-permissions
RUN chmod a+rx /usr/local/bin/fix-permissions
# Enable prompt color in the skeleton .bashrc before creating the default NB_USER
RUN sed -i 's/^#force_color_prompt=yes/force_color_prompt=yes/' /etc/skel/.bashrc
# Create NB_USER wtih name jovyan user with UID=1000 and in the 'users' group
# and make sure these dirs are writable by the `users` group.
RUN echo "auth requisite pam_deny.so" >> /etc/pam.d/su && \
sed -i.bak -e 's/^%admin/#%admin/' /etc/sudoers && \
sed -i.bak -e 's/^%sudo/#%sudo/' /etc/sudoers && \
useradd -m -s /bin/bash -N -u $NB_UID $NB_USER && \
mkdir -p $CONDA_DIR && \
chown $NB_USER:$NB_GID $CONDA_DIR && \
chmod g+w /etc/passwd && \
fix-permissions $HOME && \
fix-permissions "$(dirname $CONDA_DIR)"
USER $NB_UID
WORKDIR $HOME
ARG PYTHON_VERSION=default
# Setup work directory for backward-compatibility
RUN mkdir /home/$NB_USER/work && \
fix-permissions /home/$NB_USER
# Install conda as jovyan and check the md5 sum provided on the download site
ENV MINICONDA_VERSION=4.7.10 \
MINICONDA_MD5=1c945f2b3335c7b2b15130b1b2dc5cf4 \
CONDA_VERSION=4.7.12
RUN cd /tmp && \
wget --quiet https://repo.continuum.io/miniconda/Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh && \
echo "${MINICONDA_MD5} *Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh" | md5sum -c - && \
/bin/bash Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh -f -b -p $CONDA_DIR && \
rm Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh && \
echo "conda ${CONDA_VERSION}" >> $CONDA_DIR/conda-meta/pinned && \
$CONDA_DIR/bin/conda config --system --prepend channels conda-forge && \
$CONDA_DIR/bin/conda config --system --set auto_update_conda false && \
$CONDA_DIR/bin/conda config --system --set show_channel_urls true && \
if [ ! $PYTHON_VERSION = 'default' ]; then conda install --yes python=$PYTHON_VERSION; fi && \
conda list python | grep '^python' | tr -s ' ' | cut -d '.' -f 1,2 | sed 's/$/.*/' >> $CONDA_DIR/conda-meta/pinned && \
$CONDA_DIR/bin/conda install --quiet --yes conda && \
$CONDA_DIR/bin/conda update --all --quiet --yes && \
conda clean --all -f -y && \
rm -rf /home/$NB_USER/.cache/yarn && \
fix-permissions $CONDA_DIR && \
fix-permissions /home/$NB_USER
# Install Tini
RUN conda install --quiet --yes 'tini=0.18.0' && \
conda list tini | grep tini | tr -s ' ' | cut -d ' ' -f 1,2 >> $CONDA_DIR/conda-meta/pinned && \
conda clean --all -f -y && \
fix-permissions $CONDA_DIR && \
fix-permissions /home/$NB_USER
# Install Jupyter Notebook, Lab, and Hub
# Generate a notebook server config
# Clean up temporary files
# Correct permissions
# Do all this in a single RUN command to avoid duplicating all of the
# files across image layers when the permissions change
RUN conda install --quiet --yes \'notebook=6.0.0' \
'jupyterhub=1.0.0' \
'jupyterlab=1.2.1' && \
conda clean --all -f -y && \
npm cache clean --force && \
jupyter notebook --generate-config && \
rm -rf $CONDA_DIR/share/jupyter/lab/staging && \
rm -rf /home/$NB_USER/.cache/yarn && \
fix-permissions $CONDA_DIR && \
fix-permissions /home/$NB_USER
EXPOSE 8888
# Configure container startup
ENTRYPOINT ["tini", "-g", "--"]
CMD ["start-notebook.sh"]
# Add local files as late as possible to avoid cache busting
COPY start.sh /usr/local/bin/
COPY start-notebook.sh /usr/local/bin/
COPY start-singleuser.sh /usr/local/bin/
COPY jupyter_notebook_config.py /etc/jupyter/
# Fix permissions on /etc/jupyter as root
USER root
RUN fix-permissions /etc/jupyter/
# Switch back to jovyan to avoid accidental container runs as root
USER $NB_UID
解决方案:修改 jupyter/base-notebook 的 BASEIMAGE 为:nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
Dockerfile 中涉及到的相关文件可从https://github.com/jupyter/docker-stacks/tree/master/base-notebook 获取。
Docker调试相关命令
sudo service docker stop # 关闭 Docker 服务 sudo rm -rf /var/lib/docker/ # 删除所有 Docker 镜像 sudo service docker start # 启动 Docker 服务 docker images -a # 显示所有 Docker 镜像 docker rmi qw/jupyterhub # 删除指定 Docker 镜像 docker container ls -a # 显示所有 docker 容器 docker container stop jupyterhub # 停止指定 Docker 容器 docker container start jupyterhub # 开启指定 Docker 容器 docker container rm jupyterhub # 删除指定 Docker 容器 docker logs --details jupyterhub # 显示指定容器日志 docker exec -it jupyterhub bash # 进入指定容器(按 Ctrl+D 退出) docker container prune -f # 删除所有停止的容器 docker stop $(docker ps -aq) # 停止所有容器



