name: inverse layout: true class: center, middle, inverse --- name: centered layout: true class: center, middle --- layout: false class: title # Frequently Answered Queries # from StackOverflow .left-column[ .pic-50[![BoxBoat](img/boxboat-logo-color.png)] ] .right-column[.align-right[.no-bullets[ - Brandon Mitchell - Twitter: @sudo_bmitch - StackOverflow: bmitch ]]] ??? - This talk is a stripped down version of a longer talk. - It's a series of common questions and challenges people have with Docker from the perspective of answering them on SO. - There's a mix of common answers we all know here with some more advanced tricks mixed in so hopefully everyone can learn something. - I'm BMitch, and I'm a solutions architect with BoxBoat who provides consulting services around everything container related. - I've answered almost 1k questions on the docker tag on SO. People wonder how I learned everything I know about docker, and where I find the time. --- # How Do We Learn? ??? - Everyone has their own preferred ways to learn a new skill... -- - RTFM - Training - Practice - Drills - Teaching ??? - Some just read the documentation, or take a class, and others want to be more interactive. - For me, I learn the topic best when I teach others that skill. It forces me to understand it enough to answer others questions. - With SO, many times I don't know the answer to a question when I first read it. The path is the reverse that many expect. - I don't answer questions to share my knowledge about Docker, I gain knowledge about how Docker works by answering questions. Sharing knowledge is just an awesome side effect. --- # Typical StackOverflow User Background - Mostly developers - Often more comfortable with an IDE than a CLI - DevOps is shifting those Devs into more Ops tasks - Pro: devs no longer depend on ops to manage their app runtime environment -- - Con: devs no longer depend on ops to manage their app runtime environment - Devs are now learning OS/Linux/distributions, scripting, package managers, networking, and storage. ??? - The result: lots of smart developers making common mistakes. --- template: inverse # General Docker Questions --- # Q: How do containers differ from VM's - Containers have a shared kernel, application isolation vs hardware isolation - How do we change the mindset of people using containers as a lightweight VM? ??? - Namesapces and cgroups, not virtualized hardware. - Yet people still cram 5 different apps into a single container and upgrade in place, how do we change that? -- - Who likes uptime? ??? - Everyone with their hands down are probably texting their boss to explain why everything is down equals "working as designed". -- - Who wants to maintain a server that hasn't been rebooted for 3 years, and the original admin has left? ??? - It's got uptime, probably vulnerable to spectre/meltdown and lots of other exploits, along with being a really old framework no one likes. -- - Uptime quickly becomes a ticking time bomb. ??? - Upgrades are out of the question since you don't know what could break or how to fix it or how to revert. There's a transition point where uptime goes from desired to risky. - That transition point is as soon as you get state drift. -- - What we want is availability, not uptime. We want a LB pointing to replicas spread across multiple AZ's so we can have __low uptime__ and __high availability__. ??? - The fix is continuous delivery, infrastructure as code, immutability and disposability. - People using containers as lightweight VM's are making pets, they'll end up a 3 year old container that no one knows how it works. Those doing it right will manage a farm of cattle. --- # Q: How do containers differ from VM's Practical differences: - Don't ssh into containers (exec, and only in dev) - Don't upgrade containers in place (replace them) - Don't install multiple apps inside a single container (compose files) - Don't give containers static IP's (LB/reverse proxies) - Don't backup containers (backup volumes) - Don't export containers to make new images (use a Dockerfile) ??? - Treat containers as immutable and disposable. - Learn from others mistakes: - The worst answer I had to give to someone was that they wasted months of work and had to start all over. - They had been maintaining their container without a Dockerfile, applying updates in place, and periodically running an export / import. - One day they lost their image, maybe they ran a bad rmi, maybe filesystem corruption, maybe they hit the layer limit. - A Dockerfile and versioned images would give them not only a quick way to revert, but also to completely recreate without some of the day 1 cruft. --- template: inverse # Dockerfile --- # Q: Why doesn't build use the cache? .left-column[ Cache requires: - Same command to be run - Same checksum on all files - Same previous layer - Image was built locally ] ??? - Note that the value of external resources can't be tested. If you have a RUN command that curl's an artifact into your image, docker doesn't know when that artifact changes if your command is identical. -- .right-column[ Cache can be broken by: - Changing a build `ARG` value - Changing a timestamp - The previous layer being rebuilt ] -- .no-column[ To trust images pulled from a registry, use: ``` docker build --cache-from my_image ... ``` ] --- # Q: Why is my image so large? How big are the layers resulting from this Dockerfile: ``` FROM busybox RUN mkdir /data RUN dd if=/dev/zero bs=1024 count=1024 of=/data/one RUN chmod -R 0777 /data RUN dd if=/dev/zero bs=1024 count=1024 of=/data/two RUN chmod -R 0777 /data RUN rm /data/one CMD ls -alh /data ``` ??? - 5 commands: - Create 1M /data/one - Change permissions - Create 1M /data/two - Change permissions - Remove /data/one - This could easily be: - Download code from git - Extract code - Compile app - Make app executable - Remove download and source --- # Q: Why is my image so large? - Running the image you see the 1MB file: ``` -rwxrwxrwx 1 root root 1.0M May 12 00:14 two ``` - Each `dd` command adds a 1MB layer. -- - Each `chmod` command will change permissions and copy the entire 1MB file to the next layer. -- - What does the `rm` command do to the image size? --- class: small # Q: Why is my image so large? The `rm` command only changes directory metadata in the next layer: ``` Step 6/7 : RUN chmod -R 0777 /data ---> Running in 038bd2bc5aea * ---> 77793bf30d5f Step 7/8 : RUN rm /data/one ---> Running in 504c6e9b6637 * ---> 9fe0e2f18893 ... $ docker image ls -a | grep 77793bf30d5f REPOSITORY TAG IMAGE ID CREATED SIZE
77793bf30d5f 10 minutes ago 6.37MB $ docker image ls -a | grep 9fe0e2f18893 REPOSITORY TAG IMAGE ID CREATED SIZE
9fe0e2f18893 10 minutes ago 6.37MB ``` --- # Q: Why is my image so large? - Resulting 1MB file has become 4MB on disk and over the network - Compare the two resulting images to see the added disk space: ``` REPOSITORY TAG IMAGE ID CREATED SIZE busybox latest 54511612f1c4 8 months ago 1.13MB test-layers latest 757ce49dd12f 10 minutes ago 6.37MB ``` - Subtracting the two you get the expected ~5MB --- # Q: Why is my image so large? - 5MB? Not 4MB? Where did the extra 1MB come from? ``` FROM busybox RUN mkdir /data RUN dd if=/dev/zero bs=1024 count=1024 of=/data/one RUN chmod -R 0777 /data RUN dd if=/dev/zero bs=1024 count=1024 of=/data/two RUN chmod -R 0777 /data RUN rm /data/one CMD ls -alh /data ``` -- - A `chmod` or `chown` changes a timestamp on the file _even when there is no permission or ownership change made_. ??? - The recursive `chmod` and `chown` commands you run to fix permissions and run will always include every file in that directory in the new layer. --- # Q: Why is my image so large? How can we examine layers? Build with `docker build --rm=false .` ``` ... Step 2/7 : RUN mkdir /data * ---> Running in 04c5fa1360b0 ---> 9b4368667b8c Step 3/7 : RUN dd if=/dev/zero bs=1024 count=1024 of=/data/one * ---> Running in f1b72db3bfaa 1024+0 records in 1024+0 records out 1048576 bytes (1.0MB) copied, 0.006002 seconds, 166.6MB/s ---> ea2506fc6e11 ``` ??? - This leaves the temporary containers around used during the build - Docker used to leave them around by default, but they took up space and people rarely used them - Until now... --- # Q: Why is my image so large? Check each temp image with `docker diff ${cid}` .small[ ``` $ docker diff 04c5fa1360b0 # mkdir /data A /data *$ docker diff f1b72db3bfaa # dd if=/dev/zero bs=1024 count=1024 of=/data/one C /data *A /data/one *$ docker diff 81c607555a7d # chmod -R 0777 /data C /data *C /data/one $ docker diff 1bd249e1a47b # dd if=/dev/zero bs=1024 count=1024 of=/data/two C /data A /data/two *$ docker diff 038bd2bc5aea # chmod -R 0777 /data *C /data/one *C /data/two $ docker diff 504c6e9b6637 # rm /data/one C /data *D /data/one ``` ] ??? - The `/data/one` file is created once with `A`, changed twice (`C`), and then it's deleted (`D`) which also changes the metadata on `/data`. - Each of these `docker diff` commands is a view into the layered filesystem, and every layer is shipped with an image, it's not squashed automatically. --- # Q: Why is my image so large? ### Reduce the image size by merging RUN lines: .small[ ```no-highlight FROM busybox RUN mkdir /data \ && dd if=/dev/zero bs=1024 count=1024 of=/data/one \ && chmod -R 0777 /data \ && dd if=/dev/zero bs=1024 count=1024 of=/data/two \ && chmod -R 0777 /data \ && rm /data/one CMD ls -alh /data ``` ] ``` REPOSITORY TAG IMAGE ID CREATED SIZE busybox latest 54511612f1c4 8 months ago 1.13MB test-layers2 latest 951252cf34ed 25 seconds ago 2.18MB ``` ??? - Merge the `RUN` commands by joining each command with a `&&` and span multiple lines with a `\`. - The previous 5MB of layers is now just an added 1MB. - Do the same when downloading source or temporary files, same with apt-get update/clean, do those in the same RUN line. --- template: inverse # Run --- # Q: What does "invalid reference format" mean? - A reference is a pointer to an image. - The docker command line is order dependent: ``` docker ${docker_args} run ${run_args} image ${cmd} ``` - Frequently happens when an invalid arg gets parsed as the image name or invalid characters from copy/pasting from a source that changes dashes and quotes. - What does docker interpret as the image name here: ``` my project$ docker run -it —rm -v $(pwd):/data “my_image” ``` ??? - References can be the image name and tag, name and default to latest, image id, with or without a registry, or it can be pinned to a sha256 hash. - The first thing after the run command that isn't a valid arg is treated as the image reference. - Commonly see with wrong type of `-`, "smart" quotes, unquoted paths with spaces. - In the example, there are three different errors, `--rm` has an em-dash, the volume path isn't quoted and contains a space, and left/right quotes are used around my_image. - First: `—rm` - Next: `project:/data` - Then: `“my_image”` (with left/right quotes) --- # Q: Why do I get "executable not found"? - Did you run the intended command? ```no-highlight docker run --rm my_image -it echo hello world ``` - Is docker trying to run a json string? - Does the file exist... in the path and inside the container? - If it is a shell script, check the first line ```no-highlight #!/bin/bash ``` - Check for windows linefeeds on linux shell scripts - If it is a binary, there is likely a missing library ??? - The `-it` is not an executable, everything after my_image is a command to run - If docker is trying to run your "json" CMD/ENTRYPOINT, there's likely a typo or other invalid json causing it to be processed as a string. - People try to run commands on their laptop inside the container. We have filesystem namespacing, which means containers cannot see the laptop directories (unless you have a volume mount). The path for exec may not have everything an interactive shell in the container has, login scripts aren't run. - If it's not executable, fix it with a `chmod`, preferably in the same layer where the file was created to reduce layer size. - Alpine, busybox, and some other images don't ship with bash out of the box. - Statically compiled go libraries still have some library dependencies without a few extra flags. If you compile on ubuntu and run on alpine, the musl libc will not work. --- template: inverse # Networking --- #
Q: Networking issues between containers?
- Make sure app is listening on 0.0.0.0, not 127.0.0.1 - Use a user generated network - Use DNS: container id, container name, service name, or network alias - Connect to the container port, not the host published port - With overlay networking, open 7946/both, 4789/tcp, and protocol 50. ??? - When moving an app from host to container, networking needs to change to use multiple namespaced networks instead of a global host network. --- #
Q: Networking issues between containers?
## Follow-up Q: Do I need to expose the port? - Nope, expose is documentation. ## Follow-up Q: Do I need to publish the port? - Nope, that only makes the container accessible from outside of docker. ## Follow-up Q: Do I need links? - Nope, links are deprecated, use user created networks. ??? - Any container with a common user created network can talk to every port of the other container, no expose, publish, or linking needed. - Linking is what we had before user generated networks, you still need that if you don't create a network for your container, but it's not recommended. --- #
Q: Networking issues accessing published port?
- Make sure app is listening on 0.0.0.0, not 127.0.0.1. - Connect to an IP on the host, not the 0.0.0.0 listener wildcard. - Verify the publish command. `-p 8080:80` maps host port 8080 to container port 80. - With overlay networking, open 7946/both, 4789/tcp, and protocol 50. -
Verify the docker host you are using with `echo $DOCKER_HOST`. If this is set, connect to that IP instead.
??? - Docker doesn't control how the app listens, only where the forwarded traffic goes - Every container can listen internally on the same port without conflict, that's what network namespacing gives you. People often reverse the port order. My way of remembering is you can publish to a specific IP on the host, and if so, that's `host_ip:host_port:container_port` - Overlay networking needs ports opened between docker nodes in the swarm. Protocol 50 is not tcp or udp, it's a different protocol, for IPSec only. - IP `0.0.0.0` is only for servers listening, not clients connecting. The web server listens on all network interfaces, but the web browser only connects to a single IP address. --- # Networking Tips ## Follow-up Q: How to check if the app is listening on 0.0.0.0? ``` docker run -it --rm --net container:${cid} \ nicolaka/netshoot netstat -lnt ``` ??? - There are several network types: host, overlay, bridge, and none. There's also using the network namespace of another container. -- ## Follow-up Q: Why doesn't localhost work? - IPv6 is the likely cause, use 127.0.0.1 instead. ??? - For commands run inside the container (including healthcheck), the container /etc/host may have an IPv6 entry even when not enabled in docker. --- template: inverse # Volumes --- # Q: Build isn't updating a directory? Typically caused by a volume: - A volume attached to the container persists the old state - A volume defined in the image prevents changes to that directory ??? - For volumes defined on the container, the image is being updated. - Images are created with temporary containers, the temp containers get a temporary volume attached, only changes to the container filesystem are saved, changes to the temp volume are discarded. - You can `COPY` and `ADD` content into a directory defined as a `VOLUME`. --- # PSA: Remove VOLUME in Dockerfiles - Users cannot extend the image - Anonymous volumes clutter the filesystem - Not required for creating volumes at runtime ??? - VOLUME in the Dockerfile came from a time when we had data containers, and mounted volumes from other containers. All of that is deprecated, and there's no need to include it in your image anymore. - Adding initial data for an image is a common extension that volumes block - Anonymous volumes are where useless data goes fill your disks up -- - Solution: define volumes in a compose file --- #
Q: How do I handle UID/GID and permission issues with host volumes?
- Option 1: `chmod 777` - Option 2: Update image user to match host uid/gid - Option 3: Use named volumes an manage data with containers - Option 4: Correct permissions with entrypoint ??? - Opt 1 is a cop-out - Opt 2 is not portable - Opt 3 works as long as you don't need host volumes - Opt 4 requires suid command or starting container as root --- ##
Q: How do I handle UID/GID and permission issues?
Option 2: Update image to match host uid/gid: ``` FROM debian:latest ARG UID=1000 ARG GID=1000 RUN groupadd -g $GID cuser \ && useradd -m -u $UID -g $GID -s /bin/bash cuser USER cuser ``` ``` $ docker build \ --build-arg UID=$(id -u) --build-arg GID=$(id -g) . ``` ??? - Only need to match uid/gid, username and groupname can be different - Not portable if UID/GID isn't common across all docker hosts --- ##
Q: How do I handle UID/GID and permission issues?
Option 4: Entrypoint to correct uid/gid: ``` FROM jenkins/jenkins:lts *USER root RUN apt-get update \ * && wget -O /usr/local/bin/gosu "https://github.com/..." \ && chmod +x /usr/local/bin/gosu \ && curl -sSL https://get.docker.com/ | sh \ && usermod -aG docker jenkins *COPY entrypoint.sh /entrypoint.sh *ENTRYPOINT ["/entrypoint.sh"] ``` ??? - To use an entrypoint I've: - run the container as root - installed gosu - created the user/group - copied in the entrypoint - I've removed error handling and cleanup steps to fit on the slide --- ##
Q: How do I handle UID/GID and permission issues?
Option 4: Entrypoint to correct uid/gid: ```no-highlight #!/bin/sh # if image and volume gid do not match, modify container user SOCK_DOCKER_GID=$(ls -ng /var/run/docker.sock | cut -f3 -d' ') CUR_DOCKER_GID=$(getent group docker | cut -f3 -d: || true) if [ "$SOCK_DOCKER_GID" != "$CUR_DOCKER_GID" ]; then * groupmod -g ${SOCK_DOCKER_GID} docker fi # drop access to jenkins user and run jenkins entrypoint *exec gosu jenkins /bin/tini -- /usr/local/bin/jenkins.sh "$@" ``` ??? - TIME CHECK - I've removed error handling and cleanup steps to fit on the slide - The last line: - `exec` removes the shell from pid 1, replaces with `gosu` - `gosu` drops access from root to jenkins, and does an internal exec to `tini` - `tini` is a small init to reap zombie processes, and launches `jenkins.sh` - `jenkins.sh` runs the app - Done this way, the entrypoint is transparent to the user and container process, they only notice the file permissions in the container always match where ever they run the container - A `docker exec` will still run as root, but app will run as jenkins --- # Q: How to initialize a host volume from image? - Option 1: Don't. Initialize outside of docker, before starting the container - Option 2: Copy with an entrypoint from a saved location in the image. ??? - Opt 1: pay attention to uid/gid's. - Opt 2: if the image is run without a host vol, then use a symlink inside the image from /data to /data-save -- - Option 3: Define a named volume that's a bind mount. ```no-highlight $ docker volume create --driver local \ --opt type=none \ --opt device=/home/user/test \ --opt o=bind \ test_vol ``` ??? - Opt 3: Docker will initialize named volumes, but the directory must exist in advance for the bind mount to succeed (unlike host volumes) --- ## Q: How to initialize a host volume from image? Walk-through of example 3 - Dockerfile: ```no-highlight FROM busybox:latest RUN adduser --home /home/user --uid 5001 \ --disabled-password user USER user *COPY --chown=user sample-data/ /home/user/data/ ``` ??? - We've made a user, and copied in some sample data to the image. --- ## Q: How to initialize a host volume from image? Walk-through of example 3 - Sample data: ``` $ ls -al sample-data/ total 24 drwxr-xr-x 3 bmitch bmitch 4096 Jan 22 2017 . drwxr-xr-x 30 bmitch bmitch 4096 May 14 09:41 .. drwxr-xr-x 2 bmitch bmitch 4096 Jan 22 2017 dir -rw-r--r-- 1 bmitch bmitch 14 Jan 22 2017 file2.txt -rw-r--r-- 1 bmitch bmitch 12 Jan 22 2017 file.txt -rw-r--r-- 1 bmitch bmitch 214 Jan 22 2017 tar-file.tgz ``` ??? - This gets loaded into the image --- ## Q: How to initialize a host volume from image? Walk-through of example 3 - create volume: ```no-highlight *$ mkdir test-vol *$ ls -al test-vol total 8 drwxr-sr-x 2 bmitch bmitch 4096 May 14 09:40 . drwxr-xr-x 30 bmitch bmitch 4096 May 14 09:33 .. *$ docker volume create --driver local --opt type=none \ * --opt device=$(pwd)/test-vol --opt o=bind test-vol test-vol ``` ??? - We have to create the directory on the host first, same for any bind/nfs/etc mount. - The volume create could also be done in a compose.yml file. Slightly different syntax. --- ## Q: How to initialize a host volume from image? Walk-through of example 3 - Run the container: ```no-highlight $ docker run -it --rm -v test-vol:/home/user/data test-vol \ /bin/sh -c "\ echo hello world >/home/user/data/inside-container.txt \ && ls -l /home/user/data" total 20 drwxr-xr-x 2 user user 4096 May 14 13:43 dir -rw-r--r-- 1 user user 12 Jan 23 2017 file.txt -rw-r--r-- 1 user user 14 Jan 23 2017 file2.txt *-rw-r--r-- 1 user user 12 May 14 13:43 inside-container.txt -rw-r--r-- 1 user user 214 Jan 23 2017 tar-file.tgz ``` ??? - Run the container, using the named volume mount - Volume was initialized even though it's a "host" volume - Created the "inside-container.txt" file by the container - Note the container owns everything as "user" --- ## Q: How to initialize a host volume from image? Walk-through of example 3 - Show the local directory from the host: ``` $ ls -al test-vol/ total 28 drwxr-sr-x 3 5001 5001 4096 May 14 09:43 . drwxr-xr-x 30 bmitch bmitch 4096 May 14 09:41 .. drwxr-xr-x 2 5001 5001 4096 May 14 09:43 dir -rw-r--r-- 1 5001 5001 14 Jan 22 2017 file2.txt -rw-r--r-- 1 5001 5001 12 Jan 22 2017 file.txt -rw-r--r-- 1 5001 5001 12 May 14 09:43 inside-container.txt -rw-r--r-- 1 5001 5001 214 Jan 22 2017 tar-file.tgz ``` ??? - Note the uid matches the image uid, this doesn't solve the uid mapping issue - Combine this with previous uid/gid entrypoint fix for best of both --- layout: false class: title # Thank You ### Slides: https://github.com/sudo-bmitch/dc2018 .left-column[ ![QR](img/github-qr.png) ] .right-column[.align-right[.no-bullets[ - Brandon Mitchell - Twitter: @sudo_bmitch - StackOverflow: bmitch ]]] ??? - With that, I challenge everyone to go teach others - If you don't know all the answers, that's ok, learn it while you're answering - All the slides, along with the much longer talk are up on github. - If you have questions, look me up in a hallway track, "at" me on twitter, open an issue on this github repo.