Dockerfile最佳实践

Dockerfile提供了一个简单的语法来构建Image。以下是几点小窍门和技巧以帮助你更有效的使用Dockerfiles。

1、使用缓存

Dockerfile中的每一个指令都会提交一个更改到一个新的image中,并且这个image会做为下一个操作的基础image。如果一个image存在相同的base image和指令(除了ADD),docker将直接使用这个image来代替执行指令,这就像是缓存。
为了有效地利用缓存,你需要让你的Dockerfiles一致并且只在文件的最后添加修改。我所有的Dockerfiles开始同样的5行。

FROM ubuntu
MAINTAINER Michael Crosby <michael@crosbymichael.com>

RUN echo "deb http://archive.ubuntu.com/ubuntu precise main universe" > /etc/apt/sources.list
RUN apt-get update
RUN apt-get upgrade -y

改变MAINTAINER指令将使docker执行RUN以更新apt,而不是命中缓存。

在Dockerfile顶部保持共同的指令以利用缓存。

2、使用标签

除非仅仅是出于实验目的,不然你应该始终使用docker build -t来创建带有标签的image。简单易懂的标签帮助你管理已经创建好的每个image。

docker build -t="crosbymichael/sentry" .

总是通过 – t 标记生成的image。

3、暴露端口

docker的两个核心概念是可重复利用和可移植。Image可以在任何主机上根据需要运行多次。使用Dockerfiles可以映射私有和公开的端口,但不应该在Dockerfile中指定公开端口。在映射了公共端口的主机中,你只能有一个实例在运行。

# private and public mapping
EXPOSE 80:8080

# private only
EXPOSE 80

如果image的消费者在乎容器中映射的公开端口,他们可以通过 -p 选项在运行时的设置;否则,docker会自动分配一个容器的端口。

不要在Dockerfile中指定公开端口。

4、CMD和ENTRYPOINT语法

CMD和ENTRYPOINT都很直接了当,但他们有一个隐藏的,呃,“特性”,如果你不知道会引起问题。这两个指令有两种不同的语法。

CMD /bin/echo
# or
CMD ["/bin/echo"]

This may not look like it would be an issues but the devil in the details will trip you up. If you use the second syntax where the CMD ( or ENTRYPOINT ) is an array, it acts exactly like you would expect. If you use the first syntax without the array, docker pre-pends /bin/sh -c to your command. This has always been in docker as far as I can remember.

Pre-pending /bin/sh -c can cause some unexpected issues and functionality that is not easily understood if you did not know that docker modified your CMD. Therefore, you should always use the array syntax for both instructions because both will be executed exactly how you intended.

Always use the array syntax when using CMD and ENTRYPOINT.

5、CMD and ENTRYPOINT better together

In case you don’t know ENTRYPOINT makes your dockerized application behave like a binary. You can pass arguments to the ENTRYPOINT during docker run and not worry about it being overwritten ( unlike CMD ). ENTRYPOINT is even better when used with CMD. Let’s checkout my Rethinkdb Dockerfile and see how to use this.

# Dockerfile for Rethinkdb 
# http://www.rethinkdb.com/

FROM ubuntu

MAINTAINER Michael Crosby <michael@crosbymichael.com>

RUN echo "deb http://archive.ubuntu.com/ubuntu precise main universe" > /etc/apt/sources.list
RUN apt-get update
RUN apt-get upgrade -y

RUN apt-get install -y python-software-properties
RUN add-apt-repository ppa:rethinkdb/ppa
RUN apt-get update
RUN apt-get install -y rethinkdb

# Rethinkdb process
EXPOSE 28015
# Rethinkdb admin console
EXPOSE 8080

# Create the /rethinkdb_data dir structure
RUN /usr/bin/rethinkdb create

ENTRYPOINT ["/usr/bin/rethinkdb"]

CMD ["--help"]

This is everything that is required to get Rethinkdb dockerized. We have my standard 5 lines at the top to make sure the base image is updated, ports exposed, etc… With the ENTRYPOINT set, we know that whenever this image is run, all arguments passed during docker run will be arguments to the ENTRYPOINT ( /usr/bin/rethinkdb ).

I also have a default CMD set in the Dockerfile to –help. What this does is incase no arguments are passed during docker run, rethinkdb’s default help output will display to the user. This is same functionality that you would expect interacting with the rethinkdb binary.

docker run crosbymichael/rethinkdb

Output

Running 'rethinkdb' will create a new data directory or use an existing one,
  and serve as a RethinkDB cluster node.
File path options:
  -d [ --directory ] path           specify directory to store data and metadata
  --io-threads n                    how many simultaneous I/O operations can happen
                                    at the same time

Machine name options:
  -n [ --machine-name ] arg         the name for this machine (as will appear in
                                    the metadata).  If not specified, it will be
                                    randomly chosen from a short list of names.

Network options:
  --bind {all | addr}               add the address of a local interface to listen
                                    on when accepting connections; loopback
                                    addresses are enabled by default
  --cluster-port port               port for receiving connections from other nodes
  --driver-port port                port for rethinkdb protocol client drivers
  -o [ --port-offset ] offset       all ports used locally will have this value
                                    added
  -j [ --join ] host:port           host and port of a rethinkdb node to connect to
  .................

Now lets run the container with the –bind all argument.

docker run crosbymichael/rethinkdb --bind all

Output

info: Running rethinkdb 1.7.1-0ubuntu1~precise (GCC 4.6.3)...
info: Running on Linux 3.2.0-45-virtual x86_64
info: Loading data from directory /rethinkdb_data
warn: Could not turn off filesystem caching for database file: "/rethinkdb_data/metadata" (Is the file located on a filesystem that doesn't support direct I/O (e.g. some encrypted or journaled file systems)?) This can cause performance problems.
warn: Could not turn off filesystem caching for database file: "/rethinkdb_data/auth_metadata" (Is the file located on a filesystem that doesn't support direct I/O (e.g. some encrypted or journaled file systems)?) This can cause performance problems.
info: Listening for intracluster connections on port 29015
info: Listening for client driver connections on port 28015
info: Listening for administrative HTTP connections on port 8080
info: Listening on addresses: 127.0.0.1, 172.16.42.13
info: Server ready
info: Someone asked for the nonwhitelisted file /js/handlebars.runtime-1.0.0.beta.6.js, if this should be accessible add it to the whitelist.

And there it is, a full Rethinkdb instance running with access to the db and admin console by, interacting with the image the same way you interact with the binary. Very powerful and yet extremely simple. I love simple.

ENTRYPOINT and CMD are better together.

I hope this post helps you to get started working with Dockerfiles and building images that we all can use and benefit from. Going forward, I believe that Dockerfiles will be a very important part of what makes docker so simple and easy to use whether you are consuming or producing images. I plan to invest much of my time to provide a complete, powerful, yet simple solution to building docker images via the Dockerfile.

我来评几句
登录后评论

已发表评论数()

相关站点

+订阅
热门文章