|
NAME | DESCRIPTION | EXAMPLES | SEE ALSO | SEE ALSO | AUTHOR | COLOPHON |
|
LXC.CONTAINER.CONF(5) LXC.CONTAINER.CONF(5)
lxc.container.conf - LXC container configuration file
LXC is the well-known and heavily tested low-level Linux container
runtime. It is in active development since 2008 and has proven itself
in critical production environments world-wide. Some of its core
contributors are the same people that helped to implement various
well-known containerization features inside the Linux kernel.
LXC's main focus is system containers. That is, containers which
offer an environment as close as possible as the one you'd get from a
VM but without the overhead that comes with running a separate kernel
and simulating all the hardware.
This is achieved through a combination of kernel security features
such as namespaces, mandatory access control and control groups.
LXC has supports unprivileged containers. Unprivileged containers are
containers that are run without any privilege. This requires support
for user namespaces in the kernel that the container is run on. LXC
was the first runtime to support unprivileged containers after user
namespaces were merged into the mainline kernel.
In essence, user namespaces isolate given sets of UIDs and GIDs. This
is achieved by establishing a mapping between a range of UIDs and
GIDs on the host to a different (unprivileged) range of UIDs and GIDs
in the container. The kernel will translate this mapping in such a
way that inside the container all UIDs and GIDs appear as you would
expect from the host whereas on the host these UIDs and GIDs are in
fact unprivileged. For example, a process running as UID and GID 0
inside the container might appear as UID and GID 100000 on the host.
The implementation and working details can be gathered from the
corresponding user namespace man page. UID and GID mappings can be
defined with the lxc.idmap key.
Linux containers are defined with a simple configuration file. Each
option in the configuration file has the form key = value fitting in
one line. The "#" character means the line is a comment. List
options, like capabilities and cgroups options, can be used with no
value to clear any previously defined values of that option.
LXC namespaces configuration keys by using single dots. This means
complex configuration keys such as lxc.net.0 expose various subkeys
such as lxc.net.0.type, lxc.net.0.link, lxc.net.0.ipv6.address, and
others for even more fine-grained configuration.
CONFIGURATION
In order to ease administration of multiple related containers, it is
possible to have a container configuration file cause another file to
be loaded. For instance, network configuration can be defined in one
common file which is included by multiple containers. Then, if the
containers are moved to another host, only one file may need to be
updated.
lxc.include
Specify the file to be included. The included file must be in
the same valid lxc configuration file format.
ARCHITECTURE
Allows one to set the architecture for the container. For example,
set a 32bits architecture for a container running 32bits binaries on
a 64bits host. This fixes the container scripts which rely on the
architecture to do some work like downloading the packages.
lxc.arch
Specify the architecture for the container.
Some valid options are x86, i686, x86_64, amd64
HOSTNAME
The utsname section defines the hostname to be set for the container.
That means the container can set its own hostname without changing
the one from the system. That makes the hostname private for the
container.
lxc.uts.name
specify the hostname for the container
HALT SIGNAL
Allows one to specify signal name or number sent to the container's
init process to cleanly shutdown the container. Different init
systems could use different signals to perform clean shutdown
sequence. This option allows the signal to be specified in kill(1)
fashion, e.g. SIGPWR, SIGRTMIN+14, SIGRTMAX-10 or plain number. The
default signal is SIGPWR.
lxc.signal.halt
specify the signal used to halt the container
REBOOT SIGNAL
Allows one to specify signal name or number to reboot the container.
This option allows signal to be specified in kill(1) fashion, e.g.
SIGTERM, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default signal
is SIGINT.
lxc.signal.reboot
specify the signal used to reboot the container
STOP SIGNAL
Allows one to specify signal name or number to forcibly shutdown the
container. This option allows signal to be specified in kill(1)
fashion, e.g. SIGKILL, SIGRTMIN+14, SIGRTMAX-10 or plain number. The
default signal is SIGKILL.
lxc.signal.stop
specify the signal used to stop the container
INIT COMMAND
Sets the command to use as the init system for the containers.
lxc.execute.cmd
Absolute path from container rootfs to the binary to run by
default. This mostly makes sense for lxc-execute.
lxc.init.cmd
Absolute path from container rootfs to the binary to use as
init. This mostly makes sense for lxc-start. Default is
/sbin/init.
INIT WORKING DIRECTORY
Sets the absolute path inside the container as the working directory
for the containers. LXC will switch to this directory before
executing init.
lxc.init.cwd
Absolute path inside the container to use as the working
directory.
INIT ID
Sets the UID/GID to use for the init system, and subsequent commands.
Note that using a non-root UID when booting a system container will
likely not work due to missing privileges. Setting the UID/GID is
mostly useful when running application containers. Defaults to:
UID(0), GID(0)
lxc.init.uid
UID to use for init.
lxc.init.gid
GID to use for init.
PROC
Configure proc filesystem for the container.
lxc.proc.[proc file name]
Specify the proc file name to be set. The file name available
are those listed under /proc/PID/. Example:
lxc.proc.oom_score_adj = 10
EPHEMERAL
Allows one to specify whether a container will be destroyed on
shutdown.
lxc.ephemeral
The only allowed values are 0 and 1. Set this to 1 to destroy
a container on shutdown.
NETWORK
The network section defines how the network is virtualized in the
container. The network virtualization acts at layer two. In order to
use the network virtualization, parameters must be specified to
define the network interfaces of the container. Several virtual
interfaces can be assigned and used in a container even if the system
has only one physical network interface.
lxc.net
may be used without a value to clear all previous network
options.
lxc.net.[i].type
specify what kind of network virtualization to be used for the
container. Multiple networks can be specified by using an
additional index i after all lxc.net.* keys. For example,
lxc.net.0.type = veth and lxc.net.1.type = veth specify two
different networks of the same type. All keys sharing the same
index i will be treated as belonging to the same network. For
example, lxc.net.0.link = br0 will belong to lxc.net.0.type.
Currently, the different virtualization types can be:
none: will cause the container to share the host's network
namespace. This means the host network devices are usable in
the container. It also means that if both the container and
host have upstart as init, 'halt' in a container (for
instance) will shut down the host.
empty: will create only the loopback interface.
veth: a virtual ethernet pair device is created with one side
assigned to the container and the other side attached to a
bridge specified by the lxc.net.[i].link option. If the
bridge is not specified, then the veth pair device will be
created but not attached to any bridge. Otherwise, the bridge
has to be created on the system before starting the container.
lxc won't handle any configuration outside of the container.
By default, lxc chooses a name for the network device
belonging to the outside of the container, but if you wish to
handle this name yourselves, you can tell lxc to set a
specific name with the lxc.net.[i].veth.pair option (except
for unprivileged containers where this option is ignored for
security reasons).
vlan: a vlan interface is linked with the interface specified
by the lxc.net.[i].link and assigned to the container. The
vlan identifier is specified with the option
lxc.net.[i].vlan.id.
macvlan: a macvlan interface is linked with the interface
specified by the lxc.net.[i].link and assigned to the
container. lxc.net.[i].macvlan.mode specifies the mode the
macvlan will use to communicate between different macvlan on
the same upper device. The accepted modes are private, vepa,
bridge and passthru. In private mode, the device never
communicates with any other device on the same upper_dev
(default). In vepa mode, the new Virtual Ethernet Port
Aggregator (VEPA) mode, it assumes that the adjacent bridge
returns all frames where both source and destination are local
to the macvlan port, i.e. the bridge is set up as a reflective
relay. Broadcast frames coming in from the upper_dev get
flooded to all macvlan interfaces in VEPA mode, local frames
are not delivered locally. In bridge mode, it provides the
behavior of a simple bridge between different macvlan
interfaces on the same port. Frames from one interface to
another one get delivered directly and are not sent out
externally. Broadcast frames get flooded to all other bridge
ports and to the external interface, but when they come back
from a reflective relay, we don't deliver them again. Since we
know all the MAC addresses, the macvlan bridge mode does not
require learning or STP like the bridge module does. In
passthru mode, all frames received by the physical interface
are forwarded to the macvlan interface. Only one macvlan
interface in passthru mode is possible for one physical
interface.
phys: an already existing interface specified by the
lxc.net.[i].link is assigned to the container.
lxc.net.[i].flags
Specify an action to do for the network.
up: activates the interface.
lxc.net.[i].link
Specify the interface to be used for real network traffic.
lxc.net.[i].mtu
Specify the maximum transfer unit for this interface.
lxc.net.[i].name
The interface name is dynamically allocated, but if another
name is needed because the configuration files being used by
the container use a generic name, eg. eth0, this option will
rename the interface in the container.
lxc.net.[i].hwaddr
The interface mac address is dynamically allocated by default
to the virtual interface, but in some cases, this is needed to
resolve a mac address conflict or to always have the same
link-local ipv6 address. Any "x" in address will be replaced
by random value, this allows setting hwaddr templates.
lxc.net.[i].ipv4.address
Specify the ipv4 address to assign to the virtualized
interface. Several lines specify several ipv4 addresses. The
address is in format x.y.z.t/m, eg. 192.168.1.123/24.
lxc.net.[i].ipv4.gateway
Specify the ipv4 address to use as the gateway inside the
container. The address is in format x.y.z.t, eg.
192.168.1.123. Can also have the special value auto, which
means to take the primary address from the bridge interface
(as specified by the lxc.net.[i].link option) and use that as
the gateway. auto is only available when using the veth and
macvlan network types.
lxc.net.[i].ipv6.address
Specify the ipv6 address to assign to the virtualized
interface. Several lines specify several ipv6 addresses. The
address is in format x::y/m, eg.
2003:db8:1:0:214:1234:fe0b:3596/64
lxc.net.[i].ipv6.gateway
Specify the ipv6 address to use as the gateway inside the
container. The address is in format x::y, eg. 2003:db8:1:0::1
Can also have the special value auto, which means to take the
primary address from the bridge interface (as specified by the
lxc.net.[i].link option) and use that as the gateway. auto is
only available when using the veth and macvlan network types.
lxc.net.[i].script.up
Add a configuration option to specify a script to be executed
after creating and configuring the network used from the host
side.
In addition to the information available to all hooks. The
following information is provided to the script:
· LXC_HOOK_TYPE: the hook type. This is either 'up' or 'down'.
· LXC_HOOK_SECTION: the section type 'net'.
· LXC_NET_TYPE: the network type. This is one of the valid
network types listed here (e.g. 'macvlan', 'veth').
· LXC_NET_PARENT: the parent device on the host. This is only
set for network types 'mavclan', 'veth', 'phys'.
· LXC_NET_PEER: the name of the peer device on the host. This
is only set for 'veth' network types. Note that this
information is only available when lxc.hook.version is set
to 1.
Whether this information is provided in the form of environment
variables or as arguments to the script depends on the value of
lxc.hook.version. If set to 1 then information is provided in the
form of environment variables. If set to 0 information is provided as
arguments to the script.
Standard output from the script is logged at debug level. Standard
error is not logged, but can be captured by the hook redirecting its
standard error to standard output.
lxc.net.[i].script.down
Add a configuration option to specify a script to be executed
before destroying the network used from the host side.
In addition to the information available to all hooks. The
following information is provided to the script:
· LXC_HOOK_TYPE: the hook type. This is either 'up' or 'down'.
· LXC_HOOK_SECTION: the section type 'net'.
· LXC_NET_TYPE: the network type. This is one of the valid
network types listed here (e.g. 'macvlan', 'veth').
· LXC_NET_PARENT: the parent device on the host. This is only
set for network types 'mavclan', 'veth', 'phys'.
· LXC_NET_PEER: the name of the peer device on the host. This
is only set for 'veth' network types. Note that this
information is only available when lxc.hook.version is set
to 1.
Whether this information is provided in the form of environment
variables or as arguments to the script depends on the value of
lxc.hook.version. If set to 1 then information is provided in the
form of environment variables. If set to 0 information is provided as
arguments to the script.
Standard output from the script is logged at debug level. Standard
error is not logged, but can be captured by the hook redirecting its
standard error to standard output.
NEW PSEUDO TTY INSTANCE (DEVPTS)
For stricter isolation the container can have its own private
instance of the pseudo tty.
lxc.pty.max
If set, the container will have a new pseudo tty instance,
making this private to it. The value specifies the maximum
number of pseudo ttys allowed for a pts instance (this
limitation is not implemented yet).
CONTAINER SYSTEM CONSOLE
If the container is configured with a root filesystem and the inittab
file is setup to use the console, you may want to specify where the
output of this console goes.
lxc.console.buffer.size
Setting this option instructs liblxc to allocate an in-memory
ringbuffer. The container's console output will be written to
the ringbuffer. Note that ringbuffer must be at least as big
as a standard page size. When passed a value smaller than a
single page size liblxc will allocate a ringbuffer of a single
page size. A page size is usually 4kB. The keyword 'auto'
will cause liblxc to allocate a ringbuffer of 128kB. When
manually specifying a size for the ringbuffer the value should
be a power of 2 when converted to bytes. Valid size prefixes
are 'kB', 'MB', 'GB'. (Note that all conversions are based on
multiples of 1024. That means 'kb' == 'KiB', 'MB' == 'MiB',
'GB' == 'GiB'.)
lxc.console.buffer.logfile
Setting this option instructs liblxc to write the in-memory
ringbuffer to disk. For performance reasons liblxc will only
write the in-memory ringbuffer to disk when requested. Note
that the this option is only used by liblxc when
lxc.console.buffer.size is set. By default liblxc will dump
the contents of the in-memory ringbuffer to disk when the
container terminates. This allows users to diagnose boot
failures when the container crashed before an API request to
retrieve the in-memory ringbuffer could be sent or handled.
lxc.console.logfile
Specify a path to a file where the console output will be
written. Note that in contrast to the on-disk ringbuffer
logfile this file will keep growing potentially filling up the
users disks if not rotated and deleted. This problem can also
be avoided by using the in-memory ringbuffer options
lxc.console.buffer.size and lxc.console.buffer.logfile.
lxc.console.rotate
Whether to rotate the console logfile specified in
lxc.console.logfile. Users can send an API request to rotate
the logfile. Note that the old logfile will have the same name
as the original with the suffix ".1" appended. Users wishing
to prevent the console log file from filling the disk should
rotate the logfile and delete it if unneeded. This problem can
also be avoided by using the in-memory ringbuffer options
lxc.console.buffer.size and lxc.console.buffer.logfile.
lxc.console.path
Specify a path to a device to which the console will be
attached. The keyword 'none' will simply disable the console.
Note, when specifying 'none' and creating a device node for
the console in the container at /dev/console or bind-mounting
the hosts's /dev/console into the container at /dev/console
the container will have direct access to the hosts's
/dev/console. This is dangerous when the container has write
access to the device and should thus be used with caution.
CONSOLE THROUGH THE TTYS
This option is useful if the container is configured with a root
filesystem and the inittab file is setup to launch a getty on the
ttys. The option specifies the number of ttys to be available for the
container. The number of gettys in the inittab file of the container
should not be greater than the number of ttys specified in this
option, otherwise the excess getty sessions will die and respawn
indefinitely giving annoying messages on the console or in
/var/log/messages.
lxc.tty.max
Specify the number of tty to make available to the container.
CONSOLE DEVICES LOCATION
LXC consoles are provided through Unix98 PTYs created on the host and
bind-mounted over the expected devices in the container. By default,
they are bind-mounted over /dev/console and /dev/ttyN. This can
prevent package upgrades in the guest. Therefore you can specify a
directory location (under /dev under which LXC will create the files
and bind-mount over them. These will then be symbolically linked to
/dev/console and /dev/ttyN. A package upgrade can then succeed as it
is able to remove and replace the symbolic links.
lxc.tty.dir
Specify a directory under /dev under which to create the
container console devices. Note that LXC will move any bind-
mounts or device nodes for /dev/console into this directory.
/DEV DIRECTORY
By default, lxc creates a few symbolic links (fd,stdin,stdout,stderr)
in the container's /dev directory but does not automatically create
device node entries. This allows the container's /dev to be set up as
needed in the container rootfs. If lxc.autodev is set to 1, then
after mounting the container's rootfs LXC will mount a fresh tmpfs
under /dev (limited to 500k) and fill in a minimal set of initial
devices. This is generally required when starting a container
containing a "systemd" based "init" but may be optional at other
times. Additional devices in the containers /dev directory may be
created through the use of the lxc.hook.autodev hook.
lxc.autodev
Set this to 0 to stop LXC from mounting and populating a
minimal /dev when starting the container.
MOUNT POINTS
The mount points section specifies the different places to be
mounted. These mount points will be private to the container and
won't be visible by the processes running outside of the container.
This is useful to mount /etc, /var or /home for examples.
NOTE - LXC will generally ensure that mount targets and relative
bind-mount sources are properly confined under the container root, to
avoid attacks involving over-mounting host directories and files.
(Symbolic links in absolute mount sources are ignored) However, if
the container configuration first mounts a directory which is under
the control of the container user, such as /home/joe, into the
container at some path, and then mounts under path, then a TOCTTOU
attack would be possible where the container user modifies a symbolic
link under his home directory at just the right time.
lxc.mount.fstab
specify a file location in the fstab format, containing the
mount information. The mount target location can and in most
cases should be a relative path, which will become relative to
the mounted container root. For instance,
proc proc proc nodev,noexec,nosuid 0 0
Will mount a proc filesystem under the container's /proc,
regardless of where the root filesystem comes from. This is
resilient to block device backed filesystems as well as
container cloning.
Note that when mounting a filesystem from an image file or
block device the third field (fs_vfstype) cannot be auto as
with mount(8) but must be explicitly specified.
lxc.mount.entry
specify a mount point corresponding to a line in the fstab
format. Moreover lxc add two options to mount. optional
don't fail if mount does not work. create=dir or create=file
to create dir (or file) when the point will be mounted.
relative source path is taken to be relative to the mounted
container root. For instance,
dev/null proc/kcore none bind,relative 0 0
Will expand dev/null to ${LXC_ROOTFS_MOUNT}/dev/null, and
mount it to proc/kcore inside the container.
lxc.mount.auto
specify which standard kernel file systems should be
automatically mounted. This may dramatically simplify the
configuration. The file systems are:
· proc:mixed (or proc): mount /proc as read-write, but remount
/proc/sys and /proc/sysrq-trigger read-only for security /
container isolation purposes.
· proc:rw: mount /proc as read-write
· sys:mixed (or sys): mount /sys as read-only but with
/sys/devices/virtual/net writable.
· sys:ro: mount /sys as read-only for security / container
isolation purposes.
· sys:rw: mount /sys as read-write
· cgroup:mixed: mount a tmpfs to /sys/fs/cgroup, create
directories for all hierarchies to which the container is
added, create subdirectories there with the name of the
cgroup, and bind-mount the container's own cgroup into that
directory. The container will be able to write to its own
cgroup directory, but not the parents, since they will be
remounted read-only.
· cgroup:ro: similar to cgroup:mixed, but everything will be
mounted read-only.
· cgroup:rw: similar to cgroup:mixed, but everything will be
mounted read-write. Note that the paths leading up to the
container's own cgroup will be writable, but will not be a
cgroup filesystem but just part of the tmpfs of
/sys/fs/cgroup
· cgroup (without specifier): defaults to cgroup:rw if the
container retains the CAP_SYS_ADMIN capability, cgroup:mixed
otherwise.
· cgroup-full:mixed: mount a tmpfs to /sys/fs/cgroup, create
directories for all hierarchies to which the container is
added, bind-mount the hierarchies from the host to the
container and make everything read-only except the
container's own cgroup. Note that compared to cgroup, where
all paths leading up to the container's own cgroup are just
simple directories in the underlying tmpfs, here
/sys/fs/cgroup/$hierarchy will contain the host's full
cgroup hierarchy, albeit read-only outside the container's
own cgroup. This may leak quite a bit of information into
the container.
· cgroup-full:ro: similar to cgroup-full:mixed, but everything
will be mounted read-only.
· cgroup-full:rw: similar to cgroup-full:mixed, but everything
will be mounted read-write. Note that in this case, the
container may escape its own cgroup. (Note also that if the
container has CAP_SYS_ADMIN support and can mount the cgroup
filesystem itself, it may do so anyway.)
· cgroup-full (without specifier): defaults to cgroup-full:rw
if the container retains the CAP_SYS_ADMIN capability,
cgroup-full:mixed otherwise.
If cgroup namespaces are enabled, then any cgroup auto-mounting
request will be ignored, since the container can mount the
filesystems itself, and automounting can confuse the container init.
Note that if automatic mounting of the cgroup filesystem is enabled,
the tmpfs under /sys/fs/cgroup will always be mounted read-write (but
for the :mixed and :ro cases, the individual hierarchies,
/sys/fs/cgroup/$hierarchy, will be read-only). This is in order to
work around a quirk in Ubuntu's mountall(8) command that will cause
containers to wait for user input at boot if /sys/fs/cgroup is
mounted read-only and the container can't remount it read-write due
to a lack of CAP_SYS_ADMIN.
Examples:
lxc.mount.auto = proc sys cgroup
lxc.mount.auto = proc:rw sys:rw cgroup-full:rw
ROOT FILE SYSTEM
The root file system of the container can be different than that of
the host system.
lxc.rootfs.path
specify the root file system for the container. It can be an
image file, a directory or a block device. If not specified,
the container shares its root file system with the host.
For directory or simple block-device backed containers, a
pathname can be used. If the rootfs is backed by a nbd device,
then nbd:file:1 specifies that file should be attached to a
nbd device, and partition 1 should be mounted as the rootfs.
nbd:file specifies that the nbd device itself should be
mounted. overlayfs:/lower:/upper specifies that the rootfs
should be an overlay with /upper being mounted read-write over
a read-only mount of /lower. aufs:/lower:/upper does the same
using aufs in place of overlayfs. For both overlayfs and aufs
multiple /lower directories can be specified. loop:/file tells
lxc to attach /file to a loop device and mount the loop
device.
lxc.rootfs.mount
where to recursively bind lxc.rootfs.path before pivoting.
This is to ensure success of the pivot_root(8) syscall. Any
directory suffices, the default should generally work.
lxc.rootfs.options
extra mount options to use when mounting the rootfs.
CONTROL GROUP
The control group section contains the configuration for the
different subsystem. lxc does not check the correctness of the
subsystem name. This has the disadvantage of not detecting
configuration errors until the container is started, but has the
advantage of permitting any future subsystem.
lxc.cgroup.[subsystem name]
specify the control group value to be set. The subsystem name
is the literal name of the control group subsystem. The
permitted names and the syntax of their values is not dictated
by LXC, instead it depends on the features of the Linux kernel
running at the time the container is started, eg.
lxc.cgroup.cpuset.cpus
lxc.cgroup.dir
specify a directory or path in which the container's cgroup
will be created. For example, setting lxc.cgroup.dir = my-
cgroup/first for a container named "c1" will create the
container's cgroup as a sub-cgroup of "my-cgroup". For
example, if the user's current cgroup "my-user" is located in
the root cgroup of the cpuset controller in a cgroup v1
hierarchy this would create the cgroup
"/sys/fs/cgroup/cpuset/my-user/my-cgroup/first/c1" for the
container. Any missing cgroups will be created by LXC. This
presupposes that the user has write access to its current
cgroup.
CAPABILITIES
The capabilities can be dropped in the container if this one is run
as root.
lxc.cap.drop
Specify the capability to be dropped in the container. A
single line defining several capabilities with a space
separation is allowed. The format is the lower case of the
capability definition without the "CAP_" prefix, eg.
CAP_SYS_MODULE should be specified as sys_module. See
capabilities(7). If used with no value, lxc will clear any
drop capabilities specified up to this point.
lxc.cap.keep
Specify the capability to be kept in the container. All other
capabilities will be dropped. When a special value of "none"
is encountered, lxc will clear any keep capabilities specified
up to this point. A value of "none" alone can be used to drop
all capabilities.
NAMESPACE INHERITANCE
A namespace can be inherited from another container or process.
lxc.namespace.[namespace identifier]
Specify a namespace to inherit from another container or
process. The [namespace identifier] suffix needs to be
replaced with one of the namespaces that appear in the
/proc/PID/ns directory.
To inherit the namespace from another process set the
lxc.namespace.[namespace identifier] to the PID of the
process, e.g. lxc.namespace.net=42.
To inherit the namespace from another container set the
lxc.namespace.[namespace identifier] to the name of the
container, e.g. lxc.namespace.pid=c3.
To inherit the namespace from another container located in a
different path than the standard liblxc path set the
lxc.namespace.[namespace identifier] to the full path to the
container, e.g. lxc.namespace.user=/opt/c3.
In order to inherit namespaces the caller needs to have
sufficient privilege over the process or container.
Note that sharing pid namespaces between system containers
will likely not work with most init systems.
Note that if two processes are in different user namespaces
and one process wants to inherit the other's network namespace
it usually needs to inherit the user namespace as well.
RESOURCE LIMITS
The soft and hard resource limits for the container can be changed.
Unprivileged containers can only lower them. Resources which are not
explicitly specified will be inherited.
lxc.prlimit.[limit name]
Specify the resource limit to be set. A limit is specified as
two colon separated values which are either numeric or the
word 'unlimited'. A single value can be used as a shortcut to
set both soft and hard limit to the same value. The permitted
names the "RLIMIT_" resource names in lowercase without the
"RLIMIT_" prefix, eg. RLIMIT_NOFILE should be specified as
"nofile". See setrlimit(2). If used with no value, lxc will
clear the resource limit specified up to this point. A
resource with no explicitly configured limitation will be
inherited from the process starting up the container.
SYSCTL
Configure kernel parameters for the container.
lxc.sysctl.[kernel parameters name]
Specify the kernel parameters to be set. The parameters
available are those listed under /proc/sys/. Note that not
all sysctls are namespaced. Changing Non-namespaced sysctls
will cause the system-wide setting to be modified. sysctl(8).
If used with no value, lxc will clear the parameters specified
up to this point.
APPARMOR PROFILE
If lxc was compiled and installed with apparmor support, and the host
system has apparmor enabled, then the apparmor profile under which
the container should be run can be specified in the container
configuration. The default is lxc-container-default-cgns if the host
kernel is cgroup namespace aware, or lxc-container-default othewise.
lxc.apparmor.profile
Specify the apparmor profile under which the container should
be run. To specify that the container should be unconfined,
use
lxc.apparmor.profile = unconfined
If the apparmor profile should remain unchanged (i.e. if you
are nesting containers and are already confined), then use
lxc.apparmor.profile = unchanged
lxc.apparmor.allow_incomplete
Apparmor profiles are pathname based. Therefore many file
restrictions require mount restrictions to be effective
against a determined attacker. However, these mount
restrictions are not yet implemented in the upstream kernel.
Without the mount restrictions, the apparmor profiles still
protect against accidental damager.
If this flag is 0 (default), then the container will not be
started if the kernel lacks the apparmor mount features, so
that a regression after a kernel upgrade will be detected. To
start the container under partial apparmor protection, set
this flag to 1.
SELINUX CONTEXT
If lxc was compiled and installed with SELinux support, and the host
system has SELinux enabled, then the SELinux context under which the
container should be run can be specified in the container
configuration. The default is unconfined_t, which means that lxc will
not attempt to change contexts. See
/usr/local/share/lxc/selinux/lxc.te for an example policy and more
information.
lxc.selinux.context
Specify the SELinux context under which the container should
be run or unconfined_t. For example
lxc.selinux.context = system_u:system_r:lxc_t:s0:c22
SECCOMP CONFIGURATION
A container can be started with a reduced set of available system
calls by loading a seccomp profile at startup. The seccomp
configuration file must begin with a version number on the first
line, a policy type on the second line, followed by the
configuration.
Versions 1 and 2 are currently supported. In version 1, the policy is
a simple whitelist. The second line therefore must read "whitelist",
with the rest of the file containing one (numeric) sycall number per
line. Each syscall number is whitelisted, while every unlisted number
is blacklisted for use in the container
In version 2, the policy may be blacklist or whitelist, supports per-
rule and per-policy default actions, and supports per-architecture
system call resolution from textual names.
An example blacklist policy, in which all system calls are allowed
except for mknod, which will simply do nothing and return 0
(success), looks like:
2
blacklist
mknod errno 0
lxc.seccomp.profile
Specify a file containing the seccomp configuration to load
before the container starts.
PR_SET_NO_NEW_PRIVS
With PR_SET_NO_NEW_PRIVS active execve() promises not to grant
privileges to do anything that could not have been done without the
execve() call (for example, rendering the set-user-ID and set-group-
ID mode bits, and file capabilities non-functional). Once set, this
bit cannot be unset. The setting of this bit is inherited by children
created by fork() and clone(), and preserved across execve(). Note
that PR_SET_NO_NEW_PRIVS is applied after the container has changed
into its intended AppArmor profile or SElinux context.
lxc.no_new_privs
Specify whether the PR_SET_NO_NEW_PRIVS flag should be set for
the container. Set to 1 to activate.
UID MAPPINGS
A container can be started in a private user namespace with user and
group id mappings. For instance, you can map userid 0 in the
container to userid 200000 on the host. The root user in the
container will be privileged in the container, but unprivileged on
the host. Normally a system container will want a range of ids, so
you would map, for instance, user and group ids 0 through 20,000 in
the container to the ids 200,000 through 220,000.
lxc.idmap
Four values must be provided. First a character, either 'u',
or 'g', to specify whether user or group ids are being mapped.
Next is the first userid as seen in the user namespace of the
container. Next is the userid as seen on the host. Finally, a
range indicating the number of consecutive ids to map.
CONTAINER HOOKS
Container hooks are programs or scripts which can be executed at
various times in a container's lifetime.
When a container hook is executed, additional information is passed
along. The lxc.hook.version argument can be used to determine if the
following arguments are passed as command line arguments or through
environment variables. The arguments are:
· Container name.
· Section (always 'lxc').
· The hook type (i.e. 'clone' or 'pre-mount').
· Additional arguments. In the case of the clone hook, any extra
arguments passed to lxc-clone will appear as further arguments to
the hook. In the case of the stop hook, paths to filedescriptors
for each of the container's namespaces along with their types are
passed.
The following environment variables are set:
· LXC_CGNS_AWARE: indicator whether the container is cgroup namespace
aware.
· LXC_CONFIG_FILE: the path to the container configuration file.
· LXC_HOOK_TYPE: the hook type (e.g. 'clone', 'mount', 'pre-mount').
Note that the existence of this environment variable is conditional
on the value of lxc.hook.version. If it is set to 1 then
LXC_HOOK_TYPE will be set.
· LXC_HOOK_SECTION: the section type (e.g. 'lxc', 'net'). Note that
the existence of this environment variable is conditional on the
value of lxc.hook.version. If it is set to 1 then LXC_HOOK_SECTION
will be set.
· LXC_HOOK_VERSION: the version of the hooks. This value is identical
to the value of the container's lxc.hook.version config item. If it
is set to 0 then old-style hooks are used. If it is set to 1 then
new-style hooks are used.
· LXC_LOG_LEVEL: the container's log level.
· LXC_NAME: is the container's name.
· LXC_[NAMESPACE IDENTIFIER]_NS: path under /proc/PID/fd/ to a file
descriptor referring to the container's namespace. For each
preserved namespace type there will be a separate environment
variable. These environment variables will only be set if
lxc.hook.version is set to 1.
· LXC_ROOTFS_MOUNT: the path to the mounted root filesystem.
· LXC_ROOTFS_PATH: this is the lxc.rootfs.path entry for the
container. Note this is likely not where the mounted rootfs is to
be found, use LXC_ROOTFS_MOUNT for that.
· LXC_SRC_NAME: in the case of the clone hook, this is the original
container's name.
Standard output from the hooks is logged at debug level. Standard
error is not logged, but can be captured by the hook redirecting its
standard error to standard output.
lxc.hook.version
To pass the arguments in new style via environment variables
set to 1 otherwise set to 0 to pass them as arguments. This
setting affects all hooks arguments that were traditionally
passed as arguments to the script. Specifically, it affects
the container name, section (e.g. 'lxc', 'net') and hook type
(e.g. 'clone', 'mount', 'pre-mount') arguments. If new-style
hooks are used then the arguments will be available as
environment variables. The container name will be set in
LXC_NAME. (This is set independently of the value used for
this config item.) The section will be set in LXC_HOOK_SECTION
and the hook type will be set in LXC_HOOK_TYPE. It also
affects how the paths to file descriptors referring to the
container's namespaces are passed. If set to 1 then for each
namespace a separate environment variable LXC_[NAMESPACE
IDENTIFIER]_NS will be set. If set to 0 then the paths will be
passed as arguments to the stop hook.
lxc.hook.pre-start
A hook to be run in the host's namespace before the container
ttys, consoles, or mounts are up.
lxc.hook.pre-mount
A hook to be run in the container's fs namespace but before
the rootfs has been set up. This allows for manipulation of
the rootfs, i.e. to mount an encrypted filesystem. Mounts done
in this hook will not be reflected on the host (apart from
mounts propagation), so they will be automatically cleaned up
when the container shuts down.
lxc.hook.mount
A hook to be run in the container's namespace after mounting
has been done, but before the pivot_root.
lxc.hook.autodev
A hook to be run in the container's namespace after mounting
has been done and after any mount hooks have run, but before
the pivot_root, if lxc.autodev == 1. The purpose of this hook
is to assist in populating the /dev directory of the container
when using the autodev option for systemd based containers.
The container's /dev directory is relative to the
${LXC_ROOTFS_MOUNT} environment variable available when the
hook is run.
lxc.hook.start-host
A hook to be run in the host's namespace after the container
has been setup, and immediately before starting the container
init.
lxc.hook.start
A hook to be run in the container's namespace immediately
before executing the container's init. This requires the
program to be available in the container.
lxc.hook.stop
A hook to be run in the host's namespace with references to
the container's namespaces after the container has been shut
down. For each namespace an extra argument is passed to the
hook containing the namespace's type and a filename that can
be used to obtain a file descriptor to the corresponding
namespace, separated by a colon. The type is the name as it
would appear in the /proc/PID/ns directory. For instance for
the mount namespace the argument usually looks like
mnt:/proc/PID/fd/12.
lxc.hook.post-stop
A hook to be run in the host's namespace after the container
has been shut down.
lxc.hook.clone
A hook to be run when the container is cloned to a new one.
See lxc-clone(1) for more information.
lxc.hook.destroy
A hook to be run when the container is destroyed.
CONTAINER HOOKS ENVIRONMENT VARIABLES
A number of environment variables are made available to the startup
hooks to provide configuration information and assist in the
functioning of the hooks. Not all variables are valid in all
contexts. In particular, all paths are relative to the host system
and, as such, not valid during the lxc.hook.start hook.
LXC_NAME
The LXC name of the container. Useful for logging messages in
common log environments. [-n]
LXC_CONFIG_FILE
Host relative path to the container configuration file. This
gives the container to reference the original, top level,
configuration file for the container in order to locate any
additional configuration information not otherwise made
available. [-f]
LXC_CONSOLE
The path to the console output of the container if not NULL.
[-c] [lxc.console.path]
LXC_CONSOLE_LOGPATH
The path to the console log output of the container if not
NULL. [-L]
LXC_ROOTFS_MOUNT
The mount location to which the container is initially bound.
This will be the host relative path to the container rootfs
for the container instance being started and is where changes
should be made for that instance. [lxc.rootfs.mount]
LXC_ROOTFS_PATH
The host relative path to the container root which has been
mounted to the rootfs.mount location. [lxc.rootfs.path]
LXC_SRC_NAME
Only for the clone hook. Is set to the original container
name.
LXC_TARGET
Only for the stop hook. Is set to "stop" for a container
shutdown or "reboot" for a container reboot.
LXC_CGNS_AWARE
If unset, then this version of lxc is not aware of cgroup
namespaces. If set, it will be set to 1, and lxc is aware of
cgroup namespaces. Note this does not guarantee that cgroup
namespaces are enabled in the kernel. This is used by the
lxcfs mount hook.
LOGGING
Logging can be configured on a per-container basis. By default,
depending upon how the lxc package was compiled, container startup is
logged only at the ERROR level, and logged to a file named after the
container (with '.log' appended) either under the container path, or
under /usr/local/var/log/lxc.
Both the default log level and the log file can be specified in the
container configuration file, overriding the default behavior. Note
that the configuration file entries can in turn be overridden by the
command line options to lxc-start.
lxc.log.level
The level at which to log. The log level is an integer in the
range of 0..8 inclusive, where a lower number means more
verbose debugging. In particular 0 = trace, 1 = debug, 2 =
info, 3 = notice, 4 = warn, 5 = error, 6 = critical, 7 =
alert, and 8 = fatal. If unspecified, the level defaults to 5
(error), so that only errors and above are logged.
Note that when a script (such as either a hook script or a
network interface up or down script) is called, the script's
standard output is logged at level 1, debug.
lxc.log.file
The file to which logging info should be written.
lxc.log.syslog
Send logging info to syslog. It respects the log level defined
in lxc.log.level. The argument should be the syslog facility
to use, valid ones are: daemon, local0, local1, local2,
local3, local4, local5, local5, local6, local7.
AUTOSTART
The autostart options support marking which containers should be
auto-started and in what order. These options may be used by LXC
tools directly or by external tooling provided by the distributions.
lxc.start.auto
Whether the container should be auto-started. Valid values
are 0 (off) and 1 (on).
lxc.start.delay
How long to wait (in seconds) after the container is started
before starting the next one.
lxc.start.order
An integer used to sort the containers when auto-starting a
series of containers at once.
lxc.monitor.unshare
If not zero the mount namespace will be unshared from the host
before initializing the container (before running any pre-
start hooks). This requires the CAP_SYS_ADMIN capability at
startup. Default is 0.
lxc.group
A multi-value key (can be used multiple times) to put the
container in a container group. Those groups can then be used
(amongst other things) to start a series of related
containers.
AUTOSTART AND SYSTEM BOOT
Each container can be part of any number of groups or no group at
all. Two groups are special. One is the NULL group, i.e. the
container does not belong to any group. The other group is the
"onboot" group.
When the system boots with the LXC service enabled, it will first
attempt to boot any containers with lxc.start.auto == 1 that is a
member of the "onboot" group. The startup will be in order of
lxc.start.order. If an lxc.start.delay has been specified, that
delay will be honored before attempting to start the next container
to give the current container time to begin initialization and reduce
overloading the host system. After starting the members of the
"onboot" group, the LXC system will proceed to boot containers with
lxc.start.auto == 1 which are not members of any group (the NULL
group) and proceed as with the onboot group.
CONTAINER ENVIRONMENT
If you want to pass environment variables into the container (that
is, environment variables which will be available to init and all of
its descendents), you can use lxc.environment parameters to do so. Be
careful that you do not pass in anything sensitive; any process in
the container which doesn't have its environment scrubbed will have
these variables available to it, and environment variables are always
available via /proc/PID/environ.
This configuration parameter can be specified multiple times; once
for each environment variable you wish to configure.
lxc.environment
Specify an environment variable to pass into the container.
Example:
lxc.environment = APP_ENV=production
lxc.environment = SYSLOG_SERVER=192.0.2.42
In addition to the few examples given below, you will find some other
examples of configuration file in /usr/local/share/doc/lxc/examples
NETWORK
This configuration sets up a container to use a veth pair device with
one side plugged to a bridge br0 (which has been configured before on
the system by the administrator). The virtual network device visible
in the container is renamed to eth0.
lxc.uts.name = myhostname
lxc.net.0.type = veth
lxc.net.0.flags = up
lxc.net.0.link = br0
lxc.net.0.name = eth0
lxc.net.0.hwaddr = 4a:49:43:49:79:bf
lxc.net.0.ipv4.address = 10.2.3.5/24 10.2.3.255
lxc.net.0.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3597
UID/GID MAPPING
This configuration will map both user and group ids in the range
0-9999 in the container to the ids 100000-109999 on the host.
lxc.idmap = u 0 100000 10000
lxc.idmap = g 0 100000 10000
CONTROL GROUP
This configuration will setup several control groups for the
application, cpuset.cpus restricts usage of the defined cpu,
cpus.share prioritize the control group, devices.allow makes usable
the specified devices.
lxc.cgroup.cpuset.cpus = 0,1
lxc.cgroup.cpu.shares = 1234
lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = c 1:3 rw
lxc.cgroup.devices.allow = b 8:0 rw
COMPLEX CONFIGURATION
This example show a complex configuration making a complex network
stack, using the control groups, setting a new hostname, mounting
some locations and a changing root file system.
lxc.uts.name = complex
lxc.net.0.type = veth
lxc.net.0.flags = up
lxc.net.0.link = br0
lxc.net.0.hwaddr = 4a:49:43:49:79:bf
lxc.net.0.ipv4.address = 10.2.3.5/24 10.2.3.255
lxc.net.0.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3597
lxc.net.0.ipv6.address = 2003:db8:1:0:214:5432:feab:3588
lxc.net.1.type = macvlan
lxc.net.1.flags = up
lxc.net.1.link = eth0
lxc.net.1.hwaddr = 4a:49:43:49:79:bd
lxc.net.1.ipv4.address = 10.2.3.4/24
lxc.net.1.ipv4.address = 192.168.10.125/24
lxc.net.1.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3596
lxc.net.2.type = phys
lxc.net.2.flags = up
lxc.net.2.link = dummy0
lxc.net.2.hwaddr = 4a:49:43:49:79:ff
lxc.net.2.ipv4.address = 10.2.3.6/24
lxc.net.2.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3297
lxc.cgroup.cpuset.cpus = 0,1
lxc.cgroup.cpu.shares = 1234
lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = c 1:3 rw
lxc.cgroup.devices.allow = b 8:0 rw
lxc.mount.fstab = /etc/fstab.complex
lxc.mount.entry = /lib /root/myrootfs/lib none ro,bind 0 0
lxc.rootfs.path = dir:/mnt/rootfs.complex
lxc.cap.drop = sys_module mknod setuid net_raw
lxc.cap.drop = mac_override
chroot(1), pivot_root(8), fstab(5), capabilities(7)
lxc(7), lxc-create(1), lxc-copy(1), lxc-destroy(1), lxc-start(1),
lxc-stop(1), lxc-execute(1), lxc-console(1), lxc-monitor(1),
lxc-wait(1), lxc-cgroup(1), lxc-ls(1), lxc-info(1), lxc-freeze(1),
lxc-unfreeze(1), lxc-attach(1), lxc.conf(5)
Daniel Lezcano <daniel.lezcano@free.fr>
This page is part of the lxc (Linux containers) project. Information
about the project can be found at ⟨http://linuxcontainers.org/⟩. If
you have a bug report for this manual page, send it to
lxc-devel@lists.linuxcontainers.org. This page was obtained from the
project's upstream Git repository ⟨git://github.com/lxc/lxc⟩ on
2018-02-02. (At that time, the date of the most recent commit that
was found in the repository was 2018-02-01.) If you discover any
rendering problems in this HTML version of the page, or you believe
there is a better or more up-to-date source for the page, or you have
corrections or improvements to the information in this COLOPHON
(which is not part of the original manual page), send a mail to
man-pages@man7.org
2018-02-02 LXC.CONTAINER.CONF(5)
Pages that refer to this page: lxc.conf(5), lxc.system.conf(5)