Device Plugin
最后更新于
最后更新于
Since the launch of version v1.8, Kubernetes has infused a shot of adrenaline in its operations with the introduction of Device Plugins – an alpha version tool designed to bolster support for various devices, such as GPUs, FPGAs, high-performance NICs, InfiniBand, and more. As a result, hardware manufacturers can now simply execute a proprietary device plugin in line with the Device Plugin interface, thereby eliminating the need to fine-tune the Kubernetes core code.
The potency of this feature got a subsequent boost when it was upgraded to Beta version in v1.10.
Before putting these Device Plugins to good use, one must first activate the DevicePlugins feature. This can be done by configuring the --feature-gates=DevicePlugins=true
(which, by default, is switched off).
Device Plugins are, in essence, a gPRC interface that requires the implementation of methods like ListAndWatch()
and Allocate()
. It calls for the monitoring of the gRPC Server's Unix Socket in the /var/lib/kubelet/device-plugins/
directory, such as /var/lib/kubelet/device-plugins/nvidiaGPU.sock
. In rolling out your Device Plugins, bear in mind that:
The plugin must be registered with Kubelet via /var/lib/kubelet/device-plugins/kubelet.sock
upon launch, supplying along with the Unix Socket of the plugin, the API version number, and the plugin name (formatted as vendor-domain/resource
, such as nvidia.com/gpu
). Kubelet will then expose these devices in the Node status for future use by the scheduler.
Post-launch, the plugin must continually broadcast its lineup to Kubelet, allocate devices as and when required, and maintain real-time device status monitoring.
The plugin's vigil on Kubelet status must remain constant post-launch, and it should re-register itself following a Kubelet restart. After initializing, for instance, Kubelet will wipe clean the /var/lib/kubelet/device-plugins/
directory – plugin authors can therefore keep watch to see if the unix socket they're monitoring gets deleted and re-register themselves in response to such an event.
Normally, it is recommended that these Device Plugins be rolled out in the form of DaemonSets, and that /var/lib/kubelet/device-plugins
be mounted into the containers as a volume. That said, they can also be implemented manually, but this comes without the fallback option of automatic failure recovery.
NVIDIA has developed an advanced GPU device plugin based on the Device Plugins interface NVIDIA/k8s-device-plugin.
To compile:
To deploy:
To request GPU resources when creating a Pod:
Please Note: When using this plugin, ensure the configuration of nvidia-docker 2.0, and set nvidia
as the default runtime (i.e., configure the docker daemon's options as --default-runtime=nvidia
)**. The installation of nvidia-docker 2.0 (presented here for Ubuntu Xenial, while other systems can refer to here) looks something like this:
GCP has also sprung up a GPU device plugin – however, it's designed exclusively for Google Container Engine. For more details, do check out GoogleCloudPlatform/container-engine-accelerators.