Mini Shell
REALTIMEKIT Realtime Policy and Watchdog Daemon
GIT:
git://git.0pointer.de/rtkit.git
GITWEB:
http://git.0pointer.de/?p=rtkit.git
NOTES:
RealtimeKit is a D-Bus system service that changes the
scheduling policy of user processes/threads to SCHED_RR
(i.e. realtime scheduling mode) on request. It is intended to
be used as a secure mechanism to allow real-time scheduling to
be used by normal user processes.
RealtimeKit enforces strict policies when handing out
real-time security to user threads:
* Only clients with RLIMIT_RTTIME set will get RT scheduling
* RT scheduling will only be handed out to processes with
SCHED_RESET_ON_FORK set to guarantee that the scheduling
settings cannot 'leak' to child processes, thus making sure
that 'RT fork bombs' cannot be used to bypass RLIMIT_RTTIME
and take the system down.
* Limits are enforced on all user controllable resources, only
a maximum number of users, processes, threads can request RT
scheduling at the same time.
* Only a limited number of threads may be made RT in a
specific time frame.
* Client authorization is verified with PolicyKit
RealtimeKit can also be used to hand outh high priority
scheduling (i.e. negative nice level) to user processes.
In addition to this a-priori policy enforcement, RealtimeKit
also provides a-posteriori policy enforcement, i.e. it
includes a canary-based watchdog that automatically demotes
all real-time threads to SCHED_OTHER should the system
overload despite the logic pointed out above. For more
information regarding canary-based RT watchdogs, see the
Acknowledgments section below.
In its duty to manage real-time scheduling *securely*
RealtimeKit runs as unpriviliged user, and uses capabalities,
resource limits and chroot() to minimize its security impact.
RealtimeKit probably has little use in embedded or server use
cases, use RLIMIT_RTPRIO tehre instead.
WHY:
If processes that have real-time scheduling privileges enter a
busy loop they can freeze the entire the system. To make sure
such run-away processes cannot do this RLIMIT_RTTIME has been
introduced. Being a per-process limit it is however easily
cirumvented by combining a fork bomb with a busy loop.
RealtimeKit hands out RT scheduling to specific threads that
ask for it -- but only to those and due to SCHED_RESET_ON_FORK
it can be sure that this won't 'leak'.
In contrast to RLIMIT_RTPRIO the RealtimeKit logic makes sure
that only a certain number of threads can be made realtime,
per user, per process and per time interval.
CLIENTS:
To be able to make use of realtime scheduling clients may
request so with a small D-Bus interface that is accessible on
the interface org.freedesktop.RealtimeKit1 as object
/org/freedesktop/RealtimeKit1 on the service
org.freedesktop.RealtimeKit1:
void MakeThreadRealtime(u64 thread_id, u32 priority);
void MakeThreadHighPriority(u64 thread_id, s32 priority);
The thread IDs need to be passed as kernel tids as returned by
gettid(), not a pthread_t! Only threads belonging to the
calling process can be made realtime/high priority. (Please
note that gettid() is not available in glibc, you need to
implement that manually using syscall(). Consult the reference
client implementation for details.)
A BSD-licensed reference implementation of the client is
available in rtkit.[ch] as part of the package. You may copy
this into your sources if you wish. However given how simple
the D-Bus interface is you might choose to implement your own
client implementation.
It is advisable to try acquiring realtime scheduling with
sched_setsheduler() first, so that systems where RLIMIT_RTPRIO
is set can be supported.
Here's an example using the reference implementation. Replace
this:
<snip>
struct sched_param p;
memset(&p, 0, sizeof(p));
p.sched_priority = 3;
sched_setscheduler(0, SCHED_RR|SCHED_RESET_ON_FORK, &p);
</snip>
by this:
<snip>
struct sched_param p;
memset(&p, 0, sizeof(p));
p.sched_priority = 3;
if (sched_setscheduler(0, SCHED_RR|SCHED_RESET_ON_FORK, &p) < 0
&& errno == EPERM)
rtkit_make_realtime(system_bus, 0, p.sched_priority);
</snip>
But of course add more appropriate error checking! Also,
falling back to plain SCHED_RR when SCHED_RESET_ON_FORK causes
EINVAL migt be advisable).
DAEMON:
The daemon is automatically started on first use via D-Bus
system bus activation.
Currently the daemon does not read on any configuration file,
however it can be configured with command line parameters. You
can edit
/usr/share/dbus-1/system-services/org.freedesktop.RealtimeKit1.service
to set those.
Run
/usr/libexec/rtkit-daemon --help
to get a quick overview on the supported parameters and their
defaults. Many of them should be obvious in their meaning. For
the remaining ones see below:
--max-realtime-priority= may be used to specify the maximum
realtime priority a client can acquire through
RealtimeKit. Please note that this value must be smaller than
the value passed to --our-realtime-priority=.
--our-realtime-priority= may be used to specify the realtime
priority of the daemon itself. Please note that this priority
is only used for a very short time while processing a client
request. Normally the daemon will not be running with a
realtime scheduling policy. The real-time priorities handed
out to the user must be lower than this value. (see above).
--min-nice-level= may be used to specify the minimum nice
level a client can acquire through RealtimeKit.
--our-nice-level= may be used to specify the nice level the
the daemon itself uses most of the time (except when
processing requests, see above). It is probably a good idea to
set this to a small positive value, to make sure that if the
system is overloaded already handing out further RT scheduling
will be delayed a bit.
--rttime-usec-max= may be used to control which RLIMIT_RTTIME
value clients need to have chosen at minumum before they may
acquire RT scheduling through RealtimeKit.
--users-max= specifies how many users may acquire RT
scheduling at the same time for one or multiple of their
processes.
--processes-per-user-max= specifies how many processes per
user may acquire RT scheduling at the same time.
--threads-per-user-max= specifies how many threads per user
may acquire RT scheduling at the same time. Of course this
value should be set higher than --process-per-user-max=.
--actions-burst-sec= may be used to influence the rate
limiting logic in RealtimeKit. The daemon will only pass
realtime scheduling privileges to a maximum number of threads
within this timeframe (see below).
--actions-per-burst-max= may be used to influence the rate
limiting logic in RealtimeKit. The daemon will only pass
realtime scheduling privileges to this number of threads
within the time frame configured via
--actions-burst-sec=. When this limit is reached clients need
to wait until that time passes before requesting RT scheduling
again.
--canary-cheep-msec= may be used to control how often the
canary thread shall cheep.
--canary-watchdog-msec= may be used to control how quickly the
watchdog thread expects to receive a cheep from the canary
thread. This value must be chosen larger than
--canary-cheep-msec=. If the former is set 10s and the latter
to 7s, then the canary thread can trigger and deliver the
cheep with a maximum latency of 3s.
ACKNOWLEDGMENTS:
The canary watchdog logic is inspired by previous work of
Vernon Mauery, Florian Schmidt, Kjetil Matheussen:
http://rt.wiki.kernel.org/index.php/RT_Watchdog
LICENSE:
GPLv3+ for the daemon
BSD for the client reference implementation
AUTHOR:
Lennart Poettering
REQUIREMENTS:
Linux kernel >= 2.6.31
D-Bus
PolicyKit >= 0.92
Zerion Mini Shell 1.0