Nick Moffitt
on 15 April 2015
When we design our charms, we typically know the sources of information we have in mind (configuration settings, relation data, etc.), and the actions we want the charm to take based on them. As charm complexity grows, we tend to illustrate the resulting state machine’s conditions and actions as a directed graph, plotting out the course changes the charm can take as circumstances change.
When we sit down to write the hooks, though, we need to translate our elegant little design into the grubby details of how Juju events fire. At times this can feel like a completely different language! The “impedance mismatch” between the two ways of thinking about charm hooks is frustrating, and something we’ve been working to solve since the beginning.
A lot has changed for charm authors in the past few years. Early charm hook scripts were often written in bash. The charmhelpers
library helped to make coding hooks in Python more comfortable. Some adventurous developers are even experimenting with charms written in Go! Throughout this process, we’ve explored the shape of the tasks we want our hooks to perform, and tried to imagine what our ideal tool would look like.
The most important advance to date is the Services Framework, which is part of the charmhelpers
suite of Python tools. For the first time, charm hooks are abstracted away into a mapping from data requirements to resulting actions. This model fits the way we reason about the decisions made in our Juju-driven orchestration, and helps us understand our charms better.
Rewriting Charms with the Services Framework
Canonical IS maintain a number of WordPress instances, and our squad was given the project to modernise the cluster of charms around our deployments to meet our needs. We wanted the services to support theme and plug-in management via Juju, and we need horizontal scaling of the application servers and front-ends. The scaling feature brought with it the requirement for shared object storage on the application servers, which we also went and added.
Despite this increase in capabilities, we found that the new framework-driven wordpress
, apache2-subordinate
and squid-reverseproxy
charms were roughly ⅓ the size of our older Python charms. We were able to prototype more quickly, and we found the new versions to be more reliable in our mojo tests. In many cases we were surprised to discover that it was easier and quicker to re-write an old charm in the Services Framework than it was to add functionality to or fix a bug in an older version!
An Example: squid-reverseproxy
One of the pieces in our Juju-deployed WordPress infrastructure is a front-end squid3
reverse proxy.1 The front-end proxy is a vital piece for any dynamic Web site exposed to the public, and is often an inexpensive and easy place to scale horizontally when traffic and load increase.
There are several pieces of data that a reverse proxy needs to know, but for now we’ll focus on two in particular:
- The cache policy
- The details of the back-end servers to proxy for
The first piece of data tends to come from configuration choices made by the Juju administrator. One might juju set
a number of values to select an aggressive policy for static images, and a more lenient set for dynamic Web content. This tends to be hand-tuned, even if some values are specified in proportion to the cloud unit’s total resources.
The second set of data comes from Juju relations with other services. By adding the relation between your front-end proxy service and your back-end appserver’s service, you can let the charms work out the details of which hosts and ports to connect to, and which hostnames to use for virtual hosting. In this case we’ll assume that the appserver’s service passes two URLs to the front-end proxy: one for the public face of the site, and one for the back-end connection.
Event Hooks are Tricky Business
So now we have two sources of data, both of which should trigger an action that writes out the /etc/squid3/squid.conf
file and restarts the daemon. In a traditional hooks.py
using straight charmhelpers
we would need to manually perform the following:
- Write predicate functions to ensure that our cache policy settings are valid.
- Write predicate functions to ensure that each of the appserver units sent sufficient and valid relation data.
- Write a function to render a template out to
/etc/squid3/squid.conf
containing the peer stanzas for the appserver units, and ACL/port settings for the sites served by those peers. - Write control logic in the hook functions to follow the standard charm idiom and
sys.exit(0)
if either of the predicate functions is unhappy. Otherwise call the rendering function. - Wire up hook functions to specific events (
config-changed
,foo-relation-joined
, etc.) using ahookenv.Hooks
object.
This of course is if you’re planning the flow of the hook actions from the start. Many such functions grow over time instead, and the inline if
/else
blocks form a tree of conditions that may span pages. It can be difficult to identify opportunities for refactoring this system, and the benefits of making a small localised change can delay code clean-ups for some time.
Do What I Mean
It would be far more comfortable if you could write the rendering function above and wire it up to changes in the configuration and relation variables you care about. Let’s say, something like the following:
ServiceManager([{
'service': 'squid3',
'ports': [80],
'required_data': [
RequiredConfig('refresh_max', 'refresh_min',
'refresh_percent'),
RelationContext(name='website',
additional_required_keys=['services'])],
'data_ready': [
install_packages,
render_template(source='squid-conf.j2',
target='/etc/squid3/squid.conf')],
}]).manage()
I have a ServiceManager
class that accepts a list of dictionaries defining services my charm provides. I could define more for the other things a charm needs to do, like wiring up monitoring or log rotation hooks. I could add a new service based on the peer relation to make use of any clustering features the daemon may provide. I could just make a new “service” that happens to map to a corner case in my infrastructure, where specific action needs to be taken.
This ServiceManager
could infer a lot of things from the data provided. It could automatically wire up start
and stop
events to run service squid3 start
or stop
. It could register port 80
for public access when you run juju expose squid3
. It might compare the old configuration settings and the services
data in the website
relation across all units, and only fire off the data_ready
hooks when something relevant changed. Finally, it could provide all the required_data
to the template being rendered.
All of this would remove boiler-plate code that charm authors copy around and pay little attention to. It changes the focus from “Are we in the config-changed
hook or the website-relation-changed
hook?” to “Did the values we care about change just now?”
Where do I get this?
You’re in luck, because the code above is possible today using the Services Framework. The way the example used it is a bit contrived and less than optimal, so let’s take a look at the parts in detail.
The ServiceManager
The primary driver of the Services Framework is the ServiceManager()
class. It has one method, manage()
which causes it to inspect all of its defined services and fire callbacks when their conditions are met. It has become traditional to put the creation of this object in a file under the hooks/
directory, called services.py
with the manage()
call inside an if __name__ == '__main__':
block.
Since the question of which hook is being called is less important now, we can wire up nearly all of the events we expect to receive to this services.py
file:
ubuntu@myhost:~/mynewcharm/hooks/$ ls -l
-rw-rw-r-- 1 ubuntu ubuntu 5983 Feb 24 17:27 actions.py
-rw-rw-r-- 1 ubuntu ubuntu 1073 Feb 24 17:27 my_helpers.py
drwxrwxr-x 6 ubuntu ubuntu 4096 Feb 24 17:27 charmhelpers/
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 config-changed -> services.py*
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 install -> services.py*
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 nrpe-external-master-relation-joined -> services.py*
-rw-rw-r-- 1 ubuntu ubuntu 1156 Feb 24 17:27 services.py
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 start -> services.py*
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 stop -> services.py*
drwxrwxr-x 2 ubuntu ubuntu 4096 Jan 21 09:54 tests/
lrwxrwxrwx 1 ubuntu ubuntu 2838 Jan 21 09:54 upgrade-charm*
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 webservice-relation-broken -> services.py*
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 webservice-relation-changed -> services.py*
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 webservice-relation-departed -> services.py*
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 webservice-relation-joined -> services.py*
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 website-relation-broken -> services.py*
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 website-relation-changed -> services.py*
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 website-relation-departed -> services.py*
lrwxrwxrwx 1 ubuntu ubuntu 8 Jan 21 09:54 website-relation-joined -> services.py*
Notice that we’ve aimed symbolic links at services.py
for all of the relation hooks, the install
and config-changed
hooks, as well as the start
and stop
hooks. The only hook we haven’t used the Services Framework for is upgrade-charm
. This is because the upgrade-charm
hook is far more difficult to write in a declarative style, depending as it does on the path between two particular versions of the charm.
Actions
All of the work done on the unit is bundled into functions called actions. In our data_ready
section above, we called render_template()
, passing in the source template name and destination path. This action generator actually exists in the services.helpers
module, along with other action functions to do things like start and stop services.
But let’s take a look at the simpler install_packages()
action for now:
def install_packages(service_name):
if 'hold' in config['package_status']:
fetch.apt_hold(service_name)
return fetch.apt_install(service_name)
This function is simple enough and trivially idempotent, so we just let it run every time an event fires. It has its own self-contained logic to determine if it’s meant to put the package on hold.
It also trusts that the service_name
parameter will hold the package name. This works in our case as we set the service
entry in the ServiceManager()
’s data structure to squid3
, and that gets passed in to all actions.
More Elaborate Actions
The built-in render_template()
action makes use of more advanced features, which it gained by subclassing the ManagerCallback
class. We were able to instantiate it with the template location and output path, and the template context dictionary was pre-populated with all of the information that was checked in the required_data
section. This gave our templates access to all of the values from the charm configuration (thanks to the RequiredConfig
object) and the website
relation (thanks to the RelationContext
object).
Required Data
The real magic of the Services Framework, though, comes from its abstraction of the event/hook model. When we can specify the config and relation data we’re waiting for, we don’t need to fire our actions until they’re ready. The way this is achieved by the ServiceManager()
is surprisingly elegant.
All of the items in the required_data
list are either dict
s or subclasses of dict
. If any one of these objects is “falsey”2, then the service is not ready and none of the data_ready
actions fire.3 This allows us to write our own __bool__()
methods to let our code decide if our desired conditions are met.
Apparmor Won’t Always Work in an LXC (yet!)
Writing a custom required data object can be simple. While working on the Apache2 subordinate charm, we discovered that apparmor didn’t behave the way we expected when we deployed our charms in LXC containers on our laptops. To work around this, we made a custom NotLxc()
object to put in the required_data
list for the apparmor-related actions.
It went a little like this:
class NotLxc(dict):
""" In LXC containers, the `/run/container_type` file will
contain the string 'lxc', and outside containers the file
may not exist at all. """
def __init__(self):
try:
with open('/run/container_type') as f:
self['container_type'] = f.read().strip()
except IOError:
self['container_type'] = 'bare metal'
def __bool__(self):
return 'lxc' not in self['container_type']
__nonzero__ = __bool__
Making custom objects like this is simple:
- Subclass
dict
, and put some values inself['something']
to flesh out the dictionary - Override
__bool__()
and__nonzero__()
to define your own “is it ready?” condition, instead of the default “Is this dict empty?” behaviour.
Now not only do we have a NotLxc
class that can prevent actions from running on containers that don’t support them, but we also have placed that container_type
variable into the context for the render_template()
action. And most of the actual work here was spent handling errors from open()
!
The Future of the Framework
The Services Framework is ready right now for you to make writing charms easier, but it has some quirks that you should be aware of.
The one you notice most as your charm grows in complexity is that the framework is written with a sort of “client-side” execution model. The provided_data
section (which we didn’t cover here) unconditionally sends all its information to its RelationContext()
objects before it considers the required_data
. It’s intended to announce information and collect the response from the other side, and then take action based on that.
So the Services Framework doesn’t help the other side of this conversation as much. Writing the “server” side of a relation protocol requires you to call the standard Charm Helpers hookenv.relation_set()
function in your actions. It’s not a hardship, by any means, but it definitely feels like a missed opportunity. Doing this in a way that fits the framework best seems to involve making custom notify_relation_with_data()
actions to send the responses, and putting them in special services inside your ServiceManager()
list.
The New Charm Helpers Framework
Fortunately the authors of the Services Framework are keenly aware of such impedance mismatches, and have been working on refinements to the system. The Juju Big Data Development team have been working on the next version of this framework, which they’re calling The Charm Helpers Framework.
The new framework is exciting, but have no fear: all of the effort put into writing or converting charms to use the existing Services Framework will be rewarded even with the new system. The Charm Helpers Framework is merely a simpler and more flexible implementation of the same coding practices. Both systems help you write your charm hooks in the same way: the way you likely already tend to think about them.
- If you’re not familiar, a “reverse” proxy is one that sits in front of a particular Web site and caches and/or load-balances requests from all over the Internet. This is because the first “forward” Web caches were originally used to speed up a local network’s outbound access to public Web pages.↩
- Numerically zero items or empty collection types fail
if
tests in Python, even if they’re not strictly theFalse
value. These are called “falsey” values in the Python community.↩ - It’s not uncommon to add values to a template’s context dictionary by simply dropping a
dict
in torequired_data
, and an empty dict is a quick way to disable a service unconditionally.↩