Embeddability? I am quite certain this is in fact an english word, but you probably won’t find it among the other -ilities of qualities, software might want to address. Definitely among it, there are some boring and pretty self-explanatory ones like maintainability, even some dogmatic ones like correctness[1], but fortunately also funnier ones like extensibility.
And like to often, how a certain quality is best achieved depends on a plethora of things, but according to Wikipedia, one to archive extensibility is to use scripting languages and everything finally comes together: They can be quite embeddable.
So in case you have some time to kill, join me on a lengthy journey through 20+ years of personal FOSS-history. We are having a look at different approaches of embeddings and also see why this is always great idea - plus there are memes.
Unbeknownst to my past self, I made my first experience with this kind of extensibility in 2004, when I started my long journey with Xlib. During that time I started a project called deskbar with the lofty goal to print system information like cpu load, battery usage etc. directly onto the root window of the X session. There were plenty of alternatives like GKrellM readily available, but who in their right mind prefers pre-built stuff over rolling your own[2]?
The initial idea was just to include everything in one binary, but I quickly discovered the ergonomics of re-compiling and shipping everything together are annoying and I switched to a simple plugin system.
I would have loved to show some screenshots of deskbar in action here, but unfortunately after messing with the infamous Autotools and trying to compile old C-code with a modern compiler this is as far as I got[3]:
$ ./configure && make
deskbar 0.1
-----------------
Build with ZLIB support.......: yes
Build with PNG support........: yes
Plugins:
Common Plugins................: Clock CPU Date
Battery Plugin................: no
XMMS Plugin...................: no (1)
BMP Plugin....................: no (2)
Debug Plugin..................: no
The binary will be installed in /usr/local/bin,
the lib in /usr/local/lib and the plugins
in /usr/local/lib/deskbar.
Try make now, good luck!
make all-recursive
make[1]: Entering directory '/home/unexist/build/deskbar-0.1'
# --- %< --- snip --- %< ---
/bin/bash ../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I.. -g -O2 -I/usr/include -I/usr/include -MT htable.lo -MD -MP -MF .deps/htable.Tpo -c -o htable.lo htable.c
libtool: compile: gcc -DHAVE_CONFIG_H -I. -I.. -g -O2 -I/usr/include -I/usr/include -MT htable.lo -MD -MP -MF .deps/htable.Tpo -c htable.c -fPIC -DPIC -o .libs/htable.o
In file included from htable.c:2:
/usr/include/string.h:466:13: error: storage class specified for parameter 'explicit_bzero'
466 | extern void explicit_bzero (void *__s, size_t __n) __THROW __nonnull ((1)) (3)
| ^~~~~~~~~~~~~~
/usr/include/string.h:471:14: error: make[2]: *** [Makefile:457: htable.lo] Error 1
make[2]: Leaving directory '/home/unexist/build/deskbar-0.1/libdeskbar'
make[1]: *** [Makefile:479: all-recursive] Error 1
make[1]: Leaving directory '/home/unexist/build/deskbar-0.1'
make: *** [Makefile:374: all] Error 2storage class specified for parameter 'strsep'
471 | extern char *strsep (char **__restrict __stringp,
| ^~~~~~
/usr/include/string.h:478:14: error: storage class specified for parameter 'strsignal'
478 | extern char *strsignal (int __sig) __THROW;
| ^~~~~~~~~
# --- %< --- snip --- %< ---
make[2]: *** [Makefile:457: htable.lo] Error 1
make[2]: Leaving directory '/home/unexist/build/deskbar-0.1/libdeskbar'
make[1]: *** [Makefile:479: all-recursive] Error 1
make[1]: Leaving directory '/home/unexist/build/deskbar-0.1'
make: *** [Makefile:374: all] Error 2
| 1 | X Multimedia System (XMMS) |
| 2 | I can only guess what is was supposed to do, since the plugin is just an empty stub that returns NULL |
| 3 | Yes, oh well… |
Nevertheless, this output clearly proves there has been a plugin system with conditional compilation, which bases solely on linking magic, and we have to move on.
Everything in C is a bit more complicated, so let us ignore the scary memory handling and just talk about the two interesting calls dlopen and dlsym:
DbPlugElement *element = NULL;
element = (DbPlugElement *) malloc (sizeof (DbPlugElement));
snprintf (buf, sizeof (buf), "%s/%s.so", PLUGIN_DIR, file);
element->handle = dlopen (buf, RTLD_LAZY); (1)
if ((err = dlerror ())) (2)
{
db_log_err ("Cannot load plugin `%s'\n", file);
db_log_debug ("dlopen (): %s\n", err);
free (element);
return;
}
/* Get entrypoint and call it */
entrypoint = dlsym (element->handle, "db_plug_init"); (3)
element->data = (*entrypoint) (); (4)
| 1 | Load the named shared object from path |
| 2 | There is apparently a third call, but rarely mentioned at all |
| 3 | Find the address of a named entrypoint |
| 4 | Execute the entrypoint for profit |
The entrypoint here is quite interesting, since the main application cannot know what is included
in the plugin or even what is exported.
Following the idea of Convention-over-configuration, the defined contract here
expects a symbol named db_plug_init inside a plugin, which is called on load and must return
a pointer to an initialized struct of type DBPlug:
static DbPlug plugin =
{
"Battery", /* Plugin name */
battery_create, /* Plugin create function */
battery_update, /* Plugin update function */
battery_destroy, /* Plugin destroy function */
&data, /* Plugin data */
NULL, /* Plugin format */
3600 /* Plugin update interval */
};
DbPlug *
db_plug_init (void)
{
plug = &plugin;
return (&plugin); (1)
}
| 1 | Pass the local address back to the main application |
Once loaded the plugin is called in the given interval and can exchange data with the main application.
void
battery_update (void)
{
int capacity = 0,
percent = 0;
char buf[100], state[20];
/* Get battery info */
if (!fd1)
{
snprintf (buf, sizeof (buf), "/proc/acpi/battery/BAT%d/state", bat_slot); (1)
fd1 = fopen (buf, "r");
memset (buf, 0, sizeof (buf));
}
else
fseek (fd1, 0, SEEK_SET);
/* --- %< --- snip --- %< --- */
}
| 1 | Here the battery plugin checks the battery values from the ACPI interface |
Allowing contribution this way is really easy and powerful, but like so often comes with a catch. Segmentation faults, the bane of software engineering, don’t make halt inside plugins like they should, but they wipe the board and kill the entire application.
I think Torvalds nailed it perfectly and I agree this should never happen:
Mauro, SHUT THE FUCK UP!
WE DO NOT BREAK USERSPACE!
I am kind of surprised how far I went in trying to keep problems in the plugin at bay. The original project included memory management[4] for plugins and also applied the next two calls I’d like to demonstrate next.
Handling segmentation faults properly is really difficult and the common sense is normally catch them and exit gracefully when possible. Still, there are cases when faults can be safely ignored and a plugin interface is a paragon for this.
This can be done with the pair of setjmp and longjmp, which behave for most practical senses like a goto on steroids:
static int¬ 26 save_call (DbPlugElement *element,¬
save_call (DbPlugElement *element,
DbPlugFunc plugfunc
const char *name)
{
if (plugfunc)
{
if (setjmp (env) == 0) (1)
plugfunc ();
else
{
db_log_mesg ("Ayyyee! Segmentation fault in plugin %s!\n", element->data->name); (2)
db_log_debug ("Call to %s () failed\n", name);
db_plug_unload (element);
return (1);
}
}
return (0);
}
| 1 | Save stack and instruction pointer for later use when it is for the first time; otherwise ditch the plugin |
| 2 | Well, different times back then.. |
When the application receives the bad signal SISEGV, it checks if there are stored stack
and instruction values and rewinds the stack accordingly:
static void
sig_handler (int sig)
{
switch (sig)
{
case SIGSEGV:
longjmp (env, 1); (1)
db_log_debug ("Something went wrong! Segmentation fault!\n");
db_sig_destroy ();
abort ();
break;
/* --- %< --- snip --- %< --- */
}
| 1 | Check the values and pass control if necessary; otherwise just bail out |
| Ease of use | Richness of API | Language agnostic | Error handling | Performance |
|---|---|---|---|---|
Low; requires compilation and linking |
The API is simple, but can be enriched by the host |
No; requires plugins to be in C[5] |
Arcane; requires stack unwinding |
Runs natively, so pretty fast |
Three years later in 2007 I continued on building upon my Xlib skills and started my long-lasting project subtle.
Over the years there have been many major breaking changes, from the initial design to the state it currently is in. Two of the post-related changes were the integration of the scripting language Lua and its later replacement with Ruby after a few years in this glorious issue #1.
I am not entire sure where I picked Lua up, but I never played WoW so probably from somewhere else and I can only talk about the state and API from back then.
Adding a scripting language solves quite a few problems:
File loading and parsing can be offloaded to the language core
The language itself comes with a basic subset of things you can do with it
Bonus: Config handling can also be directly offloaded
My attempt of trying to compile the project and provide an actual screenshot this time ended quickly as well:
$ ./configure && make
# --- %< --- snip --- %< ---
subtle 0.7b
-----------------
Binary....................: /usr/local/bin
Sublets...................: /usr/local/share/subtle
Config....................: /usr/local/etc/subtle
Debugging messages........:
Try make now, good luck!
make all-recursive
make[1]: Entering directory '/home/unexist/build/subtle-0.7b'
Making all in src
# --- %< --- snip --- %< ---
if gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -g -O2 -I/usr/include/lua5.1 -g -O2 -MT subtle-event.o -MD -MP -MF ".deps/subtle-event.Tpo" -c -o subtle-event.o `test -f 'event.c' || echo './'`event.c; \
then mv -f ".deps/subtle-event.Tpo" ".deps/subtle-event.Po"; else rm -f ".deps/subtle-event.Tpo"; exit 1; fi
event.c: In function ‘subEventLoop’:
event.c:352:57: error: implicit declaration of function ‘subSubletSift’; did you mean ‘subSubletKill’? [-Wimplicit-function-declaration]
352 | subSubletSift(1);
| ^~~~~~~~~~~~~
| subSubletKill
make[2]: *** [Makefile:310: subtle-event.o] Error 1
make[2]: Leaving directory '/home/unexist/build/subtle-0.7b/src'
make[1]: *** [Makefile:233: all-recursive] Error 1
make[1]: Leaving directory '/home/unexist/build/subtle-0.7b'
make: *** [Makefile:171: all] Error 2
This is kind of embarrassing for an official release and really have to question the quality in retrospect, but this won’t stop us now.
After a dive into the code there were some obviously problems and also blatant oversights and if you are interested in the shameful truth here is silly patch:
And without further ado here is finally the screenshot of the scripting part in action, before we dive into how this is actually done under the hood:
Starting with the easy part, offloading the config handling was one of the first things I did and this made a config like this entirely possible:
-- Options config
font = {
face = "lucidatypewriter", -- Font face for the text
style = "medium", -- Font style (medium|bold|italic)
size = 12 -- Font size
}
-- Color config
colors = {
font = "#ffffff", -- Color of the font
border = "#ffffff", -- Color of the border/tiles
normal = "#CFDCE6", -- Color of the inactive windows
focus = "#6096BF", -- Color of the focussed window
shade = "#bac5ce", -- Color of shaded windows
background = "#596F80" -- Color of the root background
}
-- --- %< --- snip --- %< ---
Essentially the C API of Lua is a stack machine and the interaction with is through pushing and popping values onto and from the stack.[6]
I’ve removed a bit of the fluff and checks upfront, so we can have a quick glance at the config loading and jump further into nitty-gritty details:
/* --- %< --- snip --- %< --- */
subLogDebug("Reading `%s'\n", buf);
if(luaL_loadfile(configstate, buf) || lua_pcall(configstate, 0, 0, 0)) (1)
{
subLogDebug("%s\n", (char *)lua_tostring(configstate, -1));
lua_close(configstate);
subLogError("Can't load config file `%s'.\n", buf);
}
/* --- %< --- snip --- %< --- */
/* Parse and load the font */¬
face = GetString(configstate, "font", "face", "fixed"); (2)
style = GetString(configstate, "font", "style", "medium");
size = GetNum(configstate, "font", "size", 12);
/* --- %< --- snip --- %< --- */
| 1 | Internal calls to load the config file and just execute it in a safe way pcall |
| 2 | Once everything is stored inside configstate we fetch required values |
#define GET_GLOBAL(configstate) do { \ (1)
lua_getglobal(configstate, table); \ (2)
if(lua_istable(configstate, -1)) \
{ \
lua_pushstring(configstate, field); \ (3)
lua_gettable(configstate, -2); \
} \
} while(0)
/* --- %< --- snip --- %< --- */
static char *
GetString(lua_State *configstate,
const char *table,
const char *field,
char *fallback)
{
GET_GLOBAL(configstate);
if(!lua_isstring(configstate, -1)) (4)
{
subLogDebug("Expected string, got `%s' for `%s'.\n", lua_typename(configstate, -1), field);
return(fallback);
}
return((char *)lua_tostring(configstate, -1)); (5)
}
| 1 | Blocks in C macros require this fancy hack; probably best to skip over it |
| 2 | We check and fetch a table[7] |
| 3 | Push the string onto the current stack |
| 4 | Pull the value with index -2 from the stack |
| 5 | And convert it to our desired format |
Loading of plugins at runtime is basically the same as loading the config upfront, so let us just move on to error handling, which is slightly more interesting. It is probably no surprise, but the API is quite rudimentary and the handling of the stack and calls in case of an actual error is up to person to embed the engine.
Before we can see how this is done, let us quickly check how our battery plugin evolved from the arcane version in C to the Lua glory. First of all, plugins have been rebranded to sublets[9] and it (at least to me) became a bit more readable:
-- Get remaining battery in percent
function battery:meter() (1)
local f = io.open("/proc/acpi/battery/BAT" .. battery.slot .. "/state", "r")
local info = f:read("*a")
f:close()
_, _, battery.remaining = string.find(info, "remaining capacity:%s*(%d+).*")
_, _, battery.rate = string.find(info, "present rate:%s*(%d+).*")
_, _, battery.state = string.find(info, "charging state:%s*(%a+).*")
return(math.floor(battery.remaining * 100 / battery.capacity))
end
| 1 | The : here is used as a kind of namespace separator and should be read as a global table
called battery with the entry meter. |
Once the sublet is loaded and initialized we can just call it analogue to our save_call from
before:
void
subLuaCall(SubSublet *s)
{
if(s)
{
lua_settop(state, 0); (1)
lua_rawgeti(state, LUA_REGISTRYINDEX, s->ref);
if(lua_pcall(state, 0, 1, 0)) (2)
{
if(s->flags & SUB_SUBLET_FAIL_THIRD) (3)
{
subLogWarn("Unloaded sublet (#%d) after 3 failed attempts\n", s->ref);
subSubletDelete(s);
return;¬
}
else if(s->flags & SUB_SUBLET_FAIL_SECOND) s->flags |= SUB_SUBLET_FAIL_THIRD;
else if(s->flags & SUB_SUBLET_FAIL_FIRST) s->flags |= SUB_SUBLET_FAIL_SECOND;
subLogWarn("Failed attempt #%d to call sublet (#%d).\n",
s->flags & SUB_SUBLET_FAIL_SECOND) ? 2 : 1, s->ref);
}
switch(lua_type(state, -1)) (4)
{
case LUA_TNIL: subLogWarn("Sublet (#%d) does not return any usuable value\n", s->ref); break;
case LUA_TNUMBER: s->number = (int)lua_tonumber(state, -1); break;
case LUA_TSTRING:
if(s->string) free(s->string);
s->string = strdup((char *)lua_tostring(state, -1));
break;
default:
subLogDebug("Sublet (#%d) returned unkown type %s\n", s->ref, lua_typename(state, -1));
lua_pop(state, -1);
}
}
}
}
| 1 | A bit stack setup and retrieval via upfront |
| 2 | Here we call lua_pcall, which abstracts and hides the nasty setjmp and longjmp
handling from us |
| 3 | Looks like I discovered bitflags there and utilized it for error handling |
| 4 | Type handling for a more generic interface |
Moving fast-forward with subtle, I’ve replaced Lua with Ruby after a while and this is an entirely different way of integration, but let us just stick to our recipe here and do one mistake after another.
This time we can keep it short and simple, since I am using it on a daily on several devices and can easily provide screenshots without messing with outdated and broken builds[10].
So when we finally start subtle everything comes together, and we see known pieces from other projects before, which is more or the less entirely the same.
Just feel free to skip the next few listings and join us later and for the ones remaining..
Just kidding, here is the promised triplet of loading info, config and the battery thingy:
$ subtle -d :2 -c subtle.rb -s sublets
subtle 0.12.6606 - Copyright (c) 2005-present Christoph Kappel
Released under the GNU General Public License
Compiled for X11R0 and Ruby 2.7.8
Display (:2) is 640x480
Running on 1 screen(s)
ruby: warning: already initialized constant TMP_RUBY_PREFIX
Reading file `subtle.rb'
Reading file `sublets/battery.rb'
Loaded sublet (battery)
Reading file `sublets/fuzzytime.rb'
Loaded sublet (fuzzytime)
The config looks a bit different, mainly because we are now using a custom DSL, but we are going to cover this part in detail shortly, promised.
# Style for all style elements
style :all do (1)
foreground "#757575"
background "#202020"
icon "#757575"
padding 0, 3
font "-*-*-*-*-*-*-14-*-*-*-*-*-*-*"
#font "xft:sans-8"
end
# Style for the all views
style :views do (2)
# Style for the active views
style :focus do
foreground "#fecf35"
end
# --- %< --- snip --- %< ---
end
| 1 | Ruby is famous for metaprogramming and we obviously make have use of it here |
| 2 | Styles are a CSS-like way of configuring colors in subtle - batteries and inheritance included |
And lastly, a quick glimpse into the battery sublet, which naturally also makes use of the mentioned DSL:
on :run do |s|
begin (1)
now = IO.readlines(s.now).first.to_i
state = IO.readlines(s.status).first.chop
percent = (now * 100 / s.full).to_i
# --- %< --- snip --- %< ---
# Select icon for state
icon = case state (2)
when "Charging" then :ac
when "Discharging"
case percent
when 67..100 then :full
when 34..66 then :low
when 0..33 then :empty
end
when "Full" then :ac
else :unknown
end
s.data = "%s%s%s%d%%" % [
s.color_icon ? s.color : s.color_def, s.icons[icon],
s.color_text ? s.color : s.color_def, percent
]
rescue => err # Sanitize to prevent unloading
s.data = "subtle"
p err
end
end
| 1 | Ruby comes with exception handling and this eases the whole scripting part greatly |
| 2 | Aww, this kind of reminds of Rust <3 |
So when we talk about metaprogramming, what exactly is different here? If you have a closer look at the previous examples, we mostly defined data structures and methods there, which were later collected during load and/or actually called by the host application. In other words our scripts defined an API according to the rules of the host application, which then runs it. With metaprogramming now, we turn this around and define methods and provide an API for our scripts to let them call it.
The Ruby integration in subtle is quite vast, and there are many cool things I’d like to show, but time is precious, as is our attention span and sobriety is in order. So we have to cut a few corners here and there and follow loads of indirection abstraction, but I think we better stay with the styles excerpt from above.
Loading styles from the config consists of following basic building blocks:
void subRubyInit(void) {
VALUE config = Qnil, options = Qnil, sublet = Qnil;
/* --- %< --- snip --- %< --- */
config = rb_define_class_under(mod, "Config", rb_cObject); (1)
/* Class methods */¬
rb_define_method(config, "style", RubyConfigStyle, 1); (2)
/* --- %< --- snip --- %< --- */
}
| 1 | Define a holding class for our method definition |
| 2 | Define the actual method style and bind it to RubyConfigStyle |
void subRubyLoadConfig(void) {
VALUE klass = Qnil;
/* Load supplied config or default */
klass = rb_const_get(mod, rb_intern("Config")); (1)
config_instance = rb_funcall(klass, rb_intern("new"), 0, NULL);
rb_gc_register_address(&config_instance); (2)
if (Qfalse == RubyConfigLoadConfig(config_instance,¬
rb_str_new2(subtle->paths.config ? subtle->paths.config : PKG_CONFIG))) { (3)
subSubtleFinish();
exit(-1);¬
} else if (subtle->flags & SUB_SUBTLE_CHECK) {
printf("Syntax OK\n");
}
/* --- %< --- snip --- %< --- */
}
| 1 | Call back our config class and create a new instance |
| 2 | Take care, that the internal garbage collector doesn’t get rid of it |
| 3 | Wrap it again and continue in the next snippet |
static VALUE RubyConfigLoadConfig(VALUE self, VALUE file) {
/* --- %< --- snip --- %< --- */
printf("Reading file `%s'\n", buf);
/* Carefully load and eval file */
rargs[0] = rb_str_new2(buf);
rargs[1] = self;
rb_protect(RubyWrapEvalFile, (VALUE) &rargs, &state); (1)
if (state) {
subSubtleLogWarn("Cannot load file `%s'\n", buf);
RubyBacktrace();
return Qfalse;
}
return Qtrue;
} /* }}} */¬
| 1 | Ruby uses its own version of setjmp and longjmp, so wrap everything up and pass it over |
/* RubyWrapEvalFile */
static VALUE RubyWrapEvalFile(VALUE data) {
VALUE *rargs = (VALUE *) data, rargs2[3] = {Qnil};
/* Wrap data */
rargs2[0] = rb_funcall(rb_cFile, rb_intern("read"), 1, rargs[0]); (1)
rargs2[1] = rargs[0];
rargs2[2] = rargs[1];
rb_obj_instance_eval(2, rargs2, rargs[1]); (2)
return Qnil;
} /* }}} */
| 1 | Then we use the internal symbol rb_cFile to call File#read on our arguments |
| 2 | And then a final eval - see we adhere to the motto! |
Actually we covered this already in the previous section, so nothing to be done here and we better hurry.
During the 2020s lots of weird things happened and I was forced into my own sort of crisis being stuck with and on a macOS for some years. Needless to say the window management there totally annoyed me and I started another highly ambitious project aptly named touchjs.
There, I tied the new[11] Touch Bar, basic window management via Accessibility API and a JavaScript integration based on duktape together.
Unfortunately we are back at build problems: Somehow and totally unexplainable to me, I forgot to check in some essential headers to the project which led to a full-halt:
$ make
clang -c -mmacosx-version-min=10.12 -x objective-c src/touchjs.m -o src/touchjs.o
src/touchjs.m:17:10: fatal error: 'delegate.h' file not found
17 | #include "delegate.h"
| ^~~~~~~~~~~~
1 error generated.
make: *** [src/touchjs.o] Error 1
$ make
clang -c -mmacosx-version-min=10.12 -x objective-c src/touchbar.m -o src/touchbar.o
src/touchbar.m:57:23: error: use of undeclared identifier 'kQuit'
57 | [array addObject: kQuit];
| ^
src/touchbar.m:149:21: warning: class method '+presentSystemModalTouchBar:systemTrayItemIdentifier:' not found (return type defaults to 'id') [-Wobjc-method-access]
149 | [NSTouchBar presentSystemModalTouchBar: self.groupTouchBar
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
150 | systemTrayItemIdentifier: kGroupButton];
| ~~~~~~~~~~~~~~~~~~~~~~~~
src/touchbar.m:150:39: error: use of undeclared identifier 'kGroupButton'; did you mean 'kGroupIcon'?
150 | systemTrayItemIdentifier: kGroupButton];
| ^~~~~~~~~~~~
| kGroupIcon
# --- %< --- snip --- %< ---
Fixing something that isn’t there is quite difficult, and it took me some time and reading reference manuals to understand what I actually have to restore. When I made the first progress there, I suddenly remembered I have in fact a backup of the MacBook Pro from back then.
Although I really had fun playing with it, there has never been a real usage of the project. Luckily I already worked test-driven, so I can show off these test scripts written in JavaScript[12] along with some resulting shots of the Touch Bar:
/* WM */
var wm = new TjsWM(); (1)
tjs_print("wm: trusted=" + wm.isTrusted());
/* Events */
wm.observe("win_open", function (win) {
tjs_print("Open: name=" + win.getTitle() + ", id=" + win.getId() + ", frame=" + win.getFrame()); (2)
});
| 1 | Highly ambitious as I’ve promised |
| 2 | Well, just print some details of windows in the normal state |
And some more with actual UI elements:
var b = new TjsButton("Test")
.setBgColor(255, 0, 0)
.bind(function () {
tjs_print("Test");
});
/* Attach */
tjs_attach(b);
/* --- %< --- snip --- %< --- */
var b4 = new TjsButton("Exec")
.setBgColor(255, 0, 255)
.bind(function () {
var c1 = new TjsCommand("ls -l src/");
tjs_print(c1.exec().getOutput());
});
var s1 = new TjsSlider(0)
.bind(function (value) {
tjs_print(value + "%");
rgb[idx] = parseInt(255 * value / 100);
l1.setFgColor.apply(l1, rgb);
});
var sc1 = new TjsScrubber()
.attach(b1)
.attach(b2)
.attach(b3)
.attach(b4);
/* Attach */
tjs_attach(l1);
tjs_attach(sc1);
tjs_attach(s1);
We could go into details here how the loading process and error handling works in Obj-C, but I ultimately replaced Obj-C with Rust and later on also got rid of the macbook, so interested in how this can be done in Rust? Bet you are!
Around 2023 I started another pet project under the nice moniker rubtle. I can only guess what my plans for it were, but might have been glimpse into the future, but more on that later, when we talk about the last project of this blog post. Whatever the plans were, I didn’t spend too much time on it and rubtle isn’t polished in any sense.
So why do I mention it at all you might ask? Within rubtle I followed a different approach we haven’t covered so far. Instead of inventing an own API, I created bridge[14] and allowed the scripts to interact directly with the underlying engine:
fn main() {
let args: Vec<String> = env::args().collect();
if 1 < args.len() {
let contents = fs::read_to_string(&args[1]); (1)
let rubtle = Rubtle::new();
init_global(&rubtle);
init_rubtle(&rubtle); (2)
match contents {
Ok(val) => rubtle.eval(&val),
Err(_) => eprintln!("File read failed"),
}
} else {
println!("Usage: {}: <JSFile>", args[0]);
}
}
| 1 | Just file loading, no surprises here yet |
| 2 | Now it is getting exciting - off to the next listing! |
fn init_rubtle(rubtle: &Rubtle) {
#[derive(Default)]
struct UserData {
value: i32,
};
let mut object = ObjectBuilder::<UserData>::new() (1)
.with_constructor(|inv| {
let mut udata = inv.udata.as_mut().unwrap();
udata.value = 1;
})
.with_method("inc", |inv| -> CallbackResult<Value> { (2)
let mut udata = inv.udata.as_mut().unwrap();
udata.value += 1;
Ok(Value::from(udata.value))
})
/* --- %< --- snip --- %< --- */
.build();
rubtle.set_global_object("Rubtle", &mut object); (3)
}
| 1 | Using the builder pattern was really a fight back then to me |
| 2 | Here we are assembling an object by adding some values and methods |
| 3 | And we register this as a global object |
Compiled, ready and armed we can feed this fancy test script into it:
var rubtle = new Rubtle();
rubtle.set(5);
assert(5, rubtle.get(), "Damn"); (1)
rubtle.inc();
assert(6, rubtle.get(), "Damn");
print(rubtle.get()) (2)
| 1 | Seriously no idea.. |
$ RUSTFLAGS=-Awarnings cargo run -- ./test.js
Compiling rubtle-duktape v0.1.0 (/home/unexist/projects/rubtle/rubtle-duktape) (1)
Compiling rubtle-lib v0.1.0 (/home/unexist/projects/rubtle/rubtle-lib) (2)
Compiling rubtle v0.1.0 (/home/unexist/projects/rubtle/rubtle)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 2.03s
Running `target/debug/rubtle ./test.js`
<JS> "6" (3)
Inside rubtle-lib is lots of scary stuff and I don’t want to scare away my dear readers, so the next excerpt is boiled down and absolutely safe to handle:
impl Rubtle {
/* --- %< --- snip --- %< --- */
///
/// Set value to context and assign a global reachable name
///
/// # Arguments
///
/// `name`- Name of the value
/// `rval` - The actual value
///
/// # Example
///
/// use rubtle_lib::{Rubtle, Value};
///
/// let rubtle = Rubtle::new();
/// let rval = Value::from(4);
///
/// rubtle.set_global_value("rubtle", &rval);
///
pub fn set_global_value(&self, name: &str, rval: &Value) {
unsafe {
let cstr = CString::new(to_cesu8(name));
match cstr {
Ok(cval) => {
self.push_value(rval); (1)
ffi::duk_require_stack(self.ctx, 1); (2)
ffi::duk_put_global_lstring(
self.ctx,
cval.as_ptr(),
cval.as_bytes().len() as u64,
);
}
Err(_) => unimplemented!(),
}
}
}
/* --- %< --- snip --- %< --- */
}
unsafe extern "C" fn fatal_handler(_udata: *mut c_void, msg: *const c_char) { (3)
let msg = from_cesu8(CStr::from_ptr(msg).to_bytes())
.map(|c| c.into_owned())
.unwrap_or_else(|_| "Failed to decode message".to_string());
eprintln!("Fatal error from duktape: {}", msg);
process::abort();
}
| 1 | Did I mention duktape is also a stack machine and exposes this type of API? |
| 2 | This is a similar handling of the stack like we’ve seen in Lua |
| 3 | And we essentially provide a error handler to trap errors when they occur |
| Ease of use | Richness of API | Language agnostic | Error handling | Performance |
|---|---|---|---|---|
Low to complex; depends on the chosen language |
You provide the API, can be full-fledged interface but also just a simple bridge |
Absolutely, there usually are many bindings and they can also be created with FFI |
Depends a bit on the language, but ranges from easy to complex |
Another thing that depends on the embedder and the embeddee[15] |
I think Webassembly is one of the more interesting topics from the web technology cosmos. It allows to create binaries from a plethora of languages and to run them mostly at full speed directly inside stack-based[16] virtual machines. Originally meant for embedding in the web, it can also be utilized in other types of software and provide more flexibility when required, but also raw speed on execution.
There is lots of movement and things might break change quite often, but frameworks provide stability here where required. Extism is such a framework and also the one used in my latest project subtle-rs as a re-write in Rust and the spiritual successor of subtle.
subtle-rs is under active development and therefore a piece of a cake to demonstrate it:
In contrast to the other projects, subtle-rs doesn’t use a scripting language as its config, but relies on a simple TOML file. Therefore it doesn’t make sense to go into detail here. If you still are curious just check the repository: https://github.com/unexist/subtle-rs/blob/master/subtle.toml
Startup and loading the four existing plugins works like a charm:
$ cargo run -- -d :2 --config-file ./demo.toml
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.12s
Running `target/debug/subtle-rs -d ':2' --config-file ./demo.toml`
[2026-01-25T15:48:20Z INFO subtle_rs] Reading file `"./demo.toml"'
[2026-01-25T15:48:20Z INFO subtle_rs] subtle-rs 0.1.0 - Copyright (c) 2025-present Christoph Kappel <[email protected]>
[2026-01-25T15:48:20Z INFO subtle_rs] Released under the GNU GPLv3
[2026-01-25T15:48:20Z INFO subtle_rs] Compiled for X11
[2026-01-25T15:48:20Z INFO subtle_rs::display] Display (:2) is 640x480
[2026-01-25T15:48:20Z INFO subtle_rs::plugin] Loaded plugin (time) (1)
[2026-01-25T15:48:20Z INFO subtle_rs::plugin] Loaded plugin (fuzzytime) (2)
[2026-01-25T15:48:20Z INFO subtle_rs::plugin] Loaded plugin (mem) (3)
[2026-01-25T15:48:20Z INFO subtle_rs::plugin] Loaded plugin (battery) (4)
[2026-01-25T15:48:20Z INFO subtle_rs::screen] Running on 1 screen(s)
| 1 | Written in Zig - https://github.com/unexist/subtle-rs/tree/master/plugins/time |
| 2 | Written in Go - https://github.com/unexist/subtle-rs/tree/master/plugins/fuzzytime |
| 3 | Written in JavaScript - https://github.com/unexist/subtle-rs/tree/master/plugins/mem |
| 4 | Written in Rust - https://github.com/unexist/subtle-rs/tree/master/plugins/battery |
Under the hood the integration works a bit different to the embeddings before. The plugins run alone and isolate in their virtual machine and all capabilities beside the ones provided by the language and the wasm target must be exported by the embedding host. On the other side, the plugin can also define methods, export them and they can in turn be called by the host.
Creating such exports and load a plugin is quite easy with Extism:
/* --- %< --- snip --- %< --- */
host_fn!(get_battery(_user_data: (); battery_idx: String) -> String { (1)
let charge_full = std::fs::read_to_string(
format!("/sys/class/power_supply/BAT{}/charge_full", battery_idx))?; (2)
let charge_now = std::fs::read_to_string(
format!("/sys/class/power_supply/BAT{}/charge_now", battery_idx))?;
Ok(format!("{} {}", charge_full.trim(), charge_now.trim()))
});
/* --- %< --- snip --- %< --- */
pub(crate) fn build(&self) -> Result<Plugin> {
let url = self.url.clone().context("Url not set")?;
// Load wasm plugin
let wasm = Wasm::file(url.clone());
let manifest = Manifest::new([wasm]);
let plugin = extism::PluginBuilder::new(&manifest) (3)
.with_wasi(true)
/* --- %< --- snip --- %< --- */
.with_function("get_battery", [PTR], [PTR],
UserData::default(), Self::get_battery) (4)
.build()?;
debug!("{}", function_name!());
Ok(Plugin {
name: self.name.clone().context("Name not set")?,
url,
interval: self.interval.unwrap(),
plugin: Rc::new(RefCell::new(plugin)),
})
}
/* --- %< --- snip --- %< --- */
| 1 | The macro host_fn! allows us to define functions for our webassembly guest |
| 2 | Funny how the path of the acpi interface has changed over the years |
| 3 | Extism also provides an easy-to-use loader |
| 4 | Time to register our host function |
And just to complete the usual triplet again, here is what the battery plugin actually does:
#[host_fn("extism:host/user")]
extern "ExtismHost" {
fn get_battery(battery_idx: String) -> String; (1)
}
#[plugin_fn] (2)
pub unsafe fn run<'a>() -> FnResult<String> {
let values: String = unsafe { get_battery("0".into())? }; (3)
info!("battery {}", values);
let (charge_full, charge_now) = values.split(" ") (4)
.filter_map(|v| v.parse::<i32>().ok())
.collect_tuple()
.or(Some((1, 0)))
.unwrap();
Ok(format!("{}%", charge_now * 100 / charge_full))
}
| 1 | This imports the function from the host |
| 2 | Mark this function for export to the host |
| 3 | Sadly the unsafe here is required… |
| 4 | Pretty straight forward - parse and convert with a bit error checking - one line |
Due to the isolation of the plugins the error handling happens inside the virtual machine:
/* --- %< --- snip --- %< --- */
impl Plugin {
pub(crate) fn update(&self) -> Result<String> {
let res = self.plugin.borrow_mut().call("run", "")?; (1)
debug!("{}: res={}", function_name!(), res);
Ok(res)
}
/* --- %< --- snip --- %< --- */
}
| 1 | Just a quick call and result check of the plugin function |
| Ease of use | Richness of API | Language agnostic | Error handling |
|---|---|---|---|
Depends on the language, but you can pick from the list of supported ones |
All noteworthy API must be provided by the host, like time |
Yes, the list of supported language is quite nice |
Extism offers easy integration and error checking |
Time for a conclusion after such a marathon through many ideas, languages and projects, so we can call this a day. We have seen different approaches of providing an API to essentially shape what a guest or plugin can do in your application. And we have also covered error checking and seen how it can range from being arcane and nasty to be handled entirely by your framework.
I think taken with care the integration of scripting languages can be a great way to ease the hurdle of providing new feature sets. It can also allow different audiences not familiar with the host language or host domain to enrich it. And additionally approaches like webassembly allow to combine the raw processing speed of compiled languages with the ease-of-use flow of scripting.
The list of examples is quite long, but please help yourself:
mem.c if you are curious
During my career facing legacy code has always been an annoying task and it took me quite some years to understand, that oftentimes today’s code is tomorrow’s legacy. Still, legacy code can be a great opportunity to learn something new and especially when you are the original author of the piece.
This post jumps on the bandwagon of rewriting everything in Rust and elaborates a bit on my personal motivation and learnings of rewriting my pet window manager project subtle, which I started ~20 years[1] ago and still use it on a daily basis.
Among the many things AI can do for us, migrating code from one language into another is usually a strong selling point and even without AI there are excellent tools on its own, like C2Rust, to get the job done with just a flick of a finger.
So why is an excellent question.
One of my main motivators isn’t just to get the job done, like I lamented on a bit in my previous blog post, but to have a learning experience and take something from it besides another code base, which easily ticks every point of the legacy code checklist.
Manual labor isn’t probably the most controversial aspect of it, but porting an X11 application in the day year epoch of Wayland might look like a waste of time.
Alas, the reasoning here is basically the same. Plus I’ve spent many years with X11 learning its core concepts and still like the system and capabilities.
On a side note - I am not entirely certain there is a giant switch to get rid of X11 yet, despite how decisions of e.g. the GNOME project[2] might appear.
Porting a codebase, like the one of subtle with 14728 LoC (according to sloccount[3]), brought loads of challenges with it. Some of them were the usual ones like "where to start" and how can this be done in language X, but let us concentrate here on a handful of interesting points.
| The problems are inter-related, and it is sometimes a chicken or the egg-type of problem which to address first, so please be prepared to jump a bit around if necessary. |
When I started subtle back then, I didn’t even know that this pattern is called God Object or that it is considered to be prime example of an anti-pattern. To me it was something that I’ve learned by reading other people’s code and looked like a good solution to a problem, which is still relevant today.
The main problem is kind of easy to explain and mainly related to software design: Your program needs to keep track of data like state or socket descriptors and many related functions have to access and sometimes mutate them.
There are several ways to tackle it, like moving everything related together, but this can also mean there is basically just one big file and C isn’t the strongest language to enforce a proper structure and coherence. It was way easier to have a global object which included every bit and was available throughout the program.
This might obviously lead to interesting side-effects in multi-threaded applications, but fortunately the design goal of subtle has always been to be single-threaded and no other means of locking were required.
What I did not understand back then and which is more of concern here, is the implicit coupling of everything to this god object. This means changing the god object may require changes of other parts of the program and also may unknowingly break other parts of the application.
subtle-rs (as its predecessor) is event-driven and many parts revolve around a single connection to the X11 server. This connection must be available to most parts and moving everything into the holding object made proper separation of concerns more difficult.
Like every worth-while decision this is a classical trade-off and the original design was kept with the addition to carry the dependency explicitly through the codebase.
void subClientSetSizeHints(SubClient *c, int *flags) {
...
}
pub(crate) fn set_size_hints(&mut self, subtle: &Subtle, mode_flags: &mut ClientFlags) -> Result<()> { (1)
...
}
| 1 | The signature includes a reference to Subtle. |
Resource acquisition is initialization (RAII) is another programming idiom, which is less of a concern in C-based languages, but can turn into a problem in strict languages like Rust. Simply put this just means whenever we initialize something like a holding structure, we also have to initialize all of its members due to the general idea of predictable runtime behavior and zero-cost abstraction.
This easily turns into a problem, whenever the holding structure contains something, that requires some preparation before it can be initialized - like a socket connection:
struct Holder {
Connection *conn;
}
Holder *holder = calloc(1, sizeof(Holder)); (1)
holder.conn = MagicallyOpenConnection(); (2)
| 1 | Init the holding structure |
| 2 | Open the actual connection |
Since this a more general problem in Rust, there exists a bunch of options with different ergonomics. One of the easiest ways is to wrap the connection in Option, which can be initialized with its default value and set later, but as I’ve said the ergonomics of mutating[4] something on the inside are bothersome.
A better option alternative is let one of the many cells[5] handle this job. OnceCell, as the name implies, offers an easy way to initialize our socket once we are prepped.
struct subtle_t {
...
Display *dpy; //< Subtle Xorg display
...
} SubSubtle;
extern SubSubtle *subtle; (1)
| 1 | God mode - on! |
void subDisplayInit(const char *display) { (1)
...
/* Connect to display and setup error handler */
if (!(subtle->dpy = XOpenDisplay(display))) {
...
}
| 1 | We usually pass the ENV var DISPLAY, but NULL is also an accepted value. |
int main(int argc, char *argv[]) {
...
/* Create subtle */
subtle = (SubSubtle *) (subSharedMemoryAlloc(1, sizeof(SubSubtle))); (1)
...
}
| 1 | This is just calloc with some error handling. |
pub(crate) struct Subtle {
...
pub(crate) conn: OnceCell<RustConnection>,
...
}
impl Default for Subtle { (1)
fn default() -> Self {
Subtle {
...
conn: OnceCell::new(), (2)
...
}
}
}
| 1 | Unfortunately deriving the Default trait doesn’t work for all members of Subtle. |
| 2 | This initializes our OnceCell with its default value. |
pub(crate) fn init(config: &Config, subtle: &mut Subtle) -> Result<()> {
let (conn, screen_num) = x11rb::connect(Some(&*config.display))?;
....
subtle.conn.set(conn).unwrap(); (1)
....
}
| 1 | Error handling here would require more explanation, so let us just forget about it and move on. |
fn main() -> Result<()> {
...
// Init subtle
let mut subtle = Subtle::from(&config); (1)
...
display::init(&config, &mut subtle)?;
...
}
| 1 | Config holds the configured values - a courtesy of clap - and we convert it with
the help of our From trait implementation. |
Did you wonder why the (in)famous borrow checker isn’t number one on our list of problems? Well, simply because you can come pretty far without running into beloved errors like E0499 or E0502 and grouping problems to keep a common thread is quite difficult.
Anyway, back to the topic at hand: Why can’t we just keep a mutable reference of our god object all the time and pass it around?
Interestingly this is again more about software design and Rust’s pragmatic way of handling mutability in contrast to other (functional) languages like Haskell. Please have a look at the next code block:
#[derive(Default)] (1)
struct Counter {
number: u32,
}
impl Counter {
fn increment(&mut self) { (2)
self.number += 1;
}
fn print(&mut self) { (3)
println!("number={}", self.number);
}
}
fn increment_counter(counter: &mut Counter) { (4)
counter.number += 1;
}
fn print_counter(counter: &mut Counter) { (5)
println!("counter={}", counter.number);
}
fn main() {
let mut counter = Counter::default();
counter.increment(); (6)
counter.print(); (7)
increment_counter(&mut counter); (8)
print_counter(&mut counter); (9)
}
| 1 | Derive is one of Rust’s real work horses. |
| 2 | Mut required due to write to binding. |
| 3 | Is mut required here? |
| 4 | Mut! |
| 5 | Mut? |
| 6 | Implied mut! |
| 7 | Implied mut? |
| 8 | Mut! |
| 9 | Why mut? |
If you don’t mind trailing all those terribly explicit mut keywords the above code runs fine and
if you don’t try to re-borrow anything the aliasing rules work in your favor.
A different story is the coupling and the cognitive load: When everything gets a mutable reference, everything is coupled together and you can never be sure about the side-effects of calling a certain function.
The easiest and most naive solution to this kind of problem is just omit mut wherever possible.
#[derive(Default)]
struct Counter {
number: u32,
}
impl Counter {
fn increment(&mut self) {
self.number += 1;
}
fn print(&self) { (1)
println!("number={}", self.number);
}
}
fn increment_counter(counter: &mut Counter) {
counter.number += 1;
}
fn print_counter(counter: &Counter) { (2)
println!("number={}", counter.number);
}
fn main() {
let mut counter = Counter::default();
counter.increment();
counter.print();
increment_counter(&mut counter);
print_counter(&counter); (3)
}
| 1 | This access is just read-only, so no need for mut and also a promises of being side-effect free. |
| 2 | See ❶! |
| 3 | See ❶! |
Now its getting interesting, and we have to talk about given promises of immutability and one more time about ergonomics of our general design.
With the last problem we established the underlying promise of functions, that don’t require a mutable reference, will never change the object itself and only changes made to a mutable reference are of any consequence to you.
What happens, when you need to change some internal state, which is just required for internal bookkeeping and doesn’t change anything at all for the caller?
Have a look at following contrived[6] example:
use std::time::{SystemTime, UNIX_EPOCH};
#[derive(Default)]
struct Counter {
number: u32,
last_printed: u32,
}
impl Counter {
fn increment(&mut self) {
self.number += 1;
}
fn print(&mut self) { (1)
self.last_printed = SystemTime::now()
.duration_since(UNIX_EPOCH).unwrap().as_secs() as u32; (2)
println!("number={}", self.number);
}
}
fn main() {
let mut counter = Counter::default();
counter.increment();
counter.print();
}
| 1 | To allow our internal bookkeeping the signature must include mut now. |
| 2 | Error checking skipped for brevity - unwrap all the things! |
Here we had to change the methods signature just to allow the pointless action of storing the last printing time, maybe for big data applications, who knows.
From the caller’s perspective it doesn’t make any sense to pass a mutable reference into the
print function and from the counter’s perspective[7] there wasn’t any
actual change of the number.
This is a pretty common problem and Rust provides many different options like Cell and RefCell, Atomic and some more advanced options like the smart pointer Arc for more shenanigans.[8]
In our case Cell works splendidly here for our type comes prepared with the copy trait:
typedef struct subsubtle_t {
...
int visible_tags; //< Subtle visible tags
...
} SubSubtle;
void subScreenConfigure(void) {
...
/* Reset visible tags, views and available clients */
subtle->visible_tags = 0; (1)
...
/* Set visible tags and views to ease lookups */
subtle->visible_tags |= v->tags;
...
}
| 1 | No one can stop us from just accessing our god object directly. |
pub(crate) struct Subtle {
...
pub(crate) visible_tags: Cell<Tagging>,
...
}
impl Default for Subtle {
fn default() -> Self {
Subtle {
...
visible_tags: Cell::new(Tagging::empty()),
...
}
}
}
pub(crate) fn configure(subtle: &Subtle) -> Result<()> {
...
// Reset visible tags, views and available clients
let mut visible_tags = Tagging::empty(); (1)
...
// Set visible tags and views to ease lookups
visible_tags.insert(view.tags);
...
subtle.visible_tags.replace(visible_tags); (2)
...
}
| 1 | This is a pretty easy case: We introduce a local variable via let binding first. |
| 2 | And then once we are happy with the result we tell the cell to swap-out the content entirely. |
Likewise with mutability, Rust in a similar way annoyingly verbose and explicit with how it handles data and copies of it. Seems like keeping all the guarantees and promises there has to be done some work upfront from every side.
In the next example we just continue with the counter from before, but the repetition of the struct definition and implementation itself have been removed, since they just divert from the actual problem:
...
fn print_counter(counter: &Counter) {
counter.print();
}
fn main() {
let mut counter1 = Counter::default();
counter1.increment();
let counter2 = counter1; (1)
print_counter(&counter1);
print_counter(&counter2);
}
| 1 | D’oh! |
The above snippet fails to compile for apparent reasons, still the error message of the compiler is kind of a surprise in its detail and content:
error[E0382]: borrow of moved value: `counter1`
--> src/main.rs:27:19
|
21 | let mut counter1 = Counter::default();
| ------------ move occurs because `counter1` has type `Counter`, which does not implement the `Copy` trait
...
25 | let counter2 = counter1;
| -------- value moved here
26 |
27 | print_counter(&counter1);
| ^^^^^^^^^ value borrowed here after move
|
note: if `Counter` implemented `Clone`, you could clone the value
--> src/main.rs:2:1
|
2 | struct Counter {
| ^^^^^^^^^^^^^^ consider implementing `Clone` for this type
...
25 | let counter2 = counter1;
| -------- you could clone this value
For more information about this error, try `rustc --explain E0382`.
error: could not compile `example` (bin "example") due to 1 previous error
This is just an example of a really overwhelming and also quite helpful error message from our partner in crimes - the Rust compiler. What it points out here is that we can just add the copy trait marker and also implement the clone trait to satisfy this move.
And like our friendly compiler told us, when we just do as suggested the code runs perfectly fine:
#[derive(Default, Clone, Copy)]
struct Counter {
number: u32,
}
This innocent assignment over there just introduced the concept of move semantics, that Rust uses internally in its affine type system:
An affine resource can be used at most once, while a linear one must be used exactly once.
The definition is quite heavy and somehow unwieldy, but what it basically says, is every type that
doesn’t come along with a Copy marker trait is moved and the ownership transferred to the
recipient.
All other types are just copied along the way.
Accessing the object afterward is a violation of the ownership[9] model and hence causes such an error.
Writing this blog post has been an interesting experience on its own and helped me to sharpen my understanding of how Rust internally works and also helped me to summarize what I actually learned about it over the course of this project.
Porting such a large codebase from my past into a modern language and also re-visiting many of the taken design choices have been a great experience so far. And in regard to the legacy code aspect I mentioned initially - there are tests but still even I don’t understand some of the odd namings for variables and steps in algorithm anymore. Maybe I should have read Clean Code some years earlier [cleancode]..
I currently do not dare to use subtle-rs as my daily window manager yet, mainly because some required features are still missing like something simple to bring e.g. a clock into the panel, but I am eagerly looking at Extism for this matter.
Naturally I’ve read some books about Rust if you are looking for inspiration:
Most of the examples were taken from following repositories:
[idiomaticrust] Brenden Matthwes, Idiomatic Rust: Code like a Rustacean, Manning 2024
[coderustpro] Brenden Matthwes, Code Like a Pro in Rust, Manning 2024
[asyncrust] Maxwell Flitton, Caroline Morton, Async Rust: Unleashing the Power of Fearless Concurrency, O’Reilly 2024
[effectiverust] David Drysdale, Effective Rust: 35 Specific Ways to Improve Your Rust Code, O’Reilly 2024
[cleancode] Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship, O’Reilly 2007
It has been roughly two years since my last post regarding my experience with the state of AI (Coding with AI) and I think it is about time to talk about this again.
In contrast to my previous post, I don’t want to dwell on specific products and tools, but talk about some points about that I think we should pay close attention to and why this topic is a generally such a mixed bag to me.
Enough chit-chat, let us begin.
When I look back at the last two years I can probably safely say the whole thing got even more traction, as we’ve left the early adopting phase and AI got a lot of traction.
AI has entered the mainstream and almost everything gets AI support.
Content is specially prepared for AI systems apparently without following best practices.
There is an abundance of new tools and new companies sprout like pop-up stores.
There are experiments to replace general human labor and also specific ones like nurses in healthcare.
There are promises AI reduces our daily working time.
So in hindsight everything went according to plan from 2023:
I’ve spent a lot of time reading about the general progression of AI and besides dozen of blog posts and other articles for and against AI, also some books to get a broader perspective.
Among these books are following:
I somehow begin to wonder what kind of problems are we trying to address really with AI?
When I I look at the business side I’d say the overall themes are increase of productivity[1] like reducing tiresome and/or manual labor and probably fear of missing out competitive advantage. And on the personal side I mostly see quality-of-life improvements like easy access to information with the help of the natural interfaces like ChatGPT[2] and generational parts to create memes and reels more easily.
This short list is non-exhaustive mind you, but is sufficient for the points I’d like to make next.
Increasing productivity and reducing work time with technology isn’t strictly speaking a new idea, blue collar industrial workers faced this already during the mid-19th century during the industrial age, but for the first time white collar knowledge workers are impacted, and they are probably not backed by any labour union.
There have been lots of riots and protests according to Wikipedia, so apparently the workforce wasn’t all happy with the outcome, but we aren’t there yet, so let us focus on the promise of improved work-life-balance.
Interestingly during that time a strange phenomenon could be observed:
But rather than allowing a massive reduction of working hours to free the world’s population to pursue their own projects, pleasures, visions, and ideas, we have seen the ballooning of not even so much of the ‘service’ sector as of the administrative sector, up to and including the creation of whole new industries like financial services or telemarketing, or the unprecedented expansion of sectors like corporate law, academic and health administration, human resources, and public relations. And these numbers do not even reflect on all those people whose job is to provide administrative, technical, or security support for these industries, or for that matter the whole host of ancillary industries (dog-washers, all-night pizza delivery) that only exist because everyone else is spending so much of their time working in all the other ones. These are what I propose to call ‘bullshit jobs[3]’.
This is just an excerpt of an article written for a magazine under the umbrella of things nobody would print, as the author points out in his book Bullshit Jobs [bullshitjobsbook], but still the term hits a mark. There are a lot more examples and explanations in the book or supposedly bullshit jobs and also jobs that just feel like one, but my key takeaway is the implied question what can our highly skilled workforce do for a living, when their field of expertise has been replaced with automatons and we haven’t reached a moneyless utopia yet?
Hell is a collection of individuals who are spending the bulk of their time working on a task they don’t like and are not especially good at. Say they were hired because they were excellent cabinet-makers, and then discover they are expected to spend a great deal of their time frying fish.
Another issue I see with increasing productivity is the orientation towards throughput or rather output in general in information related topics. Viewed from the business side increasing quantity makes sense to me, this is what the business has been created for in the first place, but is output the only and ulterior goal and the learning how to reach and achieve something can be totally neglected?
If your media feeds are like mine, there is probably something about AI every two or three posts and depending on the type of media, like e.g. LinkedIn, the posts are full of promises and how the full potential of AI can be unlocked to utilize it for your business.
Oftentimes these posts appear to be written with the help of AI and especially em-dashes enjoy increasing popularity. I think eating your own dog food is always advisable, so I see no fault there. On the other hand, I can rarely find some kind of empirical evidence or any other kind of proof for these theses and here I consider this usually as a red flag - skepticism can help.
The current hype and pressure increases and academia has also started looking into the phenomenon of development of fear of missing out on AI. And there is an increasingly number of posts and voices besides the ones from common {aibro][AI Bros] who foretell if you don’t start to use AI today you are going to lose your edge.
| I won’t cite any of these posts, but if you are curious here is a starter: https://kagi.com/search?q=use+ai+or+lose |
Let us start with something positive: AI does a splendid job of lowering the bar to access information! Hallucinations vary between dangerous and hilarious and some people are bold enough to state this is an original feature of LLM design, but with our previous established skepticism regarding media consumption this should be fine.
Delivering probability-based answers to question is only part of the deal, another great application of these models if for content generation and both goes perfectly hand in hand:
I personally think we should just stick to the bullet point list instead of applying a "prosa-2-text" conversion twice, but still I wonder what happens to the quality of the information underneath. Writing this blog post or generally writing is a really time-consuming task. Drafting a new post and trying to fill the intended outline with content is a tasks which helps me personally to pinpoint what I really want to say[4] and I wouldn’t want to miss this journey.
I am a bit afraid the following is more than true:
After all, who are you writing for? Do you care if anybody reads it and how they respond to it? How can you expect anybody to relate to a piece of writing if it was generated by an AI model? If you can’t be bothered to write the entire article, you can’t really expect anybody else to be bothered to read it.
This is probably the most interesting point and I think it is really difficult to imagine the world to come and visionaries like Sam Altman play a big role in it. Still, when money gets involved things are sometimes getting sour and I think one of more recent posts from Altman really condenses the problem down well:
The implied comparison of mass-produced fast fashion with the overgeneralized idea of Software-as-a-Service is interesting by itself, although I think it is not a good one to promote your AI services. For me two of the pain points of fast fashion are the environmental footprint and the exploitation of people in fabric factories and according to media the same is true for the AI industry. There are many reports of the energy requirements of AI and the references to the Mechanical Turk are also increasing:
Amazon using this name [Amazon Mechanical Turk] or their product is surprisingly on the nose: their system also plays the function of hiding the massive amount of labor needed to make any modern AI infrastructure work. ImageNet, during its development in the late 2000s, was the largest single project hosted on the MTurk platform, according to Li. It took two and a half years and nearly 50,000 workers across 167 countries to create the dataset. In the end, the data contained over 14 million images, labeled across 22,000 categories.
I think the real point he wanted to make is with the help of AI can cheap software be mass-produced instead of paying monthly fees to service providers or individual solutions to problems. And this works actually well with software, since there is a negligible impact on the environment in contrast to physical products.
Currently, I am not exactly sure where we are on the hype cycle from the beginning of this post and I hope the next few months and years will surely show a direction there. We are going to see if history repeats itself in the protests of workers and if the dystopian outlooks of the movie Idiocracy stay a work of fiction.
I think my personal usage of AI won’t sky rocket any time soon, since I am most of the time interested in discovering how and why something can be done and rarely just in a fast solution. Given the situation that I am interested in exactly that and I don’t plan on using it beyond this narrow scope I might ask AI would still write it myself.
For any other stuff that can readily be automated I totally agree to this:
[tamingsiliconvalleybook] Gary F. Marcus, Taming Silicon Valley: How We Can Ensure That AI Works for Us, The MIT Press 2024
[theaiconbook] Emily M. Bender, Alex Hanna, The AI Con: How to Fight Big Tech’s Hype and Create the Future We Want, Harper 2025
[searchesbook] Vauhini Vara, Searches: Selfhood in the Digital Age, Random House 2025
[stupidityparadoxbook] Mats Alvesson, André Spicer, The Stupidity Paradox: The Power and Pitfalls of Function Stupidity at Work, Profile Books 2016
[bullshitjobsbook] David Graeber, Bullshit Jobs: A Theory, Simon & Schuster 2019
Handling containers is probably something a modern developer can’t and probably should not live without anymore. They provide flexibility, allow easy packaging and also sandboxing of stuff you might not want to have installed on your machine.
Like so often in tech, using something successfully doesn’t imply real understanding how it works under the hood, but I lived quite happily with this black box and all greasy details shrouded in mysteries hidden behind tooling like Podman. This changed, when I started looking for an artifact store for our firmware binary artifacts. I quickly discovered there are many container registries available, but just a few stores for ordinary artifacts without spending large parts of our engineering budget on enterprise license fees. Passing this question to my bubble lead to a suggestion of a good friend to have a look at ORAS, which leverages OCI-compliant registries for exactly what I wanted to literally archive. We are already using Harbor, so moving other artifacts there as well aroused my interest.
So over the course of this article we are going to dive into the container world with a short primer of the duality of OCI, talk about basic usage and a few advanced points like SBOM and signing and conclude with my impression on the technology.
| This post includes several introductional chapters as a deep dive into a specific topic. If you are just here for the examples and how to use the tooling quickly jump ahead and wait for us. |
Turns out the Open Container Initiative (OCI) isn’t a single spec by itself, but rather a governance body around several container formats and runtimes - namely:
Runtime Specification (runtime-spec)
Image Specification (image-spec)
Distribution Specification (distribution-spec)
The links lead to the related GitHub projects in case you want to build your own container engine, but I suggest we focus on image-spec, which lays out the structure in all gory details.
If you’ve dutifully studied the spec the overall structure of an actual container will probably not surprise you. If not believe me, they are less magically than thought, can be fetched with the help of Podman and easily be dissected on the shell:
$ podman save ghcr.io/oras-project/oras:main -o oras.tar
Copying blob 08000c18d16d done |
...
Writing manifest to image destination
$ tar xvf oras.tar --one-top-level
08000c18d16dadf9553d747a58cf44023423a9ab010aab96cf263d2216b8b350.tar
...
manifest.json
repositories
$ tree oras
oras
├── 08000c18d16dadf9553d747a58cf44023423a9ab010aab96cf263d2216b8b350.tar
...
├── 29ec8736648c6f233d234d989b3daed3178a3ec488db0a41085d192d63321c72
├── json
├── layer.tar -> ../08000c18d16dadf9553d747a58cf44023423a9ab010aab96cf263d2216b8b350.tar
└── VERSION
...
├── manifest.json
└── repositories
6 directories, 23 files
| 1 | Blobs is the main directory with all adressable filesystem layers and their related metadata defined in the appropriate JSON files config and manifest. The name of the layers are actually digests as well, but to make it easier to follow let us keep the fancy numbers. |
| 2 | Config contains entries like meta information about author as well as other runtime information like environment variables, entrypoints, volume mounts etc. as well as infos about specific hardware architecture and OS. |
| 3 | rootfs contains an ordered list of the digests that compose the actual image. |
| 4 | The manifest just links to the actual configugration by digest and to the layers. |
| 5 | And finally the index includes all available manifests and also image annotations. |
Mysteries solved, but there is still one essential piece missing - namely media types.
This surprises probably no one, but media types are also covered by a spec [2] - the media-spec
There you can see the exhaustive list of the known types and an implementor’s todo list for compliance to the specs. Conversely, this also means as long as we pick something different we are free to fill layers with anything to our liking without triggering a certain behaviour accidentally.
The next few examples require an OCI-compatible registry and also access to the binaries of oras and cosign and some more. Since installation is usually a hassle, all examples rely on Podman and the well-supported Zot Registry.
Setting up our registry is just a piece of cake and shouldn’t raise any eyebrows yet. We pretty much set just the bare essentials - deliberately without any hardening for actual logins.
$ podman run --rm -it --name zot-registry -p 5000:5000 --network=host \
-v ./infrastructure/zot-registry/config.json:/etc/zot/config.json \ (1)
ghcr.io/project-zot/zot-linux-amd64:v2.1.2
| 1 | Apart from host stuff we also want to enable the fancy web UI and the CVE
scanner - have a glimpse how this can be done on GitHub: https://github.com/unexist/showcase-oci-registries/blob/master/infrastructure/zot-registry/config.json |
Once started and after Trivy's update of the vulnerabilities is done we are dutifully greeted with an empty list:
Time to push our first artifact!
Ultimately I want to push embedded software artifacts to the registry, but since this is public and my own project heos-dial isn’t ready yet we are pushing a binary of the Golang version of my faithful todo service:
$ podman run --rm -v .:/workspace -it --network=host \ (1)
ghcr.io/oras-project/oras:main \
push localhost:5000/todo-service:latest \
--artifact-type showcase/todo-service \ (2)
--plain-http \ (3)
todo-service/todo-service.bin:application/octet-stream
✓ Uploaded todo-service/todo-service.bin 26.1/26.1 MB 100.00% 32ms
└─ sha256:cc8ab19ee7e1f1f7d43b023317c560943dd2c15448ae77a83641e272bc7a5dbc
✓ Uploaded application/vnd.oci.empty.v1+json (4)
└─ sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a
✓ Uploaded application/vnd.oci.image.manifest.v1+json
└─ sha256:fb1f02fff7f1406ae3aa2d9ebf3f931910b69e99c95e78e211037f11ec8f1eb6
Pushed [registry] localhost:5000/todo-service:latest
ArtifactType: showcase/todo-service
Digest: sha256:fb1f02fff7f1406ae3aa2d9ebf3f931910b69e99c95e78e211037f11ec8f1eb6
| 1 | The ORAS container allows us to call it this way and directly pass our arguments. |
| 2 | Here we set our custom artifact type, to be able to distinguish it. |
| 3 | No need to make our live miserable with SSL/TLS! |
| 4 | This isn’t a real container, so we must provide a https://oras.land/docs/how_to_guides/manifest_config/[dummy config}. |
One-way-success, time to get it back:
Pulling images from container registries is one of the core tasks of Podman:
$ podman pull localhost:5000/todo-service:latest
Trying to pull localhost:5000/todo-service:latest...
Error: parsing image configuration: unsupported image-specific operation on artifact with type "showcase/todo-service" (1)
| 1 | Unsurprisingly Podman doesn’t understand our custom artifact type and hence refuses to do our bidding. |
|
If Podman cannot connect to your local registry and bails out with
|
Let us try again - this time with ORAS.
$ podman run --rm -v .:/workspace -it --network=host \
ghcr.io/oras-project/oras:main \
pull localhost:5000/todo-service:latest --plain-http
✓ Pulled todo-service/todo-service.bin 26.1/26.1 MB 100.00% 38ms
└─ sha256:cc8ab19ee7e1f1f7d43b023317c560943dd2c15448ae77a83641e272bc7a5dbc
✓ Pulled application/vnd.oci.image.manifest.v1+json 586/586 B 100.00% 66µs
└─ sha256:fb1f02fff7f1406ae3aa2d9ebf3f931910b69e99c95e78e211037f11ec8f1eb6
Pulled [registry] localhost:5000/todo-service:latest
Digest: sha256:fb1f02fff7f1406ae3aa2d9ebf3f931910b69e99c95e78e211037f11ec8f1eb6
$ tree todo-service
todo-service
└── todo-service.bin
1 directory, 1 file
There are several commands available to gather information about images on the registry.
$ podman run --rm -v .:/workspace -it --network=host \
ghcr.io/oras-project/oras:main \
manifest fetch --pretty --plain-http \
localhost:5000/todo-service:latest
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"artifactType": "showcase/todo-service",
"config": {
"mediaType": "application/vnd.oci.empty.v1+json", (1)
"digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
"size": 2,
"data": "e30="
},
"layers": [
{
"mediaType": "application/octet-stream",
"digest": "sha256:cc8ab19ee7e1f1f7d43b023317c560943dd2c15448ae77a83641e272bc7a5dbc",
"size": 27352532,
"annotations": { (2)
"org.opencontainers.image.title": "todo-service/todo-service.bin"
}
}
],
"annotations": {
"org.opencontainers.image.created": "2025-06-04T11:57:57Z"
}
}
| 1 | This is our empty dummy config - check the size and data fields. |
| 2 | Annotations are supported as well and can be added with oras push --annotation. |
$ podman run --rm -v .:/workspace -it --network=host \
ghcr.io/oras-project/oras:main \
discover --format tree --plain-http \
localhost:5000/todo-service:latest
localhost:5000/todo-service@sha256:fb1f02fff7f1406ae3aa2d9ebf3f931910b69e99c95e78e211037f11ec8f1eb6
A software bill of materials or SBOM is a kind of inventory list of an artifact, which details included software components and assists in securing the software supply chain. This gets more and more attention as it should especially since the log4j vulnerability back then in 2020 and 2021.
There are different formats for SBOM files like SPDX or CycloneDX and also a broad range of tools that support one or more of them as input and output is available.
I am kind of fond[3] of Anchore with their tools syft and grype and therefore the next examples are going to make use of both of them.
Since my todo service is based on Golang syft can easily scan the source code and assemble our SBOM
$ podman run --rm -v .:/workspace -it --network=host \
-v ./todo-service:/in \
docker.io/anchore/syft:latest \
scan dir:/in -o cyclonedx-json=/workspace/sbom.json (1)
✔ Indexed file system /in
✔ Cataloged contents 86121fea66864109267c361a1fec880ab49dc5f619205b1f364ecb7ba31eb066
├── ✔ Packages [70 packages]
├── ✔ Executables [1 executables]
├── ✔ File digests [1 files]
└── ✔ File metadata [1 locations]
[0000] WARN no explicit name and version provided for directory source, deriving artifact ID from the given path (which is not ideal)
A newer version of syft is available for download: 1.26.1 (installed version is 1.26.0) (2)
$ cat sbom.json | jq '.components | length' (3)
71
| 1 | My pick is entirely based on the cool name though. |
| 2 | Interesting since I am using the latest tag. |
| 3 | Quite a lot of components.. |
Like Trivy, grype can easily scan from inside a container and provide machine-readable statistics by default:
$ podman run --rm -v .:/workspace -it --network=host \
docker.io/anchore/grype:latest \
sbom:/workspace/sbom.json
✔ Vulnerability DB [updated]
✔ Scanned for vulnerabilities [9 vulnerability matches]
├── by severity: 1 critical, 2 high, 6 medium, 0 low, 0 negligible
└── by status: 9 fixed, 0 not-fixed, 0 ignored
NAME INSTALLED FIXED-IN TYPE VULNERABILITY SEVERITY EPSS% RISK
golang.org/x/crypto v0.15.0 0.17.0 go-module GHSA-45x7-px36-x8w8 Medium 98.45 36.5
golang.org/x/net v0.18.0 0.23.0 go-module GHSA-4v7x-pqxf-cx7m Medium 98.35 33.4
golang.org/x/crypto v0.15.0 0.31.0 go-module GHSA-v778-237x-gjrc Critical 96.91 32.6
google.golang.org/protobuf v1.31.0 1.33.0 go-module GHSA-8r3f-844c-mc37 Medium 46.14 0.1
github.com/jackc/pgx/v5 v5.4.3 5.5.4 go-module GHSA-mrww-27vc-gghv High 38.06 0.1
golang.org/x/crypto v0.15.0 0.35.0 go-module GHSA-hcg3-q754-cr77 High 15.90 < 0.1
golang.org/x/net v0.18.0 0.38.0 go-module GHSA-vvgc-356p-c3xw Medium 5.05 < 0.1
golang.org/x/net v0.18.0 0.36.0 go-module GHSA-qxp5-gwg8-xv66 Medium 1.24 < 0.1
github.com/jackc/pgx/v5 v5.4.3 5.5.2 go-module GHSA-fqpg-rq76-99pq Medium N/A N/A
If we are content with the scanning result[4] let us quickly add this to our image:
$ podman run --rm -v .:/workspace -it --network=host \
ghcr.io/oras-project/oras:main \
attach localhost:5000/todo-service:latest --plain-http \
--artifact-type showcase/sbom \ (1)
sbom.json:application/vnd.cyclonedx+json
✓ Uploaded sbom.json 50.1/50.1 KB 100.00% 2ms
└─ sha256:0690e255a326ee93c96bf1471586bb3bc720a1f660eb1c2ac64bbf95a1bd9693
✓ Exists application/vnd.oci.empty.v1+json 2/2 B 100.00% 0s
└─ sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a
✓ Uploaded application/vnd.oci.image.manifest.v1+json 724/724 B 100.00% 3ms
└─ sha256:5c6bb144aaed7d3e4eb58ac6bcdbf2a68d0409d5328f81c9d413e9301e2517a9
Attached to [registry] localhost:5000/todo-service@sha256:fb1f02fff7f1406ae3aa2d9ebf3f931910b69e99c95e78e211037f11ec8f1eb6
Digest: sha256:5c6bb144aaed7d3e4eb58ac6bcdbf2a68d0409d5328f81c9d413e9301e2517a9
| 1 | This gave me a bit of a headache, because Zot supports SBOM scanning and also propagates the results on the web UI - see the sidepanel for more information. |
And if we run discover again we can see there is a new layer:
$ podman run --rm -v .:/workspace -it --network=host \
ghcr.io/oras-project/oras:main \
discover --format tree --plain-http \
localhost:5000/todo-service:latest
localhost:5000/todo-service@sha256:fb1f02fff7f1406ae3aa2d9ebf3f931910b69e99c95e78e211037f11ec8f1eb6
└── showcase/sbom
└── sha256:5c6bb144aaed7d3e4eb58ac6bcdbf2a68d0409d5328f81c9d413e9301e2517a9
└── [annotations]
└── org.opencontainers.image.created: "2025-06-04T12:40:38Z"
Speaking about security: Just adding images without means of verification if this is the real deal apart from the checksum doesn’t make too much sense too me.
I think the why should be clear, let us talk about how.
Needless to say topics like encryption, signatures etc. are usually pretty complicated, so I can gladly there exists lots of tooling to ease this for us dramatically. I did the homework for us in preparation for this post and checked our options. While doing that I found lots of references to notary and {skopeo[skopeo], but the full package and overall documentation of cosign just convinced me and it can basically sign anything in a registry.
In this last chapter we are going to sign our image and specific layers via in-toto attestations with the help of cosign.
Cosign comes with lots of useful commands to create and manage identities, signatures and whatnot, but in the most convenient way it just allows us to select from a list of supported identity provider in our browser per runtime:
$ podman run --rm -v .:/workspace --network=host \
ghcr.io/sigstore/cosign/cosign:v2.4.1 \
sign --yes \
localhost:5000/todo-service:latest
Generating ephemeral keys...
Retrieving signed certificate...
Non-interactive mode detected, using device flow.
Enter the verification code xxxx in your browser at: https://oauth2.sigstore.dev/auth/device?user_code=xxxx (1)
Code will be valid for 300 seconds
Token received!
Successfully verified SCT...
...
By typing 'y', you attest that (1) you are not submitting the personal data of any other person; and (2) you understand and agree to the statement and the Agreement terms at the URLs listed above. (2)
tlog entry created with index: 230160511
Pushing signature to: localhost:5000/todo-service
| 1 | Quickly follow the link and pick one of your liking - we continue with Github here. |
| 2 | Glad we added --yes - interactivity in container is usually a pain. |
And when we check the web UI we can see there is a bit of progress:
Relying on Zot is nice and good, but there are other ways to do that.
It all boils down to another simple call of cosign:
$ podman run --rm -v .:/workspace --network=host \
ghcr.io/sigstore/cosign/cosign:v2.4.1 \
verify \
--certificate-oidc-issuer=https://github.com/login/oauth \ (1)
--certificate-identity=[email protected] \
localhost:5000/todo-service:latest | jq ".[] | .critical" (2)
Verification for localhost:5000/todo-service:latest --
The following checks were performed on each of these signatures: (3)
- The cosign claims were validated
- Existence of the claims in the transparency log was verified offline
- The code-signing certificate was verified using trusted certificate authority certificates
{
"identity": {
"docker-reference": "localhost:5000/todo-service"
},
"image": {
"docker-manifest-digest": "sha256:fb1f02fff7f1406ae3aa2d9ebf3f931910b69e99c95e78e211037f11ec8f1eb6"
},
"type": "cosign container image signature"
}
| 1 | There are several options for verification available - we just rely on issuer and mail. |
| 2 | Apparently this critical is nothing of concern and a format specificed by RedHat. |
| 3 | This is a short summary of the checks that have been performed during the verification. |
Just as a negative test this is how it looks like when the verification actually fails:
$ podman run --rm -v .:/workspace --network=host \
ghcr.io/sigstore/cosign/cosign:v2.4.1 \
verify \
--certificate-oidc-issuer=https://github.com/login/oauth \
--certificate-identity=[email protected] \
localhost:5000/todo-service:latest
Error: no matching signatures: none of the expected identities matched what was in the certificate, got subjects [[email protected]] with issuer https://github.com/login/oauth
main.go:69: error during command execution: no matching signatures: none of the expected identities matched what was in the certificate, got subjects [[email protected]] with issuer https://github.com/login/oauth
First step done - step two is to sign our SBOM as well.
If you have made it this far in this post I probably shouldn’t bore you with another spec about in-toto or the framework around it and just provide the examples:
$ DIGEST=`podman run --rm -v .:/workspace -it --network=host \
ghcr.io/oras-project/oras:main \
discover --format json --plain-http \
localhost:5000/todo-service:latest | jq -r ".referrers[].reference"` (1)
$ podman run --rm -v .:/workspace --network=host \
ghcr.io/sigstore/cosign/cosign:v2.4.1 \
attest --yes \ (2)
--type cyclonedx \ (3)
--predicate /workspace/sbom.json \
$DIGEST
Generating ephemeral keys...
Retrieving signed certificate...
Non-interactive mode detected, using device flow.
Enter the verification code xxxx in your browser at: https://oauth2.sigstore.dev/auth/device?user_code=xxxx
Code will be valid for 300 seconds
Token received!
Successfully verified SCT...
Using payload from: /workspace/sbom.json
...
By typing 'y', you attest that (1) you are not submitting the personal data of any other person; and (2) you understand and agree to the statement and the Agreement terms at the URLs listed above.
using ephemeral certificate:
-----BEGIN CERTIFICATE-----
LOREMIPSUMDOLORSITAMETCONSECTETURADIPISCINGELIT
MORBIIDSODALESESTVIVAMUSVOLUTPATSODALESTINCIDUNT
...
-----END CERTIFICATE-----
tlog entry created with index: 232176597
| 1 | We need the digest to identify our artifact for the next steps - so please keep it at hand. |
| 2 | Don’t forget to deal with the interactive prompt here. |
| 3 | Some information about type and name of what cosign is supposed to attest. |
| cosign still supports the older command attach sbom to attach artifacts, but the it is deprecated and it is generally advised to use proper attestations. There is a heaty debate about its status and maturity though. |
As mentiond before this is complex, so let us have a closer look at what we can actually get back.
$ podman run --rm -v .:/workspace --network=host \
ghcr.io/sigstore/cosign/cosign:v2.4.1 \
download attestation \
$DIGEST | jq "del(.payload)" (1)
{
"payloadType": "application/vnd.in-toto+json", (2)
"signatures": [
{
"keyid": "",
"sig": "MEYCIQDE4/CeQstLjHLE+ZQ+BCH+aaw2wSWSr9i26d7iuazXrwIhAPtly5XBD6C14s/78vTjuHdLOjj2a9TeSgs0yD6YRrZd"
}
]
}
| 1 | We omit the payload data here - feel free to dump your own base64 blob |
| 2 | This is the actual type of the payload that has been transmitted. |
If you want to see the actual content of the payload here is a small exercise for you:
$ podman run --rm -v .:/workspace --network=host \
ghcr.io/sigstore/cosign/cosign:v2.4.1 \
download attestation \
$DIGEST | jq -r .payload | base64 -d | jq .predicate
And lastly in the same manner as before the attestation can also be verified by the means of cosign:
$ podman run --rm -v .:/workspace --network=host \
ghcr.io/sigstore/cosign/cosign:v2.4.1 \
verify-attestation \
--type cyclonedx \
--certificate-oidc-issuer=https://github.com/login/oauth \
--certificate-identity=[email protected] \
$DIGEST | jq ".[] | .critical"
podman run --rm -v .:/workspace --network=host \
ghcr.io/sigstore/cosign/cosign:v2.4.1 \
verify-attestation \
--type cyclonedx \ (1)
--certificate-oidc-issuer=https://github.com/login/oauth \
--certificate-identity=[email protected] \
$DIGEST > /dev/null (2)
Verification for localhost:5000/todo-service@sha256:5c6bb144aaed7d3e4eb58ac6bcdbf2a68d0409d5328f81c9d413e9301e2517a9 --
The following checks were performed on each of these signatures:
- The cosign claims were validated
- Existence of the claims in the transparency log was verified offline
- The code-signing certificate was verified using trusted certificate authority certificates
Certificate subject: [email protected]
Certificate issuer URL: https://github.com/login/oauth
| 1 | Here we pass some expectations to the checks. |
| 2 | We don’t want to see the exact same content from the previous step again. |
Passing bogus information or trying to verify the wrong digest leads to an error:
podman run --rm -v .:/workspace --network=host \
ghcr.io/sigstore/cosign/cosign:v2.4.1 \
verify-attestation \
--type cyclonedx \
--certificate-oidc-issuer=https://github.com/login/oauth \
--certificate-identity=[email protected] \
$DIGEST > /dev/null
Error: no matching attestations: none of the expected identities matched what was in the certificate, got subjects [[email protected]] with issuer https://github.com/login/oauth
main.go:74: error during command execution: no matching attestations: none of the expected identities matched what was in the certificate, got subjects [[email protected]] with issuer https://github.com/login/oauth
Phew that was quite lengthy to reach this point, time for a small recap.
During the course of this post we have seen how OCI-registries can be leveraged to store almost any kind of artifact. The layered structure and format allows to add additional metadata and ancillary artifacts like Helm-charts can be put there to rest as well.
Bill of materials allow quick scan of layers for known vulnerabilities and combined with proper signing can the security of the supply chain be further strengthened. Alas this is also no silver bullet and takes lots of work to get it right in automatic workflows.
I personally think this is a great addition, solves my initial hunt for artifact storage and also eases the handling of all the dependencies of different kind of artifacts in a more secure way. Next stop for me is to compile all this into a shiny new Architecture Decision Record. and discuss is with my team.
All examples can be found here hidden in the taskfiles:
The great enemy of communication is the illusion of it.
I think we can all agree on communication is hard and especially when you want to convey something that is perfectly clear to you. One simple explanation can be the curse of knowledge, but this doesn’t help me (at least) on my next struggle to find the right words without getting frustrated first.
This kind of struggle can be mildly said interesting in most cases during personal communication, but what happens during anything related to business, like complex requirement of your next big product?
During the course of this post I want to put emphasis on visual communication, which can help to support any narrativ and ultimately provide additional help in getting understood.
|
Like many posts in this blog before, we again use my sample todo application - if you still
haven’t seen it yet you can find an OpenAPI specification here: https://blog.unexist.dev/redoc/#tag/Todo |
Even with business requirements it is possible to start simple and one of the simplest things a user probably wants to do with our application is following:
Simple enough and perfectly straight forward, but the same can be expressed (and supported not replaced mind you) with a simple use-case diagram:
I suppose if I’d ask you for your first thoughts on this example now I’d probably get something in the range of this just adds clutter and is completely overkill for this really simple matter.
So still why do I insist this adds benefits?
Targeting the right audience is also key here, but still adding too much technical jargon and information to a use-case kind of beats the benefit of getting everything in a quick glance.
UML might offer many niceties, but please ask yourself does the extension of the previous use-case add anything of value?
Let’s move on to a more complex use-case.
No creative hat on today, so I am just going to re-use an idea from my previous [logging/tracing showcase][]:
This just adds a bit more complexity, but with focus on the business side the updated use-case can look like this:
So far this probably doesn’t bring any real benefit business-wise, so let us quickly add a way to actual see the created todo entries and awestruck our competitors:
There are many more ways to improve these use-cases and I don’t lack funny ideas, but the main goal here was to demonstrate the power of visual use-cases and the story that can unfold.
Instead of creating all of these use-cases in isolation, we can also carry on with the story idea and actually tell them.
At its heart Domain Storytelling is a workshop format, usually held by a domain expert and a supporting moderator, who share examples how they actually work inside the domain.
While the expert explains the domain, the moderator tries to record the story with a simple pictographic language. Each domain story covers one concrete example and can be directly used to verify if the story has been understood correctly or otherwise adjusted.
This approach allows all participants to learn the domain language (see ubiquitous language), get an understanding of the activities of the domain and also discover boundaries between the different parts (see bounded contexts).
The authors of the book Domain Storytelling [domstory] also provided Egon, a lightweight editor to support the workshop format.
One of my personal favorite features among others it the replay button to actually blend in the different steps like in a good slidedeck.
If we translate our last use-case to a simple domain story, one version could be like this:
Writing and evaluating requirements can be a progressive approach as we have seen with the evolution from a single no-brainer requirement to a more complex one. Going even further, the whole process can be done in a conversational and story-telling way and directly improve the understanding of all participants.
Using diagrams for communication isn’t something new, still I rarely see developers using them. I sometimes think this might be a problem of tooling, but with the rise of documentation-as-code this shouldn’t be an excuse anymore.
Domain storytelling is a different approach to the whole idea and even if you don’t follow this approach by detail, your projects can still benefit from the way Egon tells your stories.
If you interested in this topic and want to read more about it I highly suggest to have a look at these two books:
Domain Storytelling [domstory]
I am getting more and more obsessed with centralized documentation and this isn’t because I enjoy writing documentation (which I unfortunately really do), but more due sheer lack of it in my day job and all the related issues we are currently facing.
Pushing ideas like the one from my previous post (Bringing documentation together) certainly helps to make writing docs easier, but there are still some loose ends to follow - like API.
So this post is going to demonstrate how OpenAPI (or formerly Swagger) can be converted into shiny AsciiDoc and be brought into the mix.
There are many ways to document API (mind you any documentation is better than none!), but keeping established standards like OpenAPI and AsyncAPI (which isn’t to far off) really help to keep the cognitive churn low while trying to understand what a document is trying to convey.
And from a developer’s perspective there are many low-hanging fruits:
Code-first or API-first - you decide
Many generators in both directions available - like ktor-openapi-tools used in the example
Tools like Swagger UI and Redoc
Comes pre-assembled with a testing tool
Again, there are dozens of options to select from.. Since I rely on the confluence publisher plugin my initial pick was something with Maven-integration as well, but unfortunately swagger2asciidoc has been unmaintained for quite some time. I actually tried to use it, but this was more like an educative endeavor for learning what happens to neglected packages.
The next best option and probably should have been my first pick anyway is OpenAPI with its exhaustive list of generators. They offer a plethora of different ways to convert specs and thankfully AsciiDoc is among them.
If we omit all nitty-gritty details it boils down to this call:
$ openapi-generator-cli generate -g asciidoc \
--skip-validate-spec \ (1)
--input-spec=src/site/asciidoc/spec/openapi.json \
--output=src/site/asciidoc (2)
| 1 | Let us ignore version handling and maturity of my own spec for now |
| 2 | This is my preferred structure for Maven-based documentations |
| This can also be run from a container, see either openapi-generator-cli or have a look at my containerfile for even more dependencies. |
When everything works well a resulting document like this can be viewed:
One of the strong points of AsciiDoc is surely its extensibility and this also true for the generator pipeline we are using now.
Per default, the generator offers a lot of different entrypoints to provide custom content for inclusion in the final document, without doing fancy hacks like e.g. an include of the generated document in your own one.
If you have a closer look at the actual generated document you can see lots of commented out includes like:
[abstract]
.Abstract
Simple todo service
// markup not found, no include::{specDir}intro.adoc[opts=optional]
An introduction sounds like a good idea, so we could use the space there to inform our readers about the automatic updates of the document:
$ cat asciidoc/src/site/asciidoc/spec/intro.adoc
[CAUTION]
This page is updated automatically, please do *not* edit manually.
After that we have to tell the generator to actually include our document.
When started, it is looking for these
templates.{fn-templates}.[1]
inside the specDir, something we haven’t set before, but we are quite able to do.
This only requires a minor change of our previous commandline:
$ openapi-generator-cli generate -g asciidoc \
--skip-validate-spec \ (1)
--input-spec=src/site/asciidoc/spec/openapi.json \
--output=src/site/asciidoc \
--additional-properties=specDir=spec/,useIntroduction=true (2)
| 1 | Additional properties can be used to pass down configuration directly to the AsciiDoc renderer |
And hopefully, a run of the above rewards with an output like this:
There are many more templates that can be filled and I would gladly supply a list, but at the time of writing I just can offer to grep the document on your own:
$ \grep -m 5 "// markup not found" src/site/asciidoc/index.adoc
// markup not found, no include::{specDir}todo/POST/spec.adoc[opts=optional]
// markup not found, no include::{snippetDir}todo/POST/http-request.adoc[opts=optional] (1)
// markup not found, no include::{snippetDir}todo/POST/http-response.adoc[opts=optional]
// markup not found, no include::{specDir}todo/POST/implementation.adoc[opts=optional]
// markup not found, no include::{specDir}todo/\{id\}/DELETE/spec.adoc[opts=optional]
| 1 | Looks like we can also supply snippets to the example sections - neat! |
|
During my tests I stumbled upon a weird behavior, whereas there are different checks per index and generation phase, which have different requirements to the actual path. This made it necessary for me to fix this with a symlink in my builds:
|
I think this is the third time I tease how everything can be pushed to Confluence, but since I don’t run any personal instance just feel teased again:
$ mvn -f pom.xml \
-DCONFLUENCE_URL="unexist.blog" \
-DCONFLUENCE_SPACE_KEY="UXT" \
-DCONFLUENCE_ANCESTOR_ID="123" \
-DCONFLUENCE_USER="unexist" \
-DCONFLUENCE_TOKEN="secret123" \
-P generate-docs-and-publish generate-resources
What have we done here? Strictly speaking this doesn’t bring many advantages, especially when the tooling for OpenAPI looks so polished like this:
The ultimate goal of this is to create a central place where these specifications can be stored, without too many hurdles for non-dev stakeholders. Developers do well, when told the specs can be generated via Makefile.[2], but what about other roles like e.g. testers?
Back then we rolled a special infrastructure container, which basically included SwaggerUI along with the current versions of our specs, but infrastructure is additional work that has to be done and everything that leads to it must be maintained.
Whatever you do, proving easy access to documentation really helps to reach a common understanding and also might help to keep it up-to-date.
All examples can be found here:
The ultimate goal for my previous post about Dagger was to demonstrate the combination of it with Gitlab and Podman, but unfortunately I ran into so many different problems I made the decision to break it apart.
This is second part of a small series and explains how to set up Gitlab with Podman-in-Podman and various pitfalls along the way.
|
If you are looking for the first part just follow this link over here: Building with Dagger. |
The first step in order to start Gitlab is to provide an SSL cert, but to make this a lot more interesting we rely on a self-signed one:
$ openssl req -newkey rsa:4096 -x509 -sha512 -days 365 -nodes \
-out gitlab.crt -keyout gitlab.key \
-addext "subjectAltName=DNS:gitlab" \ (1)
-subj "/C=DE/ST=DE/L=DE/O=unexist.dev/OU=showcase/CN=gitlab" (2)
| 1 | This line is essential, otherwise Gitlab won’t accept this cert |
| 2 | We are going to use gitlab for the hostname, so make sure to add it to your hosts file |
Next up is the actual config of Gitlab.
There is aplenty that can actually be configured beforehand, especially in memory constrained environments it is beneficial to disable services like Prometheus, but here we trust in convention over configuration and just include only the bare minimum to run Gitlab:
external_url 'https://gitlab:10443/'
registry_external_url 'https://gitlab:4567' (1)
registry_nginx['enable'] = true
registry_nginx['listen_port'] = 4567 (2)
nginx['ssl_certificate'] = "/etc/gitlab/ssl/gitlab.crt"
nginx['ssl_certificate_key'] = "/etc/gitlab/ssl/gitlab.key"
nginx['listen_port'] = 10443 (3)
| 1 | Setting the ports here causes problems elsewhere, so better also set the ports in <2> and <3> |
| 2 | My initial idea was to use the registry as a cache, but more to that later |
| 3 | Nginx usually picks the port from external_url, which is not what we want to do |
Like Kubernetes, Podman allows to group or rather encapsulate containers in pods and also to convert them afterward, so let us quickly create one:
$ podman pod create -n showcase --network bridge \
-p 10022:22 `# Gitlab ssh` \
-p 10443:10443 `# Gitlab web` \
-p 4567:4567 `# Gitlab registry`
e91d11fdeb168c5713c9f48a50ab736db59d88ae7e39b807371923dcf4f26199
This can be done with the make target pd-pod-create.
|
Once everything is in place we can fire up Gitlab:
$ podman run -dit --name gitlab --pod=gitlab \
--memory=4096m --cpus=4 \
-v ./gitlab.crt:/etc/gitlab/ssl/gitlab.crt \ (1)
-v ./gitlab.key:/etc/gitlab/ssl/gitlab.key \
-v ./gitlab.rb:/etc/gitlab/gitlab.rb \ (2)
-v ./gitlab-data:/var/opt/gitlab \
-e GITLAB_ROOT_PASSWORD=YourPassword \ (3)
docker.io/gitlab/gitlab-ce:latest
17349b87f81aa9eb7230f414923cf491c84a36a87d61057f8dc2f8f82c7ea60a
| 1 | We pass our new certs via volume mounts to Gitlab |
| 2 | Our previously modified minimal config |
| 3 | Let’s be creative |
This can also be done with the make target pd-gitlab.
|
Once the container is running Gitlab can be reached at following address: https://localhost:10443
Great success, but unfortunately Gitlab alone is only half the deal.
Setting up a runner which is able to spawn new containers inside Podman is a bit tricky and requires to build a specially configured container first.
Luckily for us other people struggled with the same idea and did the heavy lifting for us:
$ podman build -t $(RUNNER_IMAGE_NAME) -f runner/Containerfile \ (1)
--build-arg=GITLAB_URL=$(GITLAB_URL) \ (2)
--build-arg=REGISTRY_URL=$(REGISTRY_URL) \
--build-arg=PODNAME=$(PODNAME)
| 1 | This relies on the pipglr project |
| 2 | This is an excerpt from the provided Makefile, so please consider the env variables properly set |
This can also be done with the make target pd-runner-podman-build.
|
The current registration process requires us to register a new runner inside Gitlab first and this can be done via at: https://localhost:10443/admin/runners
Once submitted the redirection is going to fail, since our host machine doesn’t know the hostname
gitlab:
This can be bypassed by just replacing gitlab with localhost or with a quick edit of the
hosts file:
$ grep 127 /etc/hosts
127.0.0.1 localhost
127.0.0.1 meanas
127.0.0.1 gitlab
Registration of the actual runner is bit a more involved, but remember the other people? pipglr, the actual hero our story, comes prepared and brings some container labels to execute the registration commands.
I took the liberty to throw everything into a Makefile target, and we just call it directly this time:
$ TOKEN=glrt-t1_QnEnk-yx3sdgVT-DYt7i make pd-runner-podman
# This requires Podman >=4.1 (1)
#podman secret exists REGISTRATION_TOKEN && podman secret rm REGISTRATION_TOKEN || true
#podman secret exists config.toml && podman secret rm config.toml || true
Error: no secret with name or id "REGISTRATION_TOKEN": no such secret
Error: no secret with name or id "config.toml": no such secret
1a02dae2a667dbddbdc8bd7b0
Runtime platform arch=amd64 os=linux pid=1 revision=690ce25c version=17.8.3
Running in system-mode.
Created missing unique system ID system_id=s_d3cc561989f6
Verifying runner... is valid runner=t1_QnEnk-
Runner registered successfully. Feel free to start it, but if it's running already the config should be automatically reloaded!
Configuration (with the authentication token) was saved in "/etc/gitlab-runner/config.toml"
# Fix SSL config to contact Gitlab registry
db86c90b8d202682014668223
pipglr-storage
pipglr-cache
8230fd623fc59d7621600304efcf1a11b5c9bf7cec5a8de5237b6d0143edb809 (2)
| 1 | I really need to update this, meanwhile even my Debian machine uses a decent version of Podman |
| 2 | Yay! |
The output looks promising, so let us verify our containers via Podman:
$ podman ps -a --format 'table {{.ID}} {{.Image}} {{.Status}} {{.Names}}'
CONTAINER ID IMAGE STATUS NAMES
bfac4e6acb26 localhost/podman-pause:5.3.2-1737979078 Up 42 minutes e91d11fdeb16-infra
cc6599fdf8db docker.io/gitlab/gitlab-ce:latest Up 42 minutes (healthy) gitlab
8230fd623fc5 localhost/custom-pip-runner:latest Up About a minute pipglr
And there it is, our new runner in the list of Gitlab:
From here everything should be pretty much self-explanatory and there are loads of good articles how to actually use Gitlab itself like:
Following the original idea of using Dagger, just another step of preparation is required. Dagger uses another container inside the runner and adds a bit more compexity to the mix:
The containers are nicely stacked, but this requires a specially grafted one for Dagger in order for it to access files:
FROM docker.io/golang:alpine
MAINTAINER Christoph Kappel <[email protected]>
RUN apk add podman podman-docker curl fuse-overlayfs \
&& sed -i 's/#mount_program/mount_program/' /etc/containers/storage.conf \ (1)
&& curl -sL --retry 3 https://dl.dagger.io/dagger/install.sh | BIN_DIR=/usr/local/bin sh
| 1 | This took me quite a while to figure out |
With so many containers (1x gitlab + 1x runner + 1x builder) the limit of a free tier can be quicky reached, and it is strongly advised to add some kind of caching layer. Gitlab comes with its own registry and can be used to cache all artifacts locally.
We already did the required configuration in our minimal config, so we just have to push the containers and configure the registry.
$ podman login -u root -p $(GITLAB_PASS) --tls-verify=false https://$(REGISTRY_URL) (1)
$ podman push --tls-verify=false \
$(BUILDER_IMAGE_NAME):latest $(REGISTRY_URL)/root/showcase-dagger-golang/$(BUILDER_IMAGE_NAME):latest
| 1 | Perfectly set-up environment for sure! |
And finally this can be done with the make target pd-gitlab-prepare-cache.
|
Gitlab is by itself a complex system and adding Podman and Dagger to the mix doesn’t make it easer at all, but probably increases the complexy tenfold.
So what do we actually get?
During my experiments with the trio I quickly ran into many problems and some of them were really challenging. Although I tried to address some of them in this blog post, to make it fellow readers easier to gets started, the whole thing is still complicated.
My original goal was to benefit from the facts to have pipeline knowledge everywhere, since the same pipelines are run locally and in the actual CICD and to be freed from the sales stuff of Docker, but if I consider the cost of this small advantage…
Ultimately I made the decision to postpone every move in this direction for now.
All examples can be found next to the examples from the first post:
Documentation is and was always my strong point and if I look back upon the year, which is about to close, it also has been a huge part inside of this blog and my daily job. During the year one critical problem (among how to motivate create a motivating environment to write documentation) remained:
How can we manage documentation that is scattered among many repositories and documentation systems?
The first problem is easily solved and I also recommended giving Antora a spin for my go-to documentation system AsciiDoc here, but what about the latter?
If you look closely, you can probably find n+1 documentation systems for every language. Examples include Javadoc for Java, Rustdoc for Rust just to name a few I daily use. Visiting all of them is totally beyond the scope of this post, so this post focuses on a more general approach with Doxygen, which also better matches my main motivation to align documentation for application and embedded software engineering.
Doxygen was actually the first documentation generator I’ve ever used and even my oldest C project subtle contains configuration for it.
In a nutshell Doxygen collects special comment blocks from the actual source files, takes care of all the symbols and provides various output formats like HTML in the next example:
/**
* @brief Main function (1)
*
* @details (2)
* @startuml
* main.c -> lang.c : get_lang()
* @enduml
*
* @param[in] argc Number of arguments (3)
* @param[in] argv Array with passed commandline arguments
* @retval 0 Default return value (4)
**/
int main(int argc, char *argv[]) {
printf("Hello, %s", get_lang("NL"));
return 0;
}
| 1 | The first section brief briefly (as the name implies) describes the method or function |
| 2 | A details block includes more verbose information about the implementation in the source file and can even contain Plantuml diagrams |
| 3 | Parameters should surprise no one besides the direction information in, out or both |
| 4 | And lastly return values can also be nicely laid out |
Normally Doxygen command starts with a \, but I personally prefer the Javadoc @
version via the config option JAVADOC_AUTOBRIEF.
|
Doxygen can then be run either locally or even better via container to create the first version of our output:
$ podman run --rm -v /home/unexist/projects/showcase-documentation-asciidoxy:/asciidoxy \
-it docker.io/unexist/asciidoxy-builder:0.3 \
sh -c "cd /asciidoxy && doxygen"
Doxygen version used: 1.11.0
Searching for include files...
Searching for example files...
Searching for images...
Searching for dot files...
...
Generate XML output for dir /asciidoxy/src/
Running plantuml with JAVA...
Generating PlantUML png Files in html
type lookup cache used 8/65536 hits=26 misses=8
symbol lookup cache used 16/65536 hits=50 misses=16
finished...
Once done the generated html pages look like this (in dark mode):
This works well, but unfortunately creates another documentation artifact somewhere and doesn’t move us any closer to an aggregated documentation - yet.
Besides the html output from above, Doxygen can also create xml files which include information about all the found symbols, their documentation and also their relationship to each other. Normally this would be quite messy to integrate into Asciidoc, but this is the gap AsciiDoxy closes as we are going to see next.
Originally created by TomTom and hopefully still managed since I’ve opened a bug on Github, it parses the xml files and ultimately provides a short list of AsciiDoc macros for convenient use inside our documents:
${language("cpp")} (1)
${insert("main", leveloffset=2)} (2)
${insert("main", template="customfunc")} (3)
| 1 | Set the language - the Mako templates vary a bit based on the language |
| 2 | Insert an actual symbol |
| 3 | Insert the same symbol again, but use a different template now |
| The initial setup is a bit tricky, especially with the different modules, but refer to the showcase and the official manual if you are stuck. |
The container from before is equipped with the whole chain, so let us quickly fire it up:
$ podman run --rm -v /home/unexist/projects/showcase-documentation-asciidoxy:/asciidoxy \
-it docker.io/unexist/asciidoxy-builder:0.3 \
sh -c "cd /asciidoxy && asciidoxy \
--require asciidoctor-diagram \
--spec-file packages.toml \
--base-dir text \
--destination-dir src/site/asciidoc \
--build-dir build \
--template-dir templates \
-b adoc \
text/index.adoc"
___ _ _ ____ 0.8.7
/ | __________(_|_) __ \____ _ ____ __
/ /| | / ___/ ___/ / / / / / __ \| |/_/ / / /
/ ___ |(__ ) /__/ / / /_/ / /_/ /> </ /_/ /
/_/ |_/____/\___/_/_/_____/\____/_/|_|\__, /
/____/
Collecting packages : 100%|██████████████████████████████████| 1/1 [00:00<00:00, 226.55pkg/s]
Loading API reference : 100%|██████████████████████████████████| 1/1 [00:00<00:00, 47.60pkg/s]
Resolving references : 100%|██████████████████████████████████| 2/2 [00:00<00:00, 1954.48ref/s]
Checking references : 100%|██████████████████████████████████| 1/1 [00:00<00:00, 28149.69ref/s]
Preparing work directory: 100%|██████████████████████████████████| 2/2 [00:00<00:00, 267.69pkg/s]
Processing asciidoc : 100%|██████████████████████████████████| 2/2 [00:00<00:00, 67.52file/s]
Copying images : 100%|██████████████████████████████████| 2/2 [00:00<00:00, 6647.07pkg/s]
Once this step is done AsciiDoxy has expanded all the macros and replaced them with the appropriate
AsciiDoc directives like the following for ${insert("main", leveloffset=2)}:
[#cpp-hello_8c_1a0ddf1224851353fc92bfbff6f499fa97,reftext='main']
=== main
[%autofit]
[source,cpp,subs="-specialchars,macros+"]
----
#include <src/hello.c>
int main(int argc,
char * argv)
----
main
Main function
[plantuml]
....
main.c -> lang.c : get_lang()
....
[cols='h,5a']
|===
| Parameters
|
`int argc`::
Number of arguments
`char * argv`::
Array with passed commandline arguments
| Returns
|
`int`::
|===
| The markup is a bit cryptic, but shouldn’t be too hard to understand with a bit of AsciiDoc knowledge. |
AsciiDoxy can perfectly generate AsciiDoc documents by itself and even supports multipage documents, but we require an intermediate step for the next part.
There is more than one way to generate the prepared document to its final form, but as initially told the general idea is to bring everything together.
I am not that fond of Confluence, but the goal of collecting everything in one place ranks higher than my taste here. Since rendering just the document doesn’t work here, we are going to rely on the asciidoc-confluence-publisher-maven-plugin from before.
This adds some more dependencies and finally explains why the container is based on Maven.
The base call to create the document works in the same manner as before:
$ podman run --rm --dns 8.8.8.8 -v /home/unexist/projects/showcase-documentation-asciidoxy:/asciidoxy \
-it docker.io/unexist/asciidoxy-builder:0.3 \
sh -c "cd /asciidoxy && mvn -f pom.xml generate-resources"
[INFO] Scanning for projects...
[INFO]
[INFO] --------------< dev.unexist.showcase:showcase-documentation-asciidoxy >---------------
[INFO] Building showcase-documentation-asciidoxy 0.1
[INFO] from pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
Downloading from central: https://repo.maven.apache.org/maven2/org/asciidoctor/asciidoctor-maven-plugin/2.1.0/asciidoctor-maven-plugin-2.1.0.pom
...
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 2 resources
[INFO] asciidoctor: WARN: index.adoc: line 60: id assigned to section already in use: cpp-hello_8c_1a0ddf1224851353fc92bfbff6f499fa97
[INFO] Converted /asciidoxy/src/site/asciidoc/index.adoc
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 17.596 s
[INFO] Finished at: 2024-12-26T15:51:23Z
[INFO] ------------------------------------------------------------------------
And if we have a look at our final result:
Getting the actual document to Confluence is a nice exercise for my dear readers:
$ podman run --rm --dns 8.8.8.8 -v /home/unexist/projects/showcase-documentation-asciidoxy:/asciidoxy \
-it docker.io/unexist/asciidoxy-builder:$(VERSION) \
-e CONFLUENCE_URL="unexist.blog" \
-e CONFLUENCE_SPACE_KEY="UXT" \
-e CONFLUENCE_ANCESTOR_ID="123" \
-e CONFLUENCE_USER="unexist" \
-e CONFLUENCE_TOKEN="secret123" \
sh -c "cd $(MOUNTPATH) && mvn -f pom.xml -P generate-docs-and-publish generate-resources"
Give it a try, I’ll watch.
Adding Doxygen and AsciiDoxy to the mix allows us to enhance our documentation with rendered meta information directly from the code and supplements the existing features of directly including code by file or tag. Being able to customize the used templates and select per symbol what is included offers great flexibility and still keeps the beautiful look of AsciiDoc.
The additional overhead of the toolchain and the intermediate steps to call Doxygen, AsciDoxy and AsciiDoc on every change is something to consider, but should be a no-brainer within a proper CICD pipeline.
All examples can be found here:
I can probably cite myself from this blog, but writing documentation (not necessarily good documentation mind you, but any at all) is really difficult and keeping it up-to-date nigh on impossible. To ease the pain, some clever people invented tools to write documentation-as-code, so docs can co-exist next to the source and have a better chance of being touched, whenever something is changed.
Based on my personal experience I can say the same is true for any kind of project decisions and good luck finding any hint about them - until I discovered records.
During the course of this post we are going to do a quick recap of ADR mostly by pointing to links (DRY, you know?), introduce a new type for technical debt (aptly named TDR), have a look at some examples some examples with adapted tooling and talk a bit about the power of the idea, that isn’t covered by the documents alone.
When I first heard about architecture decisions I was directly intrigued and blogged about, so there is no need to reiterate on this right now, but in hindsight I can say it really took a while for me to actually see the real benefit of them.
It never came to my mind, but why should we stop here?
Michael Stal pretty much got the gist of it and his suggestion is to handle technical debt in the same lieu as architecture decisions.
Documented as code
Well placed next to the actual code or any other kind of source code repository
With some mandatory fields and an open format as a guide rail.
In comparison with architecture decision records, the format of these new records (especially since it is Markdown) looks a bit different, but we are going to cover that later on.
| I included the descriptions of the fields in the actual document, just because I cannot explain the fields any better. |
Technical Debt Record
====================
Title:
------
A concise name for the technical debt.
Author:
-------
The individual who identified or is documenting the debt.
Version:
--------
The version of the project or component where the debt exists.
Date:
-----
The date when the debt was identified or recorded.
State:
------
The current workflow stage of the technical debt (e.g., Identified, Analyzed, Approved, In Progress, Resolved, Closed, Rejected).
Relations:
----------
Links to other related TDRs to establish connections between different debt items.
Summary:
--------
A brief overview explaining the nature and significance of the technical debt.
Context:
--------
Detailed background information, including why the debt was incurred (e.g., time constraints, outdated technologies).
Impact:
-------
Technical Impact:
- How the debt affects system performance, scalability, maintainability, etc.
Business Impact:
- The repercussions on business operations, customer satisfaction, risk levels, etc.
Symptoms:
---------
Observable signs indicating the presence of technical debt (e.g., frequent bugs, slow performance).
Severity:
---------
The criticality level of the debt (Critical, High, Medium, Low).
Potential Risks:
----------------
Possible adverse outcomes if the debt remains unaddressed (e.g., security vulnerabilities, increased costs).
Proposed Solution:
-------------------
Recommended actions or strategies to resolve the debt.
Cost of Delay:
---------------
Consequences of postponing the resolution of the debt.
Effort to Resolve:
-------------------
Estimated resources, time, and effort required to address the debt.
Dependencies:
-------------
Other tasks, components, or external factors that the resolution of the debt depends on.
Additional Notes:
-----------------
Any other relevant information or considerations related to the debt.
He also provides tooling along with the definition, which is quite nice for a starter.
The format is really close to the one of the ADR, so I did the obvious migration and adapted it to the format already used there.
The drawback of this is the previous tools cannot handle this new format and the adr-tools cannot handle TDR yet.
During the course of the last few years I played with the original adr-tools based on the work of its inventor Nat Pryce and added some missing features. Like the pending Asciidoc support, a simple database layer to speed up some of the generators and added simple rss/atom feeds for easier aggregation.
This put me in a perfect position to adapt the tools even further and hack a new format into it under a new umbrella.
I am still playing with the idea to port the shellscripts to Rust - does anyone fancy
record-tools-rs?
|
The following examples demonstrates how the record-tools can be used, starting with the basic steps up to deploying rendered versions to a Confluence instance, since it always pays off to include non-tech-savvy folks.
The record-tools include two examples, one of each kind to kickstart the decision to actually use these formats and keep the intention of the original along with some shameful self advertisement:
= 1. Record architecture decisions
:1: https://unexist.blog/documentation/myself/2024/10/22/decision-records.html
|===
| Proposed Date: | 2024-10-24
| Decision Date: | 2024-10-24
| Proposer: | Christoph Kappel
| Deciders: | Christoph Kappel
| Status: | accepted
| Issues: | none
| References: | none
| Priority: | high
|===
NOTE: *Status types:* drafted | proposed | rejected | accepted | deprecated | superseded +
*Priority:* low | medium | high
== Context
We need to record the architectural decisions made on this project.
== Proposed Solution
Architecture Decision Records as {1}[summarised by Christoph] might help us as a format.
== Decision
We will use Architecture Decision Records.
== Consequences
None foreseeable.
== Further Information
== Comments
|
It isn’t strictly necessary to checkout the example, but if you want to play with the tooling:
|
Besides the name, the record-tools basically behave in the same manner like the original version of the tools and for example a new TDB can be created like this:
$ ../src/record-tdb new Usage of log4j (1)
| 1 | This command creates a new record and opens it in your default $EDITOR |
If you consider the topic of this record there probably comes a lot to your mind what you would like to add, but let us shorten this phase and accept the record as-is and press save :+w.
Sometimes decisions have to be revised (or superseded) and that couldn’t be more true with technical matters, once more information has been gathered and/or experience with the actual decision could be gained.
$ ../src/record-tdr new -s 2 Usage of zerolog (1)
| 1 | Both are quite incompatible, but zerolog is always worth mentioning |
Under the hood, supersede just overwrites the status of the previous record with supersded and applies links in both directions. This can also be done manually with arbitrary links:
$ ./src/record-tdr link 3 Amends 1 "Amended by" (1)
| 1 | This command links record 3 to 1 long with the relationship of the link forwards and backwards |
There isn’t much direct visible effect besides the addition of the links to the Further Information field, but more on this in the next section:
== Further Information
Any other relevant information or considerations related to the debt.
Supersedes link:0002-usage-of-log4j.adoc[2. Usage of Log4j]
Amends link:0001-technical-debt-decision.adoc[1. Record technical debt decisions]
The tools include various generators that can be used to generate listings, graphs and even feeds.
The table of contents generates a nice overview of the known records and can additionally prepend and append an intro and an outro, to allow further customization:
$ ../src/record-tdr generate toc -i Intro -o Outro
= TDR records
Intro
* link:0001-technical-debt-decision.adoc[1. Record technical debt decisions]
* link:0002-usage-of-log4j.adoc[2. Usage of log4j]
* link:0003-usage-of-zerolog.adoc[3. Usage of zerolog]
Outro
These two generators should be pretty self-explanatory:
$ ../src/record-tdr generate rss (1)
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title>List of all tdr records</title>
<description>List of all created tdr records</description>
<ttl>240</ttl>
<lastBuildDate>2024-10-24 12:05</lastBuildDate>
<generator>record-tools</generator>
<webmaster>[email protected]</webmaster>
<item><title>1. Record technical debt decisions</title><link>0001-technical-debt-decision.adoc</link><category>high</category><pubDate>2024-10-24</pubDate><description>Status: superseded</description></item> <item><title>2. Usage of log4j</title><link>0002-usage-of-log4j.adoc</link><category>low</category><pubDate>2024-10-22</pubDate><description>Status: superseded</description></item> <item><title>3. Usage of zerolog</title><link>0003-usage-of-zerolog.adoc</link><category>low</category><pubDate>2024-10-23</pubDate><description>Status: drafted</description></item>
</channel>
</rss>
| 1 | Either use rss atom for the specific type |
Both generators create a graph based on dot - the sole difference is the plantuml version just
neatly wraps the output between @startdot and @enddot:
$ ../src/record-tdr generate plantuml
... (1)
| 1 | We omit the output here, because it looks way better directly rendered with Plantuml below |
Plantuml doesn’t use the passed links, but when the graph is directly renderes as a a vector graphic (svg) it also includes links:
$ ../src/record-tdr generate digraph | dot -Tsvg > graph.svg
And index accumulates all known records, groups them based on different properties like the severity and combines everything into a clickable page.
| This uses the tools quite heavily - or in other words is pretty slow. Therefore it relies on the database to speed things up, which needs to be populated first. |
$ ../src/record-tdr generate database
$ ../src/record-tdr generate index
...
== List of all TDR with high severity
[cols="3,1,1,1,1", options="header"]
|===
|Name|Proposed Date|Decision Date|Status|Severity
|<<technical-debt-records/0001-technical-debt-decision.adoc#, 1. Record technical debt decisions>>|2024-10-24|2024-10-24|superseded|high
|===
== List of all TDR with critical severity
[cols="3,1,1,1,1", options="header"]
|===
|Name|Proposed Date|Decision Date|Status|Severity
|===
== List of all TDR
[cols="3,1,1,1,1", options="header"]
|===
|Name|Proposed Date|Decision Date|Status|Severity
|<<technical-debt-records/0001-technical-debt-decision.adoc#, 1. Record technical debt decisions>>|2024-10-24|2024-10-24|superseded|high
|<<technical-debt-records/0002-usage-of-log4j.adoc#, 2. Usage of log4j>>|2024-10-24|?|superseded|low
|<<technical-debt-records/0003-usage-of-zerolog.adoc#, 3. Usage of zerolog>>|2024-10-24|?|drafted|low
|===
...
This page can be converted via Asciidoctor and its various backends:
$ ../src/record-adr generate database (1)
$ ../src/record-adr generate index > _adr_autogen.adoc (2)
$ asciidoctor -D architecture-decision-records src/site/asciidoc/architecture-decision-records/*.adoc (3)
$ asciidoctor -D . -I architecture-decision-records /site/asciidoc/architecture-decision-records.adoc (4)
$ asciidoctor -r asciidoctor-pdf -b pdf -D . src/site/asciidoc/architecture-decision-records.adoc (5)
| 1 | Generate the database for both types |
| 2 | Generate a neat index page for both types |
| 3 | Render the actual documents now |
| 4 | Optional step - just in case a PDF version is required |
Once rendered the pages should look like this:
Another way of generating the page is via Maven, which is quite handy since it is prerequisite for the next step anyway. Fortunately the example contains all required configuration and all that needs to be done is this:
$ mvn -P generate-docs exec:exec generate-resources (1)
| 1 | The maven exec plugin handles the database generation and index page part |
There is a Makefile included in the example that provides convenience targets for the
commands like make generate and make publish which will come in handy for the next step.
|
And finally we want to publish our documents, to make them easy accessible for everyone. There are many different options to pick from, but one of the easiest is to use the Confluence Publisher and put our documents to a Confluence instance of our choice.
Spinning up a confluence instance for this example is quite pointless without a license, so if
you really want to see it in action there is some config required in the pom.xml file:
<!-- Confluence config -->
<!-- NOTE: Be careful with the ancestorID, everything will be overwritten -->
<confluence.url>${env.CONFLUENCE_URL}</confluence.url> (1)
<confluence.publishingStrategy>APPEND_TO_ANCESTOR</confluence.publishingStrategy>
<!-- Provide these values from env; don't commit them! -->
<confluence.spaceKey>${env.CONFLUENCE_SPACE}</confluence.spaceKey> (2)
<confluence.ancestorId>${env.CONFLUENCE_ANCESTOR}</confluence.ancestorId> (3)
<confluence.publisherUserName>${env.CONFLUENCE_USER}</confluence.publisherUserName>
<confluence.publisherPassword>${env.CONFLUENCE_TOKEN}</confluence.publisherPassword>
| 1 | The configuration can either passed by environment variables or be hardcoded - this is up to you |
| 2 | This is normally the two letter abbreviation of the space, which can be found within the space settings |
| 3 | And finally we also need the ancestor id to append our records to. Problems to find it? Just open the page settings and have a look at the address bar of your browser. |
And once everything is set up correctly just fire up following:
$ CONFLUENCE_USER=USER_NAME CONFLUENCE_TOKEN=USER_TOKEN mvn -P generate-docs-and-publish exec:exec generate-resources
Aside from the documentation aspect and way to have these documents kind of guided to the guided document layout, we haven’t spoken of the real power of this yet.
Records foster active collaboration and work splendidly with all kind of crowd thinking. They offer a space to experiment maybe in the form of proof-of-concepts or simple showcase for a particular technologie or to collect further opinions in Writer’s Workshops.
In this way teams are able to contribute to and suggest changes of the overall architecture in the case of ADR and point to critical problems within TDR. This can be a culture change of the involved teams, since it allows a more active participation in the process and especially if they are involved in the actual (democratic?) decision.
We are still experimenting with the actual documents and formats at work, but my personal feeling is this really moves us forward and allows the team more autonomy and offers additional ways for contribution.
Like always all my examples can be found here:
Finding good reasoning to explore different options for monitoring or better [observability] is difficult. Either there wasn’t the singular impact on production yet, that made you lust for better monitoring and/or it is difficult to understand the merit and invest of time.
And even when you make the decision to dive into it, it is always a good idea not to start on production, but with a simple example. Simple examples on the other hand rarely show the real powers, so it usually ends in heavily contrived ones like the one I’ve used in my last post about Logging vs Tracing.
Still, nobody got fired for buying IBM ramping up monitoring, so let us - for the course of this post - put our EFK stack and friends aside and get started with something shiny new in Golang.
If you are like me and you haven’t heard the name SigNoz before the first and foremost
questions are probably what is SigNoz and why not one of these solutions
insert random product here.
From a marketing perspective the key selling point for me probably and honestly was the headline on the frontpage:
OpenTelemetry-Native Logs, Metrics and Traces in a single pane
Without knowing prior to that, this was exactly what I need, so well done marketing:
Seems to be FOSS
Single solution to address the three pillars
Nice and complete package
That sounds rather too good, but time to put on my wizard hat and to check the brief. Before messing with Docker, I checked the documentation and discovered an architecture overview and this looks like they hold their part of the bargain:
| 1 | Apps can directly send data to SigNoz |
| 2 | Otel collectors can transmit data as well |
| 3 | Internally another custom collector provides the endpoints to receive data |
| 4 | Though I haven’t heared of Clickhouse either before, but columnar storage sounds about right |
| 5 | Some abstraction to query the actual data |
| 6 | Alert Manager keeps tracks and handles all the various alerts - glad they haven’t reinvented the wheel |
| 7 | And the shiny bit we’ve spoken of before |
Once Signoz is running, which basically boils down to calling docker-compose, a nice starter question is how to deliver your actual data to it.
OpenTelemetry is the defacto standard for that and offers many ways to gather, collect and transmit data via highly configurable pipelines. The only noteworthy thing here to pay attention to the size of the generated logs - which may cause some headaches as it did for me during my vacation.
While playing with SigNoz I discovered it doesn’t connect each of its containers separately to an OpenTelemetry Collector.[1], but passes this task entirely to a container with logspout.
After a quick glance at the Github page marketing did its thing again:
Logspout is a log router for Docker containers that runs inside Docker. It attaches to all containers on a host, then routes their logs wherever you want. It also has an extensible module system.
Alright, this still sounds like a splendid idea and is exactly we do in the example. In fact, there isn’t much we have to configure at all:
Docker needs a minimal config to get us started:
logspout:
container_name: todo-logspout
image: "docker.io/gliderlabs/logspout:latest"
pull_policy: if_not_present
volumes: (1)
- /etc/hostname:/etc/host_hostname:ro
- /var/run/docker.sock:/var/run/docker.sock
command: syslog+tcp://otelcol:2255 (2)
depends_on:
- otelcol
restart: on-failure
| 1 | Logspout needs access to the Docker socket and hostmapping for convenience |
| 2 | This configures a connection to a receiver of our otelcol instance and comes up next |
And we have to define a receiver in otelcol:
receivers:
tcplog/docker:
listen_address: "0.0.0.0:2255"
operators: (1)
- type: regex_parser (2)
regex: '^<([0-9]+)>[0-9]+ (?P<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}(\.[0-9]+)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?) (?P<container_id>\S+) (?P<container_name>\S+) [0-9]+ - -( (?P<body>.*))?'
timestamp:
parse_from: attributes.timestamp
layout: '%Y-%m-%dT%H:%M:%S.%LZ'
- type: move (3)
from: attributes["body"]
to: body
- type: remove
field: attributes.timestamp
- type: filter (4)
id: logs_filter
expr: 'attributes.container_name matches "^todo-(postgres|otelcol|logspout)"'
- type: json_parser
parse_form: body
| 1 | Operators allow to parse, modify and filter entries |
| 2 | This is the default format of the messages logspout forwards to otelcol |
| 3 | We basically move our content to the actual body of the entry |
| 4 | There might be lots of different containers running, so we limit the entries based on container names |
There is plenty of explanation and definition out there, way better than I can ever provide, but just to recall the three back to our memory:
Logging |
Historical records of system events and errors |
|---|---|
Tracing |
Visualization of requests flowing through (distributed) systems |
Metrics |
Numerical data like e.g. performance, response time, memory consumption |
The first pillar is probably the easiest and there is also lots of help and reasoning out there, including this blog.
So best we can do is throw in zerolog, add some handling in a Gin-gonic middleware and move on:
logEvent.Str("client_id", param.ClientIP). (1)
Str("correlation_id", correlationId). (2)
Str("method", param.Method).
Int("status_code", param.StatusCode).
Int("body_size", param.BodySize).
Str("path", param.Path).
Str("latency", param.Latency.String()).
Msg(param.ErrorMessage)
| 1 | The essential mapping magic happens here |
| 2 | A correlation id can help to aggregate log messages of the same origin |
SigNoz offers lots of different options to search data and if you have any experience with Kibana and the likes you will probably feel right away at home:
There is also no reason to shy away if you require some kind of aggregation and diagrams with fancy bars:
The second pillar is a slightly different beast and requires special code to enhance and propagate a trace - this is generally called instrumentation.
OpenTelemetry provides the required toolkit to start a tracer and also add spans:
func (resource *TodoResource) createTodo(context *gin.Context) {
tracer := otel.GetTracerProvider().Tracer("todo-resource") (1)
ctx, span := tracer.Start(context.Request.Context(), "create-todo",
trace.WithSpanKind(trace.SpanKindServer))
defer span.End()
var todo domain.Todo
if nil == context.Bind(&todo) {
var err error
// Fetch id
todo.UUID, err = resource.idService.GetId(ctx)
if nil != err {
context.JSON(http.StatusBadRequest, gin.H{"error": err.Error()})
span.SetStatus(http.StatusBadRequest, "UUID failed") (2)
span.RecordError(err) (3)
return
}
// Create todo
if err = resource.todoService.CreateTodo(ctx, &todo); nil != err {
context.JSON(http.StatusBadRequest, gin.H{"error": err.Error()})
return
}
} else {
context.JSON(http.StatusBadRequest, "Invalid request payload")
return
}
span.SetStatus(http.StatusCreated, "Todo created")
span.SetAttributes(attribute.Int("id", todo.ID), attribute.String("uuid", todo.UUID)) (4)
context.JSON(http.StatusCreated, todo)
}
| 1 | This creates a tracer based on the current context |
| 2 | Spans as working unit of a trace can include a status |
| 3 | Error messages can also be thrown in |
| 4 | And they can also include different types of general span attributes |
The above code calls the id-service and demonstrates how traces can be continued and passed
between service boundaries:
func (service *IdService) GetId(ctx context.Context) (string, error) {
tracer := otel.GetTracerProvider().Tracer("todo-service")
_, span := tracer.Start(ctx, "get-id")
defer span.End()
response, err := otelhttp.Get(ctx, fmt.Sprintf("http://%s/id",
utils.GetEnvOrDefault("APP_ID_HOST_PORT", "localhost:8081"))) (1)
if err != nil {
return "", err
}
jsonBytes, _ := io.ReadAll(response.Body)
var reply IdServiceReply
err = json.Unmarshal(jsonBytes, &reply)
if err != nil {
return "", err
}
return reply.UUID, nil
}
| 1 | The otelhttp package makes it really easy to propagate traces |
When everything is set up correctly propagated traces look like this:
The last pillar is one of the most interesting and probably the most troublesome, since there is no easy recipe what could and what should be done.
Metrics can generally be of following types:
Counter |
A simple monotonically increasing counter which can be reset |
|---|---|
Gauge |
A single value that can go arbitrarily up and down |
Histogram |
A time series of counter values and a sum |
Summary |
A histogram with a sum and quantile over a sliding window |
This allows a broad range of measurements like the count of requests or the avg latency between each of them and has to be figured out for each service or rather service landscape individually.
Still, when there are metrics they can be displayed on dashboards like this:
Although not directly related to the three pillars, alerts are a nice mechanic to define thresholds and intervals to receive notification over various kind of channels.
The documentation is as usual quite nice and there isn’t much to add here, besides the fact a paid subscription is required to connect SigNoz to teams. There is also a way to fallback to Power Automate, unfortunately this requires another subscription.
A little hack is to use connectors for Prometheus, but please consider supporting the good work of the folks of SigNoz:
SigNoz is a great alternative to the established different solutions like EFK or Grafana in a well-rounded package. It is easy to install and so far as I can say easy to maintain and definitely worth a try.
All examples can be found here: