Tuesday, January 7, 2025

Olly 2.0

 You blink and suddenly there's Observability 2.0. It's a logical conclusion of where the more interesting things were going, and I'm a little disappointed in myself that I didn't see this coming. One of those "obvious in hindsight" things for me.

Related articles:


I do see the main challenge as adoption. I wonder if this can step out of being a niche pattern. Specifically, I think I disagree with the "critical mass" part of this:

a critical mass of developers have seen what observability 2.0 can do. Once you’ve tried developing with observability 2.0, you can’t go back. That was what drove Christine and me to start Honeycomb, after we experienced this at Facebook. It’s hard to describe the difference in words, but once you’ve built software with fast feedback loops and real-time, interactive visibility into what your code is doing, you simply won’t go back.


The market has been changing rapidly and the capabilities provided by the big providers like Datadog, Dynatrace, Grafana Labs etc. are ever expanding: more integrations to import ever more sources of data, more ways to visualise data (Datadog now even visualises step flow executions). Yet at the same time I feel less excited and more confused then when I first started playing with InfluxDB on a client project in 2014 (real-time claims processing data for an insurance company that business didn't care about because they already had BI systems in place that they were happy with).

It is hard to actually find data that is relevant, nobody builds custom dashboards, instead you go back and forth between a variety of pre-built, generic service pages that show you a lot of little graphs but very little information. To make up for this Datadog, Dynatrace, Wavefront etc. all have some clever processing that tries to do some form of root cause analysis and anomaly detection for you. Often useful, always distracting.

I appreciate that it's no longer the default to have no instrumentation at all and to only hear about outages from customer complaints while you frantically try to deduce what's going on from what little actually useful logs you have. I'm just not sure how to advance beyond that.

On the data analytics side it seems to have now become common to have Data Stewards or some such role to ensure that whatever gets dumped into the data lakes by different teams adheres to some shared understanding of the world. Maybe something like that would be useful to agree upon in the dev world.

I've been excited enough about observability to do a conference talk about it. As mentioned in the introduction of that talk, it's because I want to get more people excited about the topic and because I think it's not easy to get started and see the possibilities if you're starting from scratch.

Honeycomb's sandbox examples seem to be the only thing that comes close to getting a glimpse of that. And even those examples are pretty heavily focused on the operational side.

As an aside - maybe there is a space for "open source" software to showcase some more higher level things to do with observability. IIRC Emelia was working on getting some metrics out of hachyderm and maybe things like that could be expanded on. (Hachyderm also seems large enough to have statistically relevant amounts of data)

It's my impression that not a lot of developers care all that much about observability. And the pretty decent out of the box support of open telemetry type agents covers quite a lot of operational concerns. So I'm wondering what the driving cultural change could be that would get developers interested in putting some time into considering what might be interesting to log about whatever piece of code they're currently working on.

I'm not sure if something equivalent to code coverage checks might be useful, if it were straightforward to build? I've not been a fan of hard coverage targets but as a soft constraint automated coverage and linting checks can be a useful way of automating standards that a team has agreed upon.

These changes might take time, I guess. As an industry we're still not particularly great about code quality. Taking TDD as an example, it's not like proper use of it is at all common. But at the same time doing at least some form of automated testing does seem the default now. I do remember the times when you had to justify the time spent on writing tests. We're clearly in a better place now.

Maybe over time it will become common to expect to be able to interrogate your running software for more interesting data than just how many request per seconds some controller is handling.

Tuesday, May 7, 2024

DevConf 2024

 DevConf 2024

 Update: talk is now online at https://www.youtube.com/watch?v=a-EB4d4FsyE

I'm doing a talk on Observability and wanted to add a few links with more information for people taking an interest.

Slides of my talk are on speakerdeck and the repo of my playground

Some useful guides

 Structured logging

A good intro to how rate queries work in Prometheus by Beorn Rabenstein (youtube)

A guide on how to do SLOs that I like

An interesting look to the future: https://hazelweakly.me/blog/redefining-observability/

Honeycomb's sandboxes 

Grafana Docs and playground

(Edit 09.01.2025) This post on structured logging is also really useful and worth adding to this post

Historical section

An old video by Coda Hale on metrics. The video and tech (dropwizard) are dated but I found it a good intro anyway

A bit more on how Netflix used to do deployments.  

Sam Newman's talk as mentioned


Tuesday, August 3, 2021

An issue I had with AWS CodeDeploy

 Hi there

Clearly, I've not really written anything here in 7 years. Let's change that. By writing about an issue we had at Qixxit about 3 years ago. It's pretty specific so really not exciting but I've published a bit of code for it and I wanted to finally do a bit of a write-up on it.

The problem

We ran qixxit's backend on AWS as an auto-scaling group behind an application load-balancer. The auto-scaling group would manage ec2 instances running the application. The auto-scaling group would know about the linked load balancer through a property "target-group" in its configuration. This allows the auto-scaling group to tell the load balancer to stop sending traffic to an instance before removing it. 

We used AWS CodeDeploy's blue/green deployment feature to roll out new versions. This would create a new configuration of the auto-scaling group based on a newly built AMI. It would then bring up new instances for that and redirect traffic from the load balancer to these. Eventually, the old configuration is removed.

The new configuration for the auto-scaling group has all the settings of the previous configuration copied over automatically by CodeDeploy. Except for that "target-group" setting.

Now when the auto-scaling group decides that it wants to scale in and remove an instance it will just immediately shut down that instance. (Sidenote: maybe it's because English isn't my native language but I really don't get why it's scale in/scale out and not scale down/scale up. Is this maybe specific to Amazon's documentation?) Anyway, the instance is gone but the load balancer still sends traffic to it.

I don't actually remember what specific problems this caused and whether this was also an issue when scaling out. It wasn't too dramatic and initially we worked around it by just setting the target group manually and filed a support request.

Amazon's support was helpful and acknowledged the issue. But they didn't provide a way to directly track whether this was being worked on (I guess they generally don't?) and from the communication it sounded like it was pretty unlikely anyone would ever actually work on it.

A solution

And so I wrote a bit of Javascript to solve it for us. CodeDeploy sends notifications for the various steps of the deployment. That can be hooked into to then set the target group of the auto-scaling group. The whole thing runs as a lambda managed via the serverless framework.

This was a nice way for me to practice my javascript test driving skills and also to play around with the serverless framework. And as of recently also to play around with Github Actions. Though I don't actually know if any of it still works or whether it's even still relevant and maybe the bug got fixed by Amazon. 


In closing

I don't like how this write up turned out. It seems impossible to me to describe this in a generic way that doesn't require some knowledge of a bunch of specific AWS products. But maybe that's a good way to keep expectations low for this blog. 

Tuesday, March 25, 2014

Looking back on 2013


A lot of things happened for me in 2013 but much of it seems informal and I'm not sure I can make much of it in my usual year in review format. So be it. Also, I wrote this early February and then forgot to publish it.

2013 started with me returning to South Africa after a much needed vacation back home. I arrived with a much clearer head and much of the stress form the initial months in South Africa gone.

The project I was working on during my whole time in Joburg was doing well and we started trying to spread those experiences out to more teams at the client. This proved to have the same difficulties every other enterprise agile transformation seems to struggle with, too. I think personally, I learned heaps of things there in terms of how to communicate better and wish I would have taken a more active role in some of the discussions. On the plus side, the client was willing to experiment with processes to some degree, we had some really experienced Thoughtworkers around that I could learn a lot from and for myself I'm happy with what was achieved during the time I was there.

Overall though, I find that the topic of enterprise agile transformation increasingly fills me with despair. There are so many inherent contradictions causing the same kinds of friction, again and again, with huge challenges for making a lasting positive difference for both the companies and the people working for them. I feel new approaches and the courage to experiment with them are required. Lots of that appears to be happening in smaller companies but I haven't witnessed any of that myself. Ok, enough of that. Where was I?

I kept taking part in the Joburg developer community, attending lots of little events and making friends along the way. While the community is relatively small, I do like the vibe and this is one of the reasons why I'm looking forward to spending more time in Johannesburg. But more on that later.

I also had a slightly more active role in some of the events. We had another Black Girls Code event that I ended up facilitating, with heavy support from Thoughtworkers and non-Thoughtworkers alike. I was happy to see (from afar, as I was already back in Germany) that the next BGC event had even more non-Thoughtworkers there. The more people involved in it, the more sustainable and successful it will hopefully become. Curriculum-wise we were working with material we received from the BGC organisers in the US, who were also kind enough to take the time to talk us through it over Skype.

I also did a little workshop at the Developer Usergroup appropriating the very nice "Taking Babysteps" format. I had learned about it taking part in it when Adi ran it in Hamburg a few months earlier.

Just before heading back to Germany my colleague Rouan and I also gave a talk at JSinSA about lineman, a javascript framework that more people should know about. (As an aside, I stumbled upon lineman because I've been twitter-stalking its author Justin Searls for years now, ever since he wrote a maven plugin that stood out because it actually did what it was supposed to.)

I was very happy in the Johannesburg Thoughtworks office and when it came time to return home to Germany, that contrast made it pretty clear that I didn't want to keep working for Thoughtworks in Germany. And so my farewell to South Africa also became my farewell to Thoughtworks.

The next months were incredibly chaotic as I was trying to figure out what I wanted to do next. A good part of that time was spent as a journeyman developer. I blogged about that time extensively so I won't repeat too much of that here. Except to repeat that I'm grateful to have had the opportunity and really enjoyed it.

During this time I also went to the SoCraTes and ALE conferences again. Both events have quickly turned into my favourite events during the year and I'm already sad that I will probably miss both of them in 2014.

Eventually I made up my mind and decided to go back to South Africa. It's pretty difficult to describe what I enjoy about the vibe of both the developer community as well as that of Johannesburg itself but in summary I feel it's worth it to go back for a a few years.

I went back for two months at the end of the year to go job-hunting. It was a pretty positive experience overall and I'm also happy and excited about the new job coming up.

While I was back I also took part in the global day of code retreat again. I really liked the venue and it was nice to just take part without being involved in facilitating.

Going back to Germany, I had to wait for my work permit to come through. This proved to be a pretty rough time as being without a job and without my own place really started to wear on me.

But eventually I got back to Joburg, started a new job, got my own place again and published this blog post. So it all worked out but that's a story for next year's year in review.

Thursday, February 6, 2014

Journeyman weeks - retrospecting

(All my previous posts on this topic can be found via the journeyman weeks label)

With a little bit of distance I will now try to sum up my experience of my little journeyman tour.

The quick summary is: I'm really happy I did this and I'm really grateful that it all came together on such short notice.

Getting started

First I want to elaborate a little bit more on my motivation. At the time of the SoCraTes conference I didn't really know what to do next, work-wise. I had briefly contemplated joining a Berlin start-up as the technical co-founder but ultimately decided against it. I also wasn't really happy with the idea of just working for any German company. I also started toying with the idea of going back to South Africa.

As all my stuff was still in storage and I didn't feel like committing to a new place in Germany it meant I was crashing on people's couches anyway. And I didn't just want to hang around doing nothing while I was trying to make up my mind on what to do next. As such, the idea of the journeyman tour seemed like a perfect fit.

After coming back from SoCraTes I wrote up what I wanted to do in a little blog post. People kindly reshared it on Twitter and generally were very encouraging. I spent a few days doing this and coordinating with people and sorted out a little bit of an itinerary. I had 6 weeks to fill, with most of the first week spent on organising and then a week in between in Bucharest for the ALE conference. The other four weeks thankfully filled up quickly. I'm pretty happy that I didn't really think this through because if I had, I might have just gotten scared. What if nobody was interested? What if, after talking about it so much, I would not actually be able to do it?

This wasn't that unlikely, I now realise, as companies struggled to understand what I was doing. And since I didn't really know either, I wasn't sure how to market myself. In all the four companies that I ended up visiting I was introduced by people I already knew who had faith in me. I wasn't that aggressive in communicating with other companies and in hindsight I'm much more convinced of what I had to offer. So maybe, if I had taken more time to prepare, I would have managed to find companies by cold-calling them. But as it was, I was dependent on other people doing my marketing for me and I am incredibly grateful for the support I got.

Lessons learned

I really wasn't sure what usefulness I could provide in just one week. And while I wasn't asking for much in terms of compensation, finding something for me to do, providing someone for me to pair with and getting me set up definitely required effort by my hosts. And so I was a little bit scared to disappoint the faith put into me. Thankfully, I think it turned out ok and I was actually surprised by how much can be done in a week.

Part of that usefulness falls into the general usefulness of pairing. Having a fresh set of eyes to help work on something, I think, was appreciated by all of my hosts. (This relates well with the "Beginner's Mind" pairing pattern, described in this post.)

I also didn't expect how quickly I was able to become productive in the various new dev environments. With a lot of automation already set up and a pair to help explain the details I was able to focus on the actual code. And with concrete tasks to work on this provided a pretty perfect setting to dig into a new language or tool. This is also what I found most beneficial for my own goal of learning new things during those weeks. If I want to get started with a new language I usually struggle setting up some dummy test app to play around with. Having the basics of building, deploying and baseline architecture already created makes it really easy and fun to explore a new language.

There is probably a lesson in here for companies scared of becoming polyglot, trying to fit all their development work to one language (and one framework and platform, too. Eek.) While there certainly are risks to maintaining a diverse set of software projects, I would say that we tend to underestimate how much of the existing know how of software development can be carried over to a new language. And this only gets easier the more you learn about what differentiates various languages.

There's probably also a lesson about automation and reducing the time for new people joining a team to actually start writing code. But I think that's a little out of scope of this post.

Apart from providing a nice way to learn new development skills I also appreciated being able to learn more about how different companies work. All four companies were pretty different in size and organisation and it was nice to be able to take a peek into that. And I think this arrangement would be a pretty good way to do hiring and have potential employee and employer get to know each other.

Last but not least, besides languages and companies I also got to work with more people and make new friends. That alone has made this worth it for me. Spending time with people who share some of the same enthusiasm for software development was really refreshing and energising.

Try it yourself?!

So, would I recommend this to others? In general, I would say yes. I found it a good way to learn and to get diverse experience in a short time-span. And for the hosting companies I would say it provides quite a bit of value to have a fresh pair of eyes looking at existing code and processes. But there are some caveats, of course. I'm not sure if I had been as eager to do this if I wasn't homeless at the time. And the things I learned about were a little bit random and probably not always on the top of my list of things to look at. This didn't bother me much, as I have made the experience that a little bit of randomness in my life had always turned out well for me. But it's worth pointing out.

For the host companies I think the benefits are more pronounced. It's a cheap way to have an outsider come in and ask questions and potentially pick up on blind spots. This is something I think would even work for less experienced developers coming in (and would also work for other roles then devs). And fostering pairing is also something that is valuable, even though a lot of companies may not have picked up on that. Plus, I think showing that a company cares about learning is good advertising from a hiring perspective.

Further reading

There are other people who did similar travels who inspired me and who put more effort and energy into it than I ever did. I'm extremely happy to see that this doesn't seem to let up.

First, there is Corey Haines, who I saw talking about craftsman swaps, code retreats and his journeyman tour at QCon in London in 2010. He had a tremendous impact on how I think about my work. (You can watch the talk here.)

Rob Ashton, while probably working from slightly different motivations, was the one whose blog posts came at just the right time to spark off discussing this idea at SoCraTes. A year after he started, he still hasn't settled down and I envy him for it.

Peter Kofler has also done a craftsman tour around Vienna and I'm grateful to have been able to exchange experiences with him. He's also written extensively about it on his blog.

Since then I've also become aware of and impressed by Andy Waite and the guys from The Bakery.

I hope this encourages you to try something similar. If I can be of any help please do contact me either via twitter, by email or just leave a comment.

And, if you're in or around  Germany, there are groups for remote pairing and for craftsman swaps on the Softwerkskammer website.

Wednesday, October 2, 2013

Journeyman weeks - week four @ msgGillardon

Read here about last week at soundcloud...

Nicole Rauch had been a strong supporter even before I actually came up with the idea of doing a journeyman tour. The German Softwerkskammer network has been playing with the idea of craftsmen swaps and Nicole worked hard to make that possible in the company she works for. Hopefully the example will inspire others to do similar things.

During SoCraTes Nicole and her partner Andreas Leidig had offered to host me and eventually got the OK from their employer msgGillardon to let me work there for a week. The company is set in the small town of Bretten, close to Karlsruhe. It's a very beautiful little corner of Germany and I was quite happy with the contrast of small town life compared to frantic Berlin the week before.

View from the top of the office
The part of msgGillardon I was working at makes software doing forecasting for finacial institutions. Most of that is still in C++, with newer parts now being written in Java. I was a little too intimidated by C++ to work on that, which in hindsight might have been unnecessarily self-limiting. But I actually enjoyed working in Java again, since it's been quite a while. The framework used (Eclipse RAP) and the specifications written in FitNesse provided enough things that were new to me and gave opportunity to learn.

I remain skeptical of frameworks that try to hide the complexity of visually representing an application. In the case of Eclipse RAP it at least seems to be thorough and consistent in hiding representation, which makes it more acceptable then Wicket or JSF (*shudder*). If you're happy to stick with what the framework provides and think in terms of desktop applications, as in this case, RAP seems to be helpful. But one of the drawbacks showed itself when we started to write Selenium tests for the application. The effort required to select elements to interact with made automating them too expensive.

FitNesse seems a little crude in some places but I liked working with it. I was missing support to generate code snippets, as Cucumber/JBehave/SpecFlow all do. Aside from that, writing examples in Slim tables fit very nicely for the domain we were working on. There was a lot of combinatorial complexity in the inputs and it looked a lot more comprehensible to have these in FitNesse rather than in java unit tests. It also allowed for a slightly easier active conversation with the product owner about the required functionality.

In terms of culture, msgGillardon is quite different to the start-up companies I've visited in previous weeks (and also quite different compared to ThoughtWorks and its clients). As a more traditional medium sized company working for a lot of customers in banking, things were a little bit more formal and the technology less bleeding edge.

Nevertheless, I was very positively surprised by the willingness to try out new things. (And not just the obvious experiment of letting some random guy show up there and work there for a week. With very convenient and simple organisation.) There seems to be a genuine interest in changing and improving and that is not something I'm taking for granted anymore.

The other thing that definitely stuck out was the diversity of the teams, at least in age and gender. I have no idea how that came to be but it was refreshing to see. I'd be curious if people there have found an explanation for why they are doing so much better than everyone else in Germany.

During the week I also had the opportunity to take part in the Softwerkskammer Karlsruhe meet-up. It was nice to see so many familiar faces from the SoCraTes conference. Nicole and Andreas were running a legacy refactoring workshop that was a very nice alternative to the legacy code retreat format. Like all good workshops, it left me with a lot of things to think about on how I would do things differently if I were to do it again.

This ends my journeyman weeks for now. I feel incredibly grateful for the privilege of having been able to do this. I am in the process of summarising the different experiences and compare and contrast them. I will hopefully also have some useful information for others who want to try something similar. If you have any specific questions or feedback, please do leave a comment, contact me on twitter or write me an email. I'd love to hear from you.

Pictures from the past couple of weeks are on picasa. Also check out Peter Kofler's blog about doing something similar in Vienna and Andy Waite's for something remarkably global.

Wednesday, September 18, 2013

Journeyman weeks - week three @ soundcloud


To find where I left off go here...

Leaving Bucharest behind

The break in Bucharest was really awesome. The ALE conference was as good as I remembered from Berlin in 2011. There's a certain vibe to this conference that is hard to describe. People generally seem to be very open and trusting and thus the conversations to be had are very interesting and candid. It felt very nice to be around friends old and new. And people were very encouraging of what I'm currently doing and very supportive. Which reminds me to point out the following. If you like what I'm doing please let me know in comments or on Twitter. It means a lot to me.

I stayed at quite historic grounds

I was still processing the impressions from Bucharest when I set out for Berlin. Arriving there I found my way to the interim SoundCloud office in Prenzlauer Berg. I was greeted by my old friend and former colleague Duana Stanley who had arranged my week there. We had never got to work on the same team while we were in ThoughtWorks, so I was looking forward to finally getting to pair up with her.

The interim SoundCloud office

We had briefly talked about what I could do wile I was there and one of the things I said I was interested in doing, was learning more about Android development. The Android team had been working on the jenkins CI build of their SoundCloud client and were happy to have Duana and me help out on some things there.

Their build consists of a fairly normal maven configuration that gets instrumented by a rake build file. That seemed like a much nicer solution to wrap filesystem and git tasks then trying to do that in maven. As we set to work, I was quite surprised by how much maven knowledge was still in my head. Since I haven't really worked with maven in two years and have been trying to forget its existence for longer than that, this was not what I had expected. (Just to be clear: this is not an implicit endorsement of other build tools.)

We managed to get our initial goal done fairly quickly, which clearly showed the benefits of pairing. Since some of that work was the usual drudgery of trial and erroring our way through different maven settings it was nice to have someone to share in the suffering as we pushed each other forward.

After that we tried to move parts of the build out into its own project. This took more time than I would have liked but eventually reduced the build time by a good minute. Duana quickly calculated that we would need about 1200 builds to make up for the time we invested in doing this. As we calculated a bit more I was surprised by how quickly this would pay back, if you consider that multiple people are running the build, many times a day. This led me to realise that I never really thought much about this in other instances of trying to improve build times and it seems painfully obvious now.

Sure everyone feels the pain when the build duration passes that magical 5 minutes barrier. And we feel compelled to improve things. And even 5 minutes is pretty awfully slow. But if you think about it in more concrete terms it gets highlighted. Improve the build by a minute with 3 pairs trying to build 10 times a day and you already get half an hour. By month's end you have already gained more then a full day. The very unscientific and fuzzy way of coming up with this number is balanced by the fact that you really should be building more than 10 times a day anyway.

Anyway, back to my week. Finally we got to work on a little feature for SoundCloud's android client. Since Android 4.1 notifications can have a different, expanded view when there is enough space in the notification menu. Getting this working for the SoundClound client was a nice, small fearure that cut through a lot of basic Android concepts. I got to do the layout for the big view and, while refactoring, learned how the notifications fit into the concept of Android's services. We decided not to add any extra functionality to it yet, because we felt that we wouldn't be able to finish it by the end of the week.

We didn't get around to writing much tests (shame on me) but learned that Robotium wouldn't be able to test the notifications anyway. But I think we managed to get the feature to a good starting position.

On Friday afternoons, SoundCloud has a demo session where people from across teams show what they've been working on. This seems to have most of the employees present and is followed by drinks and mingling. Duana and myself presented our results and I talked very briefly about what I am doing with this journeyman tour.

Large paintings of your favourite meme

I quite liked this get together and SoundCloud's company culture in general. SoundCloud is decidedly bigger than the other two startups I was at (and they're still growing) and I think it's at a critical stage where it becomes challenging to keep the existing close-knit culture. People were very open and helpful to me and I think there's still that sense of playfulness that makes startups fun. Also, they provide free pistachios and cashews (among other things) which meant I got pretty sick quickly because I have terrible impulse control when it comes to eating those.

While I was there I also got to meet some of the women working for the rails girls summer of code project. They all seemed really eager and doing fine so I was pretty happy to see that the project seems to be going along well.

I also went to another session of xtc Berlin. I think there were about eight people present and I was happy to see that it still lives on. It was also really nice to catch up with some people I hadn't seen in a while. Among which was Stefan Hübner, who reminded me about Euro Clojure, which I promptly bought a ticket for.

Next week I'm heading south to Bretten near Karlsruhe. Nicole Rauch and Andreas Leider have invited me to work with them at msgGillardon. I've never spent any time in that region of Germany so it should be interesting.