architecture, process comments edit

Today I want to talk about something I’ve seen in a few places that just frustrates me to no end: the repeated failed strangler pattern.

To make sure we’re all on the same page: The strangler pattern is when you want to overhaul an existing system and you do so by wrapping it in a facade. You swap out components under the facade from old bits to new bits, eventually strangling the old system and removing the facade so only the new stuff is left. The idea is that doing this over time potentially has lower risk than rewriting an existing system from scratch.

There are a couple of articles on this pattern here:

In general, I think this is a decent pattern. However, I take issue with the implication of this line from the Fowler article (emphasis mine):

They aren’t yet at the point where the old application is strangled - but they’ve delivered valuable functionality to the business that gives the team the credibility to go further. And even if they stop now, they have a huge return on investment - which is more than many cut-over rewrites achieve.

My challenge is around the notion that you can do a partial strangler and that’s an OK - or even good - thing.

My experience directly contradicts this. Let’s take a hypothetical example:

Let’s say you started out a long time ago, in a galaxy far, far away, with a simple string processing engine. It was long before SOAP or REST. It did what it needed to do, you sold it, and you got some customers up and running with it.

String processing engine

So far, so good. SOAP comes along, and XML is pretty awesome, so you decide you want to start moving away from arbitrary string-in, string-out and to something with more formal contracts. Cool! Strangler to the rescue. Except… you do still have some customers that won’t really be able to update right away… you need to leave access to the original string processing engine in place. New folks can take the new interface, though, so that’s good, right?

XML messages wrapping the string processing engine

Turns out the strangler to convert strings to XML wasn’t quite SOAP due to some custom extensions you needed to create to make it work with the string processing engine. You still really want some SOAP wrappers on this thing, though, so you can start decoupling things and iterate faster over individual services/features. Let’s wrap the XML message handling with actual SOAP contracts that are pretty close to but not exactly like the XML messages.

Except… you sold the XML messaging to some customers and they can’t really switch to the slightly modified contracts for the services. And you really haven’t pushed those original string processor customers to upgrade yet because they’re threatening to leave if you create any breakages.

SOAP services wrapping XML messages wrapping the string processing engine

OK, this time for reals - REST is a bit lighter weight and would lend itself better to some of the new prospective clients’ needs. Getting some REST microservice support in there could really get things going, especially since most of the developers you’re hiring now are more versed in REST and that’s the direction your market is going.

(I bet you see where this is going…)

Except… now you have customers on all three previous layers: SOAP, XML messaging, and string messaging. Gotta keep access to all three of those things. No breaking changes! Ever!

REST services wrapping SOAP services wrapping XML messages wrapping the string processing engine

Does that look at all familiar?

Seems like a bit of a design flaw...

This is why I call it “Death Star Architecture.” You’re not finishing the strangler, so instead of getting the benefit of the pattern, you’re just adding layers to your system that all need to be maintained and tested.

Finish your strangler!

Finish your strangler! In the short term it may seem like you’re getting benefits, but long term the unfinished work results in technical debt that will ultimately cause your destruction.

autofac, net comments edit

Time to address a frequently asked question I see with Autofac and many other open source libraries I use or work on:

I’ve upgraded the (target framework/base libraries/build/.NET SDK version) for my application and now I see build warnings due to transitive dependencies. Why don’t you upgrade the library dependencies to a later version to address an issue I see in my application?

I see this a lot in the .NET Core realm, where use of many small dependencies rather than a big installed framework is a new thing. It’s not so much a question in places like Node.js where the use of many small, chained dependencies has been around for a while.

What it boils down to is that there’s a difference in how you manage dependencies in libraries and applications.

Libraries Target Compatibility

When you have a library, you want to make sure it’s stable and compatible for your consumers, both at the outset and across upgrades. People want the latest features and bug fixes, right? This means a lot of things including:

  • Interfaces need to be stable. If you change an interface, people consuming your library may not be able to take the next update. That means for any public and protected interfaces and classes you have to be very mindful about changes.
  • Dependencies need to be stable. If the library takes an update to a dependency, at a minimum that means anyone taking the next version of your library is forced to take an indirect update to the dependency. It may mean forcing a breaking change onto the library consumers who may be directly referencing the dependency and can’t take that update.
  • Target the lowest common denominator. That means targeting the lowest version of the .NET Framework or .NET Standard that you can. Lower target version means more compatibility. This also means targeting the lowest version of dependencies you can for the same reason - lower version means generally more compatibility, especially with folks who are already using the transitive dependency.
  • Keep the target framework stable. Increasing that target .NET Standard or framework version means the new version of the library may not be compatible with existing applications - people will be forced to update their applications or may just be locked out of using the new version of the library.

As you can imagine, any changes here can cause unforeseen ripple effects. Upgrading a dependency version may fix one issue but could cause downstream consumers problems you can’t anticipate.

The general rule for dependency/framework versions is to target low and keep it stable.

Applications Target Features

When you write an app, your largest concerns are the features you need and the target environment in which it’s going to run. It means priorities shift as far as compatibility and upgrades are concerned.

  • If you need a new dependency feature, you can just take it. If you see something new you need out of a dependency - a feature, a bug fix, whatever - you can take the upgrade when you want.
  • Breaking API changes are surfaced differently. This may be a REST API, a command line argument interface, or a plugin abstraction, but bringing in a new application dependency generally doesn’t cause a breaking change for application consumers.

Application dependencies generally don’t end up affecting application consumers.

Addressing Library Dependency Scenarios

By and large, when you see an issue with a transitive dependency coming into your application, the solution is to add a direct dependency in your application to the version you want.

Some scenarios you may see in the .NET Core world to help make this concrete:

  • Security updates in .NET Core. When security issues are detected in .NET Core base class/framework libraries, the way you resolve that in your application is to manually take dependency updates to fixed versions. (This doesn’t mean libraries shouldn’t take these updates, but it does illustrate how application developers don’t necessarily get to delegate that responsibility away.)
  • .NET Standard 2.0 Release. When a new .NET Standard release is issued, that doesn’t require every library to move to that release even if that’s what your application targets. If you look at how .NET Standard works, a higher .NET Standard means more APIs are available to you, but it also means fewer existing apps that target lower .NET Standard versions are able to use your library. You can consume lower .NET Standard libraries in higher targeting applications. For example, a library may target .NET Standard 1.3 and you can totally consume that in your .NET Standard 2.0 app.
  • New dotnet SDK/CLI/build system. If you upgrade your development/build systems to use a new dotnet SDK/CLI it can start generating warnings on your applications when it finds transitive dependencies that may be stale. The solution is to add a direct reference to the later version of the stale dependency. The reason that’s the solution is that not everyone has updated to that latest version of the dotnet SDK/CLI. Libraries need to maintain backward compatibility, so forcing the update may indirectly require consumers to upgrade parts of their development or build process that they’re not ready or able to upgrade.

The point is that many of the challenges seen at the application level due to transitive dependencies can be solved by adding direct dependencies; those same challenges may not be solved in the same way at the library level.

Wrapping It Up

Hopefully this helps clarify why libraries you consume “don’t just take a dependency upgrade” when you notice something in your application.

And, of course, a lot of this is generality. In some cases apps can’t just “take upgrades when they want” because customer environments may not support those changes. In some cases libraries can make changes to public APIs or dependency versions and it doesn’t hurt anything. There’s no hard and fast rule, but basic understanding.

If this sort of discussion interests you, Nick Blumhardt has a similar post on his blog about logging abstraction usage differences between libraries and applications - libraries use abstractions (like Microsoft.Extensions.Logging or LibLog, applications use implementations (like Serilog or log4net).

javascript, tfs comments edit

For better or worse, I’m trying to host a Bower package in a Git repository hosted in Team Foundation Server 2013.

Something I noticed when trying to install the package into my project is that the default Git credential manager was totally being ignored. That is, this was happening:

PS> bower install --save http://my.tfs.server:8080/tfs/Collection/Project/_git/my-package
bower my-package#*      not-cached http://my.tfs.server:8080/tfs/Collection/Project/_git/my-package#*
bower my-package#*         resolve http://my.tfs.server:8080/tfs/Collection/Project/_git/my-package#*
bower my-package#*        download http://my.tfs.server:8080/tfs/Collection/Project/_git/my-package
bower my-package#*           EHTTP Status code of 401

I had authenticated to the TFS server before and credentials should have been stored using the Windows Credential Manager. Doing a git ls-remote on the repo didn’t prompt me for credentials.

The answer, as it turns out, is to prefix the URL with git+ and suddenly the credential manager kicks in.

PS> bower install --save git+http://my.tfs.server:8080/tfs/Collection/Project/_git/my-package
bower my-package#*      not-cached http://my.tfs.server:8080/tfs/Collection/Project/_git/my-package#*
bower my-package#*         resolve http://my.tfs.server:8080/tfs/Collection/Project/_git/my-package#*
bower my-package#*        checkout 1.2.3
bower my-package#*        progress Receiving objects:  14% (15702/112154), 71.73 MiB | 9.57 MiB/s
bower my-package#*        progress Receiving objects:  20% (22431/112154), 81.10 MiB | 9.51 MiB/s
...

build, process comments edit

Pre-emptive disclaimer: This set of anecdotes doesn’t refer to a specific product or a specific repository. Some things come from past life experience, some come from open source projects, some come from other places I’ve encountered. As they say in the movies, “The story, all names, characters, and incidents portrayed in this production are fictitious. No identification with actual persons (living or deceased), places, buildings, and products is intended or should be inferred.”

Some of what I do involves continuous integration and continuous delivery, build tooling, that sort of thing. A particular pain inflicted on me is when I run into a monolith repository - a huge, multi-project, multi-solution repo with tons of different things that are all intended to work together so they all live in one source code repository.

Autofac was in one of these monoliths until a few years ago in our move to GitHub. There were good reasons for it to be that way, but in the end we split it up due to some of the challenges.

Here are a few of the challenges I’ve seen in working with monolith repos.

Slow Builds

Once a repository gets to a certain size the time it takes to build becomes unbearably long. I’ve seen estimates of about 10 - 15 minutes being a reasonable build time including tests. I think that’s about right. Much longer and folks stop building everything.

Don’t get me wrong, it’s OK to let the build server do its job and offload the task of running huge builds to detect things that are broken. What’s not OK is when your developers can’t build the whole thing - either because the build run takes several hours or because various tooling changes and build optimizations (see below) have made it so only build agents have the right configuration to enable a build to get out the door.

“Views” on the Source

If it takes too long to build, or the source is so big it’s hard to work in, folks end up working in logical “views” over the source. For the most part this is a reasonable thing to consider, but this has an unforeseen effect on how you build, which leads to some of the challenges I’ll explain below: incorrect build optimizations, inconsistent tools, non-modular modules, and so on.

This becomes more challenging when people want their own custom views. For example, a person who might be working on a specific application also wants to be able to refactor things in dependencies of that application. Creating that custom “view” can give the impression that it’s OK to just change things within that view and that it won’t have any impact on anything outside.

Incorrect Build Optimizations

Once you have slow builds going on, folks are going to want to speed the build up, especially for their “view” (if you went that direction).

One way you might try optimizing is to run all the tests in a separate build - build one time to compile, provide feedback on whether it even compiled or not, and then run tests with additional feedback on pass/fail. The problem here is that it’s pretty easy to get something compiling that totally won’t work at runtime. How long do those tests take? Who caused them to fail? If you have too many compilation builds queuing up for tests to run, you’ll end up batching them which means finger pointing. “I’m sure it wasn’t me, so someone else can fix it.”

Another way you might try optimizing is to use package management systems to handle dependencies. Each “view” can build independently and in any order, so you can have separate builds if you want, one for each “view” or logical component. That’s actually a pretty decent idea as long as you do that consistently and with a lot of discipline… and as long as you acknowledge that building everything in the entire repository won’t necessarily ensure that the latest versions of everything work together. With package management, the build output of this part of the build won’t necessarily be the build input to some other part.

Stale and Inconsistent Tools

Once the build gets too large or complicated, it can be an easy pattern to fall into to just tweak one or two code files, check them in, and let the build server do all the work to verify things.

However, if developers never actually run the build, the developers may start using newer tools than the build agent uses, causing the actual build tooling to become stale.

Alternatively, different developers or teams working in different areas of the monolith may want to update the tooling used to build their “view” of the monolith repo. As long as you don’t want to build everything together, that’s fine. Chances are, though, the decision of the team on their component now puts additional requirements on build agents and makes it so all devs can’t build the whole set of components.

A great example of this is MSBuild and Visual Studio: Say you start your project when Visual Studio 2010 is out. As time goes on, developers update their machines to Visual Studio 2012, 2013, and beyond. Parts of the monolith that don’t change much remain back on 2010 while people move on. Eventually no developer has VS 2010 installed and could never actually build the solution. In the meantime, the build agent requires all of those Visual Studio versions be installed because there wasn’t a concerted effort to unify on a version.

Another MSBuild / Visual Studio example: Again, say you start your project with Visual Studio 2010. New features and functions get added but in an effort to not change tooling for everyone, people start manually tweaking files used by the tooling - solution and project files. Now you have a situation where if you actually open the solution in VS 2010 you get an error saying there are project types in the solution that aren’t supported… but if you open the solution in a newer version of Visual Studio you get a notice that the solution is on old tooling and must be upgraded to work.

Bad Versioning Strategies

Generally a build of a codebase corresponds to a single version of something. In the case where the code has multiple logical applications or components, the build might correspond to a single release of those things.

In a more microservice scenario, you likely wouldn’t have each microservice building and deploying as part of a monolith repository. Instead you might have a “build” that continuously runs integration tests over the deployed microservices to ensure they’re still working together as expected.

Anyway, since the build number (or build version, or whatever) can really only track one version number at a time, you have three choices, none of which are awesome:

  1. Version everything together. That means if you change one line in one component and the build kicks off, the version on every component in the entire repo changes. If you only ever deploy the entire monolith at the same time and it all builds together (e.g., not through a package management mechanism or using “views” on the repo), that can work.
  2. Implement an alternative independent versioning mechanism. In this case you’d have to figure out a custom way to indicate the version of each component in the system that can “version independently.” That version will not be tied to the build server version/number. You may build the same version of a component multiple times if the overall build gets kicked off and the component version hasn’t been incremented. This gets more complex if you want to see in which build a particular component version originated.
  3. Never change the version. This really only works if it’s a small project and everything is entirely, like, “software as a service” or something where you always only ever have the latest version deployed and you never have to report on it.

We tried ideas one and two in Autofac before splitting the repos in GitHub. I’ve seen idea three in other projects.

Circular Builds

If you build as a monolith, it’s really easy to accidentally create a circular build dependency.

Say you have some custom build tasks or scripts that help you with your build, like custom MSBuild tasks. That’s fine as long as the custom tasks are entirely independent of the code they’re building. However, say you have a custom build task that has a dependency on one of the assemblies being built… and that assembly being built requires the use of the build tasks to succeed.

Bad times. You need to unravel that.

Package management decoupling can also make it a piece of cake in the monolith to create a circular build dependency. Component A takes package B as part of its dependencies. Component A builds, publishes package A. Now component B builds… and takes package A as a dependency. This can be a really hard thing to detect, especially if package A and package B don’t themselves properly declare package dependencies.

Non-Modular Modules

Once the build gets broken up into logical components, applications, microservices, or modules, it’s far too easy to “just add a dependency” on some other piece of the source code repository and ignore the application and process isolation required to ensure that module actually stays modular. A lot of times you’ll see inconsistent application of dependencies - some come from a package management system, some come from the local repository’s build output from some other component.

Ever-Changing Framework

If the shared libraries or shared dependencies you use are in the same repo as your consuming components, it takes a lot of discipline to not start new functionality by instantly putting new items right in the common code. Your new application or component is going to need to validate phone numbers, why wouldn’t you add a whole shareable framework component for phone number validation? Hey, just throw that in the lowest level dependency so it’s readily available anywhere at any time!

Of course, that means from that point on anyone using your shared library will assume the new functionality is there and removing it will be a breaking change… so… uh… maybe that’s not the best idea.

It Failed, but Really Succeeded!

The build server may say the build failed, but if you build “a view” of the monolith in isolation, that same piece may actually succeed. Which one is right? Is it a problem with the overall ordering of how the repository builds the logical components? Is the build server actually the system of record anymore?

It Succeeded, but Really Failed!

After a certain level of complexity gets introduced, it gets pretty easy to start ignoring warnings that get generated or inadvertently cause errors to get ignored.

For example, say your build uses MSBuild in some areas and PowerShell in others. MSBuild calls a PowerShell script which then ends up calling MSBuild. Errors reported in that innermost MSBuild execution may not actually cause the overall build to fail… which means the build will show as successful even though it’s not.

Illig’s Law of Monolithic Repositories

The amount of discipline required to maintain a build is directly proportional to the size of the source code repository. The amount of discipline actually used is inversely proportional.

Most of the monolith repository problems you see could be avoided with enough developer due diligence and discipline. However, the larger a repository gets, the less personal responsibility folks start to feel for keeping the build performing and running clean. It’s too easy to complain about the size and complexity, passing general housekeeping off as technical debt to be addressed later. Eventually people become complacent (“That’s just how it is, we can’t fix it.”) and nothing ever does get fixed.

It’s the opposite of “too big to fail”: It’s “too big to fix.”

Our receiver, a Marantz SR5010, is in a cabinet. While it supports on-screen display to show current volume levels and input info, it seems to be fairly well known in the community (e.g., forums, etc.) that getting it to work in conjunction with a 4K display is more luck than skill.

We had no problem with the OSD in standard 1080p HD format, once we got a 4K TV, the OSD basically stopped appearing. Turning everything off and back on again might see it reappear for a 10 or 15 minute span, but after that it disappears.

The challenge: We like to see the receiver’s power/volume/input status but we don’t want to leave the cabinet hanging open.

The solution: An Arduino-based volume monitor to provide a remote display for the receiver.

Finished volume monitor

Here’s a video where you can see it in action.

Parts

Prices listed are the prices I paid - they may have changed since I bought them, etc.

In that list I didn’t include the box you may or may not want to put the finished product in; and little stuff like solder and a length of wire you’ll need to patch the 1602 shield.

How It Works

Marantz receivers have an HTTP API used for remote control programs and general network interaction. By making a GET request to http://<receiver-ip-address>/goform/formMainZone_MainZoneXml.xml you will get a fairly large XML document that has all the information about the receiver’s current status.

The Arduino volume monitor polls this endpoint and displays values based on the contents of the response.

The basic algorithm is:

  1. Make a request to get the XML status from the receiver.
  2. If the receiver is OFF, wait for five seconds before polling again.
  3. If the receiver is ON…
    1. Parse the XML to get the volume, selected input, and audio type.
    2. Update the 1602 display with the latest information.
    3. Immediately poll again.

The wiki on GitHub where I posted my code has a lot more details on specifically what the Marantz API responses look like and how that works.

Assembling the Hardware

The hardware itself is pretty easy to assemble. We’re going to stack the shields in the order (from bottom to top): Arduino, Ethernet shield, 1602 shield. We’ll do that after we do two things.

The Arduino and shields, ready to stack

First, we need to patch the 1602 shield. The W5100 Ethernet shield needs digital pin 10 on the Arduino for transmission. The problem is, the 1602 shield (at least the one that I bought) also wants pin 10 for control of enable/disable on the display. If you just stack them up now, things go all haywire - the Ethernet shield never really transmits correctly and the 1602 display basically stays disabled all the time.

  1. On the 1602 shield, clip off pin 10. You don’t want the 1602 shield picking up anything from the Ethernet shield. I clipped this off with a small set of flush cutters.
  2. Solder a small jumper wire across the top of the 1602 shield from pin 10 to pin 3. You could choose a different pin if you like, but pin 3 seemed reasonable.

Now if you want to control the display enable/disable toggle, you can write to pin 3. It won’t interfere with the Ethernet shield and it works great.

In the picture below, you can see me pointing with a screwdriver at the clipped-off pin and my purple jumper wire.

Patch the 1602 shield so it doesn't interfere with the Ethernet shield

Second, you need to create some risers out of stackable shield headers. The Ethernet jack on the W5100 shield is too tall to just stack another shield on top. I used stackable shield headers to create some small extensions/risers to ensure the 1602 shield didn’t run into the Ethernet jack.

I did clip the stackable header pins down a bit so they sat nice and flush with the Ethernet shield headers.

Create extensions with stackable headers

Now just stack ‘em up and you’re ready to go!

Installing the Software

I published the software on GitHub. You can head over there, grab it, and upload it to your Arduino.

I used the Visual Micro extension for Visual Studio when developing, so you’ll see some Visual Studio files in the source, but you should be able to load it up in the standard Arduino IDE and use it without issue. If you find a problem, file an issue and let me know.

You may need to adjust the button resistance tolerances. In the DFRobotLCDShield.h I have some input values as the buttons are read on analog pin 0. These don’t match the values I saw in any other code snippet or data sheet they posted. I don’t know if yours will match mine, but if they don’t, you’ll have to tweak it.

Using the Software

When you first start up the Arduino it will get a DHCP address and then try to read the configured IP address for your Marantz receiver. If none is configured, you’ll be sent into a setup routine to configure the receiver’s IP address. Alternatively, you can push the “SELECT” button on the 1602 shield keypad and enter the setup routine.

In setup, use the left and right buttons to move the cursor to the right spot in the IP address and up/down to increment/decrement. When you have the IP address entered, press “SELECT” again to save it and continue.

The Arduino will use the receiver’s IP address as the location to make the HTTP GET request as noted above. When the receiver is off, the display on the 1602 will be off; when the receiver is on, the display will be on and showing current data.

It may take a second or two between turning the receiver on and seeing the display come on. This is due to the Arduino only polling every five seconds for status when the receiver is off.

The Arduino is not a super-powered CPU so the data may be delayed by half a second or so. The HTTP request and subsequent processing of the response takes about a half second, give or take. It polls as fast as it can, but that still means you’ll get maybe three updates a second. As such, if you hold down on the receiver’s volume button, you’ll see the Arduino display “jump” in increments instead of incrementing and decrementing smoothly. It also may be slightly behind.

Say you are holding down the volume button on the receiver so it’s constantly going up:

Time Action Arduino Display Actual Volume
0.0s Arduino makes HTTP request 0 0
0.1s Receiver sends response 0 1
0.2s Arduino starts processing response 0 2
0.3s Arduino still processing 0 3
0.4s Arduino still processing 0 4
0.5s Arduino updates display 1 5

The Arduino is going to display the volume at the time the receiver sent the response, which may not be the same volume at the time of display. Not to worry, it should catch up on the next request. At most you’ll be about a half second behind, which isn’t so bad.

Finishing Touches

I put my volume monitor in a box. I used one of those little unfinished boxes from a craft store and stained it dark. I padded the inside with a little craft foam to keep it in place.

The unfinished box

Once it was all put together, it looked pretty good on the shelf!

The finished monitor on the shelf

Interesting Points

I learned a lot while working on this project.

I was going to do dynamic discovery of the receiver during the setup process so you didn’t have to manually enter the receiver IP address, but that requires UDP multicast to use SSDP. I found out the W5100 series shield doesn’t support UDP multicast… or if it does, I couldn’t get it working. There really aren’t any examples out there. All the standard Arduino Ethernet library support for UDP multicast seems to be for the later W5500 shield.

I noticed that a lot of projects skip the “nice UI” thing and hardcode a lot of stuff. For example, most projects seem to hardcode destination IP addresses right in the program. I suppose that’s fine, but if (for whatever reason) I need to change the receiver’s IP address, I really don’t want to have to reprogram the Arduino. That’s why I put the setup UI in there, though it does take some of my program space and RAM to support its existence.

Since you only get 2K of RAM to work with, the Marantz HTTP response being in XML was challenging. Even if it was in JSON, it’d still be far too large to read in its entirety so I had to do some pretty basic string parsing to read the XML and process it as a stream. I’m kind of surprised there aren’t SAX parsers for Arduino, though I suppose these projects generally avoid XML altogether.

The Repository

The code is all free on GitHub. I included a lot of more technical info in the wiki for that repo.

Go check it out!