CodeNewbie Community

Cover image for Umbracadabra! Defence Against the Dark Arts of Magic Strings in Umbraco
Joe Glombek
Joe Glombek

Posted on • Originally published at 24days.in on

Umbracadabra! Defence Against the Dark Arts of Magic Strings in Umbraco

I'm sure many of us have been told at some point in our careers that "magic strings are bad" but why exactly is that and what could go wrong? And what alternatives are there to improve our code?

Why are magic strings so bad?

We use the phrase "magic strings" (and, to a lesser extent, "magic numbers") to refer to a constant string used directly in your code. An example of this that we, as Umbraco developers, may be all too familiar with is Model.GetPropertyValue("bodyText"). Here, "bodyText" is the magic string. The text can be used in multiple places (maybe in the meta tags as well as the page body, or on different templates), has to be in this exact format (I can't swap it out with "Can I have the text for the body please?", as polite as that is) and refers to an external entity (the Umbraco database via the Umbraco APIs in this case). These things make it a particularly code-smelly example of a magic string that needs replacing.

Magic strings is a term used to cover multiple scenarios of when a string is used in place of a more robust method.

Typos: the arch enemy of magic strings. If we're having a particularly productive, Hollywood-hacker-style afternoon, smacking away at those keys, you'd be excused for mis-hitting the odd one.

Clip from NCIS showing two people typing manically on one keyboard

They can be particularly difficult to spot too: depending on your font, "bodyIext" could easily go unnoticed. The folks that write our IDE's know this, and will correct us where possible. A string, however, cannot be autocorrected because any value of string is a valid string. As far as the IDE knows, you meant to type "bossyTRex". These cause runtime errors rather than the easier-to-spot compile-time issues.

Fat-fingered-ness aside, magic strings can be exclusive to those who speak a different first language or people who have difficulty reading and writing. Even if you speak the same language issues can arise when trying to agree how to spell "colour"!

Or how about a piece of logic that sets this.ValidationMethod = "none". We then later decide we need to change the validation method, but it's not obvious what the other possible values are. Magic strings aren't self-documenting and so we rely on reference documents or "there's a comment that explains how to use it somewhere" or "didn't I put that in the README file". This is far from ideal.

And if all that wasn't enough, magic strings make it more difficult to replace a value, test your code or change logic on your test environments compared to production. We can do better!

Is there such thing as a muggle string?

Are there cases when a string is non-magic? Or perhaps when a magic string is ok? Yes, not all strings are magic strings and not all magic strings are nescessarily bad.

If you're typing a string value in code, the blog do you even code, bro? has boiled it down to 4 questions you need to ask yourself:

  • Can you reuse this constant?
  • Can you easily change a few values at once?
  • Does it improve readability?
  • Is it part of configuration?

This won't cover all scenarios, but it's a good place to get started. If any of your answers to these questions is "yes", it's time to look into alternatives.

Did you just casually drop the term "magic numbers" too? I've only just got to grips with magic strings!

Ah, yes. Sorry about that!

Yes, magic numbers are the lesser talked about sibling of magic strings. But they can cause the same sorts of issues.

Take this code as an example:

public void PrepareForCheckout() {
    if(Customer.Country == "United Kingdom") {
        FinalValue = Value * 1.175;
    }
}

Enter fullscreen mode Exit fullscreen mode

It's not immediately obvious what this 1.175 is. You might have picked up that it's the old tax rate in the UK of 17.5%, but it could easily be missed, especially by a non-British developer.

Now that we know this is the VAT rate, we also know it's wrong. The UK changed their VAT rate to 20% a few years back. Correcting this value could be difficult too - we need to replace all instances, but if it's written as value = value + value * (17.5/100) (someone has gone about code clarity in an odd way!), a simple find-and-replace won't suffice.

If, for example, we had a constant of this value somewhere that we reference from everywhere, it would be both clearer what the code was doing and easier to update.

public void PrepareForCheckout() {
    if(Customer.Country == "United Kingdom") {
        FinalValue = Value * UkVatRate;
    }
}

Enter fullscreen mode Exit fullscreen mode

Now, there are many other tweaks we could make to this code, but I've just solved the one we've been talking about.

ModelsBuilder to the rescue

The first example I gave in this article was Model.GetPropertyValue("bodyText"), which, back in the Umbraco 7-days of 2013, Stephan Gay aka ZpqrtBnk identified as a problem and developed the ModelsBuilder package. ModelsBuilder is now a part of Umbraco and if you're not using already, you really should be. It removes the need for magic strings when interacting with IPublishedContent across our Umbraco solutions by generating models for each document type. Dave Woestenborghs wrote an article on Getting started with Umbraco Modelsbuilder back in 2016 which still holds up well.

A constant problem

I mentioned creating a constant just now. And that's probably one of the simplest methods of avoiding magic strings. Depending on the size and complexity of your project, it can be as simple as pulling your strings up to constant declarations in your class.

public class Cart {
    private const string BodyTextKey = "bodyText";
    private const decimal UkVatRate = 1.175;

    //...
}

Enter fullscreen mode Exit fullscreen mode

In .NET, constants (const declarations) are actually pretty efficient - at compile-time, they're not allocated any memory like variables are, so it's equivalent performance-wise to directly using the string in each location. It does, however, help us out a lot pre-compilation by removing the magic strings.

To make constants reusable across your project you might find it useful to create a Constants.cs file in your project and stick all your relevant strings in there. This can get pretty cluttered on bigger projects, so you may need to categorise your constants further, grouping them into logical classes and namespaces or adding your constants to related services like so:

public static class VatHelper {
    public const decimal UkVatRate = 1.175;

    //...
}

Enter fullscreen mode Exit fullscreen mode

In fact, with a helper, we could go even further in case we need to apply VAT to other countries in the future:

public static class VatHelper {
    public decimal GetVatRate(string country) {
        switch (country) {
            case Countries.UnitedKingdom: 
                return 1.175;
            default:
                return 1;       
        }
    }
}

//...

public class Cart {
    private const string BodyTextKey = "bodyText";
    private const decimal UkVatRate = 1.175;

    //...

    public void PrepareForCheckout() {
        FinalValue = FinalValue * VatHelper.GetVatRate(Customer.Country);
    }

    //...
}

Enter fullscreen mode Exit fullscreen mode

(OK, so in most real-world examples the country and it's respective VAT rate would probably both live in a database somewhere, but it works as an example!)

Enum-ber of other solutions

Constants are great and all, but they can get a little bit repetitive. Enumerated types (that's enums to you and me) are useful for a finite number of constants that won't change.

Enums are a great solution to our ValidationMethod setting we were talking about earlier. We could create an enum with all the possible values and use this enum in place of our strings:

public enum ValidationMethod {
    None,
    Required,
    EmailAddress,
    PhoneNumber,
    Numeric
}

//...

this.ValidationMethod = ValidationMethod.None;

//...

switch (field.ValidationMethod) {
    case ValidationMethod.Required:
        return !string.IsNullOrEmpty(field.Value);
        // ...
}

Enter fullscreen mode Exit fullscreen mode

Enums also work well for statuses:

public enum OrderStatus {
    Draft,
    Paid,
    Pending,
    Shipped,
    RecievedByCustomer
}

//...

Order.Status = OrderStatus.Paid;

//...

if(Order.Status == OrderStatus.Paid) {
    ProcessOrder();
}

Enter fullscreen mode Exit fullscreen mode

Magic numbers and enums

Under the covers, enums actually store integers for each value. In the case of our OrderStatus enum, Draft is equivalent to 0, Paid is 1, Pending is 2, etc. You can even explicitly set the number applied to each value which can be useful if you need to map a readable name to a number, or if you need to add a value to an existing enum.

public enum OrderStatus {

    /// We want to add a new status at the top,
    /// but also to maintain the mapping of the original enum
    /// so we specify the values

    Cancelled = -1,
    Draft = 0,
    Paid = 1,
    Pending = 2,
    Shipped = 3,
    RecievedByCustomer = 4
}

//...

/// Readable definitions for all possible
/// response codes from the ACME API docs
public enum AcmeApiResponseCodes {
    Ok = 200,
    BadRequest = 400,
    Unauthorized = 401,
    InvalidClientId = 490,
    ExpiredClient = 491
}

Enter fullscreen mode Exit fullscreen mode

JSON and enums

As a side effect of the underlying value being an integer, you might notice that by default if you return an enum as a JSON result, you'll end up returning the number rather than the pretty enum.

{
    "orderNumber": 51138461315,
    "status": 2,

    "...": "..."
}

Enter fullscreen mode Exit fullscreen mode

JSON (and JavaScript too, without some workarounds) don't support enums. So we have two choices here: return the number, or return the string. You've got a few options if you want to convert the enum to and from a string if you'd rather.

You can set the converter for your individual property with a simple attribute:

public class Order {

    public long OrderNumber { get; set; }

    [JsonConverter(typeof(JsonStringEnumConverter))]
    public OrderStatus Status { get; set; }

    //...
}

Enter fullscreen mode Exit fullscreen mode

Alternatively, set the attribute it on the enum definition if you want it to be serialized to a string every time you use it. While we're here, you'll notice you can also customise how the enum is rendered as a string with the EnumMember attribute.

[JsonConverter(typeof(JsonStringEnumConverter))]  
public enum OrderStatus {
    Draft,
    Paid,

    //...

    [EnumMember(Value = "Recieved by customer")]
    RecievedByCustomer
}

Enter fullscreen mode Exit fullscreen mode

Or finally, you can set the behaviour globally for all enums by modifying the ConfigureServices method in Startup.cs. We need to add AddMvcAndRazor to get ahold of the MVC config before adjusting the default JSON options.

public void ConfigureServices(IServiceCollection services)
{
    services
        .AddUmbraco(_env, _config)

        // Here's the good stuff

        .AddMvcAndRazor(mvc =>
        {
            mvc.AddJsonOptions(json =>
            {
                json.JsonSerializerOptions.Converters.Add(new JsonStringEnumConverter());
            });
        })

        // That's all, folks!

        .AddBackOffice()
        .AddWebsite()
        .AddComposers()
        .Build();
}

Enter fullscreen mode Exit fullscreen mode

Your configuration may differ slightly if you're using Umbraco 8 or below (with .NET Framework) or if you're using Newtonsoft.Json (rather than System.Text.Json) in your Umbraco 9+/.NET 5+ app, but the principles are the same.

Flagged enums

Simple enums work fine for things like statuses, where only one can be true at any one time. But how about where multiple values make sense?

Our ValidationMethod example from earlier applies here. It may be possible to have a field that needs validating as Required but also EmailAddress. With strings, we could have done this as an array and we can do the same with enums if we want...

// How we might have allowed multiple values with magic strings
field.ValidationMethods = new string[] { "email address", "required" };

// We can do the same when using enums too
field.ValidationMethods = new ValidationMethod[] { ValidationMethod.EmailAddress, ValidationMethod.Required };

Enter fullscreen mode Exit fullscreen mode

But we can also do one better with flagged enums to allow an enum to have multiple values.

[Flags]
public enum ValidationMethod {
    None = 0,
    Required = 1,
    EmailAddress = 2,
    PhoneNumber = 4,
    Numeric = 8
}

//...

this.ValidationMethods = ValidationMethod.EmailAddress | ValidationMethod.Required;

//...

if(this.ValidationMethods.HasFlag(ValidationMethod.Required)) {
        return !string.IsNullOrEmpty(field.Value);
}

Enter fullscreen mode Exit fullscreen mode

In this example, you can see we've added the [Flags] attribute to the enum and assigned each enum a value - this bit is important. We can then assign multiple values using the "bitwise OR operator" or pipe character (|) it's the same we use two of in an or statement ( this || that).

To check if an enum has a flag, we can use the HasFlag on the enum value. I've had to refactor our switch statement from earlier to use an if statement for each value we want to check for.

You might notice the integer I've assigned to each number isn't in order. Well, it is in order, but using a doubling sequence: 1, 2, 4, 8... etc. Each number is double the previous value. This is important because a flagged enum still only stores one integer!

It works by using bitwise operations (there's that word again) which, simply put, is looking at the individual "bits" (and I mean that in the technical term!) of a binary number and applying an "OR" operation - if any of the bits in that column is 1, it returns a 1, otherwise it returns a 0.

For those curious, let's look at the numbers in the doubling sequence in binary.


0000 (0)
0001 (1)
0010 (2[\*](#footnote))
0100 (4)
1000 (8)

...etc

Enter fullscreen mode Exit fullscreen mode

You'll notice, that because they're all powers of 2, each column only ever contains a single 1. Therefore, no matter what combination we make of these in a bitwise OR, we can work out which initial enum values went into creating it.

var a = ValidationMethod.EmailAddress | ValidationMethod.Required;
/// That's 1 and 2
///
/// 0001
/// OR 0010
/// -------
/// 0011 (the last two columns have a 1 in the first number OR second number)

var b = ValidationMethod.Required | ValidationMethod.EmailAddress | ValidationMethod.PhoneNumber| ValidationMethod.Numeric;
/// That's 1 and 2
///
/// 0001
/// 0010
/// 0100
/// OR 1000
/// -------
/// 1111

Enter fullscreen mode Exit fullscreen mode

In both examples above, to check the value has the flag Required, which is 0001, we simply need to check for a 1 in the last column - the HasFlag method is actually using the bitwise AND flag (&) under the covers which does this check.

this.ValidationMethods.HasFlag(ValidationMethod.Required);

// is the same as

(this.ValidationMethods & ValidationMethod.Required) == ValidationMethod.Required;

// is the same as

// Binary literals in C# are prefixed with 0b (C# 7+)
0b0000011 & 0b0000001 == 0b0000001;

/// 0011
/// AND 0001
/// --------
/// 0001 (the last column has a 1 in the first number AND second number)

Enter fullscreen mode Exit fullscreen mode

Because of this behaviour, we can also add a value for common combinations or all into our enum by adding all the values of the combining values together (adding and bitwise operations provide the same result in this case because they're all powers of two, but not normally!)

[Flags]
public enum ValidationMethod {
    // Regular items
    None = 0,
    Required = 1,
    EmailAddress = 2,
    PhoneNumber = 4,
    Numeric = 8,

    //Combinations
    RequiredEmail = 3 // 0001 | 0010 = 0011 or cheat by doing 1 + 2 = 3
    All = 15 // 0001 | 0010 | 0100 | 1000 = 1111 or cheat by doing 1 + 2 + 3 + 4 + 8 = 15
}

Enter fullscreen mode Exit fullscreen mode

Not that an "All" value makes any sense in this case, but it's good to know!

Config your way out of it

One of the questions we asked at the beginning was "is it part of configuration?" This is a good question to ask because it may well change how we deal with it. What is configuration? I like to think of anything that can alter an application's behaviour. This value will have been flagged (in a specification or by yourself) as something that's likely to change - be that in the future or per environment. A URL to an API endpoint is a good example of a configuration variable:

  • it sits outside the control of your application and could conceivably change
  • or you may want to point your staging site at a sandbox version of the API.

Now we're in the land of .NET 5, we can even get rid of some of the magic strings we have historically used to pull values out of config by mapping whole configuration sections to C# objects. I've also used a constant to get the configuration section name, to avoid that as a magic string.

public class AcmeConfig
{
    public const string Section = "ACME";

    public string ApiBaseUrl { get; set; }

    //...
}

//...

public class AcmeClient {
    public AcmeClient(IConfiguration configuration) {
        var config = configuration.GetSection(AcmeConfig.Section).Get<AcmeConfig>();

        //...
    }

    //...
}

Enter fullscreen mode Exit fullscreen mode

Umbracadabra! Magic-less code

Hopefully I've been able to explain a little of why avoiding these magic strings might be necessary and how we can improve upon them. Don't go crazy - there's no need to replace every variable in your code, but its worth thinking about in the future each time you open those double-quotes!

Happy const-ing.. and enum-ing... and Model Building!

*There are 10 types of people in the world: those who understand binary and those who don't.

Discussion (0)