insert_separator

By nmusatti

I recently realized that there is a problem that has kept cropping up regularly in my code for the last few months: transform a sequence of words in a comma separated list. Not a big issue in itself, but I find it interesting for two reasons: first, it is almost impossible to specify a solution that matches the regularity and simplicity of the expected outcome; second, it’s a good test case for checking how well different languages support writing a generic solution.

I have to confess that I wouldn’t have given it much thought if this wasn’t a kind of problem that really irks my sense of aesthetics; what you want is something as simple as:

string1, string2, string3

A very plain sequence, without any exception or special case. Yet when you try to code it you can’t avoid making a special case of either the first or the last string, as in this Python function:


def insert_separator1(l):
    ret = l[0]
    for s in l[1:]:
        ret += ", "
        ret += s
    return ret

Otherwise you have to insert a conditional inside the loop, which conceptually you should only need to compute once, as in this alternative Python implementation:


def insert_separator2(l):
    ret = ""
    first = True
    for s in l:
        if first:
            first = False
        else:
            ret += ", "
        ret += s
    return ret

By the way, I’m well aware that Python’s standard library provides a ready-made solution:


print ", ".join([ "string1", "string2", "string3" ])

While Python is very convenient to expose programming concepts, the code where this little problem keeps coming up is actually written in C#. This language also has a library solution, which is very similar to Python’s:


using System;

class InsertSeparator1
{
    static void Main(string[] args)
    {
        string[] l = { "string1", "string2",
                "string3" };
        System.Console.Out.WriteLine(
                string.Join( ", ", l));
    }
}

These library solutions are undoubtedly convenient, but they are limited to the scenario where you want to concatenate your list elements into a single string. Sometimes, however, you would rather output your strings to a file, or you might want to interleave other kinds of things, rather than just strings.

Let’s see then how different languages support a generic solution. Thanks to generators and duck typing Python makes it very easy:


import sys

def insert_separator4(elements, sep):
    first = True
    for e in elements:
        if first:
            first = False
        else:
            yield sep
        yield e

if __name__ == "__main__":
    for e in insert_separator4([ "string1",
            "string2", "string3" ], ", "):
        sys.stdout.write(e)
    sys.stdout.write( '\n' )

Slightly less so C#, which is constrained by static typing and the subordination of generics to inheritance as a means of generalization:


class InsertSeparator2
{
    static IEnumerable<ValueType>
            InsertSeparator<ValueType>(
                    IEnumerable<ValueType> elements,
                    ValueType sep)
    {
        bool first = true;
        foreach (ValueType e in elements)
        {
            if (first)
                first = false;
            else
                yield return sep;
            yield return e;
        }
    }

    static void Main(string[] args)
    {
        string[] l = { "string1", "string2",
                "string3" };
        foreach ( string e in
                InsertSeparator<string>(l, ", " ) )
        {
            System.Console.Out.Write(e);
        }
        System.Console.Out.WriteLine();
    }
}

Note that you need to implement the IEnumerable<T> interface in order to exploit this function.

C++ is also statically typed, but thanks to a form of compile time duck typing supported by templates it doesn’t require an abstract base class:


#include <iterator>
#include <iostream>
#include <string>

template <typename InputIter, typename OutputIter,
        typename ValueType>
void insert_separator(InputIter first,
        InputIter last,
        OutputIter out,
        ValueType sep)
{
    bool initial = true;
    while ( first != last )
    {
        if ( initial )
            initial = false;
        else
            *out++ = sep;
        *out++ = *first++;
    }
}

int main()
{
    std::string l[] = { "string1", "string2",
            "string3" };
    insert_separator(l, l + 3,
            std::ostream_iterator<std::string>(
                    std::cout),
            ", ");
}

At first sight the C++ solution isn’t much better than the C# one. However generalization is not just a matter of language constructs; it is also a matter of conceptual framework. With little effort from our part the function above integrates perfectly with the standard library’s iterators and containers, as well as with any user defined ones, provided they adhere to the standard library constraints.

Finally, there is a solution to this problem which satisfies my quest for elegance, but it’s currently late at night and this post is already long enough as it is ;-)

Leave a Reply