Wednesday, October 01, 2008

The Idiom Trap

While on leave last week, looking for something to read in the local Borders, I chanced upon Joel Spolsky's Best Software Writing collection of essays -- and one that struck me, in relation to some recent discussion at work about C++ (and how it made you stupid, as if C# was a silver bullet) was the one titled C++ - The Forgotten Trojan Horse, which recounts the language's success as being down to being a superset of 'C'. 

This is really also its greatest weakness.  Being treated as 'C' with a few flanges bolted on, a lot of code that is in .cpp files because it has mutated enough to no longer compile as .c, can still spiritually remain 'C', by keeping the idioms -- the comfort zone, as it were -- of the older language.  Although it was once said that "The determined Real Programmer can write FORTRAN programs in any language." -- these days, 'C' would be a more accurate target of jest than FORTRAN, leading to a style of coding that -- use of supplied APIs aside -- could be pretty much identical despite putting a C(++), Java or C# label on it.

Of course, if writing in nominal C++ and sticking to old, familiar 'C' idioms makes the code more comprehensible to a wider audience -- contrast

while ((line = reader.ReadLine()) != null)...

with

cont<T> sequence( (std::istream_iterator< T >( T ) ), std::istream_iterator< T >() );
...

for example -- there is a serious trade-off to be made when considering robustness and being idiomatic in the full language, on the one hand, and going for code that is couched in familiar terms on the other.  Not that this is just a 'C'/C++ issue -- C# evolves too, and C#1 idioms (like making two passes to remove items from a List) remain familiar, despite the IList<T>.RemoveAll() method taking a delegate being there at C#2.

It's just worse with 'C'/C++ because there is more legacy code and more legacy idiom to drag you down -- as anyone who has ever attempted to impose some const-correctness on a real piece of code, let alone trying to replace PCCHAR arguments with the appropriate choice of std::string& or std::vector<char>& using checked STL containers that will actually throw on out-of-range access, will testify.  In that case it's more than just putting the explanatory comment about the idiom (in the same way that you'd have to comment why you might be doing odd range bounds computations in the 'C'-ish code) that holds back the adoption of the idiom -- it's the pervasive nature of the change, usually way beyond that of the immediate issue being addressed, that is the impediment to positive change.

C++ has changed a lot since 1990, but has code changed to reflect that? The advice in The Pragmatic Programmer is to learn one new language each year.  Make sure that some years, the language you learn is one you think you already know, be it C++, C#, Ruby, Python, JavaScript, whatever, because they are all moving targets, and what you though was best practice can have become obsolete.

No comments :