Friday, November 6, 2009

Long time no post

Wow, it's been forever since I last posted! So much for *that* new years resolution!

Life has been so hectic since my last post! I continued to guide my team through their Agile transition for a few more months. Then my company fell on some hard times, and they made some pretty deep cutbacks. Hard to justify keeping middle management when you have only a handful of developers! So I found myself out of a job for about two days before some old friends called looking for help with a project.

I started contracting with a small startup in the Electrical Power Systems space, building software to help optimize the electrical power grid's transmission facilities. It feels so good to have a real programming job again! Not that I had really stopped programming as a manager, but 10h a week programming just isn't quite the same as an intense project!

Right now we are nearing (I expect us to be there by Christmas) v1.0 of the product. Once that is out, my contract will be up and I'll have to start looking for what to do next. Decisions, decisions!

Saturday, February 21, 2009

That's why they send me, I am expert!

Aha! One of Jeff Atwood's Coding Horror blog posts this week was Are You An Expert? I love it when I read something that so clearly echos my own personal frustrations! Beware of the self-proclaimed expert! He will be nothing but a hassle for your organization. Instead look for those who see themselves as artisans honing their craft, or as students wants to explore new things. To me those are the most valuable outlooks to have from an engineering team.

To hire self-proclaimed experts into your team is to open your team to feeling like you were just visited by Karl Hungus.

Sunday, February 15, 2009

Is it bigger than a breadbox?

This week saw my team start to actually use a burn down chart to help us measure our progress in our Sprint. It is a great way to track how the team is progressing, and very early tell us if we are falling behind on our goals. But the whole technique presupposes that you have good and accurate estimates.

Something I have learned is that most developers could not estimate how much time it takes to make lunch, let alone develop a piece of software that has some degree of research and prototyping involved.

So it begs the question: How do you get your team to be good at estimating?

Years ago I read a phenomenal article by Joel Spolsky called Painless Software Schedules which has had a profound effect on the way that I estimate and view scheduling software projects. Joel now espouses Evidence Based Scheduling (EBS), which I like very much, but I feel that it is much better suited to long cycle, waterfall style development than Agile development.

As I see them, here are the most important things to do to help your team be good at estimating:
  1. Only someone who is a candidate to actually perform the work should estimate, not managers.
  2. Two heads are better than one - always have more than one person working together on estimating the tasks. Even if only one person knows the code well, having a second set of eyes on the task list and estimates will be a benefit.
  3. Estimates are not guesses. Serious thought needs to be put into estimating.
  4. All features must be broken into tasks. Large tasks are harder to estimate accurately than small tasks, so always try to break tasks down to units of not more than 2 days of effort.
  5. Keep a list of the tasks that your team commonly forgets to estimate (bug fixing time, integration testing, updating wiki pages - whatever your team needs), and ask the team to remember these items when they turn in their estimates.
  6. Everytime you have a meeting to report time, focus on re-estimating how much time is remaining on the tasks, and on inputting newly discovered tasks.
  7. Keep track of the initial estimates and how much time actually gets spent on each feature.
There are some caveats. First you must get your team on board with this. They must understand why you are doing it, and how the numbers are going to get used. If they do not buy into the process, you might as well not use it.

Whatever you do, do not attempt to use this technique to verify how many hours your developers are working.


This whole process depends on you trusting your developers to do the right things. If you start using this as a punch card, developers will not be honest with you, and the whole system will be pointless. Resist the urge to total up how many hours individuals are spending in a week, you will only lose out if you do it.

Over time, if you track initial estimates and total time spent on a task, you will notice that your developers will tend to converge towards a consistent level of over or under estimating. You can then form a personal fudge factor for each developer and apply it to his estimates. Because this will be a consistent factor, by applying it to a developers estimates you will then be able to get a more accurate prediction of the time that any feature will take to develop.

Consistency is more important than accuracy because you can use that consistency to normalize your data!

Saturday, January 24, 2009

Fear and Loathing in Massachusetts

Sometimes I feel like my life runs parrallel to crazy movies that I have seen. I've always been a bit of a movie buff. Pretty much any movie is a good movie for me. Well... until a recent string of bad movies. But I am constantly amazed at how I get reminded of movies in my every day life.

This is a real period of change for my team. Next week we plan to start our first ever real sprint. In preparation we decided that we should try to figure out how to do Story Point Estimation our product backlog. Unfortunately we hadn't done enough up front work on creating clear and crisp User Stories which made the task all the harder. As a consequence, team members had a lot of trouble understanding exactly what was being asked for, and we spent a lot of time debating the scope of what was being asked.

We had two bags of grass, seventy-five pellets of mescaline, five sheets of high-powered blotter acid, a salt shaker half-full of cocaine and a whole galaxy of multicolored uppers, downers, screamers, laughers.... also a quart of tequila, a quart of rum, a case of Budweiser, a pint of raw ether, and two dozen amyls.... But the only thing that worried me was the ether. There is nothing in the world more helpless and irresponsible than a man in the depths of an ether binge....
--Hunter S. Thompson, Fear and Loathing in
Las Vegas

With no clarity on the scope, it was sometimes difficult for the team to focus on the problems that were actually likely to be real problems, and instead focused on hypotheticals. While the exercise of debating was good for the team dynamics, I can't help but feel like we wasted time of the developers simply because the product managers didn't put the time into making proper user stories.

Saturday, January 17, 2009

Pick something to suck at less

So I've been lazy about posting for a while. I should post more often. I have a thousand excuses, but none of them are good. It just boils down to me being too lazy. My New Year's Resolution is to try to keep up on the blogging.

Anyhow, I have changed roles at my job and am now managing the main development team, so I figured I'd talk a bit about what we are doing and keep a journal of sorts on our new team odyssey: shifting from a waterfall development organization to an agile one.

We have a new QA manager, and I have to admit to really liking the guy. The other day I was talking about all of the different ways that the development team will have to change in the next few months, and he stops me and says something so profound and meaningful that I just had to write about it.
"Remember, you don't have to get good at everything; you just need to pick something to suck less at".

I admit, I am a bit of a perfectionist at times. It drives me nuts to not achieve my goals. And on a conceptual level, he didn't say anything that I didn't already know. But I had never had anyone word it in just a basic way. I think I'm going to adopt it as one of my new development mantras. Help remind me that improvements need to be incremental.


Friday, October 10, 2008

Inconceivable!

I have been interviewing a lot of engineers lately, and I have to say I am disappointed in the way a lot of people misrepresent their level of knowledge on a topic. I mean, if I ask you how well you know something, don't you naturally expect that I am going to follow up asking questions that I think you should know the answers to if you answer in the affirmative?

The other day I had a fellow who told me he was an expert SQL programmer. So I asked him to tell me what a left outter join is. I didn't think it was a bad question. Anyone who really knows what a join is can pretty much guess it. But no, his answer was so bad, I was almost forced to whip out my favourite Billy Madison quote:
What you've just said is one of the most insanely idiotic things I have ever heard. At no point in your rambling, incoherent response were you even close to anything that could be considered a rational thought. Everyone in this room is now dumber for having listened to it. I award you no points, and may God have mercy on your soul.
I have also had other people self apply the term guru. If you ant to get on my bad side, please say:
Oh yes, I am a Java Guru.
I know a little bit about quite a number of things. I would never refer to myself as a guru - I am not one. Not on any topic. However, my goal in life is to learn enough about each topic out there to be able to tell who is an expert, who is a guru, and who is a poseur. In general, unless I am at that point I profess only familiarity with a subject, I don't know it.

Java is probably my strongest skill set. I like to think of myself as an experienced Java programmer, working my way towards being an expert but not there yet. If you are a guru, you have better know a lot more than I do.

In the past week I have seen gurus who don't understand the Java memory model, don't know the difference between ArrayList and LinkedList, don't know how variables are passed in Java (and when given obvious hints can't figure it out!), and who for some reason think it is just fine to not understand recursion. Gurus. In the words of Inigo Montoya:
You keep using that word. I do not think it means what you think it means.

Saturday, September 20, 2008

But nothing has changed in my environment!

Ah - the rallying cry of many an IT person when phoning an enterprise software company when they have a problem. Gartner group claims that 80% of all failures in the enterprise are the result of a change in the environment. My experience jives with that.
This was working and then it stopped, but we haven't changed a thing in our environment!
Almost universally they are wrong. They aren't lying to you. They just don't actually know what has happened on the systems, so they assume that nothing has changed. I worked for 6 years on a piece of software designed to try to help people detect these changes. While Veritas Configuration Manager can be a big help in many configuration change oriented scenarios, there are lots of scenarios it can't help with.

At my new job, we recently hit a situation with a large customer such a scenario. Our product connects to Oracle via the OCI driver after months of functioning on this Linux box, suddenly and without warning, our product was failing with a SIGSEGV consistently. When we examined the stack we saw something that made my head swim: the SIGSEGV was coming from deep within the Oracle OCI calls.

But this made no sense. The customer verified that Oracle hadn't been patched in months. Then the senior member of my team noticed something I hadn't. The printout to stderr contained the following line:

*** glibc detected *** free(): invalid pointer: 0x08056450 ***

I had no idea what that meant. Lucky for us there is Google. The programmer's best friend. It seems that this message comes from the new mcheck() in glibc. And when we examined the details and read the new manpage for malloc() we saw that:

Recent versions of Linux libc (later than 5.4.23) and GNU libc (2.x) include a malloc implementation which is tunable via environment variables. When MALLOC_CHECK_ is set, a special (less efficient) implementation is used which is designed to be tolerant against simple errors, such as double calls of free() with the same argument, or overruns of a single byte (off-by-one bugs). Not all such errors can be protected against, however, and memory leaks can result. If MALLOC_CHECK_ is set to 0, any detected heap corruption is silently ignored and an error message is not generated; if set to 1, the error message is printed on stderr, but the program is not aborted; if set to 2, abort() is called immediately, but the error message is not generated; if set to 3, the error message is printed on stderr and program is aborted. This can be useful because otherwise a crash may happen much later, and the true cause for the problem is then very hard to track down.

It turned out that someone with access to the user that we run as had edited their environment scripts to include setting MALLOC_CHECK_=3 for other work they were doing. This setting is then inherited by every process that ran. Then, in the OCI layer when someone double freed a pointer - something that is frowned upon, but not usually fatal, the program starting aborting. Having the user unset MALLOC_CHECK_ resolved their issue.

Thursday, September 18, 2008

Refactoring

I posted an article for Accurev about Refactoring over at the Accurev Blog. I truly do feel that this is the single most useful trick in my professional bag of tricks. I sometimes feel that I have only one marketable skill - I can grok other people's code with relative ease. And I credit that skill to refactoring having taught me how to drink deeply of the cup of code.


UPDATE: My Refactoring post on the Accurev Blog was listed as #86 on today's Wordpress' Top 100 of the Day.

Saturday, September 13, 2008

AIX Clock Ticks Per Second

In the wake of our failed startup attempt, I've taken on a new job to pay the bills. You have to do that kind of thing when you have kids. I am running the Continuing Engineering group for an enterprise monitoring software company.

Last week we hit an interesting bug that I thought I'd talk about. In our datafiles were were recording a formatted timestamp that had more than 1000 milliseconds in its output. Well, when we tried to load the datafile into the Sybase database, we got an error.

Now obviously the code needs some more defensive and our Time class itself needs to prevent people from entering bad values. However, the real question is: how did we ever get a millisecond value greater than 1000. After some digging around, 1 finally found a piece of code written in 1999 that has a bug in it. This code has worked for coming up on a decade without reported error!

The problem was subtle. The very last line of the description of the times() manpage contains the warning:
Applications should use sysconf(_SC_CLK_TCK) to determine the number of clock ticks per second as it may vary from system to system.

Either the author didn't read the manpage or he simply never believed that anyone would monkey with the Clocks per Second setting in the AIX kernel. Well, someone did!

Wednesday, August 6, 2008

Process Agnostic

I have frequently found myself in discussions with people who are rabidly infatuated some process or other. Pro-Agile, Pro-CMM, Pro-Waterfall, Pro-XP, Pro-SCRUM. I've worked in shops who used all of these processes over the years, and my experience is this:

Everyone one of those rules and are the greatest thing to ever happen to you.
And at the same time;
Everyone one of them sucks and you are bound for complete and total failure if you use it.


But how can this be???

The answer is simple. Software engineers are not assembly line workers. Processes and methods that work very well for one developer may not work for another at all. Two different software teams can follow 100% identical processes and get radically different results. One-size-fits-all just does not work when you are dealing with coders. I think it is because coding something that is more akin to scientific research than it is to manufacturing or industrial design.

In order to have a truly successful process you must design your processes specifically for your team of engineers. If you don't, you could get lucky and pick a process that happens to work well for your team. You might pick a process that leads to failure. Most likely you will pick a process that the team does OK with, but that no one is really happy with.

My advice: invest the time in letting the team refine the processes for themselves. You will not regret it.