I joined Citrix in 2002 as a developer when we were still a single product company (MetaFrame) of about 1,800 employees and revenues of appx $500m. Since then, we’ve had magnificient growth in nearly every possible metric – revenues, employees, customers, market share and strategic importance to IT. Working behind the scenes to make it all happen are XenApp’s 18 different Technology Component teams working around the clock in 5 different countries to develop 450 product binaries and rigorously test 320 build layouts throughout the project cycle. With this enormity and explosive growth, the question of “How do we deliver products that bring the most value to IT at the lowest cost in the shortest time?” is one that our engineering teams deal with in every product release cycle.
In the last 10 months, being on the Release Team for XenApp, I learned 10 valuable lessons in software engineering management (in addition to losing some sleep and hair 🙂 ). Although these observations may be based on Citrix-specific examples, I believe that in here are universal truths that apply not just to software development at Citrix but any large IT or software project. Secondly, I want you to get an insight into what some of our engineering processes are like.
Lesson 1: Tackle cross-team dependencies first
XenApp Platinum Edition has many components with numerous cross-team dependencies. If you are a team that owns an SDK (even a couple of APIs or modules for another team to consume), focus on those first before worrying about your internal milestones. You ask why? There are two reasons. First, the team that consumes your interfaces might in turn have other important milestones that depend on this SDK being delivered on time. You can’t starve them. Second, most problems arise from these touch points. Eg: RPC calls, 32-bit consumers vs 64-bit SDK, logistics of releasing an SDK are not trivial – build, install issues are often unaccounted for and/or overlooked. That is why it is important to at least provide a skeletal interface (that returns dummy data) and fill in the meat later. Remember – it is better to be somewhat correct than precisely wrong.
Lesson 2: Refactor gradually, not all at once
Our natural tendency as engineers is to try and build the best performing, ideally architected, and most logically modular components. Realistically, it is impossible to score on all these fronts unless you have infinite time. Software engineers are builders and architects. They are extremely passionate about their buildings (code). We often come up with grandiose ideas of re-writing entire components with the latest available technology to try and make it all better and new, this time. But it just doesn’t work like that. Here are the pitfalls in “grand” refactoring –
•Grand refactoring comes at a huge upfront cost that we tend to ignore (most of this is in reverse-engineering legacy code). These costs are hard to justify to managers in a tough economy like this, especially when everything is working as is.
• There is no guarantee that refactored code will perform better. The original engineer who wrote that code made certain design decisions consciously (might have been subtle compromises). Statistically speaking, the new engineers who are doing the refactoring are no smarter than the original authors of the code. So don’t touch something if it is not completely broken.
• By the time you are half-way into refactoring, it is quite likely that a new set of requirements may come in that contradict your refactoring plan.
I am not against refactoring. Here’s what I think is the best approach to refactoring:
• Identify the top problematic areas (in key measures such as maintainability, performance and security) and start by going after those first.
• Learn from smaller refactoring undertakings before you take a big step (think big but start small).
• Advertise refactoring improvements. If a modest investment got you a big gain, blog about it, share your experiences so others can learn from it.
• Refactoring must be continuous (in my opinion, every release should layout 5-7% of $$$ for refactoring improvements) but don’t overdo it.
Lesson 3: Don’t overestimate demos
Demos are a great way to reveal earned value. It helps to showcase engineering innovation, secure (or maintain) funding and most importantly give confidence in your design. But demo’s that show PoC’s (proof of concepts) should not be mistaken for end-products. We take several short-cuts when doing demos. There is a long way between demo/prototype quality and release quality that you need to account for in your project plan.
Lesson 4: Don’t underestimate integrations
When an novice engineer says “Oh its easy, will take 10 minutes to do”, they are almost always wrong. This is especially true of system integration. The convenience of having a VBL (Virtual Build Lab, a sort of private tree for building code) for isolated and disruptive development does not come for free (unless you are isolated by binary based releases, even there you may have a cost arising from dependency alignment). Integrations don’t end when your code compiles. It all needs to work (in Citrix, we use a product-wide test automation framework that runs on every build to ascertain quality metrics on a continuous basis). Assign a generous amount of time to do integrations.
Lesson 5: Automation is not a silver bullet
There is a sign in the first floor’s break room in our main XenApp engineering building here in Ft. Lauderdale that says “Automation is not a silver bullet”. Keep a copy of it in your office. Certain code lends itself well to automation and certain classes of code do not. Like refactoring, here are some thoughts I have on automation:
• Full-fledged automation comes at a huge fixed cost. If your payback period is more than 3 years, re-think. We are in a fast-paced industry. The scenario or code that you automate now, may be far less important (or not even applicable) 2-3 years from now.
• There is code that can be tested very effectively using automation (eg: session management, capacity and load management), and some that just can’t be (multimedia). 100% automation is impractical (trying to achieve 100% of anything is somewhat impractical for that matter).
• Automation also needs continuous maintenance. For example, if you author automated unit tests, you need to make sure a. they keep running release after release and b. they keep passing release after release. As code changes, the way it is tested may also need to change. So while calculating payback, be sure to attach a maintenance cost for every test you author.
• Like refactoring, I believe automation effort is one of those things that you have to include for each release, but not overdo it. Start with automating the biggest bang for the buck.
Lesson 6: Keep processes lightweight, yet efficient
As part of the Release Team, I have an obligation to make sure that release processes facilitate and not burden Technology Component (TC) teams. Citrix’s processes have evolved greatly since we started. When Citrix had less than 10 engineers in the late 80s, they used to keep track of bugs on a single common whiteboard! Simple yet effective. 20 years later, we have sophisticated bug tracking and requirements management systems. But yet we still strive for the same simplicity and effectiveness that we had 20 years ago. To that effect, we introduced a number of process improvements:
• Formal requirements authoring processes take backstage, instead lightweight feature specs with visuals, screenshots and video demos take center-stage. We need to document things that need to be communicated, not every possible thing one can ask.
• Formalized Graphical Test Plans only where it makes sense (for complete end-end features, not for individual components or modules).
• For features submissions that are surgical changes, you don’t need to produce a full code coverage report (cost doesn’t always justify the benefit). But make sure you methodically and carefully step-through code changes with various inputs.
• Test automation only where ROI can be justified.
• No more separate low level and high level design documents. They can be combined into one that is readable by both technical and non-technical audiences.
Lesson 7: Cross-team hand-off is more than just being code-complete
With a process methodology that is component-centric, we “deliver” frameworks, “release” SDKs or “hand-off” API sets. I get really upset when a team makes tall claims to have done one of these things without having a single consumer try it out first. How do you know that your hand-off meets the requirements of the SDK consumer? Does it even work? Did you factor in the time that it takes to write an installer and release a build, tasks that go hand-in-hand with releasing something? I really think Hand-offs must be signed not by the team that releases it, but by the team that actually consumes it.
Lesson 8: Component complexity is multiplicative, not additive
The amount of cross-team dependencies that we have in our product is amazing. This is no different from any other mature product of our size. Don’t underestimate this when taking on a new project. If 20 TC teams each take on a project with a complexity of n, the overall complexity of the release becomes n^20, not 20n. This is due to the high cost of integration and integration testing. Keep this in mind before you take on a new feature. Often, less is more.
Lesson 9: What quality means…
The definition of quality has changed in the least 2 decades. The 80s and 90s were about trying to achieve high quality control. If there was no Six Sigma or SEI certification, you were considered an outcast. Things have changed since then. In the age of well-designed products like the iPhone, Windows 7, Google maps, Amazon and Salesforce, strong visual appeal, good design and usability are now taking center-stage. This has been true for Citrix as well. In the last few years, our focus has been shifting from a blind, knee-jerk, single-dimension view on “bug counts” to taking a more practical approach of solving real customer pain-points (for example: XA5 for Win2003 IMA resiliency, VM Hosted Apps, etc) and making some smart investments in design and intuitive visual appeal (Dazzle, Receiver, etc). Also, we’ve been constantly keeping maintenance costs low and passing on the cost benefits to our customers, by shedding expensive baggage (high cost, niche legacy feature deprecation and removal of code that is no longer executed). Keep looking for these opportunities at the code level.
Lesson 10: Balance Idealism and practicality
This is really the biggest lesson that I’ve learned. You can’t achieve your ideal vision in one release. Be practical and be patient. Go after hard and known technical problems one at a time. Build on small successes to springboard you to the next level. Enterprise products and quality are built like concentric circles, inside – out. Create and follow a roadmap to your ultimate vision.