Friday, January 21, 2005

My Development and Deployment Strategies

So I tried to come up with a document that summarizes my Development and Deployment Strategies. Enjoy... 1. Tools in use 1.1. Version Control Critical to the development of modern applications is strict control of source files and versions. The source code control system is responsible for maintaining a history of file changes, and to act as a central repository for the archival and distribution of the application source code. There are many modern version control systems, some free, some included with other tools and some very costly (and powerful). For most development shop's needs, the most logical tools to use are one of the following: 1.1.1. CVS – Concurrent Versions System The CVS system is a very mature free open-source system in use in most modern development environments. It can most easily be hosted on a Unix/Linux system, but can also be deployed on Windows servers. It is under active development and is supported by almost all clients (Windows, Unix, Linux, etc.). Notably, there are a couple very good Windows clients to ease check-in and check-out. The current release of CVS is 1.12.11 (released on Dec 13, 2004) which can be found at the home page[i]. There is a fairly complete tutorial on CVS server use[ii]. I recommend use of TortoiseCVS for the client as it has a very easy-to-use integration to the Windows Explorer context menus (right-click). It is a free open-source tool under active development, and is very stable and mature. The current release of TortoiseCVS is 1.8.11 (released on Dec 30, 2004) which can be found at the home page[iii] 1.1.2. Subversion The Subversion system is a mature free open-source system designed to replace CVS and augment it’s abilities with some often requested features and improvements. Its major advantages over CVS are in its much better support for atomic operations (all check-ins happen, or none are committed). Additionally it has much more efficient support for branches and adding and removing directories and files from a project while still maintaining an easy interface to the older versions. Additionally, it is much more efficient in network use as it always sends “difference-only” messages, whereas CVS sends entire files from the client to the server. The current release of Subversion is 1.1.3 (release on Jan 14, 2005), and can be found at the home page[iv]. There is an excellent online book about Subversion use and setup[v]. I recommend use of TortoiseSVN for the client as it has a very easy-to-use integration to the Windows Explorer context menus (right-click). It is a free open-source tool under active development, and is very stable and mature. The current release of TortoiseSVN is kept in sync with the Subversion release and is 1.1.3 (released on Jan 20, 2005) which can be found at the home page[vi] 1.1.3. Visual Source Safe VSS is a source code control package included with Microsoft Visual Studio. It performs fairly well for smaller teams, and is well integrated in to the Microsoft development suite. Does not support atomic check-in and has much more limited branch support than Subversion or CVS. Additionally, it is a file-based system, so it requires Microsoft Network sharing. The best reason to use VSS is that it’s included in most setups and developers will usually be quite familiar with it. 1.2. NAnt NAnt is a free open-source project build tool that automates the process of compiling, linking and deploying builds of projects. It is similar to the Make or NMake tools, and grew out of the java community’s Ant tool. NAnt is driven by a task-list in an XML file. It has the ability to control most development tasks, and can even be extended to add additional tasks that are unique to a specific environment. The task-list is processed with complete dependency checking and task ordering. It can directly consume and build Microsoft Visual Studio project and solution files and is optimized for use with .Net projects. It is a free open-source tool under active development and can be found at the home page[vii]. The current version is 0.85 (released on Nov 11, 2004). In addition to the documentation on the home site, a very good tutorial on the use of NAnt is available[viii]. Additionally a very nice example of a NAnt script is part of the flexwiki project[ix]. 1.3. NUnit NUnit is a unit-testing framework for .Net. It was derived from the JUnit framework and ideas developed by the Agile software development methodologists. NUnit allows writing unit tests in source code, which the framework automatically executes and reports on the status. This allows automatic testing to occur at every point of the development process. The current version is 2.2 (released on Aug 9, 2004) available at the home page[x]. Very good documentation is available on the home site as well as in many sites dedicated to unit testing. The value of unit testing cannot be overstated. It allows the developer to “work with a net” insuring that changes they make do not break other parts of the system and insuring that requirements captured as unit tests are actually completed. The rate at which unit tests are completed and made to pass provides a good indicator of the progress and status of the project. 1.4. Cruise Control.Net Cruise Control.Net is a build automation tool for .Net projects that scripts the regular flow of watching the source code control system for changes, triggering a build, and reporting the results. It is a very mature free open-source product. The current version is 0.8 (released Jan 20, 2005) which is available at the home page[xi]. It is installed as a windows service to insure that it is always running when the Build Server is rebooted. The service is known as CCService, and can be stop, started and restarted using the standard Windows Control Panel / Administrative Tools / Services task. Additionally it can be stopped by executing appropriate commands from the command prompt on the Build Server (e.g. NET STOP CCService or NET START CCService) The actions of Cruise Control.Net are driven by an XML configuration file called ccnet.config which is located in the Cruise Control.Net program’s directory. Cruise Control can do build automation for any number of projects. Each project is described, along with a schedule for builds, the tasks to be run, the source code control system to be monitored, and the people to report build status to (via e-mail). In most cases it is setup to do continuous monitoring and do a build every time a change is made. Since builds take a while, it would be stupid to start a build, then seconds later need another one, so you can configure a quiescent period. Typically a wait time of 60 seconds is used. That way after a change is committed; a build will start, but not until at least 60 seconds of NO other commits occurs. Once the need for a build is detected, the appropriate NAnt task is triggered, which then does the meat of the build. When the build is complete, status of the build is noted from the return status of the NAnt execution and the output of the build and unit tests is emitted to the Cruise Control.Net project dashboard page. Cruise Control.Net acts as a web server to allow browsing of the current status and history of builds. Finally, the build notification e-mails are sent using the list in the ccnet.config file, so if you want someone else to get those notifications, that’s the place to edit the addresses. One thing to be clear about regarding those addresses, those recipients tagged with always get a notification of every build, while those tagged with change get a notification when the build goes from failed to succeeded or succeeded to failed. These are typically the people that you want to raise red-flags to when a build first goes bad or is just-now fixed. 1.5. Microsoft Application Blocks All Microsoft Application Blocks are free open-source code libraries that encapsulate and codify common .Net development patterns a practices. They offer the ability to incorporate advanced functionality without having to reinvent the wheel. All are well documented on the Microsoft Patterns & Practices website[xii] with source available. In the near future, a new generation of these application blocks will be released as the Enterprise Library[xiii]. 1.5.1. Data Access The Data Access Application Block is a .NET component that contains optimized data access code that will help you call stored procedures and issue SQL text commands against a SQL Server database. The documentation provides guidelines for implementing an ADO.NET-based data access layer in a multi-tiered .NET application. It focuses on a range of common data access tasks and scenarios and presents guidance to help you choose the most appropriate approaches and techniques. This guide encapsulates performance and resource management best practices and can easily be used as a building block in your own .NET application. If you use it, you will reduce the amount of custom code you need to create, test, and maintain. Drivers and adapters for Microsoft SQL Server, Oracle, OLEDB and ODBC data sources are included The most current version is actual hosted on the GotDotNet web site[xiv] and should be downloaded from there. 1.5.2. Exception Management Exception Management Application Block for .NET consists of an architecture guide and an application block. The documentation discusses design and implementation guidelines for exception management systems that use .NET technologies. It focuses on the process of handling exceptions within .NET applications in a highly maintainable and supportable manner. Exception Management Application Block for .NET provides a simple yet extensible framework for handling exceptions. With a single line of application code, you can easily log exception information to the Event Log or extend it by creating your own components that log exception details to other data sources or notify operators, without affecting your application code. Exception Management Application Block for .NET can easily be used as a building block in your own .NET application. It can be downloaded from Microsoft[xv]. 1.5.3. Logging Building useful logging capabilities into your applications can be a significant challenge. At the very least, you need to determine what information is appropriate to log, design the events themselves, and make them available for analysis in an appropriate format. Effective logging is useful for troubleshooting problems with an application as well as provides useful data for analysis, helping to ensure that the application continues to run efficiently and securely. To help provide effective logging for enterprise applications, Microsoft has designed the latest patterns & practices applications block: The Logging Application Block. This block is a reusable code component that uses the Microsoft Enterprise Instrumentation Framework (EIF) and the Microsoft .NET Framework to help you design instrumented applications. It can be downloaded from Microsoft[xvi]. 1.5.4. Other Blocks There are several other application blocks that are more difficult to initially integrate into projects but address some other common issues, in particular these may be very useful for some applications. They can be downloaded from the main Microsoft Patterns & Practices website. · User Interface Process · Cache Management · Authorization and Profile 1.5.5. Other Tools Object Relational Managers Depending on the complexity of the databases in use it may be appropriate to use an object-relational data manager. These tools ease the process of persisting the domain objects into the database. In particular, nHibernate[xvii] is a very complete solution. It is a free open-source package with a good parallel in the java community. Page Template / Master Page Frameworks In most web based applications, it is important to deliver a consistent look and feel. This is best done in the .Net environment through the use of a framework that exposes the “inner” variant page content as user controls. A good free open-source package is available on CodeProject[xviii]. 2. Deployment Processes 2.1. Environment Infrastructure 2.1.1. Server Machines The server machines for the recommended development environment are intended to not be used by any developer directly. It is desirable that they are only used to provide the functionality of the intended role and should never be used as a user’s workstation. No development or modifications should ever be performed directly on a server machine. This is in stark contrast to the typical past-generation .ASP and CGI development techniques. Web Servers The web servers are designed to run the application presentation logic, and any necessary data access and business logic that makes up an application. The web servers are often configured in a pooled environment to allow for load sharing, though in the case of some projects, the workload may not warrant that level of complexity. At minimum, the web servers should have the desired .Net runtime and Framework SDKs installed, typically Windows Server 2003 will be the operating system. In the recommended deployment strategy, it is expected that there will be at least three web servers created; one each for the roles of Authoring, Testing and Production. Please refer to section 2.2.2 for details as to how these machines are configured and used. Database Servers The database server is use to house the databases and should not have any other functionality. You may use a single machine or (given the appropriate database software support) use a pool of fail-over machines. It is anticipated that for many shops, both Oracle and Microsoft SQL Server will be used. Each of the development roles of Authoring, Testing and Production (see section 2.2.2) should have its own database server or database server instance. The configuration component will automatically determine the connection-string to be used for the development role. Application Servers The application servers host any business logic (typically exposed as Web Services or through .Net Remoting) that should be centralized and isolated from the normal Web Server farm. Additionally, application servers are used to run any “batch processing” tasks that do not need user input or take a long time to execute. As with all the other servers, there is a role specific instance for the Authoring, Testing and Production uses. It is expected that most business logic will be run on the Web Server, with probably component sharing at the DLL layer to the application server’s batch programs. If a project needs better isolation between the Web Servers and internal resources, the business logic can be coded as Web Services. Source Code Control Server The source code repository server hosts the chosen version control software. If Subversion or CVS is chosen, the server can be Windows or Unix/Linux. If Visual Source Safe is used, then it will have to be a Windows machine with Microsoft Networking shares available. There is no need for a role-specific setup for this server 2.1.2. Build Machine Daily Builds Daily builds are performed on a developer-class machine, which has all the normal development tools installed. It is not to be directly used by developers. Rather it is setup with the Cruise Control.Net system to perform automatic builds. The build server acts as a buffer against a developer’s natural tendency to customize his or her development environment. Rather than incurring the loss of productivity that denying a developer’s favorite tools will cause, the build environment is standardized by moving the build process to a dedicated machine. Additionally, the fact that the builds are automatic insures that the process doesn’t come to a halt just because the “buildmeister” developer is not available. The build server does not need to be mirrored for the Testing and Production roles, as it’s only use is to create Authoring builds. This machine should have Visual Studio, the .Net Framework SDK and any custom controls needed (such as third-party UI widgets, database drivers, etc.) to mirror what is the baseline for a developer machine. It is important to realize that any build starts with a clean-slate, insuring that a build can be rebuilt on a (suitably configured) brand new build server at anytime. Build Repository The builds are performed regularly and a build number assigned. They are then labeled in the source code control system, and the actual builds archived to the repository of builds. When a build is thought to be a candidate for promotion to the Testing role, the actual build (and the entire source) can be captured from the build repository. This need not be a separate server, just some designated storage. Policies about how long builds stay in the repository are determined by how stable the project seems. In the early stages, it is quite appropriate to keep more non-promoted builds in the repository to ease the ability to choose a “best known” build to promote for interim testing. Any build that is promoted to Production should be permanently archived. 2.1.3. Development Machine The development machine is what the individual developers use on a daily basis to develop and support the applications. It should have all the same tools and third-party controls installed as the Build Server. It does not have Cruise Control.Net installed as the developers are not responsible for the daily build process. Additionally, no developer should ever copy binaries, images or pages to a server of any role. Doing so bypasses the build process, which sabotages the ability to always be able to rebuild from the sources in the version control system. When a developer checks files into the source code control system, the Cruise Control.Net process running on the Build Server will automatically begin a build process. 2.2. Deployment Path 2.2.1. Continuous Integration The general strategy to follow is know as Continuous Integration and was formalized by Martin Fowler, a great introduction of the principles of Continuous Integration can be found here[xix]. The Continuous Integration insures that developer changes are quickly assimilated into the overall build. This has several benefits: · Changes are guaranteed to be in the source code control system · Changes made by one developer are quickly integrated with other developer’s changes and (through unit tests) conflicts detected · Any build is a candidate for release, meaning that progress is steady and obvious · Developers can see and benefit quickly from changes made by other team members Critical to the success of Continuous Integration is regular check-ins by developers, including database administrators. Unit tests are just as critical in that they quickly indicate breaking-changes where one developer has made changes that impact existing code. Unit tests are automatable, making it possible for them to be automatically executed on the Build Server by NAnt scripts. 2.2.2. Promotion Roles Essential to management of the large number of potential releases generated by the Continuous Integration process is to have a well defined promotion strategy. A current industry best-practice is to have the output builds of the Build Server (as archived on the Build Repository) posted to the Authoring Environment. This is easily accomplished by creating additional tasks in the NAnt script for the project. These additional tasks can either be automatically triggered at the end of the build process (perhaps only if the unit-tests pass), or it can be triggered by a manual invocation. In either case, the promotion technique is usually little more than the copying of the project build outputs (EXEs, DLLs, pages, images, etc) to the Authoring Server. As the NAnt task that accomplishes this is contained in the standard build file, it is subject to, and benefits from all the same source code controls. This means that even the strategy for copying the build outputs is version controlled. The Authoring Environment is comprised of the “set” of servers used in a project, typically at lease a Web Server and Database Server, and potentially an Application Server. The environment can be quickly setup for each project as needed and then controlled by a configuration management system to insure files paths and connection strings are properly managed at an environment level. This is done by placing simple markers in the .Net Framework’s machine.config file. Testing on the Authoring server is intended for daily development work An individual developer machine can be used as a “proxy” Authoring environment, which means that programs one the developer’s machine are executed against the Authoring environment’s Database Server. This allows the developer to do daily work and insure functionality before committing changes to the source code control system; which triggers the Build Server to do an official build. Once on the project code on the Authoring Environment is deemed worth of promotion, it is copied (using a NAnt task, as always) to the Testing Environment. As with the Authoring Environment, the build outputs are copied to the appropriate server machines. The Testing Environment is intended to be a stable environment where formal system-level testing and user acceptance testing can take place without affecting the Production version (for existing projects undergoing new development) or exposing incomplete projects to the Production environment users. The Testing Environment is only updated on-demand, as determined by the testing team, and is promoted not from the Authoring Environment, which may have already been changed by further development. Rather; the build outputs that were archived on the Build Repository are used. This way, the Testing Environment also always corresponds to a specific build (and thus to a version control system label). The Test Environment is the ideal place to record regression test scripts and perform stress tests as it represents the best image of the final deployment environment, but still is under explicit control of the testing team. Once a build is on the Testing Environment, only the testing team can decide to update it, and only the testing team (with appropriate approvals) can release changes to the Production Environment. When a build is deemed ready to be released for use by the users, it is promoted to the Production Environment. This process is, once again, driven by a NAnt task to ensure repeatability and auditability of the process. This is extremely important with the recent Sarbanes-Oxley regulations[xx] that apply to financial and accounting information processing. The Production Environment is setup to exactly mirror the Testing Environment to insure that the testing reflects the behavior of the tests. The Production Environment is not suitable for use in testing as other users may be modifying the data that the test scripts may be using. To allow for reproducibility of user bug reports, at regular intervals as determined by the testing staff, the databases used by the Testing Environment can be replaced with a backup of the Production Environment’s database. This allows for realistic testing of the application once live data has built up. The same strategy is used to insure that the Authoring Environment’s databases are mirror from (possibly a subset) of the Production Environment’s databases. This also gives usable data to use when developing enhancements and defect corrections that require data conversion, data validation or schema updating. When the Authoring Database repopulation is done, then you can test and retest the SQL scripts to be used when rolling out the next version. 2.3. Hotfixes and Next Generation Development Hotfixes are a reality of software development. All programs have flaw, either in design or in the execution of the design. At times the flaws will be significant enough to warrant immediate correction. Typically this happens when a critical-path of the application no longer works (due to data issues, new uses of the functionality, or simply functionality that was never adequately tested). The important thing to realize is that when a hotfix is needed, it is usually needed immediately. It’s also likely that the pressure to release quickly is very high. Usually these crises arise after the project is no longer under active development, or after an enhancement phase has been begun. All of these factors conspire to make it very difficult to make “surgical strikes” to just fix newly discovered issue. With all these pressures against successful hotfixes, it makes sense to practice the process and formalize how the situation should be handled; after all we are good at what we practice. With a proper Build Repository, it is very easy to get the exact set of source files (and indeed the build outputs) that makes up the current production release. All that needs to be done is to suspend the regular automated builds while the hotfix is under production. The source files are restored on the Build Server, then the changes are made against that version. This processes is known as branching, and is a common practice that is well supported by Subversion and CVS, and adequately (but less well) in Visual Source Safe. Once a decision as to the source code control system is made, the process can be documented, but it’s commonly called a Branching operation. What is important to realize is that once a production release is live, it is imperative to optimize the path for hotfixes, as they are typically time critical and not-often practiced. Thus it is important to perform the “branch” operation as soon as a next-phase development is done. This insures that the system is already in place for hotfixes when the need arises. Obviously, this preemptive branching could impede new development. This is especially true when doing schema or data breaking-changes. When this happens, the best thing to do is create an Authoring Next Environment and Testing Next Environment for the new development. This can simply be new virtual directories and database instances for the deployment and a separate Cruise Control.Net build project. By creating a new project, the existing framework for the old (live) version is left in place and can be quickly triggered into action merely by checking files in on the branch version. This means that the source code control checkouts used by Cruise Control.Net would be driven by the branch label. When the next generation version is released to the Production Environment, the old version’s Authoring Environment and Testing Environment are retired and replaced by the Authoring Next Environment and Testing Next Environment. If a new breaking-changes version is needed for further development, the process of doing the branch and creating new Authoring Next Environment and Testing Next Environment is repeated. 3. Quality Control 3.1. Version Control Best Practices 3.1.1. Check-in Daily As development progresses in the project, nothing will give a better guarantee of success than regularly checking source into the version control system. Obviously only code that works should be checked-in, but that doesn’t mean it should take a more than a day to make single changes. If the tasks are subdivided into day-sized pieces, then every team member can benefit from the ever-increasing functionality. It also guarantees that when a developer takes an absence (even unplanned), nothing is left hanging. Since the automated builds will be triggered by the check-in process, it also insures that all existing unit tests are executed and the source archived. 3.1.2. Use Version Labels When setting up the source control system, plan to use version labels for every good build and additional release labels for each build that is released to the Production Environment. This makes it very easy to do delta reports between releases and to restore to a known build in case of lost of the Build Repository. Lastly it aids in generating any Sorbanes-Oxley reports if needed. 3.1.3. Branch At Breaking Changes Whenever a major change in the data schema or external interfaces is needed, branch and pin the current release’s source to insure that that it is trivial to do hotfixes. The trunk (main version) of the file should always be the next-generation path. 3.1.4. Merge Branches As Soon As Possible When a branch has been made, and hotfixes applied on the branch (old) version, as soon as the fix is released to Production and known-good, merge the change into the trunk (main version) of the file. This insures that the hotfix will not be lost in the next enhancement release of the software. 3.2. Daily Peer Code Review Every day, have all developers get in the habit of updating their source version from the latest version in the source code control system. Before switching to the new versions of the source, however, make it a daily process to do a complete source-comparison. By keeping abreast of the changes committed in previous days, the entire team gains understanding about coding techniques in use, the current areas in flux from other developers, and gain a general understanding of other parts of the project. Peer code reviews need not be formal; rather they should be oriented toward understanding what code was changed and what the intent of the change was. The best source for change reasons would be requirements and comments in the source control system. The peer reviews will catch a lot of programming errors, and daily reviews insure that the feedback is quick and valuable. A great tool for this is Beyond Compare, an (unfortunately) not free tool which can be purchased here[xxi]. 3.3. Pair Programming The next step beyond daily code reviews is doing pair programming. This is a very beneficial technique of having all code developed by two developers. While it might seem that this would halve productivity, studies show that productivity of a pair team is actually higher than the cumulative efforts of the same two developers working independently. This is because the pair can quickly bounce ideas and talk-through design and implementation alternatives and choose the best course from the outset. Additionally it tends to keep developers on-focus and prevent the easy distractions of daily development tasks from overwhelming the task at hand. Typically, the quality of pair-programmed software shown 40% less defects per line of code; two eyes are always better than one. 3.4. Status Reporting Status reporting is essential to the success of the project. Typically, in the past, developers have had to rely on memo and notes to keep track of and report where they are within the list of tasks. If business requirement are captured in unit tests first, then the progress of a project can easily be tracked by the number of failing unit tests. As more functionality is added to a project, unit tests to capture the business requirements are added. The tests initially fail, as there is no implementation of the required logic. As the code is developed, the unit tests will one-by-one begin to pass, and barring breaking changes will continue to pass with each project. Thus the status reporting becomes a matter of measuring how many of the business requirements are captured in unit tests, and how many of those tests are passing. When the rate of unit tests being added slows down, the project manager can verify that it is because the requirements have been captured, then when the rate of unit tests that are newly passing (the “burn rate”) begins to slow, then the project is nearing stability. It is possible to release to production at any point where the remaining failing unit tests are not considered critical, and releases often ship with some outstanding failures due to lack of priority. When a new defect is reported, the developer should first write a unit test to isolate and reproduce that error, then they can safely fix the error and know that it will stay fixed because the unit test is never removed. References [i] - Current download at [ii] [iii] - Current download at [iv] - Windows Subversion server binaries downloadable at [v] [vi] - Current download at [vii] - Current download at [viii] [ix] [x] [xi] [xii] [xiii] [xiv] [xv] [xvi] [xvii] [xviii] [xix] [xx] [xxi]


Unknown said...

Great list Marc.
A formal compilation of all the informal things happening at work.


IDisposable said...

Thanks for the kind words :)

Some people have asked about an RSS feed, blogger offers an Atom feed, at

Anonymous said...

Your thoughts on bug/issue tacking systems (like bugzilla) would be a good addition to the list.


IDisposable said...

Excellent point. I'll have to think about it, honestly. I like RMTrack][1], but don't like the fact that it requires an ActiveX control (though that's supposed to be changing). I also like FogBugz[2] a lot.