Accesses: ![]() |
|
Maverick Development |
Maverick Quality Assurance
for
Agile Development
By Geoffrey Slinker
version 1.32
April 2004
Abstract
Agile development has helped test driven development and unit tests come from the arena of good practices to mainstream development. The aggregation of customer driven requirements, customer driven feature development schedule, writing unit tests before code, continuous builds, and short iterations work together to streamline and focus development. Agile development works bests for small teams and projects. The principles of Agile development are applied to large teams and large projects but concessions are made to deal with issues such as increased communication and problems with large sets of unit tests that can take a significant amount of time to run. With large teams on large projects the continuous build becomes less and less continuous and becomes "build as often as possible". Also Agile development has not been clearly understood in the area of integration testing. Maverick Quality Assurance (MQA) considers the goals of quality assurance (QA) upfront and proposes a method for full QA in an Agile development process.
Goals of QA
QA processes provide balance between quality of product, cost of product, features provided by the product, and release date of the product. QA is fundamentally an aspect of what some refer to as "Big upfront design", but is more accurately described as a predictive process. Because it is predictive, it is not necessarily iterative but more likely will be based on well defined phases such as a Waterfall Methodology.
A very accurate definition of QA was found on a NASA web site.
Software Quality Assurance (SQA) is defined as a planned and systematic approach to the evaluation of the quality of and adherence to software product standards, processes, and procedures. SQA includes the process of assuring that standards and procedures are established and are followed throughout the software acquisition life cycle. Compliance with agreed-upon standards and procedures is evaluated through process monitoring, product evaluation, and audits. Software development and control processes should include quality assurance approval points, where an SQA evaluation of the product may be done in relation to the applicable standards.
As with any Maverick approach the first thing we do with a methodology or process is to question why it exists and to drill down to the reasons the method exists, the problems the method addresses, and the goals or results the method was to supply. It has been said that methodologies only represent the solutions to problems that someone has faced. Hidden in the methodology are the complaints that the process addresses. These complaints are at the core of the issues and need to be clearly stated so that alternative solutions can be correctly identified.
From research, investigation, deduction, education, and reason MQA identifies the goals of QA as follows:
In the interest of the customer deliver a valid product and verify that the product was built as required. In interest of the software producer and the customer deliver this product within budget and on time.
A valid product is defined as a product that correctly performs the function for which it was intended. This implies that if it is performing intended function correctly the software is error free.
A verified product is a product that was built in the way that was proposed. This means the product is made of the components that were agreed upon. This includes that it was developed in the programming language that was specified, uses the data stores specified, or which ever pieces that were selected to build the product.
QA methodologies try to insure the goals are met through a very complete and sophisticated process of deliverables. The task of QA is to help safeguard the interests of the customer and the company developing the software. Because QA serves two masters it becomes a difficult task to decide which party is represented at certain times. If the customer is represented exclusively and the company is worked until it is out of money then what has been accomplished? The goal of delivering the product has been missed. This is just the tip if the iceberg of all of the difficulties and issues facing QA teams.
MQA is an Agile QA process. Traditional QA processes set up standards and compare the adherence to these standards through processes such as audits. MQA sets up the standards that are the simplest to meet the goals of QA and through iterative modifications to the QA process emerges a process that addresses the issues that are faced by the particular development task. MQA doesn't believe that software development is so dangerous that the managing processes, procedures, and the time to complete their requirements should be larger than the effort to actually perform the development of the software.
In order to deliver on the goals of QA the QA department has to be an impartial judge at times. The realization that people write software and therefore software becomes subject to all of the human frailties can not be ignored. Pressures of layoffs can cause people to hide issues that are considered as bad or of poor performance. Individuals seeking advancement or recognition can put themselves before the team. Management can hide the real state of the development process hoping against hope to catch up lost time, effectively hoping that the completion of work left to be done is not constrained by time itself. MQA proposes that the computer is the ultimate impartial judge when it comes to verifying and validating software. People are needed to decide on schedules and features and such but once the features are decided upon the computer can validate the feature set completely using Unit tests, Components tests, and an automated test process with a continuous build process. The computer can not be influenced by gain, honor, or recognition. The computer will not be affected by the daily commute, the pressures of work, or any of the things that affect people and their performance.
NASA follows a very formal and rigorous process for QA. Given that fact, let us remember the Mars Orbiter.
WASHINGTON (AP)
-- For nine months, the Mars Climate Orbiter was speeding through space and speaking
to NASA in metric. But the engineers on the ground were replying in non-metric English.
was a mathematical mismatch that was not caught until after the $125-million spacecraft,
a key part of NASA's Mars exploration program, was sent crashing too low and too
fast into the Martian atmosphere. The craft has not been heard from since.
"We were on the wrong trajectory and our system of checks and balances did not allow
us to recognize that," Edward Stone, director of the Jet Propulsion Laboratory,
said Wednesday. NASA centre in California was in charge of the Mars mission. Henners
of Lockheed Martin Astronautics, the prime contractor for the Mars craft, said at
a news conference it was up to his company's engineers to assure the metric systems
used in one computer program were compatible with the English system used in another
program. The simple conversion check was not done, he said.
"It was overlooked," Henners said. mathematical mismatch was identified within days
after the spacecraft was lost and the report released Wednesday confirmed the problem.
Mars Climate Orbiter was launched last Dec. 11 and spent nine months coasting toward
Mars. Stephenson, director of the Marshall Spaceflight Center and head of a NASA
investigation team, said the spacecraft was not symmetrical and pressure from the
sun caused it to slowly twist or roll as it sped along. board gyroscopes partially
controlled the motion but eventually rocket-firings were needed to stabilize the
craft, he said. This happened 12-14 times a week over the nine-month voyage. on
the ground calculated the size of the rocket-firing using feet-per-second of thrust,
a value based on the English measure of feet and inches. , the spacecraft computer
interpreted the instructions in Newtons-per-second, a metric measure of thrust.
The difference is 1.3 metres a second.
"Each time there was a burn (rocket-firing) the error built up," said Stephenson.
the spacecraft approached its rendezvous with Mars and the engineers prepared for
a final rocket-firing, there were indications of something seriously wrong with
the navigation but no corrective action was taken, Stephenson said. the Mars Climate
Orbiter did fire its rockets, the craft went too low into the planet's atmosphere,
instead of into a safe orbit. Communication signals stopped when the craft passed
behind Mars and have not been heard since.
"We entered the Mars atmosphere at a much lower altitude (than planned)," said Ed
Weiler, NASA's chief scientist.
"It (the spacecraft) either burned up in the Martian atmosphere or sped out (into
space). We're not sure which happened." said the problem was not with the spacecraft
but with the engineers and the systems used to direct it.
"The spacecraft did everything we asked of it," said Stephenson. said the mathematical
mismatch was "a little thing" that could have been easily fixed if it had been detected.
"Sometimes the little things can come back and really make a difference," he said.
MQA recognizes that poor quality is not an acceptable attribute of software. The final state of a software product is constrained by at least three factors, quality, functionality, and the release date. Any two of the factors can be defined and the third is dependent on the two. For example, Product Marketing can choose the release date and the feature set of the product. Development will then deliver the features on the set date without regard to quality. Suppose the product is a modern word processor. A big trade show is coming up in three months. The development team consists of five individuals. Product Marketing says that the new word processor will have all of the features of the most popular word processor currently available and will be done before the trade show. Development can not deliver a "real" word processor in that time frame. Some of the features may work, but most will not. Quality is the variable in this example. If Product Marketing chooses the date to be the same and sets the quality requirement to high, then development chooses the feature set. By the time of the trade show there will be a product with some set of features that work correctly but the feature set will be so limited because of the release date set by Product Marketing that the product is not going to be marketable.
MQA recognizes that software in the 21st century has to be high quality. Quality is not an option anymore. Users will not tolerate software that is buggy or with features that do not work as expected. Therefore, MQA states that quality is a constant and that feature set and release date are the only unknowns. Those that define the product can pick the release date but not the feature set or they can pick the feature set but not the release date.
Domain Knowledge
Domain knowledge is the knowledge and understanding of a specific area or domain. Communication of domain knowledge is done in many various ways. These ways include verbal communication in a language that those involved understand, use cases or stories, modeling languages, documentation, and code.
MQA recognizes that communication is not perfect and if the opportunity for miscommunication can be avoided it should be. This does not mean to work in isolation. It does mean to allow those with domain knowledge take responsibility for any items that fall into that domain. Specifically MQA is concerned with the full range of testing of software and that these tests are done by those that know how to do it best instead of external test teams.
Agile development has been adopted in many areas and as a part of that unit testing has also been adopted. The idea that engineers can not test their own code has been a myth that is hard to trace. I am sure it addressed a problem and I will guess it was a problem where the engineers performance was measured by lines of code produced and not by the quality of the code. Therefore when the engineer was asked to test the code he did a poor job because he needed to get back to what really matters, and what really matters is what management measures and brings up in performance reviews.
The developers have the correct domain knowledge to write the unit tests. They also have the correct knowledge to write the component tests. External teams introduced to write component tests require the transfer of domain knowledge. If the developer with the domain knowledge doesn't give proper attention to the test engineers then it will require a new method of transferring the domain knowledge and that method is typically documentation. Now the developers are writing documentation, and the production of this documentation is rarely included as "real" work for the developer so the developer is anxious to get back to what he is measured on which is writing code. In MQA unit tests and component tests use the same "test" portion of the code. Test code is linked with stubs and mock objects and this creates a unit test. The same test code is linked with the "real" components and this creates a component test.
Contexts and Unification Methods
Large software is divided into contexts and these contexts are unified by various means. Domain Driven Design by Eric Evans goes into great detail on this subject and does an excellent job in describing the issues worthy of consideration. Since a large software product is usually divided into contexts for reasons based on the model and the core domain there will be boundaries, relationships, and translation layers that must be dealt with.
Those that have the correct knowledge of the domain and sub-domains and any mapping layers or anticorruption layers should be the ones that write the test code. If an external team is assigned to write the tests then the transferring domain knowledge is critical and MQA believes that it is an opportunity for miscommunication.
There are telltale signs of problems with external test teams failing to work well in the varied contexts of the system. Statements like, "That is an invalid test", or "It took two weeks for testing to write that test when I (the developer) could have done it in a day", or "I (test engineer) cant get my tests done because I cant figure out what the system should be doing", or "I can not validate the data object because there is no method or criteria that defines a valid or invalid object."
Because of these problems the traditional QA process requires increased transfer of domain knowledge. The transfer of domain knowledge is the most common solution that addresses the problem without reorganizing teams and changing roles.
Definition of Tests
The definition of test the types of tests can be found at http://www.faqs.org/faqs/software-eng/testing-faq/section-14.html.
The following definitions are from a posting by Boris Beizer on
the topic of "integration testing" in the c.s.t. newsgroup.
The definitions of integration tests are after Leung and White.
Note that the definitions of unit, component, integration, and
integration testing are recursive:
Unit. The smallest compilable component. A unit typically is the
work of one programmer (At least in principle). As defined, it does
not include any called sub-components (for procedural languages) or
communicating components in general.
Unit Testing: in unit testing called components (or communicating
components) are replaced with stubs, simulators, or trusted
components. Calling components are replaced with drivers or trusted
super-components. The unit is tested in isolation.
component: a unit is a component. The integration of one or more
components is a component.
Note: The reason for "one or more" as contrasted to "Two or
more" is to allow for components that call themselves
recursively.
component testing: the same as unit testing except that all stubs
and simulators are replaced with the real thing.
Two components (actually one or more) are said to be integrated when:
a. They have been compiled, linked, and loaded together.
b. They have successfully passed the integration tests at the
interface between them.
Thus, components A and B are integrated to create a new, larger,
component (A,B). Note that this does not conflict with the idea of
incremental integration -- it just means that A is a big component
and B, the component added, is a small one.
Integration testing: carrying out integration tests.
Integration tests (After Leung and White) for procedural languages.
This is easily generalized for OO languages by using the equivalent
constructs for message passing. In the following, the word "call"
is to be understood in the most general sense of a data flow and is
not restricted to just formal subroutine calls and returns -- for
example, passage of data through global data structures and/or the
use of pointers.
Let A and B be two components in which A calls B.
Let Ta be the component level tests of A
Let Tb be the component level tests of B
Tab The tests in A's suite that cause A to call B.
Tbsa The tests in B's suite for which it is possible to sensitize A
-- the inputs are to A, not B.
Tbsa + Tab == the integration test suite (+ = union).
Note: Sensitize is a technical term. It means inputs that will
cause a routine to go down a specified path. The inputs are to
A. Not every input to A will cause A to traverse a path in
which B is called. Tbsa is the set of tests which do cause A to
follow a path in which B is called. The outcome of the test of
B may or may not be affected.
There have been variations on these definitions, but the key point is
that it is pretty darn formal and there's a goodly hunk of testing
theory, especially as concerns integration testing, OO testing, and
regression testing, based on them.
As to the difference between integration testing and system testing.
System testing specifically goes after behaviors and bugs that are
properties of the entire system as distinct from properties
attributable to components (unless, of course, the component in
question is the entire system). Examples of system testing issues:
resource loss bugs, throughput bugs, performance, security, recovery,
transaction synchronization bugs (often misnamed "timing bugs").
These definitions of unit, component, and integration testing will be used as the MQA definition of tests.
QA Roles in MQA
The roles of individuals in MQA are aligned with domain knowledge.
MQA requires that the former role of Manager/Director of Test Engineering (the person that manages the traditional external test team) become critical and essential to the software development process in a continuous manner. In traditional QA the software process goal was to make the entire process predictable with distinct deliverables and phases. The Director of Test Engineering would make sure that all requirements were understood upfront and that a test plan and criteria for acceptance was developed, reviewed, and accepted very early in the software life cycle. Then such things as criteria to determine if the software was ready to advance to the testing stage were used to move the software from development to testing. This does not meet the needs for Agile development.
The Director of Test Engineering becomes a Director of Requirements Adherence (This title could be improved). This role is more critical in that it requires the management of every software developer. Maverick Agile Development for Large teams requires the development of unit tests first. This good practice was widely introduced in the XP methodology. The unit tests must be driven by data or object generators. MQA requires a correct implementation of a data/object generator or factory. This component is necessary for the MQA methodology to work. It is the Director's responsibility to see that this generator is developed by those with the domain knowledge of the objects. Also, critical to MQA is a data/object validator. Those that understand the context of the data must be able to determine if an object is valid. Because of the 2nd dimension of programming described in Maverick Agile Development there are certain behaviors that objects must have because they exist in the computer realm (2nd Dimension). One of these behaviors is equivalency. In order to validate that an object that is immutable did not mutate the object must be comparable to another object. It is the Director's responsibility to see that the data generator, data validator, and data equivalence verifier are in place.
Unit tests call stubs or mock objects. Many developers want to skip this step. The Director of Requirements Adherence is charged with insuring these stubs and mock objects are created. Through the use of the data generators the difficulty with the mock objects is alleviated. Many unit tests just test the properties of the object (the getters and the setters). This is an incomplete unit test and therefore is not accepted as a unit test at all.
Since all tests are written before the code is developed the role of the Director of Requirements Adherence is more significant still. A traditional Director of Development has not been charged with writing "good" tests. That has been the role of the testing department. Developers may know the fundamentals of writing good tests but they have not written tests often and they have not been evaluated on their ability to write good tests. Therefore the Director of RA has to insure that good tests are written.
Writing the unit tests will take a significant amount of time and it will be a while before development begins on the other code. Since MQA is proposed for iterative development methodologies, the beginning of each new iteration that adds a new use case to the system will begin with writing unit tests. As you can see the role of the Director of RA is on going and very demanding. It raises the bar for the person in the role. They have to be able to internalize changes and react correctly to those changes.
If a modification is needed to an object then a test is written first to validate the change when it is implemented. Thus the role of the Director of RA is continuous and essential. Making sure the tests are correct and that the data generation and other infrastructure are kept up to date are difficult tasks.
Metrics
Metrics are necessary to report the many states of software development. MQA requires that metrics and their gathering be kept simple and to the point. Multiple reports that present the same data in various ways should not be generated.
Customer acceptance can be done with a simple rating. A scale from one to ten on satisfaction can capture the acceptance level. When the product reaches a ten rating, the product is accepted. Metrics could be broken down on factors such as on time delivery, feature correctness, and other factors that determine the overall acceptance of a software product.
Verification and Validation of the software for MQA should measure that all unit tests are in place before coding, that the data generators, validators, and equivalence verifiers are in place, and that all tests are passing. Also, from the iteration goals, metrics should be available on the features for the iteration.
Since the continuous build process runs all of the tests (unit and component) there should be a metric from acceptance testing on any bugs that were found that was not caught by the unit and component tests. Any bug found with an acceptance test requires a new unit or component test to be placed into the system. Also, the Director of Requirements Adherence will determine why the test case was missed and inspect the system to see if similar oversights exist elsewhere.
Test Code
Test code is written for each unit in the system. The test code is assembled to make the unit tests and component tests. Test code linked to stubs and mock objects are unit tests. Test code linked to real objects are component tests.
Test code is written before the code is developed. If a change in existing code is needed a test is written first to show the code adheres to the new requirement.
Test code exercises all of the behavior of an object, not just the properties of the object.
Development has the domain knowledge necessary to write the test code.
Unit Tests
Unit tests are compiled with mocked or stubbed objects and the data generation facilities will be used for these stubs and mock objects. The developer has the domain knowlege necessary to write the unit test.
Component Tests
Component tests are the same as unit tests. They are built differently in that the libraries (JAR files/Assemblies) that contain the mock objects and stubs are replaced with the real components. This gives maximum reuse of the test code. This should reduce the issues of "why didn't the unit tests catch this?"
The product is integrated when all of the component tests pass.
System Tests
System tests are the component tests ran in a deployed environment.
The configuration and deployment team have the domain knowledge and access to the deploy architecture.
Performance Tests
Performance tests are ran on the deployed environment. It is not reasonable to make optimizations for any platform other than the deploy configuration. Performance test harnesses may be needed to provide timing tests and load tests. These harnesses should call and reuse the component tests. Additional tests will be needed to simulate network outages and other aspects.
The configuration, deployment, and development teams have the domain knowledge to produce the performance tests.
Regression Tests
Regression tests are necessary to identify changes. The Director of Requirements Adherence is responsible that test codes are not changed to perform an entirely new test when functionality changes but that an entirely new test is written.
Tests of Third Party Components
The test of such things as database performance, network performance and other third party and external systems is essential.
As with many Agile methodologies the tests are written first. This should be done for third party solutions as well. Reason and prudence must be applied because it may not be possible to write test code for all potential solutions. Acceptance criteria can be specified and the criteria must include items that are verifiable.
The configuration and deployment team as well as those that selected the third part solutions know and understand the expectation of these solutions. These individuals will be responsible for writing tests that show these systems meet the requirements of the product. (The bar has been raised for the configuration management teams in that members of their team must know how to test the third party components of the system to the standard that the third party components were expected to meet).
Mutation Tests
Mutation tests are tests of the tests. Given a unit that passes all of the tests for that unit the unit is modified in a way that is not correct for the requirements. Then it is observed to see if the unit tests fail as expected or if the unit tests do not catch the bug.
The domain knowledge of mutation testing is in development. The Directory of Requirements Adherence is responsible for assuring adequate mutation testing is occurring.
Acceptance Tests
Acceptance tests are done by the customer, by a test technician (using black box and scripting approaches), and by beta testers. Any bugs are logged and the first question to be asked is why this bug wasn't caught by the test code.
The domain knowledge for performing these tests is with the customer and the test technician. Specific features will be known by the developer that is responsible for that feature and therefore the developer can do black box testing of their features.
Trusted Data Generation
The data generation for MQA is essential. The data generator is considered a trusted component. The data generated must meet the requirements of the system. Faulty data generation invalidates all tests associated with that data. The Director of Requirements Adherence is responsible to make sure the data generation facility is correct. This component has the status of a critical component like the steering wheel of your car and therefore requires a mature development method similar to methods for critical systems.
Data Validation and Equivalence Verification
In order to test the units there must be a way to validate the data and compare the data with standards. If the result of a call into the system produces an object it must be validated. The developer responsible for the object must know the requirements for a valid object and is responsible for providing a validator. As well, the developer knows what is considered equivalent. Equivalence can be very difficult and the problem of record linkage addresses one aspect of the equivalence problem. For example, trying to identify if two person objects are equivalent may not be possible. A ranking or some solution must be developed in order to accommodate the tests. If the matching of persons are part of the system then the matching facilities would be used by the Equivalence verifier and the status of trusted component would have to be removed from the equivalence verifier. Reality is such that these concessions would have to be made.
Conclusion
Maverick Quality Assurance (MQA) considers the goals of quality assurance (QA) upfront and proposes a method for full QA in an Agile development process.
Maverick Quality Assurance identifies the goals of the QA team to be the delivery of a valid product that is verified to the customer. As well the QA team is responsible to the employer to identify that the product is on time and within budget.
By developing test code that is deployed is specific ways it is possible to reuse the code for unit, component, system, performance, and other types of tests. A unit test is the test code linked to stubs. A component test is the test code linked to real components.
By moving the code based tests into the area of development where the domain knowledge resides the problems with the transfer of domain knowledge to an external integration test team is eliminated.
By adding a trusted data generator, validator, and equivalence verifier to the other aspects of Agile development the methods of MQA can be performed. Without these additional components MQA is compromised and does not give the optimal results.