I mightn't be familiar enough with the practice, but I generally don't think of alphas and betas as "post" release testing. To me, post-release testing applies to things like patches and other updates. If the software has been released to the client for general use, then it's not really an alpha/beta anymore, just an undertested and potentially unstable initial release. "Releasing" an alpha/beta to, say, an executive for testing is still considered pre-release. Am I wrong? Ham Pastrami (talk) 21:00, 3 April 2008 (UTC)
software testing is the process to find the correctness as well as the defects in the software application. —Preceding unsigned comment added by 122.167.109.27 (talk) 04:52, 7 June 2008 (UTC)
People seem to be forgetting here that 'Beta Test' is in fact not a Testing Group function. Beta Test is a Marketing function to test the features of the product against what the targeted users want/desire the product to do. It is not intended to find actual development or programing bugs per se. —Preceding unsigned comment added by 205.225.192.66 (talk) 16:48, 6 January 2009 (UTC)
Some parts of the testing process have nothing to do with excersizing, so I changed the heading.
Having said that, I need to get some excersize, so signing out. ;-) -- Pashute (talk) 11:10, 25 June 2008 (UTC)
Testing - The process of exercising a system or system component to verify that it satisfies specified requirements and to detect errors. [in DO-178 SOFTWARE CONSIDERATIONS IN AIRBORNE SYSTEMS AND EQUIPMENT CERTIFICATIOn]Thread-union (talk) 17:25, 4 July 2008 (UTC)
One of the bullet points under the 'Controversy' sxn says "...and mostly still hold to CMM." What is CMM? There's no mention of this acronym earlier, and no obvious antecedent in the bullet point. Mcswell (talk) 00:49, 13 August 2008 (UTC)
I added a link to the CMMI article and changed "CMM" to "CMMI" NoBot42 (talk) 20:00, 21 August 2008 (MEZ)
The references to CMMI seem to be misplaced. CMMI defines process improvement guidelines independent of the development model used (be it agile, waterfall, or otherwise). This 'controversy' is creating a false dichotomy between agile development and process maturity. There is nothing inherent in CMMI that precludes its implementation in an agile environment. The conflict between 'agile' vs. 'traditional'/waterfall is that agile methods emphasize continuous testing, where the traditional method was to begin testing only at the end of the development process. 76.164.8.130 (talk) 22:24, 21 November 2008 (UTC)
---
The comments above about CMMI are correct. The CMMI is about ensuring that the process is managed and does not prescribe any mechanics about how to perform the functions. At best it provides examples, but those examples are more to illustrate what they mean for a particular process area than they are to prescribe how to do it. By implementing a good agile process, you will by matter of course address most of the disciplines spelled out in the CMMI. It is true that the U.S. Government often mandates CMMI level 2 or 3 compliant processes, but they usually leave the implementation of that process to the company they contracted with. Many companies are learning the value of applying agile processes to the CMMI. The U.S. Government may impose its implementation of process on a company, but that is not the norm.
One of the chief mechanisms used by all agile processes to mitigate the risk of the costs associated with change in the development of the application is that of tight iterations. Essentially you break down the work into 1-2 week iterations where you do your specification to integration, with testing lagging behind an iteration. The costs of bad requirements, design, or implementation are essentially reduced to very manageable levels. That is what makes them agile.137.100.53.254 (talk) 20:17, 16 April 2009 (UTC)
--- I think there needs to be a greater focus on test automation here. When advocates of agile development recomend 100% of all tests, they often mean "I want to have 100% code coverage with the unit tests we are running". Even if you have 100% code coverage, there is still plenty of room for error (usability bugs, integration bugs, etc). If there is a way to automate usability testing I am unaware of it.
Why is this important? There are a lot of uninformed PHBs (pointy headed bosses) that listen to (software automation tool) salesmen and believe that record-playback tools will allow thier QA team to completely automate thier (blackbox) tests. Some mention of this "snakeoil" in a highly respected, evolving media like wikipedia could put some damper on unreasonable expectations of this type. There should also be a clear distinction made between GUI automation (which is subject to varying degrees of brittleness, and requires as-much or more time to maintain than it does to write initial tests: the argument against automation) and Unit test automation (which usually requires less maintenance than GUI automation).
(Before this goes any further, I think someone should provide citations showing some prominent Agile advocates who do, in fact, propose 100% coverage. I'm absolutely certain that none of those I've met in real life do, and some of them get very irritated by this claim, so I'm not convinced this whole idea isn't a popular misconception/misunderstanding. 82.71.32.108 (talk) 01:12, 10 February 2009 (UTC))
--
The 100% coverage is usually quoted for unit tests, and not for all tests for a system. At best the 100% coverage is a goal, but may not be feasible due to limitations of the coverage tool or other mitigating circumstances. Many agile teams do not employ Test Driven Development (that is only one application of the agile processes), so they don't always measure the test coverage. Of the ones that do, 100% code coverage does not necessarily mean 100% of the cases have been tested. [1]137.100.53.254 (talk) 20:14, 16 April 2009 (UTC)
--
Manual testing vs. automated Some writers believe that test automation is so expensive relative to its value that it should be used sparingly.[47] Others, such as advocates of agile development, recommend automating 100% of all tests. More in particular, test-driven development states that developers should write unit-tests of the x-unit type before coding the functionality. The tests then can be considered as a way to capture and implement the requirements.
--
"Should testers learn to work under conditions of uncertainty and constant change or should they aim at process 'maturity'?" This very quote from the "Agile vs. Traditional" approach is a misrepresentation. Essentially the only difference is when testers become involved in the software life cycle. With the traditional approach, testers don't get involved until the software is 100% feature complete (all the requirements have been satisfied). With both agile and iterative approaches testers are involved as soon as an iteration is done. Agile methodologies have either weekly or bi-weekly iterations, which means testing begins in the second or third week of development. The testers are increasing the coverage of their tests to match the requirements (or user stories as agilists tend to call them) that were developed at that time. The benefit being that by the time an agile project has reached the 100% feature complete stage, most of the bugs have already been caught and dealt with.
Agile processes are mature, and it has little to do with working under conditions of uncertainty and constant change. It has to do with tight iteration cycles so that when the client needs the team to refocus their efforts they have that ability to do so. Let us not also forget that there are other mature processes which espouse many of the same principles such as Rational Unified Process (RUP) and other iterative methodologies where the requirements→design→build→test→maintenance cycle is repeated several times before the project is 100% feature complete.72.83.188.248 (talk) 02:26, 20 April 2009 (UTC)
"Specification based testing is necessary but insufficient to guard against certain risks. [16]". This sentence is completely irrelevant, because
I suppose, this sentence is only there to introduce a link to the author. This is why I remove it. —Preceding unsigned comment added by 80.219.3.124 (talk) 18:20, 21 August 2008 (UTC)
The metrics for effort of finding and fixing bugs are interesting. I have seen them many times. It occurred to me that the collorary is that It is xx times easier to introduce bugs at requirement and architecture phases. Is there any literature on this? Ie, what is cause and what is effect? Many times these seem to be used by those use to a waterfall method to justify over specifying things. 69.77.161.3 (talk) 20:50, 19 November 2008 (UTC)
The trademark is probably inappropriate usage; makes the reference here an advertisement. Tedickey (talk) 01:16, 30 November 2008 (UTC)
From another software tester, stop making things up as you have done on the Software testing article. You're making up terms and concepts. If you don't stop, I'll consider your un-sourced edits as vandalism and report you. For instance, not a single book in my library mentions "Optimistic testing". It sounds like you're defining positive and negative testing. Nothing else. You're giving them elaborate terms, one of which conflicts with another term. Your elaborations on destructive testing, which has a Wikipedia article, are incorrect. Please stop necessarily elaborating on the terms. --Walter Görlitz (talk) 06:34, 21 July 2009 (UTC)
"ad hoc testing should be limited belies your misunderstanding of its use"
. It was ment there in the article, that testing in general, and even more the ad-hoc testing and creativity, may discover so many bugs, as well as no one. You do not agree, that "this easily tends to be endless, costly and no result giving."
? The project has always only limited funding, pareto must be applied, thus also the ad-hoc "creative" testing must be limited in time and scope. You could test a certain part of a SW forever, thus it needs a limit: And such will be set really artificially, yes. ...I was pushed many time to make a "professional-expert time estimation", even without any informations available yet. Is this, what you do not agree?"don't always use positive test cases"
: tests can be constructed as a questions, as "do this, will this appear?" With the expecting, that (a) appearing / b) not appearing) is correct, thus the test can pass as well as fail on "appeared" event. Yes, we both know.Looking for a clear definition of the term code completeness, I run into this passage. Clearly its first sentence is wrong: the paragraph is about completeness of the testing code, not the code being tested, although the latter is one of the things being tested. Rp (talk) 11:16, 18 August 2009 (UTC)
I made a minor edit to the opening definition. The definition was, "'Software testing' is an empirical investigation conducted to provide stakeholders with information about the quality of the product or service under test, with respect to the context in which it is intended to operate." I dropped "with respect to the context in which it is intended to operate" because I think it is confusing and unnecessarily narrowing. People often use products in ways far different from the maker's intention. It often makes perfect sense to test a product's behavior under foreseeable uses or in foreseeable contexts, not just in the intended ones. (Think of life-critical products for example.) CemKaner (talk) 15:39, 29 December 2009 (UTC)
There is a very clear definition of "Software Testing" in the IEEE Standard Glossary of Software Engineering Terminology, IEEE Std 610.12-1990 (paywalled!), and I cite:
- "The process of operating a system or component under specified conditions, observing or recording the results, and making an evaluation of some aspect of the system or component"
- "(IEEE Std 829-1983) The process of analyzing a software item to detect the difference between existing and required conditions (that is, bugs) and to evaluate the features of the SW items."
More simply, in "Seven Principles of Software Testing" (Bertrand Meyer (ETH Zürich and Eiffel Software) in: “IEEE Computer”, August 2008, pp. 99-101 (paywalled!), the simple defintion is given:
"To test a program is to try to make it fail."
The ISTQB (International Software Testing Qualifications Board) does not give a proper definition of testing (in particular, no defintion can be found in "Standard Glossary of Terms used in Software Testing Version 2.0 by the ISTQB") but extends the meaning informally to include what they call "static testing" in complement to "dynamic testing", which are activies like reviews, code inspections and static analysis and which generally would fall under design and quality management (more power to them I guess). The following informal definition is given in the "ISTQB Foundation Level Syllabus":
Test activities exist before and after test execution. These activities include planning and control, choosing test conditions, designing and executing test cases, checking results, evaluating exit criteria, reporting on the testing process and system under test, and finalizing or completing closure activities after a test phase has been completed. Testing also includes reviewing documents (including source code) and conducting static analysis.
In particular, in "Point/Counterpoint - Test Principles Revisited" (Bertrand Meyer (ETH Zürich) vs. Gerald D. Everett (American Software Testing Qualifications Board), “IEEE Software”, August 2009, pp. 62-65 (paywalled!), we read the following by Mr. Meyer:
"Mr. Everett and the ISTQB broaden the definition of testing to cover essentially all of quality assurance. In science one is free to use any term, with a precise definition, to denote anything, but it makes no sense to contradict established practice. The ISTQB’s definition goes beyond dynamic techniques commonly known as testing to encompass static ones. Hundreds of publications discuss static analysis, including proofs, versus tests. Such comparisons are of great relevance (including to me as the originator, with Yuri Gurevich, of the Tests and Proofs conferences, http://tap.ethz.ch), but the differences remain clear. Ask practitioners or researchers about testing; most will describe dynamic techniques. If the ISTQB wants to extend its scope to quality assurance, it should change its name, not try to redefine decades-old terminology."
78.141.139.10 (talk) 17:27, 19 March 2013 (UTC)
Agree with that. Meyer clearly wanted a short and pregnant "core definition" that certainly was not intended to cover all aspects of testing. Here is the context, from the above-cited paper
The only incontrovertible connection is negative, a falsification in the Popperian sense: A failed test gives us evidence of nonquality. In addition, if the test previously passed, it indicates regression and points to possible quality problems in the program and the development process. The most famous quote about testing expressed this memorably: “Program testing,” wrote Edsger Dijkstra, “can be used to show the presence of bugs, but never to show their absence!”
Less widely understood (and probably not intended by Dijkstra) is what this means for testers: the best possible self-advertisement. Surely, any technique that uncovers faults holds great interest for all “stakeholders,” from managers to developers and customers. Rather than an indictment, we should understand this maxim as a definition of testing. While less ambitious than providing “information about quality,” it is more realistic, and directly useful.
Principle 1: Definition To test a program is to try to make it fail.
This keeps the testing process focused: Its single goal is to uncover faults by triggering failures. Any inference about quality is the responsibility of quality assurance but beyond the scope of testing. The definition also reminds us that testing, unlike debugging, does not deal with correcting faults, only finding them.
78.141.139.10 (talk) 17:15, 22 March 2013 (UTC)
Under "Testing Methods", we read the following:
Static vs. dynamic testing
There are many approaches to software testing. Reviews, walkthroughs, or inspections are referred to as static testing, whereas actually executing programmed code with a given set of test cases is referred to as dynamic testing. Static testing can be omitted, and unfortunately in practice often is. Dynamic testing takes place when the program itself is used. Dynamic testing may begin before the program is 100% complete in order to test particular sections of code and are applied to discrete functions or modules. Typical techniques for this are either using stubs/drivers or execution from a debugger environment.
Static testing involves verification whereas dynamic testing involves validation. Together they help improve software quality.
The nonstandard use of "Dynamic Testing" and "Static Testing" comes directly from the ISTQB syllabus. The phrase "Static testing can be omitted, and unfortunately in practice often is." does not make any sense, because "Static testing" belongs to design, lifecyle management and quality control. It's not testing. Can it be left out? Is it? Is that unfortunate? The answer is "it depends". Of course doing it, time and money permitting, helps improve SQ, that's what this is all about.
And then:
"Static testing involves verification whereas dynamic testing involves validation. Together they help improve software quality."
NO! On the Wikipedia webpage on V&V, we read:
Validation. The assurance that a product, service, or system meets the needs of the customer and other identified stakeholders. It often involves acceptance and suitability with external customers.
Verification. The evaluation of whether or not a product, service, or system complies with a regulation, requirement, specification, or imposed condition. It is often an internal process.
So-called "static testing" applies therefore to both Validation and Verification. So-called "dynamic testing" definitely to Verification ("does it meet listed requirements") but in a far lesser degree to Validation.
Conclusion: Rewrite needed. One should clarify the "Static/Dynamic" thing vs. the "Plain Testing" thing, it's very confusing.
78.141.139.10 (talk) 17:32, 22 March 2013 (UTC)
Hi, I've seen that you deleted my subsection on software testing. I did't know that previously not published articles could not be added to wikipedia. What counts as "previously published"?
I've published at: http://experiencesonsoftwaretesting.blogspot.com/2010/01/braguet-testing-discovery-of-lucas.html so it's not anymore un-published material now.-- Diego.pamio (talk) 15:43, 5 January 2010 (UTC)
Does it make sense..plz let me know.
The discussion of certification cites me for two propositions.
Regarding the first, I have often argued that software engineering (including testing) is not ready for LICENSING because we lack an accepted body of knowledge. (See for example, John Knight, Nancy Leveson, Lori Clarke, Michael DeWalt, Lynn Elliott, Cem Kaner, Bev Littlewood & Helen Nissenbaum. "ACM task force on licensing of software engineers working on safety-critical software: Draft report, July 2000. See also http://www.acm.org/serving/se_policy/safety_critical.pdf.) However, certification is not licensing. We can certify someone as competent in the use of a tool, the application of a technique, or the mastery of a body of knowledge without also asserting that this is the best tool, the most applicable technique, or the "right" (or the only valid) (or the universally accepted) body of knowledge. I think the current certifications claim too much, that they misrepresent the state of knowledge and agreement in the field and in doing so, several of them promote an ignorant narrow-mindedness that harms the field. But this is a problem of specifics, a problem of these particular certifications.
Regarding the second, I have not said that certification CANNOT measure these things. Look at Cisco's Expert-level certification, for example. I see this as a clear example of a certification of skill. Similarly, I see no reason to argue that we CANNOT measure a programmer's or tester's productivity or practical knowledge. However, believe that the current certifications DO NOT attempt to measure these things, or that anyone could reasonably argue that any of these certs does a credible job of making such measurements. CemKaner (talk) 22:30, 9 April 2010 (UTC)
Return on investment (ROI) is often a misunderstood term. This term can get even more complex when measuring your investment return around software testing.
How to evaluate Software Testing ROI and How to improve it?
http://blogs.msdn.com/b/robcaron/archive/2006/01/31/520999.aspx
http://www.mverify.com/resources/Whats-My-Testing-ROI.pdf NewbieIT (talk) 03:53, 10 June 2010 (UTC)
The heading "CMMI or waterfall" was utterly incorrect. Neither the Capability Maturity Model for Software (SW-CMM), the CMM for Systems Engineering (EIA-731), nor the Capability Maturity Model Integrated (CMMI) have ever mandated the waterfall life cycle. In fact, many of the original CMM authors have lectured on the topic of how the waterfall originated from a mis-quoted and misunderstood speech by Winston Royce in 1973, and therefore arguably was never a legitimate life cycle in its strictest interpretation. Spiral, incremental, iterative, OOPS, sushi, fountain, etc. are all life cycles that may be the basis for project and test execution, and all may be used to address the practices called for in the CMMI. Vic Basily has written an excellent article discussing how test-everything-at-the-end approaches are not "traditional", and how incremental/iterative approaches to development and software testing have been around as long as the industry. The "test first" approach promoted by agile methods is actually a revival of long-standing disciplines. —Preceding unsigned comment added by 63.241.202.8 (talk) 19:31, 9 August 2010 (UTC)
The heading for this section was so jarring that I had to stop and see if anyone had raised the issue. I'm glad to see someone has. The first writer above is 100% correct. The second... CMMI and waterfall don't really suggest things that are in the slightest way similar. This heading appears to have been written by someone who hasn't the slightest clue about CMMI.65.201.123.212 (talk) 17:16, 6 May 2013 (UTC)
I agree with 65.201.123.212 and Walter Görlitz, CMMI is a completely different model and what would be better referenced here is TMMI and how it related to the overall SDLC. This section should debate Waterfall or Agile Sandelk (talk) 11:00, 17 December 2013 (UTC)
I am by no means an expert on the matter, which is why I struggle to understand this concept, why is automated testing so much more expensive? Once you have your automation architecture in place, I see no additional costs...if you are performing testing activities as a one-off thing I can see it being expensive, but in an environment where constant testing takes place, I imagine it's worth the investment? Cronax (talk) 12:21, 28 September 2010 (UTC)
I will actually disagree a little with Walter Görlitz here. Automation can actually prove to be cheaper, but it has a really high return on investment and requires a lot of initial expense to pay for the respective tools, engineers to write the scripts and additional challenges that will be faced, but if done right - after 2-3 years the investment pays off greatly, especially when you consider the time saved through the automation efforts. This article needs to look at automation more objectively and highlight both the pros and cons more effectively. Sandelk (talk) 11:09, 17 December 2013 (UTC)
Should there be a new controversy point: A separate test team vs business analysis team conducting the tests? I'm looking for information on this, and depending on the comments to this topic, a new controversy point may be added. —Preceding unsigned comment added by Culudamar (talk • contribs) 12:21, 20 October 2010 (UTC)
At Yahoo!, a major software company with ~15,000 employees, the terms "Quality Assurance" and "Quality Engineering" are used primarily to refer to what is termed Software Testing on this page. All testing except for Unit Testing falls to a Quality Assurance team, and the members of those teams have Quality Engineer as a part of their job titles. This leads me to wonder: does the distinction made here reflect industry practice? It could be that Yahoo! is an exception, or it could be that the usage of the terminology has shifted. I don't know enough about the issue to say, but I wanted to raise the question.
CopaceticOpus (talk) 04:25, 1 February 2011 (UTC)
Definition does not make sense at all: "Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test."
Bluh. One couldn't be more unspecific. Why not say "Software testing is an activity, that a lot of people earn their money from easily without even understanding the basics of reason, by scrumming up and crying "yeah, we have a profession too!"."?
The "software testing industry" and ISTQB is a religion that absorbs any decent thought. —Preceding unsigned comment added by 115.189.199.174 (talk) 23:10, 22 April 2011 (UTC)
Thank you, Walter.
I state it does not, because it is not a definition. A definition must distinguish the object defined from the rest of the universe, else it is not a definition. I will show this defect by replacing terms of the "definition" without changing the semantics, noting in () for each step why it is correct:
"Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test."
<--> ("stakeholders" can be anyone)
"Software testing is an investigation conducted to provide information about the quality of the product or service under test."
<--> ("quality" is undefined and describes arbitrary features)
"Software testing is an investigation conducted to provide information about the product or service under test."
<--> ("quality" is undefined)
"Software testing is an investigation conducted to provide information about the product or service under test."
<--> ("product or service" can be anything)
"Software testing is an investigation conducted to provide information about the object under test."
<--> ("software testing" is the activity performed)
"Software testing is an investigation conducted to provide information about it's object."
<--> ("investigation" is any activity trying to reveal information on something)
"Software testing is an investigation on it's object."
<--> ("object" is any arbitrary thing)
"Software testing is an investigation on something."
<--> ("something" is an indifferent thing)
"Software testing is an investigation."
<--> ("investigation" is any activity trying to reveal information on something, but there is nothing specified)
"Software testing is an investigation on everything."
<--> ("an investigation on everything" focuses only on esoterics and exoterics)
"Software testing is religion."
<--> ("religion" is irrelevant to software quality, the only thing "Software testing" is relevant to)
"Software testing is irrelevant."
<--> ("irrelevant" means, it has no relation to other things)
"Software testing is!"
<--> (everything is)
"!"
So: not a definition there. The existing article should be replaced with an "!".
A pessimist is someone who states the truth too early. —Preceding unsigned comment added by 115.189.239.224 (talk) 09:15, 3 May 2011 (UTC)
Thank you, WikiWilliamP. I would love to see you try. —Preceding unsigned comment added by 115.189.38.100 (talk) 21:03, 3 May 2011 (UTC)
The article states: "Testing can never completely identify all the defects within software". This would really be bad if it were true. Consider simple programs that can be fully verified for all inputs. In these cases, the testing process can assure that the algorithm operates as expected unless the operating system, or the hardware fails, or data becomes corrupted by other parts of the software. All of these conditions lie outside of the scope of testing a single algorithm or collection of algorithms. However, usually software is too complex to allow for complete verification or proof of correctness. Even so, many bugs can be found through testing. You could say that a bug that cannot be found is one that does not exist, given sufficient time for testing (which may be a lot of time...). While you can never be sure that all errors in an algorithm have been found through testing, unless all input/output combinations are verified, you may still have found all bugs in the software. Therefore, a much more cautious wording is required here. I would propose to say, "Randomized testing cannot ensure that all defects within software have been found." — Preceding unsigned comment added by ScaledLizard (talk • contribs) 17:34, 20 June 2011 (UTC)
That is what "combinatorial explosion" is all about. 78.141.139.10 (talk) 17:17, 22 March 2013 (UTC)
Your dispute seems due to a simple ambiguity. "Testing can never completely identify all the defects within software" can be interpreted as
or
or something in between, e.g.
I think most people would consider the first statement to be false and the second to be true. The third statement seems closest to what is intended and I think it's hard to refute. Rp (talk) 17:41, 28 February 2014 (UTC)
Under the presence of specific testing hypotheses, there exist finite test suites such that, if the implementation under test (IUT) passes them all, then it is necessarily correct with respect to the considered specification. For instance, it is well known that, if we assume that the IUT behavior can be denoted by a deterministic finite-state machine with no more than n states, then the (finite) test suite consisting of all sequences of 2n+1 consecutive inputs is complete (i.e. it is sound and exhaustive). That is, if the IUT passes them all then we know for sure that it is correct (check e.g. "Principles and methods of testing finite state machines" by Lee and Yannakakis). Moreover, it has been proved the existence of finite complete test suites for many other (very different) sets of assumed hypotheses, see hierarchy of testing difficulty in this article for further details. So, the sentence "Testing can never completely identify all the defects within software" is false and should be replaced by something like "Under the absence of appropriate testing hypotheses, testing can never completely identify all the defects within software." In order to explain this addition a little bit and avoid controversy, perhaps a link to the hierarchy of testing difficulty section in this article should be included right after the sentence, so it could be something like: "Under the absence of appropriate testing hypotheses, testing can never completely identify all the defects within software (though complete test suites may exist under some testing hypotheses, see hierarchy of testing difficulty below)." If nobody argues against this change within a few days, I'll do it. --EXPTIME-complete (talk) 20:00, 25 August 2014 (UTC)
Since nobody has argued against my point in the previous paragraph, I have changed the text by: “Although testing can precisely determine the correctness of software under the assumption of some specific hypotheses (see hierarchy of testing difficulty below), typically testing cannot completely identify all the defects within software.” I think this is a consensus sentence. On the one hand, it shows that testing cannot guarantee the software correctness if some specific hypotheses cannot be assumed (a typical case indeed). On the other hand, it shows that some hypotheses enable the completeness in testing. Note that the power of hypotheses in testing is relevant and worth of being mentioned here because, in any field, new knowledge can be gathered only if some hypotheses are assumed: Mathematicians need to assume axioms, Physicists need to assume that observations are correct and “universal rules” will not suddenly change in a few minutes, etc. Software testing is not an exception. Moreover, even when testing cannot guarantee the system correctness (the typical case), many hypotheses are also implicitly or explicitly assumed. --EXPTIME-complete (talk) 8:56, 28 August 2014 (UTC)
There is no mention of how tests should generally be recorded in each category. Some generally accepted guidelines would be useful, such as tester, date+time, title, detail, action, resolution, etc. Depends on the category, for example performance testing and regression testing are quite different.
It would also be useful to describe how to summarise results on an ongoing basis to developers & managers; where the test timeline stands, proportion of resolved issues, criteria for acceptance (not necessarily 100% success). All essential to managers. SombreGreenbul (talk) 13:07, 20 July 2011 (UTC)
Often, software is portable between platforms, eg between Windows ME, Vista and 7. Testing of the software on relevant platforms should be a subsection withing Non-Functional Testing. SombreGreenbul (talk) 14:13, 20 July 2011 (UTC)
Moving this from my talk page: Hi, I am unable to get your ideas about grey box testing.Why are u reverting my edits ,i am providing references also which belongs to my relevant edits.. just check it out and inform me where i am wrong But please dont undo my edits. — Preceding unsigned comment added by Netra Nahar (talk • contribs) 2011-09-12T17:52:40
"Sometimes, beta versions are made available to the open public to increase the feedback field to a maximal number of future users.[citation needed]" Could this be expanded? This is now quite common practice for web sites and, typically, the larger the site then the longer the duration of the beta testing. Google mail was in beta for 5 years! For citations, how about http://www.slate.com/articles/news_and_politics/recycled/2009/07/why_did_it_take_google_so_long_to_take_gmail_out_of_beta.html or even a much earlier article: http://www.zdnet.com/news/a-long-winding-road-out-of-beta/141230 — Preceding unsigned comment added by 86.19.211.206 (talk) 15:19, 7 October 2011 (UTC)
What are your thoughts on using the term "human testing" or "human performed testing" instead of manual testing? Does it make sense? This in contrast to machine performed testing, robot performed testing or automated testing.
Regards. — Preceding unsigned comment added by Anon5791 (talk • contribs) 22:21, 14 October 2011 (UTC)
After " Dave Gelperin and William C. Hetzel classified in 1988 the phases and goals in software testing in the following stages" a list follows that extends beyond 1988. More explanation is necessary. Jeblad (talk) 20:28, 27 December 2011 (UTC)
Risk-based testing appears to have been written without consideration that it is probably better off just a brief sentence in this article, if independent reliable sources demonstrate such weight is due. In other words, the article appears to be a neologism, WP:POVFORK, and a bit of a soapbox. Can anyone find sources to justify a brief mention in this article, or maybe sources enough to keep Risk-based testing as an article in itself? --Ronz (talk) 03:15, 31 January 2012 (UTC)
This article seems to have nothing of the sort. There would be issues of synchronization, hazards, races, and use of semaphores or mutexes which, in addition to being designed more or less correctly, needs to be tested. There would be the issue as fast producer slow consumer and how this is handled by the system, or by the application. Sending of pointers to shared data, is it done, and does it work? There would be priority issues of processes or messages, and priority inversion handling. And this would probably only be some of the factors which would need to be tested. Should there be a separate chapter about this in the article? — Preceding unsigned comment added by Aclassifier (talk • contribs) 12:07, 26 March 2012 (UTC)
Is this worth mentioning? There may be requirements outlined in standards (like IEC 61508), but there is no mention of ethics? How do one treat or test a situation that would be very rare? What would the consequences be? How much do we tell to the end user what has been and what has not been tested (in this version)? How do we reply to a question? I don't know much about this, but to me it seems relevant. Should there be a sperate chapter about this? Øyvind Teig (talk) 12:16, 26 March 2012 (UTC)
There are two problems with the discussion of non-functional testing:
--AlanUS (talk) 18:06, 31 March 2012 (UTC)
The whole section on "functional" versus "non-functional" is wrong. The distinction alluded to is between verification ("Did we code the thing right?") and validation ("Did we code the right thing?").
Functional refers to the code which is called by the Code Under Test (CUT). Integration refers to the code which calls the CUT. That is probably the single most important distinction in all of software testing, and it's not even part of the vocabulary for most coders.
The reason it's so important is that most people combine integration- and functional-testing. That's actually validation, though most people lazily call that whole enchilada integration-testing. It's the hardest to debug. They should perform integration-testing without functional-testing by faking the code which is called by the CUT, or by using a trivial version of the CUT.
Many people say unit-testing when they mean functional-testing. Proper Unit testing mocks the code called by the CUT so that only the CUT is executed.
-- Cdunn2001 (talk) 17:58, 7 July 2013 (UTC)
Positive and negative test cases redirects here, but neither is explained in the article. -- Beland (talk) 18:16, 5 October 2012 (UTC)==
Since 2008 there has been a detailed article on stress testing software, 100% devoted to the topic. Stress test (software). The reference here to that specific article has been reverted back 2x to the very general Stress testing article? That's a broad brush article covering hardware, software, financial (bank stress tests), and may soon cover medical/human stress testing (cardiac, voice, labor & delivery, emotional stress testing, etc.). Seems to make no sense that this article which is 100% devoted to software should not point directly to Stress test (software), since the detail reader here is known for sure to be focused on software Rick (talk) 03:34, 25 February 2013 (UTC)
The above refers to revert1 and revert2
Where on the page, Software testing, the link:
has been twice reverted to the more general:
Need to pin down what is meant by (it) in:
Also need to pin down exactly what is being referring to in these ()?
This article and other related articles may be subject to editing by inexperienced editors as part of an effort to improve the quality of information on the subject of software testing: http://weekendtesting.com/archives/3095
Please be kind.
Cmcmahon (talk) 22:45, 5 September 2013 (UTC) (I am WMF staff but operating here not in my official capacity)
I think monitoring (for example with nagios) is part of software testing. The current wikipedia article does not cover this. There is a section about alpha-testing, then about beta-testing. It is long ago that programms were written, then burned onto a CD/DVD and then sold. Today most software is server based and the programmers are able to care for the software during live execution. Like "DevOps": watching the processes is part of software testing. I am not a native speakers, that's why you don't want to write on the real wiki article. But maybe someone agrees with me and can add something to the real article. — Preceding unsigned comment added by 89.246.192.60 (talk) 19:47, 8 November 2013 (UTC)
I think "Is monitoring part of software testing?" has no definite answer. But I hope all agree: It is **related** to software testing. That's why I think some sentences about monitoring (nagios checks) should be included in the page. Up to now I am too new to wikipedia and don't know how to start. But if someone starts, I would love to give feedback. Guettli (talk) 07:25, 6 November 2015 (UTC)
In my context, monitoring is part of testing. The canonical (but maybe not best) reference is this decade old talk from Ed Keyes, where he says "Sufficiently Advanced Monitoring is Indistinguishable from Testing" (video link)Angryweasel (talk) 23:18, 17 November 2017 (UTC)
I am a software author with more than 40 years experience of testing. This article is massively oversized for what is essentially a "simple" process. Most people understand what a knife is used for and would recognize cutting implements of different kinds from the stone age up to the present time. A stone aged "tester" would perform the task of "testing" his product in much the same way as a modern day butcher. Does the cutting implement do what it is supposed to do? If not why not? How can it be fixed? The article differentiates debugging from testing despite the fact that testing is the most obvious way of identifying errors. I believe there should be a history section that clearly identifies at what stage each advance in manual or automatic program validation techniques progressed. The present article would have us believe that there were numerous "arcane" sub divisions of software testing from the outset. This is so not true. There have been paradigm shifts in the art of testing and debugging that are reflected in the commercial tools that have evolved to assist the progress right up to the present day. To some extent the vast number of programming paradigms, languages and hardware platforms has hampered the development of universal testing tools - but the concept of assisted testing has existed since at least the 1970's and is understated. It is a truly dreadful article. — Preceding unsigned comment added by 81.154.101.27 (talk) 09:44, 4 January 2014 (UTC)
Good programmers are lazy. That's at least my opinion. Yes, there is big amount of theory, .... but what's the goal of this article? Do in-depth theoretical academic work, or give a good overview? For me the overview is more important than the details. I would like a much shorter article, too. If some parts need more in-depth explanations, then a new page needs to be created. For example "security testing". I guess only 0.0001% of all developers work in an environment which needs security testing. Yes, it is important for some people, but only very few. PS: I talk about "security testing". "Security concerns" is something else. This needs to be done by every developer daily.Guettli (talk) 07:33, 6 November 2015 (UTC)
Article claims that "Several certification programs exist to support the professional aspirations of software testers and quality assurance specialists. No certification now offered actually requires the applicant to show their ability to test software. No certification is based on a widely accepted body of knowledge.", but what is the actual basis of claiming so? There are certainly others that don't believe this to be true, see for example ISTQB, "The scheme relies on a Body of Knowledge (Syllabi and Glossary) and exam rules that are applied consistently all over the world, with exams and supporting material being available in many languages.". Added citation needed-template. Slsh (talk) 15:34, 30 January 2014 (UTC)
Negative test is a disambig. The software meaning links here, but this page doesn't contain the word "negative". --Dan Wylie-Sears 2 (talk) 01:48, 11 April 2014 (UTC)
The article desperately needs a definition of the term or a link to another article in which it is defined.— Preceding unsigned comment added by 68.183.37.170 (talk) 20:02, 21 May 2014 (UTC)
One does not need access to logs or databases to understand an algorithm or internal data structure and vice versa. So, there is nothing that actually distinguishes gray box from white or black box testing.— Preceding unsigned comment added by 68.183.37.170 (talk) 20:02, 21 May 2014 (UTC)
Is it a level or type of testing? If the answer is "both", then the notions of "level" and "type" overlap and those two sections would have to be combined.— Preceding unsigned comment added by 68.183.37.170 (talk) 20:02, 21 May 2014 (UTC)
I've removed most of the entries in the certification provider section. Wikipedia is WP:NOTDIRECTORY, and the section was getting awfully spammy with all the entries lacking articles or secondary sources. If these testing certifications are actually provided by noteworthy organizations, their inclusion should be supported by either an article, or a WP:SECONDARY source. If no such sources can be found, it becomes impossible to tell the difference between a legitimate service and a certification-mill. Regardless, Wikipedia is not a platform for advertising, which is what this amounted to. I think the few remaining entries should also be removed, unless there are any objections. Grayfell (talk) 20:28, 6 September 2014 (UTC)
The NIST study isn't a credible source for the economic estimate of the costs of software defects to the economy. It comes up with weird results, like "on average a minor software error has a cost of four million dollars" or "minor errors can cost more than major ones" (both from Table 6-11). It has unreasonably low sample sizes - fewer than 15 software developers, and even though the user portion of the study had 179 plus 98 respondents, that represents a dismally low response rate that would have resulted in tossing the study in most academic publications. Most crucially, it isn't based on any actual in-house measurements but on a 25-minute survey which asked people to guess what bugs were costing their company.
More details here: https://plus.google.com/u/1/+LaurentBossavit/posts/8QLBPXA9miZ — Preceding unsigned comment added by LaurentBossavit (talk • contribs) 14:32, 13 September 2014 (UTC)
I noticed that the first 3-4 paragraphs in the very first section of this page repeat itself. If you read it, you'll see what I mean. It could really use to be cleaned up. I would gladly do it, but I don't want to just jump in and take care of it without bringing it up here first, and since I have no idea how long it might take for this process to play out, someone else will probably want to do it, at least if anyone cars about how intelligent the article should appear to be, considering the subject matter. — Preceding unsigned comment added by 184.78.188.225 (talk) 03:50, 14 September 2014 (UTC)
I've noticed that two sections use the exact same sentences when discussing two different types of testing:
Can someone who understands this topic better please clean that up? — Mayast (talk) 22:12, 23 November 2014 (UTC)
I think that back-to-back testing is missing & should be mentioned in this article.--Sae1962 (talk) 14:59, 16 June 2015 (UTC)
Hello fellow Wikipedians,
I have just added archive links to one external link on Software testing. Please take a moment to review my edit. If necessary, add {{cbignore}}
after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}}
to keep me off the page altogether. I made the following changes:
When you have finished reviewing my changes, please set the checked parameter below to true to let others know.
This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}}
(last update: 5 June 2024).
Cheers. —cyberbot IITalk to my owner:Online 16:54, 27 August 2015 (UTC)
The changes made by @NoahSussman: were too much. The addition of sources was good, but not all meet WP:RS. Elisabeth Hendrickson is but Kate Falanga is not and going from 73 references to 36 references isn't an improvement. Removing common terms such as Black-box and white-box testing is incomprehensible. It is too much to review in a single sitting. Walter Görlitz (talk) 14:01, 15 March 2017 (UTC)
Walter Görlitz It would be nice if you would consider Noah Sussman's work ongoing, and criticize it or update it point-by-point if necessary, rather than revert these changes wholesale. This page has been a shambles for years, and now that finally someone competent is updating it, the software testing community would appreciate as much support as Wikipedia can give. If it makes a difference, I was the QA Lead at WMF for about three years, and I can vouch that no one in this conversation is a sock puppet. Cmcmahon (talk) 18:16, 15 March 2017 (UTC)
As a reminder you have not justified your reversion of Noah's edits, you've simply stated that the revisions weren't in your opinion "helpful." Which specific revisions weren't helpful? Why? If 1/2 of the content is moved to other pages has it been removed? or simply edited? If half the content is moved, wouldn't one expect half of the references to be removed as well?
The language you use here " I am considering it", "It is too much to review in a single sitting." is very reminiscent of WP:OWNBEHAVIOR please remember WP:OWNERSHIP — Preceding unsigned comment added by Cyetain (talk • contribs) 19:34, 15 March 2017 (UTC)
I will readily admit that I made massive edits to see what would happen. I am sympathetic to anyone who would like their work chunked at the smallest grain that is practical :) I will redo the edits in small chunks. HOWEVER the "wholesale removal" is INACCURATE AND WRONG as I MOVED the content in question to a new page, which is linked from the old location of the content. I intend to apply this change again. IT IS NOT REMOVAL OF BLACK AND WHITE BOX TESTING I am simply complying with the "too large" box that I *found in place* on the page. I am trying to follow the extant instructions for improving wikipedia and making the page smaller by extracting list content into a "list of things" page. So I fully expect not to get pushback on that change when I re-implement it in the near future. Thank you and I look forward to continuing the discussion / your thoughts / your further feedback NoahSussman (talk) 10:38, 16 March 2017 (UTC)
"73 references to 36 references isn't an improvement." it is if half the references are to marketing material, out of date, badly written material or material that is all three at once. As is the case here. Too many unreliable / marketing links on this page is a serious credibility problem. Again though I will now challenge one reference at a time rather than attempting any more bulk deletion. No more bulk deletions. But the references on this page blow chunks and I will END THEM NoahSussman (talk) 10:45, 16 March 2017 (UTC)
"I'm not sure how a technique can become outdated" - well, this is the crux of problem with this page, I think. The whole page is based on an idea of Software Testing that emerged decades ago, while software development techniques have moved on. Many of us software testers have extended/adapted our methods over the last decade or so, and I agree with Noah that major changes are necessary to the whole thing. I do understand the reluctance to throw away large parts of the page (and references) and I hope we do get a better page out of this discussion. Rutty (talk) 15:00, 16 March 2017 (UTC)
I'd like to see some discussion of generative testing, supplemented by a link to the QuickCheck page. RichMorin (talk) 03:29, 28 September 2017 (UTC)
Does this section provide value or reference on software testing roles? I'm questioning whether it can be removed or merged into another section.
Furthermore, the list of "roles" in this section are just a snapshot of some software testing titles, and there are so many of these that I don't think listing a few of them would help any reader. — Preceding unsigned comment added by Angryweasel (talk • contribs) 23:22, 17 November 2017 (UTC)
I don't see a strong precedent for the date format in citations. I see things like "2012-01-13" and "July 1, 2009". Is there a preference? I think "July 1, 2009" is more readable. Faught (talk) 19:23, 21 November 2017 (UTC)
So I went with the US style that was prevalent in the body of the article. But perhaps you'd want to use ISO format only in the citations? Not sure whether you'd want to make it different just in the citations. Faught (talk) 00:47, 28 November 2017 (UTC)
Hello fellow Wikipedians,
I have just modified 2 external links on Software testing. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.
An editor has reviewed this edit and fixed any errors that were found.
Cheers.—InternetArchiveBot (Report bug) 08:17, 2 December 2017 (UTC)
How should we choose between American English or British English? Does the spelling on any related pages matter?
On Software testing I see American spellings like artifacts, behavior, and unrecognized, and British spellings like artefacts, grey-box, unauthorised, and organisational. Faught (talk) 20:57, 6 December 2017 (UTC)
{{Use American English|date=September 2024}}
. Of course, this will change each month and year that it's on the talk page or in the archive because of the embedded formula. Walter Görlitz (talk) 01:58, 7 December 2017 (UTC)Can anyone identify what this reference is? "Regarding the periods and the different goals in software testing,[1]..." Perhaps the article they co-authored, "The Growth of Software Testing"? If we can't identify what this is referring to, we should delete the reference. Faught (talk) 18:24, 18 December 2017 (UTC)
References
There's a rather odd citation in the History section: Company, People's Computer (1987). "Dr. Dobb's journal of software tools for the professional programmer". Dr. Dobb's journal of software tools for the professional programmer. M&T Pub. 12 (1–6): 116.
Can anyone intuit a title and author for this article? Faught (talk) 19:19, 20 December 2017 (UTC)
References
I published an edit to the Certifications section with several minor improvements, and adding certifications from the International Software Certifications Board (better known as QAI).
I deleted two of them: 1) Certified Quality Improvement Associate (CQIA), which is not specific to software, and 2) ISEB, which is not a certification, but hints at the fact that ISEB markets the ISTQB certifications already listed here. If there is any controversy about those deletions, let's add them back without losing the other changes.
The ISTQB offers a richer set of certifications than is indicated here, but the way they organize them makes it difficult to list all the variations as separate certifications. Maybe someone could find a good solution for this page.
One more thing - I want to delete the "Software testing certification types" information entirely. This is already covered at Certification#In_software_testing and doesn't need to be hashed out on the software testing page. Any objections?
Also, I want to consider moving most of the first paragraph to the Controversies section. I don't think it has a neutral point of view. Faught (talk) 22:20, 30 January 2018 (UTC)
I added a [citation needed] tag to the following line in the Testing Levels section:
"There are generally four recognized levels of tests: unit testing, integration testing, component interface testing, and system testing."
I do see a few web mentions (mostly from sites selling their wares) defining "the 4 levels" as unit, integration, system, and acceptance. I feel like that may be the better edit, but I'm not certain where the initial reference of those 4 levels comes from. I'll keep digging, but throwing it here in the meantime. For example, there's an explanation on the test-institute dot org website (coincidentally blocked by wikipedia) - but I don't want to reference a site that is focused on selling certifications. I've thumbed through my library of test books, but haven't found the original source yet.
I also found a reference to Component, integration, system, and acceptance testing in _Foundations of Software Testing ISTQB Certification_ by Rex Black and Dot Graham.
I am wondering (out loud) if anything about Testing Levels rises to the level of being worth mentioning in Wikipedia. Angryweasel (talk) 19:49, 3 April 2018 (UTC)
The blurb about outsourcing links to an article touting a particular firms services and offers little actual evidence about the claims, suggesting it be removed Sinfoid (talk) 02:52, 7 May 2018 (UTC)
we can find the tests to get overview and asked my students to this overview 103.228.159.104 (talk) 15:47, 6 December 2021 (UTC)
This article was the subject of an educational assignment supported by Wikipedia Ambassadors through the India Education Program.
The above message was substituted from {{IEP assignment}}
by PrimeBOT (talk) on 20:08, 1 February 2023 (UTC)
User MrOllie has recently removed my contributions about the Testability Hierarchy arguing citation spam. I would like to ask a volunteer to review the relevance to this article of the Testability Hierarchy section existing before it was reverted by MrOllie at 22:27, 20 February 2024 (UTC). EXPTIME-complete (talk) 23:33, 20 February 2024 (UTC)
In particular, this is the text I would like to introduce again:
Based on the number of test cases required to construct a complete test suite in each context (i.e. a test suite such that, if it is applied to the implementation under test, then we collect enough information to precisely determine whether the system is correct or incorrect according to some specification), a hierarchy of testing difficulty has been proposed.[1][2] It includes the following testability classes:
It has been proved that each class is strictly included in the next. For instance, testing when we assume that the behavior of the implementation under test can be denoted by a deterministic finite-state machine for some known finite sets of inputs and outputs and with some known number of states belongs to Class I (and all subsequent classes). However, if the number of states is not known, then it only belongs to all classes from Class II on. If the implementation under test must be a deterministic finite-state machine failing the specification for a single trace (and its continuations), and its number of states is unknown, then it only belongs to classes from Class III on. Testing temporal machines where transitions are triggered if inputs are produced within some real-bounded interval only belongs to classes from Class IV on, whereas testing many non-deterministic systems only belongs to Class V (but not all, and some even belong to Class I). The inclusion into Class I does not require the simplicity of the assumed computation model, as some testing cases involving implementations written in any programming language, and testing implementations defined as machines depending on continuous magnitudes, have been proved to be in Class I. Other elaborated cases, such as the testing framework by Matthew Hennessy under must semantics, and temporal machines with rational timeouts, belong to Class II.
EXPTIME-complete (talk) 09:01, 21 February 2024 (UTC)
References
I'm planning to delete section Faults and failures since although not wrong is off topic. Also planning to delete the paragraph with "software product caters" since also far off topic.
@Furkanakkurt8015: wanted to tag you since I see you modified the fault section reacently. Hope you don't mind if I ax it. Stevebroshar (talk) 20:27, 28 April 2024 (UTC)