Information on snowpack instability is crucial for assessing avalanche risk in backcountry operations as well as for operational forecasting of the regional avalanche danger. Since slab avalanche release requires both fracture initiation and fracture propagation in a weak snowpack layer, field observations should ideally provide reliable information on the probability or propensity of both fracture processes. Even simple field observations that do not require digging a snow pit can provide useful information. Traditional snowpack tests include the shovel shear test, the shear frame test, the compression test (CT) and the rutschblock test (RB). Interpretation of the test results for the CTand RB has been improved by considering the appearance or type of the fracture in addition to the score. More recently, two tests have been developed that focus on fracture propagation rather than initiation: the extended column test (ECT) and the propagation saw test (PST). We compare the sensitivity, specificity and unweighted average accuracy of various stability tests. Comparative studies indicate that the RB, ECT and PST have comparable accuracy. For most test methods the unweighted average accuracy of a single test was 70–90% depending on the dataset. Test methods such as the RB, ECT and PST, which fracture an area large enough to include fracture propagation, are generally more accurate than test methods that fracture smaller areas (e.g. the CT). The threshold-sum method was also less accurate. Even with very experienced observers for the RB, ECT and PST an error rate of at least about 5–10% has to be expected. Performing a second, adjacent test on the same slope improves test reliability.