## multidimensional scaling Quick reference

### A Dictionary of Statistics (3 ed.)

... is called the stress (also called Kruskal stress ) of the solution—a low value suggests a good solution. Great Britain Ireland Nether- Lands Belgium France Italy Spain Great Britain 20 11 11 6 7 2 6 Ireland 11 20 11 10 11 10 10 Netherlands 11 11 20 12 11 6 6 Belgium 6 10 12 20 12 11 9 France 7 11 11 12 20 10 12 Italy 2 10 6 11 10 20 13 Spain 6 10 6 9 12 13 20 The proximity matrix shows, for seven European countries, the numbers of food products (out of twenty) that were found to similar extents (i.e. common in both countries or scarce in both...

## tetrahedral number Quick reference

### The Concise Oxford Dictionary of Mathematics (5 ed.)

...number An integer of the form , where n is a positive integer. This number equals the sum of the first n triangular numbers . The first few tetrahedral numbers are 1, 4, 10 and 20, and the reason for the name can be seen from the...

## equiangular spiral Quick reference

### The Concise Oxford Dictionary of Mathematics (5 ed.)

...property that the angle α between OP and the tangent at P is constant. In fact, k =cot α . The equation can be written r = kθ +b, and the curve is also called the logarithmic spiral. http://jwilson.coe.uga.edu/EMT668/EMAT6680.F99/Erbas/KURSATgeometrypro/related%20curves/related%20curves.html Animations of related curves, and an interactive manipulation of the spiral in Geometer’s...

## box plot Quick reference

### The Concise Oxford Dictionary of Mathematics (5 ed.)

...together with lines, sometimes called ‘whiskers’, showing the maximum and minimum observations in the sample. The median is marked on the box by a line. Box plots are particularly useful for comparing several samples. The figure shows box plots for three samples, each of size 20, drawn uniformly from the set of integers from 1 to...

## Anderson–Darling test Quick reference

### A Dictionary of Statistics (3 ed.)

...sample equivalents. In some cases, as shown in the following table, an adjusted test statistic is required. test statistic upper tail probability 0.10 0.05 0.025 0.01 Specified distribution A 2 1.933 2.492 3.070 3.857 Normal, estimated mean ( n >20) A 2 0.894 1.087 1.285 1.551 Normal, estimated variance ( n >20) A 2 1.743 2.308 2.898 3.702 Normal, estimated mean and variance 0.631 0.752 0.873 1.035 Exponential, estimated mean 1.062 1.321 1.591...

## decision tree Quick reference

### A Dictionary of Statistics (3 ed.)

...tree A graphical representation of the alternatives in a decision-making problem. As an example , suppose that we are choosing between two machines. One costs 100 units and has a 20% probability of breaking down within a year. The other costs 120 units and has a 5% breakdown probability. A breakdown costs 60 units. Ignoring the possibility of multiple breakdowns, which machine should we buy? Reading the decision tree from right to left, we see that it is cheaper to buy the 100-unit machine. In machine learning the ‘decisions’ are, in effect, the...

## hypothesis test Quick reference

### A Dictionary of Statistics (3 ed.)

... N( μ , 9) and it is desired to test H 0 : μ = 20 against H 1 : μ >20, using a sample of size 25. An appropriate statistic is the sample mean X̄ , which has distribution N(20, 9/25) under H 0 , or its standardized value which has distribution N(0, 1). If the desired significance level is 5%, the critical region, from tables of critical values for the standard normal distribution ( see appendix iv ) is Z >1.645. In terms of X̄ , the critical region is X̄ >20 + 1.645 or, equivalently, X̄ >20.99. The probability of making a Type II error is...

## rank correlation coefficient Quick reference

### A Dictionary of Statistics (3 ed.)

... variables are related. The original data may be ranks , or measurements that are converted to ranks. For example, in a flower show, six sunflower exhibits (A–F) might be given the ranks 1, 3, 2, 5, 6, and 4. If the heights of these exhibits are 2.4, 1.9, 2.2, 1.8, 1.85, and 2.0 m then we might suppose that the heights affected the judge's ranking, since on converting the heights to ranks we get a similar pattern to the judge's ranks: An attraction of using a rank correlation coefficient is that the calculations are simple. Commonly used coefficients are ...

##
*L*-moments
Quick reference

### A Dictionary of Statistics (3 ed.)

... that are computed from linear combinations of the ordered data values x (1) ≤ x (2) ≤⋯≤ x ( n ) (with mean x̄ , which is the first L -moment). Defining the quantities b 1 , b 2 ,…, b n− 1 by the next three L -moments are 2 b 1 − x̄ , 6 b 2 − 6 b 1 + x̄ , and 20 b 3 −30 b 2 +12 b 1 − x̄ . The second L -moment is related to the Gini statistic . An advantage of L -moments is that they can be calculated even when distributions have infinite...

## Pascal's triangle Quick reference

### The Concise Oxford Dictionary of Mathematics (5 ed.)

...r , with r =0, 1,…, n . With the numbers set out in this fashion, it can be seen how the number is equal to the sum of the two numbers and , which are situated above it to the left and right. For example, equals 35, and is the sum of , which equals 15, and , which equals 20. http://www.mathsisfun.com/pascals‐triangle.html An illustration of some of the patterns in Pascal's...

## stem‐and‐leaf plot Quick reference

### The Concise Oxford Dictionary of Mathematics (5 ed.)

...plot A method of displaying grouped data , by listing the observations in each group, resulting in something like a histogram on its side. For the observations 45, 25, 67, 49, 12, 9, 45, 34, 37, 61, 23, grouped using class intervals 0–9, 10–19, 20–29, 30–39, 40–49, 50–59 and 60–69, a stem‐and‐leaf plot is shown on the left. If, as in this case, the class intervals are defined by the digit occurring in the ‘tens’ position, the diagram is more commonly written as shown on the right. See figure. stem‐and‐leaf...

## Cochran-Armitage test Quick reference

### A Dictionary of Statistics (3 ed.)

...after Cochran and Armitage ) for trend in a 2 × k contingency table where the columns correspond to categories of an ordinal variable . Denote the frequency in row i and column j by f ij with i = 1 or 2, and j = 1,…, J , denote the row totals by f 10 and f 20 , the column totals by f 01 ,…, f 0 J , and the grand total by f 00 . The test examines the null hypothesis that, as j increases, the population proportion for a given row consistently increases (or decreases). The test statistic is T given by where t 1 , …, t k are...

## birthday problem Quick reference

### A Dictionary of Statistics (3 ed.)

...is not 183, but 23. The complementary event is that all n people have different birthdays. The probability of this, p n , is which reduces surprisingly quickly as n increases: n 3 5 9 13 16 19 22 23 26 30 34 40 46 p n 0.99 0.97 0.91 0.81 0.72 0.62 0.52 0.49 0.40 0.29 0.20 0.11...

## Platonic solid Quick reference

### The Concise Oxford Dictionary of Mathematics (5 ed.)

..., with 4 triangular faces ( p =3, q =3), (ii) the cube , with 6 square faces ( p =4, q =3), (iii) the regular octahedron , with 8 triangular faces ( p =3, q =4), (iv) the regular dodecahedron , with 12 pentagonal faces ( p =5, q =3), (v) the regular icosahedron , with 20 triangular faces ( p =3, q...

## binary prefixes Quick reference

### A Dictionary of Computer Science (7 ed.)

...In computing, it became common to use the prefix ‘kilo-’ to mean 2 10 , so one kilobit was 1024 bits (not 1000 bits). This was extended to larger prefixes, so ‘mega-’ in computing is taken to be 2 20 (1 048 576) rather than 10 6 (1 000 000). However, there is a variation in usage depending on the context. In discussing memory capacities megabyte generally means 2 20 bytes, but in disk storage (and data transmission) megabyte is often taken to mean 10 6 bytes. (In some contexts, as in the capacity of a floppy disk, it has even been quoted as 1 024 000 bytes,...

## reliability Quick reference

### A Dictionary of Statistics (3 ed.)

.... Alternatively, if a variety of scores are possible on test j , let s 2 j be the variance of those obtained. An approximation to the reliability coefficient is provided by Cronbach’s alpha , given by and other approximations are provided by the Kuder–Richardson formulae KR 20 and KR 21 (named after the equation numbers in Kuder and Richardson's 1937 paper): An alternative approach is the split-half method , in which each test provides two scores (for example, the score on the even questions and the score on the corresponding odd questions). Let ...

## runs test Quick reference

### A Dictionary of Statistics (3 ed.)

...null hypothesis, for reasonably large m and n , this has an approximate normal distribution , with mean and variance equal to respectively. Since the count is an integer, a continuity correction of 0.5 will be needed. As an example, suppose that the reaction times, in ms, of 20 girls and 25 boys were: Girls 428, 444, 446, 479, 492, 513, 522, 533, 544, 545, 560, 566, 581, 582, 590, 595, 599, 612, 634, 655 Boys 415, 439, 442, 477, 500, 512, 523, 532, 577, 580, 613, 614, 622, 633, 670, 671, 680, 688, 701, 703, 722, 730, 744, 750, 777 The null hypothesis is...

## Mantel–Haenszel test Quick reference

### A Dictionary of Statistics (3 ed.)

...The test statistic, M , is computed as follows. Denote by f jkl the number of patients in class l (=1, 2,…, L ) who experience outcome j (=1 or 2) when given treatment k (=1 or 2). Write f 0 kl = f 1 kl + f 2 kl , f j 0l = f j1l + f j2l , and f 00 l = f 10 l + f 20 l . Then Under the null hypothesis, the distribution of M is approximated by a chi-squared distribution with one degree of freedom . When L =1 the test is equivalent to the Yates-corrected chi-squared test ( see two-by-two table ). The ½ term is a continuity correction ....

## wavelet Quick reference

### A Dictionary of Statistics (3 ed.)

...most used were introduced by Daubechies in 1988 . These wavelets have fractal properties. Other families of wavelets include symmlets and coiflets . Mother wavelets. The D 2 wavelet is the Haar wavelet. The wavelets illustrated here are the D 4 , D 12 , and D 20 wavelets. All are members of the Daubechies...

## chi-squared test Quick reference

### A Dictionary of Statistics (3 ed.)

...if there are too many small expected frequencies. As an example, suppose it is hypothesized that a type of sweet pea occurs in shades of white, red, pink, and blue, with proportions ¼, p , (¾−3 p ), and 2 p , respectively. A random sample of 120 seeds is sown. All germinate with 20 having white flowers, 10 having red flowers, 40 pink, and 50 blue. The question is whether these results are consistent with the hypothesis. In this case the maximum likelihood estimate of p is 0.15, so the expected frequencies are 30, 18, 36, and 36. Thus There are 4−1−1=2...