| |||||||||||
|
Median RoutineDerive has numerous in-built statistical functions. At the time of writing, it does not have a median function. This is a very simple routine, which demonstrates some basic array (also known as vector) manipulation. Such vector manipulation is very important in many aspects of mathematical programming. The three main methods of describing measures of central tendency of a group of data are: mean (average), median and mode. Derive already has an average() function. Other useful in-built Derive functions are sum, dim, sort, and sub. Suppose we have a one-dimensional matrix. This is also called a vector, or an array. For the purposes of this exercise, it does not matter whether it is a column or a row vector. However, measures of central tendency somewhat assume we are dealing with numbers. Take the vector: a:=[4, 5, -12, 1, -9, 20] The mean, median and mode are intuitive. To find the mean (average), Derive simply adds up the elements, and divides by the number of elements. Hence, AVERAGE(a) = 1.5 a SUB 3 = -12 Here, DIM(a) is the DIMENSION, or number of elements, of 'a'. Also, the 'SUB' command identifies the individual vector elements - a SUB 3 is the third element of vector a: in this case, -12. If the dimension of a vector is odd, the median of a set of data is the mid-value of the ordered vector. If the number of elements is even, we take the two mid-values of the sorted array, and find their average. To find the median, we:
Such a function could be coded in the first instance as
As a one-line entry into the Derive author line: median(a, n) := PROG(a := SORT(a), n := DIM(a), IF(ODD?(n), a™((n + 1)/2), (a™(n/2) + a™(n/2 + 1))/2)) Now let's break down the program to see how it works. A more imaginative approach would be
As a one-line entry into the Derive author line: median(a, m) := PROG(a := SORT(a), m := (DIM(a) + 1)/2, (a™FLOOR(m) + a™CEILING(m))/2)
This
version removes the need for an IF function in the program, but uses the FLOOR()
and CEILING() Functions. The CEILING(m) function simplifies to the smallest integer greater than or equal to m. For example CEILING(3.141)=4 but CEILING(-3.141)=-3
or
we could cunningly program the median with
which is in 1-D entry line format. This version uses the REVERSE(V) function which reverses the elements of the vector v, e.g. REVERSE([1,2,3])=[3,2,1] and UPDATE operators. Also CEILING(DIM(a), 2) is exactly the same as CEILING(DIM(a)/2). It may be a good idea to test if any of the elements of the vector are not numbers, a quick way to do this would be to insert the line IF(DIM(VARIABLES(a))>0, RETURN false) at the beginning. The VARIABLES() command returns a vector of all the variables in a, so if there are no variables in a, i.e. all the elements are numbers, then DIM(VARIABLES(a))=0. So if there should be variables or strings in a then DIM(VARIABLES(a))>0.
| |||||||||||||||||||||||||